HOME › GLOSSARY

Research glossary.

Plain-language definitions of phylogenetics, chromosome evolution, population genetics, and genomics terms. A working reference, not an exhaustive textbook, written the way we'd explain each concept to a new lab member.

Phylogenetics and comparative methods

Phylogeny: A branching diagram representing the evolutionary relationships among taxa; the "tree of life" for a group of organisms. Branch lengths often represent evolutionary time or amount of change.
Clade: A group consisting of an ancestor and all of its descendants; a monophyletic group. The foundation of phylogenetic classification.
Synapomorphy: A shared derived character state inherited from a common ancestor; the evidence used to define clades.
Brownian motion (BM): A null model of continuous trait evolution in which trait values change randomly over time proportional to branch length. Widely used as a baseline for trait evolution under drift.
Ornstein-Uhlenbeck (OU) model: An extension of Brownian motion in which a trait is pulled toward an optimal value (θ) with strength α. Captures stabilizing selection.
Phylogenetic signal: The tendency for closely related species to resemble each other more than expected by chance. Measured by Blomberg's K or Pagel's λ.
Pagel's λ (lambda): A scalar (0 to 1) that multiplies the off-diagonal elements of the phylogenetic variance-covariance matrix. λ=1 matches Brownian motion expectation; λ=0 = star phylogeny (no signal).
PGLS (Phylogenetic Generalized Least Squares): Regression that accounts for non-independence among species due to shared evolutionary history. Uses the phylogenetic variance-covariance matrix as the error structure.
Mk model: A continuous-time Markov model for discrete character evolution, parameterized by transition rates between states. The simplest version assumes equal rates (ER); others allow asymmetric rates (ARD) or structured rate matrices.
Ancestral state reconstruction (ASR): Inference of the most likely character states at internal nodes of a phylogeny, given the tip states and a model of evolution. Can be marginal (per-node) or joint.
Stochastic character mapping (SIMMAP): A Bayesian approach that samples complete histories of character state changes over the entire phylogeny, not just node states; used to calculate time in each state per branch.
Birth-death model: A macroevolutionary model in which lineages speciate at rate λ (birth) and go extinct at rate μ (death). Net diversification rate r = λ minus μ. The Yule model is a special case with μ = 0.
BAMM: Bayesian Analysis of Macroevolutionary Mixtures, estimates branch-specific speciation and extinction rates across a phylogeny.
Likelihood ridge: A region of parameter space where many different parameter combinations yield nearly identical (high) likelihood values, making optimization difficult. Common in discrete trait models at high rate values proportional to trait spread.
Pagel's test: A likelihood ratio test for correlated evolution between two binary characters on a phylogeny, comparing a model of independent evolution to one of correlated evolution.
chromePlus: An R package (Blackmon Lab) implementing Markov models for chromosome number evolution (gain, loss, polyploidy, demiploidy) with the option to correlate rates with a binary trait.
ChromEvol: An earlier R/C++ program (Mayrose et al.) for ancestral chromosome number reconstruction using probabilistic models.

Chromosome and karyotype evolution

Karyotype: The complete chromosomal complement of an organism, typically described by chromosome number (2n), morphology, and sex chromosome system (e.g., XY, ZW, X0).
2n (diploid number): The total number of chromosomes in a somatic cell of a diploid organism.
Haploid number (n): Half the diploid number; the number of chromosomes in a gamete.
Sex chromosomes: A pair of chromosomes that differ between sexes and carry the primary sex-determining locus; in XY systems males are heterogametic (XY), in ZW systems females are heterogametic (ZW).
Dosage compensation: Mechanisms that equalize gene expression between males and females despite differences in sex chromosome copy number (e.g., X-inactivation in mammals).
Fission: Division of one chromosome into two, increasing the chromosome number.
Fusion (Robertsonian): Union of two chromosomes (typically at centromeres) into one, reducing the chromosome number.
Polyploidy: Whole-genome duplication, resulting in more than two complete chromosome sets. Common in plants; rarer in animals.
Demiploidy: Addition of a single extra haploid genome set, resulting in a triploid-like karyotype (3n). Occurs in some insect lineages.
B chromosomes: Supernumerary chromosomes found in some individuals within a population; not essential for normal development; often heterochromatic and selfish.
Non-recombining region (NRR): A portion of a sex chromosome that does not recombine during meiosis; typically degenerates over time due to Hill-Robertson interference.
Heteromorphic sex chromosomes: Sex chromosomes that differ visibly in size or morphology between homologs (e.g., the mammalian X and Y).

Population genetics

Genetic drift: Random changes in allele frequencies due to sampling variation in finite populations. Stronger in small populations; primary driver of fixation of neutral alleles.
Effective population size (Ne): The size of an ideal Wright-Fisher population that experiences the same rate of genetic drift as the actual population; often much smaller than census size.
Fixation: The process by which an allele rises to 100% frequency in a population. Probability of fixation for a new mutation = 1/(2Ne) under neutrality.
Fixation probability: The probability that a new mutation ultimately spreads to fixation. For a neutral mutation, = 1/(2Ne); for a beneficial mutation with effect s, ≈ 2s (Kimura, 1962).
Selection coefficient (s): A measure of the fitness effect of an allele relative to a reference allele. s > 0 = beneficial; s < 0 = deleterious; s = 0 = neutral.
Wright-Fisher model: A discrete-generation model of allele frequency change in a finite population, incorporating drift as binomial sampling. Foundation of most population genetic theory.
Hardy-Weinberg equilibrium (HWE): The expected genotype frequencies (p², 2pq, q²) in an infinitely large, randomly mating population with no selection, mutation, or migration.
Coalescence: A backwards-in-time view of genealogies; any two gene copies in a population eventually "coalesce" at their most recent common ancestor. Coalescent theory relates genealogy shape to population size and history.
FST: A measure of genetic differentiation between populations (0 = no differentiation; 1 = complete differentiation). Based on partitioning of heterozygosity within vs. among populations.
Linkage disequilibrium (LD): Non-random association between alleles at different loci; decreases over time with recombination.
Hill-Robertson interference: Reduced efficacy of selection at one locus due to LD with variants at other loci; especially severe in non-recombining regions like sex chromosomes.

Genomics

Synteny: Conservation of gene order along chromosomes between different species; implies homology of chromosomal segments.
Ortholog: A gene in two different species that descended from the same gene in their common ancestor via speciation (vs. paralog, which arises by duplication).
Read depth / coverage: The average number of sequencing reads that cover each base of a genome assembly; higher depth = more reliable variant calls.
SNP (single nucleotide polymorphism): A single base-pair difference between individuals or populations; the most common form of genetic variation surveyed in population genomic studies.
RADseq (Restriction-site Associated DNA sequencing): A reduced-representation sequencing approach that samples thousands of loci distributed across the genome; widely used for population genomics when a reference genome is unavailable.

Research glossary.

Related resources