Phylogenetic comparative methods, a guide.
Closely related species are not independent observations. Treat them as if they are and you will overstate your evidence. This guide walks from first principles to working analyses: how to read a tree, how to model continuous and discrete traits, how to combine them, and how to simulate the null expectation from a birth-death process. It is meant for a graduate student starting their first comparative project, not a specialist looking for the latest method.
Why this matters
Drop two squirrel species and a bat into the same scatter plot and the squirrels will sit near each other, not because of any process you care about, but simply because they share an ancestor. The data look like 3 points but contain perhaps 1.5 independent observations. Ordinary regression and correlation overstate evidence by treating shared ancestry as if it were replicated independent draws. This is the phylogenetic pseudoreplication problem (Felsenstein 1985). The chapters below are the toolkit for doing it right.
The chapters
Foundational Phylogenies: how to read the tree Topology, branch lengths, why they matter.
Before anything else you need to know what a phylogeny is and how to read one. This chapter covers interpreting the tree of life, understanding branch lengths, and the ways topology changes inference.
Interactive Birth-death simulator Paired trees, identical rates, very different outcomes.
Run paired birth-death simulations with identical rates and watch how different the two trees turn out. A hands-on way to see why stochasticity alone produces very different species counts and why you cannot read rates off a single empirical tree.
Intermediate Discrete traits Wings vs no wings. XY vs X0. The Mk model.
How do we model traits that come in distinct states? This chapter covers the Mk model, ancestral state reconstruction, and the practical questions around rate symmetry and hidden states.
Intermediate Continuous traits Brownian motion, phylogenetic signal, PGLS.
Body size, chromosome number, metabolic rate. How do we analyze traits that vary on a continuum? Brownian motion as the null, phylogenetic signal as a measurement, and phylogenetic generalized least squares as the workhorse regression.
Advanced Discrete and continuous together Threshold models, phylogenetic ANCOVA.
What happens when you need both types of data in the same analysis? This chapter covers phylogenetic ANCOVA, threshold models, and the patterns to watch for when you combine data types.
Advanced Likelihood ridges When the optimizer looks happy but the model is not identified.
A short chapter on likelihood ridges, the common failure mode where the optimizer returns a confident answer to an underdetermined problem. If you run comparative analyses long enough you will encounter one and this is how to notice.