Continuous Trait Evolution
What is a continuous trait?
Continuous traits are characters that vary along a spectrum. Body size, brain mass, metabolic rate, chromosome number, genome size. These are not discrete categories. A species can have a body size of 2.3 kg, or 4.7 kg, or any value in between.
When you measure a continuous trait across species, you get a dataset of real numbers. The question in phylogenetic comparative methods is: how much of the variation in this trait is explained by shared ancestry, and how much is explained by independent evolution in different lineages?
The problem with naive regression
Suppose you measure body size and home range for 50 species of carnivores. You want to know: do larger animals have larger home ranges? A simple linear regression seems like the obvious approach.
The problem: your 50 species are not independent data points. If 20 of them are closely related lions, they inherited similar body sizes and home ranges from their common ancestor. Your regression treats each as an independent observation, inflating the sample size and giving you false confidence in your result.
This is the essence of the phylogenetic comparative problem. We need methods that acknowledge the tree structure.
Brownian Motion
The foundation for most continuous trait analysis is the Brownian motion model. This is a random walk: at each instant in time, a trait changes by a small, random amount. The change is drawn from a normal distribution with mean 0 and variance sigma-squared (the rate of evolution).
Under Brownian motion, the variance of a trait at the tips of the tree is proportional to time. A longer branch accumulates more variance. A shorter branch accumulates less. This is exactly what we expect under random evolution.
Phylogenetic signal
Not all traits evolve under Brownian motion at the same rate. Some traits show strong phylogenetic signal: closely related species are very similar, and the trait is predictable from phylogeny. Other traits show weak signal: the trait is distributed randomly across the tree, with no obvious relationship to phylogeny.
The most common way to quantify signal is Blomberg's K. Under Brownian motion, K = 1. If K > 1, the trait is more clustered by phylogeny than expected under BM (strong signal). If K < 1, the trait is less clustered (weak signal, more convergence or recent change).
Phylogenetic Generalized Least Squares (PGLS)
When you regress one continuous trait against another (e.g., body size vs. home range), you want to account for the phylogenetic non-independence. This is what PGLS does.
PGLS is a regression method that uses the expected covariance matrix from the phylogeny to weight the data. Under Brownian motion, species that are closer in the tree are expected to be more similar. PGLS exploits this to give you a regression line that reflects the true evolutionary relationships, not just the raw tip data.
What you actually do in R
In practice, you use the gls function from the nlme package with a correlation structure from ape:
library("nlme")
library("ape")
# tree: your phylogeny (phylo object)
# data: your data frame with trait columns
model <- gls(trait2 ~ trait1,
data = data,
correlation = corBrownian(phy = tree))
summary(model)
The corBrownian function encodes the phylogenetic covariance matrix. The gls function (generalized least squares) uses this matrix to compute regression coefficients that properly account for phylogenetic non-independence.