Comparative Methods
Current understanding
Comparative methods allow evolutionary biologists to model trait evolution across phylogenies, but their reliability depends heavily on the statistical models employed and the structure of the underlying tree. A recurring methodological concern is the inflation of false-positive rates when trait-dependent models are applied to phylogenies with particular topological or branch-length properties. Work on chromosome-number evolution in Carnivora illustrates this problem concretely: neutral binary traits simulated on the Carnivora phylogeny produced false-positive associations with elevated rates of chromosomal fusion in 22% of cases and with elevated rates of fission in 33% of cases under the chromePlus framework. This suggests that researchers relying solely on significance thresholds from such models may frequently mistake stochastic rate variation for genuine trait-dependent effects.
Importantly, effect size can serve as a partial corrective. Although raw false-positive rates appear high, only 7% of neutral simulations produced ΔR magnitudes as extreme as those observed in the empirical data, indicating that filtering results by effect size — not just statistical significance — substantially reduces the risk of spurious inference. This distinction between a statistically detectable association and a biologically meaningful one is critical when applying trait-dependent comparative models to any clade.
Supporting evidence
- Drift drives the evolution 2024, Finding 1: Using neutral simulations on the Carnivora phylogeny, this study directly quantified the false-positive susceptibility of chromePlus-style inference, finding that 22% of neutral traits appeared associated with elevated fusion rates and 33% with elevated fission rates.
Contradictions / open disagreements
The finding raises a tension between the headline false-positive rates and the more reassuring effect-size distribution. A reader emphasizing the raw rates (22–33%) would conclude that chromePlus inferences are poorly calibrated and routinely misleading. A reader emphasizing the ΔR distribution would argue that the method performs acceptably when users condition on large effect sizes, not merely on nominal significance. These two positions are not fully reconciled in the literature surveyed here, and it remains unclear how broadly the Carnivora-specific calibration applies to other phylogenies with different size, balance, or branch-length distributions.
Tealc’s citation-neighborhood suggestions
- Literature on phylogenetic signal and type-I error inflation in BiSSE-family models (e.g., Rabosky & Goldberg 2015) would contextualize these findings within the broader debate about trait-dependent diversification methods.
- Studies benchmarking SIMMAP or stochastic character mapping on empirical versus simulated trees would complement the chromosome-specific simulation results described here.