Mk Model

One-sentence definition. The Mk model is a continuous-time Markov chain for discrete trait evolution on a phylogeny, where each trait state can transition to any other at a rate governed by a rate matrix Q, and likelihood is computed by integrating over all possible ancestral histories.

One-sentence analogy. The Mk model treats trait evolution like a random walk where a ball in one state can roll to any adjacent state at a fixed rate — the phylogeny is the track, and the observed tip states let you infer the rates and likely starting points.

Why it matters. The Mk model is the foundational engine behind ancestral state reconstruction, stochastic character mapping (SIMMAP), BiSSE, and most comparative discrete-trait analyses in the lab. Key choices — equal-rates vs. all-rates-different, reversible vs. irreversible, number of hidden states — critically change inferences. In haplodiploidy evolution, permitting reversions (two-rate model) versus forbidding them (one-rate model) changes the estimated number of independent origins from 12.9 to 7.9. Getting the model right is not a formality.

Where you meet it in the wiki.

Primary citation. null

Prerequisites: none Next, learn about: ancestral state reconstruction, SIMMAP, BiSSE

Background

Pagel (1994) proposed the Mk model to fit discrete character evolution on phylogenies using maximum likelihood. The name comes from “Markov k-state model.” The model is a continuous-time Markov chain (CTMC): a character in state i can switch to state j at any instant, governed by a rate parameter. The chain has no memory; the probability of the next transition depends only on the current state. This memoryless property is both the model’s strength and its central assumption.

The Mk model also assumes that characters evolve independently along each branch. For chromosome-count evolution this matters: we treat chromosome number transitions as a single process running along the tree, not something coupled to body size or mating system, unless we explicitly model that coupling with tools like BiSSE or chromePlus.

How it works

The Mk model specifies a rate matrix Q of size k x k, where k is the number of character states. Each off-diagonal entry q_ij is the instantaneous rate of transitioning from state i to state j. The diagonal entries are set so each row sums to zero. The probability of transitioning from state i to state j over a branch of length t is the (i, j) entry of the matrix exponential e^(Qt).

Three common parameterizations appear in the literature. The equal-rates (ER) model forces all off-diagonal entries to a single rate. The symmetric (SYM) model allows each pair of states to share a forward and reverse rate. The all-rates-different (ARD) model estimates every entry independently. We choose among them by comparing likelihoods with AIC or BIC.

Likelihood across the whole tree comes from Felsenstein’s pruning algorithm. Starting at the tips with observed states, we propagate conditional likelihoods up through the nodes, integrating over all possible ancestral states at each internal node. The result is the probability of observing the tip data given Q and the tree.

For chromosome number the state space spans dozens of integer values. The chromePlus package extends the Mk framework by building a structured Q matrix with separate rate parameters for gains, losses, and polyploidy events, and by coupling chromosome number to a binary trait so that rate parameters can differ across trait states.

A worked example

Suppose we study chromosome number evolution in Coleoptera, where haploid counts range from 5 to 30. We define 26 states. The Q matrix has a banded structure: nonzero off-diagonal entries appear only between adjacent counts (gains: i to i+1; losses: i to i-1) and between i and 2i for polyploidy. All other entries are zero by assumption, which keeps the model identifiable.

We fit the model with MCMC. The sampler proposes new values for the gain rate, loss rate, and polyploidy rate, then computes the pruning likelihood for each proposal. After the chain converges, we have a posterior distribution over each rate. A typical result shows the loss rate three to five times higher than the gain rate, and the polyploidy rate an order of magnitude lower. This ordering is consistent with chromosome loss being common and polyploidy being rare but largely irreversible. We then reconstruct the most probable chromosome count at each internal node by integrating those posterior rates through the pruning algorithm.

Common misconceptions

How to spot it in papers

Further reading

Within-wiki concepts and methods that build directly on the Mk model:

Question copied. Paste it into the NotebookLM tab.