Beetle genomics.
Coleoptera is the most species-rich order of animals, with about 400,000 described species and probably more than a million yet undescribed. Almost none have reference genomes. That is the opportunity we work in, and it is why our lab builds karyotype databases, chromosome-level assemblies, and comparative frameworks centered on beetles.
Why beetles
Foundations The most species-rich order of life, and the most genomically under-sampled ~25% of all described animal species, ~40% of insects, with reference genomes for a vanishingly small fraction.
Beetles occupy nearly every terrestrial and freshwater habitat on the planet. They include critical pollinators (scarabs, longhorns), decomposers (burying beetles, dung beetles), agricultural pests (Colorado potato beetle, boll weevil, bark beetles), and forest ecosystem engineers (bark beetles that drive succession in conifer forests). Yet genomic resources for Coleoptera have lagged far behind Diptera and Lepidoptera. For years Tribolium castaneum (the red flour beetle) was essentially the only high-quality beetle reference genome.
That is changing. Long-read sequencing (PacBio HiFi, Oxford Nanopore) combined with Hi-C scaffolding has made chromosome-level beetle assemblies achievable for any species, not just model organisms. The bottleneck is no longer sequencing. It is choosing which beetles to prioritize and what comparative questions to answer first.
Why it matters Pest management, conservation, and the genomic basis of biodiversity Beetles span 325+ million years of evolution, a massive comparative framework.
Bark beetles, crop pests, and stored-product beetles cause billions of dollars in damage annually, and genomics enables targeted control strategies. Endangered beetle species need genomic baselines for population viability and genetic rescue. And the big fundamental question sits right on top of all of this: why are there so many beetle species, and what about their genomes made the radiation possible? The 1KITE consortium phylogenomic framework (Misof et al. 2014, Science) gives us the timescale; our job is to fill in the genomic content at scale.
Stevens elements, the chromosomal building blocks
Chromosomal architecture Conserved linkage groups across 300+ million years of beetle evolution Named for Nettie Stevens, who discovered sex chromosomes in Tenebrio molitor, a beetle.
Just as Drosophila has Muller elements A through F, beetles have their own set of conserved chromosomal elements. We call these Stevens elements, named after Nettie Stevens, who in 1905 made the first clear cytological-genetic case for sex chromosomes while studying the mealworm beetle Tenebrio molitor. (Henking had earlier observed an unpaired "X element" in Pyrrhocoris insect embryos in 1891 without recognizing its role in sex determination.) Stevens elements represent ancestral linkage groups conserved across beetle evolution despite hundreds of millions of years of divergence. Even when chromosome numbers change dramatically through fusions and fissions, the gene content of these elements tends to stay together as syntenic blocks.
The Tribolium castaneum genome (2n = 20, or 10 chromosome pairs) provided the initial framework for defining these elements. Note that 2n = 20 is a reduced karyotype for beetles. The inferred ancestral coleopteran has 2n ≈ 22, so Tribolium reflects fusion/fission history, not an ancestral baseline.
What the elements are used for Tracing rearrangements, sex-chromosome homology, and ancestral karyotypes The same genes across very different chromosome numbers.
Stevens elements let us do four things that are otherwise hard. We can trace the history of chromosome rearrangements, asking which fusions and fissions occurred in a lineage. We can determine sex chromosome homology, identifying which element became the X or Y in different species. We can ask whether the same genes are dosage-compensated when different elements become sex-linked. And we can reconstruct ancestral karyotypes, building up a picture of what the beetle ancestor's chromosomes looked like. Stevens herself published the foundational work in the Carnegie Institution of Washington's 1905 monograph Studies in spermatogenesis with especial reference to the "accessory chromosome" (Publication No. 36).
Genome assemblies from the lab
Lab assemblies Chromosome-level genomes chosen to fill phylogenetic and biological gaps PacBio HiFi + Hi-C scaffolding + RNA-seq, validated against karyotype records.
The lab generates chromosome-level beetle assemblies with PacBio HiFi sequencing (>99.9% accuracy, 15 to 20 kb reads), Hi-C scaffolding to order and orient contigs into chromosome-level scaffolds, and RNA-seq for annotation. We evaluate assemblies with BUSCO completeness (typically >95%), and cross-check against karyotypes from our databases of 14,000+ beetle records. BUSCO measures gene-content completeness, not contiguity, so we also report N50, the number of scaffolds anchored to chromosomes, and concordance with cytogenetic karyotypes (Vertebrate Genomes Project and Earth BioGenome standards) alongside BUSCO whenever available.
Notable assemblies:
- Chrysina gloriosa (Scarabaeidae, Rutelinae), the glorious scarab. Iridescent green and gold structural coloration from chiral nanostructures in the exoskeleton that selectively reflect circularly polarized light. Sylvester et al. 2024, G3 14(6) →
- Cheirotonus formosanus, an endemic, endangered Taiwanese long-armed scarab. Chromosome-level assembly with discovery of a putative Y-linked scaffold and demographic history reconstruction. Chien et al. 2026, Ecology and Evolution →
- Dynastes grantii, the western Hercules beetle. One of the largest beetles in North America, with extreme sexual dimorphism and horn elaboration.
- Dendroctonus frontalis, the southern pine beetle. A forest pest whose genome assembly revealed the origins of gene content reduction in Dendroctonus. Copeland et al. 2024, Royal Society Open Science →
Why these species Each assembly answers a question the model organisms cannot Structural coloration, sexual selection, pest genomics, conservation baselines.
Our genome choices are strategic, not opportunistic. Chrysina gives us a high-quality scarab reference and a system for studying the genetics of structural coloration. Dendroctonus frontalis lets us ask why the genus has reduced gene content and what that means for pest biology under climate change (Bracewell et al. 2018, Molecular Ecology 27: 677 to 691, also Keeling et al. 2013, Genome Biology 14: R27 for D. ponderosae). Dynastes grantii supports work on sexual selection, horn development, and genome size. The endangered-beetle assembly is a flagship conservation genomics project. Together these fill phylogenetic gaps in Scarabaeidae, Curculionidae, and beyond.
Dosage compensation in Tribolium and beyond
Gene regulation Most beetles have XY; some have Xyp, neo-XY, or X0 Coleoptera has more sex chromosome system diversity than almost any other order.
Dosage compensation addresses a fundamental problem: males have one X chromosome while females have two. This means X-linked genes are at half dosage in the heterogametic sex, potentially halving the expression of hundreds of genes. Different clades solved this problem differently. Drosophila uses the MSL complex to up-regulate the single male X to match two-X female expression. Mammals silence one female X (the Barr body) so both sexes express from a single X. Caenorhabditis elegans XX hermaphrodites halve expression from both Xs.
Tribolium castaneum shows incomplete dosage compensation: some X-linked genes are compensated (expression ratio near 1:1 between sexes), while others are not (ratio near 0.5 in males). Whether this pattern is ancestral for beetles or derived in Tribolium is an open question. Comparative dosage compensation data across coleopteran lineages are still sparse, which is why the lab's comparative assemblies matter here.
Methods Coverage-based sex chromosome identification without a genetic map Male versus female read depth, done right, tells you which scaffolds are sex-linked.
We investigate dosage compensation patterns across beetle species using RNA-seq and coverage-based sex chromosome identification. By comparing male and female read depth across the genome, we can identify sex-linked scaffolds without any prior genetic map: autosomal scaffolds show F:M coverage of 1:1, X-linked scaffolds show F:M of 2:1, and Y-linked scaffolds show female coverage near zero. Paired with RNA-seq, we then ask whether male X expression is compensated gene by gene.
Population genetics and Ne variation
Population genetics Beetle Ne spans five orders of magnitude From 102 in isolated flightless endemics to 107+ in continent-wide pests.
Effective population size (Ne) determines the relative power of genetic drift versus natural selection, the rate of adaptive evolution, the level of standing genetic variation, and the probability of fixing new mutations. Beetles span an enormous range of population sizes: widespread agricultural pests with Ne in the millions (Colorado potato beetle, bark beetles during outbreaks) at one end; narrowly endemic montane species with Ne in the hundreds (flightless ground beetles on isolated sky islands) at the other.
Flightless beetles tend to have smaller effective population sizes, stronger genetic structure (higher Fst), higher rates of allopatric speciation, and potentially faster chromosome evolution. This last prediction connects directly to the drift barrier hypothesis for karyotype change. Our recent work in mammals (Jonika et al. 2024, Journal of Heredity) shows that range size, a proxy for effective population size, predicts chromosome number evolution rates in Carnivora, with smaller-range species showing faster karyotype turnover. The mechanistic implication is that drift, not selection, dominates fixation of mildly underdominant chromosomal rearrangements. We are now testing the same prediction in beetles using our karyotype databases.
Approaches π, Fst, PSMC, SMC++, admixture The modern pop-gen toolkit, applied at beetle scale.
Modern beetle population genomics uses several complementary approaches: nucleotide diversity (π, an estimate of Ne × μ), Fst between populations to measure genetic differentiation, PSMC and SMC++ to reconstruct Ne through time from a single genome, and admixture analysis to reveal hybridization and introgression. The same approaches, applied consistently across our lab-assembled genomes and published reference genomes, let us compare demographic histories across suborders and biological guilds.
Beetle subordinal diversity
Systematics Four suborders, one of them dominates Adephaga, Archostemata, Myxophaga, Polyphaga.
Coleoptera is divided into four suborders, each with a distinctive body plan and ecology:
- Adephaga: ground beetles, tiger beetles, diving beetles (~45,000 species). Predatory, with strong mandibles.
- Archostemata: reticulated beetles (~40 species). Ancient relicts, the most basal living beetles.
- Myxophaga: tiny aquatic beetles (~100 species). Algae feeders in water films.
- Polyphaga: the vast majority (~350,000 species). Includes weevils, scarabs, longhorns, ladybugs, fireflies, and most beetles encountered in the field.
Genomic phylogenetics has resolved several long-standing debates: the placement of weevils (Curculionoidea) deep within Polyphaga; the rapid radiation of phytophagous beetles coinciding with angiosperm diversification; a Carboniferous origin approximately 325 million years ago, with beetles surviving the end-Permian mass extinction; and most modern beetle families diversifying during the Cretaceous terrestrial revolution. Foundational references include McKenna et al. 2019, PNAS 116(49): 24729 to 24737 (4,818 genes from transcriptomes of 146 species representing all major beetle lineages); Zhang et al. 2018, Systematic Entomology 43(1): 34 to 53; and the Handbook of Zoology, Coleoptera, Beetles (Beutel & Leschen, eds., De Gruyter 2016).
Comparative genomics Genome size, gene families, synteny, repeats 13-fold genome size variation, dynamic gene families, conserved synteny blocks.
Beetle genome sizes span roughly 150 Mb to over 2 Gb, a greater than 13-fold variation driven by transposable element dynamics, DNA deletion rates, and (rarely) polyploidy in some weevil lineages. Specific gene families have expanded or contracted relative to other insects: olfactory receptors in bark beetles (host and pheromone detection), gustatory receptors in phytophagous species (host-plant chemistry), and cytochrome P450s in species feeding on chemically defended plants. Despite chromosome numbers ranging from 2n = 4 in some bark beetles to 2n = 72+ in some longhorns, many synteny blocks are conserved across hundreds of millions of years, consistent with the Stevens element framework. See Herndon et al. 2020, BMC Genomics 21: 47, for the enhanced Tribolium assembly and gene set.
If one could conclude as to the nature of the Creator from a study of creation, it would appear that God has an inordinate fondness for beetles. (attributed, apocryphally, to J.B.S. Haldane)
Future directions
Looking ahead Pan-genomes, population resequencing, CRISPR, conservation, microbiomes Earth BioGenome Project and i5K are accelerating the raw data. The analytic gap is where we sit.
Large-scale initiatives are rapidly expanding beetle genomic resources. The Earth BioGenome Project (Lewin et al. 2018, PNAS 115(17): 4325 to 4333), i5K (5,000 insect genomes), and national genome projects around the world are making beetle genomes available at an accelerating pace. Coming frontiers include pan-genomes that capture structural variation within species, population-level resequencing of hundreds of individuals across hundreds of species, CRISPR-based functional genomics in non-model beetles, conservation genomics for endangered species, and metagenomics of beetle-microbiome interactions (especially the obligate symbioses in bark beetles and grain beetles).
The big question Why are there so many beetle species? Adaptive radiation, key innovations, geographic opportunity, or the genomes themselves.
Adaptive radiation into new ecological niches? Key innovations (phytophagy, complete metamorphosis, elytra protecting flight wings)? Geographic opportunity? Or something about the genomes, high rates of chromosome rearrangement that promote speciation, or gene family expansions that enable rapid adaptation? Comparative genomics across the order may help answer Haldane's quip about an inordinate fondness for beetles. Several factors almost certainly contribute, and teasing them apart needs the kind of integrative work we do: karyotype databases, genome assemblies, population genomics, and phylogenetic comparative methods.