HOME › AI IN THE LAB

AI in service of biology.

Most of evolutionary biology is a data problem. Five projects in the lab use AI agents to read the literature, pull structured data out of it, and help with the comparative work that follows. Two of those agents run autonomously, always on. One is an experiment in handing a full research problem to an agent and letting it run. We expect it to fail in interesting ways and that is the point.

Both autonomous agents are open-source under MIT. Any researcher with a Claude API account can deploy them on a laptop, undergrad to senior PI, with no university IT in the loop.

AI handles search, extraction, and verification at scale. Researchers supply the questions, the skepticism, and the judgment.

AI projects running

Always-on autonomous agents online

Open databases

Students trained AI integrated CUREs

TraitTrawlerv6.2: 11 specialist subagents, 24 deterministic Python gates, Chain-of-Verification critic, code-execution triage prefilter TealcAlways-on autonomous lab member; 158 tools, 59 scheduled jobs, v2 rigor layer, executive loop and overnight science active llms.txt166KB machine-readable site snapshot for agents

Projects

Five projects are active. Two are always-on autonomous agents; one is an experiment in handing an agent a full research problem; one is a teaching tool built almost entirely by AI-assisted coding.

TraitTrawler A literature-mining agent pointed at any trait, any clade Searches four sources, cascades through a 12-source PDF retriever, and requires double-entry verification before a row is written.

A general-purpose pipeline for building trait datasets from the primary literature. It starts from a keyword search across PubMed, OpenAlex, bioRxiv, and Crossref, retrieves full-text PDFs, and enforces a grounding invariant: every extracted claim must appear verbatim in the cited page of a SHA256-hashed source PDF before the row is written. TraitTrawler generalizes from version one, which was built only to collect karyotype data, to any trait at any phylogenetic scale.

TraitTrawler →

AI as lead investigator Can an agent carry a research problem start to finish? Growing out of TraitTrawler: the same scaffolding, pointed at the whole research arc.

TraitTrawler proved an agent can extract structured trait data from the literature at scale. The natural next question is whether one can reason about it. We are building an experiment to find out, designing scaffolding that lets a Claude-powered agent carry a research problem from hypotheses through comparative analysis to a manuscript draft. We are still stress-testing each step and every breakdown teaches us where the next version needs work. The goal is not a robot PI. It is a collaborator that can do the programming and literature triage that would otherwise take a lab of students and postdocs.

What Tealc is, in full → Build your own Tealc → Tealc live console → Research program →

Agent-readable lab Everything we publish is meant to be readable by other researchers' agents llms.txt, llms-full.txt, JSON exports for every database, JSON-LD on every page, an open /data directory.

If AI tools can read our work, they can use it to help other researchers. So we publish everything in formats agents can consume: llms.txt and a longer llms-full.txt as a single-file snapshot of the whole site, JSON exports of every database (CUREs karyotypes, tau, news, publications), structured JSON-LD on every page, and an open data/ directory. Open formats are a form of contribution, and they should be standard practice for any field that wants its work to still be useful in ten years.

llms.txt → llms-full.txt → Raw exports →

Population genetics simulator A Wright-Fisher teaching tool built through AI-assisted coding Drift, selection, mutation, migration, bottlenecks. A test of what a coding agent can produce without manual intervention.

An interactive Wright-Fisher simulator that supports drift, selection, mutation, migration, and bottlenecks. We built it almost entirely through AI-assisted coding, both as a teaching tool for our classes and as a test of what a coding agent can produce when the task is carefully scoped and each output is reviewed.

Open simulator →

Teaching

Courses and curricula at Texas A&M that give biology students practical fluency with AI tools, and the skepticism to evaluate the outputs.

Biology & AI CURE A course where every student runs an original evolutionary biology project Course-based undergraduate research built around phylogenetic comparative methods and AI-assisted analysis.

A course-based undergraduate research experience where each student picks a clade, extracts data with agents, runs comparative analyses, and writes up the results. The spring 2026 cohort wrapped up this spring and several of their projects are tracking toward publication.

Meet the cohort →

AI in Biology concentration A formal 10-credit concentration in the TAMU Biology BS and PhD programs Skills that transfer across tools, not just familiarity with whatever is current.

A concentration built with Texas A&M Biology to give students a formal credential alongside their degree. Required courses cover AI fundamentals, computational biology, data literacy, and critical evaluation of model outputs. Designed so the curriculum ages well as the tools change.

View concentration →

AI tools and prompting guides Practical guides for biologists Literature review, data analysis, coding assistance, writing workflows.

Practical guides to AI tools and prompting techniques written for biologists. They cover literature review, data analysis, coding assistance, and writing workflows. Everything we use internally to train students is online.

Browse guides →

Principles

A few commitments we hold ourselves to. Some are guardrails, some are the reason we do this work in the first place.

01 Validate before trusting Every AI output is verified before it enters a publication, dataset, or decision.

Every AI output is verified before it enters a publication, dataset, or decision. The form of verification (computational check, statistical test, expert review) depends on the task, but some form of verification is always required.

How TraitTrawler validates →

02 AI amplifies effort; researchers supply judgment Search, extraction, reformatting, summarizing, consistency checking: good uses. Deciding what to conclude stays with the researcher.

Search, extraction, reformatting, summarizing, and consistency checking are good uses of AI. Deciding what question is worth asking, whether a result makes biological sense, and what to conclude: that stays with the researcher.

AI tools guide →

03 Document how the result was produced Prompts, model versions, and pipeline configurations belong in the methods section.

Reproducible science requires knowing not just what AI generated, but what it was asked, with what model, and under what constraints. Prompts, model versions, and pipeline configurations are part of the methods section.

Our agent context →

04 Characterize failure modes, not just capabilities Testing the limits of a tool is as valuable as deploying it successfully.

We care as much about documenting where AI reasoning breaks down as demonstrating what it can do. Systematically testing the limits of a tool is as scientifically valuable as deploying it successfully. The lead-investigator project is designed around this idea.

Research program →

05 Make domain knowledge explicit A general-purpose model on a specialized problem gives generic results.

A general-purpose AI applied to a specialized scientific problem gives you generic results. Getting something useful means putting the expert knowledge into your prompts, constraints, and validation rules, not assuming the model already has it.

Lab context, full →

06 Build for the commons Tools, data, and ideas leave this lab in formats other people can pick up and run with.

The point of science is to add to what we collectively understand. Tools, data, and ideas leave this lab in formats other people (and the AI tools they use) can pick up and run with. We are not the last people who will work on these questions, and we want whoever comes next to be better equipped than we were.

Raw data → Lab GitHub →

We think of this as a release, not a revolution. An incremental but real change in how science gets done, rolled out carefully: agent-readable data, validated pipelines, honest documentation of what broke, and students who know how to use the tools.