Why We're Building AI Tools for Microbiome Research
And why we think the next wave of scientific breakthroughs will come from better software, not just better experiments
If you work in microbiome research, you already know the data problem.
A typical study generates 16S or shotgun metagenomic sequences, taxonomic abundance tables, metadata, maybe metabolomics. To make sense of it, you cross-reference against taxonomy databases, disease association databases, metabolite databases, pathway databases, and a mountain of literature. Each of these lives in a different place, uses different identifiers, and was last updated at a different time.
The actual science — forming hypotheses, finding patterns, making discoveries — gets buried under data wrangling.
We started Graphomics because we lived this problem. Our team comes from computational biology and bioinformatics, and we kept rebuilding the same infrastructure: scripts to query databases, pipelines to process sequences, tools to visualize results. We figured if we were doing it, everyone else was too.
So we built three things:
MicroMap is a knowledge graph that connects 1.1 million microbial taxa to diseases, metabolites, pathways, and drugs. It integrates data from 12+ public databases (NCBI Taxonomy, Disbiome, BugSigDB, HMDB, KEGG, ChEMBL, CARD, and more) into a single queryable resource. Every association traces back to its source paper. It’s free to use via API.
Workbench is a visual pipeline builder for bioinformatics. Drag analysis steps onto a canvas, connect them, run them. Under the hood, each step is a containerized process running on Apache Airflow. We have 50+ pre-built nodes for everything from BIOM file processing to diversity analysis to publication-ready figures. No scripting required, but there’s a Jupyter escape hatch for when you need it.
Nexus is an AI research assistant grounded in structured scientific data. Ask it a question in natural language — “What gut bacteria are depleted in Parkinson’s disease patients?” — and it queries the knowledge graph, retrieves cited associations, and synthesizes an answer. Upload a BIOM file and it’ll compute diversity metrics, run statistical tests, and explain the results. It’s not a chatbot that guesses — it’s a research tool that shows its work.
What this newsletter will be about:
This is where we’ll share what we’re learning — about building AI tools for science, about the microbiome field, about the engineering challenges of integrating biological data at scale. Some posts will be technical deep-dives. Some will be opinions about where the field is heading. Some will be short notes on interesting papers or tools we’ve come across.
If you’re a researcher, a bioinformatician, a biotech founder, or just someone interested in the intersection of AI and biology, this is for you.
Subscribe if that sounds interesting. And if you want to try the tools: graphomics.com.
— The Graphomics Team
