Today we’re releasing Zoogle — a new web portal that leverages billions of years of evolutionary data to help predict which organisms are the most relevant models for different aspects of human biology. Although we’ve been building this technical framework for a while now, we finally put the parts together so that anyone can more easily tap into our workflows with the click of a button.
Zoogle is our first stab at a more user-friendly, data-driven way to search for organismal models. Use Zoogle to see how 60 of the more developed research organisms (all with some genetic tools) are ranked by our multi-dimensional analyses as models for Your Favorite Gene in humans. And vice versa — which human genes are predicted to be most strategically served by Your Favorite Organism.
You don’t have to study something expensive and low-throughput to approximate human disease in the lab. There are simpler models, and they’re not just simpler — they may sometimes actually be better.
We expand on this and why we think this approach matters below. But feel free to skip ahead and come back to that later. Start Zoogling here.
The success of biomedical efforts often depends on selecting the right research model organisms. You need model systems to investigate complex biology in vivo without causing harm to human subjects.
When this goes well, the upside is huge. Models can lead to awe-inspiring, serendipitous discoveries. They’re responsible for our basic understanding of many aspects of human biology — cell cycle, gene regulation, secretion pathways, developmental programs, brain circuitry, tissue regeneration, the cytoskeleton, and much, much more. Without these fundamentals in place, we wouldn’t have cures for nearly as many diseases as we have today.
Getting models wrong can also have big effects in the opposite direction. About 90% of drugs that enter clinical trials ultimately fail. The research informing trials leans heavily on non-human models, which don’t always deliver on predicting safety and efficacy. Why? Human disease biology is complex, and regulatory hurdles make it hard to innovate beyond best practices.
Relying on traditionally accepted organisms like mice can even, in some cases, be clinically dangerous due to key species differences. Patients are subject to disastrous outcomes, such as catastrophic systemic organ failure or irreversible brain damage. Yet, we soldier on to the drumbeat of an estimated 50 billion dollars per year in the U.S. alone. We really need to figure out how to do better.
The fact that we’ve made as much progress as we have based on early intuitive choices around organisms is a pretty compelling testament to how much this approach has to offer. Comparative biology is a superpower. Full stop.
But even superpowers can be improved. To do this, it’s first helpful to think about how we ended up here. Historically, model selection has been a largely intuitive practice. This makes sense, given that most modern biomedical research organisms were initially chosen and developed in the first half of the 20th century. Tools, funding trends, and community efforts helped some of these become even more prevalent, creating a positive feedback loop that reinforced “supermodels” like mice and flies.
Our intuition has also led us to a widely-held belief that there’s an intrinsic trade-off between better models and experimental throughput. We tend to assume that the species that look or behave more like us would be better models for human biology. Simple sequence-based comparisons reinforce this bias, given that more recently diverging organisms have a naturally higher baseline sequence similarity to us. If we don’t account for that baseline, the data mirrors what we intuitively expect.
Because of this, we tell ourselves we have to compromise by working with “lesser than” models to make faster and cheaper headway in the lab. Rather than work on a monkey, we work on mice, flies, or yeast to speed up understanding. It feels too practically difficult (and ethically murky) to do otherwise.
As a result, we as a scientific community have specialized in a relatively small number of organisms. Over 90% of research funded by the NIH can be claimed by one of six canonical research organisms (calculated based on these informative posts), < 0.0001% of what biology has to offer. We can do better, but part of why we haven’t is that we haven’t articulated to ourselves a sufficiently strong, data-driven argument for how else to choose the organisms we should work on.
We now have more data, understanding, and tools to conduct the kinds of unbiased, comparative analyses that move us beyond a narrow cannon. It’s long overdue for us to take a bird’s eye view of the tree of life in the lab.
The best research models aren't just easy to work with — they have proteins that function like ours do. While our closest evolutionary relatives naturally share more of our DNA, we've found some surprising functional similarities in unexpected places. Our analytical methods, detailed here and here, look at both gene-level evolutionary distance and predicted protein structure to spot these hidden similarities.
A key concept that allows room for improvement is that most analyses don’t take into account the fact that different forces cause an organism’s individual genes to each evolve differently. Therefore, you lose a lot of information by aggregating and comparing whole genomes. By instead zooming in on individual genes, we're finding research models that mimic human biology in ways traditional approaches might miss.
Our results have disabused us of the simple trade-off narrative (that cheaper/easier models are biologically less useful) — check out example results for the gene that causes Duchenne muscular dystrophy below to see what we mean.
As you can see, depending on the specific human gene under investigation, our data suggests that many experimentally simpler organisms are predicted to punch above their weight as biological models. Our analyses have surfaced many examples of human genes that are well-modeled, or even better-modeled, by species that are also experimentally much higher-throughput than more standard animal models like mice. The results can sometimes be so surprising that they force you to reexamine your assumptions about the relevant biology. This could possibly lead to different or better hypotheses.
In many ways, what we’re saying is already intuitive if you work in drug development. “Mouse is a terrible model” is one of the worst-kept secrets in pharma. We all sense that there’s a better way. And Zoogle is our first attempt at a rational, data-driven path forward.
Ultimately, this all gets at the core motivation behind Arcadia. We want to leverage the full diversity of biology to solve problems. We're testing the hypothesis that more data-driven approaches to organismal selection will improve scientific and real-world outcomes while moving us away from organismal specialization.
Figuring out the usefulness of Zoogle requires more of the community, as Zoogle predictions are merely that — predictions. Our guesses aren’t worth much without empirical testing. And we’re doing this in-house through our own research and startups.
But more users and testers are needed to make this maximally useful to everyone. This is why Zoogle, our lessons, and our workflows are all open. The practical value of our open science principles is embodied in this — utility, speed, and impact. We hope this portal will draw more scientists and non-scientists alike to tell us about the work they’d like to do with Zoogle predictions. We’re committed to troubleshooting with you. We'll be tinkering with our prediction algorithms and updating the platform over time as we learn more from our work and others.
We’re optimistic that being more open-minded and data-driven about organisms may lead us to the thing we're most unapologetically fanatic about — exploring more organisms. Ironically, embracing some agnosticism may be the key to fully unlocking our childlike obsession with the wonders of biology. We hope you’ll join us.
Dive into Zoogle and submit feedback through the forms at the bottom of the page, or comment directly on our pubs.