Data-Generating AI Platform Conducts Huge Number Of Microbial Experiments

June 7, 2023


By Deborah Borfitz 

June 5, 2023 | Bioengineers at the University of Michigan (U-M) have seriously scaled up data collection on bacterial species known to affect human health with robots and an artificial intelligence (AI) system that learns from scratch without the need for big databases of knowledge to train on. Dubbed BacterAI, the platform generates its own data and quickly gets smarter by playing a game doling out reward points for accurately predicting the outcome of an experiment, according to Paul Jensen, assistant professor of biomedical engineering. 

An AI agent pumps out hundreds if not thousands of instructions for conducting microbial experiments every day, Jensen says, which get turned into commands followed by an interconnected system of robots. It proved impossible for humans in the lab to keep the robots running at that pace because they couldn’t think up enough experiments. “Most of the time the robots were waiting on the humans to tell them what to do, so... [we became] the bottleneck.”  

In a study newly published in Nature Microbiology (DOI: 10.1038/s41564-023-01376-0), the U-M team showed that BacterAI needed only nine days and 3,024 microbial experiments before its prediction accuracy exceeded 90%. The aim of the game, and the way it scores points, is by figuring out the rules for feeding Streptococcus gordonii and Streptococcus sanguinis—two beneficial microbes found in the mouth—starting with no baseline information, says Jensen. 

It was a big ask, given that there are more than a million possible combinations of the 20 amino acids needed to support life. That initial training enabled the system to reliably predict the results of all other untested combinations of amino acids over the next day or two. 

Using transfer learning based on the amino acid growth requirements of S. sanguinis, the researchers also demonstrated how BacterAI could go from media with 20 to 39 ingredients, a problem “half a million times larger,” in only four days, he says. “AI is very good at taking prior knowledge and building on it. We think that’s promising.” The challenge now is for scientists to start thinking about the next big questions they want the platform to answer. 

As for Jensen and his team, which relocated to U-M from the University of Illinois Urbana-Champaign last August, they’re expanding from individual bacteria to groups of them. They are interested in learning how different inputs change the shape of a microbial community over time, as well as the effects of stressors such as antibiotics. 

Human Bottleneck 

“The big problem is that AI takes a lot of data,” says Jensen in explaining the value of conducting such a high volume of microbial experiments. Over the last decade, scientists have discovered thousands of bacterial species living in and outside the body that affect human health that weren’t known to exist, but AI can’t be used to predict anything about them because there is no data available to train the models. It is implausible to have microbiologists do all the work because the learning process would take centuries.  

When his lab decided to develop computational models of bacteria to speed up the process, using robots to conduct experiments during the workday, it became immediately apparent that “to keep the robots running ... we had to get rid of the humans in the loop.” Three years ago, Jensen and his team developed BacterAI to tell those robots what to do next, based on an analysis of data from previous experiments. 

The agent uses the same AI technique that can teach a computer how to play chess by having it make moves randomly and, if it wins the game, repeat those plays more often in subsequent matches, explains Jensen. As two of these AI agents play against each other, they remember the good strategies and forget the bad ones and eventually get better at playing chess than any human. The key difference is that with BacterAI the agents are playing to learn something about microbes based on results of real-life laboratory experiments they conjure up for the robots. 

Other software generates individual instructions for the robots, such as where to place a pipette and what goes in an incubator, he adds. The whole operation takes 10 to 12 robots, depending on the game at hand. 

Each day, the AI system is asked to come up with a couple hundred experiments that it thinks will be informative, as measured by how surprising the result would be, says Jensen. It also must guess the result; that is, if bacteria cultured in a certain media are going to grow. Predictions and actual results are then compared, with correct guesses rewarded with points. “The AI’s goal is to try to win the game by maximizing the number of points, so it gets very good at predicting what the outcome is going to be because that’s the only way it can score.” 

The agent has a neural network inside modeled after a human brain and, much like a competitive chess player who intuitively knows the next best move, eventually figures out how to predict whether bacteria are going to grow, he continues. Scientists can’t understand how AI makes its predictions any more than chess players can explain their moves, so they must distill the information into a form a human can understand—readable, play-by-play instructions. 

This second step in the process is accomplished by having the AI agent play another game where it is rewarded for building logical rules matching what it already knows. “That’s as close as I can get to seeing what is inside the neural network,” Jensen says. 

A single human technician oversees the robots, including their calibration, and disposes of waste from the experiments. But this individual has no idea what experiments they are doing. Jensen likens this to someone working in an Amazon factory who picks items to be put in a box not knowing why someone ordered those things or where they’re going. 

Productivity Aid 

The real novelty of the platform is that it maps the metabolism of microbes starting from a blank slate, notes Jensen. Other groups have automated science platforms using everything known about an organism to predict what will be useful experiments, but that approach relies on having big databases of knowledge as a starting point.  

The U-M team started with S. gordonii and S. sanguinis because they are known to be bacteria found in the mouths of people who have good oral health outcomes, he says. Cavities filled by dentists are the result of a microbial war where bad bacteria have overgrown and produced a lot of acid that breaks down the enamel of teeth. Researchers want to understand how to prevent this dysregulation of the microbial community so they can keep that from happening by tipping the scales in favor of the good bacteria.  

Most of the work being done by the U-M bioengineers focuses on the oral microbiome, which is already being regularly reengineered by toothpastes containing baking soda to neutralize acids, fluoride to control the growth of bad bacteria, and sweeteners to promote saliva that is naturally rich in antimicrobial compounds, he points out. Beyond toothpaste additives, people might also be given advice about what they should and shouldn’t eat to help the good bacteria flourish.  

On a broader scale, the data-generating AI platform could be applied to all microbes affecting our health, says Jensen. “We know there are things in our gut that show up in people who are healthier, and we want to promote those bacteria, but first we need to know what they like to eat.” If beneficial bacteria are to grow in higher abundance, they need to be fed the right diet and right now no one knows what they like and what they don’t, and what inhibits their growth. 

As for the application of AI in science, the future isn’t likely to include microbiology being done in massive robotic centers but rather AI at the benchtop with small robots assisting individual scientists with whatever they happen to be studying, says Jensen. That’s what happened over time with computer mainframes, once feared to replace people but instead evolved into smaller and distributed technology in the form of smartphones and personal computers making everyone more productive.