Healthy Gut Microbiome Reference Database Initiated

November 6, 2019

By Paul Nicolaus

November 6, 2019 | Trillions of microorganisms live in and on our bodies, impacting crucial processes like digestion and offering up new possibilities for the treatment of an array of ailments. Despite the spike in interest in this realm of research, some experts have pointed out that there is still a need to better understand the types and ratios of microbes found in the healthy human gut.

In a paper published in September in PLOS ONE (doi: 10.1371/journal.pone.0206484), a group of researchers presents their efforts to address this issue. They describe an initial baseline healthy gut microbiome reference database and put forth a prototype reporting template. The goal behind the work is to map the healthy gut microbiome so that they and other researchers can use the data to develop disease-specific prediction models.

To arrive at their findings, the researchers genetically sequenced 48 fecal samples from 16 healthy volunteers recruited from George Washington University campus and combined those data with 50 fecal metagenomic samples downloaded from the Human Microbiome Project (HMP).

The resulting GutFeeling KnowledgeBase (GutFeelingKB)— a healthy human reference microbiome list and abundance profile—describes 157 organisms in all (8 phyla, 18 classes, 23 orders, 38 families, 59 genera, 109 species, and additional subspecies and strains) that make up the baseline biome and can potentially be used as healthy controls for studies related to dysbiosis. Firmicutes was the largest phylum of bacteria, which consisted of 20% Clostridia, 19% Bacteroidia, 17% Bifidobacteriales, 14% Enterobacterales, and 14% Lactobacillales. Out of 109 species, 84 were found in all of the samples.

Having this overarching umbrella in place “allows us to slowly start understanding which pathways these organisms express,” said Raja Mazumder, professor of biochemistry and molecular medicine at George Washington University. Understanding pathways would be the next step toward better understanding how humans interact with these microorganisms.

A person’s microbiome is potentially tied to a whole array of factors, such as diet, exercise, stress, and antibiotic use. “All of these things can change our microbiome in our gut,” Mazumder told Bio-IT World. “If we can look at many, many, many people, I believe what we will find out is that the healthy definition is broad, and there will be different microorganisms in different people.”

Toward Clinical Practice

To bring microbiome science to routine clinical practice, there is a need for a standard report that enables the comparison of an individual’s microbiome to the expanding knowledge base of “normal” microbiome data, according to the researchers, and their Fecal Biome Population Report (FecalBiome) and underlying GutFeeling KnowledgeBase (GutFeelingKB) address this issue.

Their tool could potentially be useful to regulatory agencies as they assess fecal transplant and other microbiome products, for example, or to clinicians looking to evaluate the gut microbial status of patients. “The goal of the database and report is to connect lab results with outcomes,” the researchers noted in the paper.

While creating FecalBiome, a standard reporting template that includes absolute and relative abundance information about a sample compared to an average across the database, the researchers drew inspiration from the reporting of blood tests that take place as part of a routine physical exam.

While some companies now offer tests for fecal microbiome results and comparisons to other healthy individuals (and product recommendations), more data will provide much more validated results. “This is the first step towards understanding that this can become a test,” Mazumder said.

The research was supported in part by funds from National Science Foundation (NSF), the NIH National Center for Advancing Translational Sciences, and the McCormick Genomic and Proteomic Center (MGPC) at George Washington University.

Reactions to Findings

“The findings provide an important baseline for what a microbiome in a ‘healthy’ person looks like,” according to Amesh Adalja, a senior scholar at Johns Hopkins University who was not involved in the study.

Science and medicine are just starting to understand the microbiome and the role it plays in health and disease. “At this early stage, it is essential to understand what a ‘normal’ microbiome is and how it changes,” he added. Any study that puts forth a framework to understand dysbiosis and evaluate treatments like fecal transplants is “an important step forward.”

“There is still a lot we don’t know about the microbiome of healthy individuals,” agreed osteopathic physician Bryan Tran of health and nutrition company DrFormulas, “so studies on this topic do make significant contributions to the current body of knowledge.”

From his perspective, one interesting finding is that Bacteroides made up between 0.37% to 98.82% of all gut flora in healthy individuals. “This is quite a large range,” he pointed out, and any subsequent studies should attempt to explain why.

Also, Bifidobacteria, which is commonly used as probiotics, made up between 0.004% and 12.21% of all gut flora in healthy individuals. It would be interesting, he noted, to compare these abundances for issues like diarrhea, constipation, irritable bowel syndrome, or obesity to explore whether or not there is a relationship between certain gut microbes and disease states.

Meanwhile, microbiologist Alex Berezow with the American Council on Science and Health, expressed his skepticism considering the researchers collected fecal samples from students at just one campus over a short duration of time. “That’s problematic because a healthy microbiome probably changes over time,” he explained.

In addition, there is no reason to expect that George Washington University students are a suitable representation of all humanity. “The microbiome of a healthy person from the United States will likely differ from that of a healthy person from, say, Asia or Africa,” Berezow pointed out.

The effort is noteworthy in the sense that it assigns value to genes in the microbiome that cannot be mapped to specific organisms (also known as “dark matter”), according to Raja Dhir, co-founder of Seed Health, a microbial sciences company.

It is too soon, however, to say whether these genes are relevant in determining the health or disease status of a human. From his vantage point, other recent efforts to quantify the function of the microbiome are probably even more significant within this field of study.

Engaging a Wider Community

Fecal samples from healthy individuals have been collected by plenty of other groups as part of other projects, Mazumder said of the bigger picture of gut microbiome research.

Most studies currently use study-specific control groups and reporting methods, however, and the studies that do create clinically relevant results are based upon marker genes, he and colleagues pointed out in their paper, which means they do not shed light on the origin of dark matter and cannot be integrated with whole-genome shotgun sequencing studies.

These issues are exacerbated considering different bioinformatics pipelines lead to different results, mainly because “all current pipelines use a limited number of ad hoc reference organisms to determine abundance,” they explained. The understanding of the baseline healthy microbiome can, in turn, be flawed. “As such, there is a need for aggregation, validation for interoperability, and eventual standardization of methods and reporting.”

“We want to make sure that the bioinformatics pipeline is the same pipeline,” Mazumder said, “so that we can actually compare the results and create a catalog of these organisms.” The metagenomic analysis pipeline he and colleagues developed includes three software tools and one sequence database called Filtered-nt. All the software tools (CensuScope, HIVE-Hexagon, and IDBA-UD) are integrated into the HIVE platform.

Although there is an expectation that other studies will find additional organisms present in healthy gut microbiomes, they view the GutFeelingKB as an important first step. The reference list and abundance information can be used for comparative analysis of samples from healthy people across the globe and help understand differences due to factors like diet, disease, and therapy.

The group at George Washington University intends to continue working on their project by adding to the resource any time they collect samples from healthy individuals while pursuing research on diseases such as diabetes, epilepsy, or colon cancer. “We want to continue increasing our footprint in our microbiome research as we move along,” he said.

They also hope to attract other microbiome researchers. They wanted to publish this paper because there are many other researchers in this space, he added, and “everything that we have done so far is freely available” so that others can build upon it.

The goal is to engage the wider community to help better define or annotate these organisms, Mazumder continued, so that anybody—including the Human Microbiome Project or a company like Thryve Inside—can use the application programming interfaces to pull knowledge out of the database in the reporting system.

“I feel that is where we can bring in value in terms of curation of the information into our research and whatever is known elsewhere to make it useful for people—for physicians, for bioinformaticians, for researchers,” he added.

Difficulties Remain

There are several challenges looking ahead, though, according to Mazumder. When dealing with fecal samples, the DNA extraction performed uses a common procedure for all of the 150 or so microorganisms.

For some microorganisms, that DNA extraction could be better, and for some, it may not be as good. This means “the final total DNA could have a bias depending on the type of organism which is in your gut,” he said, “which can skew the results.”

“The second challenge is the dark matter,” he said, considering large amounts of the next-generation sequence data cannot be matched to any known organism. There is a need to develop methods for de novo assembly, for example, or design primers to better understand that dark matter. Exploring those organisms will be crucial because they likely have some effect on us.

The last notable challenge, from his vantage point, relates to the analysis of data. The cost of sequencing is getting cheaper and cheaper, which means there is a need for a commercial, well-packaged algorithm and hardware to run these large datasets. Sometimes generating the data is less expensive than analyzing the data.

It is essential for universities, funding agencies, and commercial companies to realize that the analysis of data is just as critical as the generation of data, he added. “Otherwise, that research is not going to move as fast as we want.”

Paul Nicolaus is a freelance writer specializing in science, nature, and health. Learn more at