April 12, 2007 | This year, the journal Nucleic Acids Research published its 14th annual molecular biology database issue. The 2007 compendium included 174 databases, including 106 new entries and 68 updates. This brings the total number of databases in the NAR online Molecular Biology Database Collection to 968. That’s an impressive increase of 110 on the previous year.
Although each of these repositories has its diehard aficionados, one entrant has come in for some unusually heavy scrutiny lately. The Human Metabolome Database (HMDB) is the work of a 40-member all-Canadian team, led by David Wishart, of the department of computing science at the University of Alberta, Edmonton.
Motivated by the absence of a metabolomic equivalent of GenBank that could provide information and possibly even samples of metabolites, researchers secured $7.5 million funding from Genome Canada in 2005 for the “Human Metabolome Project.” Their goal is, “to improve disease identification, prognosis, and monitoring; provide insight into drug metabolism and toxicology; provide a linkage between the human metabolome and the human genome; and develop software tools for metabolomics.”
Metabolomics is the effort to identify and catalogue the thousands of metabolites found in human blood and urine, as well as other organisms. A full metabolite catalogue would improve efforts to understand the production of metabolic biomarkers, the course of disease progression, and the metabolic actions and toxicity of new drugs in preclinical research.
According to Wishart and colleagues, HMDB is “the most complete and comprehensive curated collection of human metabolite and human metabolism data in the world.” The database contains records for more than 2,500 human metabolites culled from “thousands of books, journal articles, and electronic databases,” and the Canadian team estimates the final tally will be more than double the current number. There are typically dozens of data fields, including synonyms, structural and physico-chemical data, NMR and MS spectra, disease associations, pathway information, sequence and SNP data, and external links. The HMDB is available at: www.hmdb.ca.
“Fundamentally,” Wishart and colleagues write, “HMDB is a multi-purpose bioinformatics-cheminformatics-medical informatics database with a strong focus on quantitative, analytic or molecular-scale information about metabolites, their associated enzymes or transporters and their disease-related properties. HMDB combines the data-rich molecular biology content normally found in curated sequence databases such as SwissProt and UniProt with the equally rich data found in KEGG (about metabolism) and OMMBID (about clinical conditions).”
But in London, Imperial College’s Jeremy Nicholson, the founding father of metabolomics, says the significance of the HMDB has been overblown. “It is just a database of compounds that were mostly known to be involved in human metabolism,” he says. “The project did not address the key issue — the variance of metabolites between tissues, cells, and biofluids. That is the important thing which classifies diseases and phenotypes.”
Last year, Nicholson’s team collaborated with scientists at Pfizer to publish a paper in Nature on “Pharmaco-metabonomic phenotyping and personalized drug treatment.” The goal is to to predict drug-related outcomes in humans by measuring metabolic signatures in body fluids. Indeed, Nicholson notes that humans vary considerably in their metabolic profile according to a host of factors, including age, sex, time of day, time of month (in women), diet, gut flora, pollutant exposure, ethnicity, fitness, and more. Nicholson acknowledges that the HMDB is “a useful catalogue,” but of limited novelty, he says. For him, HMDB “only forms the index for the book of the human metabolome — not the text.”
Wishart professes the utmost respect for Nicholson, but says criticism of HMDB as a mere list is unfair. "It would be like saying the Encyclopedia Brittanica is just a list or that GenBank is just a list," he says. His group has released a new database called FooDB (food components and additives), which complements HMDB and the DrugBank database. "These databases, if printed off, would be 100,000 pages long. They contain an enormous amount of biological, chemical, clinical, biochemical data." He adds, "We've had to upgrade our servers to deal with the heavy load," of some 200,000 hits per month.
See related article: The Human Metabolome Contretemps
Subscribe to Bio-IT World magazine.