The study of metabolomics is attracting a flurry of biotechs and academics, with research implications ranging from plant biology to drug discovery
By Karen Hopkin
July 14, 2004 | In the beginning, there was the genome, which was good — but not good enough. Soon scientists were talking about transcriptomes and proteomes, too.
Now some biologists are venturing beyond central dogma-land and entering metabolome country, a territory populated by the hordes of small molecules that govern living systems. Armed with mass spectrometers and million-dollar nuclear magnetic resonance (NMR) machines, researchers in industry and academia are working to catalog these collections of biochemical compounds — a pursuit dubbed metabolomics — and determine how they respond when organisms are challenged by drugs, disease, or stress, an approach some call metabonomics (see "L-M-N-Omics").
For biotech companies, metabolomics and metabonomics could speed drug development and uncover biomarkers for diagnosing disorders. In university labs, they provide another means for decoding the information locked in the genome. As the results begin rolling in, scientists now face the formidable challenge of learning how to handle, process, and extract useful information from these mountains of metabolic data.
Peaks and Patterns

John Ryals, president of Metabolon |
Measuring metabolites is not new. For decades, clinicians have charted chemistries in blood, urine, and other bodily fluids — using glucose to track diabetes and cholesterol to monitor heart disease, for example. What's new is that researchers are now casting a wider net, attempting to gather an unbiased sample of metabolites that can serve as a snapshot of an organism's physiology. "It's essentially global biochemistry," says John Ryals, president of Metabolon in Durham, N.C.
The goal is to be able to distinguish between an individual who is healthy and someone who has — or might develop — a disease. To do that, researchers start by measuring as many metabolites as they can in samples from different patient populations. They then compare these metabolic profiles and look for patterns, molecules whose concentrations deviate in a disease. They can then use these molecules to develop diagnostic tools or to derive a deeper understanding of the disease mechanism.
Researchers at Metabolon are working on tests for Alzheimer's disease and other neurological disorders. And scientists at Metabometrix, a company founded by Jeremy Nicholson and colleagues at Imperial College London, have devised a method for predicting cardiovascular disease. By analyzing NMR profiles obtained from a blood sample, these researchers can distinguish between people with normal coronary arteries and those who have triple blockages, a sign of advanced cardiovascular disease.
LONDON CALLING: Jeremy Nicholson of Imperial College and Metabometrix next to a superconducting magnet, the central component of the NMR system used to analyze biofluid samples. |
Some companies have formed research alliances with academics or big pharmas. Beyond Genomics, in Waltham, Mass., has partnered with GlaxoSmithKline, Novartis, AstraZeneca, and others to develop tools for monitoring drug response and toxicity and search for biomarkers for disorders such as rheumatoid arthritis and atherosclerosis. Other companies swap metabolic analyses for cash.
"People send samples, and they get back data," says Oliver Fiehn, a researcher at the Max Planck Institute of Molecular Plant Physiology and founder of MetaProfile, a small fee-for-service company based in Germany. "They want the data, but they don't want to go through the pain."
Metabometrix also does a bustling business, processing samples and generating mathematical models and detailed biological analyses for customers who are running studies on drug toxicity or disease diagnosis. "We were formed in response to demand," Nicholson says. "Metabometrix doesn't even have a marketing department — which is good because I hate marketing people!"
Having metabolic profiles that indicate whether a drug has the desired specificity or undesired toxicity would help pharmaceutical companies determine which leads to pursue. Such decisions are sometimes based on nonmedical factors such as "patentability" because companies have "no real data" showing how their proto-drugs work in people, Ryals says.
The same sort of data can help researchers select a population of patients that will benefit most from a medication. "Many drugs have remarkable effects on a subset of patients," says Robert McBurney of Beyond Genomics. A number of anticancer drugs — Iressa, Erbitux, Tarceva, and Avastin, for example — are "very effective but in no more than 30 percent of patients," he says. Many of these meds run $10,000 or more for a course of treatment, so "to treat 100 patients, it could be $1 million, of which $700,000 is wasted expenditure," he says. Identifying who will respond to a drug should cut costs, McBurney says, and enable doctors to "get the right treatment to the right patient at the right time."
META ANALYSIS: The study of metabolomics is the natural progression from genomics and proteomics. Metabolome analysis is a valuable means of inferring gene function.
|
The Lure of Metabolomics
Many companies are turning to metabolomics because metabolites are themselves the products of genes and proteins, and so provide a more direct readout of physiology. And in some ways, small molecules are easier to track than gene transcripts or proteins. "Glucose is glucose," Ryals says. "Whether it's in humans or E. coli, it's the exact same molecule." So an organism doesn't need to have its genome sequenced before it can be studied.
| L-M-N-Omnics |
PEOPLE USE THE ‘L’ WORD too loosely
Read More |
And the techniques tend to be reproducible. Nicholson heads up a consortium of drug companies that have collaborated to profile the toxicity of some 150 compounds. In lab-to-lab comparisons, the reproducibility of their NMR- and MS-based methods was at least "20 times better than the best genomics studies ever published," Nicholson says.
Academics are also jumping on the metabolomics bandwagon. The National Institutes of Health has put out a call for project proposals. And many researchers are already knee-deep in metabolic data, particularly those who study plants. At the Max Planck Institute of Molecular Plant Physiology, Fiehn and his colleagues use metabolic profiling to tease apart the roles that individual enzyme isoforms play in regulating the development and metabolism of potato plants. Down the hall, Alisdair Fernie's group is combing through measurements of metabolites, hormones, and mRNAs to get a handle on the physiology of fruit ripening in tomatoes.
Work at the Max Planck Institute spawned another company that is tackling the entire Arabidopsis genome, one mutant at a time. Researchers at Metanomics in Berlin have generated 55,000 knockout lines, each missing a single gene. In addition, they've engineered another 155,000 lines of Arabidopsis that contain individual genes from yeast, E. coli, or crop plants. By examining the metabolic profiles of these unique populations, researchers hope to identify genes that boost vitamin concentrations, increase crop yield, eliminate toxins, and produce plants that are tolerant to drought or cold.
| Six of One ... |
Metabolites come in a variety of shapes and sizes.
Read More |
And Canadian researchers are taking on the human metabolome. With a three-year, $7.5-million grant from Genome Canada, David Wishart and his colleagues at the University of Alberta plan to identify and quantify all of the small molecules present in the body at concentrations greater than 1 µM, which is the cutoff sensitivity for NMR — their weapon of choice (see "
Six of One ...," right).
The sheer volume of all these data can be overwhelming. At Metanomics alone, 50 mass spectrometers run day and night, generating some 3 terabytes to 4 terabytes of data per year. "We would need an army of analysts to sit in front of the mass specs and just look at peaks," says Arno Krotzky, the company's CEO. Instead, bioinformatics experts at companies and universities have developed software to process the profiles, extract information, search for patterns — even to load the data into a database. "You can't plug a mass spec into a database," Ryals says.
Once the data are in, and the interesting patterns have been flagged, the challenge becomes figuring out which metabolites are there. "The primary bottleneck is being able to identify all the things that are changing," says James Willett of George Mason University in Virginia who is using metabolomics to study how the communication at neuromuscular junctions changes with age. "If you know the metabolite, you immediately know where it came from, where it's going, and what other portions of biochemical networks it affects."
The task is hampered by the lack of a public database. "There's no GenBank of all metabolites typically found in blood or urine or cerebrospinal fluid or other tissues," Wishart says. He and his colleagues have founded a company called Chenomx, which is developing software that will provide a list of reference standards alongside their NMR spectra. Such databases must be made publicly available, Fernie says, or "every group of researchers will have to re-invent the wheel."
Some companies, such as Metabolon and Metabometrix, specialize in collecting metabolic data. Others are gathering and processing several different kinds of data — gene sequences and expression profiles, proteomic data, metabolic profiles, and clinical information — in an attempt to assemble a more detailed physiological picture. "We're interested in any molecules that change as an indication of things that are happening to organisms," says Aram Adourian of Beyond Genomics. "It makes sense to go into these analyses with as many clues as possible, so we can triangulate on what pathways are important, and understand why we are seeing what we're seeing."
 MIXED SIGNALS: Shown are some of the key factors that affect the likelihood of disease outcomes. Disease results from a complex interaction of host and environmental factors — the field of metabonomics. |
Integrating this information is a problem for which "there's no shrink-wrapped solution," Adourian says. But many organizations are working on it. Paradigm Genetics, a systems biology company in Research Triangle Park, N.C., recently received $11.7 million from the National Institute of Standards and Technology to develop programs for blending these different data streams. And Pedro Mendes at the Virginia Bioinformatics Institute is putting together a "database of 'omes" called DOME, which he hopes will help large labs organize and interpret all their 'omic information.
The need for such tools is considerable. Think of a simple bacterium with 4,000 genes, 4,000 proteins, and 600 metabolites. "That's 8,600 things that can be correlated to one another," says Roy Goodacre of University of Manchester Institute of Science and Technology (UMIST) in England. "Mining those data will be really, really hard work." And handling the equivalent information from humans — who have some 30,000 genes, 1 million proteins, and an estimated 2,500 small molecules — will be "a huge challenge to biology," he says.
Once the data are harvested and bundled, researchers need to find a way to get a bird's-eye view. "If you test 40 or 50 drugs, you'll wind up with 1,800 spreadsheets of data," Ryals says. "You need pretty good tools to go through that." Several organizations are working on programs for visualizing 'omic-scale data. Most contain maps of reaction pathways onto which gene-expression information and metabolic data can be painted. The result: a heat map of metabolism that indicates which pathways have been switched on and which have been suppressed.
 HEART HEALTH: NMR comparison of serum metabolites from healthy (red triangles) and atherosclerosis (blue squares) patients provides rapid, noninvasive diagnosis of heart disease. |
Nonprofit research organization SRI International has produced databases containing pathways for 15 different organisms, including E. coli and humans. Mark Stitt and his colleagues at the Max Planck Institute developed MapMan, a program that lays out Arabidopsis metabolism. And Mendes built a program that spits out mini-maps of metabolic neighborhoods. "You choose your metabolite, and then look for all reactions it takes part in," he says. This block-by-block approach is useful, he says, because "cells don't know about pathways. Everything goes on all at once. But if you look at a whole metabolic network, it's just a mess of arrows. So you center on one metabolite and look at maps one neighborhood at a time." Mendes will make the program publicly available.
In the meantime, researchers keep reminding themselves why they set out to scale these mountains of data: to get a better view of the biology. "It's easy to generate a ton of numbers. It's a hypnotic job," says Lionel Hill of the John Innes Center in Norwich. "But it's not healthy to go and measure things just for the sake of measuring them. You have to attach meaning."
Then the real fun begins. "We're in late lag phase now," Goodacre says. "And things are about to go boom."
Karen Hopkin has a Ph.D. in biochemistry and is co-author of the textbook Essential Cell Biology (2nd ed.).
PHOTO OF RYALS BY BOB RIVES; PHOTO/GRAPH OF META ANALYSIS BY PHYTOCHEMISTRY; PHOTO OF NICHOLSON BY CAROLYN DJANOGLY/AURORA; PHOTO/GRAPH OF MIXED SIGNALS BY NATURE REVIEWS; PHOTO/GRAPH OF HEART HEALTH BY NATURE MEDICINE