By Mark D. Uehling
September 9, 2002 | The scientific spotlight is shifting from merely identifying genes to studying their activity, from just recording gene sequences to understanding their functions.
The development of a new suite of online tools called SAGE Genie promises to have a major impact in studying the highly variable expression of human genes in different cells, in different stages of development, and in health and disease. A systematic comparison of the total repertoire of gene expression -- commonly known as the transcriptome -- from, say, a normal colon cell and its cancerous counterpart, can lend important insights into pathogenesis and diagnosis, not to mention new drugs.
The best-known method to study gene expression in a cell or organism is through DNA microarrays. But another powerful technique, originally developed by Bert Vogelstein’s group at Johns Hopkins University Medical School, is called serial analysis of gene expression (SAGE). As published recently in the Proceedings of the National Academy of Sciences, SAGE Genie (a suite of five applications and multiple databases) is essentially a Web browser for gene activity. “We’re defining the transcriptomes of cancer and normal tissues,” explains geneticist Gregory Riggins of Duke University. “We wanted a place where we could archive all the SAGE information and view it in an intuitive format.”
SAGE Genie was sponsored by the National Cancer Institute’s (NCI) Cancer Genome Anatomy Project, and includes physicians, geneticists and bioinformaticians at Duke, Brazil’s Ludwig Institute, Harvard University and the National Institute on Aging. In the new report, scientists started with 6.8 million SAGE “tags” -- short oligonucleotide fragments that bind specifically to RNA -- derived from 171 different cell types. Then the team performed a sort of biological indexing, linking a subset of the tags to genetic data in seven published databases of expressed genes. That produced a monstrous 102MB, 5-million-record Oracle database that allows SAGE Genie to match SAGE tags with published gene names and accession numbers in a searchable way.
The Color of Cancer
Scientists visiting the SAGE Genie Web site (http://cgap.nci.nih.gov/SAGE) can type in a gene name or accession number and see which organs in the human body actually express that gene in normal or cancerous tissue. Different colors indicate the relative abundance of the encoded protein. “If one of your cancer tissues shows up in red, and everything else is blue, that indicates the gene is turned on highly in that particular type of cancer,” explains Riggins.
At Johns Hopkins, the appeal of the new Web site is clear. “SAGE Genie takes an extra several steps in the organization and presentation of these data to make it extremely easy to use,” says Victor Velculescu, an assistant professor of oncology who helped develop the technique. The SAGE Genie team is using statistical algorithms to find the best gene-tag match, he notes. “Historically, there have been difficulties linking SAGE tags to genes because of all the genetic data that wasn’t categorized correctly,” he says. “The probability of errors is going to be much smaller now with SAGE Genie.”
Knocking the Chips
The new results may help SAGE emerge from the long shadow cast by microarray-based tools for analyzing gene expression. While undeniably popular, microarray data are not easily reproducible in different labs, whereas “the expression levels in SAGE are absolute,” says Velculescu.
For SAGE partisans, this makes the new online data more useful to other scientists and especially well suited to computerized manipulation and analysis. Says Velculescu: “The data from any experiment can be compared to historical data that has been accumulated. They’re not just data that are useful on a particular day and then you throw them away.” Velculescu notes the growing popularity of SAGE among scientists, pointing to more than 200 published articles and the growing popularity of the annual SAGE conferences.
The SAGE Genie project has also excited Genzyme Corp., which licenses SAGE technology from Johns Hopkins to the pharmaceutical industry. “People are realizing they want a lot more information than they can get from an Affymetrix or chip system,” says Antony Newton, director of commercial development at Genzyme Molecular Oncology.
Newton concedes that SAGE is higher in cost than microarray approaches to gene expression. For now, SAGE remains a niche technology with most of its fans in academia and government. If Affymetrix is the Microsoft of gene expression, Newton proposes, SAGE is the Linux -- open, transparent, but something of an acquired taste. The sheer ease of use of SAGE Genie may begin to change that.