By Kevin Davies
May 15, 2003 | Given Francis Collins’ keynote admonition at the Bio-ITWorld Conference & Expo that we are still very much in the “genome era” (see “What Next?” April 2003 Bio-IT World, page 27), he would probably have disapproved of a symposium entitled “Post-Genomic Frontiers.” But regardless of the title, new vistas opened up by the genome project signal a golden era for scientists, especially in the areas of diagnostics and therapeutics.
Kerstin Lindblad-Toh of the Whitehead Institute Center for Genomic Research was one of the consortium leaders that sequenced the mouse genome – a feat requiring 41 million DNA sequence reads spanning 96 percent of the genome. Comparative studies indicate that 5 percent of the mouse genome is under selection, meaning it is strongly conserved with the human sequence. Curiously however, genes only compose one-third of this DNA.
What are these other evolutionarily conserved sequences? Some are genes that code for RNAs, others are regulatory or structural elements. Insights into their function will come from detailed study of various organisms, including yeast, with several strains now fully sequenced. Another is ENCODE (Encyclopedia of DNA Elements) – a plan to study 1 percent of the human genome in rich detail.
Meanwhile, Lindblad-Toh’s team is moving onto other mammalian genomes, notably chimpanzee and the dog. The choice of canine breed, she admitted, is a politically charged topic. As yet undecided by National Human Genome Research Institute, the top two candidates are the boxer and the beagle.
Mark Daly, Whitehead/Pfizer computational biology fellow at the Whitehead Institute, discussed computational approaches to unmask genes for common diseases, such as heart disease, which are notoriously hard to find because of the complex mixture of genetic and environmental risk factors at play. The classic example is the APOE gene in late-onset Alzheimer’s disease – inheritance of a double dose of e4 variants increases the risk of Alzheimer’s 4-fold. Recent work on Crohn’s (inflammatory bowel) disease susceptibility has incriminated the caspase recruitment domain gene (CARD15), which codes for a protein that may function as an anti-bacterial agent. Several other exciting examples are also emerging, including genetic factors in diabetes and bipolar disorder.
Hip, Hip, Array
The availability of the full genome has revolutionized the field of cancer diagnostics. Todd Golub, director of the cancer genomics program at the Whitehead Institute Center for Genome Research and Charles A. Dana investigator in human cancer genetics at the Dana-Farber Cancer Institute,reviewed molecular profiling studies that have not only reclassified acute lymphocytic leukemia, but also pinpointed FLT3 kinase as an exciting drug candidate. A Novartis drug recently entered clinical trials, just one year since the original FLT3 discovery. “The path to the clinic is using these [expression] patterns from experiments designed for diagnosis to bona fide therapeutic targets,” said Golub.
In recently published work on lung cancer, Golub says his team has found that “some primary tumors express [a] metastasis signature” – a group of 17 genes whose altered activity is strongly correlated with metastasis, but which also “transcend different tumor types,” including breast and prostate. “Tumors either have high or low metastasis potential,” Golub concluded; the presence of this signature is a strong “prediction of future metastatic potential” -- and death. Golub also addressed the need for “robust biomarkers” in cancer, using proteomics, for instance, comparing normal versus leukemia serum samples.
In another promising example of how diagnostics are impacting therapeutics, new studies have revealed a 5-gene signature associated with myeloid cell maturation. A screen for small molecules that switch on this signature could push leukemic cells to a more normal physiology.
Despite the excitement surrounding gene expression, Steve Gygi, assistant professor of cell biology at Harvard Medical School, cautioned that RNA quantitation is not a surefire method to monitor protein expression – “the biological effector molecule of the cell.” For that, you need proteomics.
(Gygi demonstrated the catch-all nature of proteomics -- protein identification, quantification, activity, structure, and molecular interactions – by showing pictures of a Japanese market displaying square watermelons, produced by growing the fruit in a box. “If they weren’t produced by genomics,” quipped Gygi, “it must have been proteomics!”)
Recent progress in mass spectrometry has been remarkable. Six years ago, it took a day to sequence a dozen or so building blocks in a single peptide. Today, mass spectrometry can sequence one peptide per second.
In the closing talk, Gene Codes President Howard Cash delivered a poignant summary of the forensic and bioinformatic efforts marshaled to identify the victims of the September 11, 2001, genocide when 110 floors of the World Trade Center collapsed into a mere 70 inches, killing more than 2,700 people. Software designed by Gene Codes within three months of the disaster was used to type DNA samples extracted from bone and dental samples painstakingly recovered from ground zero. The remains were compared to reference DNA samples submitted by family members, derived from hairbrushes, razors, and other sources. “The data were there, there was just no way to get at them,” said Cash.
To date, about half of the victims have been positively identified, and in most cases the (partial) remains have been transferred to the families. Cash said he hoped that the procedures established in this extraordinary effort could be adapted to help the victims of a future earthquake in Turkey or a hurricane in the Philippines.
Other potential applications were best left unsaid.