May 12, 2006 | From microarrays to sequencing technology, molecular diagnostics to the interactome, this year’s Bio-IT World Conference showcased exciting advances in genome technology applications, in which software analysis and data management play critical roles.
Producing more than 25,000 Affymetrix expression profiles in the past seven years, David Craig (Translational Genome Research Institute, TGen) said his group has identified a dozen single-gene mutations for various disorders. An epilepsy mutation in the CNTNAP2 gene recently published in the New England Journal of Medicine “is not a common mutation but a new mechanism that exposes new drug targets.” Although TGen conducts genomewide association studies using individual genotyping, “pooling offers a great alternative,” Craig said, for a fraction of the cost. This approach has identified a pair of genes involved in episodic memory.
VP business development Bill Cheliak said Genizon Biosciences is using a large Illumina workstation, with similarly positive results. The keys are low-cost genotyping (“without this, you might as well go home”) and a premium IT infrastructure capable of handling tens of terabytes of data, much of which must be shared with pharma partners. Genizon’s first association study in Crohn’s disease has been replicated in a German population, netting a handful of novel druggable targets.
Steve Lincoln, Affymetrix’s VP Informatics, offered a glimpse of new tools and applications. New chips will house probes for every gene exon, and eventually embrace all 3 billion bases in the human genome. The prevalence of alternative splicing “complicates microarray data analysis tremendously,” Lincoln said. The new Exon 1.0 ST Array is fostering new primary analysis methods, with Affymetrix releasing the source code to encourage further development.
The Broad Institute’s chief informatics officer, Jill Mesirov, also discussed advances in gene expression analysis. “Biological invariance acts at the level of pathways or coherent sets of genes, not single genes,” she said. Using the Kolmogorov-Smirnov test, which assesses whether data points are randomly distributed, a re-analysis of a year-old diabetes data set revealed a set of 106 genes down-regulated by 20 percent. Similar studies in cancer data sets have identified significant variations in specific pathways in leukemia and functional gene sets in lung cancer.
Susan Hardin, co-founder and CEO of Visigen, said her ultimate goal is a “$1,000 genome per day.” Visigen’s technology involves single-molecule fluorescence detection systems using engineered nucleotides and enzymes, but “the most important part of the work is the software team,” she said, which must handle terabytes of data. Solexa’s Tony Smith agreed — Solexa even recruits programmers with backgrounds in astronomy, given the similarity of sequence data and star field patterns. Solexa has selected a Yoruban male from Nigeria, who was also part of the HapMap project, for its first human genome resequencing program.
Harvard Medical School’s George Church discussed the “open-source” Personal Genome Project. Church listed seven alarming examples whereby personal anonymity in genetics studies has been breached, such as the teenager who was able to trace his biological sperm donor father.
Walter Koch, head of research at Roche Molecular Diagnostics, outlined the company’s AmpliChip technology, which genotypes cytochrome P450 genes, CYP2D6 and CYP2C19, responsible for processing common drugs including beta blockers and antidepressants. Individuals may carry anywhere from 0 to 13 copies of the CYP2D6 gene, profoundly impacting metabolism of drugs such as the antipsychotic Risperidone. “In principle, physicians know to adjust the dose,” said Koch, “but translating genotype to direct physician action is not straightforward.”
Informatics and Society
The international Genographic Project is proceeding at a spectacular rate, according to IBM’s Kris Lichter. In under 12 months, more than 135,000 DNA kits have been sold — exceeding the initial five-year target of just 100,000 and raising $2 million for the legacy fund and field research. The eventual goal is to create a virtual museum of human history and share the primary data with fellow researchers.
Howard Cash, whose company Gene Codes played a major role in the DNA identification effort after 9/11, contrasted that effort with the recovery program following the South Asian 2004 Tsunami. Although the death toll was double that of Ground Zero, the casualties were more dispersed. “Somehow, it didn’t seem as bad,” said Cash. Because more entire families were lost, there were fewer reference samples.
Fiona Murray discussed her recent analysis of the intellectual property landscape of the human genome. Her analysis suggests that 4,000 of the estimated 23,000 human genes are claimed by U.S. patents, the peak of the applications being filed in the late 1990s. Of 291 known cancer genes, fully 45 percent are patent-pending, as are 35 percent of known disease-related genes.
At the Center for Cancer Systems Biology at the Dana Farber Cancer Institute, Marc Vidal’s team is producing “a wiring diagram” for human cells — an interactome of proteins, DNA, RNA, and metabolites. Vidal says his current efforts show significant overlap with interactions gleaned from the literature, but “far from identical.” Once completed, he believes that the human interactome will reveal similar properties to the Internet and aviation network.
Finally, Glenn Rice, CEO of Bridge Pharmaceuticals, spotlighted the advantages of outsourcing preclinical research to China, where costs are one-tenth of those in the United States. “In discovery, [China] is not quite ready, but you can’t overlook brute force,” said Rice, noting the huge research talent pool emerging from China.