YouTube Facebook LinkedIn Google+ Twitter Xinginstagram rss  

Vast quantities of new data types need to be combined and analyzed now

By Malorye Branca

Oct. 9, 2002 | With so much sophisticated technology on hand, it's surprising to realize that pharmacogenomics is still in its infancy, with the promise of many new developments ahead, particularly in informatics.

Several companies with genotyping platforms are methodically sifting through millions of SNPs (single nucleotide polymorphisms) to confirm that they are sufficiently common to be useful markers. "There are a lot of junk SNPs out there, where you don't have any level of validation," warns Mark Pohl, director of informatics at Orchid BioSciences Inc.

"If we want to predict patients' responses to a drug, we need to take into account all sources of variation. We can get valuable phenotypic information from patient medical histories, and it is only through the combination of all that with genotype that we can have an accurate prediction."

Michal Preminger, Compugen

Both Orchid and Sequenom have developed Web-accessible databases in which researchers can browse the genome for useful SNPs. Sequenom's Web portal was launched earlier this year; Orchid just released a beta version of Chromosome Browser. Its full launch is planned for this quarter.

Chromosome Browser covers more than 2 million SNPs from the main public sources — dbSNP, the SNP Consortium, and the Japanese SNP Database, as well as Orchid's own SNPs. Through the Web site, researchers will be able to locate SNPs and obtain primers. More than 80,000 of these SNPs are validated. "Our validation includes 144 samples from three ethnic groups," Pohl says.

This helps SNP enthusiasts who had been using multiple databases to find SNPs, and trial and error to design their assays. "If someone is currently doing assays with microsatellites and they want to switch to SNPs, before there was no tool that allowed them to do that easily," Pohl says.

Improved tools place new demands on informatics. "It is easy to get overwhelmed by the data," Pohl says. "You need very high-throughput servers and software that can deal with this information in huge batches." Yet some important steps are still performed manually, such as grading the scatter plots that represent the readout from their genotyping instrument. "Being able to do that automatically is the next step," Pohl says.

Things will get even more interesting as new types of data are introduced to the pharmacogenomic mix. These data are coming from new databases and technologies, such as proteomics and meta-bolomics, but are also emerging from previously untapped sources.

The Israeli Compugen Ltd., for example, is adding patient medical record information into its discovery data mix. Several health-care providers are giving the informatics company access to de-identified data. Compugen can only connect with the patients through the providers.

"If we want to predict patients' responses to a drug we need to take into account all sources of variation," says Michal Preminger, Compugen's vice president of new research directions. "We can get valuable phenotypic information from patient medical histories, and it is only through the combination of all that with genotype that we can have an accurate prediction." The company collects patient DNA samples "only once we have a homogenous and clean population, so we can look at a response that is specific to the drug, not based on other environmental factors," Preminger says.

But clinical data in general present a lot of challenges, says Michael Liebman, chief scientific officer for ProSanos Corp., based in La Jolla, Calif. The startup generates complex models of disease processes based upon clinical and other data. For example, patient records are not synchronized because people seek treatment at different points in the course of their disease.

Pathological definitions of many diseases are imprecise, and overall, health and diseases are complex, dynamic concepts. Hormonal fluctuations related to menopause, for example, can influence a woman's risk of breast cancer, but it's difficult to include this information in genetic analyses. "Biology and disease involve a continuous series of events and feedback," Liebman says. "We need better software and knowledge to model these."

Fortunately, companies are delivering these new tools. One of the keys will be combining analysis of multiple data types. Bozeman, Mont.-based Golden Helix Inc. says its Helix Tree software can relate thousands of interacting genes and environmental factors to clinical outcomes. Cambridge, Mass.-based Xpogen's new PathlinX application can reveal connections between genes, phenotypes, and "any other quantifiable measures," the company says.

Liebman has tested the tool and is impressed. "[PathlinX] uses relevance networks, which provide a very useful way to look at this data," he says. "It's a step forward in the evolution of our approaches to managing this type of information."

—Malorye Branca

Back to The PATH to Personalized Medicine 

For reprints and/or copyright permission, please contact Angela Parsons, 781.972.5467.