Roche: Computers Map Genes 1,000x Faster

GENOTYPING TOOLS · Swiss company publishes technique in Science linking SNPs, haplotypes to phenotypes in mice

By Mark D. Uehling

December 15, 2004 | No, Roche won't mail you a shrink-wrapped box of super-duper gene-mapping software. Their lawyers wouldn't sign off on that. But in October, the company's scientists in Palo Alto, Calif., did the next best thing. They published their technique to map mouse genes a thousand times faster. In a single afternoon, Roche colleagues believe, any scientist who follows their recipe should be able to zero in on the location of a gene responsible for a key phenotypic trait.

"This computational method will greatly improve our ability to identify the genetic basis for a variety of phenotypic traits, ranging from the qualitative trait information to quantitative gene-expression data," Gary Peltz and colleagues from Washington University, Stanford University, and elsewhere wrote in Science.

Peltz's group had previously demonstrated in silico techniques to zero in on 60-megabase chromosomal regions housing genes of interest. Now, Peltz says, proprietary software called Hapmapper allows Roche to mine its own data and immediately identify much smaller candidate regions of just 30,000 bases — about the size of a typical gene.


Get It Done, Fast 


COOL TOOL: Hapmapper software, developed by a multidisciplinary team at Roche, offers a glimpse into how the company can combine genetic analysis and gene expression to avoid years of breeding mice.
SOURCE: ADAPTED FROM SCIENCE 306, 690-5; 2004
At Roche and elsewhere, Peltz says, that sort of project often took years of interbreeding mice and other statistical tedium. A recent paper on osteoporosis, Peltz notes, took one group of Roche collaborators a decade. With Hapmapper, Peltz says, "once you measure a phenotypic difference, some measurable trait difference among the mice strains, you can carry out your mapping literally in an afternoon. You can order your mice from the supplier, make the measurements, and do the computational analysis."

Roche is using the software to explore a variety of therapeutic areas: Citing just one example, Peltz says the company is partnering with academics to learn why inhaled anesthetics work — a 140-year-old scientific mystery. That sort of arrangement may explain why Roche publicized some of its computational prowess — to lure academic investigators into collaborative projects.

Peltz concedes that figuring out the relevance of a mouse gene to human disease remains a major challenge. Even so, the technique may begin to undercut some of the doubts about gene-expression data.


"This computational method will greatly improve our ability to identify the genetic basis for a variety of phenotypic traits, ranging from the qualitative trait information to quantitative gene-expression data."
-Gary Peltz, Roche
 
Peltz doesn't dispute that there have been issues with the reproducibility of microarray experiments. "Microarray technology has advanced significantly in the past couple of years," he says. "It got off to a rocky start in the literature, with publications that had results that were either not internally validated or the experiments were not well-designed. The quality of the array data now, currently, is really quite high. The challenge is to design the proper experiments and learn to utilize the data in the most productive way."

Peltz continues: "Use of the array data alone has only a certain amount of value. But when you couple it with genetic analysis, it becomes very, very powerful. Rather than look at all the problems in the literature, I'd look forward a bit and say with the new tools, such as this program and other things, we'll be able to make much more sense out of gene-expression data."


Happier Hunting 
Not long ago, Peltz recalls, it was common to find hundreds or thousands of genes differentially expressed. Now it's possible to focus the search much more tightly. In part, that is because of Roche's own publicly available SNP database — one of the more popular ones, with hundreds of daily visitors, and now containing 150,000 SNPs and coverage of 2,214 genes.

Within the company, Peltz says, the Hapmapper software has been in use for several years, and is in some respects inextricable from the data it relies upon. "We're starting to cover some of the critical gene families," he says. "For pharmacogenetic analysis, we've covered the 200 genes that are known to be involved in drug metabolism. We have a complete map of those genes. We're putting in all the proteases, all the ion channel [genes]. As the breadth, the number of strains, and the depth, the number of genes, grows, we've seen the performance of the tool markedly improve."

Peltz's SNP discovery research was partly funded by the NIH, which helped the company create a database of 16 commonly used mouse strains. But in the next few years, the expected tripling of the number of mouse strains, not to mention a doubling in the size of the public SNP database, will only make the software more potent. "If you can go into the human studies armed with information, and knowing where to look from the mouse," Peltz says, "you can work much more efficiently and productively."




Featured Paper
Liao, G. et al. "In silico genetics: Identification of a functional element regulating H2-Ea gene expression." Science 306, 690-5; 2004.
 





For reprints and/or copyright permission, please contact  Jay Mulhern, (781) 972-1359, jmulhern@healthtech.com.