A Genome-Wide Quantum Leap



By Kevin Davies

August 8, 2007 | I was recently invited to London as a guest of my former journal Nature Genetics at its 15th anniversary celebratory dinner held, rather impressively, at the Houses of Parliament. The evening was highlighted when Alan Keen, MP, led the guests on a private tour of the floor of the House of Commons chamber. Rather less memorable was the meal itself — where is Gordon Ramsay when you need him?

The occasion immediately preceded a major conference on the genetics of common diseases. Indeed, this is proving to be a golden year for genomewide association (GWA) studies. After years of bumbling around like “Keystone Cops,” in the view of gene mappers David Altshuler and Mark Daly (Nat Gen. 39:813-815;2007), GWA studies are pouring out, identifying gene variants for the most common and complex diseases, including cancer, heart disease, obesity, autism, and depression.

Fifteen years ago, we talked about the “Gene of the Week” syndrome, as geneticists routinely used linkage analysis to map rare disease genes and pinpoint the rogue genes. Around the turn of the century, it was “another week, another genome,” in the words of the Economist. Now, thanks to both technological and cultural factors — as discussed in this issue’s cover story (see page 28) — the results are astonishing.

One need look no further than the largest GWA study so far, published in June by the Wellcome Trust Case Control Consortium (WTCCC; Nature 447:661-678;2007). In this tour de force featuring some 17,000 individual samples and showcasing Affymetrix 500K GeneChips, an alliance of more than 50 U.K. research groups identified genetic risk factors for seven common diseases at extreme levels of significance.

“If you think of the human genome as a very long road where you are trying to find something in the dark, previously we were able to turn the lights on only in a very few places,” said WTCCC chairman, Oxford’s Peter Donnelly. “By turning on half a million lights along the genome, as we have done, you get to see a very large proportion of the variation that is there.”

Light Reading
The seven diseases in the WTCCC analysis were:  bipolar disorder (1 gene identified), coronary artery disease (CAD, 1), Crohn’s disease (9), rheumatoid arthritis (3), type 1 diabetes (T1D, 7) and type 2 diabetes (3). The WTCCC stressed the importance of appropriately large samples because of the “modest effect sizes observed at most loci identified.” Other developments contributing to the success of GWA mapping include the International HapMap project, which facilitates the design and analysis of GWA studies, and the improved availability of well-characterized clinical samples for many diseases. But much of the credit rests with the microarray manufacturers for the huge increase in SNP density, facilitating genome-wide studies on thousands of cases and controls. Even better, the latest chips from Affymetrix (GeneChip 6.0) and Illumina (1M array) contain 1.8 and 1 million probes, respectively. These arrays will help ferret out loci that are not in linkage disequilibrium with the SNPs studied so far, and will also improve representation of polymorphisms in non-European populations.

The statistical evaluation of GWA data remains a thorny issue. As Harvard statisticians David Hunter and Peter Kraft point out (NEJM July 19, 2007), the 500,000-SNP chips produce orders of magnitude more data than before, but with 500,000 comparisons, “the potential for false positive results is unprecedented.” Hence most researchers are applying very stringent genomewide statistical thresholds — dividing the standard P value of 0.05 by 500,000, for a significance level of 10-7. (The WTCCC used a threshold of P<5 ≈ 10-7.)

More important than the initial significance threshold, however, is the degree of replication. Here, there is more exciting news. Even as the ink was drying on the WTCCC study, replication reports were flying off the presses. For example, the CAD locus on chromosome 9 has been confirmed in three independent studies, making it the most highly replicated locus for CAD so far. And John Todd and colleagues in Cambridge, U.K., reported they had “confirmed unequivocally the association” of four of the genes for T1D. Ditto Crohn’s disease and T2D. These replication studies won’t be the last word, however, for the groups have probably skipped over some initial false negative associations. Finding a robust, replicated association is just the beginning. Many associations implicate regions with no known genes or genes of unknown function. Translating DNA into drugs won’t be easy or cheap, but at least we’re finally learning where to start.

Email Kevin Davies at kevin_davies@bio-itworld.com.

Subscribe to Bio-IT World  magazine. 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .