A Genome-Wide Quantum Leap


By Kevin Davies

August 8, 2007 | I was recently invited to London as a guest of my former journal Nature Genetics at its 15th anniversary celebratory dinner held, rather impressively, at the Houses of Parliament. The evening was highlighted when Alan Keen, MP, led the guests on a private tour of the floor of the House of Commons chamber. Rather less memorable was the meal itself — where is Gordon Ramsay when you need him?

The occasion immediately preceded a major conference on the genetics of common diseases. Indeed, this is proving to be a golden year for genomewide association (GWA) studies. After years of bumbling around like “Keystone Cops,” in the view of gene mappers David Altshuler and Mark Daly (Nat Gen. 39:813-815;2007), GWA studies are pouring out, identifying gene variants for the most common and complex diseases, including cancer, heart disease, obesity, autism, and depression.

Fifteen years ago, we talked about the “Gene of the Week” syndrome, as geneticists routinely used linkage analysis to map rare disease genes and pinpoint the rogue genes. Around the turn of the century, it was “another week, another genome,” in the words of the Economist. Now, thanks to both technological and cultural factors — as discussed in this issue’s cover story (see page 28) — the results are astonishing.

One need look no further than the largest GWA study so far, published in June by the Wellcome Trust Case Control Consortium (WTCCC; Nature 447:661-678;2007). In this tour de force featuring some 17,000 individual samples and showcasing Affymetrix 500K GeneChips, an alliance of more than 50 U.K. research groups identified genetic risk factors for seven common diseases at extreme levels of significance.

“If you think of the human genome as a very long road where you are trying to find something in the dark, previously we were able to turn the lights on only in a very few places,” said WTCCC chairman, Oxford’s Peter Donnelly. “By turning on half a million lights along the genome, as we have done, you get to see a very large proportion of the variation that is there.”

Light Reading
The seven diseases in the WTCCC analysis were:  bipolar disorder (1 gene identified), coronary artery disease (CAD, 1), Crohn’s disease (9), rheumatoid arthritis (3), type 1 diabetes (T1D, 7) and type 2 diabetes (3). The WTCCC stressed the importance of appropriately large samples because of the “modest effect sizes observed at most loci identified.” Other developments contributing to the success of GWA mapping include the International HapMap project, which facilitates the design and analysis of GWA studies, and the improved availability of well-characterized clinical samples for many diseases. But much of the credit rests with the microarray manufacturers for the huge increase in SNP density, facilitating genome-wide studies on thousands of cases and controls. Even better, the latest chips from Affymetrix (GeneChip 6.0) and Illumina (1M array) contain 1.8 and 1 million probes, respectively. These arrays will help ferret out loci that are not in linkage disequilibrium with the SNPs studied so far, and will also improve representation of polymorphisms in non-European populations.

The statistical evaluation of GWA data remains a thorny issue. As Harvard statisticians David Hunter and Peter Kraft point out (NEJM July 19, 2007), the 500,000-SNP chips produce orders of magnitude more data than before, but with 500,000 comparisons, “the potential for false positive results is unprecedented.” Hence most researchers are applying very stringent genomewide statistical thresholds — dividing the standard P value of 0.05 by 500,000, for a significance level of 10-7. (The WTCCC used a threshold of P<5 ≈ 10-7.)

More important than the initial significance threshold, however, is the degree of replication. Here, there is more exciting news. Even as the ink was drying on the WTCCC study, replication reports were flying off the presses. For example, the CAD locus on chromosome 9 has been confirmed in three independent studies, making it the most highly replicated locus for CAD so far. And John Todd and colleagues in Cambridge, U.K., reported they had “confirmed unequivocally the association” of four of the genes for T1D. Ditto Crohn’s disease and T2D. These replication studies won’t be the last word, however, for the groups have probably skipped over some initial false negative associations. Finding a robust, replicated association is just the beginning. Many associations implicate regions with no known genes or genes of unknown function. Translating DNA into drugs won’t be easy or cheap, but at least we’re finally learning where to start.

Email Kevin Davies at kevin_davies@bio-itworld.com.

Subscribe to Bio-IT World  magazine. 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

HP white paper image
Extreme Storage Knowledge Center
Sponsored by HP

Visit HP’s Extreme Storage Knowledge Center to find informative, complimentary white papers, case studies, videos, product information and more.  Brief overview of topics:

  • The challenges of unstructured storage and how to manage both cost-effectively and efficiently
  • Company case studies of data storage challenges that translate across pharmaceutical and biotech companies today
  • Systems that manage vast amounts of data with simple deployment, unified management, and extreme scalability at an exceptionally low price per terabyte
  • Life sciences data management; viable solutions for small and large companies to manage growing storage demands
  • Take our virtual product tour and see our storage unit from inside out


Coupa white paper 92
10 Secrets to Recession-Proof Your Business
Sponsored by Coupa


Read this white paper to discover 10 strategies smart companies deploy to recession-proof their business.
Leaders generally face hard choices on how to mange a company during an economic downturn and
behave in one of three ways:
1) “The ostrich” - Preserve the status quo/hope for the best
2) “The bull in the china shop” - Blindly cut expenses across the board
3) “The fox” - Use the downturn to make your business more effective and position it for future growth

Learn how to behave “like a fox” and use a recession as a means to pounce on emerging trends.



SGI BriefingON image
High-Performance Computing in Life Science & Education
Sponsored by SGI and Intel
The varied collection of Bio-IT World articles and insights assembled in this BriefingON examine key trends in HPC infrastructure and how researchers are putting their best computational resources to use. Provided here are stories and lessons around the effective use of high performance computing in life science. Download the BriefingON.


Life Science Webcasts & Podcasts

Medidata Solutions

Rising Clinical Trial Delays and Costs - Addressing the Cause, Not the Symptoms 

medidata podcastProtocol complexity is taking a toll on clinical study speed and efficiency: increasingly complicated and ambitious protocols are not only burdening sites and study volunteers but are also prolonging trials and increasing expenses. In response, sponsors have turned to global study placement, restructured site relationships and new site management practices, but the problem remains.

This podcast will discuss:

  • Why these responses address only the symptoms, not the underlying cause, of rising clinical trial delays and costs.
  • Results of a recent joint Tufts University / Medidata Solutions study.
  • New metrics benchmarking protocol design trends.
  • Systematic protocol design improvements and why they are essential to clinical trial performance excellence.

Speakers: Ken Getz, Senior Research Fellow at the Tufts Center for the Study of Drug Development, and Ed Seguine, General Manager, Trial Planning Solutions at Medidata.

Download Now 



More Podcasts

Job Openings

Manager, Scientific Computing & Programming
Lead SAIC-Frederick, Inc.’s Bioinformatics & Analysis Group in developing & maintaining informatics pipelines for generation/analysis of dense genotyping & next-generation sequencing data. Required:  MS or equiv.  5 yrs related experience.  Knowledge of programming/software development, high performance computing, bioinformatics, project management. Visit www.saic-frederick.com - #130019.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.