August 8, 2007 | I was recently invited to London as a guest of my former journal Nature Genetics at its 15th anniversary celebratory dinner held, rather impressively, at the Houses of Parliament. The evening was highlighted when Alan Keen, MP, led the guests on a private tour of the floor of the House of Commons chamber. Rather less memorable was the meal itself — where is Gordon Ramsay when you need him?
The occasion immediately preceded a major conference on the genetics of common diseases. Indeed, this is proving to be a golden year for genomewide association (GWA) studies. After years of bumbling around like “Keystone Cops,” in the view of gene mappers David Altshuler and Mark Daly (Nat Gen. 39:813-815;2007), GWA studies are pouring out, identifying gene variants for the most common and complex diseases, including cancer, heart disease, obesity, autism, and depression.
Fifteen years ago, we talked about the “Gene of the Week” syndrome, as geneticists routinely used linkage analysis to map rare disease genes and pinpoint the rogue genes. Around the turn of the century, it was “another week, another genome,” in the words of the Economist. Now, thanks to both technological and cultural factors — as discussed in this issue’s cover story (see page 28) — the results are astonishing.
One need look no further than the largest GWA study so far, published in June by the Wellcome Trust Case Control Consortium (WTCCC; Nature 447:661-678;2007). In this tour de force featuring some 17,000 individual samples and showcasing Affymetrix 500K GeneChips, an alliance of more than 50 U.K. research groups identified genetic risk factors for seven common diseases at extreme levels of significance.
“If you think of the human genome as a very long road where you are trying to find something in the dark, previously we were able to turn the lights on only in a very few places,” said WTCCC chairman, Oxford’s Peter Donnelly. “By turning on half a million lights along the genome, as we have done, you get to see a very large proportion of the variation that is there.”
The seven diseases in the WTCCC analysis were: bipolar disorder (1 gene identified), coronary artery disease (CAD, 1), Crohn’s disease (9), rheumatoid arthritis (3), type 1 diabetes (T1D, 7) and type 2 diabetes (3). The WTCCC stressed the importance of appropriately large samples because of the “modest effect sizes observed at most loci identified.” Other developments contributing to the success of GWA mapping include the International HapMap project, which facilitates the design and analysis of GWA studies, and the improved availability of well-characterized clinical samples for many diseases. But much of the credit rests with the microarray manufacturers for the huge increase in SNP density, facilitating genome-wide studies on thousands of cases and controls. Even better, the latest chips from Affymetrix (GeneChip 6.0) and Illumina (1M array) contain 1.8 and 1 million probes, respectively. These arrays will help ferret out loci that are not in linkage disequilibrium with the SNPs studied so far, and will also improve representation of polymorphisms in non-European populations.
The statistical evaluation of GWA data remains a thorny issue. As Harvard statisticians David Hunter and Peter Kraft point out (NEJM July 19, 2007), the 500,000-SNP chips produce orders of magnitude more data than before, but with 500,000 comparisons, “the potential for false positive results is unprecedented.” Hence most researchers are applying very stringent genomewide statistical thresholds — dividing the standard P value of 0.05 by 500,000, for a significance level of 10-7. (The WTCCC used a threshold of P<5 ≈ 10-7.)
More important than the initial significance threshold, however, is the degree of replication. Here, there is more exciting news. Even as the ink was drying on the WTCCC study, replication reports were flying off the presses. For example, the CAD locus on chromosome 9 has been confirmed in three independent studies, making it the most highly replicated locus for CAD so far. And John Todd and colleagues in Cambridge, U.K., reported they had “confirmed unequivocally the association” of four of the genes for T1D. Ditto Crohn’s disease and T2D. These replication studies won’t be the last word, however, for the groups have probably skipped over some initial false negative associations. Finding a robust, replicated association is just the beginning. Many associations implicate regions with no known genes or genes of unknown function. Translating DNA into drugs won’t be easy or cheap, but at least we’re finally learning where to start.
Email Kevin Davies at firstname.lastname@example.org.
Subscribe to Bio-IT World magazine.