Next-Generation Sequencing Invades Microarray Turf



Two new papers unveil a new dimension to commercial next-generation sequencing applications – one that could potentially pose a threat to more-established microarray technologies. Using the Genome Analyzer from Illumina/Solexa, two groups working independently have been able to map the locations across the genome where a specific DNA-binding protein latches onto the DNA.

The new method is called ChIP-sequencing (ChIPSeq) – a combination of chromatin immunoprecipitation and next-generation, or parallel, sequencing. The feat was performed “with a speed and precision that goes beyond what has been achieved with previous technologies,” comments University of Washington geneticist Stanley Fields, in an accompanying essay in Science.

The precisely choreographed interplay of cellular gene activity is controlled by a vast cast of DNA-binding proteins – transcription factors and enzymes mostly. ChIP is a well-established lab technique to identify those specific sites where proteins latch onto the DNA. Cells are treated with a chemical to fossilize the links between DNA and protein, the chromatin is then isolated, the DNA broken up, and the attached proteins immunoprecipitated. Finally, the DNA stuck to the protein can be released and analyzed. Until now, the most high-throughput application of this technique involved using microarrays containing thousands of gene spots able to identify binding sites for transcription factors and the like.

On and Off
Writing in Science, David Johnson and colleagues at Stanford University use ChIPSeq to identify the binding sites for a transcription factor – NRSF (neuron-restrictive silencer factor), which turns off neuronal genes in non-neuronal cells. The DNA motif that NRSF recognizes consists of a 21-base pair core fragment. Using the Solexa/Illumina platform, “because high-read numbers contribute to high sensitivity and comprehensiveness in large genomes,” Johnson et al. performed a ChIP experiment in a T-cell line. They sequenced the released DNA fragments – some 2 to 5 million per sample -- of which about half were successfully mapped back to the reference genome sequence.

The Stanford group discovered a total of 1946 NRSF-binding locations in the human genome, including DNA motifs controlling more than 100 other transcription factors and 22 micro-RNAs. The most common binding target was identified more than 6700 times in the experiment. Most of the sites were identified as expected, but so too were some previously unrecognized binding motifs that did not fit the previously known rules for NRSF binding.

The authors conclude that ChIPSeq is a cost-effective alternative to microarray methods, with a significant upside. “Other ultrahigh-throughput sequencing platforms, such as the one from 454 Life Sciences, could also be used to assay ChIP products, but whatever sequencing platform is used, our results indicate that read number capacity and input ChIP DNA size are key parameters,” Johnson et al. write.

Meanwhile, Gordon Robertson, Steven Jones, and colleagues at the British Columbia Cancer Agency Genome Sciences Centre in Vancouver, performed a similar analysis, again using the Illumina Genome Analyzer because of its high throughput. Here, they looked at binding of a transcription factor called STAT1. The Vancouver group generated a total of more than 28 million fragments (in two types of cells), identifying more than 42,000 putative STAT1-binding regions.

The group suggests that ChIPSeq might be an order of magnitude cheaper than microarray alternatives, with the eight flow cell lanes in the Genome Analyzer offering excellent design flexibility. Fewer materials are required, and the method can be applied to any organism – it is not restricted to available gene arrays. 

Changing ChIPs
According to Fields, the advantages of ChIPSeq over ChIP-chip include the ability to interrogate the entire genome rather than just the genes represented on a microarray. (For example, Johnson et al. point out that a similar experiment using Affymetrix-style microarrays would require roughly 1 billion features per array.) There is also the benefit of sidestepping known hybridization complications with microarray platforms. “Perhaps most usefully,” writes Fields, “ChIPSeq can immediately be applied to any of those [available] genomes, rather than only those for which microarrays are available.”

Fields anticipates that similar experiments will quickly identify the binding locales of numerous other transcription factors, structural chromatin components, histone proteins, and various enzymes. The addition of ChIPSeq to the next-generation sequencing repertoire, as well as the ability to quantify captured gene sequences in a single sample, illustrate the growing breadth of next-generation sequencing applications.

Fields concludes his essay with a provocative thought: “The technology that is most threatened by the widespread adoption of ultrahigh-throughput sequencing? The DNA microarray.”

Further Reading:
D. S. Johnson et al. “Genome-wide mapping of in vivo protein-DNA interactions.” Science 316, 1497-1502 (2007).

G. Robertson et al. “Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing.” Nature Methods (published online 071107).

 

Subscribe to Bio-IT World  magazine.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .