Next-Generation Sequencing Invades Microarray Turf


By Kevin Davies

Two new papers unveil a new dimension to commercial next-generation sequencing applications – one that could potentially pose a threat to more-established microarray technologies. Using the Genome Analyzer from Illumina/Solexa, two groups working independently have been able to map the locations across the genome where a specific DNA-binding protein latches onto the DNA.

The new method is called ChIP-sequencing (ChIPSeq) – a combination of chromatin immunoprecipitation and next-generation, or parallel, sequencing. The feat was performed “with a speed and precision that goes beyond what has been achieved with previous technologies,” comments University of Washington geneticist Stanley Fields, in an accompanying essay in Science.

The precisely choreographed interplay of cellular gene activity is controlled by a vast cast of DNA-binding proteins – transcription factors and enzymes mostly. ChIP is a well-established lab technique to identify those specific sites where proteins latch onto the DNA. Cells are treated with a chemical to fossilize the links between DNA and protein, the chromatin is then isolated, the DNA broken up, and the attached proteins immunoprecipitated. Finally, the DNA stuck to the protein can be released and analyzed. Until now, the most high-throughput application of this technique involved using microarrays containing thousands of gene spots able to identify binding sites for transcription factors and the like.

On and Off
Writing in Science, David Johnson and colleagues at Stanford University use ChIPSeq to identify the binding sites for a transcription factor – NRSF (neuron-restrictive silencer factor), which turns off neuronal genes in non-neuronal cells. The DNA motif that NRSF recognizes consists of a 21-base pair core fragment. Using the Solexa/Illumina platform, “because high-read numbers contribute to high sensitivity and comprehensiveness in large genomes,” Johnson et al. performed a ChIP experiment in a T-cell line. They sequenced the released DNA fragments – some 2 to 5 million per sample -- of which about half were successfully mapped back to the reference genome sequence.

The Stanford group discovered a total of 1946 NRSF-binding locations in the human genome, including DNA motifs controlling more than 100 other transcription factors and 22 micro-RNAs. The most common binding target was identified more than 6700 times in the experiment. Most of the sites were identified as expected, but so too were some previously unrecognized binding motifs that did not fit the previously known rules for NRSF binding.

The authors conclude that ChIPSeq is a cost-effective alternative to microarray methods, with a significant upside. “Other ultrahigh-throughput sequencing platforms, such as the one from 454 Life Sciences, could also be used to assay ChIP products, but whatever sequencing platform is used, our results indicate that read number capacity and input ChIP DNA size are key parameters,” Johnson et al. write.

Meanwhile, Gordon Robertson, Steven Jones, and colleagues at the British Columbia Cancer Agency Genome Sciences Centre in Vancouver, performed a similar analysis, again using the Illumina Genome Analyzer because of its high throughput. Here, they looked at binding of a transcription factor called STAT1. The Vancouver group generated a total of more than 28 million fragments (in two types of cells), identifying more than 42,000 putative STAT1-binding regions.

The group suggests that ChIPSeq might be an order of magnitude cheaper than microarray alternatives, with the eight flow cell lanes in the Genome Analyzer offering excellent design flexibility. Fewer materials are required, and the method can be applied to any organism – it is not restricted to available gene arrays. 

Changing ChIPs
According to Fields, the advantages of ChIPSeq over ChIP-chip include the ability to interrogate the entire genome rather than just the genes represented on a microarray. (For example, Johnson et al. point out that a similar experiment using Affymetrix-style microarrays would require roughly 1 billion features per array.) There is also the benefit of sidestepping known hybridization complications with microarray platforms. “Perhaps most usefully,” writes Fields, “ChIPSeq can immediately be applied to any of those [available] genomes, rather than only those for which microarrays are available.”

Fields anticipates that similar experiments will quickly identify the binding locales of numerous other transcription factors, structural chromatin components, histone proteins, and various enzymes. The addition of ChIPSeq to the next-generation sequencing repertoire, as well as the ability to quantify captured gene sequences in a single sample, illustrate the growing breadth of next-generation sequencing applications.

Fields concludes his essay with a provocative thought: “The technology that is most threatened by the widespread adoption of ultrahigh-throughput sequencing? The DNA microarray.”

Further Reading:
D. S. Johnson et al. “Genome-wide mapping of in vivo protein-DNA interactions.” Science 316, 1497-1502 (2007).

G. Robertson et al. “Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing.” Nature Methods (published online 071107).

 

Subscribe to Bio-IT World  magazine.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

gq92112

This Bio•IT World Briefing On “Next-Generation Sequencing,”underwritten by GenomeQuest, Inc.,
presents a selection of feature stories, interviews,commentaries, conference reports, and editorials on the emergence, opportunities, and challenges posed by high-throughput sequencing. Covered in this collection: the launch of new
platforms from Applied Biosystems and Helicos; new applications of nextgen sequencing; the rise of personal genomics; and informatics solutions to vexing problem of managing the vast volumes of next-gen data.  Download now 



sgi_hybrid

SGI's Meeting Today’s Computational Needs for Science

The quest to better understand disease mechanisms and find new treatments is driven by new laboratory technologies and ever-more sophisticated modeling and simulation efforts. As such, life sciences R&D investigations increasingly are relying on more powerful computing resources. The challenge is how to accommodate the broad mix of applications.

Addressing this issue, this paper produced by the Bio-IT World Custom Publishing Group discusses a new SGI Hybrid Computing Environment approach. It optimally uses shared memory systems, multi-processor clusters, and FPGAs to accelerate computational workflows.



sgi_protm

SGI's Supercharging Proteomics Discovery

The deeper study of proteins and their interactions can reveal scientific information once considered nearly untouchable to scientists and researchers. Today, unprecedented advancements in computing power are enabling the creation of mounds of proteomic based data along with the accompanying bottlenecks data can create.

Rather than just “simplify the experiment” to fit the computational resources an alternative is now available with the SGI Proteomics Appliance. This complimentary white paper, produced by the Bio-IT World Custom Publishing Group, looks at ways to use the Proteomic Appliance to handle the most intensive proteomics computing tasks facing science today.



Life Science Webcasts & Podcasts

Waters

Streamlining the Chromatographic Method Validation Process

waters sm podcast button120Waters® Empower™ 2 Method Validation Manager (MVM) is a business-critical, compliant-ready software that reduces time and costs required to perform chromatographic method validation by as much as 80%. Learn in this podcast how MVM streamlines the method validation process and allows the entire process to be efficiently performed within Empower 2, so fewer software applications need be deployed, validated, and maintained. Download Now


More Podcasts

Job Openings

Lilly Singapore Center for Drug Discovery (LSCDD) - Associate Director of Informatics
Lead and mentor a strong team for the Bioinformatics group at the Integrative Computational Sciences (ICS) department at LSCDD towards the development of novel algorithms, data analysis methods and software tools for drug discovery. Work closely with the Software Engineering group at ICS, and collaborate with the Discovery IT organization in Europe and USA. For additional information, or to apply visit: LSCDD 

 Lilly Singapore Center for Drug Discovery (LSCDD) - Senior Software Engineer
Join a strong team of software engineers in our Integrative Computational Sciences (ICS) at LSCDD. Collaborate with, and help develop integrated applications to process and visualize data from cutting-edge technologies used by scientists at Lilly Research Labs (LRL) and the Drug Discovery Research (DDR) teams. The Software Engineering team provides computational tools and tailored software solutions that enable the global effort of Tailored Therapeutics; ‘The Right Drug, at The Right Dose for The Right Patient at The Right Time'. For additional information, or to apply visit: LSCDD 

For reprints and/or copyright permission, please contact RMS, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext 100 or via email to bio-itworld@theygsgroup.com.