Accelrys Pipeline Pilot Guides ONT’s Nascent NGS Data Handling

2011 Best Practices Awards: Research and Discovery
Winner: Oxford Nanopore Technologies
Nominator: Accelrys
Project: Data Pipelines for Next Generation Sequencing Applications

By Kevin Davies

August 2, 2011 | Although still in stealth mode, Oxford Nanopore Technologies (ONT) recently revealed details of the GridION hardware that will form the basis of its next-generation sequencing technology as well as protein analysis and other applications. And as its Best Practices Award shows, it has been laying the groundwork for an effective and flexible informatics solution as well.

“In the face of staggering estimates for the all-inclusive cost and complexity of NGS analysis, simply providing a new instrument is only half of the story,” says ONT senior scientist Richard Carter. The British company believes in offering simple ways for scientists to analyze NGS data while retaining the flexibility to adapt to “a rapidly shifting landscape of analysis methods and algorithms.”

After assessing several commercial and public options, ONT elected to partner with Accelrys, agreeing to offer a version of the Pipeline Pilot NGS Collection as its recommended platform for NGS data analysis. Already deployed in some 1,300 institutions, the Pipeline Pilot workflow software appears to be a good choice. After all, “Pipeline Pilot is the computational underpinning for all Accelrys products,” says Clifford Baron, product marketing director.

A bioinformatician himself, Carter collaborated with Accelrys to develop the NGS collection and created a series of workflows that reflect analyses performed on a broad range of publications. “It’s relatively simple even for a novice user of Pipeline Pilot to create useful and powerful applications using the NGS Collection,” says Carter. “In little time and without requiring scientists to learn sophisticated analysis software, Pipeline Pilot helps scientists ask relevant, scientific questions about [NGS] data.”

With the growing number of NGS software algorithms available, selecting and configuring the best tool is a tricky, even risky business. “There is no universal ‘best answer’ when it comes to NGS analysis algorithms,” says Carter. “Analysis of NGS data is far from a settled science.” What bioinformatics teams need, he says, are systems to compare analysis algorithms and organize data processing workflows for their various user groups quickly and efficiently, minimizing repetition.

No Best Answer

Launched in early 2011, the NGS Collection for Pipeline Pilot consists of some 150 components for analyzing NGS data, including quality assessment and processing, assembly and mapping, variant detection and profiling, and transcript and ChIP-Seq analysis. From ONT’s standpoint, Pipeline Pilot’s use of graphical application development and application integration provides the data management and algorithmic building blocks needed to develop customized NGS analyses in a relatively accessible environment.

“It’s all about empowering your bench scientists,” says Carter. For example, one user of the system used the software to run an analysis of the publicly available German food poisoning Escherichia coli data. In a handful of mouse clicks and a couple of hours, a de novo assembly had been performed and the sequence compared with other strains in Genbank.

Carter has created several NGS workflows using out-of-the-box Pipeline Pilot components. One calculates GC content in a genome and compares it to depth of coverage, helping scientists to spot outliers. Carter also integrated the popular Circos plot (now a standard component in the NGS collection) for visualizing genomic variation such as SNP prevalence or gene density.

The software appears well suited to the properties of ONT’s technology when it is launched. ONT’s GridION is designed to acquire and analyze data in real time so that experiments can be monitored and adjusted as they are being performed. The range of analyses in the NGS collection facilitates the “Run until…” function, where users will choose to sequence until a pre-determined experimental outcome has been achieved.

“Our customer surveys indicate that Pipeline Pilot saves 30-70% development time,” says Baron. Trevor Heritage, Accelrys’ senior VP, adds that he is “really excited” about the new NGS Collection. “We’re not prescribing an out-of-the-box packaged solution... We’re offering a workflow-oriented platform with the scientific brains to read the data in an intelligent way and do the analysis on top.”

ONT believes that Pipeline Pilot can help address the rising challenges of data analysis and the level of expertise required to perform it. “That’s what makes it such a powerful and important tool,” says Carter.   

This article also appeared in the 2011 July-August issue of Bio-IT World.

