Whole Genome Assembly with Nanopore Data

June 16, 2015

June 16, 2015 | Nature Methods has published a paper by researchers at the University of Birmingham and University of Toronto, demonstrating a complete de novo assembly of an E. coli bacterium's genome using only data from an Oxford Nanopore MinION Sequencer. The MinION, a portable sequencer that plugs into a laptop computer, has previously been used to identify species of bacteria based on DNA similarity to previously sequenced reference genomes, but had not before been demonstrated as a means of assembling a genome from scratch.

In some ways, the MinION should be well suited to de novo assembly: because it delivers sequence data in extremely long reads, covering tens of thousands of DNA bases at a stretch, those reads are easier to overlap to span an organism's entire genome (in the case of E. coli, over four megabases). However, the device's high error rate at the level of individual bases, and the non-random nature of its errors, are a major obstacle to reliable de novo assembly — a second sequencing method would be recommended to check errors, especially if the species under study had no existing reference genome.

The authors of the Nature Methods paper wrote two computational tools to smooth out the MinION's error rates after referring back to the raw data from the sequencer. Both tools, called nanocorrect and nanopolish, have been posted to GitHub. Using these correction methods, the authors claim to have created an E. coli genome, with MinION data alone, that has 99.5% identity with the E. coli reference genome. This result provides some hope that nanopore sequencing could one day be reliably used to study completely novel genomes.

Oxford Nanopore has also recently announced updates to its technology that the company claims will improve the MinION's error rate in future releases.