Clive Brown on Field Biopsies, New Chemistry, Short Reads
By Allison Proffitt
December 3, 2021 | At Oxford Nanopore’s virtual Community Meeting this week, before CTO Clive Brown talked about new chemistries, before he previewed the Apple iPad Pro-powered Mk1D with an integrated MinION, before he teased the diagnostic capacities of “outy” sequencing, before he unveiled a browser base caller or even gave shipping updates on the forthcoming PromethION 2, he talked short reads.
“We believe that long reads have more utility. We believe that for the vast majority of applications you’d want long reads, not short reads,” he began. “But that doesn’t mean we can’t do short reads. In fact, we’ve always been able to do short reads… Physically, the pore itself can probably process down to about 20 or 30 bases. Actually we see very high capture rates for short fragments. Some of the best throughput numbers we get come from short fragments rather than long fragments."
The data, he said, is generally the same though the increased number of files requires some software adaptations. “We’ve never really done that, but we’ve looked at it recently,” Brown conceded. Oxford Nanopore’s forthcoming Short Fragment Mode represents software updates to improve scaling for short reads. “It’s always been there; we’ve tidied it up,” he said, saying the software updates will be available next year, supporting as many as 250M native human reads, at ~200 bp, on a PromethION flow cell. And again: “It’s really always been there. It’s just a software update.”
With that “dealt with”, Brown moved on to other company news.
Machine Learning and Base Calling
Oxford Nanopore has released “Kit 12”, a new package that includes the latest “Q20+” chemistry and enables “Duplex” sequencing. This is coupled with the release of R10.4 flow cells. The company reports that users can now achieve >Q20 raw read/“simplex” accuracy or around Q30 Duplex accuracy, and enhanced, high-accuracy consensus sequencing and variant calling with these released products.
“You win, generally, with 10/12 [the R10.4 flow cell and Kit 12] over 9/10 [the earlier flow cell and kit],” Brown said.
There has also been a “huge flurry” of activity from the Oxford Nanopore machine learning team, Brown said, exploiting the latest in neural networks to improve the accuracy of base calling. He recommended reviewing the software analysis of older data. “I urge you to re-call old data, because you will get a significant uplift from the latest caller.”
The neural network research has been particularly fruitful for calling modifications. “The underlying problem here,” Brown said, “is a ground truth problem. Because we use a machine learning base caller, we need to have something to learn from. We’ve been solving that in the background… using a variety of methods, more recently using synthetic DNA where we put modifications in, but also on modified references. This is quite a significant area of internal research for Nanopore now.”
The company plans to start with modified DNA, but Brown said when that research is mature the company will “really have a go at decoding direct RNA, where there are many more modifications.”
Remora, a new tool for methylation analysis, offers this enhanced neural network-based method and is used concurrently with standard base calling at no additional cost. It’s relatively lightweight in terms of computing, Brown said, and fast to run. After the first pass calling canonical bases at standard accuracy, the second pass of Remora annotates modifications. Remora provides industry-leading performance at only 20X coverage, the company says, and is available now in Bonito and will be rolled out to other base callers and MinKNOW directly.
Browser Calling on an Integrated Tablet
The base calling update that Brown called “massive” is the Bonito base caller online. “I’m interested in going back to running things in browsers,” he said. “The latest browsers can access the GPU on your computer; they’re getting quite powerful.” With just a modern browser and a GPU, users can drag and drop .fast5 files into Web Bonito, and the base calling will be done locally. Brown says it works quite nicely on his iPad Pro. “We don’t need to install drivers; we don’t need to install libraries for GPUs.”
He encouraged users to try the fully featured Bonito to re-call old data. “You don’t need to be a bioinformatician to take that old data that you’ve got… You can drag and drop the files onto Web Bonito and bring them all up to date.”
Browser base calling is only the first step though. Brown hopes to replace the Mk1C with the MinION Mk1D—combining a MinION with an Apple iPad Pro. Calling it the next generation of portable sequencing, Brown highlighted that the Mk1D has no cables, runs off battery, and can use 5G mobile capabilities globally.
In the promotional images, the iPad Pro case includes two wells above the keyboard for additional devices. One will be for a MinION, the second could be another MinION or perhaps a sample prep device derived from VolTRAX.
“Most importantly, that web-based Bonito—the experimental version—we’ve been able to run it on [Apple’s] M1 processors and other versions of the base callers. And M1 is Apple’s new silicon which actually has the GPU and other neural network things on the processor. They’ve integrated it with the processor. We’re getting ‘keep-up’ base calling on the iPad Pro—that’s 100 kb/s. On the M1-Max, we’re getting close to P2-level performance.”
His engineers predict late 2022 for the Mk1D, but Brown has other ideas. “I’d like to do this much more quickly.”
The Outy Option
Shifting to sequencing chemistry, Brown described two of the three options for moving DNA or RNA through a pore: using an enzyme motor to either drive it in (“inny”) or pull it out (“outy”). In the early days of the platform, Brown said, “I settled on the inny scheme, because my belief was it would be easier to implement with double-stranded DNA… It felt more like it was going to be high-throughput… But it’s been on my mind for a long time to circle back to outy. Some of our earliest and best data was done using outy.”
Brown reported current outy sequencing running at about 200 bases per second, and—because the enzyme motor does not immediately detach—a single strand can be re-read again and again. Even with “fairly naïve software,” and a “fairly primitive pore that isn’t optimized,” the average base calls from this cyclical reading are reaching very high accuracy consensus on a single molecule. “Because this is under software control, largely, you have real-time control” over when to repeat a read and when to release the sample, he said.
What really matters, Brown emphasized, is the “ability to definitively call a mutation or a modification at very high confidence.” The outy sequencing mode will enable adaptive sampling—base calling and aligning the first hundred bases of a sample and choosing to either keep or reject the strand based on the findings. “What I can do is, I can fish for interesting molecules in one step,” he said. Then for interesting molecules, the outy sequencing enables “adaptive accuracy”: “Once we’ve found one, we can hold it on the pore. We can accumulate interesting molecules on the pore.”
Paired with liquid biopsy workflows in the lab or in the field, the outy sequencing mode could potentially enable easy-to-use remote “liquid biopsy” sequencing. “It’s the essence of looking at fragments of blood; it’s the essence of looking at fragments of river, it’s the essence of looking at viruses in air droplets,” Brown said, “Anywhere where you get a heterogenous mixture of fragments and you’re looking for a subset with high confidence.”
The hardware for this application stays the same, though the software, front-end, and kit will need to be customized. The company is targeting mid-2022 for a demonstration.
Enabling the work without a lab at all is Brown’s goal. “I intend to make a liquid biopsy platform,” he said, showing prototypes that use a swab input and are compatible with the MinION platform. Small molecule sensing will be feasible, Brown said, as well as cell-free DNA. “There’s still a lot of work to do on these; it’s very early days. But I’m much happier with the direction it’s going in.”
Finally, Brown gave a shipping update on the company’s PromethION 2 (P2), a device that Brown says falls between GridION and PromethION on the product spectrum. It can run up to two high-throughput PromethION Flow Cells and is appropriate for individuals or single labs. He expects shipping (barring now-common delays) to begin in mid-Q2 of 2022. The P2 is designed to bring low-cost nanopore sequencing for large genomes or high-throughput long-read transcriptomics without the requirement for capital investment and with minimal infrastructure. P2 will be available as a standalone device with integrated compute or as “P2 Solo”, a sequencing device that can be connected to existing compute, including GridION. P2 Solo is listed for $10,455 and will be shipped from Q2 2022.