Molecular Tagging System With Real-World Potential
By Deborah Borfitz
November 17, 2020 | A decades-old technology initially used to hide secret messages in dots of DNA could get a reboot as a tracking system for commodities that might be too small or numerous for tagging with QR codes or radio-frequency identification, such as experimental drugs and controlled substances. The “molecular bits” (molbits) it creates could provide more distinct barcodes for multiplex sequencing. Leveraging advancements in DNA-based data storage, sequencing technologies and raw signal processing tools, University of Washington (UW) researchers recently introduced the first end-to-end molecular tagging system designed for do-it-yourselfers in real-world scenarios across multiple industries, according to Kathryn Doroschak, a UW computer science and engineering doctoral student.
The new tagging system, known as Porcupine, enables rapid, on-demand encoding and decoding of DNA tags at scale, says Doroschak, lead author of a paper recently published paper in Nature Communications (DOI: 10.1038/s41467-020-19151-8) that describes the innovation in detail. The nano-sized tags can be programmed and read within seconds using Oxford Nanopore Technologies’ MinION, a low-cost sequencing device the size of a candy bar.
Porcupine uses dehydrated strands of synthetic DNA to tag objects in lieu of bulky plastic or printed barcodes—much like the actual rodent defending itself against a predator uses quills, explains Doroschak. After being shipped or stored, the molecular tags get rehydrated using a buffer solution prior to readout.
Molecular tagging has traditionally required access to specialized labs and expensive, benchtop-sized sequencers, which has made it an impractical choice for real-time use cases or when large quantities of tags are needed, says Jeff Nivala, a UW research scientist in computer science and engineering and senior author of the newly published paper. The technology has historically been used in steganography to hide protected information and as an offender marking spray to deter and prevent criminal behavior.
All seven scientists on the larger research team are affiliated with the same lab at UW where DNA data storage is a primary focus. Co-author Karin Strauss is also senior principal research manager at Microsoft Research.
Porcupine helps to address the main holdup in its adoption as an information storage medium—namely, the prohibitively high cost of chemically synthesizing DNA—and Microsoft has a down-the-road interest in building the DNA-based archival storage system, says Doroschak. Although Microsoft has made financial gifts to the lab, the patent application for Porcupine does not include Microsoft.
DNA is inherently more stable than paper or anything a QR code might be printed on, lasting hundreds if not thousands of years under the right conditions, Nivala says. Dehydrating the tags helps improve their survivability, although the upper bound of their lifespan has yet to be pinned down. But researchers did recently ship a tag via the U.S. Postal Service from Washington to California and successfully recovered it after four weeks, he notes.
Porcupine is a hybrid molecular-electronic system that employs molecular engineering, new sensing technology, and machine learning. The hard parts of the tagging process are done in advance, including engineering the DNA fragments to unwind and flow through a nanopore to create a telltale current change, Doroschak says. These ionic current traces allow Porcupine to avoid the lengthy, complex computational process of “basecalling” that is typically required to identify the base sequences from raw current measurements—and is the largest source of error when working with nanopore data.
“We’re not trying to identify any DNA, just the DNA we know is already there so we can turn this into a classification problem instead of a decoding problem,” she continues. “That saves a lot of time and increases our accuracy dramatically. We’ve designed molbits to have distinct nanopore signals, using a tool called Scrappie that is produced by Oxford Nanopore.”
An “evolutionary process” is used in designing these signal-producing sequences for the molbits, which involves simulating what they will look like and computing how different they are from one another, Doroschak says. Identifying the molbits is a non-issue, she adds. “Training accuracy is nearly 100%, validation accuracy is 97.7%, and testing accuracy is nearly 97%.”
Some “error correction” has been added into the encoding process by using a portion of the molbits to produce a codeword that also gets put in the molecular tag, says Doroschak. This effectively allows multiple molbit mistakes without worrying about getting the entire tag wrong.
Molecular tags encode information in as 1s and 0s just like any other digital tag, except those numbers are converted to molbits, says Doroschak. Each molbit is represented by a different type of DNA and the part encoded as 1 gets added to the molecular tag mixture while the part encoded as 0 gets left out completely.
New DNA does not need to be generated for every new piece of information because DNA can be copied over and over again, Doroschak says. She likens the approach to braille, excepting Porcupine encodes with the presence or absence of DNA strings rather than dots.
As envisioned, end users of Porcupine would receive a small well plate filled with the 96 molbits, says Doroschak. When a tag it needed, the system would instruct them to randomly combine a portion of the dehydrated molbits to create a unique identifier that could then be applied to the surface of an object.
Next, the object would either be shipped or stored. When the tag needs to be read, the designated individual would add water to rehydrate it and put it in the nanopore sequencing device and specially created software would indicate whether the actual and expected tag is a match, she continues. The MinION natively plugs into a laptop via an HDMI cable, although Oxford Nanopore Technologies has plans to make the device smartphone-compatible.
With the initial 96 barcodes, Porcupine can produce roughly 4.2 billion unique tags using basic laboratory equipment, as reported in the Nature Communications paper.
The methodology has many potential applications, such as commodity tracking to ensure the authenticity of a particular variety of cotton (as is already being done by Applied DNA Sciences) or valuable works of art where it would be important for tags to be undetectable by sight or touch, says Doroschak. Molecular tags might also be sprayed onto drugs during the manufacturing process, Nivala adds, so pills could be tracked backed to their source to establish provenance.
Porcupine could conceivably be used as part of a blockchain. “The two technologies are both able to use identifiers to establish provenance,” Doroschak says. “What’s missing with our system is the distributed nature of the ledger.”
The novel tagging system is effectively ready for real-world use, says Doroschak, although “circumstantial testing” has yet to begin for specific use cases. Much work remains to be done to determine what kind of surfaces DNA tags can be attached to and how long they last.
Intriguing work with molecular tagging is also happening elsewhere, says Nivala, including experimental track-and-trace systems based on environmental DNA unique to specific geographic locations. Recently, scientists have also sprayed barcoded microbes onto produce as a way to track food contamination back to its source.