TriPrint Method Watermarks DNA



Malcolm Simons proposes codon changes to identify DNA.

By Graeme O’Neill

July 14, 2008 | MELBOURNE — In the fall of 2001, five people died in the anthrax attacks in New York and Boca Raton, Florida. Although the crime remains unsolved, the Bacillus anthracis strain used in the attacks was virtually identical to one held at the U.S. Army Medical Research Institute for Infectious Diseases (USAMRIID) in Fort Detrick, Maryland. The episode highlights the need for a means of readily distinguishing modified strains of bacteria, viruses, cell lines, and transgenic plants and animals, from naturally occurring forms.

Australian immunologist Malcolm Simons—who made headlines a few years ago with his controversial “junk DNA” patent (see, “Simons Says”)—is not the first to propose a method for “watermarking” proprietary DNA, but his idea is simple… and waterproof.

An obvious way to watermark a gene or genome is to insert a distinctive, non-coding DNA sequence. J. Craig Venter recently watermarked the name of his research institute into the genome of the prototype of the world’s first synthetic living organism, a genetically minimal Mycloplasma. His group inserted a synthetic sequence of triplet codons to spell out “VenterInstitvte”, along with his own name and the names of several colleagues, using the single-letter abbreviations of the 20 amino acids. (The amino acid “alphabet” lacks six of the 26 letters in the English alphabet, hence Venter’s substitution of a valine (V) in “Institvte”.)

“The issue is conserved sequence,” says Simons. “Everyone understands that it is only possible to hide cryptic information in conserved sequences. There are ultra-conserved sequences in non-coding DNA, but within the protein-coding region of a gene, any change in a single nucleotide may change the function of the protein.”

Simons’ solution, called TriPrint, exploits the variation available in synonymous codons within protein-coding exons to “imprint” the watermark within the gene, without changing the encoded protein. The four-base DNA code yields 64 unique triplet codons, which specify 20 amino acids. Each amino acid has between two and six alternatives, and even the universal “stop” codon that terminates genes has three variants: UAA, UAG and UGA. TriPrint would combine synonymous variants of the stop codon, and serine codons in each protein-coding exon, to create unique watermarks. (Altering just the first or second nucleotide in the triplet provides five alternatives to any wild-type serine codon).

Patented System
“The stop codon is least the mutable codon, and the least mutable amino acid of the 20 has got to be serine,” he said. “You would have to change two nucleotides in a transversion manner and the probability of simultaneous substitutions in the same serine is extremely remote.”

He said that a watermark pattern comprising just the first occurrence of serine in each exon of the gene, plus the stop codon, would provide many more unique permutations than required to distinguish between all naturally occurring variants of the gene. Indeed, even in a small gene with just a handful of exons, the number of potential permutations rapidly approaches Avogadro’s number.

“In practice, it might be sufficient just to modify the first serine codon in each exon of the gene, plus the stop codon,” Simons said. “The particular combination of serine variant and the stop codon could even serve to identify the company that created the gene.”

Simons’ TriPrint patent, filed with the Australian Patent Office on April 17, relates specifically to transversion mutations—changes that substitute a purine (A or G) for a pyrimidine (T or C) at any of the three codon positions. “The improbability of a dual mutation in the first two nucleotides can be regarded as infinitesimally small,” he said. “There is no need for an error-checking mechanism.”

Simons proposes that DNA watermarks would be recorded in a central database. “Given the concerns of some consumers about ‘contamination’ of conventional crops by pollen from genetically modified crops, TriPrint provides a way of individually watermarking every transgenic variety,” he said.

 A DNA watermark in the presumptive Fort Detrick strain of B. anthracis might not have prevented the anthrax attack, but Simons said the ability to rapidly trace a transgenic microbe to its source might give terrorists pause for thought. 

Simons Says

Malcolm Simons invented a controversial patent on the use of conserved DNA markers in non-coding DNA to identify haplotypes associated with inherited disorders (see, “Malcolm in the Middle,” Bio•IT World, August 2003). Simons asserts that the field of genome-wide association mapping was founded on his discovery that non-coding SNPs occur as haplotypes that enable gene mapping without pedigrees. In 2000, Simons quit Genetic Technologies, the Australian biotechnology company he co-founded, and relinquished his interest in both patents. He subsequently co-founded Haplomic Technologies to develop techniques to isolate and characterize the separate haploid DNA contributions of parents to their offspring, a prerequisite to understanding how the two “haplomes” interact to create phenotypically unique offspring. --G.O.

_________________________________________________

This article appeared in Bio-IT World Magazine.

Subscriptions are free for qualifying individuals.  Apply Today.

 

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .