By Allison Proffitt
November 30, 2010 | SINGAPORE—The Moorea Biocode Project, a species inventory of the island of Moorea, is making publicly available its LIMS as a free beta version. Moorea is an island in French Polynesia just east of Tahiti. Funded through a $5.2 million grant to UC Berkeley from the Gordon and Betty Moore Foundation, and based out of the American and French research stations on Moorea, the Moorea Biocode Project aims to create a comprehensive inventory of all of the coral reef and terrestrial species on Moorea larger than a microbe. The project brings together researchers from the Smithsonian Institution, France’s National Center for Scientific Research (CNRS), and other partners.
To develop the LIMS and data analysis components of the project, Biocode researchers collaborated with Biomatters on a plugin for Biomatters’ Geneious Pro sequence analysis software. The Geneious Biocode LIMS will give biologists around the world a tool to use in their own research as well as free access to the Moorea project’s final database. An accompanying “Biocode Genbank Submission” plugin will allow researchers to upload their own sequence data from inside Geneious Pro directly to Genbank, the world’s largest public DNA sequence database.
The unique database of Moorea’s biodiversity will be publicly shared as a resource for ecologists and evolutionary biologists around the world. So far, the Biocode LIMS has tracked over 24,000 specimens from over 30 phyla of algae, fungi, plants and animals in the first two years of the project.
“The Moorea Biocode Project was created with the intention of providing a model system for similar comprehensive genetic inventories. In addition to tracking down all the biodiversity in this tropical island ecosystem, one of the promised deliverables has been an informatics tool to allow easy access to those data and to aid other genetic barcoding initiatives,” said Christopher Meyer, from the Smithsonian Institution and director of the Biocode Project in a press release.
“We see no sense in reinventing the wheel,” said Meyer. “We want to share our best practices from this ambitious project with anyone, from single researchers, to a principal investigator's lab, to large scale initiatives like our own.”
Barcoding Pipeline Tool
The Biocode LIMS provides an informatics pipeline for batch processing of samples from DNA extraction through to sequencing, identifying, and re-running failed reactions, and identifying systematic errors that can be strategically addressed. It integrates with Geneious Pro’s existing sequence assembly tools and various Field Information Management Systems (FIMS), including TAPIR standard access protocols.
Once the specimen reaches the end of the pipeline, the Biocode Genbank submission plugin automates the submission of completed contigs to make the DNA sequences publically available. The reaction data from the LIMS database is combined with the field metadata from the FIMS database as a quality control mechanism including the completely tracked history.
“This is the first freely-available, broadly-applicable software tool to assist tracking materials through the DNA barcoding pipeline. No other freely-available program allows the level of tracking and data quality assurance through a lab system. Importantly, it goes beyond DNA barcoding to accommodate multiple genetic markers for use in a broad range of biodiversity and ecogenomic studies,” said Neil Davies, director of UC Berkeley’s Gump South Pacific Research Station and principal investigator of the Biocode Project.
Biomatters and the Moorea Biocode Project have been working together for two years and the LIMS software has been developed under “demanding real world circumstances,” said Candace Toner, Biomatters’ CEO.
“Most sequencing projects focus on one or two species, while Biocode focuses on an entire ecosystem and involves researchers from multiple international institutions. This creates challenges to organise the vast amounts of information produced and manage the sheer volume of human handling involved. For reproducibility and transparency, the Biocode LIMS needs to store a full, publically available lab workflow for each specimen,” Toner said.
Meyer and Davies have found that starting with an existing product facilitated their research. “Our collaboration with Geneious has been absolutely critical for tracking workflows and maintaining the data integrity of the Biocode Project,” said Meyer. “Instead of starting from scratch, we got a leg up by joining forces with Biomatters and integrating with their existing Geneious platform.”
Davies added, “[Biomatters’] workflow approach has greatly simplified the process of identifying reaction failures, and setting them up to be run again. It manages data that used to be spread across multiple individuals and notebooks so we can search it, report success, or look for patterns. It has significantly reduced the human error that has been problematic in large-scale sequencing projects such as this in the past.”
A beta version of the free Biocode LIMS and Biocode Genbank plugins are available from http://software.mooreabiocode.org. Users of the free Geneious Basic software will be able to access and view the Biocode database upon completion of the project, but a commercial copy of Geneious Pro is required for data creation and analysis.