June 14, 2006 | Later this summer, coinciding with the anticipated publication of a major peer-reviewed paper on results from J. Craig Venter’s worldwide voyage sampling ocean genomes, researchers will gain access to version 0.5 of CAMERA — the Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis — a platform replete with a wealth of data, analysis tools, and high-speed computational infrastructure.
“We’ll point readers to the portal from the paper,” says Paul Gilna, who was appointed executive director of CAMERA last month. Gilna is an experienced science program administrator who helped launch GenBank, worked on the protein data bank (PDB), and was a director of DOE’s Joint Genome Institute at Los Alamos National Laboratory. He hopes CAMERA will help jumpstart and grow the nascent field of metagenomics.
“I’ve noticed it’s one of those fields that’s beginning to take on the ‘in-the-eye-of-the-beholder’ definition,” says Gilna. Broadly, metagenomics has been used two ways: One refers to the sequencing of microbial communities that contain large mixtures of microbial species; the other is the capture of data associated with any given microbe or microbial sequence or microbial genome. CAMERA will emphasize the latter, says Gilna, as pure sequence data will be submitted to GenBank, but its metadata will also be housed at CAMERA.
“When we lift a sample out of soils or out of the ocean, we know we are sequencing parts of many thousands, even millions of microbes,” Gilna says. “So for any one sequence, getting the context of what was around it at the time in terms of other sequences is important. In the case of genome sequences garnered from the ocean surveys, there is information associated with spatial location such as GPS coordinates as well as satellite imagery of the particular sites. There are data associated with ocean conditions such as depth, temperature, salinity.”
Soon, scientists may have those data at their fingertips. CAMERA is a partnership among the J. Craig Venter Institute (JCVI), the California Institute for Telecommunications and Information Technology (Calit2), UCSD’s Center for Earth Observation and Applications based at the Scripps Institution of Oceanography, and the San Diego Supercomputer Center.
JCVI is providing data and analysis and visualization tools. Calit2 is providing computational infrastructure, based on its high-performance optical networked computing system known as OptIPuter (which is building on a national high-speed 10-gigabit grid). Funding of $24.5 million, over seven years, comes from the Gordon and Betty Moore Foundation, which also partly funded Venter’s ocean genome sampling project.
“Phase one, which we sort of term the sprint phase, is to really get the system up and running, knowing what we do today. The second phase, which we’ve characterized as the marathon, is one in which the facilities we offer, the tools that we offer, will in large part be defined by the community itself. We know our assembled partnership here can say and posit a lot of what could be done to harness this new data set, but part of what we need to do is engage with the community,” says Gilna.
In fact, a key CAMERA goal, says its new boss, is to create a platform on which members of the community can not only develop needed tools but also use that same platform to provide them to the community at large.
The project began in earnest in February, and Gilna says a series of release cycles has already been planned. Roughly 30 people are working full time at JCVI on the project, and a similar number in San Diego are engaged in the day-to-day project management of getting the first release in place to coincide with publication of the papers. Gilna didn’t specify which journal, but said the June-July timeframe was likely.
“If you look at GenBank, which is where I really started this career, I came to Los Alamos in 1988 to join the project. The growth of data at the time had clearly begun to demonstrate an exponential growth pattern, when in fact at the project’s initiation everybody believed, certainly the funders, that the growth rate would at best be linear,” says Gilna.
He expects CAMERA to experience the same sort of explosive data influx and to produce a great deal of fundamental science such as was hinted at by the Venter proof-of-concept sequencing effort in the Sargasso Sea in which he reported finding 70,000 entirely novel genes, from an estimated 1,800 genomic species, including 148 novel bacterial phylotypes (see “Venter Makes Waves — Again,” April 2004 Bio-IT World, page 1).
Turning the bicoastal effort into a cohesive team will require both advanced technology (“state-of-the-art videoconferencing”) and surmounting cultural challenges, agrees Gilna, whose “line of sight” bosses include Larry Smarr, director of Calit2, and Ramesh Rao, USCD division director of Calit2. No doubt Venter will share his opinions with the new director.
“This is a huge and welcome challenge for me,” says Gilna. “I’ve sort of known these people throughout my career in one way or another. I’m most energized when I’m working on projects where success or failure is judged by how well the community believes it has been served.” If the past is prologue, CAMERA is likely in good hands.