iDEA conference names data viz winners.
By Allison Proffitt
August 2, 2011 | SAN DIEGO—Illumina CEO Jay Flatley kicked off the iDEA (Illumina Data Excellence Awards) conference* with a striking prediction: We would be at the $1,000 genome—all in—within three to five years and that there was no need for any new technology. Then he passed the microphone to the dozen finalists competing in the inaugural iDEA awards challenge.
Scott Kahn, chief informatics officer at Illumina, assured me over lunch the next day that Flatley’s promises didn’t scare him. “Being the one responsible for the informatics part, I’m not shuddering,” he said. “Jay said $1,000 all in. ‘All in’ means everything to do with the sample, everything to do with the analysis. Go back two years, what everyone did was take the images off the machine and they had this huge informatics pipeline. They spent probably more than $1,000 to $2,000 just in raw CPU cycles to do the analysis and image processing.”
Flatley insisted that the next-generation sequencing field needs improved technologies, not new ones; faster cameras, for instance, not different ones. “Jay’s alluding to the fact that… you’re going to see more of the downstream alignment and variant calling move onto the instrument because it can, because it reduces costs and the time to result,” said Kahn. He was adamant that Flatley’s “all in” doesn’t include the cost of interpretation. “It’s just the genome; that’s why I’m not shuddering.”
Even as the technical costs continue to fall and sequencing gets faster (Illumina’s mini MiSeq, which launches later this year, will generate more than one gigabase per run in about a day), Kahn acknowledges that there are still huge problems that need to be addressed in the interpretation and use of genomic data.
To help bridge the gap between data and interpretation, Illumina hosted its first iDEA challenge and conference. Announced in May 2010, the competition was designed to challenge both commercial and academic entrants to develop new and creative visualization and data analysis techniques.
From 30 entries, judges selected 12 finalists based on technical merit—7 academic entries, 5 commercial (see, “Top Twelve”). Each finalist gave a presentation and answered questions at the conference. Just before the winners were announced, Kahn said was pleased with the quality of talks and the entries, and said he had learned a couple of new things. “Some of the methods are very cool, good ideas… I like the idea of a nonlinear representation of the genome, like the [Strand Life Sciences Avadis entry] elastic browser stuff. I liked the talk that Stephan Schuster [from Penn State] gave on inGAP… The challenge the judges are going to have is how to weight entries that explore very different variables.”
“There are clearly some entries that are scratching an itch that people didn’t know was there, and there are others that are very novel approaches to solving problems that other people have solved,” Kahn said. “You can solve an unmet problem incompletely, but at least it’s a partial solution. How do you weigh that against, ‘Here’s something that takes a capability and enhances it significantly?’”
Kahn had his own ideas, but the iDEA entries were evaluated by an independent group of judges, including Steven Jones (Genome Science Centre at the British Columbia Cancer Research Centre, Canada), Jared Maguire (Broad Institute), John Quackenbush (Dana-Farber Cancer Institute), Steven Salzberg (University of Maryland), Gavin Sherlock (Stanford University), and Bang Wong (Broad Institute).
The Envelope Please
The judges awarded sculptures from glass artist Barry Entner to the six winners. One of Kahn’s favorites, inGAP from Pennsylvania State University, in conjunction with Fudan University and the Beijing Institutes of Life Science (BioLS), Chinese Academy of Sciences, won the overall academic award and a $50,000 grant from Illumina to further develop the software. inGAP, an Integrated Next-gen Genome Analysis Pipeline, started in 2007 as a SNP calling tool for Sanger sequence data, and now includes aligners, detects SNPs, indels and structural variation, and does comparative genome assembly all with a graphic user interface. The award grant will be used to extend inGAP to metagenomics and transcriptomics studies, said Fangqing Zhao at BioLS.
Enlis Genomics received the overall award in the commercial category and a one-year co-marketing agreement with Illumina. Another entry that impressed Kahn, Enlis hopes to enable “point-and-click genomics” for biologists rather than bioinformaticians. “Existing software packages have been focused on the bioinformatic tasks of assembling a genome, but our software is the first commercial package to recognize that for many biologists, the work of connecting genomic data to biology starts after variants have been called,” said Devon Jensen, Enlis’ founder. Fast algorithms enable variation filtering and genome comparisons, and Enlis’ .genome file format wraps all genomic data into a single compact and efficient file.
GenomeRing from the University of Tübingen (Germany) and Partek won awards for the most creative algorithms. GenomeRing is an interactive tool to visualize indels, SNPs, and other changes in dozens of genomes in a circular, rather than linear, view by constructing a “SuperGenome” and using the structure to compare different genomes. “We feel really honored,” said Kay Nieselt, head of the integrative transcriptomics group. “We now hope to be able to apply for new funding in the area of Visual Analytics in Bioinformatics. We are very motivated to continue our work to create new innovative algorithms and visualizations in the area of next-generation sequencing technologies.”
Partek debuted Gene-Specific Modeling for the iDEA Challenge along with the company’s Flow, Genomics Suite, and Pathway products. Gene-Specific Modeling takes the position that one model will not fit all genes, for example age affects some genes and doesn’t others. By using the algorithm to select the best model for each gene, users can identify which and how many genes are affected by which factors and make more accurate statistical analysis due to better model fit. “For years Partek has been recognized as a leader in making powerful statistical methods easily accessible to medical researchers,” said Tom Downey, president. “So, to be recognized for doing that again by a panel of renowned scientists is very satisfying. I’m proud of our team and glad that their hard work paid off.”
GenomeView from VIB (Flanders, Belgium) and Genomatix received the most creative visualization awards. GenomeView enables users to dynamically browse high volumes of aligned short read data, with dynamic navigation and semantic zooming, viewing whole genome alignments of dozens of genomes relative to a reference sequence. “There is still a lot of work to be done. Everybody agrees there is a clear need for visualization tools for genomics data and GenomeView has at least part of the solution. The iDEA award tells me that we’re doing something right. Now the trick is to convert that information into papers and grant money and we’re good to go,” said Thomas Abeel, a postdoctoral/Broad fellow at VIB.
Finally, Genomatix presented several workflow tools that one judge called “very intuitive” to cover the complete analysis of the iDEA datasets from mapping to the generation of biological networks. These included Transcriptome Viewer to interactively inspect transcript expression, splicing graphs and paired-end coverages in one view; a one-step mapping approach; and ElDorado, Genomatix’ genomic annotation database. “The iDEA challenge really sparked our interest from the very first moment we heard about it,” said Jochen Supper, project manager. “We felt that getting our hands on a high quality, diverse dataset like the one Illumina provided would be ideally suited to try and test our approach of combining multiple lines of evidence to get from sequencing data to biological results.” •
Top Twelve iDEAs
Enlis GenomicsGenomatix SoftwareHarvard University, SeqeyesImmunoProfilesPartekPennsylvania State University, inGAPStrand Scientific Intelligence, Avadis NGSUniversity of California, San Diego, STAR Genome BrowserUniversity of DelawareUniversity of Georgia, DawgPackUniversity of Tübingen, GenomeRingVIB, GenomeView
This story also appeared in the 2011 July-August issue of Bio-IT World magazine.