Making DREAM(s) Come True Isn’t Easy


By John Russell

July 14, 2008 | How good is your algorithm for turning experimental data—mainly gene expression or protein interactions—into accurate pictures of biological networks?  If the results from the first DREAM challenge are any indication, plenty of progress is needed to improve pathway predictions from data.

The DREAM initiative—Dialogue for Reverse Engineering Assessments and Methods—is intended to help the life science community improve computational techniques for inferring networks. Datasets from known networks are provided and researchers are invited to uncover the true underlying networks using computational techniques. In this first “competition,” there were five challenges, 36 participating teams, and 110 predictions.

“DREAM is trying to understand whether we predict something meaningful when we take high-throughput data like gene expression and we say, well this is the network of interactions,” says one of DREAM’s organizers, Gustavo Stolovitzky. “Usually you validate with five or six connections that you either do in the lab or validate through literature. But actually you have made 1,000 predictions and you only cherry pick the five that match your data, so in a way, we don’t know whether we are fooling ourselves or whether really we have something in those predictions.”

On balance, this year’s predictions were lousy. Talking about one specific challenge, Stolovitzky says, “These were 200 genes of which 53 were true positives and the rest were not. Some people did well in the first 10 or 15, but then really not so well. At the same time many, many teams, some of which are very well known people that have one of these algorithms as their favorite algorithm [did very poorly] and their favorite algorithm is very bad, really bad, as bad as random so if you put a random predictor you will have predicted better.”

Predictions of protein interactions were even worse. Stolovitzky says simply, “If you, today, go with a set of proteins and sequences and you give it to someone who says, ‘I can predict which of these proteins are interacting,’ probably he is going to give you garbage. And that is a truth, but it is important to know that this is the case. It’s not to just roast the people who didn’t do well. It’s just to understand where we should improve.”

Ouch. Not surprisingly no commercial tool providers tackled the challenges. And let’s be clear—as Stolovitzky is—the idea isn’t to roast anyone. The goal is to learn which algorithms work best, to assist in developing better algorithms, and to help the entire community move forward. In releasing DREAM2 results, the names of the teams were not revealed.

Stolovitzky is adjunct associate professor of Biomed Informatics, Columbia University, and manager, Functional Genomics & Systems Biology, IBM Research. He notes that despite the poor results of the first set of challenges, much good work is being done by researchers using similar techniques.

Asked for his thoughts on recent work led by Merck researcher Eric Schadt in identifying key networks involved in metabolic disease by interpreting gene expression and other data (see “Merck’s Informatics Mission,” Bio-IT World, May 2008), he says, “They are great. I think he’s doing something that we all should be doing, and we are not doing in DREAM, which is he puts together a lot of [different kinds of] information like quantitative analysis, gene expression, and clinical information.”

Round Three
Plans are already afoot for DREAM3 which will be held in conjunction with 5th Annual RECOMB Satellite on Regulatory Genomics and the 4th Annual RECOMB Satellite on System Biology next October 29-Nov 2 (http://compbio.mit.edu/recombsat/). The meeting is jointly organized by the Broad Institute of MIT and Harvard, and the MIT Computer Science and Artificial Intelligence Lab (CSAIL). “We would like to cordially invite anyone to submit network inferences and answers to our new biological prediction challenges. To access the data sets and descriptions of the challenges, please go to http://wiki.c2b2.columbia.edu/dream/index.php/The_DREAM3_Challenges,” say Stolovitzky and DREAM3 co-organizer, Andrew Califano.

Interest in DREAM is growing, reports Stolovitzky. The International Society for Computational Biology is creating a team of students to participate in DREAM he says. He also says they will probably ask researchers if they want to be identified next year. Given the suggestion that some well-regarded researchers fared less well, this might by an opportunity for others to gain notice.

In any case, for the foreseeable future, consistent prediction of underlying networks from just a few data types and using only computational techniques is still something of a dream.

___________________________________________________

This article appeared in Bio-IT World Magazine.

Subscriptions are free for qualifying individuals.  Apply Today.

 

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

thomson reuters image
Biomarkers: An Indispensible Addition to the Drug Development Toolkit
Examining the Potential of Biomarkers
Sponsored by Thomson Reuters

Biomarkers are becoming an essential part of clinical development. In this white paper, Thomson Reuters provides insight from experts in industry and academia, and explores the role of biomarkers as evaluative tools in improving clinical research and the challenges this presents.

Discover the potential of biomarkers to:

  • Improve decision making
  • Accelerate drug development
  • Reduce development costs


BlueArc_Scientific Data
Scientific Data Lifecycle Management: Preparing for Storage in an Uncertain Future
Sponsored by BlueArc

Managing vast and overwhelming streams of gene sequencing data today requires ultra-high performance systems and processes. With continued rapid advancement and improvements in gene sequencing, expect tomorrow’s instruments to output quantities of genomic information that will dwarf current levels. Help your organization maintain data control and prepare for the future of sequencing through this informative paper that discusses:

  • The information technology challenges of gene sequencing
  • “Intelligent” methods for data management and customization
  • System survival tips... Deciding what data to keep or delete
  • New tools to keep scientists ahead of impending data torrents


SAS Managed image
Managed Innovation, Assured Compliance
Developing, executing and managing the transformation, analysis and submission of clinical research data with SAS® Drug Development
Sponsored by SAS
Get better products to market faster. Download this white paper to discover the top ten challenges facing life science executives and how to overcome them. See how SAS Drug Development transforms clinical data into true innovation.


Life Science Webcasts & Podcasts

Presented by Trade Commission of Spain

Spain Biotech: An Engine for Economic Change 

TCS podcastDiscover how Spain is focusing on biotechnology to be an engine for economic change through gradual internationalization, development and technology transfer.

Regional governments are actively investing in public and private biology research and promoting the creation of knowledge-based companies. Spain’s human capital combined with aggressive investment in biotech research and infrastructure has led to the creation of bio-clusters.

Today, there are nearly 700 Spanish companies engaged in biotechnology, with almost 50 percent growth in funding devoted to research. In fact, spending on internal R & D in biotechnology has grown 46 percent and is close to 300 million Euros.

Access the podcast 

 



More Podcasts

Job Openings

saic_logo

MANAGER, SCIENTIFIC COMPUTING & PROGRAMMING
(Bioinformatics Manager)
SAIC-Frederick, Inc has an exciting opportunity for a Manager, Scientific Computing & Programming - Core Genoytyping Facility in Gaithersburg, Maryland.  In this role, you will lead the Bioinformatics & Analysis Group.
Master’s or equivalent required.  PhD preferred. Six years experience in development of scientific programs in high-performance computing environment including five years supporting scientific research in computational chemistry, biology, or genetics, & two years supervisory experience.  View complete job posting & apply: www.saic-frederick.com. Position #146945.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.