DREAM3 Predictions (and Their Grades) Are In



Loading...

SYSTEMS BIOLOGY

Oct. 23, 2008 | The results for the 3rd DREAM - Dialogue on Reverse Engineering Assessment and Methods – challenges are in, and this year DREAM organizers are identifying the top performers while others remain anonymous. Forty teams participated and they made 413 predictions. The full results can be seen here, but a summary of the challenges and the top predictors, where there was one, are listed below.

“While DREAM2 was all about inference of biological networks (gene regulatory and protein protein interactions), this year we concentrated more on modeling,” explains DREAM co-organizer Gustavo Stolovitzky. “We did so because last year there were some “voices of reason” that suggested the somewhat philosophical point that assessments should be made on prediction of measurements (because that is what we obtain from the biological systems), and not on the underlying networks, that are abstractions. While I believe that there is merit to the inference of networks, I also appreciate the positivistic (in a philosophical sense) view of those who emphasize the observable over the theoretical.”

“This year there were two challenges on signaling pathways, and two on gene regulatory networks. Of the latter, one was designed in such a way that we can make a direct comparison with last year’s in silico network challenge. As a ‘community of predictors’ the average performance (measured with a scoring system that we developed) was not different last year than this year. However, the best performer in Challenge 4 this year did much better (that means with p-values orders of magnitude smaller) than the best performer of the equivalent challenge (also Challenge 4) last year. Both challenges used the same kind of datasets, except for one difference: these year’s in silico data data were ‘noisy’, that is, measurement noise was added to the data to simulate the actual measurement noise. This made this year’s dataset a little more difficult than last year’s. Hats off to Kevin Yip,  Roger Alexander, Koon-Kiu Yan, and Mark Gerstein, who developed the best performance algorithm. They will present their methods in the upcoming DREAM3 conference on Oct. 31 at the Broad Institute.”

Stolovitzky adds, “It’s hard not to think in terms of evolution: eventually, the fittest algorithms will survive. And in DREAM, fitness is measured by this scoring system, that even though is not perfect, is a rather objective measure of performance. For each data set there may be a better adapted set of methods. And the combination of dataset (with mutants and time courses simulated as if measured by gene expression) and methods of the Challenge 4 best performers this year was phenomenal. So even though the community didn’t progress as a whole, there were a few gems this year that did show progress, and this is what DREAM aims at: to find the needle in the haystack.”

The answers to this year’s challenges can be downloaded here. (You have to register to download the data and the gold standards.) The best performer teams will give talks explaining their winning strategies at the upcoming DREAM conference at the Broad Institute on Oct. 31, 2008. The program for the triconference RECOMB Systems Biology/RECOMB Regulatory Genomics/DREAM is here.

Challenge 1 - Signaling cascade prediction
The concentration of four intracellular proteins or phospho-proteins (X1, X2, X3 and X4) participating in a signaling cascade were measured in about 104 cells by antibody staining and flow cytometry. The idea of this challenge is to explore what key aspects of the dynamics and topology of interactions of a signaling cascade can be inferred from incomplete flow cytometry data.
No team did better than a p-value of 0.11, so no team has been deemed a best performer.

Challenge 2 - Signaling response prediction
Approximately 10,000 intracellular measurements (fluorescence signals proportional to the concentrations of phosphorylated proteins) and extracellular measurements (concentrations of cytokines released in response to cell stimulation) were acquired in human normal hepatocytes and the hepatocellular carcinoma cell line HepG2 cells. The datasets consist of measurements of 17 phospho-proteins (at 30 min and 3 hrs) and 20 cytokines (at 3 hrs and 24 hrs) in two cell types (normal and cancer) after perturbations to the pathway induced by the combinatorial treatment of seven stimuli and seven selective inhibitors. The goal of this signaling response challenge is to predict the response to perturbations of a signaling pathway in normal and cancer human hepatocytes, and there were two sub-challenges:

2A. The phospho-proteomics challenge - This challenge consists of predicting a subset of data points that have been measured but removed from the normal and cancer hepatocytes datasets. “Specifically…predict the concentration of the 17 phospho-proteins at two time points (at 30 min and 3hrs) in each one of seven combinations of ligands and inhibitors for both the normal and cancer hepatocytes. As data, we provide the concentrations of all those 17 phospho-proteins for all the other combinations of ligands and inhibitors for both the normal and cancer hepatocytes…”

2B. The cytokine-release challenge - This challenge consists of predicting a subset of data points that have been measured but removed from the normal and cancer hepatocytes datasets. Specifically, we ask the participants to predict the concentration of the 20 cytokines at two time points (3 hrs and 24 hrs) in each one of 7 combinations of ligands and inhibitors for both the normal and cancer hepatocytes. As data, we provide the concentrations of all those 20 cytokines for all the other combinations of ligands and inhibitors for both the normal and cancer hepatocytes…”

 Two Top Teams: Team VITAL_SIB (Nicolas Guex, Eugenia Migliavacca and Ioannis Xenarios, Swiss Institute of Bioinformatics, Switzerland); Team GenomeSingapore (Guillaume Bourque and Neil D. Clarke, Genome Institute of Singapore)

Challenge 3 - Gene expression prediction
Gene expression time course data is provided for four different strains of yeast (S. Cerevisiae), after perturbation of the cells. The challenge is to predict the rank order of induction/repression of a small subset of genes (the “prediction targets” in one of the four strains, given complete data for three of the strains, and data for all genes except the prediction targets in the other strain. Predictors are also allowed to use any information that is in the public domain but are expected to be forthcoming about what information was used.

Two Top Teams: Team GustafssonHörnquistSweden ( Mika Gustafsson, Michael Hörnquist, Linköping University, Sweden); Team dreamteam2008 (Jianhua Ruan, The University of Texas at San Antonio, San Antonio)

Challenge 4 – In silico network inference
The goal of the in silico challenges is the reverse engineering of gene networks from steady state and time series data. Participants are challenged to predict the directed unsigned network topology from the given in silico generated gene expression datasets. There are three in silico challenges, corresponding to gene networks with 10, 50, and 100 genes.

Predictions are assessed independently for each challenge. Thus, teams may choose to submit predictions for only one or two of the challenges. However, we encourage teams to participate in all three challenges in order to compare how well different methods perform on different network sizes. Each challenge consists of five gold standard networks. In order to participate in a challenge, predictions for all five networks of this challenge must be submitted. The rationale is that in this way it will be possible to assess how consistently a method predicts the topology in five independent networks of the same type and size.

Top Team: Kevin Y. Yip, Roger P. Alexander, Koon-Kiu Yan and Mark Gerstein, Yale University

----------------------------------

This article first appeared in Bio-IT World’s Predictive Biomedicine newsletter. Click here for a free subscription.

 

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

Quantum
StorNext 4.0: Technical Product Brief
Sponsored by Quantum

 
Proven in the world’s most data intensive industries, Quantum StorNext is a scalable, high-performance file system which allows data sharing across Linux, Mac, Unix, and Windows operating systems and manages data in enterprise storage environments. In this Technical Brief you'll learn:

  • How a high-performing file system can accelerate your business
  • How to simplify your data management
  • How a tiered storage approach can save you money


SURETY-IP_WPx108
Protect Your Scientific Intellectual Property: Proof of Lab Informatics Data Authenticity is Your Best Legal Defense
Sponsored by Surety, LLC

As a bio-technology or life sciences organization, your formulas, treatments and research and discoveries are the “lifeblood” of your business. But if you aren't protecting the integrity of your scientific data in your lab informatics systems, you risk losing IP ownership, revenue and consequently your business if you can't prove time-of-creation and data authenticity. Learn how you can implement simple, cost-effective and automated controls to protect your scientific intellectual property. Consider:

  • IP protection requirements in bio-pharma and other science-oriented industries can extend out 20, 30, 40 or more years
  • Most electronic lab management solutions include generic authenticity controls, so how "legally defensible" is yours?
  • Only standards-compliant, independent controls can future-proof your approach to long-term IP integrity protection and authenticity.
  • Learn more - get the free whitepaper now


BlueArc_WP_DataMigration.jpg
The Key to Life Sciences Data Management: Transparent Migration
Sponsored by BlueArc

Life sciences organizations face new data management challenges as the volume of research data grows and more data is kept online for longer times. Read this paper to learn about:

  • The benefits of transparent data migration (TDM)
  • How TDM technologies can simplify data management.
  • How using TDM can help increase storage utilization, improve computational workflow performance, and optimize the use of storage resources.


Life Science Webcasts & Podcasts

adobe_i3_btn_webinarNext-Generation Clinical Trial and Data Management Applications
Sponsored by Adobe

This webinar introduces i3Cube - a web-based, fully integrated, clinical trial and data management system built on Adobe’s LiveCycle® Enterprise Suite.  I3 cube provides end-to-end automation that delivers unprecedented visibility into information that sponsors need to accelerate the study process and complete trials efficiently. Viewers will learn more about:

  • Creating faster and more efficient trial processes
  • Reducing investigator burden 
  • Real-time sponsor transparency into study information
  • Enterprise solutions based on Adobe LiveCycle® ES utilizing cross-platform clients of Reader, Flash and AIR

    Download now.



More Podcasts

Job Openings

Employers -- Don't miss this opportunity to reach well-qualified life science candidates.

Loading...

For reprints and/or copyright permission, please contact The YGS Group, 3650 West Market Street, York, PA;

(717) 505-9701 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.