Oct. 23, 2008 | The results for the 3rd DREAM - Dialogue on Reverse Engineering Assessment and Methods – challenges are in, and this year DREAM organizers are identifying the top performers while others remain anonymous. Forty teams participated and they made 413 predictions. The full results can be seen here, but a summary of the challenges and the top predictors, where there was one, are listed below.
“While DREAM2 was all about inference of biological networks (gene regulatory and protein protein interactions), this year we concentrated more on modeling,” explains DREAM co-organizer Gustavo Stolovitzky. “We did so because last year there were some “voices of reason” that suggested the somewhat philosophical point that assessments should be made on prediction of measurements (because that is what we obtain from the biological systems), and not on the underlying networks, that are abstractions. While I believe that there is merit to the inference of networks, I also appreciate the positivistic (in a philosophical sense) view of those who emphasize the observable over the theoretical.”
“This year there were two challenges on signaling pathways, and two on gene regulatory networks. Of the latter, one was designed in such a way that we can make a direct comparison with last year’s in silico network challenge. As a ‘community of predictors’ the average performance (measured with a scoring system that we developed) was not different last year than this year. However, the best performer in Challenge 4 this year did much better (that means with p-values orders of magnitude smaller) than the best performer of the equivalent challenge (also Challenge 4) last year. Both challenges used the same kind of datasets, except for one difference: these year’s in silico data data were ‘noisy’, that is, measurement noise was added to the data to simulate the actual measurement noise. This made this year’s dataset a little more difficult than last year’s. Hats off to Kevin Yip, Roger Alexander, Koon-Kiu Yan, and Mark Gerstein, who developed the best performance algorithm. They will present their methods in the upcoming DREAM3 conference on Oct. 31 at the Broad Institute.”
Stolovitzky adds, “It’s hard not to think in terms of evolution: eventually, the fittest algorithms will survive. And in DREAM, fitness is measured by this scoring system, that even though is not perfect, is a rather objective measure of performance. For each data set there may be a better adapted set of methods. And the combination of dataset (with mutants and time courses simulated as if measured by gene expression) and methods of the Challenge 4 best performers this year was phenomenal. So even though the community didn’t progress as a whole, there were a few gems this year that did show progress, and this is what DREAM aims at: to find the needle in the haystack.”
The answers to this year’s challenges can be downloaded here. (You have to register to download the data and the gold standards.) The best performer teams will give talks explaining their winning strategies at the upcoming DREAM conference at the Broad Institute on Oct. 31, 2008. The program for the triconference RECOMB Systems Biology/RECOMB Regulatory Genomics/DREAM is here.
Challenge 1 - Signaling cascade prediction
The concentration of four intracellular proteins or phospho-proteins (X1, X2, X3 and X4) participating in a signaling cascade were measured in about 104 cells by antibody staining and flow cytometry. The idea of this challenge is to explore what key aspects of the dynamics and topology of interactions of a signaling cascade can be inferred from incomplete flow cytometry data.
No team did better than a p-value of 0.11, so no team has been deemed a best performer.
Challenge 2 - Signaling response prediction
Approximately 10,000 intracellular measurements (fluorescence signals proportional to the concentrations of phosphorylated proteins) and extracellular measurements (concentrations of cytokines released in response to cell stimulation) were acquired in human normal hepatocytes and the hepatocellular carcinoma cell line HepG2 cells. The datasets consist of measurements of 17 phospho-proteins (at 30 min and 3 hrs) and 20 cytokines (at 3 hrs and 24 hrs) in two cell types (normal and cancer) after perturbations to the pathway induced by the combinatorial treatment of seven stimuli and seven selective inhibitors. The goal of this signaling response challenge is to predict the response to perturbations of a signaling pathway in normal and cancer human hepatocytes, and there were two sub-challenges:
2A. The phospho-proteomics challenge - This challenge consists of predicting a subset of data points that have been measured but removed from the normal and cancer hepatocytes datasets. “Specifically…predict the concentration of the 17 phospho-proteins at two time points (at 30 min and 3hrs) in each one of seven combinations of ligands and inhibitors for both the normal and cancer hepatocytes. As data, we provide the concentrations of all those 17 phospho-proteins for all the other combinations of ligands and inhibitors for both the normal and cancer hepatocytes…”
2B. The cytokine-release challenge - This challenge consists of predicting a subset of data points that have been measured but removed from the normal and cancer hepatocytes datasets. Specifically, we ask the participants to predict the concentration of the 20 cytokines at two time points (3 hrs and 24 hrs) in each one of 7 combinations of ligands and inhibitors for both the normal and cancer hepatocytes. As data, we provide the concentrations of all those 20 cytokines for all the other combinations of ligands and inhibitors for both the normal and cancer hepatocytes…”
Two Top Teams: Team VITAL_SIB (Nicolas Guex, Eugenia Migliavacca and Ioannis Xenarios, Swiss Institute of Bioinformatics, Switzerland); Team GenomeSingapore (Guillaume Bourque and Neil D. Clarke, Genome Institute of Singapore)
Challenge 3 - Gene expression prediction
Gene expression time course data is provided for four different strains of yeast (S. Cerevisiae), after perturbation of the cells. The challenge is to predict the rank order of induction/repression of a small subset of genes (the “prediction targets” in one of the four strains, given complete data for three of the strains, and data for all genes except the prediction targets in the other strain. Predictors are also allowed to use any information that is in the public domain but are expected to be forthcoming about what information was used.
Two Top Teams: Team GustafssonHörnquistSweden ( Mika Gustafsson, Michael Hörnquist, Linköping University, Sweden); Team dreamteam2008 (Jianhua Ruan, The University of Texas at San Antonio, San Antonio)
Challenge 4 – In silico network inference
The goal of the in silico challenges is the reverse engineering of gene networks from steady state and time series data. Participants are challenged to predict the directed unsigned network topology from the given in silico generated gene expression datasets. There are three in silico challenges, corresponding to gene networks with 10, 50, and 100 genes.
Predictions are assessed independently for each challenge. Thus, teams may choose to submit predictions for only one or two of the challenges. However, we encourage teams to participate in all three challenges in order to compare how well different methods perform on different network sizes. Each challenge consists of five gold standard networks. In order to participate in a challenge, predictions for all five networks of this challenge must be submitted. The rationale is that in this way it will be possible to assess how consistently a method predicts the topology in five independent networks of the same type and size.
Top Team: Kevin Y. Yip, Roger P. Alexander, Koon-Kiu Yan and Mark Gerstein, Yale University
This article first appeared in Bio-IT World’s Predictive Biomedicine newsletter. Click here for a free subscription.