By John Russell
Feb. 1, 2008 | In the closing days to 2007, a really nice piece of systems biology work was published in the journal Cell, in which researchers developed a predictive model for a free cell, in this case the Archea organism, Halobacterium salinarum NRC-1.What’s more, the authors suggest that even though their model is for a relatively simply organism (~2400 genes), the approach used to build it can probably be used to tackle complex organisms.
The authors of the paper describe a predictive model called EGRIN (Environmental and Gene Regulatory Influence Network). They used a data-driven discovery approach to determine regulatory and functional interrelationships among roughly 80 percent of NRC-1’s genes.
“Using relative changes in 72 transcription factors and 9 environmental factors (EFs) this model accurately predicts dynamic transcriptional responses of all these genes in 147 newly collected experiments representing completely novel genetic backgrounds and environments — suggesting a remarkable degree of network completeness. Using this model we have constructed and tested hypotheses critical to this organism’s interaction with its changing hypersaline environment. This study supports the claim that the high degree of connectivity within biological and EF networks will enable the construction of similar models for any organism.”
It is perhaps unsurprising that much of the work was done at the Institute for Systems Biology and led by current ISB researcher Nitin Baliga and a former ISB researcher now at the Center for Comparative Functional Genomics, New York University, Richard Bonneau. Indeed, this was a classic systems biology exercise, as espoused by ISB founder, Lee Hood (See What Is Systems Biology? Bio•IT World, September 2007), a co-author. The work involved global measurements (genome-wide); quantitative and dynamic measurements; careful system perturbation (genetic and environmental); integrating different data types; and of course adherence to the systems biology cycle of perturbation-measurement-model-hypothesis-perturbation.
Construction and Validation
There were several hurdles. For example, roughly 38 percent of NRC-1’s genes had little or no functional assignments. The group incorporated functional relationships from comparative genomics as well as predicted structural and domain similarities until achieving “nearly 90 percent... meaningful association with either a characterized protein, a protein family, or a structural fold.” Similar techniques were used to boost the number of putative transcription factors.
266 microarray experiments were used to construct the networks and 147 microarray experiments were used to validate model predictions. Network construction was based on the “Inferelator algorithm” (catchy name) developed in large measure by Bonneau. The authors note the number of experiments required was relatively modest given EGRIN’s model’s high accuracy and suggest the interdependence of many networks and, at least for metabolism, cells may usually function in one or a few dominant states.
“What is powerful about this approach is that it took under six years to move from genome sequence to this level of understanding for a relatively poorly-studied organism. Indeed, it would be significantly quicker to implement the same approach with a newly sequenced organism given that much of the scientific methods including experimental procedures, algorithms, and software have been delineated through our study,” write the authors.
Of course, there’s still work to be done on EGRIN. Many other regulatory mechanisms — small RNAs, epigenetic modifications, post-translational modifications, metabolite-based feedback — are not included and may account, at least in part, for its failure to predict what 20 percent of the genes are doing.
Further Reading: Bonneau et al. 2007. Cell 131: 1354-65.
This article appeared in Bio-IT World Magazine.
Subscriptions are free for qualifying individuals. Apply Today.