Nov. 13, 2007 | Computational biology took a significant step forward recently as a group of researchers led by David Baker developed an in silico approach to accurately predicting the 3D structure of small proteins using only their amino acid sequences and NMR data. In fact, their new method was able to improve on some structures previously determined by x-ray crystallography.
The work is presented in a new Nature paper, “High-resolution structure prediction and the crystallographic phase problem.” (Qian, B. et al., Nature, 14 October 2007, doi: 10.1038/nature06249).
The effort to develop computational approaches to protein structure prediction has a rich history, and Baker, a University of Washington researcher and Howard Hughes Medical Institute investigator, has long been a prominent figure in those efforts (see Computational Biologists Join the Fold, Bio•IT World, June 2002).
Over the past decade, much of Baker’s work has wound up in Rosetta, a software package to predict protein structure. The code is free to academics and can be licensed by commercial organizations from UWash.
Virtually all computational approaches to protein structure prediction attempt to identify minimum energy configurations. Often though, there are vexing protein segments for which the predicted models have a high degree of variability. Among other things, the new work tackled this problem.
“It’s as if you have this complex coil of rope, and there is a section that you think just doesn’t behave the way it should,” says Baker. “So you just cut it out, reconnect the ends, and computationally explore different conformations of just that section until you have a better model of its behavior.”
Baker says it is now possible to quite accurately predict the shape of smaller proteins. Indeed, they accurately predicted the 3D shape of a protein with 112 amino acids using only its sequence data. The new approach is likely to be quickly adopted by both academic and commercial researchers, says Baker.
Here’s an excerpt: “[W]e present a new energy-based rebuilding and refinement method that consistently improves models derived from NMR, from sequence-distant templates, and from de novo folding methods. The final models include high-resolution features not present in the starting models, including the packing of core side chains. Bringing together these results from all-atom structure prediction with state-of-the-art algorithms for molecular replacement and automated rebuilding, we show that distant-template-based and de novo models can reach the accuracy required to solve the X-ray crystallographic phase problem.”
The paper “represents a real breakthrough,” wrote structural biologist Eleanor Dodson in a News & Views editorial also published online by Nature. Dodson writes, “This approach demonstrates real progress in several respects: the use of enormous computational power; the exploitation of known three-dimensional structures; the development of powerful search algorithms that relate those structures to new sequences; and the steadily improving tactics used to determine low-energy conformations of molecules.”
Interestingly, much of the actual computation was accomplished using the http://boinc.bakerlab.org/rosetta/http://boinc.bakerlab.org/rosetta/project in which more than 150,000 home computer users “donated” compute cycles through the distributed Berkeley Open Infrastructure for Network Computing (BOINC) platform. Not everyone has access to such a grand computing grid but that’s not always necessary. “It depends on the problem. A modern computing cluster with tens of nodes would suffice in some cases,” says Baker.
Baker credits the biennial CASP (Critical Assessment of Techniques for Protein Structure Prediction) competition to computationally predict the 3-D shape of an array of target proteins from their amino acid sequences for driving the protein structure prediction community. “[CASP] is absolutely critical. It provides an objective blind test of methods, and provides a clear assessment of what the current challenges in the field are,” he says.
So what’s next? “Good question!” he says wryly. “But predicting the future of structure prediction is even harder than structure prediction itself!” Cute.
Subscribe to Bio-IT World magazine.