As genome-sequencing projects continue to churn out reams of data every day, researchers may come to depend on such automated processors to predict the structures of the thousands of proteins produced by even the simplest organisms. In CASP4, the automated servers did well on targets that were at least 40 percent similar to proteins of known structure, but their predictions fell short on proteins with unfamiliar sequences or novel folds. "We still need an expert to make judgments and push the right buttons," says Ajay Royyuru, head of the protein structure prediction group at IBM's Thomas J. Watson Research Center.
And maybe a human expert is all you need — if that expert is Alexey Murzin of the Medical Research Council's Laboratory of Molecular Biology in Cambridge, England. In a process he describes as part theory and part intuition, Murzin can eyeball an amino-acid sequence and see subtle patterns, telltale signs of a protein's evolutionary origin, that allow him to place the target into an already existing protein superfamily. Using other family members as templates, he then cobbles together a 3-D structure that looks good. In CASP4, Murzin's models placed among the top predictions for each target. "If we could codify the information in his head," says Howard Hughes Medical Institute investigator David Baker, "we'd be in great shape."
Until then, most CASP participants will continue to work toward automating their approach to structure prediction. If programs could do the bulk of the heavy lifting — examining a protein's sequence and choosing the most appropriate method for approximating its structure — without human intervention, researchers could turn their attention to more challenging problems, such as designing new proteins or learning how mutant proteins precipitate disease.
Back to Computational Biologists Join the Fold