May 12, 2006 | The Bioinformatics Organization presented its 2006 Benjamin Franklin Award to Professor Michael Ashburner of the University of Cambridge. The Award was presented to Ashburner by Jeff Bizarro, president of Bioinformatics.Org, at Bio-IT World’s Life Sciences Conference + Expo, which was held in Boston on April 3-5.
A noted Drosophila researcher who helped lead the project to sequence the fruit fly genome in the late 1990s, Ashburner was lauded for his steadfast championing of open-source resources for the genetics and informatics communities.
He is a driving force behind the Gene Ontology (GO) Consortium, which was launched in 1998 with three charter databases dedicated to gene ontologies for Drosophila, yeast, and mouse. That number has grown to about 20 databases, including all major model organisms, the TIGR database, and numerous human databases, including Ensemble, NCBI, Incyte, Celera, and Compugen.
GO now includes some 20,000 terms, covering all aspects of molecular function. “About half a million gene products [are] annotated across the entire living universe from viruses to humans by a skilled curator with a GO term,” said Ashburner. For example, if a scientist wants to find information on tyrosine kinases, these data can be retrieved despite the primary annotations being stored in 20 different databases.
“This is what we set out to achieve, a de facto integration of these very disparate databases,” said Ashburner.
Despite his disdain for financially driven science, Ashburner singled out Ken Fasman, vice president of R&D informatics at AstraZeneca, for his role in providing GO seed money at the end of 1998. Ashburner said, “It was the most extraordinary contract the European Molecular Biology Laboratory ever signed with a commercial company: Unless everything [produced] with this money was put in the public domain…they wanted their money back!”
Ashburner discussed some new ontology initiatives. OBO, which stands for “open biomedical ontologies,” is a “meta-ontology….[and] includes about 50 ontologies contributed by the academic community.” OBO is moving to the National Center for Biomedical Ontology (NCBO), funded by NHGRI last fall.
Say It Ain’t SO
Surprisingly, in this postgenomic era, Ashburner noted that there is no good Sequence Ontology (SO). “Other than the GenBank feature table, there are no [communally] agreed sequence definitions for sequence annotation,” such as how to define a pseudogene.
Ashburner and colleagues have established a Sequence Ontology that allows formal definitions of coding regions, promoter regions, splice sites, and other sequence motifs. “Damn it, you have to be consistent if you want to compute on the data,” said Ashburner. “You cannot now go to GenBank or the model org databases and say, ‘I want to know the number of alternatively spliced genes in four model organisms.’ It can’t be done! But when the SO is done, it will be about a half a line of PERL.”
Ashburner concluded by noting that 2006 is the centenary of the first paper published on Drosophila. He recalled how his efforts to help sequence the fruit fly genome were “thrown off course in May 1998” after Craig Venter formed Celera Genomics.
“That involved a very different sort of problem when considering openness in science…we had no option but to collaborate with Craig on sequencing the fly genome. There was no way we could compete. Craig’s motivation was to make money for the shareholders… Our motivation was to provide the sequence openly.”
The collaboration between Ashburner and the fruit fly genome researchers and Celera led to the publication of the Drosophila genome in May 2000 in Science. Somewhat bashfully, Ashburner plugged his new book, Won for All: How the Drosophila Genome Was Sequenced, published by Cold Spring Harbor Laboratory Press. One reviewer, he noted with a wink, has likened the book to “Bridget Jones for Geeks.”