August 2, 2011 | Collaborative Drug Discovery has developed a molecular library database to serve a network of over 100 tuberculosis researchers in the U.S. and Europe, helping their users mine and collaborate on tuberculosis data. Their efforts, nominated by the Tuberculosis Research Section, NIAID, NIH and the Global Alliance for TB Drug Development, earned them the 2011 Bio•IT World Best Practices Editors’ Choice Award.
With funding from the Bill and Melinda Gates Foundation and other investors, CDD collated at least 15 public datasets on Mycobacterium tuberculosis, representing well over 300,000 compounds derived from patents, literature and high throughput sequencing data.
“Any new chemoinformatic system should provide, at minimum, capabilities for fundamental data storage, retrieval, and analysis of diverse data originating from chemistry, biology, pharmacology, and toxicology activities,” explained the TB Alliance in its nomination. “Ideally such a system would be Web-based so that any participating laboratory could use it without further investment in hardware. The system should be intuitive so that new participants can learn the system with minimum training. In addition to fundamental chemoinformatic tools, such a system should be able to enhance collaboration among researchers in the same field, the community.”
Enter Collaborative Drug Discovery. “The Gates Foundation made us a grant to fund specific groups that needed to use the software for collaborations either within the institute or between institutes or between institutes and companies,” explains Sean Ekins, CDD’s collaborations director. CDD took their Cloud-based application and developed software specifically for the TB community. The initial grant was for two years awarded in 2008, but has been extended to five.
CDD’s database allows collaborators to share research data securely within and across organizations without the need to install and maintain complex software. CDD runs on a fault tolerant infrastructure providing redundant storage, compute nodes, power, HVAC, and backbone connections. The infrastructure is also redundantly secure, protected by multiple layers of host-based, network and physical security measures. CDD software runs on a MySQL database and was developed using the Ruby and Java programming languages. The tool was developed using an agile development process which uses an integrated design-build-test process.
CDD’s TB database fosters data archiving and selective sharing within the research community and enhances creation of computational models, said the Tuberculosis Research Section in its nomination.
Having public and private screening data available against M. tuberculosis enables researchers to analyze the biological activity vs. physicochemical properties of compounds in the database, said the TB Alliance. “Consequently, this database has also been used to build novel computational machine learning and pharmacophore models that could be used to filter other libraries of molecules to rapidly identify potential M. tuberculosis-active compounds.”
The software allows users to segment their data into vaults that enable sharing with specific collaborators, says Ekins. “But there’s another component of the database. There’s a public side where we have some datasets, and we’ve done annotations around TB—sort of curation of data from the literature around compounds,” he says.
“This award acknowledges two years of software development and support for TB research groups funded by the Gates Foundation, and is a credit to all CDD users and community members who have helped guide our technology over the past seven years in the cloud,” said Barry Bunin, CDD’s founder.
“This gives me a rare opportunity to publicly recognize the exceptional accomplishments of our software development and product team. We would like to thank our nominators and collaborators, as well as the editors of Bio•IT World for this prestigious award!” •
This article also appeared in the 2011 July-August issue of Bio-IT World.