November 16, 2010 | While there are occasional glimmers of pre-competitive cooperation between big pharma companies, few projects can match the tangible benefits achieved lately by Barry Bunin, Sean Ekins and colleagues at Collaborative Drug Discovery (CDD).
Chief executive Bunin says CDD has reached a tipping point following the recent release of chemical datasets on malaria by GlaxoSmithKline (GSK) and tuberculosis (TB) by Novartis. But without a doubt the project Bunin and Ekins are most excited about is a new open model with Pfizer, which suggests that CDD’s approach can be extended to commercially relevant drug discovery, without disclosing proprietary chemical structures. CDD currently houses information on more than 3 million molecules.
“We think this is game changing,” says Bunin, “an obvious experiment no one has tried before. Now others will see this and it can catalyze something beyond what we’ve done with Pfizer for the whole industry.”
One can think of CDD as a scientific Facebook or LinkedIn, “a conversation around the data,” says Bunin, highlighting the matchmaking aspect of his organization that puts biologists in touch with chemists and vice versa. “We have no wet labs, and we don’t need to—there are enough wet labs out there! There’s more work to do than we have hours in the day, just doing the informatics damn good!”
“A lot of academics had great ideas but couldn’t do drug development,” says Bunin. “We want to provide the infrastructure of a big pharma for the little guy—a full solution on the informatics and collaboration front. That was the idea of CDD.”
CDD has signed up thousands of people, says Bunin, including UCSF, Columbia, Harvard, Cornell, Johns Hopkins, UCLA, and many academic screening centers. “A lot of innovation is happening at the outskirts,” says Bunin, “and we want to handle all their data. If they’re the world’s best biologist but don’t know the Lipinski Rule-of-Five, we have a computational chemist—Sean Ekins—who understands the science and can form hypotheses and come up with new ideas. We complement people, either with technology they don’t have or the people or community they don’t have.”
Earlier this year, CDD added a new dimension to its offerings when GSK decided to open up its malaria data, making some 13,500 compounds from its Tres Cantos facility in Spain (see “Genomics provides the kick inside,” Bio•IT World, Nov 2003) freely available to scientists (through CDD and other sources) in the hope that other groups might approach certain candidates in ways that complement or differ with its own approach. The Wall Street Journal likened the idea to the pharma equivalent of the Linux operating system. CDD’s Sylvia Ernst blogged, “The world of drug discovery has officially changed today… WOW!”
The GSK release showed how “public data sharing invigorates the entire drug discovery ecosystem,” says Bunin, prompting CDD to provide free access to researchers interested in archiving and publishing all their data for the greater good of the community. “It’s a beautiful story, the intersection of commercial, economics, and humanitarian goals to combat malaria,” says Bunin. “With the barriers to precompetitive data sharing coming down, it is our wish to continue to receive and publish new data sets useful both to our client base and to the scientific community at large.”
Tackling TB Worldwide
Thanks to a $2 million grant from the Gates Foundation for TB research (on top of support from Lilly and the Founders Fund) Bunin has made CDD available for TB researchers worldwide. For TB, which claims some 1.8 million lives annually, “the main challenge is overcoming resistance and shortening the therapy,” says Bunin. “If you have to walk 20 miles to get a drug, you’re not going to do that.”
The CDD community database is currently home to chemical screening data on nearly 1.5 million small molecules with associated cheminformatics properties from pharma, academia, literature and patents – ranging from malaria SAR data dating back to World War II to gene-family wide G-protein coupled receptor SAR, to the most recent results on Novartis’ anti-bacterial compounds.
The Pfizer collaboration that has Bunin and Ekins so excited dates back to late last year, when the pair asked Pfizer’s Chris Waller and Eric Gifford if they had ever tried open tools or descriptors for building computational models for various molecular properties such as ADME/Tox. They said no. Ekins recalls, “This set in motion rigorous comparative studies by Pfizer’s Rishi Gupta that leveraged their massive high-throughput screening datasets for things like absorption, metabolic stability, toxicity etc.”
Gupta found that comparable quality computational models could be generated using very large datasets (50,000 to 200,000 compounds) whether using open-type descriptors or commercial molecular descriptors. “We expected the commercial descriptors to be so much better than anything free, but in these examples, that was not the case,” says Ekins.
Why should this be important for a company like CDD primarily interested in fostering data sharing and collaboration for neglected diseases? “Open descriptors and algorithms could enable the sharing of computational models between groups such as pharmas and academics working on neglected diseases like TB, malaria etc,” says Ekins. “These neglected disease researchers don’t have the luxury of such ADME/Tox models that could provide insights that might help produce better clinical candidates faster. Pfizer provided a proof of concept.”
Bunin thinks the next big idea is to facilitate pharmas sharing their computational models (based on molecules they probably don’t want to share) with outside scientists to score their compounds for ADME/tox issues. “That may be a way off, but the proof-of-concept work now shows we do not need to use expensive commercial tools... This will make it easy to share and make such models available without any commercial software with anyone in the world.”
In addition to benefiting neglected disease constituencies, CDD hopes to encourage pharmas to share metadata with other pharmas—without needing to reveal the compositional matter that allows the margins on successful drugs. This can help pharmas look at the molecular properties that can make the difference between clinical success or failure.
After ensuring people cannot back calculate the actual molecular structures in the models, Ekins says they hope to “enable coverage of the massive chemical space and ultimately enable better predictions.” That could “open up priceless data” if the compound structures are protected, data that academics could never generate on this scale from the best data sets from the biggest pharmas.
“Free technologies on the web for this kind of thing are just as good as commercial software costing big companies millions of dollars in license fees. Therefore they can do the same modeling at zero cost. If this is the case here, there may be other places they can cut costs using free tools that the companies have not explored aggressively”—something shareholders should demand, Ekins says.
He adds that the study shows that pharmas can collaborate and share their models, so they don’t have to spend large sums of money doing the same kinds of repetitive work. “Folks will be competing where and only where they are really innovating,” he says. “This is how the open source meme plays out in a way that works in this complex IP space.”
This begs the question: Could pharmas collaborate and share their models so they did not all have to spend money doing the same kinds of repetitive work separately in each company? “We think this is game changing,” says Ekins. “We just did the obvious experiment no one tried before. Now others will see this and it could catalyze something beyond what we did in this collaboration with Pfizer.”
Bunin says CDD was one of the first organizations using the Cloud for drug discovery data more than six years ago, called the CDD Vault. “We had GSK’s private data for six months before they went public with it,” he says. The CDD Vault allows his team to selectively share data with anyone around the world. “You can log into multiple vaults, push experiments and collaborate from vault to vault.” (Bunin says anyone with Excel can get data in and out just by hovering over the molecule—no plug-ins required.)
“We don’t host it in our own backyard,” he says. “We have a professional co-location facility, with armed security guards and concrete walls, similar to Facebook. The process of having a hosted system has become a commodity today. In our case, it’s such a custom app, we didn’t want it out in Amazon. We have our own physical server, it’s backed up. If we had ten times as many molecules, we might do it differently, but it’s worked for thousands of people so far.”
CDD Public contains some 52 datasets (at last count), including 1,700 drugs from ex-Pfizer chemist Chris Lipinski—the first such submission. “We don’t have any labs, we’re just facilitating, but we decided on our own dime to buy the compounds,” says Bunin. “And we found known drugs that could almost completely reverse resistance against these strains of malaria. It’s been through Phase 2 trials, and that could save years off a new drug from scratch. If you’re a 4-year-old with a resistant form of malaria, maybe you could take this drug that already exists!”
Meanwhile, Brian Roth (University of North Carolina), whom Bunin calls “one of the best CNS researchers in the world,” has supplied data on more than 45,000 compounds targeting G-protein-coupled receptors. “It’s a dataset we’re proud of, but it’s one of over 52 public datasets,” says Bunin, Novartis being the latest.
Bunin is anxious to get his story out in the hope of bringing other pharmas along, which he says “will allow the whole industry (and thus human health) to take a giant leap forward. Now folks can both collaborate and compete at the same time!” As for future goals, Bunin hopes to keep attracting more users and more content, and spurring further collaboration within the pharmaceutical community. •
This article also appeared in the November-December 2010 issue of Bio-IT World Magazine. Subscriptions are free for qualifying individuals. Apply today.