YouTube Facebook LinkedIn Google+ Twitter Xinginstagram rss  

Technology Overload - Innundated with new tools and mountains of data, the pharmaceutical industry struggles to pull it all together and streamline drug discovery.  

By Mark D. Uehling

March 10, 2003 | Gautam Sanyal is head of biochemistry and protein science at AstraZeneca Pharmaceuticals LP. For all the leads his group of infectious-disease researchers has mined from genomics, the group's output — its production of potential new drugs — has not risen in tandem. "The productivity has not gone up," said Sanyal, addressing a biotech seminar in Boston. In fact, there are more potential leads to investigate than before: "It's a bad sign that that's going up," he conceded. The good news? The leads are better than in the past: "The quality has gone up," he said.

The proliferation of biological targets is occurring in companies beyond AstraZeneca — and is symptomatic of a deeper problem that is delaying the development of new drugs. Wondrous as they are, genomic databases and automated laboratory instruments have uncovered more intriguing, tantalizing avenues than it is possible to explore. Scientists have succeeded in inundating themselves in mounds of new data. It's the shovels that are evolving — but imperfectly.

Truly New Drugs Are Rare Indeed 
Bio·IT World identified the 10 largest pharmaceutical companies by market capitalization in early February 2003

Read More 
Software to analyze DNA sequence, say, may be exemplary on its own terms but might not interact with gene expression tools, which doesn't help physicists figure out the structure of a complementary protein or help biologists predict what a potential new drug will do in the liver of a mouse or a pregnant woman. The upshot is that the process is obstructed not only by scientific uncertainty but also by the very tools that facilitate the process. Thus technology is not just failing to improve the discovery of new drugs. It may, perversely, be making the odds of finding new drugs worse.

At the level of specific projects, the key to accelerating drug discovery boils down to integration, defined with galactic breadth. In-house code needs to be integrated with commercial software. Scientific disciplines need to be integrated. Data from very public, government databases need to be integrated with a company's crown jewels in an Oracle database. Excel spreadsheets must be merged with molecular diagrams, which must be synchronized with charts depicting biological pathways.

Can more of it happen in a computer? "It's very hard to put it all together and very expensive to put it all together," says Roger Edwards, director of life sciences at TIAX LLC, the successor to Arthur D. Little. "People underestimated the complexity of integrating data when the scientific tools themselves were evolving."

Edwards adds that his clients are, in private, optimistic about their efforts — just more severe in their cross-examination of vendors. "You have a little more sophistication from the buyer's side," he says. "They're going, 'OK, you've said these things. Show me.'" The bad news? Edwards says it might take until 2010 or 2020 for the genomic and proteomic tools to be fully vindicated.

Got to Get to the Data 
Every vendor and analyst, every biotech and pharmaceutical company, insists that their own tools are perfect. It's those other fellows' applications or instruments that don't seem to be working. "Some companies say, 'We're going to build 27 new data marts,'" says David Butler, Spotfire Inc.'s vice president for product strategy. "There is this constant problem that nothing is ever integrated."

McKinsey Guru: 'A Lot of Heat on R&D' 
Philip Ma is a partner in the pharmaceutical, biotech, and medical devices practice at McKinsey & Co. A molecular biologist by training, Ma advises companies about their R&D strategy and productivity. He spoke with Bio·IT World's Mark D. Uehling.

Read More 
Spotfire, for its part, is not alone in trying to address the integration issue with, among other tools, its DecisionSite Posters. It's a central digital workbench for researchers in different departments across a company. Greg Tucker-Kellogg, associate director of informatics at Millennium Pharmaceuticals Inc., says that Spotfire is becoming a Microsoft-like standard at his company, where it is used in genomics and to assess high-throughput screening data and most other research data. Some of the bottlenecks have been dissolved, he says. "Spotfire has enabled us to make decisions about that quality-control network that just weren't there before."

Trouble is, the pharmaceutical industry has been investing in many platforms and tools and data warehouses for years — yet productivity is still steadily declining. So there is a teaspoon of skepticism about any assurances from the vendor community. Complicating the picture is the never-discussed quality of internal software projects within the biotech and pharmaceutical industries.

To be fair, the veil of secrecy over such IT projects is occasionally lifted at scientific meetings. A speaker from a major company will take 20 minutes to showcase a particularly promising application. But the veil drops again. No major company has shared enough information to allow objective observers to discern whether any pharmaceutical company has done a reasonable job of tying together data from disparate scientific fields, laboratory instruments, and software platforms. No one who knows where the bodies are buried — consultants, vendors, analysts — is at liberty to talk.

What is clear is that the pace of discovery research has slowed. "The number of new chemical entities (NCEs) for 2002 is going to be less than 2000," says Anthony J. Sinskey, a microbiologist at MIT and co-director of its program on the pharmaceutical industry. "This is an industry with severe problems. There is a lack of innovation."

 "The biggest issue is data integration. Virtually all the pharmaceutical companies realize there is a problem."

John P. McAlister, Tripos

Sinskey believes the industry needs better IT, better knowledge management, better bioinformatics expertise — in short, it needs to integrate data more effectively. "The pharmaceutical industry does not do a good job at this," Sinskey says. He notes that just 18 percent of drugs survived clinical trials in the 1990s, versus 23 percent in the old-fashioned 1980s, before anyone placed any great hope in genomics or bioinformatics.

Sinskey is polite, but others in the life sciences believe there are fundamentally dysfunctional aspects about the ways that vendors and large pharmas relate. At least one uber-bottleneck is obvious: scientific data congeal in one place — a lab, a machine, a database — before a researcher can analyze them in another.

For all its many IT wizards, the drug-development community understands it does not have the resources (financially or technically) to make data flow easily from one instrument or department to another. But the biotech and pharma companies are irritated that vendors do not always appear to understand the complexity of drug discovery.

George Laszlo does get it. As a director of life science solutions at CSC Global Health Solutions, he describes the present standoff between vendors and drug companies succinctly: "Large software firms don't listen enough to their clients and simply assume that some internally baked great technical idea will get the job done. Small firms develop an application that one client wants and then assume that everyone else will also want it."

Which is not to say Laszlo lets drug developers off the hook. "Biotech and pharma companies are also naive on one or more fronts," he says. "They deliberately stay away from the hard questions related to culture, politics, resistance to change, ROI, and concentrate too much on technology. They make buy decisions for the wrong reasons: 'The vendor is a safe choice. The sales person is a good friend. The money needs to be spent this year.'"

Trouble in Platform City 
Jack Crawford makes a similar point. A database veteran in the United Kingdom and co-founder of Managed Ventures LLC, he says, bluntly, "Small and large firms need to share scientific data and are hitting roadblocks at every turn. It seems that the predominant key players in biology and chemistry software (IDBS, Accelrys Inc., MDL Information Systems Inc.) are fighting each other for dominance of their client's business; however, they don't allow or promote their integration."

If there is any consolation, it is that at some to-be-determined, not impossibly distant date in the future, the same technologies that are complicating drug development may begin to simplify it. There are, thankfully, hints that this may have already begun. At Vertex Pharmaceuticals Inc., for example, the company believes it is empowering scientists to be their own IT support staff. Tom Cleveland, senior manager of discovery informatics, says that products from Tripos Inc. have been especially helpful in bottleneck busting. That's impressive, since his colleagues routinely screen 50,000 to 100,000 compounds a day.

"The product enables scientists to do some IT types of tasks," Cleveland says, "taking data from disparate data sources and integrating it, merging it together, processing it in some way and putting that data that they're generating into information that they can then use to make their decisions. It's sometimes very difficult to do those types of tasks on your own without an IT background."

Cleveland notes that he personally had to write scripts to integrate data when he first joined the company. No longer. "With Tripos," he says, "we've been able to offload some of that task from the IT programmer-type people onto scientists." Cleveland is keen to impress upon a reporter that computers by themselves can't find new drugs. "You can write all the software you want, and it's probably not going to give you a drug," he says. "This tool does allow people to ask some questions that they wouldn't normally be asking, because it allows the scientists to do tasks that they wouldn't normally be able to do. And one of those things is the merging of information."

There are skeptics. Tom Jacobs, a columnist for Nature Biotechnology and a pharmaceutical analyst at The Motley Fool Inc., cites the lists of cash-burning genomics companies at Recombinant Capital ( and says the tool makers may be in trouble if a clear vindication of their merit is not apparent. "How many drug discovery platforms do we need?" he asks. "Whose platform is better than others? I was amazed when Merck [& Co.] bought Rosetta [Inpharmatics Inc.]. The platform companies are the ones that are going to see huge bankruptcies."

Tool companies, needless to say, insist they are already solving the data integration issues across scientific disciplines and organizations. The vendors say there is recognition of the problem of controlling surging torrents of data. "The real challenge in a research environment is [determining] what the data mean," says Scott Kahn, chief scientific officer at Accelrys. "It's the domain integration of these data that is the real challenge." The scientists typically create and protect silos of data relating to their projects, he says, while research directors want to connect those silos.

Kahn says that while younger researchers are very comfortable with computers, there is still a gut-level mistrust of automating what is to some chemists a magical or creative process that can't be coded into software. "Scientists have the familiarity that serendipity and good fortune and insight — things that you don't normally associate with a machine — have played a role in the finding of some of the most significant drugs." Dispelling the idea that computers and happenstance are incompatible is a sales challenge for the company, he admits.

At NuGenesis Technologies Corp., the company says it is dealing with the integration mess. John Helfrich is program manager of the company's discovery and drug development group. He says his company is well along the way to uniting data from instruments, applications, and laboratory information management systems (LIMS). Indeed, the company's "print to database" patents, he says, will enable customers to begin to forget about formatting the data and just worry about research. Unlike some competitors, Helfrich notes, NuGenesis will already export data into an XML format. "The IT people are getting this," he says. "They know they are going to save hundreds of hours of man-months."

Analyze This with Algorithms 
For Advanced Chemistry Development Inc., the solution to the integration issue is better software. Director of Marketing Robert DeWitte describes a recent lab visit in which an employee literally dropped off printouts of mass spectroscopy and nuclear magnetic resonance data. "The true bottleneck is not running the spectrum, running the sample, and getting the result. The true bottleneck is analyzing the result," he says. "The technology we have is a set of algorithms to streamline the analysis of the data."

DeWitte rejects the idea that the people comprise one system, the computers another. For Advanced Chemistry's algorithms, he says, the Ph.D. running the system can teach the computer what he or she knows. Most of the time, the computer will identify the molecules correctly. "It can process plates of compounds automatically and flag those where the human should get involved. Each time the expert gets involved, he or she can teach the algorithm what he or she knows."

Just as NuGenesis and Advanced Chemistry Development are integrating data from multiple instruments, SciMagix Inc. is working to do the same with images. The images can come from a variety of instruments, including microscopes and mass spectroscopy equipment. "The historical methodology has been to take a picture and paste it into a notebook," says SciMagix CEO Robert Dunkle. "What we offer as part of our database management system for images is a way that they can put that data set into a common repository."

When Images Gel 
At Pfizer Inc., he reports, "gels" from proteomic research — clear plastic sheets covered with dark spots, bands, and blurs — were literally spilling out onto the floor, overwhelming the staff. SciMagix designed software that automatically scanned the gels and found patterns between clumps of spots. "By clicking on half a dozen spots, that becomes a pattern that the researcher can identify," he says. Those patterns are crucial to identifying related proteins of interest to drug researchers, but the sheer number of gels, not to mention the tedious task of examining them, had defeated Pfizer. Compounding that problem is the fact that different microscope manufacturers use slightly different formats for their images.

Upon taking delivery of the SciMagix software, Pfizer recognized patterns that had remained invisible in the welter of unanalyzed data. "The researchers have an opportunity to communicate that they didn't have before and to have insights they didn't have the opportunity to have before," Dunkle says. "That really is the power of image informatics — this whole notion of using information content and context to have a better understanding of biology."

Even Dunkle, however, admits that while his company's applications such as ProteinMine can grab disparate images from a variety of manufacturers, they cannot always pass that information down the line to other scientists. "I don't have a good answer to that," he says candidly. "One of the things we see as a focus for our activity is the ability to communicate results to a whole department or to a therapeutic team across disciplines."

At Tripos, CEO John P. McAlister agrees that assimilating data is the foremost concern. "The biggest issue is data integration," he says. "Virtually all the pharmaceutical companies realize there is a problem."

Companies can no longer fit as many molecules as they need to analyze in an Excel spreadsheet. "It's very recent history that this couldn't be handled in an e-mail-and-paper world," McAlister notes, citing the surging torrent of data crashing over the life sciences. "You need ways to clean up those data to find out if there is any signal in the noise. That's not something you eyeball and say, 'Oh, I see where there is a signal here.' It's made the use of innovative data-mining and analysis tools more important in the data-processing arena."

Behind the scenes, he reports, the scientists are resisting IT-intensive solutions. That presents managers with a problem: "The challenge," McAlister says, "is to preserve creativity but to embed it in a process that allows creativity to work efficiently. You can never coerce people into changing their process. It's a matter of figuring out how to capture that creativity for their organization and make it more available throughout the company. That demands an informatics infrastructure that hasn't previously existed."

Asked about where the Tripos data go once they've been fully analyzed, inspected, and validated within the company's universe of 50-odd different software modules, McAlister admits they may end up in a silo not easily accessed by others. But he's optimistic the problem will be solved within five years. "The data will be routinely available throughout the organization," he predicts. "They will be in data warehouses. The data will not be an issue at that point. The thing that is likely to be an issue is the process by which one goes about moving from target identification efficiently to a preclinical development."

Special Report: Resolving Bottlenecks 
This is the first of three reports in a Bio·IT World series on bottlenecks in life science research. The second installment, on lead optimization, will appear in our April issue. The final installment, on bottlenecks in clinical trials, will appear in our May issue.
One of the thorniest integration issues may be close to being solved. IBM Corp. has announced DB2 Information Integrator, which the company promises can simultaneously combine data from not only BLAST searches but also Oracle databases and Excel spreadsheets. It's been beta-tested at Aventis Pharmaceuticals Inc. and seems to be working well. "This idea of integrating data from various sources is really fundamental to them," says Laura Haas, an IBM distinguished engineer and development manager for the Integrator project.

Far from competing with software houses such as Accelrys, she says, IBM is working with them, hoping to handle back-end tasks and leave scientific functionality to the specialists. The goal: to speed internal development by as much as 50 percent. "We're trying to provide the tools of the trade to make it easier for folks to develop new applications," Haas says. "I do feel very confident that we have a fundamental infrastructure that will be flexible enough and extensible enough to work with the changes coming down the road." * 


For reprints and/or copyright permission, please contact Angela Parsons, 781.972.5467.