By Vicki Glaser
February 16, 2009 | Has high-throughput screening (HTS) suffused drug discovery pipelines with “hits” of sufficient quality and quantity to warrant the millions of dollars invested in the technology since the mid-1990’s? That debate has been raging for years. Anemic productivity, a surplus of me-too drugs, and costly and embarrassing late-stage and post-market failures continue to plague the pharmaceutical industry, bolstering critics of HTS who challenge its return on investment. But HTS is evolving; users are learning how to apply HTS technology for optimal output, and the technologies HTS has spawned are extending downstream to lead optimization and compound profiling.
The hype that accompanied the introduction of HTS in the 1990’s virtually assured its failure to meet expectations. HTS was predicted to revolutionize drug discovery. It did not. In 2004, the perception of HTS as a failed business strategy was celebrated in a front-page story in the Wall Street Journal under the headline “Drug Industry’s Big Push into Technology Falls Short,” with critics calling the combination of HTS and combinatorial chemistry “an expensive fiasco.” Those technologies were widely blamed for the drought in new drug approvals that accompanied the first decade of their use.
But given the lengthy drug development cycle, compounds that emerged from HTS campaigns in the early 2000’s are still traveling through the pipeline. Drugs in development today represent “the fruits of our labors 10 to 15 years ago,” says Dejan Bojanic, HTS leader at Novartis. About 55% of the compounds [now] in Novartis’ lead optimization pipeline are projects that came out of HTS,” he adds.
Bojanic, a biochemical pharmacologist, has two decades of experience using HTS for small-molecule drug discovery, including previous stints at Pfizer and Millennium. He points to Pfizer’s antiretroviral agent maraviroc (Selzentry), approved by the FDA in 2007, as an example of a successful HTS program. In December 1997, a Pfizer HTS campaign generated a hit list that included the molecular scaffold destined to become maraviroc. Although the drug took nearly ten years to reach market, Bojanic describes that pace as “extremely fast.”
In fact, HTS has done what it was designed to do, namely to screen increasingly large compound libraries and determine which compounds bind to a target. Most of the problems attributed to HTS lie on either side of the automated screening process. As these issues are steadily rectified, HTS grows more robust. Improvements upstream of screening have resulted in higher quality compound collections stored in smaller, more thoughtfully designed libraries to feed the HTS engine. With the application of Chris Lipinski’s Rule of Five and other algorithms, “the promiscuous hitters and bad actors have been removed,” says Lucile White, manager of the High-Throughput Screening Center at Southern Research Institute, an original member of the NIH’s Molecular Libraries Screening Center Network (MLSCN). Additionally, more physiologically relevant, functional assays are providing a window into the cellular milieu and yielding data more predictive of drug activity in the human body. Cell-based assays are now routinely performed in a high-throughput screening format (see, “Hard Cell Made Easier”).
Downstream, various labeling techniques and detection systems are enabling multiplexing, analysis of drug activity at the single-cell and subcellular level, and characterization of the effects of target binding on signaling pathways and protein-protein interactions, as well as off-target drug activity. HTS formats are becoming increasingly prevalent in secondary assays, enabling earlier studies of compound attributes such as ADME/tox, solubility, or microsomal instability and prediction of serious adverse drug effects such as hERG potassium channel activation. Both HTS itself and access to in vitro ADME/tox and other data earlier in the drug development cycle “are shortening timelines for getting compounds into the clinic,” says White.
Economic realities are also favoring HTS as cost-cutting efforts and downsizing spotlight the advantages of automated processes, maximizing output while minimizing manpower needs. The push in recent years toward assay miniaturization has also helped reduce screening costs and conserve valuable compound stores.
HTS has reached the backside of a steep learning curve. Combined with the synergies realized through parallel technology advances and strategic shifts in the implementation of screening tools, HTS is finally beginning to gain traction. But an essential challenge remains—how to extract, interpret, and apply the hidden nuggets of information from the deluge of raw data that HTS produces?
Losing its Luster
“Lots of mistakes were made in the 1990’s,” notes Robert Hertzberg, VP screening and compound profiling at GlaxoSmithKline (GSK). Two key problem areas early on were compound collections—insufficient diversity and lack of appropriate chemical properties—and screening strategies—sub-optimal assays and problems with false positives and negatives.
“By the late 1990’s, we had solved most of these problems,” says Hertzberg. For example, chemical libraries are now more diverse, with 10-15 analogs in a series instead of 50-100 and more chemical series represented. Vendors have “dramatically improved the liquid handling technology and the detection technology, and we now have much more confidence in the data.”
“HTS is now seen as a tool rather than a technology,” says Alan Fletcher, VP business development at PerkinElmer, suggesting that its value may be more in filtering out inactive compounds rather than identifying active ones. It soon became evident that primary screens based on biochemical assays to detect target binding offer limited predictability of what would be active in the body.
White describes HTS as a “shotgun approach” that is most useful with new targets about which little is known of the types of compounds that might bind to them. If the screeners are lucky, they will uncover several novel structural classes that interact with a target’s active site to pass along to the medicinal chemists.
One of the first challenges for HTS was miniaturization, to reduce the cost of large screening campaigns and preserve “the crown jewels”—pharma’s rapidly growing libraries of synthetic small molecules and natural compounds. This is now routine, and the cost of doing HTS has leveled off. Says Hertzberg: “Most of the industry has settled on a 5 µl [assay] volume, which can be run in a 384- or 1536-well plate.” The cost is about 7¢ per well, and has been for the past couple of years.
However, “we are approaching the limit of how many wells we can cut in a sheet of plastic,” says White. In the future microtiter plates will give way to microarray type dot arrays on a glass or plastic surface, she predicts.
The early limitations of HTS led to a more realistic view of the technology as merely a starting point for identifying small molecules to progress into medicinal chemistry programs. “The screening part is not what keeps me up at night,” says Bojanic. A greater concern is: “Have we got the right compounds in our libraries? Have we got the right targets?”
HTS is not a shortcut, emphasizes Bojanic, and it cannot replace a thorough up front understanding of the biology to be targeted, the chemical space to be explored, and the attributes of the molecule to be created. All of this information should guide the design of the screening library. Indeed, library quality is more important than quantity—a lesson the industry learned the hard way. Companies have begun the painstaking task of whittling down their compound collections, purging them of compounds of dubious identity and purity and more thoroughly characterizing those remaining.
“In the beginning, when there was a lot of low-hanging fruit, companies would run a huge number of screens and they did not have to be particularly smart about their experimental bandwidth,” says Norman Packard, CEO of ProtoLife. “Now it is more important to get the biggest bang for the buck.”
ProtoLife’s Predictive Design Technology (PDT) is an intelligent modeling tool that finds optimal hits in complex experimental spaces through iterative screening, versus a roll-of-the-dice, one-shot screening approach. At each iteration, PDT’s algorithms select the type and complexity of models to use for a particular experimental system based on multiple parameters. The technology models the experimental space using data accumulated from previous screening runs and various statistical learning algorithms, including neural and Bayesian networks.
The techniques underlying PDT belong to the broader field known as Design of Experiments (DoE). Traditional approaches to DoE “relied on dimension reduction,” says Packard, identifying and exhaustively screening the most relevant subspace. PDT eliminates this need. To screen a 1-million compound library, for example, PDT may apply its modeling techniques to the first 100 compounds—either selected randomly or chosen to achieve the best possible coverage of the chemical space—and use the results to predict the next set of 100 compounds, and the next set, and so on.
Then there is the critical issue of target selection. Bojanic stresses the ongoing need for more and better target validation. As biochemical assays increasingly share the limelight with cell-based screens, and the initial focus on G-protein coupled receptors (GPCRs) and kinases expands to a broader spectrum of target classes, target selection and assay design become an increasing challenge.
“We want to look at protein-protein interactions in a cellular environment, and we need simple readouts to determine if a compound is affecting a cellular pathway,” says Richard Eglen, PerkinElmer’s president of bio-discovery. Ion channels are a good example of a new, “rich mine for drug discovery,” notes Eglen, “but they are difficult to do in HTS.”
Where’s the Beef?
Despite an improving perception of HTS, many still question its ability to contribute to a knowledge base rather than simply produce massive amounts of data. Extracting value from HTS depends on data analysis and management strategies and applying those data going forward.
Stephan Heyse, head of screening informatics at Genedata, describes HTS as “a big, organized roulette game for finding actives without the need of a priori knowledge of structures and binding motifs.” However, the reliability of early HTS data was questionable. “Companies gathered the hits and then threw away the data—they didn’t trust it,” he says. With improved assays come better readouts and more reliable data, but the increasing complexity of the data continues to put pressure on the computational toolkit. HTS data about individual compounds and chemical series might prove useful not only for hit identification but also at later stages of lead selection, optimization, and preclinical drug development, Heyse says.
Collecting and managing these data throughout the drug development process makes it easier for team members to share benefits from the information. Heyse emphasizes the importance of being able to annotate screening data across an organization. In this way, biologists, chemists, and pharmacologists can understand and interpret the data without having to review the history of a screen or a particular compound series or to look up the experimental parameters. As new data emerge, users can add additional annotations, create detailed compound profiles, rank compounds, and update their knowledge base.
Screener 6.0 is Genedata’s enterprise-wide data management platform for HTS and high-content screening (HCS). Genedata recently signed a collaboration with the Drug Discovery Center at the Genome Research Institute, University of Cincinnati, to fully integrate Screener into the center’s HTS workflow.
Screener includes the company’s new Kinetics Analyzer module, which provides tools for analyzing and aggregating time series data, in which hundreds of data points may be collected from a single well. The module optimizes parameters and aggregation rules for individual assays and systematically gathers mechanistic information from the curve shape.
Screener provides “exhaustive data quality control,” says Heyse. It contains algorithms that subtract artifacts in the data. In cell-based assays, for example, cells in the center of a plate tend to grow differently than those nearer the edges, yielding subtle differences in output. Screener automatically corrects these gradients, and applies a computational, systematic strategy to deal with outliers.
“HTS generates as much misinformation as information,” says Frank Brown, VP and CSO of Accelrys, but iteratively screening more carefully designed compound sets aimed at building a structure-activity relationship (SAR) will produce higher quality hits and valuable, predictive information.
Confirmation rates for HTS campaigns are usually below 50%, which Brown explains as follows. Consider a screen that yields a 1% hit rate (Fig. 1, red line) and a 1% error rate—which Brown says would be “phenomenal, a best-case scenario.” Only half of the 1% hits will be true positives, the other half are false positives. If the active rate drops to 0.5% (yellow line), then only 33% will be real hits. A 0.5% hit rate combined with a 2% error rate generates only 20% correct information and 80% misinformation.
If one was to use the full dataset generated screening a large random compound library, Brown says you could not build a model capable of predicting which hits are actually false positives and which “inactives” are actually false negatives. The range of activities represented is too diverse—“the data is all over the place,” he says. However, if one was to use the data points generated from a selected set of about 1,000 compounds to build a model for designing subsequent, iterative screens, that model would accurately predict about 75% of the false positive (inactive) compounds and 80% of the false negatives (actives). “The model is almost perfect,” asserts Brown, because 100% accuracy is not possible due to singletons in the dataset.
Using a computational strategy that combines various statistical methods including Bayesian algorithms, molecular fingerprints, and chemical functionality, Accelrys’s SciTegic Pipeline Pilot can predict 74% of the active compounds in a large library while only screening 31% of the dataset. It does this by designing iterative screens to build a structure-activity relationship (SAR).
So why are companies still doing HTS “the wrong way,” asks Brown? Because they have the machinery built for sampling and screening entire libraries. Without the appropriate infrastructure and automation in place to cherry-pick select compound sets efficiently and run iterative screens, this approach could be too costly for many companies.
Brown also contends that the problem with predictive computational modeling lies more with the false negatives than with the false positives. The typical activity cut-off companies use to decide which compounds to move forward to hit confirmation is 50%, mainly because that produces a “manageable workload,” Brown says (Fig. 2). Shifting the cut-off to 30% would double the workload, and a 20% breakpoint would quadruple the workload. But Brown argues that new information continues to be revealed as the activity cut-off approaches 20%-30%, primarily because of eliminating the false negatives.
Although expanding the search area to include all the false negatives may be cost-inefficient, Brown says the use of informatics and computational modeling systems makes it feasible to broaden the search to the 20% to 30% mark. Scitegic orchestrates this process by creating internal workflows based on the integration of multiple functions: security/QC/inventory control; predictive modeling; identification of actives; and selection of compounds to take forward into iterative screens. “Invest in building an SAR rather than amassing a lot of data points,” and HTS will pay off, advises Brown.
In Silico vs. Wet Lab
Because HTS data is not trustworthy for building predictive models, scientists are instead using the high-quality data generated by biologists within project campaigns to design models, says Brown. They then rely on virtual screening to identify well-defined compound sets, often a focused library, which can be used to guide the preparation of HTS plates to screen against a particular target class.
Novartis’ Bojanic views HTS and in silico screening as complementary techniques. “The vision for the future is that we should be able to do all experiments in silico. But, at the moment, that is just a vision. The wet biology is still very important,” particularly to validate hypotheses generated from virtual screens.
These midstream course corrections and strategic shifts in designing and implementing high-throughput screens are a natural outcome of the maturation of HTS. Although the numbers are still small, HTS has identified hits that have successfully made it into late-stage clinical trials and even to the marketplace.
“Is that a sufficient pay-off at this point in time? Relative to the hype of the 90’s, no,” says GSK’s Hertzberg. “Relative to what most realists thought, yes.”
HTS Plus a Systems Approach
Jeffrey Hermes, Director, Merck Automated Biotechnology (North Wales, Penn.) sees HTS working for Merck in conjunction with a parallel systems biology approach.
Bio• IT World: How has the view and implementation of HTS changed in a large pharmaceutical company such as Merck?
Hermes: We continue to see both the promise and the value of HTS every day. The investments we have made in parallel screening technologies are providing us with an unprecedented breadth and depth of information on candidates across our therapeutic areas of focus. With these technologies we can assay 5,000 to 7,000 candidates in 40-100 different types of biology at the same time. We can also identify novel molecular targets and pathways by “knocking down” and/or “knocking in” gene expression in multiple cell lines using various combinations of test compounds and perturbing agents.
Critics argue that HTS has not fulfilled its promise to fill drug discovery pipelines with high value hits. Do you agree?
There has been a great convergence of technologies and disciplines in recent years, which is moving the entire field forward. For example, our high-throughput biology division works closely with the Molecular Profiling and RNA Therapeutics groups to find new ways to treat common diseases. Through these collaborations, we are dissecting disease pathways with great precision and identifying novel molecular targets in diseases such as obesity, atherosclerosis, diabetes, and bone disorders.
How do you see the role of HTS continuing to evolve, and will it remain a core technology in the drug discovery toolbox?
Merck has launched a “poly-target” profiling initiative that uses systems biology to improve the productivity of drug discovery. Whereas traditional high-throughput screening is a linear, empirical process in which multiple drug candidates are screened against one target at a time, Merck has invested in technology that allows us to analyze many critical cellular parameters—including gene and protein expression and metabolites—in parallel. This parallel, systems biology approach generates rich information for the simultaneous interrogation of many targets. By probing multiple cellular functions on an industrial scale and employing sophisticated data analysis tools, we are transforming high-throughput screening into high-throughput biology, and we are finding targets we couldn’t find using traditional linear screening. V.G.
Hard Cell Made Easier
Sailaja Kuchibhatla, senior VP business development at DiscoveRx, describes cell-based assays as “the missing link” between HTS and predictive, functional data. Cell-based assays shed light on the effects of target binding on downstream second messengers. With DiscoveRx’s PathHunter beta-arrestin assays, “we are still looking at GPCRs, but we are looking in multiple different pathways,” says Kuchibhatla.
The assays incorporate a target cytoplasmic protein fused to a ß-galactosidase (ß-gal) peptide, while another portion of ß-gal is present only in the cell nucleus. If GPCR activation causes the target protein to translocate to the nucleus, the two ß-gal peptides can produce an active enzyme and generate a signal. DiscoveRx has developed a palette of assays that encompass more than 150 known receptors and 80 orphan receptors. Kuchibhatla describes growing interest in applying the company’s technology to perform HTS for the purpose of compound profiling.
The growing interest in data-rich high-content, cell-based screening has encouraged imaging system vendors to improve throughput, cost efficiency, and accessibility. Genedata’s Heyse foresees improvements in capturing the microscopic image readouts from HCS, which can reveal subcellular activity and physiological responses. In Heyse’s vision, image analysis produces data and information without requiring scientists to analyze each image manually. This information then feeds back into the experimental design process. “We need to automate and make sense of all this multidimensional data to understand the activity and mechanism of action of a compound,” Heyse says.
“HCS will eventually go plug-and-play,” predicts PerkinElmer’s Richard Eglen, noting growing demand for simpler HCS software that can be used by a spectrum of drug discovery researchers. They want high-throughput assays for pathway analysis, cell signaling, and evaluating a range of preclinical biomarkers, says Eglen. PerkinElmer has strengthened its HCS portfolio with the acquisition of Evotec and the Opera HCS image analysis platform, and the addition of Improvision’s Velocity software, which enables viewing of live cell images in three dimensions. V.G.
This article also appeared in the Jan-Feb 2009 issue of Bio-IT World Magazine.
Subscriptions are free for qualifying individuals. Apply today.