“The toxicology community is very conservative, with much reliance on well established histopathological diagnoses and clinical chemistry endpoints,” says Peter Lord, a consultant at DiscoTox. The safety of drug candidates is often assessed late in the development phase—after considerable resources have been invested—because of the lack of early-stage toxicity assessment tools. In spite of stringent safety testing, liver toxicity alone accounts for 40% of drug failures in clinical trials and 27% of market withdrawals.
Conventional toxicology methods, such as histopathology and clinical chemistry, are too expensive to conduct on a large scale during the discovery phase. Toxicogenomics (TGx)—the branch of genomics that analyzes the interactions among the genome, exogenous chemicals, and disease—has progressed significantly in the past decade. Open-source and commercially available TGx databases now catalogue the genomic signatures of tens of thousands of chemical entities that serve as reference compounds for investigational drugs and chemicals. Advances in computational tools and data mining software have also facilitated early-stage safety assessment and elucidation of new pathways of toxicity.
The development of TGx tools began over a decade ago but its role in safety assessment is still debated. Some toxicologists view TGx as mostly hype with few results, while others think it is only a matter of time before TGx is added to the battery of early-stage toxicity tests. Philip Hewitt, head of molecular toxicology at Merck Serono in Germany, thinks the high cost of TGx technologies is discouraging, but companies are beginning to incorporate them into their pipelines as a result of positive results.
“The major hindrance [to adopting TGx technologies] is management acceptance and costs of performing such expensive gene expression profiling studies,” he says. “The only way this can change is for more success stories where a drug was pushed forward (or stopped) and saved the company money. The costs of performing these experiments must fall and, of course, new low-gene number assays will be pushed.”
According to Lord, who also has previous experience at Johnson & Johnson, GlaxoSmithKline, and AstraZeneca, the skepticism is rooted in the traditional toxicology community, which is resistant to incorporating new technologies into their protocols. “Many of the more experienced toxicologists have little molecular biology background and understandably it is a challenge for them to get a realistic sense of how to assess and integrate new molecular technologies. With the advances in computational technology in the last ten years, TGx analysis has become much faster and more easily incorporated with other data for biological context.” Examples of markers that toxicologists look for include changes in the expression of genes for cytochromes P450, secondary drug metabolism enzymes, and proteins involved in apoptosis and cell proliferation.
“In a former company I saw TGx used to resolve conflicting data from early rodent safety studies. Several compounds that produced no liver damage according to histopathology, nevertheless showed an increase in liver enzymes indicative of liver damage. After the TGx analysis suggested no liver toxicity, we were more confident in moving the drug into the next phase of development and we set up investigations into the reasons for the liver enzyme increases,” Lord recalled.
The experimental methods for TGx analysis begin with the collection of RNA 24 hours after dosing with a test compound (see Figure). Toxicology-specific microarrays from Affymetrix and GE Healthcare with only a few thousand oligonucleotides significantly simplify the analysis of data. The interpretation of the data, however, still presents a bioinformatics challenge. Microarray experiments generate hundreds of thousands of data points, and typical TGx databases integrate thousands of microarray experiments, aggregating hundreds of millions of data points overall. Data processing and biostatistical analysis software gradually reduce the data to thousands of data points for interpretation by systems biology tools such as MetaCore from GeneGo and Genedata’s Expressionist System.
To build their internal TGx databases, academic and industry research groups frequently use open-source and commercial TGx databases as reference. Open-source TGx databases, which catalog tens of thousands of microarray experiments are now available from several government-sponsored sites. The Comparative Toxicogenomics Database (CTD) from the National Institute of Environmental Health Sciences (NIEHS) integrates information from public sites such as ChemIDplus, Drug Bank, and PubMed and contains over 22,000 references as of August 2010. The National Center for Computational Toxicology (NCCT), a division of the EPA, provides the Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network as a public forum for searching and publishing toxicity data, and focuses primarily on the effect of environmental chemicals on gene expression and disease.
Commercially available toxicogenomics databases such as DrugMatrix from Iconix Biosciences, ToxExpress System from Gene Logic, and the Ingenuity Knowledge Database from Ingenuity Systems have been developed specifically to facilitate early-stage toxicity assessment in drug discovery. With genomics signatures, as well as histopathological and clinical chemistry endpoints from thousands of known compounds, these databases serve as references for the analysis of novel candidates. The databases also enable users to build their internal TGx databases, including the genomics fingerprints of their investigational compounds.
Commercial databases are complemented by predictive modeling software such as IXIS and ToxShield Suite from Iconix and Gene Logic, respectively, which provide detailed toxicity reports based on established biomarkers and a rank ordering of lead compounds. Biomarkers identified by these packages can serve as leads in later stage preclinical studies, including histopathology, clinical chemistry, and molecular pharmacology, to confirm suspected pathological endpoints. Most computational platforms are now Web-based and allow researchers to share results with other investigators. Furthermore, with the help of pathway analysis tools such as IPA-Tox from Ingenuity, researchers can make predictions about organ-specific toxicities particularly for the heart, kidneys, and the liver.
To evaluate how well TGx methods predict long-term toxicity, in 2008 the Predictive Safety Consortium evaluated the correlations between the genomic fingerprints and the carcinogenicity of over 150 compounds. The study assessed the accuracies of two published hepatic gene expression signatures by Mark Fielden and Alex Nie. The evaluations were conducted in two laboratories with different microarray platforms, and the accuracies of the genomic signatures for predicting carcinogenicity were estimated to be 55-64 and 63-69 percent respectively. Interestingly, the internal validation estimates for the signatures were reported to be over 85%, and the decreases in the percentages were attributed to the differences in experimental methods. These results have prompted the consortium to establish standardized carcinogenicity signatures on quantitative PCR (QPCR) to aid in the validation of results across different laboratories. Overall, this study confirmed the application of TGx in early-stage safety assessment, but the numbers were not considered sufficient for regulatory decision making.
In another report, which evaluated TGx in acute toxicity, the correlation between the adverse effects of acetaminophen and its genomic fingerprint were compared over five different studies. While each study identified different sets of affected genes, the results were encouraging as all of the laboratories reported changes in the stress response genes known to be involved in acetaminophen toxicity, in spite of variations in experimental methods.
As TGx is slowly incorporated into early-stage safety assessment, epigenomics is also gaining attention from drug development companies. Epigenetic changes include modifications in the genome that do not affect the DNA sequence, such as DNA methylation, histone modification, and RNA silencing. DNA methylation in particular, has been shown to be involved in the development of diseases such as cancer, multiple sclerosis, diabetes, and schizophrenia.
“The application of epigenomic profiling technologies within the field of drug safety sciences has great potential for providing novel insights into the molecular basis of a wide range of long-lasting cellular perturbations including increased susceptibility to disease and/or toxicity, memory of prior immune stimulation and/or drug exposure, and transgenerational effects,” says Jonathan Moggs, head of molecular toxicology and translational sciences at the Novartis Institutes for Biomedical Research.
Of all the epigenetic changes, DNA methylation is the simplest to measure and traditional detection methods include bisulfite DNA sequencing, methylation-specific PCR, and MALDI mass spectrometry. A recently-developed high-throughput method for detecting epigenetic changes is from Illumina, which combines the GoldenGate genotyping assay with universal bead arrays. This method is has been shown to distinguish normal and cancerous lung tissue samples. As more epigenomics methods are tested and optimized, the hope is that they can be applied to detect epigenetic changes in large populations for the diagnosis of cancers and other diseases.
“Epigenomics has significant potential to impact translational sciences in the coming years. In particular, there is an opportunity to exploit and enhance emerging knowledge from epigenome mapping initiatives on the dynamic range of epigenetic marks in normal tissues versus disease states and also to investigate the extent of epigenome perturbation by xenobiotics,” Moggs says. The Innovative Medicines Initiative (IMI), funded by the EU, is one of the organizations that is working on elucidating mechanisms of nongenotoxic carcinogenesis (www.imi-marcar.eu).
While there has been a revolution in high-throughput technologies in the last ten years, methods for interpreting large genomics datasets are lagging behind. One of the newest data management tools for whole genome analysis is Genedata’s Expressionist System, which stores, analyzes, and manages profiling data from all major commercial technology vendors. The Expressionist System “supports mRNA profiling using microarrays, PCR and next gen sequencing technologies, proteomic profiling using 2D gels, antibody arrays and mass spectrometry, metabolomic profiling based on mass spectrometry and NMR and genomic profiling using next generation sequencing and SNP arrays,” says Jens Hoefkens, head of Genedata Expressionist Business Unit. Genedata’s other package, the ExpressMap, has the capability to interpret data from different omics platforms. ExpressMap “enables scientists to easily combine data from different technology sources and use the integrated data for statistical analysis without going through tedious matching of biological entities across technologies,” adds Hoefkens.
In September 2010, Genedata announced a collaboration with the Salk Institute to validate the new Expressionist Refiner module for whole genome analysis, including epigenetic modifications. “We can collect data much faster than we can analyze it, and a bioinformatics tool such as Genedata’s Refiner Genome makes it possible for us to integrate data from multiple data sets including RNA sequences, DNA methylation and histone modifications, and visualize them fast,” says Bob Schmitz, research associate at the Genomics Analysis Laboratory at the Salk Institute for Biological Studies.
Another all-inclusive omics data management package is GeneGo’s MetaCore, which includes pathway analysis and data mining tools to facilitate integration of genomics, proteomics and metabonomics data. GeneGo’s systems toxicology package, ToxHunter includes a TGx database combined with pathway analysis tools for lead optimization and biomarker validation, and is suitable for the investigation of environmental contaminants and drug candidates. To improve their system toxicology packages, GeneGo launched a partnership with the FDA known as MetaTox, which allows industry and government representatives to discuss safety assessment issues, including TGx data analysis.
As drug development companies are incorporating high throughput technologies for safety assessment, the FDA is also developing its own tools to review data from the Voluntary Genomics Data Submission (VGDS) reports. Until recently, the FDA relied on ArrayTrack, a comprehensive microarray data management, analysis and interpretation system. The disadvantages of ArrayTrack are that 1) it is based on expensive database software (Oracle), 2) it was not designed to integrate data from different ’omics platforms, and 3) it is not a public repository, and cannot easily incorporate data from other laboratories.
ArrayTrack’s successor, ebTrack, has wider-scope of analysis tools, which cover genomics, proteomics, metabonomics, and in vivo/in vitro toxicological data. ebTrack is based on the open-source PostgreSQL database engine and programmed in Java. The design of ebTrack is based on the integration of three modules: 1) databases, 2) analysis tools and
3) functional data modules that compile large amounts of data from the public domain. While this tool was developed primarily for toxicogenomics-driven environmental health research, it is also designed to handle data from the early-stage drug development process.
To validate TGx as a new method in safety assessment, regulatory agencies and large pharmaceutical companies have formed collaborations in the United States and Europe. The Predictive Safety Testing Consortium, initiated by the Critical Path Institute in Arizona, consists of 16 pharmaceutical companies including Pfizer, Novartis, and Merck. A similar organization in Europe, InnoMed PredTox, is a joint consortium between industry and the European Commission composed of 14 pharmaceutical companies, three academic institutions and two technology providers. These organizations are working toward combining data from ’omics technologies and conventional toxicology methods to facilitate decision making in preclinical safety evaluation.
“Both U.S. and EU have significant investment in TGx and other new technologies, stimulated by the need to improve drug development and get more medicines to meet medical need,” says Lord from DiscoTox. “This has been recognized globally by governments, regulators and the pharmaceutical industry. In Europe there is also increasing sensitivity to (and legislation on) the use of animals in drug and, especially, chemical safety assessment and this is driving efforts to use TGx and complimentary technologies to reduce animal experimentation. The major pharmaceutical companies are multinational, providing good cross-talk between the initiatives with Europeans working in U.S.-based collaborations and U.S. colleagues working in EU-based programs.”
TGx is a rapidly evolving field as government, industry and academic institutions are developing and validating methods for early-stage safety assessment. While TGx is not expected to replace traditional toxicology methods, the hope is that it will aid in the elimination of toxic compounds from the drug pipeline and the discovery of new pathways of toxicity.
“TGx will be a standard part of the toxicology package in 10 years time, both in terms of prioritizing compounds in early discovery, as well as validating leads in later stages of development,” says Hewitt of Merck Serono. “But it will probably not be replacing existing toxicology studies, just be added as a “weight of evidence” approach to add information on top of the gold standard histopathology.” •