Square Pegs in Round Holes?


By Thaddeus H. Grasela

Sept. 13, 2007 | A curious researcher stumbles upon a cache of data on the Internet and after a quick analysis discovers a dangerous public health threat.

Sound like the plot of a new medical thriller? In fact, it is the backdrop for a recent series of newspaper articles on the potential safety risks of Avandia, a medication for Type 2 diabetes mellitus. A meta-analysis performed on data from multiple clinical trials suggested that Avandia might increase the risk of heart attacks in diabetics. The report generated significant concern among patients and physicians about the safety of Avandia. In May, the FDA issued a safety alert for Avandia, but decided to keep the drug on the market. In August, the FDA announced that Avandia, along with other antidiabetic drugs, would feature new warnings on their packaging.

The idea of analyzing data combined from multiple clinical trials (i.e., meta-analysis) is an attractive strategy for monitoring and assuring drug safety. While each clinical trial is designed to answer a specific question (or questions) in a specific patient population, a meta-analysis of the pooled data from multiple clinical trials potentially provides a more complete picture of the risk-to-benefit tradeoffs. A critical consideration in the performance of a meta-analysis involves the choice of studies to pool. Logistical factors that play importantly into the selection of trials include the study designs, patient populations, and outcome metric captured in each trial. Thus, a successful meta-analysis requires the availability of comprehensive information about the types of patients enrolled in clinical trials, such as information on demographic characteristics, disease severity, treatment regimens, and use of concomitant medications among other factors.

The informatics challenges to performing a successful meta-analysis are some of the key driving forces for the pursuit of semantic interoperability and the development of data standards by organizations such as the Clinical Data Interchange Standards Consortium (CDISC). The efforts directed at data standardization come at an important time. The cost of drug development has soared in recent years, and challenges regarding drug safety, such as the potential for an increased risk with Avandia, have drawn the scrutiny of Congress.

Data standardization, as it is currently being practiced, involves bringing a group of experts together to share their experiences and personal perspectives with respect to specific concepts of interest. While this exercise is valuable in exploring nuances, a problem arises when the group moves to develop a standard definition. Often, instead of retaining the rich granularity revealed during the discussions of the concepts, the group moves to achieve consensus by developing a definition that satisfies a majority of the experts.

Imagine a group of experts called together to develop a definition for “happy.” Individuals drawing upon their recent experiences might describe feelings such as glad, content, cheerful, joyful, beaming, ecstatic, jubilant, and rapturous. The consensus definition (in this case drawn from the Oxford English Dictionary, ninth edition) might be “feeling or showing pleasure or contentment.” Unfortunately, the granularity that gave Shakespeare the tools to represent the human condition is lost in the consensus-forming process. So while current efforts at data standardization ensure that the primary statistical calculations for a study can be replicated, the loss of granularity reduces the ability to represent nuances that can be essential for the interpretation of future meta-analyses.

Premature Standards’ Problems
The desire of medical researchers to achieve the promise of semantic interoperability has created a sense of urgency for the development and deployment of data standards. This urgency provides the justification for distributing early versions of a standard, with the idea that the early versions will be improved in subsequent releases.

This rush to implement a standard has two important consequences. First, late adopters have a reason for holding back from implementation because of instability with the standard. Second, and perhaps more importantly, the early adopters are forced to use what is available — resulting in the emergence of different dialects in the accomplishment of tasks not anticipated by the initial version. This need to pound round pegs into square holes creates an obstacle in the pursuit of semantic interoperability because it is difficult to rectify these issues once a premature standard has come into widespread use.

The goal of semantic interoperability, which includes the goal of facilitating analyses across trials for drug safety assessments, will require several changes to the current strategy of data standardization. First, the short-term goal of data standardization must shift from a focus on promulgating standards to an emphasis on unraveling the meanings behind complex concepts. Second, the output of this process must then be encoded in a scientific ontology built on standard formats and methodologies for ontology development, maintenance, and use in order to foster the creation of principled ontologies. (Additional information can be found at www.obofoundry.org. )

Semantic interoperability may very well remain elusive for the foreseeable future. One approach to incrementally achieve this goal might be to adopt a short-term focus on developing a strategy to learn about ambiguities sooner so that we can get to a higher level of semantic interoperability faster. This process, known to informaticians as disambiguation, involves the unraveling of complexities that are often implicitly represented in a particular data standard term.

A growing number of ontologies are being created to address various scientific domains. Of particular importance to the complex data standardization efforts in the biomedical sciences is the implementation of a curation effort. This effort aims to consolidate the terms generated from disparate ontologies in order to ensure their reusability, and to ensure compatibility between neighboring ontologies.

This effort has been a critically valuable component in the development of the gene ontology for organizing and mining newly elucidating genomic information. New approaches to drug development must evolve if we are to see continued improvement in research productivity and drug safety. The move towards scientific ontologies as a basis for developing data standards is one approach to preparing for these changes and allowing for the evolution of the informatics backbone for the pharmaceutical and biotechnology industry. 

Thaddeus H. Grasela is president and CEO of Cognigen Corp. Email: ted.grasela@cognigencorp.com.

Subscribe to Bio-IT World  magazine.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

gq92112

This Bio•IT World Briefing On “Next-Generation Sequencing,”underwritten by GenomeQuest, Inc.,
presents a selection of feature stories, interviews,commentaries, conference reports, and editorials on the emergence, opportunities, and challenges posed by high-throughput sequencing. Covered in this collection: the launch of new
platforms from Applied Biosystems and Helicos; new applications of nextgen sequencing; the rise of personal genomics; and informatics solutions to vexing problem of managing the vast volumes of next-gen data.  Download now 



sgi_hybrid

SGI's Meeting Today’s Computational Needs for Science

The quest to better understand disease mechanisms and find new treatments is driven by new laboratory technologies and ever-more sophisticated modeling and simulation efforts. As such, life sciences R&D investigations increasingly are relying on more powerful computing resources. The challenge is how to accommodate the broad mix of applications.

Addressing this issue, this paper produced by the Bio-IT World Custom Publishing Group discusses a new SGI Hybrid Computing Environment approach. It optimally uses shared memory systems, multi-processor clusters, and FPGAs to accelerate computational workflows.



sgi_protm

SGI's Supercharging Proteomics Discovery

The deeper study of proteins and their interactions can reveal scientific information once considered nearly untouchable to scientists and researchers. Today, unprecedented advancements in computing power are enabling the creation of mounds of proteomic based data along with the accompanying bottlenecks data can create.

Rather than just “simplify the experiment” to fit the computational resources an alternative is now available with the SGI Proteomics Appliance. This complimentary white paper, produced by the Bio-IT World Custom Publishing Group, looks at ways to use the Proteomic Appliance to handle the most intensive proteomics computing tasks facing science today.



Life Science Webcasts & Podcasts

Waters

Streamlining the Chromatographic Method Validation Process

waters sm podcast button120Waters® Empower™ 2 Method Validation Manager (MVM) is a business-critical, compliant-ready software that reduces time and costs required to perform chromatographic method validation by as much as 80%. Learn in this podcast how MVM streamlines the method validation process and allows the entire process to be efficiently performed within Empower 2, so fewer software applications need be deployed, validated, and maintained. Download Now


More Podcasts

Job Openings

Lilly Singapore Center for Drug Discovery (LSCDD) - Associate Director of Informatics
Lead and mentor a strong team for the Bioinformatics group at the Integrative Computational Sciences (ICS) department at LSCDD towards the development of novel algorithms, data analysis methods and software tools for drug discovery. Work closely with the Software Engineering group at ICS, and collaborate with the Discovery IT organization in Europe and USA. For additional information, or to apply visit: LSCDD 

 Lilly Singapore Center for Drug Discovery (LSCDD) - Senior Software Engineer
Join a strong team of software engineers in our Integrative Computational Sciences (ICS) at LSCDD. Collaborate with, and help develop integrated applications to process and visualize data from cutting-edge technologies used by scientists at Lilly Research Labs (LRL) and the Drug Discovery Research (DDR) teams. The Software Engineering team provides computational tools and tailored software solutions that enable the global effort of Tailored Therapeutics; ‘The Right Drug, at The Right Dose for The Right Patient at The Right Time'. For additional information, or to apply visit: LSCDD 

Related Resources & Products





For reprints and/or copyright permission, please contact RMS, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext 100 or via email to bio-itworld@theygsgroup.com.