Square Pegs in Round Holes?



By Thaddeus H. Grasela

Sept. 13, 2007 | A curious researcher stumbles upon a cache of data on the Internet and after a quick analysis discovers a dangerous public health threat.

Sound like the plot of a new medical thriller? In fact, it is the backdrop for a recent series of newspaper articles on the potential safety risks of Avandia, a medication for Type 2 diabetes mellitus. A meta-analysis performed on data from multiple clinical trials suggested that Avandia might increase the risk of heart attacks in diabetics. The report generated significant concern among patients and physicians about the safety of Avandia. In May, the FDA issued a safety alert for Avandia, but decided to keep the drug on the market. In August, the FDA announced that Avandia, along with other antidiabetic drugs, would feature new warnings on their packaging.

The idea of analyzing data combined from multiple clinical trials (i.e., meta-analysis) is an attractive strategy for monitoring and assuring drug safety. While each clinical trial is designed to answer a specific question (or questions) in a specific patient population, a meta-analysis of the pooled data from multiple clinical trials potentially provides a more complete picture of the risk-to-benefit tradeoffs. A critical consideration in the performance of a meta-analysis involves the choice of studies to pool. Logistical factors that play importantly into the selection of trials include the study designs, patient populations, and outcome metric captured in each trial. Thus, a successful meta-analysis requires the availability of comprehensive information about the types of patients enrolled in clinical trials, such as information on demographic characteristics, disease severity, treatment regimens, and use of concomitant medications among other factors.

The informatics challenges to performing a successful meta-analysis are some of the key driving forces for the pursuit of semantic interoperability and the development of data standards by organizations such as the Clinical Data Interchange Standards Consortium (CDISC). The efforts directed at data standardization come at an important time. The cost of drug development has soared in recent years, and challenges regarding drug safety, such as the potential for an increased risk with Avandia, have drawn the scrutiny of Congress.

Data standardization, as it is currently being practiced, involves bringing a group of experts together to share their experiences and personal perspectives with respect to specific concepts of interest. While this exercise is valuable in exploring nuances, a problem arises when the group moves to develop a standard definition. Often, instead of retaining the rich granularity revealed during the discussions of the concepts, the group moves to achieve consensus by developing a definition that satisfies a majority of the experts.

Imagine a group of experts called together to develop a definition for “happy.” Individuals drawing upon their recent experiences might describe feelings such as glad, content, cheerful, joyful, beaming, ecstatic, jubilant, and rapturous. The consensus definition (in this case drawn from the Oxford English Dictionary, ninth edition) might be “feeling or showing pleasure or contentment.” Unfortunately, the granularity that gave Shakespeare the tools to represent the human condition is lost in the consensus-forming process. So while current efforts at data standardization ensure that the primary statistical calculations for a study can be replicated, the loss of granularity reduces the ability to represent nuances that can be essential for the interpretation of future meta-analyses.

Premature Standards’ Problems
The desire of medical researchers to achieve the promise of semantic interoperability has created a sense of urgency for the development and deployment of data standards. This urgency provides the justification for distributing early versions of a standard, with the idea that the early versions will be improved in subsequent releases.

This rush to implement a standard has two important consequences. First, late adopters have a reason for holding back from implementation because of instability with the standard. Second, and perhaps more importantly, the early adopters are forced to use what is available — resulting in the emergence of different dialects in the accomplishment of tasks not anticipated by the initial version. This need to pound round pegs into square holes creates an obstacle in the pursuit of semantic interoperability because it is difficult to rectify these issues once a premature standard has come into widespread use.

The goal of semantic interoperability, which includes the goal of facilitating analyses across trials for drug safety assessments, will require several changes to the current strategy of data standardization. First, the short-term goal of data standardization must shift from a focus on promulgating standards to an emphasis on unraveling the meanings behind complex concepts. Second, the output of this process must then be encoded in a scientific ontology built on standard formats and methodologies for ontology development, maintenance, and use in order to foster the creation of principled ontologies. (Additional information can be found at www.obofoundry.org. )

Semantic interoperability may very well remain elusive for the foreseeable future. One approach to incrementally achieve this goal might be to adopt a short-term focus on developing a strategy to learn about ambiguities sooner so that we can get to a higher level of semantic interoperability faster. This process, known to informaticians as disambiguation, involves the unraveling of complexities that are often implicitly represented in a particular data standard term.

A growing number of ontologies are being created to address various scientific domains. Of particular importance to the complex data standardization efforts in the biomedical sciences is the implementation of a curation effort. This effort aims to consolidate the terms generated from disparate ontologies in order to ensure their reusability, and to ensure compatibility between neighboring ontologies.

This effort has been a critically valuable component in the development of the gene ontology for organizing and mining newly elucidating genomic information. New approaches to drug development must evolve if we are to see continued improvement in research productivity and drug safety. The move towards scientific ontologies as a basis for developing data standards is one approach to preparing for these changes and allowing for the evolution of the informatics backbone for the pharmaceutical and biotechnology industry. 

Thaddeus H. Grasela is president and CEO of Cognigen Corp. Email: ted.grasela@cognigencorp.com.

Subscribe to Bio-IT World  magazine.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 

Tessella delivers software engineering and consulting services to leading pharmaceutical and biotech companies. We are recruiting Software Engineersto work with skilled bioinformaticians and scientists to identify business needs and recommend and develop technical solutions. Applicants require BS, MS or PhD in bioinformatics, biology or chemistry and 2+ years of software development in either: Java, C#, C++, C or VB.NET. 

Apply at http://jobs.tessella.com   

 

oxford nanopore logo 


 Early Access Collaborations Managers
Oxford Nanopore Technologies is developing a novel technology, GridIONTM for the direct, electronic analysis of DNA/RNA and other analytes.  As the system approaches the market, we are building a team of technically knowledgeable, highly motivated candidates with excellent customer service and facilitation skills to join our company as Collaboration Managers.  This is a unique opportunity to work with world-leading genomics customers throughout the early adoption phase of a new generation of DNA sequencing technology.. This is a facilitative, enabling role with responsibility for managing technology development collaborations with key customers at leading genomics institutions.  It will include long term management of the collaboration plan and milestones and associated meetings and documentation. Click here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Oxford Nanopore Technologies is a fast-moving technology company that is developing a novel electronic molecular analysis technology. The technology is adaptable for the analysis of DNA/RNA, proteins, chemicals and other molecules.  It is therefore suitable for use in a variety of markets including scientific research and clinical applications.  As the technology approaches the market, Oxford Nanopore is seeking a visionary VP of sales and marketing to join the senior team.  The candidate will embrace the opportunities afforded by entering the market with a truly disruptive technology that has the potential to expand the number of users and the variety of applications in each target market.  This is a rare opportunity to influence the commercial strategy at an early phase of its commercial lifetime, in a well funded company.  Oxford Nanopore welcomes applications from candidates with a track record of high-level strategic commercial  leadership, who wish to apply a fresh approach to existing markets.  Experience in Life Sciences/DNA sequencing is central to this role, however we will consider your application if you have experience of disruptive technologies in other related industries.  We are particularly interested in candidates with strong expertise in the use of digital technologies for sales and marketing of scientific/technical products.  Click to  Apply  


 

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .