Oracle OpenWorld 2006: Pharma Stuck on Semantic Web



By Wendy Wolfson

Nov. 15, 2006 | SAN FRANCISCO — Among the star attractions of the 2006 Oracle OpenWorld conference* for its 42,000 attendees — aside from sessions on product lines including Oracle E-Business Suite, Oracle technology, Oracle Fusion Middleware, PeopleSoft Enterprise, and more — was a concert by Sir Elton John. Not far behind was a session in which pharma executives offered evidence that some, at least, are using the semantic web to work smarter and lower drug development costs.

The session — “Semantic Data Integration in Life Sciences” — was chaired by Susie Stephens, principal product manager at Oracle. Stephens said the semantic web can help life sciences companies integrate what they know and how they know it. Traditional approaches to knowledge management involved standardization of terms, but the focus is less on a priori standardization, as defining a proper schema for knowledge can be tricky. Rather than waste time and energy on defining what to say, Stephens said, the underlying approach is focusing on how to say things, emphasizing data sharing, explicit and ad hoc information.

The semantic web can integrate heterogeneous data by using explicit semantics to make data shareable and available. Oracle is working with the Worldwide Web Consortium (W3C) standard for data format. The Resource Description Framework (RDF) consists of triple nodes — a “node-link-node” structure — to convey terms, each of which has its own URI. RDF-S provides support for vocabularies. One can merge data since each component of the triple has a unique identifier.  

Stephens said the Oracle RDF data model provides support for RDF and RDF-S. While triples will be stored as an informational table, users can interact with them as with an object. Links represent complete RDF triples. A table function allows graph queries to be embedded in an SQL query, enabling searches for arbitrary patterns against RDF data, meaning researchers can do queries against RDF data, but also do inferencing based on RDF, RDF-S and rules defined by the user. Oracle does performance testing with UniProt.     

Putting the Web to Work
“The semantic web is a very exciting technology to companies like Pfizer,” said Giles Day, site head for research informatics at Pfizer’s Research Technology Center, in Cambridge, MA. “Pfizer has been through some extraordinary growth, but to maintain growth we will have to change the way we operate.” Day said Pfizer was working to reduce attrition, noting that company scientists must trawl through 85 candidates for each successful drug.

“That type of investment is not sustainable,” Day said. Pfizer continues to acquire companies at a “phenomenal” rate, turning to smaller biotechs such as Rinat Pharmaceuticals in South San Francisco, while also outsourcing to more CROs and collaborating with remote chemistry groups, especially in China. “That provides a challenge in securely passing information,” Day said. 

Day said the significance of the semantic web is expanding because of the growth in the scale and complexity of data. But research programs build complex data at high speed, data that is hard to integrate across an organization. “[Pfizer] is a global operation, the sun never sets on what we do,” said Day.  Pfizer builds huge data warehouses and has lots of silo data sets. Their intent is to start breaking down these divisional boundaries, to enable researchers to better communicate and make decisions. 

Day cited a study of an unnamed new medicine in first human trials. “We might see unusual events,” Day said. “Blood pressure might be dropping, we might see something in brain pathology. Are these events linked to another biological pathway? Can this jumpstart a whole new research program that could discover new indications?” Currently events are monitored by hand with doctors using different terms to describe the same phenomena. Having a semantic layer could make data accessible to researchers throughout the company. 

Day issued an important caveat; be careful about what inferences one derives from technologies. You can look up vampires, for example, and find that they are hematophagic and you can stab them with wooden stakes. But you might not see anywhere that they actually don’t exist. In laying ontology layers on top of data, one wants to have confidence that information is good and inferences are valid. But with say, 30,000 objects, one can get “a great big hairball,” Day pointed out.

Eli Lilly’s Patrick Hartman, team leader of discovery informatics, said Lilly has chosen the RDF approach “because pharma has a dilemma.” To get a drug from bench to market averages 5,000 screened compounds, 15 years and $1 billion — and that doesn’t even include the competition once to market.

Lilly wants to reduce the development attrition curve by cutting risks earlier in the pipeline. Informatics can be used to identify and validate promising targets. Starting with a therapeutic class and disease state, what are the biological pathways? How good is the target? Unfortunately, sources of data on pharmacology, druggability, ligands and toxicology are heterogeneous and may lack sufficient statistical power to draw real conclusions. “We tried data warehousing but it’s too expensive,” Hartman said. “How do we federate? 

Lilly uses RDF to relate ontologies to public data such as Entrez Genes. Lilly also uses a resource called Lingua Franca, comprising its Discovery Target Assessment Tool (TAT) that provides one stop access to integrated information for target assessment. TAT accesses key content bases including pathways, disease associations, competitive chemical entities, and detailed target analysis. Discovery TAT is built on the Lilly Science Grid (LSG), a single technical architecture for integration of plug-ins.

Hartman anticipates RDF enabling semantic description and comparisons of patients and cellomics data, as in semantically describing cellular localization with other properties, such as how cell size relates to gene expression, and relating gene exons to transcription. Hartman said Lilly is now serializing data to XML, taking a federated approach and leaving it in its original sources.

Both Lilly and Pfizer believe the semantic web will also aid in alternative drug indication discovery. RDF may help them better pick through their databases and researchers’ notebooks for promising compounds that might be otherwise left by the wayside. Both Day and Hartman are mum about results so far, but hint promise.  l

Wendy Wolfson is a science and technology writer based in Oakland, CA.

*Oracle OpenWorld 2006; San Francisco, October 23-26, 2006.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 

Tessella delivers software engineering and consulting services to leading pharmaceutical and biotech companies. We are recruiting Software Engineersto work with skilled bioinformaticians and scientists to identify business needs and recommend and develop technical solutions. Applicants require BS, MS or PhD in bioinformatics, biology or chemistry and 2+ years of software development in either: Java, C#, C++, C or VB.NET. 

Apply at http://jobs.tessella.com   

 

oxford nanopore logo 


 Early Access Collaborations Managers
Oxford Nanopore Technologies is developing a novel technology, GridIONTM for the direct, electronic analysis of DNA/RNA and other analytes.  As the system approaches the market, we are building a team of technically knowledgeable, highly motivated candidates with excellent customer service and facilitation skills to join our company as Collaboration Managers.  This is a unique opportunity to work with world-leading genomics customers throughout the early adoption phase of a new generation of DNA sequencing technology.. This is a facilitative, enabling role with responsibility for managing technology development collaborations with key customers at leading genomics institutions.  It will include long term management of the collaboration plan and milestones and associated meetings and documentation. Click here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Oxford Nanopore Technologies is a fast-moving technology company that is developing a novel electronic molecular analysis technology. The technology is adaptable for the analysis of DNA/RNA, proteins, chemicals and other molecules.  It is therefore suitable for use in a variety of markets including scientific research and clinical applications.  As the technology approaches the market, Oxford Nanopore is seeking a visionary VP of sales and marketing to join the senior team.  The candidate will embrace the opportunities afforded by entering the market with a truly disruptive technology that has the potential to expand the number of users and the variety of applications in each target market.  This is a rare opportunity to influence the commercial strategy at an early phase of its commercial lifetime, in a well funded company.  Oxford Nanopore welcomes applications from candidates with a track record of high-level strategic commercial  leadership, who wish to apply a fresh approach to existing markets.  Experience in Life Sciences/DNA sequencing is central to this role, however we will consider your application if you have experience of disruptive technologies in other related industries.  We are particularly interested in candidates with strong expertise in the use of digital technologies for sales and marketing of scientific/technical products.  Click to  Apply  


 

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .