YouTube Facebook LinkedIn Google+ Twitter Xinginstagram rss  

Experiments in Data Integration

By Robert M. Frederickson

Feb 15, 2006 | The increasing cost of drug development has put pressure on pharmaceutical and biotech companies to make better use of their vast stores of research data — and to manage the information and knowledge contained within these data. However, the technical difficulty in managing, integrating, and sharing data and knowledge between different labs and groups has plagued these efforts, as evidenced by the lack of data search tools for scientists.

Laboratory researchers tend to collect, sort, and analyze data using spreadsheets and word-processing programs, which generally require that researchers format, import, or reenter data manually from instruments into separate files. Because of the idiosyncrasies of the products and systems used in different organizations, integration can involve significant IT development time and cost. Even if the disparate programs could be integrated, the data usually lack necessary background information, such as experimental parameters and sample data.

Seattle-based Teranode was founded in 2002 with the aim of creating a dynamic platform for data and workflow management. Using an open modeling language and format called VLX (Visual Language of Experimentation), Teranode developed software that allows scientists to perform experiments using a schematic workflow definition that is both virtual and real. The software allows scientists to design and automate experiments and to manage the resulting data through a user-friendly set of tools and icons without having to program. The open VLX data can be easily stored or indexed into any number of repositories or search engines. Data can thus be easily searched and reused by scientists and application developers. The potential impact of open data is an ability to improve drug safety, accelerate project schedules, and reduce development costs.

The company evolved from a group of University of Washington scientists and has rapidly become a leading developer of informatics software for scientists. The first product developed using the VLX format, TERANODE XDA, was only delivered in Q4 2004. Teranode’s customer list is growing, including the Fred Hutchison Cancer Research Center, MIT, and the NIH Chemical Genomics Center, as well as Pfizer, Amgen, and GlaxoSmithKline.

Component Parts
TERANODE XDA comprises three main components: Design Suite (TDS), Model Server (TMS), and Integration. TDS provides a standard visual user interface to document both research protocols and biological pathways, through the Protocol Modeler and Biological Modeler tools, respectively. Protocol Modeler allows users to custom-design, manage, and analyze laboratory protocols and data and to create workflow processes made up of groups of protocols. Biological Modeler allows biological systems data to be visualized and analyzed and also facilitates biological computation that can be attached to the data, as can additional relevant annotation. A key attribute of the system is crosstalk between experimentation and pathway modeling, such that new results can drive new hypotheses and development of new models. Upon completion, the entire research activity created and employed by these tools can be packaged and archived into a Web-accessible repository through the TMS, which allows controlled, multi-user access to TDS. The openness of VLX allows data to be easily indexed by search engines or imported into an existing data warehouse.

TERANODE Integration is essentially a library of software plug-ins that connects TDS or TMS with external databases and existing laboratory systems, analytics, and instrumentation to facilitate data and information import. Drivers have been developed for the most common laboratory instrumentation in addition to widely used external databases such as KEGG (, which provides information on 34,000 biological and biochemical pathways. KEGG integration allows users to import whole pathways into TDS, which can then be edited and integrated with the set of genes or disease markers under study.

A new version of TERANODE XDA is due in early 2006 that will incorporate Semantic Web technologies. These technologies will increase the sources of data with which TERANODE XDA can integrate and make it easier for external search engines and data warehouses to mine VLX data. The result will allow scientists to search and aggregate data across labs and the Internet.

E-mail Robert M. Frederickson:

Click here to login and leave a comment.  


Add Comment

Text Only 2000 character limit

Page 1 of 1

For reprints and/or copyright permission, please contact Angela Parsons, 781.972.5467.