GUEST COMMENTARY | By Pete Smietana and Xing Jian Lou
May 15, 2004 | Bottlenecks prevent complete automation of laboratory workflow. Potential snags abound: Instrument-control software does not recognize upstream tagging information or control parameters, which must be manually entered; acquired data do not contain upstream tagging information and are usually in a proprietary format, requiring manual storage; and analysis programs that process disparate data are limited because of proprietary or unavailable binary data formats.
These bottlenecks typically happen because upstream tagging information or control parameters and experimental conditions cannot be automatically read by the downstream analysis programs. Researchers are usually reluctant to enter such tagging information and parameters during the experiment process. Hence, the workflow cannot really flow.
Great Communicators: XML & XSLT
To solve this "communication" problem within a workflow, HTML, XML, and XSLT (extensible stylesheet language transformations) can be used as the "communicators."
Source: BioXing The syntax structure for XML is similar to HTML, except it is possible to define keywords and attributes that extend its capabilities. These keywords and attributes can be used to represent hierarchies of information. In addition, XML syntax provides ways to describe the type of data and to specify validation checks for the data that are passed from an informatics data source to an instrument's controller. Therefore, XML can be used to communicate to instruments easily readable and parseable types of data.
XSLT files are used to transform XML files into other forms (HTML files, other XML files, or application-specific text files). A transform file is a scripting file written using the extensible stylesheet language in an XML syntax structure that can process XML files and create an output file to be read by a target application. That is, there would be a unique XSLT file for the same set of XML files for each type of target application. Instrument manufacturers can provide file input specifications, and informatics system developers could create XSLT files specific for each manufacturer. Or, instrument manufacturers could create XSLT files specific to the XML files generated by informatics systems. Therefore, XSLT provides a flexible mechanism for rendering and extracting data contained within XML files in a format compatible with the target instrument controller.
Once the XSLT file is created, the upstream tasks simply generate the appropriate XML files that contain the data-tagging and instrument-control parameters. The instrument-control software can then use its browser to process and display the XML files with the XSLT file, directly read-in a text file containing XML data that the XSLT file created, and use XML's XPath function to navigate parent/child XML nodes to locate specific information. Then, an XML reader function can retrieve the information directly from the XML file.
In addition, the tagging data contained in the XML files can be attached to or incorporated within the output acquired data file that will allow informatics software to automatically store the data in the repository by using the tags to create the relational links.
XML data files coupled with XSLT scripting files can help eliminate automation workflow bottlenecks. Therefore, instrument controllers should be designed to process XML/XSLT files for dynamic control and tagging of acquired data. Instrument manufacturers should consider their instruments as peripheral components to laboratory workflow informatics data systems by providing easily loadable device drivers (e.g., data readers and writers for proprietary data formats). This is similar in concept to those drivers used by peripheral device manufacturers in the computer industry. Many bottlenecks could be avoided if informatics data systems provided mechanisms for installing these device drivers so that new instruments could be seamlessly integrated into laboratory workflow.
Pete Smietana is president and chief software architect, and Xing Jian Lou is scientific advisor, at BioXing, in Danville, Calif. E-mail: psmietana@bioxing.com.