Bridging Gaps with Web Services

By BIO-IT World


 

November 19, 2004 | Researchers need to be able to migrate their processes from local development environments to large compute facilities. Organizations need to be able to address global computing requirements with flexible and interoperable systems. Data-processing pipelines and workflows need to be transferable and repeatable. IT needs to encourage collaboration, not isolate communities.

The emerging technologies known as Web services address portions of all these needs. They allow access to remote code at the level of a function or an object call. (This is far from a new idea: A pedantic case could be made that remote access to functions started with the first "jump" instruction in early microcode.)

At the core, Web services are really nothing more than a generic way to provide an application programming interface (API) to remote function calls in a language-independent manner. A server advertises its functions using a document in Web Services Description Language (WSDL), an XML document type. Client/server communication is entirely via documents written in Simple Object Access Protocol (SOAP), another dialect of XML.

Because the API is specified in a formal yet generic manner, and because all communications are written to a middle protocol, Web services are entirely independent of language, operating system, client, and wire protocol.

DEAN PROUDFOOT 
The majority of implementations use a Web server to provide access to WSDL documents, and to pass SOAP documents from client to server and back again. From our perspective, ease of use is the only reason to implement Web services over the Web. For debugging purposes, The BioTeam has used everything from Simple Mail Transport Protocol (SMTP) to flat files in moving SOAP documents from client to server and back again. Our experience has shown that this is particularly convenient when a firewall or other network barrier makes direct Web access impossible. In terms of client and server languages, there is a SOAP module or library for every language we've checked. We have experimented with Perl, Ruby, C++, C#, and Java, but there are many others.

On a recent project, we had the opportunity to add a Web services interface to an existing cluster that was already using PISE (the open-source software for building interfaces) to automatically provide Web access to a number of bioinformatics applications. Our experience with that project convinced us of the value and power provided by a formally published API backed up by a SOAP interface. Web services provided a middle ground between the raw power and matching complexity of the command line, and the powerful but sometimes limiting interface of Web pages.

SOAP and WSDL are not a panacea, and they do not completely eliminate the subtle and time-consuming process of debugging. We tested our system using a variety of client languages, which revealed differences in interpretation and implementation. Multiple return values, implicit data types and default values, large data streams, and many forms of overloading were examples that bit us. In terms of usability, the SOAP::Lite serialization from Perl is (naturally) much more forgiving about syntax than the same serialization implemented in C++.

It is not that we have found any particular client to be more mature or correct than the others; it's just that it is impossible to debug an interoperability standard in a single language, and as we move forward from "Hello World," the bugs get more difficult and more subtle. We are excited about the forthcoming version 2.0 of the WSDL specification, since it clarifies a number of these issues. It also explicitly addresses service discovery.


Interoperability Benefits 
Public APIs like those used to implement Web services can help IT purchasers avoid being locked into a single vendor. APIs published via WSDL and SOAP will remain accessible in the face of flux in software vendors and versions. They are also amenable to integration with the similarly enabled products of other vendors. Web services can, at some level, be thought of as a "plug-and-play" protocol. The big win comes when an organization knows that most or all of its services can work together smoothly and can be combined in novel and unexpected ways. This also allows each component to be tuned and debugged independent of the others, which is always a plus for IT.

Some vendors of client and/or server software may perceive Web services as a threat to their existing client base. This is shortsighted: When clients and servers communicate through open protocols with free technology, developers can focus on providing powerful services and compelling interfaces rather than on constantly tweaking a custom API.


Workflows, Workgroups, and Taverna 
Web services can be used to enable workflows because the atomic units of functionality from various systems are published in a generic, machine-readable, network-accessible format. The best workflow editors will be Web services clients. They will dynamically read interface specifications (including WSDL) to discover services, and will present them to the user in an easy-to-use format. The less static, client-side information is kept in a workflow editor, the better.

We have had success with Taverna, a free, open-source client used in the MyGrid project. While Taverna is still a little rough around the edges, it was inspiring to open up the Taverna application, point it to our WSDL document, and see glyphs appear for the various services that we had published. As the services evolved, a "reload" on the client side was all that was needed to discover new functionality and have it available to the user.

Beyond the level of a single organization, groups can use Web services technology to share resources with the world and enable processes that might literally span the globe. Several major bioinformatics groups are already providing Web services interfaces to their tools and data resources, including KEGG, EBI, and the SeqHound and BioMOBY projects. Of course, a performance penalty is associated with running one step of a process in, say, Japan and the next in the United Kingdom. However, given that the alternative is installing and maintaining every single one of the software packages locally, the network delay doesn't seem too high a cost.

As with grid computing, Web services have been the victim of marketing hyperbole. Under the hood, though, there is a rapidly maturing technology with immediate benefits to both developers and users of scientific software.

Chris Dwan is a senior consultant with The BioTeam. E-mail: cdwan@bioteam.net. 




White Papers & Special Reports

thomson reuters image
Biomarkers: An Indispensible Addition to the Drug Development Toolkit
Examining the Potential of Biomarkers
Sponsored by Thomson Reuters

Biomarkers are becoming an essential part of clinical development. In this white paper, Thomson Reuters provides insight from experts in industry and academia, and explores the role of biomarkers as evaluative tools in improving clinical research and the challenges this presents.

Discover the potential of biomarkers to:

  • Improve decision making
  • Accelerate drug development
  • Reduce development costs


BlueArc_Scientific Data
Scientific Data Lifecycle Management: Preparing for Storage in an Uncertain Future
Sponsored by BlueArc

Managing vast and overwhelming streams of gene sequencing data today requires ultra-high performance systems and processes. With continued rapid advancement and improvements in gene sequencing, expect tomorrow’s instruments to output quantities of genomic information that will dwarf current levels. Help your organization maintain data control and prepare for the future of sequencing through this informative paper that discusses:

  • The information technology challenges of gene sequencing
  • “Intelligent” methods for data management and customization
  • System survival tips... Deciding what data to keep or delete
  • New tools to keep scientists ahead of impending data torrents


SAS Managed image
Managed Innovation, Assured Compliance
Developing, executing and managing the transformation, analysis and submission of clinical research data with SAS® Drug Development
Sponsored by SAS
Get better products to market faster. Download this white paper to discover the top ten challenges facing life science executives and how to overcome them. See how SAS Drug Development transforms clinical data into true innovation.


Life Science Webcasts & Podcasts

Presented by Trade Commission of Spain

Spain Biotech: An Engine for Economic Change 

TCS podcastDiscover how Spain is focusing on biotechnology to be an engine for economic change through gradual internationalization, development and technology transfer.

Regional governments are actively investing in public and private biology research and promoting the creation of knowledge-based companies. Spain’s human capital combined with aggressive investment in biotech research and infrastructure has led to the creation of bio-clusters.

Today, there are nearly 700 Spanish companies engaged in biotechnology, with almost 50 percent growth in funding devoted to research. In fact, spending on internal R & D in biotechnology has grown 46 percent and is close to 300 million Euros.

Access the podcast 

 



More Podcasts

Job Openings

saic_logo

MANAGER, SCIENTIFIC COMPUTING & PROGRAMMING
(Bioinformatics Manager)
SAIC-Frederick, Inc has an exciting opportunity for a Manager, Scientific Computing & Programming - Core Genoytyping Facility in Gaithersburg, Maryland.  In this role, you will lead the Bioinformatics & Analysis Group.
Master’s or equivalent required.  PhD preferred. Six years experience in development of scientific programs in high-performance computing environment including five years supporting scientific research in computational chemistry, biology, or genetics, & two years supervisory experience.  View complete job posting & apply: www.saic-frederick.com. Position #146945.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.