By Salvatore Salamone
May 7, 2002 | Buoyed by recent successful demonstrations of distributed computing’s power to tackle life science problems (March 2002 Bio-IT World, p. 12), life science companies are taking a closer look at the technology for their in silico research.
After all, distributed computing can pool the idle computing power of desktop computers and corporate servers to produce the equivalent processing power of supercomputers.
But life science managers should note: Introducing distributed computing systems into a corporation requires special considerations.
Beyond simply aggregating idle computer processing power, as has been the case for several years, managers now need the ability to monitor jobs in progress, allocate resources, and schedule jobs based on the company’s business priorities.
To that end, distributed computing software vendors AVAKI Corp., Blackstone Computing Inc., Entropia Inc., Platform Computing Inc., and United Devices Inc. have enhanced the management capabilities of their distributed computing software platforms to appeal to the corporate user.
For example, Platform recently introduced software that manages application licenses, monitors applications, and automates management through a manager-defined, rules-based system. And United Devices rolled out a Web-based interface for managing applications and agents, as well as a software development kit (SDK) to help port existing applications to its distributed computing platform.
The main purpose of such recent enhancements is to make distributed computing systems more manageable and efficient, which has not necessarily been the case in the past.
For example, Oxford University’s recently completed Anthrax cure research project, which used volunteers’ Internet-connected computers to perform calculations to screen more than 3.5 billion molecules, had a 5x redundancy rate to ensure high level of accuracy and quality.
Such multiple submissions of jobs to a distributed system is necessary with large-scale, Internet-based life science projects because of the uncertainties involved, such as users quitting the project or computers not connecting to the Internet after completing a simulation.
But multiple submissions are “not efficient on a corporate network,” says Martin Stuart, vice president of life sciences at Entropia. “In the corporate environment, you need to be able to intelligently manage and schedule jobs.”
For instance, one negative consequence of resending jobs is high bandwidth consumption, which was a concern for Novartis Pharmaceuticals when it started its distributed computing project.
Novartis installed United Devices’ MetaProcessor platform to accelerate its in silico research projects. The company deployed the system on 620 computers and ran the Hidden Markov Model application -- HMMER -- around the clock for a week to evaluate the system’s performance and network load.
“With this distributed system of 620 computers we aggregated the equivalent of 3.18 years of processing time in the first seven days,” says Manuel Peitsch, Novartis’ head of information and knowledge management. “The bandwidth usage increased only a maximum of 2.4 percent.” And there were no performance problems perceived by workers in the company, he says.
One challenge of making distributed systems more efficient and minimizing network loads is to ensure that jobs run the first time they are submitted, eliminating the need to resubmit a failed job.
But ensuring that jobs run the first time they are submitted goes way beyond the scope of most distributed computing systems today. Better management tools, such as those that manage application licenses, are needed. “Many times in a distributed environment a life science software failure is not really a failure,” says Yury Rozenman, director of life sciences business development at Platform. “Running a job often becomes an issue of license availability.”
Moving Up the Ladder
Job execution management is just one step that needs to be taken in order to have a more efficient distributed computing system. The bigger trend is bringing distributed computing systems in line with business priorities.
Managers need a way to prioritize jobs based on business needs and evaluate available resources. “Today you see scientists spending more time doing computer activities,” says Ted Bardasz, vice president of R&D at Blackstone. “They’re asking questions like, ‘Which server can I run this job on?’”
This is not a good use of a scientist’s time. “We need to get this out of the life scientists’ hands,” says Bardasz.
Managers need to know simple things like the minimum amount of memory and processing power required to run a job. But they also need other information. “You need to build a history of individual clients’ usage,” says Entropia’s Stuart. “Is it on 24 hours a day, or just from 9 to 5?”
Several of the vendors have recently introduced new tools in this exact area. For instance, Blackstone’s PowerCloud software lets a manager view the status of all computers within a distributed computer system.
Such capabilities are helping move distributed and grid computing systems beyond their role as simple CPU aggregators to their next evolutionary state. “[Grid computers] will become almost like a subscription system,” says Michael Capellas, chairman and CEO of Compaq. “They will become like a utility.”