By Salvatore Salamone
January 15, 2003 | IBM’s Blue Gene, the protein folding supercomputer that promises to break the petaflop performance barrier, is several years away, but that doesn’t mean high-performance computing will stand still until then.
In 2003, distributed computing systems will rival supercomputers in power. Moreover, blades, grids, clusters, a new wave of commodity server buying, and, as always, more power will punctuate the year.
If a life science company is already using clusters or grids, it can expect increased performance simply owing to the normal computer replacement process that occurs in all companies.
“The performance of installed distributed systems gets better with time automatically,” says Ed Hubbard, CEO and co-founder of distributed computing software company United Devices Inc. He notes, for example, that with the 1.7 million-node public grid his company pulled together and managed last year for Oxford University’s cancer and anthrax research projects, “we saw the performance of the grid increase 50 percent in six months just from this replacement process.”
Such performance gains are reasonable for corporate grids too. “We’ve found that computers in [distributed systems] turn over on average once every three years, leading to an increase in computing power,” Hubbard says. Similar performance gains can be expected with clusters.
Budgetary constraints industrywide might slow the periodic replacement of computers, but that does not seem to be the case for the life sciences. A purchasing study of Bio·IT World readers found that 42 percent will buy servers, 52 percent will buy new PCs, and 29 percent will buy high-performance computers this year.
A second trend in distributed computing this year is the move toward more dedicated distributed infrastructures. For many years, life scientists have cobbled up PCs and servers into clusters and grids. Most of these systems were not dedicated to specific research projects but were used on an ad hoc basis whenever the computers were available or a particular problem needed more computational power than a single PC offered.
While the ad hoc approach to distributed computing can yield significant computing power, common problems can keep computational jobs from running. For instance, a server may be down or busy, or other researchers may tie up analytical software licenses required in a calculation.
This is one reason distributed computing infrastructures dedicated to specific applications and problems will be more common this year. Specifically, there is a trend in clustering to move to what some call production-grade bioclusters. In a talk last year at the O’Reilly Bioinformatics Technology Conference, Chris Dagdigian, co-founder of the consultancy BioTeam Inc., defined some of the attributes of such clusters. Dagdigian noted that the key properties of production systems are high reliability, availability, and security.
“Up to 1,000 CPUs can be managed by the equivalent of one full-time administrator,” Dagdigian says. To accomplish this, individual elements in a biocluster must be interchangeable and disposable. “Think of the compute elements in a cluster like light bulbs,” says BioTeam co-founder William Van Etten. “When one [server] breaks, throw it in a FedEx box, and have someone else fix it.”
Production-grade bioclusters such as the ones advocated by BioTeam use a high-end server as the host computer for submitting and managing computer jobs (as a distributed resource manager) and a separate file server for databases and applications such as BLAST that will run on the cluster. The elements of the cluster are on a private network and are used exclusively for the scientific computing applications. This architecture has the potential to increase the number of computational jobs that are run successfully by increasing the availability of clustered computers.
Many companies are realizing that distributed systems can deliver significant high-performance computing power. “You can buy a lot of CPUs running Linux for a fraction of the cost of a high-end computer such as a Sun or SGI system,” says Bruce Ling, director of bioinformatics at the biopharmaceutical company Tularik Inc. The company uses several Linux clusters to conduct research, including a 150-CPU Linux NetworX Evolocity cluster.
For the first time, two PC clusters -- neither of which is used for life sciences -- rank in the top 10 of the most powerful supercomputers in the world, according to the latest list of the top 500 supercomputers compiled by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of the U.S. Department of Energy's National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee.
This signals the new era of the volume commodity server costing in the low four figures. In fact, low-volume servers, which can be configured into huge clusters of 1,000 or even 2,000 servers, are one of the hottest growth areas in the server market. Research firm IDC reported all server spending was up three points to 40 percent in the third quarter over the same period a year ago.
Blades at the Cutting Edge
A number of small technical advances will make clusters even more powerful. For example, new blade servers from RLX Technologies, Dell Computer Corp., IBM, and Hewlett-Packard will give life science managers an easy way to physically build a biocluster. These vendors are targeting their blade servers at the life sciences by combining optional software packages such as BLAST; Platform Computing Inc.’s job scheduling software; and software that manages the exchange of data between systems in a clustered environment, such as MPI Software Technology Inc.’s MPI/Pro.
Blade server systems are just starting to be adopted by some life science companies as the hardware core for their clusters. For instance, The Wellcome Trust Sanger Institute is now using RLX blade servers clusters in conjunction with existing supercomputers to perform genomic sequencing and annotation. Phil Butcher, Sanger’s head of systems, notes that one reason for the interest in blade server clusters was the institute’s need for more computational power. “As we move into more complex computational arenas, our computer needs can only be met with flexible solutions that allow for growth,” he says. The flexibility is the combination of traditional supercomputers and easy-to-manage clusters.
Researchers at Sanger found that their blade system was better than past clustering efforts that used individual servers linked together. “The [blade cluster] had a 102 percent [performance] improvement over our 1U system,” says James Cuff, group leader of Sanger’s informatics systems group. “This means we doubled our throughput rate, allowing annotation and gene sequencing analysis to be done faster.”
Life science managers will have one other option this year when looking to meet their high-performance computing needs: pay as you go, on-demand services. Both United Devices and IBM have recently introduced such services.