High-Speed Computing: Clusters, FLOPS, Creativity


By Salvatore Salamone

December 15, 2003 | PHOENIX –  One of the biggest challenges life scientists face is mining useful information out of mountains of raw data.

It takes more than raw computing power. That was the message delivered by one supercomputer user at the festival of raw computing power, SC2003.*

While many organizations are using teraFLOPS to process terabytes of data, true insight requires creativity, said Donna Cox of the National Center for Supercomputing Applications (NCSA) in her keynote speech here. Cox, a senior research scientist and associate director of experimental technologies, said the real challenge is representing data in such a way as to make it useful for discovery.

Things were simpler when computing “was a small localized group activity,” Cox said. Now, with multidisciplinary groups, dispersed internationally, collaborating virtually, new approaches are needed to visualize large amounts of incongruous data. One of the keys, she said, is creating visual metaphors that convey characteristics or concepts.

She used as an example an advertisement for a German beer. In the ad, a bottle of beer is shown in a champagne ice bucket. “[With] the beer engulfed in a champagne bucket, qualities of champagne are associated with the beer,” Cox said. “We don’t see the champagne, but the attributes are mapped onto the beer.”

500 Approaches
Still, researchers love their floating-point operations. One of the highlights of SC was the latest list of the world’s top 500 supercomputers. Several innovative systems tailored for life sciences made the grade this time. Roaring in at number 3: Virginia Tech’s Terascale Computing Facility, a homebuilt supercomputing cluster composed of 1,100 Apple G5 computers, where each node has dual 2GHz 64-bit PowerPC processors, 4 GB of memory, and 160 GB of storage. The Virginia Tech system, just announced last month, has a peak measured performance of 10.28 teraFLOPS (10.28 trillion floating point operations per second).

Virginia Tech will use the cluster to support work in computational chemistry, molecular statistics, and molecular modeling of proteins. The Virginia Tech system is only the third system ever to be benchmarked with a peak performance that exceeds 10 teraFLOPS.

The other major notable life science computer to make the list for the first time is IBM’s Blue Gene/L Prototype. The system is ranked 73rd on the list with an official measured peak performance of 1.435 teraFLOPS. IBM’s Blue Gene program is devoted to developing new hardware and new protein folding algorithms.

A number of university supercomputer systems, which will be dedicated to scientific research, also made the list for the first time. Among the new university entrants in the top 100 are the Chinese Academy of Science, the Korea Institute of Science and Technology, and the University of Liverpool.

Adoption of of clusters continues to increase in high-performance computing environments. Seven of the top ten computers on the new Top 500 list are clusters. On last November’s Top500 list, there were only two. All told, 208 cluster systems made the most recent list. For sheer performance, the total combined processing power of the entire top 500 supercomputers is 528 teraFLOPS. Six months ago, when the previous list was released, the total combined power was 375 teraFLOPS.

SC2003 had the slick booths associated with most trade shows, but attendees could be seen in intense discussions with vendors about high-performance computing issues. The conference network, Scinet, required installation of 55 miles of fiber, supporting a 40Gbps backbone. Next year, organizers hope to undertake a new initiative called StorCloud that would provide bandwidth of 1 terabyte per second and demonstrate innovative management and allocation technologies.
--------------------------- 

* SC2003, Nov. 15-21, Phoenix



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .