Notes from the Lab: Multicore and More



Feb 15, 2006 | The first column we wrote in this space two years ago (see Lies, Liars, and Benchmarking, Jan. 2004 Bio•IT World, page 20) dealt with benchmarking. The caveats spelled out in that article are still firmly in effect:

•            Benchmarks must be representative of the actual intended use of the compute system

•            Manufacturer benchmarks tend to highlight the positive and omit the negative

•            Generic measures of “computing power” have only a passing relationship to the real-world needs of life scientists

The BioTeam recently had the opportunity to perform a set of benchmarks on one of the new quad-chip, dual-core Xeon systems from Intel. Our complete analysis is available online at bioteam.net/intel_benchmarks.

The most interesting thing to me about multicore systems is that they represent a significant push back in the ever-present single system image versus compute farm debate. For the past several years, the most cost-effective way to build a personal compute server at the 8 or 16 CPU level was to build a small cluster out of dual CPU systems. The Intel system that we evaluated contained four physical chips, each of which contained two cores. This made it an 8 CPU system before even counting any virtual CPUs. In terms of performance, administrative overhead, and cost per CPU, a single system is clearly a win over a cluster. The only major downsides are the fact that a single machine is a single point of failure, and there is a high cost to add that ninth CPU. All of the major chip manufacturers have gone multicore, and all the roadmaps I’ve seen clearly call for quad core and beyond in coming years. This means that in pretty short order we will be seeing 16 cores and 32 cores in a single chassis. This will simplify the hardware aspects of parallel computing and swing the pendulum back in favor of developers who exploit both message passing and thread-based parallelism in their code.

Naturally, one application of these highly multicore systems will be to build clusters with ever-increasing CPU count. On the other hand, the market for a “personal” supercomputer exists at approximately the 4 to 16 CPU level, regardless of the exact configuration that gets the user there. Underlying this is a comforting reality: The same interface can be used to manage workflows on an SMP machine as is used on a larger cluster.

Web Services
We’ve been talking about Web services interfaces to cluster tools for over a year now, and the second tier of services is finally starting to emerge. I recently learned that the University of Minnesota’s Center for Computational Genomics and Bioinformatics is making use of the Web services interface on their cluster to build “semantic services” involving adding information from the BioMOBY and Gene Ontology projects to raw cluster computations. In addition, they are publishing services integrated with legume genome annotation databases that they maintain. They recently demonstrated this technology at the Plant and Animal Genome conference. Tying systems together at well-defined, standard interfaces makes it possible (though still far from simple) to build a truly integrated computational universe for genomic information.

 Inside-SonyPSP.jpg
 

HAIL HANDHELDS: The Sony PSP
makes a "great mobile monitoring
platform for IT staff," says
BioTeam's Chris Dagdigian.

Getting the Most from Grid Engine
Last year, Chris Dagdigian used this space to talk about his open-source “xml-qstat” tool that is being used to transform raw Grid Engine XML status data into a variety of publishable forms including Web pages and syndicated XML feeds (RSS). (See Adventures in XML Transformation, July 2005 Bio•IT World, page 40.) Since then, xml-qstat has been entirely rewritten from the ground up and now plugs directly into the Apache Cocoon XML publishing framework. New features include sensible XML data caching to avoid stressing the Grid Engine subsystem, Atom 1.0-compliant XML syndication feeds, an XSL-template-driven documentation framework, and even automatic detection and special mobile device output for Sony PlayStation Portable (PSP) systems. Choosing to support the Sony handheld gaming device was not a joke or a design afterthought. According to Chris Dagdigian, “The Sony PSP has a large high-quality color display, built-in wireless networking, and a Web browser that supports almost all of the XHTML and CSS1/CSS2 Web publishing standards. It makes a great mobile monitoring platform for IT staff who need to keep a constant eye on grid and cluster status information.” In addition to continuing development of xml-qstat, Chris also recently launched a new Web site and community wiki for Grid Engine users that can be found at http://gridengine.info. The new site aggregates and consolidates links, documentation, resources, and HOWTOs previously buried deep within mailing list archives and other hard-to-find locations.

Coming Soon: Server Virtualization Bakeoff
One of the newest arrivals in our hardware lab recently has been a fully loaded Rackable Systems 3118 Storage Server, which we plan on using as a platform for evaluating server and OS virtualization products including Xen, VMware, and Microsoft Virtual Server. Server virtualization is becoming more and more popular for a number of use cases including server consolidation, software development, QA testing, and training applications. The Rackable 3118 is well suited for testing virtualization products as the use of two multicore Opteron CPUs hits the pricing/licensing sweet spot for the commercial products and the pair. Sixteen 250GB disk drives and a pair of 3Ware SATA controllers will allow each virtual server access a dedicated block-level storage volume. Expect a full column covering the results of our virtualization trials in the future.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .