HPC Harvard’s Way


By Allison Proffitt, Digital HealthCare & Productivity staff

October 14, 2008 | High performance computing (HPC) at Harvard Medical School isn’t just about IT, It’s about bridging clinical and research needs to support research, applications, and storage across the enterprise. John Halamka, Chief Information Officer at Beth Israel-Deaconess Medical Center painted a picture of collaboration and growth around HPC that he believes will be transformative for Harvard Medical School. 

Halamka gave the second morning keynote address last week at the Harvard Biomedical HPC Leadership Summit 2008, October 6-7, at Harvard Medical School. “To start with, HPC at Harvard is really a story of Field of Dreams, build it and they will come. Instead of a baseball field we built clusters and it worked really well,” Halamka said.

When faced with the challenge of supporting not only a medical center, but also a research institution, Halamka found that the historic model of decentralized storage and processing was not working. “Every lab wants to do its own thing,” he says of the past view. “Every lab is building a cluster in a closet, before you know it we’re going to end up with dozens of small data centers scattered around the [campus] and clearly that’s not a scalable, or sustainable architecture.”

The solution was a jointly-built high performance computing service supporting research and the medical school. “If you build central shared research infrastructure, everybody wins,” Halamka enthuses. “You avoid building dozens of local data centers, you avoid the problem of having orphan systems that aren’t supported when people leave, but you also understand the needs of the local research.”

The foundation, since the HPC service was put together in 2004, was a centralized infrastructure that the entire community could benefit from. Nodes of storage are bought by individual labs, but contributed to the community cluster. “And then of course we’ll use scheduling software to give you priority on those nodes, but when you’re not using them, the whole community is benefiting from those nodes,” he explains.

The system works. “Over the course of the last couple of years, the school, sure, has provided some storage infrastructure, the data center, and the networks, but the nodes have been largely contributed by the community.”

In the medical school today, 500 active users are spread across all of the labs. The system houses almost 1000 cores, 130 terabytes of network cached storage. With such buy in, of course Halamka’s next big problem is storage. “Storage is one of the things that’s keeping me up at night,” he jokes. “Who would have thought a couple of years ago that the petabytes would be not enough?”

Halamka sees storage at the heart of many of the challenges for Harvard Medical School and other clinical and research facilities, especially as researchers and clinicians are generating massive amounts of data each day. “We’ve put in de-duplication compression storage devices that have a 1:20 compression ratio, so that we’re able to archive this data very effectively. And it de-dupes at the block level, and so we’ve found a really quite effective method to store, compress, and if necessary retrieve relatively rapidly.”

The success doesn’t stop at Harvard. Halamka says, “We’ll continue to expand the existing high performance computer cluster, trying to build it into a regional computing resource. “

 

Click here to log in.

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Life Science Webcasts & Podcasts

medidata podcast #8 Meeting Today’s Challenges in Clinical Trial Supply Management
Sponsored by: Medidata Solutions Worldwide  

Setting up and managing the clinical trial involves many complex procedures. Among the most challenging are planning and executing the logistics of the trial’s clinical supplies. This podcast focuses in depth on the following topics which trace current practices and future evolution of this crucial aspect of clinical trials:

  • Current practices in clinical trial logistics
  • Comparing advances in clinical supply practices to  other aspects of clinical trials 
  • Where current practices fall short of meeting the challenges
  • Trends and evolving improvements that may change the way logistics are conducted

Listen Now  


More Podcasts

Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .