HPC Harvard’s Way


By Allison Proffitt, Digital HealthCare & Productivity staff

October 14, 2008 | High performance computing (HPC) at Harvard Medical School isn’t just about IT, It’s about bridging clinical and research needs to support research, applications, and storage across the enterprise. John Halamka, Chief Information Officer at Beth Israel-Deaconess Medical Center painted a picture of collaboration and growth around HPC that he believes will be transformative for Harvard Medical School. 

Halamka gave the second morning keynote address last week at the Harvard Biomedical HPC Leadership Summit 2008, October 6-7, at Harvard Medical School. “To start with, HPC at Harvard is really a story of Field of Dreams, build it and they will come. Instead of a baseball field we built clusters and it worked really well,” Halamka said.

When faced with the challenge of supporting not only a medical center, but also a research institution, Halamka found that the historic model of decentralized storage and processing was not working. “Every lab wants to do its own thing,” he says of the past view. “Every lab is building a cluster in a closet, before you know it we’re going to end up with dozens of small data centers scattered around the [campus] and clearly that’s not a scalable, or sustainable architecture.”

The solution was a jointly-built high performance computing service supporting research and the medical school. “If you build central shared research infrastructure, everybody wins,” Halamka enthuses. “You avoid building dozens of local data centers, you avoid the problem of having orphan systems that aren’t supported when people leave, but you also understand the needs of the local research.”

The foundation, since the HPC service was put together in 2004, was a centralized infrastructure that the entire community could benefit from. Nodes of storage are bought by individual labs, but contributed to the community cluster. “And then of course we’ll use scheduling software to give you priority on those nodes, but when you’re not using them, the whole community is benefiting from those nodes,” he explains.

The system works. “Over the course of the last couple of years, the school, sure, has provided some storage infrastructure, the data center, and the networks, but the nodes have been largely contributed by the community.”

In the medical school today, 500 active users are spread across all of the labs. The system houses almost 1000 cores, 130 terabytes of network cached storage. With such buy in, of course Halamka’s next big problem is storage. “Storage is one of the things that’s keeping me up at night,” he jokes. “Who would have thought a couple of years ago that the petabytes would be not enough?”

Halamka sees storage at the heart of many of the challenges for Harvard Medical School and other clinical and research facilities, especially as researchers and clinicians are generating massive amounts of data each day. “We’ve put in de-duplication compression storage devices that have a 1:20 compression ratio, so that we’re able to archive this data very effectively. And it de-dupes at the block level, and so we’ve found a really quite effective method to store, compress, and if necessary retrieve relatively rapidly.”

The success doesn’t stop at Harvard. Halamka says, “We’ll continue to expand the existing high performance computer cluster, trying to build it into a regional computing resource. “

 

Click here to log in.

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

thomson reuters image
Biomarkers: An Indispensible Addition to the Drug Development Toolkit
Examining the Potential of Biomarkers
Sponsored by Thomson Reuters

Biomarkers are becoming an essential part of clinical development. In this white paper, Thomson Reuters provides insight from experts in industry and academia, and explores the role of biomarkers as evaluative tools in improving clinical research and the challenges this presents.

Discover the potential of biomarkers to:

  • Improve decision making
  • Accelerate drug development
  • Reduce development costs


BlueArc_Scientific Data
Scientific Data Lifecycle Management: Preparing for Storage in an Uncertain Future
Sponsored by BlueArc

Managing vast and overwhelming streams of gene sequencing data today requires ultra-high performance systems and processes. With continued rapid advancement and improvements in gene sequencing, expect tomorrow’s instruments to output quantities of genomic information that will dwarf current levels. Help your organization maintain data control and prepare for the future of sequencing through this informative paper that discusses:

  • The information technology challenges of gene sequencing
  • “Intelligent” methods for data management and customization
  • System survival tips... Deciding what data to keep or delete
  • New tools to keep scientists ahead of impending data torrents


SAS Managed image
Managed Innovation, Assured Compliance
Developing, executing and managing the transformation, analysis and submission of clinical research data with SAS® Drug Development
Sponsored by SAS
Get better products to market faster. Download this white paper to discover the top ten challenges facing life science executives and how to overcome them. See how SAS Drug Development transforms clinical data into true innovation.


Life Science Webcasts & Podcasts

Presented by Trade Commission of Spain

Spain Biotech: An Engine for Economic Change 

TCS podcastDiscover how Spain is focusing on biotechnology to be an engine for economic change through gradual internationalization, development and technology transfer.

Regional governments are actively investing in public and private biology research and promoting the creation of knowledge-based companies. Spain’s human capital combined with aggressive investment in biotech research and infrastructure has led to the creation of bio-clusters.

Today, there are nearly 700 Spanish companies engaged in biotechnology, with almost 50 percent growth in funding devoted to research. In fact, spending on internal R & D in biotechnology has grown 46 percent and is close to 300 million Euros.

Access the podcast 

 



More Podcasts

Job Openings

saic_logo

MANAGER, SCIENTIFIC COMPUTING & PROGRAMMING
(Bioinformatics Manager)
SAIC-Frederick, Inc has an exciting opportunity for a Manager, Scientific Computing & Programming - Core Genoytyping Facility in Gaithersburg, Maryland.  In this role, you will lead the Bioinformatics & Analysis Group.
Master’s or equivalent required.  PhD preferred. Six years experience in development of scientific programs in high-performance computing environment including five years supporting scientific research in computational chemistry, biology, or genetics, & two years supervisory experience.  View complete job posting & apply: www.saic-frederick.com. Position #146945.




For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.