TIGR Unplugs HP, Switches to Sun for Genome Assembly



Loading...
The Institute for Genomic Research (TIGR), one of the most prestigious genome research centers in the United States, has hauled out its long-serving HP Alpha servers and switched to a new architecture from Sun Microsystems.

Two months ago, TIGR staff finally removed the 15 historic HP Alpha servers from the non-profit in Rockville, Maryland. It was a “happy event,” TIGR’s director of IT, Vadim Sapiro, told Bio-IT World. “There was something really finite about that.”

The replacements—three Sun Fire servers running the 64-bit Linux OS—had been running alongside the Alpha servers for some time before the HP servers were finally switched off six months ago. These servers are used to power TIGR’s complex genomic assembler for sequence research. In addition to significant economies of size, efficiency, energy utilization and maintenance, Sapiro says the new system’s increased performance is boosting research speed and quality.

TIGR was founded by J. Craig Venter in 1992, and claimed many landmarks in its early years, including the compilation of hundreds of thousands of expressed sequence tags (ESTs) and the first microbial genome sequence in 1995. The shotgun sequencing approach pioneered in that assembly propelled Venter to launch Celera Genomics in 1998.

In the meantime, TIGR has become best known for the sequencing, assembly, and analysis of countless microbial genomes, which currently require 80 terabytes of data storage. Since 1999, that process had relied upon HP’s Alpha clusters. But as those clusters grew less reliable and economical to operate, especially in terms of services support and cooling requirements, Sapiro and colleagues sought other solutions.

Sapiro is in his second stint at TIGR, having started in 1994 as a “classic nerdy” UNIX systems administrator. Since 1999, he’s been in charge of managing the IT department of about 20 staff, supporting over 300 employees.

HP Dependency
The dependency on HP’s Alpha servers traces back to the work of Gene Myers and Granger Sutton at Celera Genomics on the Celera Assembler pipeline. Those efforts were performed on the proprietary Alpha platform. “TIGR was allowed to use the software,” says Sapiro. “Our computational needs were increasing, so we started using the software, but had to build the Alpha architecture.”

Sapiro says the HP architecture “served us well for the first three to four years, but after that, we realized that the HP infrastructure is very expensive to buy and operate. Plus, refreshing the technology and replacing the servers was a very expensive proposition.”

Aside from operating cost, sluggishness and reliability, another motivation to change was sharing. As a non-profit, Sapiro says TIGR’s mantra is that “everything belongs back to the public—not just the data, but distributing the tools. That [HP] proprietary platform was not an exportable tool.”

Sapiro wanted to migrate to an open-source standard. “When we were ready to take the project, we needed to identify infrastructure—64-bit chips—and have open-source. 

AMD Opteron was somewhat of a logical choice.” Sapiro says they looked at IBM servers, but decided against migrating from one proprietary platform to another.

Given the specs—32 GB RAM servers with AMD architecture—Sapiro says that Sun was a logical choice, especially given its relationship with AMD: “We ran a proof of concept on a Sun-AMD server. We looked at other AMD-based servers, but we enjoy a very special relationship with Sun. Sun treats non-profits like they would a university campus.”

The level of Sun’s support organization is first-rate. “Engineering support is quite good,” says Sapiro. “We needed that as we were embarking on something a little scary!” The engineering team was able to deal with numerous technical issues, including overheating CPUs, during the proof-of-concept trials, which began in 2004. Once the first set of three Sun servers was installed, the code porting was completed. “But for next 18 months, we had the two systems running side by side, working out kinks and bugs,” says Sapiro.

About six months ago, the HP cluster was unplugged. “Give all the kudos to Alpha—it did serve us for six years and is responsible for many scientific breakthroughs,” says Sapiro.

Compare and Contrast
Sapiro says the three Sun servers were “orders of magnitude” cheaper to acquire, “and a lot cheaper to operate.”

Contrast TIGR’s first Alpha servers from 1999 with the new Sun system. The HP servers (model 4100), which cost more than $100,000, had 4 CPUs at 500 MHz, 5 GB RAM, and took up half a normal datacenter rack. Moreover, being a proprietary HP platform, “you needed IT people trained in the intricacies of that [Unix-based] operating system,” says Sapiro.

By contrast, each of the three new Linux-running Sun Fire servers has 4 CPUs at 2.4GHz, 32 GB RAM, and occupies three rack units in a normal datacenter rack. “Each cost less than $30,000, with three years’ maintenance included,” says Sapiro. Moreover, Sapiro conservatively estimates power savings of about 70 percent, and probably the same efficiencies for cooling. “Datacenter floor space is very expensive. Freeing up this much space is a big deal as well,” Sapiro adds.

Since the initial Sun installation, TIGR has added two more Sun servers to the internal grid. “This architecture lends itself to being expanded,” says Sapiro. “As our computational demands grow, there’s no mystery or huge cost problem to resolve it. I just buy another AMD server from Sun, put it in the grid, and it’s up and running in less than a day.”

Sapiro says that TIGR scientists have noticed the improved performance. “Aside from vastly improved reliability, the ability to turn out a much greater number of genomes has been increased. By combining high-performance and high-throughput computing into one infrastructure, you can run more assemblies, and we’re able to dig deeper and provide better quality data,” he says.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

oracle20723
The Role of Analytics in Transforming Healthcare
Sponsored by Oracle

Sharing many of the data challenges and opportunities faced by Healthcare, the Life Sciences industry remains focused on delivering new, innovative therapies and solutions to patients in a cost effective, timely and safe way. With spiraling R&D costs, new methods such as adaptive trials, and never ending need for deep pharmacovigilance, the Life Sciences companies that effectively use analytics to explore, monitor and optimize their business will rapidly become the new leaders.

Oracle’s strategy—built upon Enterprise Health Analytics and Health Data Warehouse Foundation—provides a powerful, practical, and extensible approach to delivering the IT analytics infrastructure required to confront the worldwide healthcare challenge.



pegasystems
BPM-Based Case Management Approach to Optimizing Clinical Trial Efficiency
Sponsored by Pegasystems

Business Process Management (BPM) software offers liberation in the planning and management of clinical trials today. SmartBPM provides the components for automating critical clinical trial processes ranging from protocol development and patient enrollment to site management and investigator payments. Advantages are:

  • Potentially stunning return on investment at multiple levels.
  • A 500%, or better, increase in application development time by directly executing business requirements
  • Improved customer retention
  • A 50% possible reduction in training time

Discovered is opportunity to enhance relationships with investigators, subjects, and regulators while bringing momentum to a technology-impaired study startup phase. Learn more about SmartBPM in this complimentary white paper.



Cmed paper
Next-gen Cloud-based eClinical
Sponsored by Cmed Technology

New technologies are available to leverage Cloud Computing in  managing clinical trial data. This paper discusses a next generation eClinical
platform that:

  • Speeds trial set up
  • Accommodates changes with zero downtime
  • Integrates effectively with other clinical trial technology systems

It is offered with either software-as-a-service (SaaS), or turnkey infrastructure options in which the user organization operates their own cloud using their IT teams, within their data centers. Read this paper to learn and decide how best to leverage cloud computing’s many strengths for your organization’s  particular needs.



Job Openings

mskc logo
Software Engineer – Computational Biology Center

Memorial Sloan-Kettering Cancer Center seeks an Engineer to design and develop complex data analysis systems in support of cancer genomics research projects at the Computational Biology Center. Qualified candidate will have a BA, 5+ years of software development experience and expert knowledge of Java, SQL, and HTML.

Apply: www.mskcciscareers.org.  Equal opportunity and affirmative action employer.

Web Symposia
Loading...

Bio-IT World proudly presents the Bio-IT World Web Symposia Series covering a broad array of topics within the life sciences and drug development enterprise.

Leveraging BPM to Increase Efficiencies in Clinical Trial Case Management
August 3, 2010 | 1:00 – 2:30 p.m. EST
Sponsored by: Pegasystems
Program Details | Register Here 

 


Loading...

For reprints and/or copyright permission, please contact The YGS Group, 3650 West Market Street, York, PA;

(717) 505-9701 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.