YouTube Facebook LinkedIn Google+ Twitter Xinginstagram rss  

Behind the knoSYS 100: Building the Genome Supercomputer

By Michael Fein 
July 22, 2014 | In the fall of 2012, when Knome launched the knoSYS 100 “genome supercomputer”, Knome envisioned that every next-generation sequencing platform would one day sit alongside an analysis and interpretation engine (see, "Knome Launches knoSYS 100 Genome Supercomputer to Enhance Interpretation"). 
Knome's then-president and CEO Martin Tolar called the launch “an evolution of our thinking.” While the larger genomics research organizations have dedicated teams and datacenters to handle genome data, for the majority of Knome’s clients, Tolar said, “you really want to have integrated hardware and software systems.”   
To build a supercomputer meant to sit in the lab next to the sequencer (at least in theory), Knome enlisted the help of Silicon Mechanics. 
Knome provides human genome interpretation systems and services, specializing in reporting variations in a person’s genome as compared to the standard human reference genome, providing annotation from a large collection of reference data. 
In a clinical setting, processing and comparing large numbers of genomes is not as important as speed, confidentiality, data protection, and the ability to organize and control revisions. Genome information is confidential data, and clinics are concerned about the risk of exposing private consumer data. While some turn to cloud services like Amazon Web Services (AWS), others want to keep their sensitive patient data behind their firewall.
Ultimately, privacy, security, version control, and transfer speed drove Knome to develop an in-house approach to meet the demands of the newly emerging clinical market. They decided to develop an end-to-end system that would enable a lab to effectively handle the computational and interpretation requirements of next-generation sequencing-based tests within its own facility. 
Once they made that decision, they then had to decide which hardware would be needed to perform all the computational tasks that AWS had previously performed for them in the cloud. Figuring out what hardware would be required to optimize the intensive computational tasks required by whole genome-level informatics was no easy task. The company’s software package, knoSYS, is not a single application—rather, it can be likened to an “ecosystem” of about 50 applications all running together.
To make that all work seamlessly, system development had to include a thorough understanding of power consumption, temperature, noise, and application performance requirements to optimize price and performance from the 20 available processor options.
The hardware ultimately selected is a high-performance grid computing system that integrates eight servers in a rack—all optimized to support 10 or more simultaneous users running Knome’s interpretation software and informatics engine. The use of industry-standard components, including the Intel Xeon Processor E5-2600 product family, reduces upfront capital and long-term support costs compared to other commonly available hardware solutions. The appliance includes a high-performance computing cluster with four nodes, each with two 8-core/16-thread, 2.4 GHz, 64-bit Intel Xeon Processors E5-2665 with 20MB cache. The same chips were used in the 1-node database server included in the knoSYS 100. 
The Intel Xeon Processor E5-2600 product family gave Knome performance gains of up to 80 percent over a previous generation Intel Xeon processor-based server and provides the reduced I/O latency Knome required with Intel Integrated I/O. In addition, the processors are upgradable to the next-generation product family, Intel Xeon Processors E5-2600v2. 
Storage is provided by a ClusterStor appliance, a commercial version of the Lustre storage system developed by Seagate’s Xyratex division. ClusterStor Lustre storage was used because its I/O rate is matched to the systems’ intensive computational capacity and it provides a simple way to integrate storage with the existing appliance. Each application running within the ecosystem has its own characteristics that must peacefully coexist with the others—some need memory, some need fast disk interaction, and some just need a lot of CPU. Lustre offers sufficient I/O support to manage this wide variety of usage scenarios.
Silicon Mechanics provides the assembly and integration for the solution, starting with assembling the component servers into a rack, and then installing Xyratex ClusterStor. Once all of the servers are tested, middleware installed, and power, networking, switching and cables are integrated, Knome installs the suite of software tools. 
The customer simply unpacks one box and plugs in their “genome supercomputer.”
Michael Fein is the Director of Sales for Silicon Mechanics.  
Click here to login and leave a comment.  


Add Comment

Text Only 2000 character limit

Page 1 of 1

For reprints and/or copyright permission, please contact Angela Parsons, 781.972.5467.