Dec. 2006 / Jan. 2007 | Data storage remains unglamorous outside the IT realm, yet few computing requirements are more important or under greater stress. Particularly problematic is the growing flood of unstructured digital data spilling out of laboratory instruments, often comprised of images, text, and even rich media. Enter clustered storage, a fairly recent addition to the storage management landscape which its pioneer, Isilon Systems, says offers distinct cost and performance advantages over SAN and NAS approaches.
In November, the company landed two significant life science deals, one with UCLA’s Laboratory of Neuro Imaging (LONI) to store the world’s largest repository of high-resolution neuro imaging data, and a second with the San Diego Supercomputer Center (SDSC) as part of cellular imaging portal for cancer scientists. Other major clients include MySpace.com and its huge data management needs, and NBC, which used Isilon storage to manage the 2004 Olympics data.
This growing market traction has other storage suppliers such as Network Appliance and EMC taking notice, adding clustered storage offerings to their product lines. What’s all the fuss about?
A New Paradigm
In general terms, clustered storage architecture uses a distributed file operating systems to create a single namespace for multiple clusters of hardware and data are stripped across all of the devices. It’s a cousin to SAN which also manages multiple remote storage resources, but not typically under single namespace. Isilon’s core product is its IQ platform series (1920 - 5 nodes, 9.6 TB; 3000 – 5 nodes, 288 TB; and 6000 – 30 TB). They are basically “storage appliances” stuffed with inexpensive, standard drives, another advantage says the company.
The secret sauce, says CTO and founder Sujal Patel, is Isilon’s distributed file operating system, OneFS, which among other things, parallelizes and optimizes input/output from the hardware. In October, Isilon released OneFS 4.5 which Patal says enables Isilon to deliver systems with one petabyte of storage and 10 gigabyte per second performance in a single file systems and single volume.
Clustered storage, argues Patel, represents a new paradigm and Isilon has early mover advantage. Customers seem to agree.
“Performance was the key criterion for selection,” says Rico Magispoc, CTO for LONI. “We couldn’t get data fast enough from our spindles. We looked at Isilon, Network Appliance, SGI, and Sun storage systems. Scalability was also an important issue.”
Prior to adopting clustered storage, LONI relied on an array of SAN storage. LONI serves as the hub of a national neuro imaging resource which supports more than 60 national and international brain imaging collaborations. 200 Gigabytes can be chewed up for a single imaging subject. Because researchers are often accessing the same datasets at the same time, data throughput was a major challenge. Managing the SAN also proved demanding and required a full-time, dedicated storage manager.
The evaluation process took roughly three months, but deployment was quite easy, says Magispoc. LONI consolidated large repositories of neuro imaging data into a single volume, speeding concurrent access to hundreds of researchers worldwide who can retrieve, collaborate, and analyze the 16,000-plus subject scans stored on-site at LONI’s Los Angeles laboratory. Performance has increased three-fold, says Magispoc, and the system no longer requires a dedicated administrator.
“Whereas it used to take four days to add storage to our SAN, we can now add Isilon nodes in about 10 minutes, without any system downtime,” he says. That’s a skill he may soon need. Roughly a month after installation, in service, LONI researchers had already filled 40 percent of the system’s 18 terabyte capacity.
Interestingly, Magispoc told Isilon that LONI was interested acquiring its own hard drives thinking he might get better pricing, but was told Isilon already used plain vanilla hardware, and that Magispoc was unlikely to achieve better savings.
Isilon’s recent deal with SDSC is to support visualization and analytics applications used by UCSD Cancer Center scientists. They will be able to upload microscopic cellular images onto the portal, run a series of automated visualizations and models of the cellular images on SD’s supercomputer, and then save both the original images and the new animations directly onto an Isilon cluster.
The Cedars-Sinai Prostate Cancer Center’s is another Isilon user. The Center uses advanced Mass Spectroscopy to obtain fingerprints of all proteins in blood, gathering more than 60 gigabytes of data from a single drop of a patient’s blood. The comprehensive patient record, of which the protein data comprises only one field, is stored on Isilon IQ clustered storage along with relevant information from other research projects.
Email John Russell.
Subscribe to Bio-IT World magazine.