June 13, 2007
| As life science organizations increasingly pursue Web 2.0 and Semantic Web applications, they must watch for the impact on the organization’s storage requirements.
“Web 2.0 users and companies need large storage,” said James Reaney, director, research markets at BlueArc. Reaney says the field is already seeing adoption of Web 2.0 type technology, for example in microscopy and microarray imaging, where users are tagging, characterizing, and indexing the data. BlueArc offers the BlueArc Titan line of high-performance, scalable storage systems.
Others are also seeing similar storage growth trends — and noticing that companies are having a hard time planning to meet the ensuing storage demands. “When talking to customers about the move to Web 2.0, they say these applications make data growth more unpredictable,” said Brett Goodwin, vice president of marketing and business development at Isilon Systems. “Users generate content, so there is a difficulty predicting how storage demands will grow.” Isilon offers the Isilon IQ line of clustered storage systems that offer scalability, high performance, and the ability to quickly add incremental storage to a live system (See “Clustered Storage Gaining Strength,” Bio•IT World, Dec. 2006/Jan. 2007).
Databases associated with Web 2.0 applications include much more metadata than traditional applications. Moreover, many Web 2.0 applications are designed so that the data are available online all the time, rather that taken off primary storage and archived to tape as other data is.
Another consequence of Web 2.0 applications, Goodwin notes, is that as the community grows so to do performance demands. As a result, traditional storage approaches become more complicated to manage if performance is to be maintained.
In traditional storage systems, performance is maintained as capacity grows by simply adding new discrete storage systems. Applications end up directly linked to physical storage devices in a one-to-one manner. When an application outgrows the device’s capacity, a new device replaces it and every user application must be changed to be sure the application can “find” the data. This is an area where storage systems that use a clustered or virtualized data volume approach are becoming more popular, as capacity can be added without changing the user’s applications.
Others are seeing similar trends. “Content is growing virally, so capacity planning is out the window,” said Craig Nunes, vice president of marketing at 3PAR, which offers utility storage systems that include massively scalable storage arrays and software that enables new storage to be added quickly.
Nunes says companies want to add capacity to meet performance demands without complicating management. “The issues we hear with Web 2.0 companies are that they are concerned with storage growth and change management.”
Beyond meeting unpredictable storage growth rates and requiring high performance, early Web 2.0 adopters in other industries point to additional issues that companies need to be aware of when deploying Web 2.0 applications.
“Change management is critical,” said Nunes. He noted that 3PAR customers (which include MySpace) are rapidly adding large amounts of storage capacity, but must be careful not to disrupt the actions of the vast number of online community members using the storage.
And Reaney points out that besides raw storage performance (including the ability to meet the demands of a large number of simultaneous user and application requests for data), there is another performance issue that comes into play with many Web 2.0 applications: High speed search capability. The twist with Web 2.0 applications is that searches must include the often unstructured metadata.
“Search speed is a key to many of these applications,” said Reaney. “[People] need to search quickly and want ultra-fast results.”
Subscribe to Bio-IT World magazine.