By Salvatore Salamone
May 9, 2003 | FOR YEARS, companies have prepared for the worst. And that was before Sept. 11, 2001. In the wake of that tragedy, perceptions of how to safeguard corporate data have changed. The lessons learned from the World Trade Center catastrophe have formed the blueprints that life science companies must adopt to ensure survival after a disaster.
|Planning for Armageddon
|Most disaster recovery plans include several common features, such as the use of multiple networks to provide remote sites with redundant paths to a data center, secure off-site storage of backup data, and online hot backup data center sites.
The key shift in disaster-recovery planning since Sept. 11 is termed business continuance planning, which builds upon the time-honored, back-up-everything-on-tape disaster-recovery method. In the event of a major disaster, business continuance planning looks for ways to quickly restore day-to-day operations and prevent the loss of critical company data. Business continuance planning also strives to improve the efficiency of normal day-to-day operations, providing methods to smooth out small glitches that can lead to loss of data, productivity, or business.
While backing up data to tapes that are stored securely offsite remains essential, business continuance planning relies on additional methods to facilitate disaster recovery.
Data mirroring — where data are replicated to a second live data center — is a crucial component of a thorough business continuance plan. The idea is to have a copy of all vital data online at a second site. This does not preclude backing up the data to tapes — indeed, the combination of mirroring and tape backup inspires confidence that no data will be lost in a major disaster.
"Backing data up to tapes on a daily basis is still our safety net, but we were concerned that the time required to retrieve data from offsite storage after a major disaster would be significant," says David O'Neill, network administrator at a specialty drug manufacturer, which O'Neill did not want named. "That's one reason we looked at replicating data to multiple sites. It ensures that data are always available to our staff."
But data mirroring is not cheap. "Tapes are still relatively inexpensive compared to disk drives," O'Neill says. "Data mirroring requires almost double the disk space as is required to store the original data."
One approach to data mirroring is real-time data replication to offsite storage devices, which has the advantage of virtually no data loss if the primary site is out of commission. The downside, however, is that moving large files — medical images, for example — throughout the business day could clog WAN connections and retard applications that access the data. Alternatively, data can be copied to offsite storage systems during down times, when network usage is not as high.
In either case, technology can help expedite the data transfer process. Data caching and compression are commonly applied to files before offsite transfers. Products from companies such as Peribit Networks reduce the amount of data that needs to be sent between two locations. Peribit's tools use algorithms that search data for patterns and build tables that store identifying labels. Any time the same patterns are detected after the labeling process takes place, only the labels (and not the actual data) are transmitted to the secondary site.
One salutary lesson learned following Sept. 11 was that business continuance requires more than brute force technical solutions. For example, some large companies affected by the World Trade Center collapse with secondary data centers in other cities discovered that the existence of those centers was not enough to restore normal operations.
|Hidden Costs of Downtime
|Downed systems cost companies money. Here are some points to consider when trying to cost-justify a disaster-recovery system.
"There was always the assumption that if a major data center was lost, you could always put IT staff on an airplane and get them out to a backup data center, so they could get vital business systems up and running," says Raymond Lopez, a consultant at Rosewall and Associates, an IT consulting firm. "This was not possible for several days after September 11, since domestic travel was completely shut down." Of course, this assumes that IT staff would be available to travel. Several companies lost many or all of their IT personnel on Sept. 11.
But correctly following backup procedures does not insure that all vital data will be preserved, either. "Firms were more vulnerable than expected to the amount of data still stored on paper or on users' desktops," says Nicholas Parks, an analyst at TowerGroup, a global services research firm.
For these reasons, Parks says, business continuance planning "now considers human and environmental factors in a disaster, with the goal of continuous availability and performance of the business, rather than just the restoration of operations in hours or days following a catastrophe."
Lopez agrees: "Disaster recovery is as much about developing policies and practices and putting those things into place, as it is installing hardware and software systems."
Life science companies would do well to follow the procedures implemented by the financial industry post-Sept. 11. For instance, to minimize the loss of crucial business information stemming from damage to papers or desktop computers, many companies are scanning paper documents as soon as they are created or received. Additionally, many firms are limiting the amount of data an employee can store locally on his or her desktop, transferring more data to networked storage systems that are regularly backed up.
"Preparation is the key to ensure that businesses can quickly rebound after a disaster," says Tony Adams, principal analyst at Gartner's IT services group. "Businesses now more widely understand that they must prepare in advance to meet the challenges of a disaster."
But many companies are finding they simply do not have the money to get started. In a recent Gartner survey of 205 IT managers, 24 percent of the respondents said that lack of funds was preventing implementation of a disaster-recovery plan. One in three companies even admitted they would lose critical data or operational capability if a disaster occurred. And 37 percent indicated they needed additional funding to carry out their disaster-recovery plan.
The irony here is that if companies do not spend money in advance on disaster-recovery planning, they will spend far more after a disaster. Knowing this, some life science managers are justifying the costs of disaster-recovery planning and implementation by showing upper management tangible savings that such systems and procedures bring in other areas. "I try to show my CIO that any investment in disaster recovery saves us money in our day-to-day operations," says Charles Mitchell, director of IT at a New Jersey pharmaceutical company research center. "The same systems and procedures that would give us access to data in a disaster can be used to improve normal systems availability."
Mitchell notes that disaster-recovery data-mirroring techniques can also be used to access data during routine maintenance or backup of storage systems. "The selling point isn't, 'Let's spend all this money in case of catastrophe,' it's more, 'Here's what we need to keep everyone happy during normal conditions, and, by the way, for no additional money we cover our [behinds] in case of a major disaster.'"
Mitchell is not alone in employing this strategy. "Developing a [return on investment] proposal for security or disaster-recovery systems is difficult," Lopez says. "But showing management that investments in IT systems will improve the resiliency of existing systems and reduce day-to-day operational costs makes a much stronger case."
Most companies will thankfully never experience a disaster akin to Sept. 11. But virtually all companies experience short-term disruptions of their systems and consequently are developing plans that deal with mundane problems to reduce downtime.
Life science IT managers need to ensure the availability and resiliency of their systems. Virtually all companies have systems in place to keep the business running when minor glitches, such as power disruptions, play havoc on operations.
Question of Survival
In a survey of 163 business managers by the consultancy Contingency Planning Research and Contingency Planning & Management magazine, 7 percent of companies said their survival would be at risk if systems went down for one hour. An additional 17 percent would be at risk if the outage lasted one business day (see "Emergency Response," below).
"We could probably survive an outage for several hours or a day or so," says Jon Benson, network systems administrator at Neurome, which studies gene-expression patterns in brain function and diseases. But it would be highly disruptive. "We do data acquisition at night where the data is written to [storage] by robotic microarray systems, and we do data analysis during the day," Benson says. With company systems running 24 hours a day, "lost time means lost opportunity. It would also hold up research for our customers." Benson uses APC's InfraStruXure system for power protection.
Common problems can provide useful ways to justify the cost of business continuance planning. Take, for example, overheating. When data centers housed primarily mainframes, it was assumed that cooling requirements were uniform throughout the center. Today, most data centers are built around racks of high-performance equipment, which instead create "heat islands" (see "It's Getting Hot in Here," page 1).
Surprisingly, however, few companies are searching for more efficient ways to cool their data centers, according to Hewlett-Packard. Many apparently believe that cooling will draw more attention as higher-performance, high-density systems are deployed. Hewlett-Packard Laboratories is working on more intelligent cooling and believes it can cut data center energy consumption by 25 percent, perhaps saving a company with an average data center about $1 million annually.
Improving the normal operations of a data center can yield quantifiable savings in operational costs. This in turn can help an IT manager make a case for investing in systems that will save a company in the event of the unthinkable.
|About 60 percent of companies would face economic ruin if they could not restore their systems in two to three days after a major disaster, according to an online survey of 163 business managers.
AT WHAT POINT IS THE SURVIVAL OF YOUR COMPANY AT RISK?
Source: Contingency Planning Research, a division of Eagle Rock Alliance, and Contingency Planning & Management Magazine