Making Data Storage More Secure, More Flexible and Less Expensive for Medical and Scientific Applications

December 15, 2023

Contributed Commentary by Eric Herzog, Infinidat 

December 15, 2023 | One of the major side effects of the massive increase in the use of artificial intelligence (AI) for basic research, translational research, clinical research, pharmaceutical development, and patient care is the tsunami of data that will be created. All of this data needs to be stored somewhere, whether on-premises or in the cloud—or, ideally, in a combination of the two for IT to get the best of both worlds. 

This exponential proliferation of data at unprecedented levels with the industry’s embracing of AI creates challenges for medical and scientific institutions. Capacity, performance, availability, complexity, cyber resilience, and cost are core issues at the forefront of any information technology plan to accommodate a laboratory’s or a medical facility’s requirements for data to be always available, secure, and affordable.    

For example, the data-intensive tools that are used to map genomic code can result in spiraling sequencing data size and associated costs. In fact, any kind of omics-based data, especially with the rise of spatial transcriptomics and various forms of proteomics, as well as the rise of digital pathology, can easily send a facility into requiring the scaling of multi-petabytes of data capacity at any given time.  

Another example is when a healthcare data steward aims to leverage the large datasets they have to both drive innovation and monetize the data by allowing AI algorithm developers to compute on those real-world, high-quality datasets. A zero-trust approach with confidential computing is recommended. What’s needed to support it is a highly reliable, enterprise-grade storage platform that delivers 100% availability.  

Neither data stewards nor AI developers would want to see a disruption in the availability of the datasets that are being used to train AI algorithms. Neither would they want to see any security breaches of the data from cyberattacks. They also need to control costs and protect intellectual property, while simultaneously reducing complexities.  

The keys to success for IT teams in the biotech, pharma, and life science fields are to plan ahead and understand how the evolution of enterprise storage dovetails with the evolving needs for super high-performance, higher-capacity storage capabilities to handle various workloads and applications. 

Hybrid Cloud

Experienced CIOs and IT teams have come to realize that a hybrid cloud approach is smarter than simply “putting everything in the cloud.” Enterprises keep more control of data, have more control of the costs associated with storing and managing the data, and have more flexibility with data by keeping it on-prem. Availability of data can also reach 100% on-prem, but does not go anywhere near as high in the cloud. It’s usually a few 9‘s in the public cloud. However, the public cloud is a suitable option for backup, disaster recovery/business continuity use cases, and archived data that a facility does not need to be accessing often.  

A fact that is too often missed is that the cost to put data, which needs to be readily available, into the public cloud and then bring it back from the cloud is very high. Even if the cloud provider promises “no hidden costs” for the cloud service, escalating network transactional costs of moving data back and forth costs also are an unexpected, additional expense.  

Utilizing a mix of private and public cloud gives an enterprise the ability to get the best of both worlds – the control, availability, security, and flexibility of the on-prem, private cloud, as well as the scalability and less time-sensitive use cases of the public cloud. This hybrid cloud approach undergirds always-on availability of applications and workloads, and it best supports collaboration among scientists and medical professionals who are geographically dispersed. In addition, it negates the need for a healthcare institution or biotech company to make “trade-offs” across the data infrastructure.  

Cost Savings

Storage is often the largest single IT cost of a biotech company. However, there are ways to reduce costs, while maintaining storage capacity—or even increasing storage capacity—and high performance. One of the best strategies is storage consolidation.  

A biotech company may have 12 storage arrays that have been purchased and compiled at different times over the years. By consolidating the 12 arrays down to two arrays at petabyte scale, the biotech company will substantially lower CAPEX and OPEX. It requires less power, less space, less manpower, and less time-consuming management. An IT team sets itself up for success when it consolidates onto a multi-petabyte platform for a very rapid return on investment (ROI) and lower ongoing expenses in the face of rising amounts of data. 

At the same time, a flexible consumption model can be adopted. This creates a pay-as-you-go option. A biotech would only pay for the storage capacity it uses. It does not need to make a huge capital expenditure upfront. Even if an on-prem storage solution is being used, there is a cloud-like experience with the private cloud. The model could be Capacity-on-Demand or another flavor of Storage-as-a-Service (STaaS) for ultimate flexibility. STaaS has advanced significantly over the past year. The best solutions come with AIOps capabilities and integrations with a wide variety of complementary software packages.  

Cyber Resilience

Cyberattacks are rampant. A ransomware attack could cripple a pharmaceutical company’s ability to conduct a clinical trial. Cyber criminals can corrupt and take data for ransom, such as the data used for patient stratification in a Phase 3 clinical trial. The security and privacy of the patient data is of utmost importance as well.  

This year cyberattacks are expected to cost enterprises $8 trillion worldwide. Not only were large health systems and research institutions held ransom by cybercriminals during the COVID pandemic, but the threats have gotten worse.  

During this past summer, an increased number of ransomware groups launched attacks against health systems, even to the point of disrupting patient care. John Riggi, national advisor for cybersecurity and risk for the American Hospital Association, revealed in July that more than 220 cyberattacks targeted health systems and hospitals, as of June 2023, affecting approximately 36 million people. To put it in context, 44 million people were affected by cyberattacks in 2022 alone.  

To address this security threat, medical and pharmaceutical companies need to enhance their cyber resilience and recovery. Rapid recovery of data nullifies the potential adverse impact of a ransomware attack.  

Cyber resilience has become a fundamental part of industry-leading enterprise storage solutions. The components of a well-rounded solution include immutable snapshots, cyber detection, logical air-gapping, the use of a forensic environment to identify a known clean copy of the data, and recovery at near-instantaneous speed. This helps enterprises avoid a disruption, knock out the cybercriminal threat, and restore the vital data, safely and securely.  

Ease of Use

When a major laboratory in a healthcare institution is doing important clinical research into cancer biomarkers with advanced imaging and spatial biology tools to study the tumor microenvironment, the staff does not want the burden of a complicated storage infrastructure. The logical thing to do is to inject automation, but it’s beneficial to go one step further—autonomous automation.  

Storage for enterprises has become much easier to use and easier to manage. The adoption of autonomous automation provides the ability for the system to direct the data and find optimal paths of utilization. The system knows whether a workload needs a higher-performance capability with the lowest possible latency in an all-flash system versus a different workload that could go to a disk-based system or go out to the cloud to be archived.  

There is no need for human intervention. The storage system “thinks for itself,” identifies problems before they have a tangible impact, and can “heal itself,” when necessary. Utilizing AI capabilities and integrations, the built-in automation simplifies the storage of data that is generated and examined in clinical analysis. It allows the scientists to focus on the science, not on difficulties created from a complex storage estate.   

All in all, the combination of hybrid cloud, flexible consumption models, cyber resilience, and autonomous automation creates a formula for super high-performance, more reliable and less expensive storage, while allowing you to achieve all of your scientific goals or patient care delivery objectives.  


Eric Herzog is the Chief Marketing Officer at Infinidat. Prior to joining Infinidat, Herzog was CMO and VP of Global Storage Channels at IBM Storage Solutions. His executive leadership experience also includes: CMO and Senior VP of Alliances for all-flash storage provider Violin Memory, and Senior Vice President of Product Management and Product Marketing for EMC’s Enterprise & Mid-range Systems Division. He can be reached at