Inova Translational Medicine Institute Deploys Cloudera

June 6, 2017

By Bio-IT World Staff

June 6, 2017 | Inova Translational Medicine Institute today announced that it has deployed Cloudera Enterprise to securely analyze massive collections of clinical and genomic data at unprecedented speeds and scale for faster innovations in translational medicine research.

The Institute collects clinical data from thousands of Inova patients born from over 110 countries, and is assembling what is expected to be one of the world’s largest whole genome sequence databases connected to patient information in a healthcare system. The Cloudera platform enabled ITMI to streamline their genomic data analysis for discovery. In the past, given the massive size of whole genomes, this process could take ITMI about two months to accomplish. Using Cloudera, ITMI can accomplish end-to-end data analysis in one week. In the future, ITMI expects to do these data analysis in just hours.

 

“The challenge for ITMI researchers and scientists was to analyze our highly complex, massive collection of raw data faster and more efficiently and translate insights into practical patient care. We’re now able to get answers in minutes and seconds and can find correlations that we couldn’t see before,” said Aaron Black, chief data officer of ITMI, in a press release. “Our researchers used to spend 80% of their time on data wrangling and only a sliver of time on the analytics. We’re in the process of reversing that. We can now accelerate the pace of genomic discovery and dramatically change the way we interact with our research teams.  We believe that will improve our ability to provide the right treatments to the right patients and ultimately, improve outcomes. What Cloudera has done is made this imminently possible.”


Working with Cloudera, ITMI built a world-class bioinformatics infrastructure for the Institute's massively growing data collection of genomes paired against the clinical record. The infrastructure was designed to store and process this convergence of biological data, at speeds and scale, well into the future.


ITMI currently tracks approximately 9,000 whole sequenced genomes, scaling to 15,000 in the future. Cloudera’s modern analytic database powered by Apache Impala (incubating) brings high-performance SQL analytics to big data. With the flexibility, scale, and speed Cloudera provides, ITMI’s team will apply multi-user concurrency and high-performance analysis of genomic data gathered from mothers, fathers and infants enrolled in various familial base studies.  For example, ITMI has been able to leverage its clinical and genomic analysis expertise to help discover previously undiagnosed congenital anomalies in infants.  This is a time consuming and iterative process, but with tools like Cloudera, ITMI anticipates accelerating these discoveries to help these families.


“Inova’s unique and leading edge big data architecture matches the diversity in their patient community and their breadth of innovation. Cloudera is proud to work with these pioneers in clinical genetics at scale, who are advancing genomic research and personalized healthcare,” said Shawn Dolley, industry leader, health and life science at Cloudera, in the statement. “ITMI is advancing the way researchers and clinicians can consume and manage genomic and molecular data. Combining clinical and genetic data and layering in machine learning is how we will transform the decisions we make in patient care, disease prevention and precision public health.”