WekaIO, PetaGene To Showcase An End-to-End Optimized Genomics Workflow At BioIT World

April 18, 2019

By Bio-IT World News Staff

April 18, 2019 | WekaIO demonstrated an optimized genomics data analysis workflow at Bio-IT World in Boston, Mass., this week. The demonstration featured WekaIO Matrix, the world’s fastest parallel file system; PetaGene, the maker of award-winning genomics data compression solutions, and recent Bio-IT World Best of Show winner; and incorporates Sentieon’s award-winning genomics tools, for accelerated genomic data analysis, and Western Digital’s ActiveScale cloud object storage for long-term storage and archival.

The cost of genome sequencing has dropped dramatically, resulting in an explosion of genomic data which if stored on legacy NAS storage systems can be prohibitively expensive. Current analytics platforms struggle to process these massive amounts of data in a timely manner, and storage costs dominate the budgets of large genomics applications. As storage costs escalate and money gets diverted to pay for infrastructure, the pace of discovery slows. Together, WekaIO, Western Digital, Sentieon, and PetaGene offer the genomics industry a scalable, robust, and high-performance solution that delivers performance that legacy NAS systems cannot offer as well as a cost savings model that allows for the research to continue rather than investing in more storage infrastructure.

WekaIO’s Matrix file system reduces time to discovery by providing low latency data access and fast delivery of data to compute servers, eliminating the I/O bottleneck and the CPU starvation problems common to genomic and cancer research workloads. With a single namespace that spans local storage and the cloud, Matrix delivers simplified management and data protection. Its performance is 3x that of local file systems and 10x that of traditional NAS. Together with PetaGene compression and integrated tiering and remote backup to the cloud with Western Digital ActiveScale object storage, WekaIO provides unprecedented storage performance and capacity scaling for genome sequencing workloads.

“We are excited to share our work with PetaGene for the life sciences community at BioIT World,” said David Hiatt, Director of Business Development at WekaIO, in a press release. “Genomic workloads are among the most challenging for storage systems with billions of small files and intense metadata operations. Our software delivers extremely high bandwidth and IOPS performance at a fraction of the cost of NAS appliances. Combined with PetaGene’s groundbreaking compression technology, an integrated solution reduces total storage costs and dramatically improves data accessibility, helping to accelerate the pace of research and discovery.”

PetaGene genomic data compression provides up to 90% reduction of BAM and FASTQ.gz file sizes, without any loss of information, resulting in greater than 50% net savings in overall storage costs. In addition, PetaGene compression technology reduces transfer times of genomic data by 60% to 90%. Whether these compressed files are stored locally or in the cloud, PetaGene’s PetaLink technology provides transparent and secure access to this genomic data to all applications, tools, and pipelines without modifications to established workflows.

“Our work with WekaIO results in a storage solution for genomics and life sciences that is easy to manage and combines industry-leading storage density and performance with breakthrough scale and economics,” Vaughan Wittorff, CCO and Co-founder of PetaGene, said in an official statement. “Our ability to provide lossless compression and workflow transparency of genomic data combined with the high performance of both PetaLink and Matrix is an infrastructure improvement that will benefit the entire genomics industry. Furthermore, the Sentieon tools offer speed and accuracy of the analysis of genomic data. PetaGene seeks to work strategically to provide the genomics industry with novel solutions and we are excited to support WekaIO with this capability demonstration.”

PetaGene also announced they have become a NetApp Alliance Partner. Combining PetaGene’s expertise in compression techniques specifically designed for genomic data with NetApp’s leadership in data services across hybrid and multi-cloud environments offers powerful new ways to simplify genomic data management. The result is improved performance and reduced costs when conducting research using the enormous datasets created by the explosion in genomic sequencing.

In an official statement, Dan Greenfield, CEO & Co-founder of PetaGene, commented, “Partnering with a leading data services provider such as NetApp will allow holders of genomic data to access ready to deploy solutions for the storage and management of their data, with the benefits of our compression technology already built-in.”