Bio-IT World Honors New Products

April 7, 2016

By Bio-IT World Staff

April 7, 2016 | BOSTON—A panel of expert judges and the Bio-IT community chose to honor six new products yesterday evening at the Bio-IT World Best of Show competition. A record-breaking 46 new products in IT and the life sciences were considered this year from the 190 Bio-IT World exhibiting companies.

The Best of Show awards program at the Bio-IT World Conference and Expo recognizes the top innovative product solutions for the life sciences industry on display at the conference in Boston. New products released or significantly improved in the past year are eligible for consideration.

“This was a record year for innovation within bio-IT,” said Allison Proffitt, Bio-IT World’s Editorial Director. “The Best of Show program honors the creativity and commitment that is pushing IT and the life sciences forward.”  

The Best of Show program relies on a panel of expert judges from academia and industry who screen eligible new products and hear presentations on site. This year judges considered 46 new products, and viewed presentations on site from 16 finalists.

The 2016 judging panel included Michael Barmada, University of Pittsburgh; Catherine Brownstein, Boston Children’s Hospital; Joe Cerro, BostonCIO; Chris Dwan, Broad Institute; Martin Gollery, Tahoe Informatics; Richard Holland, New Forest Ventures; Eleanor Howe, Broad Institute; Alan Louie, IDC; Aaron Krol, Bio-IT World; Phillips Kuhl, Cambridge Innovation Institute; Katya Mantrova, Independent Consultant; Jerry Schindler, Merck; Alex Sherman, Massachusetts General Hospital; and Proffitt.

Winners were named in four categories this year: Research & Clinical Data Management; Informatics Tools & Data Analytics; Knowledge Management; and Optimizing Speed & Storage. For the second year, the Bio-IT World community voted on a People’s Choice Award winner, and the judges also chose to name an In Silico Futures Award and an honorable mention  

The 2016 Bio-IT World Best of Show Winners Are:

Research & Clinical Data Management

Winner: Wiley, ChemPlanner 1.0.4

Wiley ChemPlanner can make creating routes faster and easier. Using a combination of novel reactions and curated information, ChemPlanner delivers computer-aided synthesis design backed up by millions of empirical reactions.  Wiley ChemPlanner builds the route for you! Simply plug in your target compound and your starting material and ChemPlanner delivers a wide variety of diverse and viable routes in a matter of minutes.

The tool launched in September 2015, is currently sold as Software as a Service (SaaS) solution and hosted on secure servers, with a local installation version coming this year.

Supported browsers are Firefox 30+, Chrome 35+, Internet Explorer 9+, Safari 6+, Opera 23+ Note: Java not required.

Informatics Tools & Data Analytics

Winner: Seven Bridges, The Cancer Genomics Cloud

The CGC allows researchers to immediately and securely access the complete TCGA dataset on the cloud, perform analyses, visualize the results, and do so collaboratively. The dataset includes raw and processed data from whole genome, whole exome, RNA, microRNA and bisulfite sequencing studies, as well as array-based studies. TCGA data has two tiers: Controlled Access, with which patients are potentially identifiable, and Open Access data, in which they are deidentified, are available. Users’ dbGap approvals are automatically associated with their CGC account using eRA Commons credentials and data can be easily found and used. Private data can also be imported to the CGC using one of several rapid mechanisms.

Researchers using the CGC can interactively query TCGA data based on cases, file types, and their associated clinical metadata with the data browser. A visual case explorer allows users to browse the mutation status and expression levels of a gene in all patients with a particular disease. Users can privately analyze their own data alongside TCGA using pre-built bioinformatics workflows. Moreover, the CGC software development kit allows users to port their own tools to easily run in a cloud environment. Every analysis is fully reproducible as the CGC captures the data, parameters, and specific version tool of a tool for every execution. Project management tools allow teams to collaborate simply, securely, and transparently.


Honorable Mention: Genestack, Genestack Platform

Genestack is universal enterprise-level genomics applications platform. It is a next generation operating system for big data problems, designed to run on heterogeneous compute architectures (cloud, cluster, PC, custom hardware), with bioinformatics-specific features. It helps build interactive applications and flexible computational pipelines within a secure collaborative ecosystem. Our “smart file” based virtual file system makes tools compatible across data format limitations and simplifies computations.

Genestack includes numerous computational pipelines for common workflows: whole genome and exome analysis, quality control, variant calling and annotation, transcriptomics for differential gene/isoform expression, transcriptome assembly, isoform discovery, methylation analysis and others. Interactive apps built and available on Genestack include a novel genome browser with computable tracks, interactive variation explorer for on-the-fly filtering and analysis of variants across populations, visual tools for quality control assessment and outlier detection, and others.

The platform has a powerful metadata system, making use of controlled vocabularies to help users annotate and harmonise data. We index public data from repositories worldwide, and map it to major ontologies. Our data browser can search data across private and public domains; the format-independent architecture makes it easy to combine and compute on data from diverse sources.

An SDK and powerful APIs exist for building Genestack applications. We collaborated with hospitals to put our platform on local servers and deliver patient reports, and with companies on interactive applications for exploring complex datasets. It is easy to get started and build and distribute amazing apps.

Knowledge Management

Winner: Eagle Genomics, eaglediscover

At BioIT, Eagle Genomics is announcing the release of new solution, eaglediscover, to directly enable pharmaceutical and biotech R&D executives and scientists to most effectively exploit their scientific data (internal, collaborator, and public data).   eaglediscover builds upon the data sharing and collaboration capabilities that eaglecore, which was brought to market in 2014.  eaglediscover brings to the industry a unique capability to attribute economic and scientific value to data through a statistical and probabilistic measurement framework and a learning-based biocuration engine.  This approach effectively yields “smart data”.

eaglediscover is an expert-guided learning system that solves the industry problem by enabling the exploration of the meta-data rather than the raw data itself. Industry standard and proprietary ontologies are used to bootstrap the system. In this way, a contextually-relevant meta-data catalog is created, with the data being statistically scored for relevance based on the scientific and business questions being posed. eaglecore may then be employed to manage the data, e.g. enabling the generation of study-specific data marts.

eaglediscover can be deployed as either a public, private or as a hybrid cloud-based solution.  Users interact with the system through an advanced web-based conversational interface that has been designed to enable rapid exploration of the data to identify the most valuable data sets. 

At BioIT, Eagle will be demonstrating eaglediscover, operating on the ICGC data set.

Optimizing Speed & Storage

Winner: PetaGene, PetaSuite 1.0

PetaSuite is a set of complementary software tools that significantly reduce the size and cost of NGS data for storage and transfer. PetaSuite lets researchers and clinicians continue using their FASTQ, BAM, and CRAM files in their existing tools and pipelines, but benefit from a reduced backend storage footprint. It can integrate into most existing storage infrastructures to provide transparent compression.

Unlike generic storage software, PetaSuite understands the internals of genomics files. For lossless storage, PetaSuite offers cost reductions of up to 4:1 compared to BAM or gzipped FASTQ files. When used with our revolutionary Bayesian approach to genomic quality score compression, genotype accuracy is preserved or even improved while reducing storage size by 5:1.

For example, on Illumina Hi-Seq-X 30x WGS human sample NA12878, the original FASTQ.GZ files are 73.73GiB in size, whereas with PetaSuite this is reduced to 13.69GiB (5.3x smaller). Moreover, it is still accessible in FASTQ format for pipelines to use.

PetaSuite consists of several complementary software tools:

- FasterQ: FASTQ compression at 140MB/sec (4-core i7), smaller than CRAM, uses 4GB of RAM. Streaming compression/decompression for file transfer acceleration.
- BayesCal: revolutionary approach to quality score refinement for BAM/CRAM/FASTQ, calculates a more complete Bayesian estimation of sequencer error. Improves compressibility by 2-3x while preserving/improving genotyping accuracy. It requires 24GB of memory.
- PetaVFS: virtual file system that provides high performance random access BAM/FASTQ virtual files representing CRAM/FasterQ compressed data. It also can split out internals of NGS data across storage tiers for lossy and lossless access.


In Silico Futures Award

Winner: Dassault Systems, BIOVIA, The Living Heart Model

Advanced imaging modalities 3DCTA and DT-MRI were used to define the physical attributes of the heart structure, while the conservation laws of continuum mechanics capture the physical response.  The LHM contains well-defined anatomic details including internal structures (e.g., heart valves, chordae tendineae, coronary arteries and veins) and proximal vasculature (e.g., aortic arch, pulmonary trunk, and SVC). Muscle fiber orientations, which vary across the surface and thickness of the heart are included, as are anatomically accurate representations of special cardiac electrical channels (bundle of His and Purkinje network). Cardiac contraction is driven by waves of electrical excitation traveling across the heart to generate physiologically observed wave propagation patterns. The mechanical behavior of heart tissue uses an anisotropic hyperelastic formulation for passive behavior and a time-varying elastance model for the active response. The LHM benefits from SIMULIA’s extensive library of nonlinear material behaviors and experience with biological materials. A closed system of fluid cavities and fluid links models blood flow. The fluid and solid models are directly coupled and the systemic and pulmonary circuits are endowed with vascular compliances and flow resistances that can be modified to simulate exercise, hypertension, and other physiological states. Optionally, 3D blood flow modeling is available using smoothed particle hydrodynamics (SPH) for computational efficiency or by coupling the LHM with traditional CFD solvers if needed. To allow users the optimal balance between accuracy and efficiency, the LHM is available with three mesh variants and computation times range from 4 to 24 hours on a 64cpu workstation.

People's Choice Award

Winner: Illumina, BaseSpace Suite

Illumina developed BaseSpace Suite, a comprehensive, streamlined and fully integrated informatics solution to support end-to-end genomic sequencing. BaseSpace Suite consists of four key components built on an integrated software platform, and the suite as a whole is tightly integrated with Illumina sequencing instruments.

The first component of BaseSpace Suite is Clarity LIMS. Clarity is used to manage and track samples throughout the entire workflow, reducing errors and ensuring the traceability necessary in regulated environments.

Once generated, sequence data is automatically uploaded into Sequence Hub (previously BaseSpace Cloud), a safe and secure cloud environment to hold the increasing volume of sequence data. Sequence Hub has over 70 apps that can be pipelined together to perform secondary analysis, including customer-developed pipelines. Sequence Hub is also a collaboration environment allowing researchers to share data and analyses to advance genomic knowledge.

Once secondary analysis has been done, variant calling, based on rules defined by users, identifies variants of interest in the sample. BaseSpace Suite provides Variant Interpreter (currently shipping in Beta) for this task.

The final key stage of genomic data analysis is to move from the individual patient to a cohort view. BaseSpace Cohort Analyzer allows users without advanced bioinformatics experience, or no access to bioinformatics resources, to aggregate and analyze cohorts of patients by integrating complete clinical records with genomics data.

BaseSpace Suite provides the streamlined, comprehensive acquisition and analysis capabilities necessary to derive greater and higher quality answers from genomic data.