|
|
|
By John Russell
July 16, 2009 | The use of leading edge informatics tools is part of Celera’s DNA. In the race to sequence the human genome, high performance computing and sophisticated home-grown informatics were as important as high speed sequencing machines to Celera’s success. Celera is hardly that company today, having redefined its mission to become a leader in diagnostics and “personalized disease management.” What hasn’t changed is its use of leading edge informatics.
While widespread adoption of automated workflow tools has been sluggish in life sciences – unlike many other industries - Celera has jumped onboard, building on its several-years-old relationship with workflow platform vender InforSense. Making sense of vast quantities of data has always central to Celera’s mission; today that generally means performing GWA studies and associated functional genomics work to identify and help validate biomarkers.
As explained by John Sninsky, Celera’s VP, Discovery Research, the turn to automated workflows for informatics analysis occurred as a result of its business plan evolution. Famously founded to sell access to its treasure trove of high quality data, Celera transitioned its business model “over the years to have solely a diagnostic and pharmacogenomic focus.” Roughly two years ago Celera purchased a CLIA-approved clinical reference laboratory, the Berkeley Heart Lab, enabling it to also offer services and have the potential to generate new in vitro diagnostic products.
In changing direction, Celera sought to exploit unmet medical needs and initially settled on six areas ranging from autoimmune disease, neurological disease, liver disease, cancers, and cardiovascular disease. “Some were combinations of diseases, cardiovascular diseases such as not only myocardial infarction but also stroke,” says Sninsky A discovery team for each area was established and work progressed.
“Over the years we have found important associations but in some cases the therapeutics area hasn’t developed as rapidly as we would like. For example, we were hoping that new drugs would have come on board for Alzheimer’s disease so that our risk markers would have had value in early evaluation of treatment. Unfortunately, those therapeutics have not come along so we’ve discontinued our Alzheimer’s work,” he says
Celera has pragmatically pared back in areas in which the payoff seemed more distant. “The three areas that we’re focusing on now are cardiovascular disease, cancer, and liver disease. So we went from eight different teams down to three teams. [Nevertheless], you can imagine having eight teams do analyses; very quickly their processes diverged.”
Two major challenges prompted the adoption of automated workflows. “One is the industrial scale in which we have operated since our inception whether that’s from a sequencing point of view or from a SNP discovery or messenger RNA profiling perspective. That industrial approach generates large amounts of data and complex data that you need to filter, sort, analyze, and interpret,” he says.
The second issue, “was that as we pushed some of these analysis tools out to the disease area teams within Celera there began to be idiosyncratic modifications and decisions made about how one analysis would be done and what kind of filters that would be used. We started ending up with a non-standardized analysis, very similar but different for different disease indications.”
Adopting a platform able to automate workflows was an attractive solution. Sninsky knew InforSense CSO Jonathan Sheldon from when they both worked at Roche and called him to learn more. The two companies soon began collaboarating. The basic idea is to be able to rapidly build, archive and re-use workflows, which would bring efficiency, better control over the processes Celera scientists used. So it has turned out.
David Ross, Celera’s director of computational biology, says, “The standardization and uniformity led to remarkable improvements in how we as an organization dealt with the data.” He cites one workflow developed for expression analysis. “We sat down for a couple of afternoons and hammered out a fairly complex workflow and visual [report] that would have taken at least a couple of weeks for a developer to do. It’s also reduced the amount of time that the patent group needed to address the question of how a particular analysis was done. They don’t need to ask ‘Was this done? How was that done?’ It was done in a standardized way.”
Sninsky estimates efficiency has jumped perhaps five-fold, adding “over the last six month we’ve started to demonstrate to other parts of the company - whether that’s development or the clinical reference laboratory - the kind of productivity improvements that come with using these workflows. Although we served as the entrée for the Celera organization to the InforSense tools, my expectation is they are going to be embraced by a larger number of people in other groups.”
It is interesting to note that despite their power and roughly a decade to mature, informatics workflow automation tools haven’t been widely embraced yet. InforSense and SciTegic (Pipeline Pilot) were both founded in 1999 to bring the technology to life science research. Teranode was formed in 2002 with a similar strategy. But scaling up business has proven difficult. In 2004, Accerlys purchased SciTegic in 2004 and IDBS’s has just acquired InforSense.
No doubt there are many reasons. The platforms, though powerful, could be tricky to use. The fast pace of change in experimental and analysis technology has sometimes made companies reluctant to invest in automating workflows tools, thinking there will always be too much manual work required. The ability to easily integrate a sufficient diversity of third party analytical tools is also important. Even conveying clearly what the platforms do can be challenging. (Both InforSense and Accerlys/SciTegic have increasingly positioned themselves as business intelligence/analytics platforms suitable for many industries.)
It is perhaps useful to offer a description of an informatics workflow. “To me,” says Crosby, “the [elements of a ] workflow are you grab data, most of the times from a database, which is both internal data and public data that we’ve put in, we manipulate that data [and] by that I mean pivot it or transform it or whatever we need to do to get it into the form we need to submit it for an analysis procedures or maybe a set of different analyses procedures that may be parallel or serial, then they are displayed and they can be manipulated even in display or displays are static. That pretty much encapsulates everything that we do.”
“[The InforSense platform] allows us to put a number of different analytical engines in the middle of that very easily. It allows us to add additional visuals downstream, so it’s an organic system. Things can be done in SAS or R or other things like Matlab and we can easily grab those new analytical procedures and try them ourselves. We’re particularly interested in the new semantic languages and databases and I think that’s a future area.”
InforSense CSO Sheldon agrees, “In an area like genetics where it’s evolving at such a pace, there really isn’t a set of five standard workflows that do genome wide association. You need a very open platform. You need the ability to rapidly integrate a whole variety of different algorithms from different data sources. Workflow has been the mechanism by which you [Celera] could rapidly prototype different approaches to analyze the data.”
Working together early has benefited both companies. InforSense was still developing what would become its Translational Research Solution (TRS), aimed at biomarker discovery activity. Celera was able to influence its direction, for example suggesting early on inclusion of a SAS node to meet Celera’s needs.
Sheldon says, “It’s worth saying that at the start of our relationship we were working on a genetics module for the platform and clearly a lot of the input that John and David have given us over the years has really helped kind of fine tune GenSense so we’re able to cope with the data types that you see in genetic analysis and we have in the system analytical methods which are appropriate for genome wide associations.”
The TRS includes a variety of modules for various omic analyses plus ClinicalSense “for carrying out cohort identification and patient stratification, which is typically one of the first steps that you carry out in a translational research study to identify biomarkers,” says Sheldon. VisualSense is the module for report generation, although Spotfire is also supported.
In addition to Celera, “we worked a lot with large medical institutes like the Mayo Clinic and Dana Farber and learned a tremendous amount about translational research over the last four or five years and we’ve encapsulated into the product,” says Sheldon. The fact that there is a community of InforSense users willing to share best practices is another plus, says Sninsky.
The surge in GWA studies and translational medicine approaches may boost demand for informatics workflow tools and Sheldon says he is seeing evidence of this now. InforSense is working closely with two major pharma now and several others at earlier stages. Time will tell how the molecular diagnostics field evolves. Celera has at least one test approved (KIF6) and is busy looking for more.
--
This article first appeared in Bio-IT World’s Predictive Biomedicine newsletter. Click here for a free subscription.
Managing the Modern Genomics Data Flood Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: - Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.
- Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform
Turning Genomics Data into Practical Insight With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can: - Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines.
- Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image).
Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands.
New Complimentary Market Survey… Collaborations and Communications Within Drug Discovery Research This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns. An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
Scientific Software Engineer Boston MA $70,000 to $95,000
Tessella delivers software engineering and consulting services to leading pharmaceutical and biotech companies. We are recruiting Software Engineersto work with skilled bioinformaticians and scientists to identify business needs and recommend and develop technical solutions. Applicants require BS, MS or PhD in bioinformatics, biology or chemistry and 2+ years of software development in either: Java, C#, C++, C or VB.NET.
Apply at http://jobs.tessella.com
Early Access Collaborations Managers Oxford Nanopore Technologies is developing a novel technology, GridIONTM for the direct, electronic analysis of DNA/RNA and other analytes. As the system approaches the market, we are building a team of technically knowledgeable, highly motivated candidates with excellent customer service and facilitation skills to join our company as Collaboration Managers. This is a unique opportunity to work with world-leading genomics customers throughout the early adoption phase of a new generation of DNA sequencing technology.. This is a facilitative, enabling role with responsibility for managing technology development collaborations with key customers at leading genomics institutions. It will include long term management of the collaboration plan and milestones and associated meetings and documentation. Click here to find out more and apply
Oxford Nanopore's GridION technology, VP, Sales and Marketing Oxford Nanopore Technologies is a fast-moving technology company that is developing a novel electronic molecular analysis technology. The technology is adaptable for the analysis of DNA/RNA, proteins, chemicals and other molecules. It is therefore suitable for use in a variety of markets including scientific research and clinical applications. As the technology approaches the market, Oxford Nanopore is seeking a visionary VP of sales and marketing to join the senior team. The candidate will embrace the opportunities afforded by entering the market with a truly disruptive technology that has the potential to expand the number of users and the variety of applications in each target market. This is a rare opportunity to influence the commercial strategy at an early phase of its commercial lifetime, in a well funded company. Oxford Nanopore welcomes applications from candidates with a track record of high-level strategic commercial leadership, who wish to apply a fresh approach to existing markets. Experience in Life Sciences/DNA sequencing is central to this role, however we will consider your application if you have experience of disruptive technologies in other related industries. We are particularly interested in candidates with strong expertise in the use of digital technologies for sales and marketing of scientific/technical products. Click to Apply
|
|
|