Oct 17, 2005 | The National Cancer Institute (NCI) has selected workflow-based integrative analytics technology from InforSense to speed development and deployment of high-throughput genetic data analysis applications.
Specifically, researchers at the NCI’s Core Genotyping Facility (CGF) will use InforSense KDE, a workflow-based grid-computing platform for integrating data sources, analytic methods, and computational services, for rapid application development. The group also plans to use the technology to make its analysis methods available to other researchers by publishing them via a Web portal. This will help accelerate cancer research, according to the NCI.
“Investigators require access to the latest analysis techniques to advance their research,” says Meredith Yeager, scientific director of the NCI’s CGF. “Therefore, creating the informatics infrastructure to compose and deliver these methods is a high priority at the CGF.”
In addition to supporting the latest analysis techniques, the CGF needs to deal with a growing amount of data from new lab technologies. “We are going to start generating more and more data — that will need to be analyzed,” says Yeager. “We need these analyses to be automated so that we can scale up to include massive amounts of data.”
To accomplish all of this, several nuances had to be addressed. “The workflow is critical, but other factors come into play, too,” says Yike Guo, CEO and founder of InforSense. He notes that in most research environments today, the life cycle associated with analysis methods also needs to be taken into account. “The analytical process is always evolving, so you must accommodate change and you must be auditable,” says Guo.
This point is echoed by Yeager. She notes that the facility’s work is always changing; for example, the research incorporates new high-throughput genotyping systems and techniques as they come along.
Essentially, the CGF needed a platform to rapidly develop analytic applications. The approach would also need to be able to schedule jobs taking into account the computationally intensive nature of the analysis routines. And it would need to scale.
Many life science organizations find themselves in a similar situation today. The choices are to build a workflow platform from scratch, use open-source software and modify it to meet the particular needs of the organization, or buy commercial software.
The impetus behind the move to the InforSense platform was speed. “Time was critical,” says Yeager. “We could move more quickly [with the commercial software] than the other approaches.”
Another factor that came into play was that whichever platform was adopted, it had to be compatible with the NCI’s caBIG (Cancer Biomedical Informatics Grid) initiative. Guo notes that InforSense KDE meets the compatibility requirements for integration, not only by handling the range of data types and associated metadata involved but also by providing a way to rapidly plug in Web services and even ontologies as well.