New Informatics Tools Will Speed Personalized Therapies

Loading...

By Catherine Varmazis

Feb. 11, 2008 | In a major collaboration, researchers from the Cancer Institute of New Jersey (CINJ), several U.S. universities, and IBM are creating grid-enabled informatics tools that perform high-throughput analysis of tissue microarrays to dramatically improve the accuracy and speed of cancer diagnoses.

"Years ago, most patients would go through the same treatment: chemotherapy, for example," says David Foran, director of the Center for Biomedical Imaging & Informatics at CINJ and lead investigator for the project. "If that didn't work they'd move on to drug 1, and if that didn't work they'd try drug 2, and so on. Now we can bypass all these trials and go directly to the therapy that is most appropriate based on [a patient's] expression signature."

The project, which received a $2.5 million grant from the National Institutes of Health (NIH) last October, makes use of tissue microarrays, pattern recognition algorithms, and grid-based supercomputing.

Foran says each tiny tissue plug on a microarray contains different types of tissue. His lab developed software that can distinguish between these heterogeneous bits of tissue and detect the presence of a specific cancer biomarker. "If it's present we have computer vision techniques and software that we've developed which will tell us if it's located in a specific tissue or in a certain sub-cellular compartment, like the nucleus or cytoplasm. All of these things have bearing on the clinical outcome of the specific patient we're looking at."

To conduct the proof-of-concept required to get funding for the project, CINJ researchers took a set of "retrospective studies" of over 100,000 patient tissues for which the diagnoses were already known, and analyzed them using their specialized software. Programmers from IBM grid-enabled the software and ran the analysis over the World Community Grid (WCG) - a virtual supercomputer established by IBM to conduct this kind of work. Computation of this magnitude would have taken a single desktop computer 2900 years to complete, but it took the WCG less than six months, says IBM's Robin Willner, VP global community initiatives.

When the analysis was complete, "we were able to compare the signatures we had generated and that we hoped would correlate with different stages and types of disease," says Foran. "We compared them with the patient outcomes and profiles in terms of diagnosis and histologic types and found there was a very strong correlation."

Foran now plans to expand the number of disorders being investigated, grow the reference library of expression patterns, and build a clinical decision support system so oncologists at cancer centers around the world can download the CINJ client and analyze their own tissue specimens. Going forward, the computation will be done on caGrid, an open-source software infrastructure that has been developed as the main grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG) program. In addition, IBM is donating a high-performance supercomputer to the CINJ's new Center for High-Throughput Data Analysis for use in examining the digitally archived cancer specimens and genomic data.

Joel Saltz, professor and chair of the Department of Biomedical Informatics, and professor in the Department of Computer Science and Engineering at Ohio State University (OSU), where most of the caGrid software has been developed, says, "One of our roles in this project is to develop a caGrid-compliant infrastructure that supports the data and algorithms [that Foran's group developed] so the tissue microarray and virtual slide data can be integrated with other kinds of experiments and translational research data types."

For data from different data sets to be compatible, there has to be a mechanism for standardizing the naming of biological terms and another for standardizing how complex data structures from different types of experiments are represented in XML schema. To this end, Saltz's group is developing standard data models and well-defined biomedical ontologies that will be harmonized with the caBIG processes, to avoid isolated "information islands."

"The caGrid infrastructure is designed to connect databases as well as computational procedures, so it's like having a worldwide programming environment of databases and procedures," explains Saltz. "But for this environment to work, you need to know how to call your procedures, and what the query language is, and that's where all this language and ontology stuff is important, because otherwise if I tell you, 'We've got this wonderful tissue microarray environment, feel free to use it.' You'd say, 'Well, thanks, but how am I going to find out how to?  And what do you have in there?' "

The complexity and scope of this work made multidisciplinary collaboration involving many organizations essential. "A lot of big science today requires many different levels of expertise," says Foran. "In fact, when we received our critiques from the NIH, they stated explicitly that this group of individuals [involved in the project] is unique in what they bring to the table."

Although still in the early stages, the tools are already being used by oncologists at CINJ. The plan for the coming year is to have a prototype system up and running that will be deployed at Arizona State University, Rutgers University, the University of Pennsylvania School of Medicine, Ohio State University, and the CINJ.  "That will serve as our test bed for iterative prototyping, and then within the next three years, we'd be constantly updating the software as it becomes refined and optimized, and we're hoping we'll have a product to put out to the research and clinical communities by year 4," says Foran.

White Papers & Special Reports

Quantum
StorNext 4.0: Technical Product Brief
Sponsored by Quantum

 
Proven in the world’s most data intensive industries, Quantum StorNext is a scalable, high-performance file system which allows data sharing across Linux, Mac, Unix, and Windows operating systems and manages data in enterprise storage environments. In this Technical Brief you'll learn:

  • How a high-performing file system can accelerate your business
  • How to simplify your data management
  • How a tiered storage approach can save you money


SURETY-IP_WPx108
Protect Your Scientific Intellectual Property: Proof of Lab Informatics Data Authenticity is Your Best Legal Defense
Sponsored by Surety, LLC

As a bio-technology or life sciences organization, your formulas, treatments and research and discoveries are the “lifeblood” of your business. But if you aren't protecting the integrity of your scientific data in your lab informatics systems, you risk losing IP ownership, revenue and consequently your business if you can't prove time-of-creation and data authenticity. Learn how you can implement simple, cost-effective and automated controls to protect your scientific intellectual property. Consider:

  • IP protection requirements in bio-pharma and other science-oriented industries can extend out 20, 30, 40 or more years
  • Most electronic lab management solutions include generic authenticity controls, so how "legally defensible" is yours?
  • Only standards-compliant, independent controls can future-proof your approach to long-term IP integrity protection and authenticity.
  • Learn more - get the free whitepaper now


BlueArc_WP_DataMigration.jpg
The Key to Life Sciences Data Management: Transparent Migration
Sponsored by BlueArc

Life sciences organizations face new data management challenges as the volume of research data grows and more data is kept online for longer times. Read this paper to learn about:

  • The benefits of transparent data migration (TDM)
  • How TDM technologies can simplify data management.
  • How using TDM can help increase storage utilization, improve computational workflow performance, and optimize the use of storage resources.


Life Science Webcasts & Podcasts

adobe_i3_btn_webinarNext-Generation Clinical Trial and Data Management Applications
Sponsored by Adobe

This webinar introduces i3Cube - a web-based, fully integrated, clinical trial and data management system built on Adobe’s LiveCycle® Enterprise Suite.  I3 cube provides end-to-end automation that delivers unprecedented visibility into information that sponsors need to accelerate the study process and complete trials efficiently. Viewers will learn more about:

  • Creating faster and more efficient trial processes
  • Reducing investigator burden 
  • Real-time sponsor transparency into study information
  • Enterprise solutions based on Adobe LiveCycle® ES utilizing cross-platform clients of Reader, Flash and AIR

    Download now.



More Podcasts

Job Openings

Employers -- Don't miss this opportunity to reach well-qualified life science candidates.

Loading...

For reprints and/or copyright permission, please contact The YGS Group, 3650 West Market Street, York, PA;

(717) 505-9701 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.