NCI’s five-year effort to link cancer centers is growing up.
By Wendy Wolfson
January 20, 2010 | When caBIG was named the Bio•IT World Best Practices Awards Editor’s Choice winner for 2008, Kenneth Buetow, associate director for bioinformatics and information technology at the National Cancer Institute, said that he hoped the award would draw more attention to caBIG (see, “Connecting the Cancer Community caBIG Time,” Bio•IT World, July 2008). Bio•IT World’s Best Practices program probably isn’t fully responsible, but since the win, the National Cancer Institute’s Cancer Biomedical Information Infrastructure (caBIG) has certainly made good on the judges’ predictions.
For the past five years, caBIG has been connecting cancer centers in a collaborative network to exchange data on patient outcomes, discovery research, genetics, tissue and imaging, and clinical trials. caBIG’s premise and promise is that sharing information can make basic research more efficient, speed translation of research into therapies, and improve patient outcomes.
The caBIG network includes caGRID, the standardized grid for information-sharing run out of Ohio State University; over 40 software tools; interoperable software enabling systems to communicate; as well as standardized vocabularies. CaBIG provides a framework for projects including Clinical Trial Management Systems, the Cancer Genome Atlas and the Cardiovascular Research Grid. Tens of millions of dollars have been invested by NCI and partnering institutions in the multi-year effort.
So far, caBIG has linked 63 NCI cancer centers, 16 community hospitals, and several NIH institutes. CaBIG is also initiating international collaborations with the UK’s National Cancer Research Initiative, the Beijing Cancer Hospital and Shanghai Center for BioInformatics Technology, and Jordan’s King Hussein Institute for Biotechnology and Cancer.
Concrete results in terms of novel cancer biomarkers and therapies are still a long way off, but at this point cancer researchers are starting to realize the benefit of asking basic research questions in new ways.
Breaking Down Data Silos at UAB
The University of Alabama at Birmingham Comprehensive Cancer Center (UAB) is using caBIG to commingle clinical data sequestered in its clinical systems with research data so scientists can access a wealth of outcomes data. According to John Sandefur, information systems manager at UAB, this means dealing with an enormous amount of data that was difficult to query as it was not generated for research purposes. “Hospital data is patient centric but not population-centric,” says Sandefur. To make the implementation work, it is essential to get support from all parties involved, as well as funding. Moreover, it is crucial to have a champion for the project who is a researcher, and who can define specific requirements and help advocate for the project.
UAB is using caBIG’s caTissue Suite biospecimen repository application for brain cancer specimens. UAB was previously tracking specimens in separate databases, but researchers couldn’t query the patient cancer databases to get information on specific cancers, or properly access patient clinical data to look at population patterns.
Sandefur’s group is currently tracking and expanding the inventory of brain cancer specimens and linking the biospecimen data to clinical outcomes data that resides in the production systems on the academic health care system. Later they will start putting imaging data on the grid. Establishing a uniform descriptive vocabulary is a challenge, as tissue banking terms vary between technicians and pathologists. Sandefur’s group is managing this locally by using a drop down menu and establishing common terminology and data elements. Future projects include the implementation of the caTissue project, building a data warehouse, defining data architecture, putting data into use, and deploying a clinical trials management system that is caBIG compatible. “Most universities are sitting on a gold mine of data and what caBIG is about is how to mine that data,” says Sandefur. “If we can share that data then a lot of new and intelligent eyes can look at that data in a new way.”
Sharing Imaging Data
Eliot Siegel, radiologist at the University of Maryland School of Medicine, is getting the caBIG In Vivo Imaging Workspace up and running as part of The Cancer Genome Atlas (TCGA)—a collaborative effort between the NCI and the National Human Genome Research Institute to identify the molecular basis of cancer. NCI is coordinating the creation of an integrated database of clinical genomic and proteomic data for users around the world. Siegel’s team is creating a template for researchers and radiologists to fill with phenotypic data associated with brain tumor images. The radiologists can annotate and analyze images. They can extract visual data and match it to anonymized clinical data.
“It is reinventing the way we do diagnostics,” says Siegel. Radiologists can use caBIG to access the cancer genome database, REMBRANDT (REpository of Molecular BRAin Neoplasia DaTa)—another database of adult and pediatric primary brain tumors—and the national biomedical archive, and display data across three different workstations in common use.
“We can look to see who lives longer and the types of tumors,” says Siegel. Doctors can also look for patterns in genomic data, correlating it with clinical, surgical, and oncology records to predict an evidence-based course of treatment. By logging who responds to particular therapies, it is possible to create a decision-support tool based on real statistical information. The database can be also used for discovery. Given the tumor DNA and radiology findings, it may be possible to predict whether a biopsy is indicated. It could also be useful as a treatment database for rare tumors. The eventual goal is that every person who develops brain cancer gets in the database.
The team intends to extend this platform to breast cancer. “We are hoping as time goes on to demonstrate this idea of personalized medicine and how the work that we are doing can reinvent the electronic medical record, so instead of it being qualitative, the description is quantitative,” says Siegel.
Ohio State University (OSU) is developing an application to collect longitudinal data for treatment programs for adults with leukemia and osteoarthritis. OSU is in the process of putting a database of clinical data to characterize tissues on caGRID. caGRID is also used by the OSU Center for Translational Science (CTSA) for a shared clinical trials management system. Normally data would be in different repositories. Now clinical trial investigators can query different clinical trials. “What would have taken weeks now takes a few mouse clicks,” says Philip Payne, assistant professor of bioinformatics.
According to Payne, CaBIG was slow going at first and the early stage of establishing models was time consuming. But now people are starting to talk about novel science and tools being delivered to the point of care. But while caBIG is providing the tools and the CTSA is starting to extend the tools, growth is an organic process. Having researchers themselves champion applying the caBIG tools in the clinic is crucial.
Waging Another Kind of War
Columbus, Ohio, like other cities, has a higher rate of premature births in certain ethnic groups, in part due socioeconomic status. For example, African American women have a 2.5 times higher rate of prematurity than white women. “We were astounded,’ said Kelly Kelleher, professor of pediatrics at Ohio Nationwide Children’s Hospital.
Nationwide Children’s and Ohio State University Hospitals are now “waging a war on prematurity” said Kelleher. The hospitals’ outreach programs have so far reduced the prematurity rate for high risk groups, for African American women as much as 12.5 to 20 percent. “Part of it is understanding the etiology of prematurity,” said Kelleher. “We can do these prevention programs but don’t understand how they work.” In conjunction with Payne’s team, Kelleher and colleagues initiated project in November using a caGRID application to conduct longitudinal studies of maternal-fetal outcomes taking into account environmental, clinical, and epigenetic factors.
The study is measuring access to care, taking tissue samples from mothers and children, and looking for biomarkers that can predict premature birth. Currently the samples are hand processed but this is expected to be put on the grid. This type of study was previously not possible because the adult and pediatric hospital information infrastructures were not linked.
“I think the maturity of the caBIG and caGRID go beyond cancer,” Payne said. “It is allowing us to query long term, longitudinal data sets not possible before. It allows us to take a look at new cohorts of patients.” Payne now sees the first green shoots of applications enabling new kinds of scientific questions to be asked. “I think the barriers to entry are lower now,” Payne said. “The most successful programs are the ones that solve the most basic questions.”
This article also appeared in the January-February 2010 issue of Bio-IT World Magazine.
Subscriptions are free for qualifying individuals. Apply today.