NCI Launches GDC

June 15, 2016

New platform makes genomic data publicly available in support of the “moonshot” to cure cancer

By Paul Nicolaus

June 15, 2016 | The push to cure cancer has, to paraphrase Neil Armstrong, taken one giant leap forward with the launch of the National Cancer Institute’s (NCI) Genomic Data Commons (GDC), which was officially unveiled June 6 at the American Society of Clinical Oncology (ASCO) Annual Meeting.

Part of the National Cancer Moonshot Initiative, the data sharing platform will support precision medicine efforts by enabling the access, standardization, analysis, and submission of cancer genomic data.

“I think it’s one of the few systems in the world, if not the only one, that is open for broad data sharing in cancer genomics,” said Louis Staudt, M.D., Ph.D., director of the NCI Center for Cancer Genomics. “We feel that this is necessary to make progress on many fronts in cancer, but particularly in precision medicine approaches to cancer.”

Raw Data Sets GDC Apart

Cancer is a genetically heterogeneous disease, and to fully understand how patients are responding to drugs, researchers must look at more than a few hundred subjects with a particular type of tumor. The Genomic Data Commons was designed to help researchers share as much data as possible.

“One unique aspect [of the GDC] is that we made the decision to start from the raw sequencing data off the sequencers so that we could reanalyze it using common software pipelines so that all the data in the GDC would be comparable,” said Staudt. “The other major difference, I would say, is that we are truly an open system in the sense that any qualified researcher who has applied for access to data and has a reasonable research request will get access to these data.”

The GDC is being managed by the University of Chicago Center for Data Intensive Science along with the Ontario Institute for Cancer Research, all under contract with Leidos Biomedical Research. As a starting point, the GDC is using a set of cases from large-scale NCI projects such as The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Treatments (TARGET), which make up some of the most comprehensive data sets in existence.

Big Picture of Big Data

While there’s no way of knowing exactly what the rate of accrual will be moving forward, there are a number of ways in which the data pool will continue to grow. Cancer genomics is a hot topic right now, which means papers on the topic are being published regularly and the journals require the sharing of that data.

“We think that the GDC would be an appropriate place and hopefully people would find it a good place to share those data as required for publication,” Staudt said, “so we think we’re going to get a number of new cases that way.”

There’s also been plenty of interest from patient-focused organizations, which tend to have the ability to gather the patients, data, and samples and even pay for the sequencing. What they do not tend to have is the bioinformatics support to make that readily available to researchers. The new platform could be a mutually beneficial outlet as these organizations share their data sets while benefitting from the infrastructure of the GDC.

Initially, the main audience group is expected to be cancer researchers, who will have the ability to delve into the data they’re most interested in. As the platform moves toward a knowledge base of cancer, the next constituencies would likely be oncologists faced with next generation sequencing of data or regulatory agencies trying to evaluate genomic tests for markers in cancer.

“Ultimately the final constituency when we amass all this knowledge would be the patient together with his or her physician,” he said. There will be an ability to observe whether a particular variance existed in other patients with that same type of cancer as well as the chance to determine how patients with a particular genetic abnormality responded to certain therapies. “We’re a ways from that,” he added, “but that is the big picture.”

Forward Thrust

One of the biggest remaining impediments to success, according to Staudt, is that genomic sequencing handled in the course of clinical care is difficult, if not impossible, to share in the way the GDC shares because it is protected by various privacy regulations.

A relatively straightforward fix for that, at least for research institutions, is to ask if patients would be willing to share the data from their tumor biopsies. “If there is consent to do so, then by definition those privacy restrictions fall away,” Staudt said, “and we’re able to use the data.”

Challenges aside, there’s plenty to be encouraged about in terms of the flight path of this particular Cancer Moonshot ship. “I think the most exciting thing about the GDC is to get together enough cases of cancer that we can now discern these subtypes of cancer and define a new molecular diagnosis of cancer that has clinical utility,” Staudt said. “That’s where we’re going.”

Paul Nicolaus is a Wisconsin-based freelance writer. Send comments, questions, or story ideas to, or learn more at