Annai Hosts International Cancer Genome Consortium Data

December 11, 2014

By Bio-IT World Staff 

December 11, 2014 | Data generated by the International Cancer Genome Consortium (ICGC), including whole genomes, exomes, and RNA-seq, will now be made available to researchers through Annai Systems' ShareSeq platform. The agreement was reached with the Ontario Institute for Cancer Research, whose President and Scientific Director, Thomas Hudson, is a co-founder and member of the executive committee of the ICGC. By hosting data on over 10,000 human tumors in the ShareSeq cloud, Annai will provide a venue for ICGC collaborators, and independent groups working with ICGC data, to quickly access and retrieve information.

The ShareSeq facility is housed at the San Diego Supercomputer Center, which also hosts the Cancer Genomics Hub, through which scientists can access the best-known data repository for tumor genomes, The Cancer Genome Atlas (TCGA). With the co-location of ICGC data at the center, "researchers will now be able to bring compute to the data in an environment where they can aggregate samples from both projects, thus having convenient and cost-effective access to the largest collection of reference cancer genome data," says Francisco De La Vega, CSO of Annai.

Annai launched ShareSeq this April at the Bio-IT World Conference & Expo in Boston. The resource includes the Annai-GNOS platform for managing genomic data and metadata, as well as a suite of open source informatics tools for analyzing data. While Annai-GNOS has already been deployed to manage ICGC data in six institutions for specific projects, today's agreement marks the first time these resources will be available to the wider community of ICGC research partners. Scientists interested in working with ICGC data can apply to the consortium's international data access committee, and if approved, will be directed to the repository of data in ShareSeq. Users can then download ICGC data directly for free, subscribe to ShareSeq to have continuing access to data hosted in Annai's cloud, or run individual computations on a pay-for-use basis.

"All NIH-funded data managed by dbGaP [such as TCGA data], like the ICGC data, is deemed protected patient identifying data and is not allowed in public cloud services like Amazon and Google," says De La Vega. "Thus, Annai is the first commercial entity that has been deemed secure to serve this data to others and offer services on top of it for researchers."