Coriell Life Sciences Prepares for the Whole-Genome Health Environment
By Aaron Krol
February 24, 2014 | When the Coriell Institute launched its Coriell Personalized Medicine Collaborative (CPMC) research project in 2007, its mission was to understand how patients and their physicians react to having genetic information when planning out their health care strategies. The CPMC has now interviewed thousands of participants about their genetic disease risks, and has tended toward the center in debates about how well this information serves the patients who see it. Representatives from Coriell have sounded a more stringent tone than many direct-to-consumer testing companies about how risk information should be validated and presented, but remain confident that patients can understand their own genetic data if given the right tools. (See “Our Genomes Are Talking to Us – Are We Good Listeners?”)
As the CPMC looked more and more into the landscape of genetic testing, however, one surprising realization stood out: even while the cost of genetic testing continued to plummet year after year, no one in the personal sequencing market seemed to be laying the groundwork for dealing with whole genomes.
In the CPMC’s own genetic tests, the organization uses a pair of SNP chips that together capture around one million potential SNPs (single-nucleotide polymorphisms) from across the genome – comparable to the amount and type of variation characterized by 23andMe, the market leader in direct-to-consumer testing. But that’s a tiny fraction of the millions of spaces in the genome where SNPs are known to occur, even leaving aside other sources of variation like insertions, deletions, and copy-number variations. With the price difference between one of these SNP chip tests and whole-genome sequencing eroding, it will soon make more economic sense to offer people their entire genomes than a limited panel.
Yet, according to Michael Christman, President and CEO of the Coriell Institute, “the whole infrastructure to manage what are massive amounts of data, in the era of whole genome sequencing, really isn’t there.” The ability to store many individuals’ entire genomes over long periods of time – and just as importantly, to keep dipping into those sequences for new health information as the state of genetic knowledge progresses – requires fundamentally different architecture than storing a limited number of SNPs with well-defined health associations.
While the CPMC tried to define best practices in the reporting of genetic information, its members found a gap in the market for a company that was willing to embrace the complexity of whole genomes – and the responsibility of validating new knowledge so it can be dependably put to use in health care. “We realized fairly recently that there was a commercial activity here,” Christman told Bio-IT World. “The gulf that exists between the expert genomic scientists, on the one hand, and physicians in a community practice on the other hand, is huge. And there’s not adequate infrastructure to get those two groups to talk to each other. There’s a big need for someone to be an expert custodian of genome information.”
From this need was born Coriell Life Sciences, a for-profit spinoff of the CPMC that aims to translate that project’s findings into a service that can scale up genetic health reporting to the level of whole genomes.
The company’s vision is of a genetic storage-and-interpretation environment split into three pieces, called GeneVault, GeneDose, and GeneExchange. GeneVault holds the genetic data; GeneDose is a Coriell-built tool for predicting individuals’ drug responses based on their genetic profiles; and GeneExchange is a sort of app store for analysis tools, where partner companies can offer their own glimpses into the medical implications of the genome.
A Safe Space for Genomes
The foundational level of Coriell’s service, collectively called the Genomic Data Ecosystem, is GeneVault. Permanent storage of any significant number of users’ genomes – especially if that storage has to be both secure and accessible to repeated data mining – demands a huge amount of hardware, which is one of the major barriers to moving whole genomes out of the exclusive domain of research and into the commercial space. To meet this challenge, Coriell Life Sciences teamed up with an IT giant that has been leaning more and more into the health care space in recent years.
“We began conversations with IBM about some of the big data storage problems that would really become particularly important once we moved to an era of whole-genome sequencing, rather than just genotyping, being the standard of care,” says Christman. IBM, which had already partnered with the Coriell Institute to help manage and track biological samples, agreed to host GeneVault in the IBM cloud. Storage of a genome in GeneVault is free to the patient, and offers a constant point of access for new analysis even if the patient changes health providers, insurers, or employers.
Security is a major concern for data in GeneVault. Genetic information has always been hard to conceal: research projects usually anonymize sequences, stripping them of obvious identifiers like a subject’s name or location, but the genome is, of course, a form of identifying information itself. Scott Megill, who is now President and CEO of Coriell Life Sciences after serving as the Chief Information Officer of the Coriell Institute, explained to Bio-IT World that his new company is taking elaborate security measures with its users’ data.
“It’s got multiple layers of encryption, and we’re also doing a salting of the data,” says Megill – referring to a method of protection by which random strings of data are stored alongside the meaningful data to be encrypted, making it much more difficult to check possible inputs against the encrypted outputs. “And we also store all the personally identifying information in a completely separate database, so they only come together at the time that the diagnostic is produced.”
“We’ve been really taking a very hard line on data in transit as well, as far as how we actually transfer information to third parties,” he adds – a crucial consideration in an environment like GeneExchange, where data is meant to travel through many analytical tools developed by multiple partners. “For one, we’re generally only providing information that’s necessary for the given interpretation that the third party is doing, so we’re not just providing wholesale access to the full sequence data… We’re also only operating across secured VPN or other encrypted channels, and always where the channels both at rest and in transit are HIPAA-compliant.”
Scott Megill (left) and Michael Christman (right) speaking in the Coriell Data Center. Image credit: Coriell Life Sciences
Addressing the security and storage demands of working with whole genomes creates a resource that can travel with Coriell’s users through life. So far, Coriell Life Sciences isn’t taking full advantage of the GeneVault’s depth. The company is launching with just the GeneDose tool live, so the first customers will only be sequenced for genes with known relevance to drug reactions, a much faster and less expensive proposition than whole-genome sequencing with modern instruments. But as GeneExchange gets up and running with new applications, the infrastructure will already be in place to add larger amounts of genetic information to the cloud.
The intention, eventually, is that new customers will want to place their whole genomes in GeneVault from the outset. “Ultimately,” says Megill, “the infrastructure that we’ve built is in support of whole-genome sequencing.”
Consumer Product, or Medical Tool?
The Genomic Data Ecosystem is not a direct-to-consumer service. Instead of allowing users to browse through their genomes and run analytics at will, Coriell operates through physicians, who order interpretations of their patients’ genomes like any other medical tests.
That’s a stark contrast with the services that might be considered Coriell Life Science’s nearest equivalents – companies like 23andMe that also try to take a comprehensive look at users’ genetics, but cater more to general curiosity and the entertainment value of skimming through your genome. The Genomic Data Ecosystem looks more like the established medical environment for doctors who want to run genetic tests today, where patient samples are shipped out to certified labs to check a specific piece of genetic information. The difference, of course, is that the sample only has to be collected once, after which new tests can be run as quickly as they’re added to GeneExchange.
This model rescues Coriell from some of the regulatory concerns that threaten the direct-to-consumer testing companies, which the FDA has recently made clear will be treated as if any health information they provide to their customers is a “diagnostic test” requiring approval for sale. However, the decision also places Coriell in untested waters for a for-profit company. Since the patients aren’t themselves being charged for the storage and interpretation of their data, Coriell has to convince insurers or health care systems that genome-wide analysis is worth paying for – even for patients with no specific indication for testing.
“Right now we are focused on institutional contracts,” says Megill. Coriell Life Sciences’ first customer, PACE, is one such institution that sees an immediate value in the genome. PACE, or the Program for All-Inclusive Care for the Elderly, signed up for Coriell’s service in November 2013. The organization serves around 50,000 frail elderly patients nationwide, and sees GeneDose as an effective way to tailor drug regimens to its members’ unique needs.
“For PACE, it’s really an economic play,” says Megill. A typical member of PACE may be taking five to ten different medications, not all of which are necessarily best-suited to that person’s genetic profile. “For them, the cost of readmittance to the hospital, the cost of adverse drug reaction care, is extremely high, and so because they’re a capitated system with pooled funds across all of their members, it makes an awful lot of economic sense for them to proactively test individuals to make sure they’re being put on the right medications.”
Of course, to provide true medical value to this population, GeneDose needs to be both broad in scope, and extremely well-validated in each call it makes. A large number of drugs have been shown to have different efficacies, or different risks of adverse reactions, in patients with certain genetic variants, but often these relationships have only been shown weakly, or in one or two isolated studies. The field of pharmacogenomics is young, and as with all genetic health disciplines, prone to uncertainty.
At present, 26 different drugs are represented in GeneDose, with 53 more under close consideration. But deciding when a drug-gene interaction is certain enough to be used in real medical practice is painstaking work – especially for a company like Coriell, which has staked its reputation on adhering to the best practices in risk reporting developed under the CPMC.
“We have a team that we call the Risk Reporting Team, that essentially scours the world’s scientific and medical literature, identifies things that are credible scientifically, [and] factors in things like whether the associations have been replicated,” says Christman. The Risk Reporting Team comes up with a quantitative score for every drug-gene interaction they look at. “The top end of the spectrum would be that there’s been a randomized clinical trial, showing a clinical outcome difference if genetics is used with a given drug, versus not using it,” says Christman. “And then the low end of the evidence might be that a molecule is known to bind to or have some association with a P450 drug-metabolizing enzyme” – molecular evidence that has never been demonstrated to have an effect in vivo. These latter kinds of relationships would not make it into GeneDose.
The analysis that goes into each patient’s GeneDose results is involved, but the resulting report is stripped down to the essentials. “We strongly feel that adoption is dependent on the information being relatively simple,” says Christman. This was a major lesson of the CPMC: that genetic risk can be complex and well-informed, and at the same time easy to comprehend at a glance. The front page of a GeneDose report is simply a list of drugs, color-coded with different recommendations: green for drugs that should be safe and effective, red for those that are likely to fail or cause dangerous side effects, and blue if more subtle considerations need to be taken into account.
An excerpt from page 1 of a sample report from GeneDose, showing various drug calls. Image credit: Coriell Life Sciences
Of course, if a physician wants to drill into the medical literature to understand why a particular call was made, or learn exactly which genetic variants a patient is carrying, those resources are included deeper in the report. Coriell encourages physicians to treat its reports as part of a much larger picture of each patient.
“It’s not prescriptive,” Christman stresses. “We are providing information to physicians, from which they need to factor in, potentially, other information about the patient to make a decision themselves. An example might be, the physician may know that a patient has had a kidney removed, or has liver disease – something that could profoundly impact appropriate drug prescribing or dosing, that we don’t know.”
That question of additional information about patients is a driving one for the leadership of Coriell. Like other players in medical genomics, Coriell Life Sciences sees the endgame as being the connection of each person’s whole genome to her or his electronic health records, allowing clinicians – or even computers – to look at both sets of information together.
“It’s a really challenging issue,” says Megill. “The hospital systems, by and large, are responsible for that phenotypic information. It’s pretty unlikely they would have whole-genome sequence data sitting inside of those hospitals themselves, and so where does all of this go?”
The hope is to plug the Genomic Data Ecosystem into the EHR system, and Coriell is already taking the first steps in that direction by designing its reports to correspond to existing HL7 fields – the standard elements of a health record built to make communication between separate EHR systems easier. This practice will grow more difficult, however, as Coriell brings new analytical tools on board. Not every piece of useful information residing in the genome will correspond neatly to an HL7 field.
Coriell decided to manage GeneDose themselves because of prior expertise in drug-gene interactions gained from the CPMC. “We’ve built a panel of experts that’s really the world’s leading scientists in pharmacogenomics,” says Megill. “So we felt that we had a position that was authoritative in this space, and had been looking at it from the perspective of what really has clinical utility, and what really has enough scientific evidence that we would return it to a physician.”
For other types of genetic health risk, the knowledge may be out there, but Coriell doesn’t have any specific experience curating it. For this, the company relies on third-party developers who want to build their own tools in GeneExchange – finding customers through the Genomic Data Ecosystem while giving a small cut of the proceeds from each use back to Coriell.
(Coriell isn’t ready to divulge which partners might join GeneExchange, but, says Christman, “there’s a lot in discussion” – emphasis on a lot.)
This puts Coriell in the curious position of validating its partners’ validation processes. It’s by no means an unworkable model; members of Coriell have been studying best practices in risk reporting for six years, and should be able to recognize when potential partners are adhering to the highest standards of evidence in genetic health. But it does highlight how intensive the curation process can be: Coriell’s Risk Reporting Team will be working hard just to keep up with the latest research in pharmacogenomics, and will need external companies to expand the tests available in GeneExchange.
With IBM’s help, Coriell is already exploring how curation could be accelerated in the future. IBM’s Watson supercomputer, one of the most powerful systems in the world for combining rapid data mining with natural language processing, has already been put to use in oncology, helping doctors find the best treatments for particular cancer cases by combing the medical literature. Coriell Life Sciences has entered into a collaboration with IBM’s newly-formed Watson division to apply some of the same principles to growing the Genomic Data Ecosystem.
One potential application, says Megill, is to use Watson as “the front line of defense [for] keeping current on what’s actually available in the medical literature.” In this scenario, Watson would pass promising sources of information to the Risk Reporting Team, while saving those experts the effort of reading through the most easily-eliminated studies. “That doesn’t circumvent the use of our expert advisory panel,” says Megill, “but what it does do is really shortcut a lot of the manual effort that’s involved in having PhD-level scientists curate all of these new publications all the time.”
More ambitiously, Watson could even become a gateway between GeneVault and EHRs. “We’re also working with the IBM Watson team on developing a more real-time physician’s assistant that can help with more precision prescribing,” says Megill, “with the idea that Watson can be aware of a patient’s medical history, their genotyping information or genomic information, and then marry that with information that’s broader than just genomics… to come up with recommendations for the physician that would help them to tailor an overall medication regimen for that patient.”
“That’s in the very early stages,” he adds. “But it’s something we’re really excited about.”
For now, the emphasis will have to fall on bringing the first apps into GeneExchange, so the Genomic Data Ecosystem can begin living up to its genome-wide ambitions. The other key to success, of course, will be proving that Coriell’s service can positively impact patient care. Savings at PACE from better-targeted prescriptions could help convince new providers to offer their patients access to GeneVault.
The coming year should begin to show GeneDose’s potential as a clinical tool. Coriell Life Sciences is already genotyping its first members: the physicians in PACE, who will learn the ins and outs of the service firsthand.