Speed Heals: The 26-Hour Diagnostic Genome
By Allison Proffitt
September 29, 2015 | Stephen Kingsmore and his colleagues at the Center for Pediatric Genomic Medicine at Children’s Mercy in Kansas City, announced 26-hour diagnostic whole genome sequencing in a paper published today in Genomic Medicine, an improvement over the 50-hour whole genome sequencing the group published in 2012.
The paper is published just one day after Kingsmore took his new post as President and CEO of the Rady Pediatric Genomics and Systems Medicine Institute at Rady Children’s Hospital in San Diego.
The second generation STATseq pipeline comprises 18-hour whole genome sequencing on an Illumina HiSeq 2500 in rapid run mode; read mapping, alignment, and variant calling with the Edico DRAGEN pipeline; and a trio of softwares for analysis: VIKING, RUNES, and SSAGA.
The 26-hour clock runs from blood sample to provisional diagnosis. Of the four samples run with the new pipeline and published in the Genomic Medicine paper (DOI 10.1186/s13073-015-0221-8), the shortest total time was 26:08 and the longest two were 26:47.
Although the time has been cut in half, Kingsmore is particularly pleased with the increased sensitivity. The analytics sensitivity and specificity are both 99.5%. “That’s a significant improvement over the 2012 marks,” Kingsmore told Bio-IT World. “But also it’s actually a pretty significant improvement over what you might call the research standard of BWA-mem plus GATK. It really is a much more sensitive platform.”
In fact, Kingsmore added, “The new variants that we’re seeing tend to be rare, deleterious variants, so that the filtering mechanisms that were being employed with the standard pipeline tended to have a bias against deleterious rare variants, which of course are the only thing you’re really interested in if you’re doing diagnostics.”
Kingsmore and his former team at Children’s Mercy have been working closely with Illumina on the problem of quick and accurate diagnoses for babies in the NICU for years. The 50-hour pipeline was published in conjunction with Illumina’s team in Essex, UK, and the current work has been ongoing since then. In fact, Kingsmore said he’s had data from the Essex team for, “well over a year,” though he wanted to wait to publish, “because we felt it was important to be able to transfer the technology from Essex at Illumina’s own shop so that it actually worked in our hospital.”
The sequencing speed improvements—26 hours to 18—were done by “tweaking the recipe,” Kingsmore said. Cycles were run faster, which then required calibration of heating and cooling, and other specifics. First run by Illumina in Essex, Kingsmore said that the ultra-fast run wasn’t seamlessly transferrable to the hospital setting. “We did have a lot of missteps initially in deploying that in Kansas City. We actually had to have some of the guys come over from Essex and spend a week in our lab working on things like ramp rates and heating and cooling and pump settings before we were happy with it.”
The most speed gains were made in alignment, variant detection, and genotyping. What took 15 hours in the first generation of the STATseq pipeline took Kingsmore’s team only 40 minutes with Edico’s DRAGEN aligner and variant caller. Maximum analytic sensitivity was achieved by combining variant calls of both the DRAGEN pipeline and GSNAP/GATK 3.2-VQSR pipeline.
Kingsmore—much like Shawn Levy at HudsonAlpha—wasn’t lukewarm on Edico’s DRAGEN pipeline; he twice used the word “solved” to describe its impact:
“DRAGEN technology solves one of the larger bottlenecks, which was up until now everybody has had to spend increasing money either on cloud compute or on supercomputing,” Kingsmore said. “Now the major driver for that, which was alignment and variant calling—that problem has been solved.”
Both Kingsmore and Levy report that the DRAGEN pipeline takes about 40 minutes to process a genome. “Much to our surprise, DRAGEN not only gave us speed but it gave us improvements in analytic performance,” Kingsmore said. “That was something we had not expected. It was a nice bonus to have not only a faster result, but a better result.”
The final step at Children’s Mercy might be the most challenging: coming up with a clinical diagnosis.
Three software tools developed by Neil Miller, Director of Informatics and Software Development at Children’s Mercy, were used in the pipeline. Slightly older (and slower by about 30 minutes) versions of SSAGA (Symptom and Sign Associated Genome Analysis) and RUNES (Rapid Understanding of Nucleotide variant Effect Software) were already part of the 50-hour pipeline.
The phenotypic analysis of a patient (or trio if possible) is done in parallel with the sequencing, Kingsmore said. While blood samples are progressing through DNA isolation, library prep, and sequencing, clinicians are teasing apart symptoms that may have no relation to one another.
“For some of the babies the clinical picture is very, very simple. The babies has seizures and that’s it. But for other babies maybe have been in the unit for a while and they have a rather complex clinical picture. And you do need to read through the record and extract what’s relevant and meaningful. It is a bit subjective, because many of these babies have more than one thing going on… There is a little bit of an art there to knowing whether you’re dealing with one thing or two concurrent diseases.”
SSAGA is a web-based software that help with that, providing a differential diagnosis from diseases listed in OMIM, Orphanet, and DECIPHER (Database of genomic variation and phenotype in humans using Ensembl Resources) based on Human Phenotype Ontology terms that a clinician enters for the patient.
RUNES identifies known disease-causing variants and uses predictive tools to estimate variant consequence. RUNES then assigns a score that estimates how likely the variant is to be disease-causing.
Previously, the outputs of the two programs were compared via spreadsheet. But in the new 26-hour pipeline, outputs from both programs feed into VIKING (variant integration and knowledge interpretation in genomes). VIKING allows dynamic filtering of variants based on clinical features, disease genes, ACMG-type pathogenicity category, allele frequency, genotype, and inheritance pattern.
The three-software package can, “automate much of the downstream analysis and interpretation, meaning that a single lab director can relatively rapidly interpret and report results. So much of the heavy lifting is done in terms of analysis and interpretation,” said Kingsmore. “It’s not fully automated. But it’s at the point now where a lab director can make a diagnosis anywhere from three seconds to three hours depending on how tricky and elusive that diagnosis is.”
Though proprietary to Children’s Mercy, RUNES, SSAGA, and VIKING “are on a journey,” Kingsmore said. “By the end of the year, I’m hopeful that they will be available as freeware because the mission is to see this technology provided to kids all over the country. I know the Neil Miller and others are also talking about providing that on a software-as-a-service basis.”
But while Kingsmore calls the particular softwares used in the pipeline “incredibly useful,” he praises analysis software built by Liz Worthey at HudsonAlpha, Ingenuity’s Variant Analysis Suite, and NextCODE’s GOR system. “I think there are a number of commercial and noncommercial systems that are evolving.”
FDA and Validation
All of the testing reported in the paper was research testing. The protocol hasn’t yet been validated for CLIA and CAP guidelines, though much of the software has been validated independently. Whether or not a diagnosis from the STATseq pipeline should undergo confirmatory testing will be up to the lab director, Kingsmore said.
The first generation pipeline was granted a “non-significant risk” status by FDA. “They said that where a delay in returning a result was life-threatening, they gave us NSR—non-significant risk status—for return of a provisional result to a neonatologist,” Kingsmore said.
He views the 26-hour test as the next generation of the original test, but regulatory waters are murky right now for laboratory-developed tests. “[The 26-hour test] is a more accurate and sensitive test. It would seem to me that if they were willing to do it [i.e. grant NSR status] for a less-accurate and sensitive test it should be a no-brainer.”
Having just completed the second generation pipeline, Kingsmore’s move to California was the logical next step to scaling the pipelines he’d worked to develop at Children’s Mercy, he said.
“Genetic diseases are the leading cause of death in infants, they’re the leading cause of death in the NICU, in the PICU. They’re responsible for 15% of pediatric hospitalizations. Across the nation we’re looking at literally hundreds of thousands of kids who could benefit from this technology,” he said.
And for the technology to ever reach those children, it needs to be not only quick, but scalable.
“A key ingredient for the breakthrough application of genomic medicine is speed at scale. In medical practice, the value of information is proportionate to its immediacy relative to the acuity of the clinical situation,” Kingsmore wrote in a Genome Medicine opinion in July (and from where this title is borrowed: Genome Medicine 2015, 7:82 doi:10.1186/s13073-015-0201-z)
Rady and San Diego are the perfect spot to facilitate this scaling, Kingsmore believes. The Institute was created in April 2014 with a $120 million gift from the Rady Family and a $40 million investment from Rady Children’s Hospital and is affiliated with the University of California San Diego. “[The hospital has] got all of the clinical stuff in place that you need to do something truly significant. And then we’ve got companies like Illumina and Edico Genomics right in our backyard!” Kingsmore adds.
Kingsmore is planning a close relationship between Rady and the team that has stayed in Kansas City. “The big Insight grant that we had that I won at Children’s Mercy is moving with me to San Diego, and then we’re going to send half of it back to Children’s Mercy so we’ll be able to collaborate,” he said. He envisions Rady as being one of the earliest users of SSAGA, RUNES, and VIKING on a software-as-a-service model. “I think that’s one that will work well for many children’s hospitals,” he said.
Kingsmore is also working to set up a consortium of children’s hospitals nationwide to transform pediatric care using genomics. He hopes the first members will be announced this fall.