By Allison Proffitt
January 22, 2013 | The Undiagnosed Diseases Program (UDP) at the National Institutes of Health (NIH) has partnered with Appistry to release the diploid aligner genetic-analysis pipeline NIH developed for rare disease diagnosis.
Rare diseases are defined in the Orphan Drug Act as those maladies that affect fewer than 200,000 individuals in the US. Of these individuals, just 150–170 qualify each year for the UDP, which often sees patients with diseases occurring in fewer than 50 people in the world.
In these patients, comparing a patient’s exome to the standard reference genome doesn’t usually reveal causes of disease. “We can improve that, by creating a better [reference] genome,” explained William A. Gahl, NHGRI clinical director, and director of the NIH UDP.
The pipeline that NIH developed makes a diploid reference genome that reflects the parental heritage of the patient. “The system creates a diploid genome that takes into consideration the haploblocks of each of the parents, and in that fashion those roughly hundred base pair reads are aligned better and we have fewer false positives and fewer false negatives. It gives us a more accurate set of punitive variants to look at,” Gahl said.
The pipeline is built on Broad Best Practices for standard exome analysis, Gahl said, and includes GATK, a Broad tool also commercialized and supported by Appistry, along with other open source and commercial pieces. In principle, the pipeline is platform independent, but it was designed using data from Illumina GAiiX and HiSeq2000 instruments, and should work well with HiSeq2500 RapidRun or MiSeq data.
The NIH team has been working on the pipeline for about two years. For the past year, Appistry has been working to make the pipeline “production-ready,” said Kevin Haar, CEO of Appistry.
“A lot of times when we work with great scientists and great doctors, they’ve developed an idea, an approach, a method for the scientific or medicinal approach they want to take. They’ve got a way to get to an answer. But typically we find that those solutions are not really production ready. You can’t just pick it up and give it to somebody else—another institution—and expect them to just be able to run it. It may not work well in their platform, it may have glitches, it may have integration problems between tools. It may have a lot of things that are not very transparent to the next user.”
“So what we were able to do in this case,” Harr continued, “is that we teamed with those scientists to really make [the pipeline] more of a commercial-grade product. There’s more documentation; it’s very reliable; it’s robust. It scales. You can run it not just once or twice a week but hundreds or thousands of times. Now you can create a community where you can share this science.”
Appistry’s goal, Haar said, is to enable NIH to pass the pipeline to others.
“We’re really about fixing any issues, any integration issues… make it supportable, provide documentation, provide commercial-grade support, and also provide universal access. We host [the pipeline] in our cloud environment. We give people the opportunity to have access to that technology without struggling with it, or having to know exactly what’s underneath the covers.”
The UDP began using the pipeline in its entirety this month, though changes are still being made. Since Appistry has worked to smooth some of the pipeline integration, “NIH has been able to make rapid changes and really begin tweaking things to do what they need it to do,” explained Deborah Ausman, Appistry’s director of marketing communications. “We are currently speaking with other UDP partners, however, about implementing the pipeline at their sites. Obviously each site may have slightly different requirements, which is the beauty of this work... we can optimize the science a site wishes to use, while considering the best practices defined by NIH UDP.”
NIH is happy for Appistry to disseminate and promote the pipeline, Gahl said. “We would like to offer this to the community, and you need a company to do that. They’re going to promote it. They’re going to suggest its use and test it out on a large scale with other users—I assume universities and sequencing centers. That’s not a function of the NIH.”
NIH, in turn, has an ongoing contract with Appistry to use the diploid aligner for the analysis of past and some future exomes in the Undiagnosed Disease Program, Gahl says. “But it’s true, we’re not really in control of this [anymore], but that’s sort of alright. That’s what we do in a government agency: we basically do things for the common good.”