By Kevin Davies
May 28, 2009 | Eric Schadt, the Merck biomathematician whose groundbreaking studies on disease pathways and genome data integration have drawn plaudits from the scientific community, is joining the next-generation sequencing company Pacific Biosciences as Chief Scientific Officer.
The hiring, which was confirmed today by PacBio as word of Schadt’s recruitment began to percolate late last week, is both a surprise and a coup. Earlier this year, Schadt joined Merck vice president Stephen Friend at CHI’s Molecular Medicine Tri-Conference to announce the creation of a new non-profit organization, Sage, which would be based on Schadt’s ground-breaking work. (The two were founding members of Rosetta Inpharmatics in Seattle, acquired by Merck in 2001 for $600 million.) That project is still a go, with Friend remaining 100% committed to it and Schadt helping to secure its launch. In retrospect, however, Schadt did not dwell on Sage during his keynote address at Bio-IT World Expo in late April. “Things were brewing,” he admitted.
“Pacific Biosciences is at an amazing point, they’ve assembled an absolutely amazing team focused on this [next-generation sequencing] problem,” Schadt told Bio-IT World. “What I learned at Rosetta is that coming up with a new technology… it’s one thing to have a paper in Science or Nature that describes the technology; it’s quite another to get to a point where you can run assays routinely and get something onto the market.”
“Eric represents probably some of the most bleeding edge brilliant thinking on how biological problems are going to be unraveled,” said PacBio CEO Hugh Martin.
Since 2007, Schadt’s research has appeared in numerous high-impact journals and magazine profiles. Following Merck’s decision to close its Rosetta Inpharmatics subsidiary in Seattle late last year, Schadt and Friend decided to form Sage. Merck agreed to make contributions of money and intellectual property. “Sage is still screaming ahead,” Schadt insists. “We’ll be transitioned out of Merck July 1st… it’ll be sitting here at the Hutch. The Board of Directors is enthusiastic about that all moving and my role, even though it’s shifting.”
Friend remains “100% committed,” while Schadt will devote a portion of his time over the coming months in helping Sage get off the ground. “It would have been a ‘no go’ for me if my making this move [to PacBio] wasn’t seen as synergistic at this stage—I wouldn’t have done it,” said Schadt.
Schadt says the primary objective in launching Sage was building an “open access platform that enables researchers around the planet to interact with complex models of biology in a way that doesn’t demand they understand the complexity but can drive their decision making and make their models better.”
He calls his move to Pacific Biosciences a “divide-and-conquer” strategy. “What Sage needs to make it are large-scale data; complex model-building expertise that can be represented on their platform; and then the platform piece. I decided that I will spend most of my time on the data generation and help enabling the model building.”
Of most importance to Sage is producing “large-scale data that informs models in the best ways, and the model building.” Schadt says he intends to dedicate more time “on the generation of the kinds of data that are going to take the model building to the next level. And the best avenue I saw to lead that kind of revolution was with Pacific Biosciences.”
Although Schadt thought about building a next-generation sequencing organization in Seattle (Sage’s primary hub), he was drawn to PacBio’s third-generation sequencing technology. “They call them ‘generations’ because they’re dramatic changes in the technology that are going to be hugely enabling,” says Schadt. “With PacBio, there is a tremendous gap between their technology and the second-generation technologies like Illumina, which is going to be completely game changing. Being within the PacBio space made more sense than waiting for that to become ultimately accessible enough to where I could set up a big lab in Seattle.”
Friend says Schadt’s decision was not a shock. “For the past year, I have known of Eric’s interest in making sure that Sage was powered up properly, but also that his desire to be rich to the new technologies was going to be something that always had some possibility,” he told Bio-IT World from London. “For a while, we thought, we can just do it all… I think we appreciated over the last couple of months that that was tricky.” (see "Stephen Friend on the Road from Merck to Sage")
PacBio CEO Hugh Martin says his company had always planned to hire a CSO when the time was right. Of co-founder and chief technology officer Steve Turner, Martin said, “Steve has artfully guided us from a couple of really cool technology principles at Cornell University to the point now that we’re actually in the middle of building the first commercial system. He’s done a phenomenal job and will remain our CTO, focusing more on how we sequence and what other applications for this technology are there.”
Martin says his firm feels a sense of obligation to help the industry identify new genomics applications. “Second generation was all about brute force,” says Martin. “How much massive throughput can we get without a whole lot of understanding,” because of short read lengths and high error rates. “The third generation—of which we’re the first and most credible—is a whole new world. It’s important that we have an obligation to provide leadership in thinking about applications and how we’re going to approach these problems.”
“The leverage Eric gets by being here on solving these problems is far greater than if he built a state-of-the-art sequencing center in Seattle,” Martin continued. “Eric is going to be on the front lines for us, talking to all these [scientists] around the world, helping them think about what they can do with the sequencing capabilities third-generation represents and how to deal with all this data.”
While helping PacBio engage with the community, Martin said Schadt will “initially spend most of his scientific time working with Sage, because we think it’s extremely important that the platform gets launched.” Once that is established, “we’ll be building a PacBio lab, which will give [Schadt] an actual budget and people to go after some of the problems that he’d like to address.”
Filling the Gaps
For Schadt, the prospect of integrating next-generation sequencing data into his brand of data integration and model building is enticing. “What we learned in generating data on a scale no-one else has generated [is] that we’re only glimpsing a fraction of the biology in those systems,” he says. “We’re using technologies like microarrays that give a very limited view of the transcriptome,” which reduces the overall accuracy of the networks because of the paucity of information.
“I see the PacBio technology providing a way to go from glimpsing maybe 1% of biology in these populations to an order of magnitude beyond that,” Schadt continued. By sequencing “entire genomes on a very short timescale,” including transcriptomes, non-coding RNAs and gene isoforms, “all of which predispose in different ways to disease and drug response, it will absolutely elevate the accuracy of the models to a place we haven’t seen before.”
“Look at the badmouthing that’s started on the GWAS [genomewide association studies] because it’s failing to deliver clinically useful models. The reason for that is you don’t have enough information to fill in the gaps. It’s absolutely the third-generation technologies like PacBio that’re going to fill in those gaps.”
From a technology perspective, Schadt cites the speed and longer read lengths as the major attributes of the PacBio platform. “We’ve spent a lot of time looking at short-read technologies like Illumina for putting together these networks,” he says. “It’s very difficult to take those short-read technologies and disambiguate what isoforms were represented for a given transcript and how those isoforms were being controlled by different genetic loci.” On a ‘per base’ basis, Schadt estimates that the PacBio technology is about 25,000 times faster. Moreover, he says “people are way under-appreciating” how important longer read lengths are going to be.
Martin adds that PacBio’s inherent read length is “hundreds of times longer than existing platforms,” but acknowledges that, “for now… there are some photochemical processes which tend to end the reads prematurely” after several thousand base reads, whereas the natural processivity of the enzyme is tens of thousands of bases. He stresses “for now,” however. One solution, presented by Turner earlier this month in a poster at the annual Cold Spring Harbor Laboratory Biology of Genomes conference, is to use strobe sequencing to produce longer reads with a dark spacer in between, or navigate repetitive regions of DNA.
“We can have the illumination on, so we can do the reading, and then turn it off, let the enzyme continue to run, and then turn the laser back on,” says Martin. “You can actually strobe the reads on or off, throughout the reads.” It is just one of many advantages of third-generation sequencing, such as redundant sequencing of circular DNA templates to obtain consensus reads on a single molecule that Martin anticipates in the coming 12 months.
Schadt anticipated that “partner-type nodes” would be set up in New Haven and the U.K., with “real possibilities” for a bigger presence in the Bay Area. “The aim would be to have PacBio as a strong collaborator,” he says. “There’s huge interest in the Bay area for seeing this type of effort migrate, particularly when you talk about the high-performance computing angles and all the IT expertise that’s in the Bay area.”