By Aaron Krol
January 15, 2014 | Yesterday, the genetics world was rocked by a surprise announcement from its most prominent company: Illumina is secretly owned by Apple.
At least, that would explain the company’s obvious joy at making their previous products look obsolete, and the way their competitors have to constantly protest that, yes, Illumina’s latest gadget is obviously very cool, but wouldn’t you like to compare prices and specifications? (This theory also squares neatly with the BaseSpace app store project.)
All this is to say that Illumina’s CEO, Jay Flatley, declared yesterday at the annual JP Morgan Healthcare Conference that the company’s newest sequencer will deliver the long-sought-after “thousand dollar genome,” the arbitrary but terribly exciting benchmark that the industry has been using for years to talk about a tipping point in the pace of genetic discovery. The platform in question, the HiSeq X, will be capable of sequencing five human genomes in a single day – making the HiSeq 2500, the barely two-year-old predecessor platform whose one genome a day was until recently the industry record, look like a slouch in comparison.
Oh, and for good measure, the new NextSeq 500 (which is already on sale) will be as fast as a HiSeq 2500, but as small and easy to use as the baby of the Illumina family, the MiSeq benchtop sequencer. All in all, even if Illumina really were a part of Apple, it’s hard to see how they could have made more of a splash with their new product line.
To be clear, you shouldn’t be pulling a cool grand out of your bank account and rushing out to get your genome sequenced. The HiSeq X retails for $1 million a unit – and Illumina won’t sell you fewer than ten of them. As Joel Fellis, Illumina’s Senior Manager for Product Marketing in Systems and Genomic Services, told Bio-IT World, “This instrument was purpose-built from the very inception of it to enable large-scale, whole-genome sequencing. Every aspect of the system was really optimized for that one goal.” That means the customer Illumina had in mind was a huge institution, if not a country, that wants to sequence thousands of genomes for population-wide research projects. The instrument won't even support any other application than whole human genomes.
Those customers are out there; in fact, Illumina has already found three of them. The Garvan Institute, headquartered in Sydney, Australia, and Macrogen, Inc. of South Korea, have each ordered the standard battery of ten HiSeq X instruments, a system that Illumina in a fit of lexical insanity refers to as a HiSeq X Ten. (Pronounced “ex ten,” not “ten ten.”) The Broad Institute of MIT and Harvard, meanwhile, has ordered no fewer than fourteen of the sequencers, at a cost of $14 million.
Despite the massive upfront costs, the math seems to check out. The price of the consumables required to perform one run on a fully-loaded HiSeq X at its highest-output setting – 1.8 Terabytes of data, enough to sequence 16 whole human genomes at standard 30x coverage – is $12,750. That comes to just a tiny bit under $800 per genome. “That’s typically how people measure the cost of a genome,” says Fellis, “or at least how vendors report it out. However, we wanted to enable a fully-loaded $1000 genome, not just a theoretical one.” So they factor in the depreciation cost of the hardware over four years, and divide that by the number of genomes you could reasonably sequence in that time, arriving at $135. Then they add in the sample prep and labor, at $65 a genome. In his announcement, Flatley said that this calculation was in line with the National Human Genome Research Institute’s equation for the cost of sequencing. (That would be the equation that produced the famous graph of falling sequencing prices as compared to Moore's Law, which Illumina has already rather audaciously updated with a new data point for 2014.)
It did not take long for this graph to start making the rounds on Twitter.
It’s worth keeping in mind that any institution with a HiSeq X Ten that chooses not to sequence genomes around the clock will not be able to keep its hardware depreciation under that $135 per genome figure, which assumes 18,000 genomes will be sequenced every year. That’s a lot of genomes: so many that just four such facilities would sequence more human genomes in their first year than the entire world has sequenced up to the present day. The calculation also doesn’t factor in operational overhead, which can vary considerably. Still, Illumina is being quite candid about the costs involved, and with the core consumable prices decisively set at $800 per genome, it’s fair to say that the HiSeq X is not going to fall very much over the $1000 figure – which, again, is just a rather arbitrary way of saying genomes will be cheap enough to sequence tens of thousands of them. One might also add that any facility with ten copies of the fastest sequencer in the world who doesn’t run it every day is being a bit irresponsible.
There’s one more important qualifier here. The HiSeq X is still essentially a second-generation sequencer. Its read length, at 150 base pairs on each end of a paired-end read, is very good for its ilk, but will still leave some significant gaps in the genome. It’s certainly not good enough for de novo assembly. If you’re really interested in telomeres, or the tandem repeats that cause Huntington’s disease, or have some burning vendetta against reference genomes, you’re likely to protest that you’re not really getting the whole human genome for your thousand dollars. That’s why the board of PacBio is probably taking this announcement in stride in a way that the board of Life Technologies shouldn’t be.
But the fact is, this is how most genome sequencing is done today; it’s good enough for most purposes. No one is railing against Illumina’s previous claims of a $4000 genome on the HiSeq 2500 just because it relies on reference assembly.
So no, you can’t get your genome sequenced on demand for a thousand dollars, but that was only ever half the point of the “thousand dollar genome.” Even if you could easily order your whole genome today, it wouldn’t tell you very much; there’s just too much uncertainty about what rare variants signify, or how predictive any genetic information is of future health outcomes. That’s why we need to support massive, population-wide studies that look at the whole human genome, and that’s exactly what the first round of HiSeq X Ten customers wants to undertake. The Garvan Institute has already released a statement about its purchase and research plans, and we can no doubt look forward to similar announcements from Macrogen and the Broad. Most importantly, these studies will have a scope and breadth of possibility that would previously have required decades and hundreds of millions of dollars in funding, instead of years and tens of millions.
And unlike previous announcements heralding the dawn of the thousand dollar genome, this one is apparently already a done deal. The HiSeq X is currently being manufactured, with the first units set to ship in March. As Fellis told us, “We’ve been working on this project for quite some time, and sequenced hundreds if not thousands of human genomes with this platform.” (No word, unfortunately, on whose was the first thousand dollar genome.)
A HiSeq X Ten sequencing system. Image credit: Illumina
The HiSeq X relies on three main types of innovation to power its leap forward in productivity over the HiSeq 2500. The sequencing-by-synthesis chemistry is reportedly four times faster than previous kits. The optical scanning is a good six times faster than on a 2500, partly because the HiSeq X scans bidirectionally instead of one way across the flow cell. Most important, the flow cell loaded into the instrument has been redesigned. In a HiSeq 2500, synthesis occurred randomly across the surface of the flow cell; the HiSeq X uses a patterned flow cell studded with billions of nanowells at known locations, which offers some control over where synthesis occurs. “That allows us to efficiently pack the flow cell,” says Fellis, “and ultimately get more reads and more data out of the flow cell. So that’s probably the main innovation that’s driving the sheer data output of the system.” In the coming months, we’ll see how much more detail Illumina cares to release.
From World's Fastest Sequencer to World's Fastest Benchtop
Naturally, the “thousand dollar genome” pitch will fetch the HiSeq X a lot of headlines (a ploy to which Bio-IT World is not immune), but for the average genetics lab, it’s the NextSeq 500 that may actually change budget allocations for 2014. This is the first benchtop sequencer that can deliver a whole human genome in a day – actually 29 hours or so – and unlike the HiSeq X, it also has plenty of other applications. The NextSeq 500 will garner some partially-deserved comparisons to a miniaturized HiSeq 2500. In fact, its high-output mode is basically equivalent to the HiSeq 2500’s less efficient rapid run mode; if you’re doing large volumes of complex sequencing, you’ll want to stick with the clunky production instrument.
Yet there’s a real market need for a high-throughput sequencer that will easily fit into the workflow of a small lab. As Fellis rightly points out, “Smaller labs and individual PIs… want to do high-throughput applications, but they either don’t have the budget or they don’t have the sample volume to warrant a production platform like the 2500. So really we see this enabling individual PIs as well as some of the smaller core labs that are looking for that flexibility.” The NextSeq is more adaptable than a HiSeq, with both mid-output and high-output flow cells, ranging from 20Gb of output to 120; that’s a range of about six targeted gene panels in 15 hours to six whole exomes in a day. It also has the simple touch-screen interface and minimal servicing demands of a MiSeq, and in a serious boon for small labs, it eliminates temperature cycles.
The NextSeq 500. Image credit: Illumina
One of the technological changes is pretty clever. Traditionally, Illumina’s sequencers have used four different dyes to represent the four bases, and the scanners have read out each of those dyes as they are incorporated into the DNA molecule during synthesis. The NextSeq chemistry, however, uses just two dyes: one for adenine, one for cytosine, both together for thymine, and a “dark” read where no dye is incorporated for guanine. That lets Illumina get away with lower-resolution optics and half as many images – meaning fewer cameras adding bulk to the machine – apparently without compromising accuracy: the company claims 75% base coverage at Q30 (99.9% accuracy) or greater.
The NextSeq 500 is already in production and ready to ship, and retails at $250,000. That’s not only much less expensive than a HiSeq, but also very close to the price of Life Technologies’ production instrument, the Ion Proton, which has so far maintained a major competitive edge on price. In the same announcement, Flatley said the company was dropping the price of the MiSeq to $99,000, which starts to encroach on the price range of an Ion PGM. All in all, it looks like a fairly aggressive commercial move got snuck into a speech that will mostly be covered for its “thousand dollar genome” highlight.
There were other interesting announcements, including that BaseSpace will finally offer private clouds with encrypted data storage, for users who want to take advantage of Illumina’s built-in informatics environment but don’t want their data living in a public cloud; and that the company will be submitting the HiSeq 2500 to the FDA for clearance as a diagnostic instrument, with the Verinata non-invasive prenatal test for chromosomal disorders as the first testing application. Those developments are being drowned out now, but the races to create a complete sample-to-analysis pipeline, and to put sequencing in the clinic, are far from finished, and both BaseSpace and the FDA’s reaction to the HiSeq should be watched closely.
For right now, though, feel free to get excited about the thousand dollar genome. Cautiously, of course. With footnotes.