Quake Sequences Personal Genome Using Helicos Single-Molecule Sequencing



By Kevin Davies

August 10, 2009 | Six years ago, Stanford University bioengineering professor Stephen Quake published a new method for sequencing single molecules of DNA, in which his team (then at Caltech) proudly managed to sequence precisely five nucleotides. This week, working with a pair of colleagues, Quake has a bit more to celebrate: the sequence of 2.5 billion bases (about 90%) of his personal genome.

Apart from being a major milestone in single-molecule sequencing (SMS), Quake says his group’s paper points to the democratization of genomic research. “This is the first case you haven’t needed a genome center to sequence a human genome,” Quake told Bio-IT World on the eve of his landmark publication. “What we’ve shown is that you can do it with a pretty modest set of resources—a single professor’s lab, one person doing the sequencing, one instrument, lower cost. Those are all order-of-magnitude improvements over what’s been published recently.” (See accompanying interview, “The Single Life: Stephen Quake Q&A.”)

The Quake genome report, which is published online today in Nature Biotechnology, also marks a major milestone for Helicos Biosciences, the company Quake co-founded in 2004 to commercialize his SMS technology.

 Stephen Quake
Stephen Quake
“The vast majority of the whole-genome sequencing on [other next-gen] platforms requires it to be done in genome centers, because you need that infrastructure,” commented Helicos president Steve Lombardi. “Literally three people did this work. That’s a real harbinger of what we see the direction of this market going. It’ll be very interesting to see what Francis Collins, in his officially appointed role [as NIH Director], does with that!”

It took research technical manager Norma Neff just four runs on the Stanford HeliScope, at an estimated cost of $48,000, while a physics PhD student, Dmitry Pushkarev, developed a new variant-calling algorithm called UMKA for the bioinformatics analysis. “It’s the first large statement from someone other than the company that the technology works. That’s a really important thing for us,” said Lombardi.

Kevin Ulmer, a pioneer of single-molecule sequencing and a former consultant to Helicos, commented: “Little did I realize that the young postdoc working next to me at the bench in Steve Chu's lab at Stanford in 1993 would become the first person to have his genome sequenced by direct single-molecule methods.” Ulmer says he was considered “completely crazy” to have proposed such a scheme back in 1987, but feels vindicated now.

“I think Helicos deserves some kudos,” commented Clive Brown, vice president of informatics and IT at Oxford Nanopore Technologies. “They’ve stuck with it, and they’ve made it work about as good as it can work with single-molecule fluorescence and the camera they have. People have taken it outside and they’ve used it. That’s not trivial. If I was them, I’d have stuck to that. They should stick to the high ground – you can quote me on that.”

Brown, who was formerly with Solexa and Illumina, said it was misleading to compare the three co-authors on the Stanford paper with the 250 or so on the landmark 2008 Illumina publication in Nature on the first African genome, because “that paper was the culmination of eight years work.” He noted that an earlier 2008 Helicos publication had more than 20 co-authors to sequence a tiny viral genome. Brown also pointed out that some of the platform price comparisons were out-of-date, noting that Illumina has introduced a personal genome sequencing service, coincidentally for the same $48,000 price.

More Than Zero

The identity of “Patient Zero” -- the Caucasian DNA donor -- is not actually specified in the paper. “We wanted to retain some semblance of dignity for the scientific literature,” Quake joked. Earlier this year, however, Quake penned an op-ed in the New York Times announcing that he had sequenced his own genome. One of his prime motivations, he wrote, was to try to understand why his daughters suffered severe peanut allergies.

Quake got access to the HeliScope after a machine was purchased by the Stanford University Stem Cell Institute last year, and volunteered to be the whole-genome subject. He could not obtain a machine for his own lab as Howard Hughes Medical Institute and Stanford University conflict-of-interest policies bar collaborations with biotech companies. “The reason they bought it was not to sequence my genome, but to sequence cancer, tumor stem cell genomes,” Quake explains. “Mine was just to practice, to show that we could do it and to get the informatics into place.”

In four HeliScope runs, Neff generated 148 billion raw reads ranging from 24 to 70 bases in length, with an average length of 32 bases. Of that sequence, 63% of the reads could be aligned to about 90% of the reference human genome, using an open-source aligner called IndexDP, for a total useful genome coverage of 28X.

The error rate is put at 3.5%, which is higher than other next-gen platforms; more than half of those errors are deletions, attributable to the sporadic incorporation of “dark” non-fluorescing bases. The read alignment ratio of 63% is on the low side, however the data generation was performed six months ago using only single reads, and is likely to improve quickly.

Pushkarev designed the UMKA program with the HeliScope’s known error profile in mind. It called 97% SNPs with 99% accuracy, which the authors say is slightly better than first leukemia genome and comparable to recent publication on the Chinese, Korean and Yoruban genomes, all sequenced on the Illumina GA II platform. Selecting a fairly stringent quality threshold, the authors documented more than 2.8 million SNPs, of which 76% are found in dbSNP. (Similar ratios have been reported for other personal genomes.)

By assessing the depth of read coverage along each chromosome in 1-kilobase windows, Quake’s team was also able to detect copy number variants (CNV). It found 752 CNVs totaling 16 megabases in this way, of which only 54% have previously been catalogued in the Database of Genomic Variants.

David vs Goliaths

Since going public in 2007, Helicos has struggled to overcome technical difficulties and problems convincing the market to accept a machine priced over $1 million in the face of more established, more affordable competition from Illumina, Applied Biosystems (Life Technologies) and Roche/454. Lay-offs, a tumbling stock price, and cash concerns hardly bode well. Nor did the return by contract research organization Expression Analysis last year of the first instrument Helicos shipped a customer.

However, fortunes appear to be turning around of late. After Ron Lowy took over as CEO (Lapidus remains on the board as chairman), the instrument cost was lowered below the $1-million mark, and the company recently announced its first HeliScope sale to a biotech company. Quake’s genome, following the recent publication of an African genome on the Life Technologies SOLiD platform, makes this the fourth next-gen platform to sequence a human genome.

Quake characterizes the sequencing market as a “David vs Goliath” battle. “There are four commercial platforms out there right now, and three of them are billion-dollar companies. The fourth is Helicos, which is a scrappy little bunch -- they’re trying to hang on! I think they’re fantastic, and I’m hoping they’re going to end up at the top of the heap.”

Helicos chief science officer Patrice Milos says the company is continuing to focus on improvements in reagent chemistry and patterned surfaces, “both of which will allow us to further improve the performance of the instrument, read lengths, and strand aligned yield.”

Lombardi stresses this is still the “first generation of the technology.” With improvements in strand length and alignment yields, as well as greater strands density on the flow cells, Lombardi says: “We think we could easily move from where we are today – which is 20-25 Gigabases/run -- well into the several hundreds of Gigabases/run, with just chemistry. No hardware changes at all.”

Quake Traits

The Quake sequence data were generated earlier this year, and Quake says, “We already have three more genomes in the can related to leukemia and cancer. We’re neck deep trying to analyze those and understand what they mean.”

Although not discussed in the Nature Biotechnology paper, Quake and his colleagues have been scouring his DNA sequence and trying to draw some preliminary conclusions about health and genetic traits. “Some of the doctors are starting to poke and prod me to see how they can couple my genome with medicine,” he said.

Among the early discoveries are a rare mutation associated with a heart disorder, for which there may be some family history. “If you know your uncle had something, you kind of discount that you can get it, but to see you’ve inherited the mutation for that is another matter altogether,” he said.

Quake also carries a variant in the CLOCK circadian rhythm gene tentatively associated with increased disagreeability. “You don’t need my genome to tell you that,” Quake quipped. “My wife could have told you that and certainly the dean could have as well.”

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 

Tessella delivers software engineering and consulting services to leading pharmaceutical and biotech companies. We are recruiting Software Engineersto work with skilled bioinformaticians and scientists to identify business needs and recommend and develop technical solutions. Applicants require BS, MS or PhD in bioinformatics, biology or chemistry and 2+ years of software development in either: Java, C#, C++, C or VB.NET. 

Apply at http://jobs.tessella.com   

 

oxford nanopore logo 


 Early Access Collaborations Managers
Oxford Nanopore Technologies is developing a novel technology, GridIONTM for the direct, electronic analysis of DNA/RNA and other analytes.  As the system approaches the market, we are building a team of technically knowledgeable, highly motivated candidates with excellent customer service and facilitation skills to join our company as Collaboration Managers.  This is a unique opportunity to work with world-leading genomics customers throughout the early adoption phase of a new generation of DNA sequencing technology.. This is a facilitative, enabling role with responsibility for managing technology development collaborations with key customers at leading genomics institutions.  It will include long term management of the collaboration plan and milestones and associated meetings and documentation. Click here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Oxford Nanopore Technologies is a fast-moving technology company that is developing a novel electronic molecular analysis technology. The technology is adaptable for the analysis of DNA/RNA, proteins, chemicals and other molecules.  It is therefore suitable for use in a variety of markets including scientific research and clinical applications.  As the technology approaches the market, Oxford Nanopore is seeking a visionary VP of sales and marketing to join the senior team.  The candidate will embrace the opportunities afforded by entering the market with a truly disruptive technology that has the potential to expand the number of users and the variety of applications in each target market.  This is a rare opportunity to influence the commercial strategy at an early phase of its commercial lifetime, in a well funded company.  Oxford Nanopore welcomes applications from candidates with a track record of high-level strategic commercial  leadership, who wish to apply a fresh approach to existing markets.  Experience in Life Sciences/DNA sequencing is central to this role, however we will consider your application if you have experience of disruptive technologies in other related industries.  We are particularly interested in candidates with strong expertise in the use of digital technologies for sales and marketing of scientific/technical products.  Click to  Apply  


 

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .