The Quest to Make Sequence Sense


By Kevin Davies

Nov. 15, 2006 | With the human genome sequenced several years ago, the challenge for biopharma organizations mining this invaluable trove of data is evolving.

New questions are emerging:

•           What is the patent landscape of human genome data?

•           How can I search GenBank faster and more effectively?

•           How can I produce accurate maps of SNPs and exons?

•           How can I annotate resequenced genome data?

GenomeQuest
Many of these questions can be addressed by GenomeQuest, the flagship product of Gene-IT. The company was founded in 1998 by bioinformatician Jean-Jacques Codani, the Paris-based chief science officer. President and CEO Ronald Ranauro, a software engineer and founder of Blackstone Computing, took over in 2002. GenomeQuest debuted in 2004.

GenomeQuest 3.0 is a biological sequence search product that creates integrated views of genome data, allowing biologists and intellectual property lawyers to evaluate sequence data and their associated patent status. The product includes an indexed archive of GenBank, EMBL databases, and 30 million patented sequences from the US and abroad, which are updated daily. “IP is where the rubber meets the road,” says Ranauro, “which programs to advance, which experiments to fund?”

The service can be hosted on a client’s in-house servers or accessed as a secure, hosted Internet service if customers don’t have the requisite IT capacity or comfort. Often customers start by using the hosted service, then deploy the system in-house. The hosted “GenomeQuestLive” resource consists of 20-node, 80 Opteron CPUs, with 80 GB RAM. Gene-IT can run three jobs simultaneously across the entire resource, with users able to filter and refine the results using sequence, alignment, and annotation properties. Alerts can be set up as new sequences are uploaded, and results further mined.

Among the latest additions to Gene-IT’s 60-strong customer roster is Biogen Idec. Others include international patent offices; biotechs such as Celera, Millennium, and Roche Diagnostics; big pharmas including Pfizer, Novartis, and Sanofi-Aventis; and ten biotechnology law practices such as Foley Hoag. Several customers come from the diagnostics area, where FDA approval moves faster yet can still consume $10 million in 6 months. “This is what keeps product managers up at night,” says Ranauro.

As Ranauro sees it, the virtues of GenomeQuest aren’t so much about raw speed as offering a unique view of the genome and patentome that affords scientists, business developers and lawyers the ability to view and mine the same data. The Biogen Idec deal, he says, underscores “our eminence as the leading solution provider for IP sequence search.”

More than two thirds of Gene-IT customers use GenomeQuest for IP-related searches including genes, proteins, probes and primers, helping to prioritize research products or abandon programs where rivals may have greater IP. Other search applications include high-throughput annotation of resequenced genomes, and validating and aligning SNP and exon-intron data over public databases such as dbSNP.

Ranauro says Gene-IT is evolving GenomeQuest from an application to a platform. The company is focusing on three major enhancements to the product: simplifying access; adding diverse biological archive information to the patent content; and most importantly, providing simple web-level API access to initiate searches via URLs.

SlimSpeed
For sheer speed in alignment analysis, few can surpass New Zealand’s Cartesian Gridspeed, which is preparing to release its SLIM Search software. SLIM Search, which does sequence alignments thousands of times faster than BLAST, just completed an international beta phase.

Two months ago, Cartesian signed Agencourt as a major customer. “Agencourt has been extremely happy,” says company founder and CEO Leonard Bloksberg.  As part of Agencourt’s contract sequencing service, it provides pre-analyzed results to their customers. “They were scheduled to buy another $1-million rack to keep up with the growing volume of searches. Instead, they ended up purchasing the SLIM Search software,” says Bloksberg.

Agencourt’s beta evaluation also provided a couple of key enhancements to the product. One was the need for cluster compatability, so “We wrote a distribution mode to run across a cluster,” says Bloksberg, noting that this will be a standard feature in a future release. “Some companies just want you to sit on their clusters,” says Bloksberg. “There are big jobs where you can require large amounts of RAM at peak performance that could take just minutes on a cluster.”

Cartesian also incorporated a module to format search results in a manner identical to the familiar BLAST output. Additional functionality includes adding Mac user support for the first time.

Bloksberg recounts a demo he gave for a prospective customer a few months ago. He was running the software live on his two-year-old Dell laptop (1.4 GHz Pentium M, with 1 GB of RAM), on battery power. Asked to do a comprehensive search using the entire C. elegans EST transcriptome, Bloksberg started to sweat as his laptop’s system monitor showed all resources running at maximum capacity. After four and a half minutes, the result sputtered out. Bloksberg was embarrassed, until the customer said, “Oh my God, the same search takes 5-10 hours on our 700-node Opteron cluster!” The electricity savings alone could buy the laptop to conduct the search.

So far, however, the response at major biopharmas “has been somewhat varied,” Bloksberg acknowledges. At one firm, the evaluator was not allowed to change the way the sequences were put through the system to take advantage of SLIM Search’s improvements. “In general, big pharmas move slower, so if they have a system to deal with sequence data, they can’t change that quickly.” New features are in development to conform to biopharma preferences, which Bloksberg says should be ready before the end of the year.

Sequencher Thirst
After years of being sidetracked working on forensic software, Ann Arbor-based Gene Codes Corporation is enhancing its flagship Sequencher desktop DNA sequence assembly and analysis product. In recent years, Gene Codes has made headlines for its efforts in forensic identification database after 9/11 and other disasters (see “Soul Searching,Bio-IT World, Sept. 2003).

Sequencher version 4.7 features an enhanced Variance Table to allow editing of data with table cells, with those changes automatically updated to the samples sequence and chromatograms, making the identification of SNPs and heterozygotes even easier. Further improvements include enhanced GenBank feature handling; updated HTML help; and expanded file export capabilities. The new version also offers improved forensic capabilities for mtDNA analysis.

Gene Codes founder and President Howard Cash says the response to the latest release has been outstanding. “Nobody can really touch Gene Codes in the small-to-medium sized sequencing market and I have to say I take some personal pleasure in watching other companies scramble to try to copy last year’s functions. We keep our upgrades dirt cheap so no customer ever feels penalized for buying a version too soon, and our users really appreciate that,” Cash told Bio-IT World.

Cash maintains that Sequencher is the community’s “premier DNA sequencing tool,” just as the company's forensic DNA tools are “several generations ahead” of anything else. “One thing that sets us apart is we don’t scour the free tools on the web and add odds and ends to our programs just to announce a new “feature.” We’re not in the business of adding features. We’re in the business of meeting clearly definable needs in the laboratory.”

Email Kevin Davies.

 Subscribe to Bio-IT World  magazine.

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

thomson reuters image
Biomarkers: An Indispensible Addition to the Drug Development Toolkit
Examining the Potential of Biomarkers
Sponsored by Thomson Reuters

Biomarkers are becoming an essential part of clinical development. In this white paper, Thomson Reuters provides insight from experts in industry and academia, and explores the role of biomarkers as evaluative tools in improving clinical research and the challenges this presents.

Discover the potential of biomarkers to:

  • Improve decision making
  • Accelerate drug development
  • Reduce development costs


BlueArc_Scientific Data
Scientific Data Lifecycle Management: Preparing for Storage in an Uncertain Future
Sponsored by BlueArc

Managing vast and overwhelming streams of gene sequencing data today requires ultra-high performance systems and processes. With continued rapid advancement and improvements in gene sequencing, expect tomorrow’s instruments to output quantities of genomic information that will dwarf current levels. Help your organization maintain data control and prepare for the future of sequencing through this informative paper that discusses:

  • The information technology challenges of gene sequencing
  • “Intelligent” methods for data management and customization
  • System survival tips... Deciding what data to keep or delete
  • New tools to keep scientists ahead of impending data torrents


SAS Managed image
Managed Innovation, Assured Compliance
Developing, executing and managing the transformation, analysis and submission of clinical research data with SAS® Drug Development
Sponsored by SAS
Get better products to market faster. Download this white paper to discover the top ten challenges facing life science executives and how to overcome them. See how SAS Drug Development transforms clinical data into true innovation.


Life Science Webcasts & Podcasts

Presented by Trade Commission of Spain

Spain Biotech: An Engine for Economic Change 

TCS podcastDiscover how Spain is focusing on biotechnology to be an engine for economic change through gradual internationalization, development and technology transfer.

Regional governments are actively investing in public and private biology research and promoting the creation of knowledge-based companies. Spain’s human capital combined with aggressive investment in biotech research and infrastructure has led to the creation of bio-clusters.

Today, there are nearly 700 Spanish companies engaged in biotechnology, with almost 50 percent growth in funding devoted to research. In fact, spending on internal R & D in biotechnology has grown 46 percent and is close to 300 million Euros.

Access the podcast 

 



More Podcasts

Job Openings

saic_logo

MANAGER, SCIENTIFIC COMPUTING & PROGRAMMING
(Bioinformatics Manager)
SAIC-Frederick, Inc has an exciting opportunity for a Manager, Scientific Computing & Programming - Core Genoytyping Facility in Gaithersburg, Maryland.  In this role, you will lead the Bioinformatics & Analysis Group.
Master’s or equivalent required.  PhD preferred. Six years experience in development of scientific programs in high-performance computing environment including five years supporting scientific research in computational chemistry, biology, or genetics, & two years supervisory experience.  View complete job posting & apply: www.saic-frederick.com. Position #146945.

For reprints and/or copyright permission, please contact The YGS Group, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.