The Quest to Make Sequence Sense



By Kevin Davies
Loading...

Nov. 15, 2006 | With the human genome sequenced several years ago, the challenge for biopharma organizations mining this invaluable trove of data is evolving.

New questions are emerging:

•           What is the patent landscape of human genome data?

•           How can I search GenBank faster and more effectively?

•           How can I produce accurate maps of SNPs and exons?

•           How can I annotate resequenced genome data?

GenomeQuest
Many of these questions can be addressed by GenomeQuest, the flagship product of Gene-IT. The company was founded in 1998 by bioinformatician Jean-Jacques Codani, the Paris-based chief science officer. President and CEO Ronald Ranauro, a software engineer and founder of Blackstone Computing, took over in 2002. GenomeQuest debuted in 2004.

GenomeQuest 3.0 is a biological sequence search product that creates integrated views of genome data, allowing biologists and intellectual property lawyers to evaluate sequence data and their associated patent status. The product includes an indexed archive of GenBank, EMBL databases, and 30 million patented sequences from the US and abroad, which are updated daily. “IP is where the rubber meets the road,” says Ranauro, “which programs to advance, which experiments to fund?”

The service can be hosted on a client’s in-house servers or accessed as a secure, hosted Internet service if customers don’t have the requisite IT capacity or comfort. Often customers start by using the hosted service, then deploy the system in-house. The hosted “GenomeQuestLive” resource consists of 20-node, 80 Opteron CPUs, with 80 GB RAM. Gene-IT can run three jobs simultaneously across the entire resource, with users able to filter and refine the results using sequence, alignment, and annotation properties. Alerts can be set up as new sequences are uploaded, and results further mined.

Among the latest additions to Gene-IT’s 60-strong customer roster is Biogen Idec. Others include international patent offices; biotechs such as Celera, Millennium, and Roche Diagnostics; big pharmas including Pfizer, Novartis, and Sanofi-Aventis; and ten biotechnology law practices such as Foley Hoag. Several customers come from the diagnostics area, where FDA approval moves faster yet can still consume $10 million in 6 months. “This is what keeps product managers up at night,” says Ranauro.

As Ranauro sees it, the virtues of GenomeQuest aren’t so much about raw speed as offering a unique view of the genome and patentome that affords scientists, business developers and lawyers the ability to view and mine the same data. The Biogen Idec deal, he says, underscores “our eminence as the leading solution provider for IP sequence search.”

More than two thirds of Gene-IT customers use GenomeQuest for IP-related searches including genes, proteins, probes and primers, helping to prioritize research products or abandon programs where rivals may have greater IP. Other search applications include high-throughput annotation of resequenced genomes, and validating and aligning SNP and exon-intron data over public databases such as dbSNP.

Ranauro says Gene-IT is evolving GenomeQuest from an application to a platform. The company is focusing on three major enhancements to the product: simplifying access; adding diverse biological archive information to the patent content; and most importantly, providing simple web-level API access to initiate searches via URLs.

SlimSpeed
For sheer speed in alignment analysis, few can surpass New Zealand’s Cartesian Gridspeed, which is preparing to release its SLIM Search software. SLIM Search, which does sequence alignments thousands of times faster than BLAST, just completed an international beta phase.

Two months ago, Cartesian signed Agencourt as a major customer. “Agencourt has been extremely happy,” says company founder and CEO Leonard Bloksberg.  As part of Agencourt’s contract sequencing service, it provides pre-analyzed results to their customers. “They were scheduled to buy another $1-million rack to keep up with the growing volume of searches. Instead, they ended up purchasing the SLIM Search software,” says Bloksberg.

Agencourt’s beta evaluation also provided a couple of key enhancements to the product. One was the need for cluster compatability, so “We wrote a distribution mode to run across a cluster,” says Bloksberg, noting that this will be a standard feature in a future release. “Some companies just want you to sit on their clusters,” says Bloksberg. “There are big jobs where you can require large amounts of RAM at peak performance that could take just minutes on a cluster.”

Cartesian also incorporated a module to format search results in a manner identical to the familiar BLAST output. Additional functionality includes adding Mac user support for the first time.

Bloksberg recounts a demo he gave for a prospective customer a few months ago. He was running the software live on his two-year-old Dell laptop (1.4 GHz Pentium M, with 1 GB of RAM), on battery power. Asked to do a comprehensive search using the entire C. elegans EST transcriptome, Bloksberg started to sweat as his laptop’s system monitor showed all resources running at maximum capacity. After four and a half minutes, the result sputtered out. Bloksberg was embarrassed, until the customer said, “Oh my God, the same search takes 5-10 hours on our 700-node Opteron cluster!” The electricity savings alone could buy the laptop to conduct the search.

So far, however, the response at major biopharmas “has been somewhat varied,” Bloksberg acknowledges. At one firm, the evaluator was not allowed to change the way the sequences were put through the system to take advantage of SLIM Search’s improvements. “In general, big pharmas move slower, so if they have a system to deal with sequence data, they can’t change that quickly.” New features are in development to conform to biopharma preferences, which Bloksberg says should be ready before the end of the year.

Sequencher Thirst
After years of being sidetracked working on forensic software, Ann Arbor-based Gene Codes Corporation is enhancing its flagship Sequencher desktop DNA sequence assembly and analysis product. In recent years, Gene Codes has made headlines for its efforts in forensic identification database after 9/11 and other disasters (see “Soul Searching,Bio-IT World, Sept. 2003).

Sequencher version 4.7 features an enhanced Variance Table to allow editing of data with table cells, with those changes automatically updated to the samples sequence and chromatograms, making the identification of SNPs and heterozygotes even easier. Further improvements include enhanced GenBank feature handling; updated HTML help; and expanded file export capabilities. The new version also offers improved forensic capabilities for mtDNA analysis.

Gene Codes founder and President Howard Cash says the response to the latest release has been outstanding. “Nobody can really touch Gene Codes in the small-to-medium sized sequencing market and I have to say I take some personal pleasure in watching other companies scramble to try to copy last year’s functions. We keep our upgrades dirt cheap so no customer ever feels penalized for buying a version too soon, and our users really appreciate that,” Cash told Bio-IT World.

Cash maintains that Sequencher is the community’s “premier DNA sequencing tool,” just as the company's forensic DNA tools are “several generations ahead” of anything else. “One thing that sets us apart is we don’t scour the free tools on the web and add odds and ends to our programs just to announce a new “feature.” We’re not in the business of adding features. We’re in the business of meeting clearly definable needs in the laboratory.”

Email Kevin Davies.

 Subscribe to Bio-IT World  magazine.

 

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

sapiosciences
The Workflow Driven Lab
Sponsored by Sapio Sciences

Many companies have recognized that their internal business units operate as a set of business processes. These business processes are also called workflows. Modern Laboratories are highly suitable to this workflow driven approach. In fact, the lab environments successful operation is predicated on the successful definition and adherence to workflows. It could be said that a modern  laboratory is an advanced process implementing construct. It is important that laboratory management software mirrors the process driven nature of the lab thereby increasing automation, shortening learning curves, improving data quality and increasing lab throughput.

  • The modern laboratory is an advanced workflow implementing construct
  • Laboratory Management Software solutions should fully embrace and mirror this process driven approach
  • Effective information management of workflow processes with a LIMS results in increased automation, reduced training curves, better data quality and increased lab throughput


panasas
Curing Life Sciences Data Management Challenges with Scalable Storage
Sponsored by Panasas

High performance storage systems are a given to meet today’s life sciences R&D computational challenges. But with the explosive growth in data produced by next-gen lab equipment, scalability and long-term data management issues must also be addressed. Read this paper to learn:

  • Why new lab equipment will impact R&D workflows
  • How to avoid the hidden costs of long-term data management
  • What approach you should take to accommodate today’s data while having the flexibility to scale to meet future demands.


Quantum
StorNext 4.0: Technical Product Brief
Sponsored by Quantum

 
Proven in the world’s most data intensive industries, Quantum StorNext is a scalable, high-performance file system which allows data sharing across Linux, Mac, Unix, and Windows operating systems and manages data in enterprise storage environments. In this Technical Brief you'll learn:

  • How a high-performing file system can accelerate your business
  • How to simplify your data management
  • How a tiered storage approach can save you money


Life Science Webcasts & Podcasts

Predict or Perish! Shaping the Practices of Clinical Trials
Decisionview webinarSponsored by:  DecisionView

Predictive Analytics are a key differentiator in running your clinical trials successfully through 2010 and beyond. They will help you to optimize your patient enrollment, reduce your clinical operations costs and minimize your financial liability in the clinical supply chain. In this session, you will:
• Learn what predictive analytics are and what they are not
• Understand why you need predictive analytics to run your clinical trials, and
• Explore how predictive analytics will shape the future of clinical trials

Download Now. 

 



More Podcasts

Job Openings

The University of Washington Department of Genome Sciences is seeking a LINUX SYSTEMS ENGINEERING MANAGER to lead a team in a diverse scientific computing environment that includes multiple HPC systems, petascale storage, and custom application servers. Apply online at UW Hires for req number 61505.  http://www.washington.edu/admin/hr/jobs/

Loading...

For reprints and/or copyright permission, please contact The YGS Group, 3650 West Market Street, York, PA;

(717) 505-9701 ext. 125, or via email to Ashley.Zander@theYGSgroup.com.