Breaking Informatics Barriers


Vast quantities of new data types need to be combined and analyzed now

By Malorye Branca

Oct. 9, 2002 | With so much sophisticated technology on hand, it's surprising to realize that pharmacogenomics is still in its infancy, with the promise of many new developments ahead, particularly in informatics.

Several companies with genotyping platforms are methodically sifting through millions of SNPs (single nucleotide polymorphisms) to confirm that they are sufficiently common to be useful markers. "There are a lot of junk SNPs out there, where you don't have any level of validation," warns Mark Pohl, director of informatics at Orchid BioSciences Inc.

"If we want to predict patients' responses to a drug, we need to take into account all sources of variation. We can get valuable phenotypic information from patient medical histories, and it is only through the combination of all that with genotype that we can have an accurate prediction."

Michal Preminger, Compugen

Both Orchid and Sequenom have developed Web-accessible databases in which researchers can browse the genome for useful SNPs. Sequenom's RealSNP.com Web portal was launched earlier this year; Orchid just released a beta version of Chromosome Browser. Its full launch is planned for this quarter.

Chromosome Browser covers more than 2 million SNPs from the main public sources — dbSNP, the SNP Consortium, and the Japanese SNP Database, as well as Orchid's own SNPs. Through the Web site, researchers will be able to locate SNPs and obtain primers. More than 80,000 of these SNPs are validated. "Our validation includes 144 samples from three ethnic groups," Pohl says.

This helps SNP enthusiasts who had been using multiple databases to find SNPs, and trial and error to design their assays. "If someone is currently doing assays with microsatellites and they want to switch to SNPs, before there was no tool that allowed them to do that easily," Pohl says.

Improved tools place new demands on informatics. "It is easy to get overwhelmed by the data," Pohl says. "You need very high-throughput servers and software that can deal with this information in huge batches." Yet some important steps are still performed manually, such as grading the scatter plots that represent the readout from their genotyping instrument. "Being able to do that automatically is the next step," Pohl says.

Things will get even more interesting as new types of data are introduced to the pharmacogenomic mix. These data are coming from new databases and technologies, such as proteomics and meta-bolomics, but are also emerging from previously untapped sources.

The Israeli Compugen Ltd., for example, is adding patient medical record information into its discovery data mix. Several health-care providers are giving the informatics company access to de-identified data. Compugen can only connect with the patients through the providers.

"If we want to predict patients' responses to a drug we need to take into account all sources of variation," says Michal Preminger, Compugen's vice president of new research directions. "We can get valuable phenotypic information from patient medical histories, and it is only through the combination of all that with genotype that we can have an accurate prediction." The company collects patient DNA samples "only once we have a homogenous and clean population, so we can look at a response that is specific to the drug, not based on other environmental factors," Preminger says.

But clinical data in general present a lot of challenges, says Michael Liebman, chief scientific officer for ProSanos Corp., based in La Jolla, Calif. The startup generates complex models of disease processes based upon clinical and other data. For example, patient records are not synchronized because people seek treatment at different points in the course of their disease.

Pathological definitions of many diseases are imprecise, and overall, health and diseases are complex, dynamic concepts. Hormonal fluctuations related to menopause, for example, can influence a woman's risk of breast cancer, but it's difficult to include this information in genetic analyses. "Biology and disease involve a continuous series of events and feedback," Liebman says. "We need better software and knowledge to model these."

Fortunately, companies are delivering these new tools. One of the keys will be combining analysis of multiple data types. Bozeman, Mont.-based Golden Helix Inc. says its Helix Tree software can relate thousands of interacting genes and environmental factors to clinical outcomes. Cambridge, Mass.-based Xpogen's new PathlinX application can reveal connections between genes, phenotypes, and "any other quantifiable measures," the company says.

Liebman has tested the tool and is impressed. "[PathlinX] uses relevance networks, which provide a very useful way to look at this data," he says. "It's a step forward in the evolution of our approaches to managing this type of information."

—Malorye Branca


Back to The PATH to Personalized Medicine 




White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 

Tessella delivers software engineering and consulting services to leading pharmaceutical and biotech companies. We are recruiting Software Engineersto work with skilled bioinformaticians and scientists to identify business needs and recommend and develop technical solutions. Applicants require BS, MS or PhD in bioinformatics, biology or chemistry and 2+ years of software development in either: Java, C#, C++, C or VB.NET. 

Apply at http://jobs.tessella.com   

 

oxford nanopore logo 


 Early Access Collaborations Managers
Oxford Nanopore Technologies is developing a novel technology, GridIONTM for the direct, electronic analysis of DNA/RNA and other analytes.  As the system approaches the market, we are building a team of technically knowledgeable, highly motivated candidates with excellent customer service and facilitation skills to join our company as Collaboration Managers.  This is a unique opportunity to work with world-leading genomics customers throughout the early adoption phase of a new generation of DNA sequencing technology.. This is a facilitative, enabling role with responsibility for managing technology development collaborations with key customers at leading genomics institutions.  It will include long term management of the collaboration plan and milestones and associated meetings and documentation. Click here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Oxford Nanopore Technologies is a fast-moving technology company that is developing a novel electronic molecular analysis technology. The technology is adaptable for the analysis of DNA/RNA, proteins, chemicals and other molecules.  It is therefore suitable for use in a variety of markets including scientific research and clinical applications.  As the technology approaches the market, Oxford Nanopore is seeking a visionary VP of sales and marketing to join the senior team.  The candidate will embrace the opportunities afforded by entering the market with a truly disruptive technology that has the potential to expand the number of users and the variety of applications in each target market.  This is a rare opportunity to influence the commercial strategy at an early phase of its commercial lifetime, in a well funded company.  Oxford Nanopore welcomes applications from candidates with a track record of high-level strategic commercial  leadership, who wish to apply a fresh approach to existing markets.  Experience in Life Sciences/DNA sequencing is central to this role, however we will consider your application if you have experience of disruptive technologies in other related industries.  We are particularly interested in candidates with strong expertise in the use of digital technologies for sales and marketing of scientific/technical products.  Click to  Apply  


 

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .