DogBox Aims to Take a Bite Out of Discovery Time



Unleashed Informatics, a spin-off of the Blueprint Initiative at Mount Sinai Hospital in partnership with Sun Microsystems of Canada, announced DogBox, a self-updating bioinformatics warehouse.

The DogBox is a hardware/software appliance that includes the Blueprint SeqHound and other public bioinformatics databases.

SeqHound is a database of biological sequences and structures. Specifically, the database combines 3D structure, annotation, sequence, and taxonomy information. SeqHound is updated daily by gathering information from a number of sources, including the National Center for Biotechnology Information and the Gene Ontology Consortium. Additional databases included with DogBox are PDB, GenBank, SwissProt, and several other public databases.

The pre-configured DogBox system includes a SunFire V20z dual-Opteron server with 4 GB of memory and a 3TB Sun StorEdge FC3511 storage system. The SeqHound database is automatically updated on a regular basis (typically, each night after new entries have been added by Blueprint researchers).

Why Pay?

One obvious question about the DogBox is, why pay for a database that is publicly available and offered for free?

There are actually three reason why a company would consider a product like DogBox: performance, integration, and security.

Using a dedicated internal device to address performance, integration, and security issues is an approach that is increasingly being adopted in many areas of research. The best example of this is the Google Search Appliance, which is a dedicated search device offered to companies.

On the performance front, a dedicated, internal device like the DogBox eliminates Internet-related delays that can occur when a query must travel over the public network to reach a database server. Additionally, a public database’s performance is not guaranteed and can be significantly impacted if it is handling many simultaneous requests. These delays are eliminated when a dedicated device sits on a company network.

With regard to integration, in many cases, life science companies use the data in a public resource as part of a larger application. For instance, an informatics application might have a workflow where in one step results of an experiment are used in a query to a database and the returned answer to the query is then used in the next step of a computational workflow.

Incorporating such calls in a workflow to public databases is common, but the DogBox allows for a tighter marrying of an application to the SeqHound database. For example, the DogBox includes application programming interfaces (APIs) that let a company write applications that directly query the system.

The third reason for using an internal appliance is security. A query sent to a public database could, in theory, be intercepted by a hacker. A hacker capturing these outbound queries could get information about the research efforts going on within the company, such as which molecules and potential drug targets are being investigated.

While this may seem far-fetched, some life science companies are taking this threat to their intellectual property and research efforts very seriously. One industry analyst noted that he recently visited a company that maintains roughly 120 public databases internally to protect information that might be derived from queries sent to these databases.

The DogBox bioinformatics warehouse appliance was announced in May and is available now.


Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .