August 8, 2008
| Bio-IT World > A Broad View
A Broad View


April 1, 2008 | The Broad Institute of Harvard and MIT is running 20 Illumina Genome Analyzers, three 454 GS FLX instruments in production, and three ABI SOLiDs, according to Toby Bloom. She manages the informatics pipelines for the Broad's sequencing platforms - old and new - for applications ranging from medical re-sequencing to epigenomics to pathogen genome sequencing.

Bloom says most next-gen vendors provide "fairly sophisticated pieces of software," much of which the Broad staff uses, including image processing, while also recommending improvements with certain vendors, for example on quality scoring. "We may come up with our own algorithms and feed that back to the vendors," she says. "Of course, for assemblies, alignments, mutation calling, we're looking at our own software as well."

Despite its considerable resources, Bloom's team has made sweeping changes to its data pipeline of late. "On the data management side, the old pipeline dealt with one read at a time," says Bloom. "Now, we deal with plate by plate or region by region or lane by lane. The data aren't stored in individual files but in batches." Another issue is that, "You're dealing with large numbers of small reads, not small numbers of large reads."

Store 24/7
Bloom says the core LIMS includes "added information about the new steps to help the lab track what their orders are. It's very different managing the lab to do large numbers of small projects. A mammalian genome would take several months to go through the lab using older technology... They now need more support for keeping track of everything."

To handle the storage demand, the Broad has 300 TB of Isilon high-speed parallel access storage, with more on the way. "We do a bunch of our work on SunFire 4500s, or Thumpers," says Bloom. These are reasonably inexpensive file-server units that have 15-20 usable TB per unit. "We actually use them to pull the images off the machines as they're being generated, so we don't have to stop the sequencers to do any processing on them between runs."

Bloom says the SunFires have "enough processing capability that we can do cycle by cycle processing." Once the image data are processed, the results are fed into the Isilon storage and core compute facility. Bloom says the images are stored "in case we need to go back to them, for a month or two. We leave them behind on the Thumpers - they never go anywhere else."

But even the Broad Institute can't store image files forever. "I don't think it's particularly useful; it's rare we'd ever go back to them," says Bloom. "What we do store forever is a sampling of the images on each run." Archiving a few images from each cycle enables troubleshooting of potential machine problems. --K.D.

Return to main article.

___________________________________________________

 This article appeared in Bio-IT World Magazine.
Subscriptions are free for qualifying individuals.  
Apply Today.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1

White Papers & Special Reports

definiens briefingon-76Next-Generation Technologies Revolutionizing Oncology and Diagnostics
underwritten by Definiens

This “Briefing On” collection of Bio-IT World features, commentaries and analysis, presents some of the latest thinking on high-throughput technologies that are being applied to the fields of research and drug discovery, with particular emphasis on oncology, diagnostics and imaging technologies. Download now at no charge compliments of the underwriting sponsor, Definiens. Download This Free Paper



metaminer image(1)

MetaMiner™ Cystic Fibrosis Report,  Sponsored by GeneGo
This paper discusses the MetaMiner™ (CF) data analysis platform for a broad range of CF researchers designed to: 1. Easily assemble important biological and chemical experimental data available today in cystic fibrosis research. 2. Visualize key mechanisms leading to the disease through pathway maps and network models 3. Provide the CF community a “one stop shop” tool for uploading and analyzing experimental data in a disease-centered interface.  Download now 



gq nxt gen seq

This Bio•IT World Briefing On “Next-Generation Sequencing,” underwritten by GenomeQuest, Inc.,
presents a selection of feature stories, interviews,commentaries, conference reports, and editorials on the emergence, opportunities, and challenges posed by high-throughput sequencing. Covered in this collection: the launch of new platforms from Applied Biosystems and Helicos; new applications of nextgen sequencing; the rise of personal genomics; and informatics solutions to vexing problem of managing the vast volumes of next-gen data.  Download now 



Life Science Webcasts & Podcasts

GenoLogicsgenologics 2 translational
Enabling Translational Research Informatics

Learn about the challenges facing life sciences research labs to manage their translational research data:

  • The trends for organizations to adopt informatics solutions for translational research.
  • The unique requirements with managing complex data and workflow.
  • What labs should consider when reviewing informatics solutions for translational research.
  • Which life sciences research organizations are successfully adopting an informatics solution.

Download Now



More Podcasts

Job Openings

Assistant Editor (Science Writer)~Cambridge Healthtech Institute (CHI), Needham, MA, 
Cambridge Healthtech Institute seeks an assistant editor (science writer) who is an ambitious, dependable journalist who can fulfill a range of writing and editorial duties for a series of eNewsletters covering various aspects of the biopharmaceutical industry in addition to CHI’s flagship publication, Bio-IT World magazine.  This is a superb opportunity to make important contributions to the growth and success of a multimedia science publishing group, while gaining invaluable experience in multiple facets of the publishing industry.   Interested candidates should submit a cover letter, including 3 writing samples (attached in Word or PDF format), salary history or requirements, and resume to kdavies@healthtech.com. For a detailed description of the Assistant Editor position, please click here.

Isilon Systems ~ Senior Marketing Communications Manager
Isilon Systems is the worldwide leader in clustered storage systems and software for digital content and unstructured data. We seek an experienced marketing communications professional/writer expert in creating and delivering effective and persuasive business communications. The ideal candidate can think at the strategic and conceptual level and act, simultaneously, as a highly-effective and productive individual contributor. The position is based in Seattle, WA. For additional information click here:
 

For reprints and/or copyright permission, please contact RMS, 1808 Colonial Village Lane, Lancaster, PA;

(717) 399-1900 ext. 125 or via email to bio-itworld@theygsgroup.com.