June 14, 2006 | Researchers at the Broad Institute have released an updated version of the popular GenePattern software suite. The version 2.0 release was announced by Michael Reich, Jill Mesirov, and colleagues in the May issue of Nature Genetics, and is freely available for download from the Broad Institute Web site.
The software, which was originally released in March 2004, has more than 2,300 registered users at more than 500 institutions and 30 biopharma companies. GenePattern won the Editor’s Choice Award in last year’s Bio•IT World Best Practices competition. The key features in GenePattern 2.0 are a new suite of modules for the analysis and visualization of proteomics data and additional support for reproducible research with the substitution of different parameters.
Michael Reich, GenePattern team leader and head of cancer informatics development at the Broad, commented: “The strengths that GenePattern brings to gene expression analysis can now be similarly realized for proteomic data. We have also added features to improve its ability to capture and reproduce analyses, which is vital to researchers both individually and as a community.”
“These new capabilities are a vivid illustration of the tool’s flexibility and adaptability, as well as our commitment to the goal of reproducible research,” added Mesirov, Broad Institute chief informatics officer.
Further enhancements are said to be in the wings, including modules to analyze single nucleotide polymorphism (SNP) and genotype data, and templates for creating text documents. In addition, there are plans to provide infrastructure to support remote access over the Web as part of the National Cancer Institute’s Cancer Biomedical Informatics Grid (caBIG) project (see “Is caBIG Ready to Bloom?” April 2006 Bio•IT World, p. 33).
Reich and Mesirov recount in their Nature Genetics correspondence the development of GenePattern back in 1999, when Broad Institute (formerly Whitehead Institute’s Genome Center) researchers Todd Golub, Mesirov, and their colleagues published in Science a landmark molecular profiling study that delineated two forms of leukemia — acute lymphoblastic (ALL) and acute myelogenous (AML) leukemia — on the basis of differential gene expression (Golub, T.R. et al., Science 286, 531-7; 1999). But data handling and analysis were tedious, requiring manual running of software programs and exporting the results into Excel for visualization.
Following publication of the paper, Mesirov and colleagues were deluged with e-mails with “questions on how to reproduce the results and the analysis details that were not included in the paper.” The development of GenePattern provided a “reproducible research” version of the analysis method by capturing all the necessary steps using an automated form-based environment.
GenePattern supports researchers of widely disparate computational expertise and offers ease of access to a range of analytic and visualization tools. Reich and colleagues suggest that another key asset is that GenePattern “supports a mechanism to guarantee the capture and independent replication of published computational methods and in silico results.”