Hardware: Heavy number-crunching takes place on 400 CPUs, including 200 Compaq Alphas and a 200-CPU Dell Linux cluster. The database is housed on an ES40 Sun Server with a 160GB capacity. Flat files linked to the database are stored in a disk array. Desktop machines are either Linux- or Windows-based; a handful of SGI workstations used by the crystallographers will probably be replaced soon by Linux machines with dual 1Ghz processors, according to Howard Hackworth, senior database architect.
Software: A homegrown Oracle database includes 200 different tables and relies heavily on large binary objects (LBOs), which enable users to access large amounts of unstructured data stored in external files. The database, currently only about 60GB due to data reduction techniques and pinpoint selectivity about what to include, is pervasive throughout the company. It contains every shred of useful information about each protein and co-crystal studied, from the initial PCR (polymerase chain reaction) through the final annotation of the completed protein structure, which is stored in a separate but linked database called the Structure Information Repository (SIR).
Staff: Of SGX' 130 employees, 15 are in the bioinformatics group and 10 are in a computational chemistry group based in San Francisco (a result of the acquisition of computer-modeling specialist Prospect Genomics in April 2001). Half a dozen of the bioinformatics staff worked on development of the database. Two are occupied full-time developing data-mining techniques.
Automation: SGX collaborated with RoboDesign International Inc., of Carlsbad, Calif., on developing systems for automating both storage and retrieval of crystals and inspection of plate samples to monitor, score, and report on crystal growth. The storage-and-retrieval system can house up to 15 million samples in 10,000 multiwell plates, and can process as many as 360 plates per hour. The inspection system can process one image per second. Both systems feed information into databases that are compatible with the LIMS.
Network: A T1 line connects the San Diego crystallization facility with the beamline at Argonne National Laboratory in Illinois. Once the samples are loaded for study, the beamline operation can be controlled either locally or from San Diego, and all data collected goes immediately into the system.
Back to Betting on the Structural Revolution