Oct. 9, 2002 | The Protein Data Bank (PDB) is not the first database that Director Helen Berman has run. She and several other members of the Research Collaboratory for Structural Bioinformatics (RCSB) team cut their teeth on the Nucleic Acid Database (NDB), which was established in 1991.

Although DNA, RNA, and protein-nucleic acid structures have always been welcome in the PDB (despite the noninclusive nature of its name), the NDB team realized early on that a relational database would be of greater benefit to the nucleic acid research community than the flat files found in the PDB. So they set out to build one. The NDB vision was a well-curated database of primary structural results and derivative data that would allow complex queries and comparisons of nucleic acid structures.

The NDB team tested the mmCIF (macromolecular crystallographic information file) format and found that it worked well. They introduced structure validation software and automated data processing procedures, and added resource links to their Web site to help scientists. In essence, the NDB experience made the RCSB highly qualified to take over the larger, more complex PDB in 1998. The NDB's software and systems were easily transferable to the PDB — most of the major bugs had already been eliminated.

Berman still oversees the NDB, which, unlike the PDB, is funded by a research grant that enables it to act as an incubator of ideas. Berman is cautious about altering the PDB archive but can play around a little more with the NDB.

To come up with new ways of doing things, biological databases need those rare individuals with experience in two languages — science and computers. This past summer, the NDB rooms were full to overflowing with such a group — undergraduate interns with double majors in biology and computer science, who were eagerly collaborating on new tools for the site. Berman clearly approves: "I wish I knew how to bottle this recipe so that we'd have it every year."

