Scientific insight is still a precious commodity, but those "computing machines" have matured into indispensable components of modern biosciences research. The greatest triumph in genetics since the double helix—the completion of the Human Genome Project—was only possible through extraordinary advances in computational speed, storage capacity, and custom software. Now, the marriage of computational and traditional biology—gray technology and green technology, as physicist Freeman Dyson puts it—is helping to annotate and interpret the genome sequences of humans, mice, and many other model organisms by identifying novel genes and their families; interpreting patterns of gene expression using biochips; and cataloguing millions of single-base differences (SNPs) between unrelated individuals that shape susceptibility to common diseases such as cancer and heart disease, and hold the key to a new world of personalized medicine.
The Human Genome Project was widely touted as the greatest technological accomplishment since the invention of the wheel or the Apollo moon landings, but as geneticist Sydney Brenner points out, "Landing a man on the moon is the easy part; it's getting him back that's the problem." Mining the human genome is a massive computational problem, but nothing compared to the daunting problems posed by proteomics—the total characterization of the identities, structures, complexes, networks, and locations of all the proteins in the body. (In a bygone era, it would have sufficed to term this initiative biochemistry, but marketing has its place in biotechnology as in any other growth industry.) Understanding the properties of a single protein is hard enough. It takes a couple of months for a Cray T3 to simulate the folding of an average protein in silico; the natural process takes mere microseconds in situ. And even if we only possess a paltry 40,000 or 50,000 genes, there are likely to be ten times as many proteins expressed in the human body, each linked in a maze of complexes, pathways, and networks. The storage capacity required to manage proteomics is orders of magnitude greater than is required for the human genome, which was tens of terabytes.
But even proteomics is just one more step—albeit a large one—in the quest to comprehend all cellular and physiological processes. After the proteome, scientists will desire to understand the function of whole cellular compartments or organelles, before embarking on models of intact cells and entire organs. Indeed, computational biologists are already talking about initiatives to tackle the "metabolome," the "morphome," and yes, the "physiome."
Managing this deluge of data is a top priority for researchers in academia, biotechnology and pharmaceutical industries, particularly to assist the onerous process of drug discovery. According to a recent Tufts University study, the average time to develop a drug is now 15 years and costs over $800 million. Consider cancer, the second leading cause of death in the United States after heart disease. According to the American Cancer Society, the lifetime risk of cancer is now two in five. One person dies from cancer every 60 seconds. Against this backdrop, any technology that can expedite target selection, validation, and ultimately approval will provide not only a profound competitive advantage but also—and more importantly—save lives. Advances in IT have already helped reveal thousands of new potential drug targets, and are improving the efficiency of screening, sharing of data, and management and administration of clinical trials. The first tantalizing hints of a new era in rational drug design are already apparent: a mere six years after publication of the first research paper on an anti-leukemia drug named STI-571, Novartis' Gleevec (as the drug is now popularly known) won approval in record time from the U.S. Food and Drug Administration.
Given these extraordinary opportunities, it is no surprise that many giants in the computer industry have become entranced by the new biology. "My industry is going to become pretty boring soon," muses Oracle CEO Larry Ellison, while adding "life sciences is really important ... and we want to be a part of it." Compaq built the supercomputer housed at Celera Genomics required to store and assemble millions of DNA fragments. IBM is building "Blue Gene," the first petaflop computer to study protein folding. And Hitachi and Oracle have formed a $185-million alliance with Myriad Genetics to analyze the human proteome. Many others companies are following suit. While IT growth in virtually all other sectors is modest at best, IT expenditure in the life sciences is predicted to reach $28 billion by 2005, according to research analysis from IDC (International Data Corp.), growing annually at some 22 percent. IBM's Lou Gerstner puts it this way: "The marriage of information technology with life sciences research and genetics ... represents the next major revolution."
The convergence of the life sciences and IT industries will be the centerpiece of a new era of cross-disciplinary research and global data sharing that will fundamentally transform biomedical research. With such sweeping changes, there is an urgent need for new sources of information that can illuminate the opportunities in the biosciences for IT companies, and educate life scientists about the latest advances in computer science and technology and their relevance.
With this goal in mind, we are proud to introduce Bio·IT World—an ambitious new publication dedicated to providing authoritative and comprehensive coverage of the exciting convergence of IT and life sciences research. Each month, Bio·IT World will present a combination of news, reviews, features, and opinions that will highlight not only the key business and scientific trends across the Bio-IT arena, but also the latest tools and technologies to help researchers identify and take advantage of the opportunities before them. A recurring theme in each issue will be developments in the core "technology pillars" of the Bio-IT marketplace: computer architecture and databases, knowledge management, storage, networks and consulting on the one hand; bioinformatics, genomics, proteomics, and computational biology on the other.
Bio·IT World is the newest publication from our parent company IDG (International Data Group), publisher of ComputerWorld, NetworkWorld, PCWorld, CIO, and more than 300 publications around the world. We have assembled a truly distinguished group of editors and writers at Bio·IT World with extensive experience in all facets of IT, bioinformatics, and life sciences research. In addition, we are working closely with an international network of IDG news correspondents stationed in Asia, Europe, and the Americas, dedicated to bringing you important developments wherever they should occur. Our intent is not merely to report the news and events in the industry, but to help define, interpret and predict new trends, critical issues, and significant advances across the Bio-IT field. In particular, we will strive to close the enormous communication gap between most IT specialists and life scientists, a gap which could seriously compromise the Bio-IT revolution.
In that spirit, the centerpiece of this special launch issue is a feature entitled "Champions of the Bio·IT World," a fascinating look at the priorities, goals, and ambitions of an extraordinarily diverse group of 60 companies that represent the convergence of IT and life sciences research, in the words of chief executives and senior officers from each firm.