Dec. 2006 / Jan. 2007 | Ushering out the old year and ringing in the new causes one to reflect on technologies gone by, and extrapolate what we might expect during this upcoming year. Each of us at The BioTeam has contributed a few topics (in no particular order) that we think are important to watch closely in 2007.
1 Power and Cooling Costs (both in dollars and CO2) — A typical 1U server costs $1 to $2 per day to both heat (provide computing) and cool. On the surface this may not seem like much. However, annually this adds up to a typical 42U rack consuming $30,000 in electricity and the generation of CO2 equivalent to having burned 4,000 gallons of gasoline. These factors (among others) are driving hardware manufacturers to increase the number of FLOPS per Watt. The introduction of two- and four-core processors dramatically increase the number of FLOPS per Watt. Look for the number of cores per processor to hit double digits.
2 “Data-aware” Computing — Companies such as Exludus can layer file-and-data locational awareness onto common distributed resource management systems such as Platform LSF and Sun GridEngine. The result is that scientific workflows automatically migrate towards systems where the required datasets already reside or can be quickly obtained. For environments with I/O heavy scientific computing requirements the throughput performance gains can be impressive. Expect to see more “data aware” computing techniques and technologies in 2007 as a means of reducing network traffic on already saturated storage networks.
3 Unified Identity Management — Research computing environments are often managed as independent silos, kept far apart from organizational and enterprise systems. Expect IT efficiency efforts and government reporting requirements to push for unified access control, single sign-on and identity management systems that span Windows, Mac and Unix systems. Companies with identity management systems built on Microsoft’s Active Directory will want to take a serious look at software products from Centrify and Quest Software — both offer solutions for extending Active Directory authentication and policy based access control onto non-Windows systems.
4 10-Gigabit Ethernet — Watch for 10-Gigabit networking to spread further into the research computing environment. Previously found mostly in large core datacenter switches, the technology is becoming increasingly affordable and obtainable with multiple vendors offering reasonably priced 10-Gb edge and workgroup switches.
5 Reconfigurable Accelerator Boards (FPGAs) — FPGA technology has been the “next big thing” in bioinformatics for a while, and 2007 will be no exception. SGI is including an FPGA offering in their new product line, and companies like Mitrionics are providing a development environment, which will support expanded use. As the computational load required for genomics becomes more stable, specific common tools like BLAST will be moved to dedicated, special purpose hardware.
6 Virtualization — Virtual servers and workstations came into their maturity in 2006, and they will see massive adoption in 2007. The imbalance between the growing number of processing cores on a standard workstation and the ability of software to make effective use of more than a couple of those cores will lead users to combine their use of computing power onto their desktop. The simplest way to do this will be to simply start virtual servers on the desktop, rather than porting server software to a whole new environment.
7 High Throughput Genomics, Part 2 — Next generation sequencing technologies from companies like Solexa and 454 will cause another sharp increase in the sheer volume of data available to researchers. With the output of a single lab machine at around 1 billion bases per day, data centers will again be struggling to keep up with demand.
8 Commodity SMP machines replacing small clusters — Dual processor, quad core systems with 32GB RAM and Terabytes of local disk have become commodity systems. Clusters of less powerful systems permitted affordable super-computing but dramatically affected the way bioinformaticists had to map their computing requirements to the cluster architecture. As 16 and 32-way commodity SMP systems hit the market, will clustering make sense anymore?
9 Multi-terabyte personal storage and effects — Half-terabyte disks can be purchased cheaply at Staples. Desktop systems can be ordered with multiple terabytes of local disk and soon laptops will too. As users possess terabytes of data, how will this effect a) backup, b) sharing data between users, c) simply finding your own data?
10 Commercial application of BitTorrent as a P2P file system — Most folks don’t generate multiple terabytes of data, however many will have this much disk on their desk needing to be filled up. Sooner or later, there will be much more disk out there than original content. Is it time to reduce the number duplicate copies and backups of duplicate copies and simply share better? What if we used BitTorrent as a P2P file system, rather than just a file sharing system?
Email the BioTeam at firstname.lastname@example.org.
Subscribe to Bio-IT World magazine.