By Gillian Law, IDG News Service
February 10, 2003 | British scientists will soon have access to a national e-science computing grid for both information exchange and joint processing power. The e-science grid connects a national center in Edinburgh to eight e-science centers at universities around the United Kingdom. The latest center, at the University of Southampton, in England, was connected last October.
National e-Science Centre (NeSC) Research Manager Dave Berry says the first applications will be run in March of this year. “There’s still work to do in making it robust,” he says. “But the infrastructure is there, and it’s just low-level engineering work that needs done, making sure that centers can communicate through one another’s firewalls and setting up the PKI [public key infrastructure] to recognize and authorize people.”
NeSC was established in April 2001 as a joint project between the University of Glasgow and the University of Edinburgh. It receives funding from the U.K. Department of Trade and Industry’s e-Science program, the Scottish Higher Education Funding Council, and the universities themselves. The Department of Trade and Industry’s Office of Science and Technology has allocated approximately £200 million over five years for the e-science project, according to Berry.
The eight e-science centers are at Queen’s University in Belfast, Northern Ireland; the universities of Cambridge, Oxford, Manchester, Newcastle Upon Tyne, and Southampton, in England; the University of Cardiff, in Wales; and the Imperial College of Science, Technology and Medicine in London. The Rutherford Appleton Laboratory and Daresbury Laboratories, both in England, are also connected. A Grid Support Center has been established at the Rutherford Appleton Laboratory to provide support to the centers and to anyone developing e-science grid applications.
The grid will have two main functions, Berry says: “There’s the notion of using remote computing facilities, where you need a large amount of processing power for some work but can’t buy a whole system yourself. Instead, you can use the grid. And then there’s the database side that we’re particularly interested in. Data manipulation, data curation, just making information available for scientists.”
The project will link scientists and allow them to exchange expertise, knowledge, and data, Berry says. It will not be made available to outside users, such as companies needing computing resources, says John Gordon, group leader at the e-Science Centre in Rutherford.
“Mainly for political reasons, it will be used by only the universities involved. They have built the thing, and they’re not going to be selling time on it. Though it’s feasible that this could act as a prototype for a commercial project, where someone sets up an aircraft hangar full of computers and sells time,” Gordon says, adding that companies will have access to the resource only by financing university research projects.
Richard Baldock is on temporary assignment to NeSC from the Medical Research Council Human Genetics Unit in Edinburgh, working several days a week on building applications for genetic research using the e-science grid. “I look at gene expression data, look at genes, and work out what they do,” he says. “The goal is to identify what each does and how [it controls] the development of an embryo. When we do experiments, we get lots of little snapshots of data, and we need to bring all that together and be able to perform searches and queries on it.”
‘The Interesting Stuff’
A spatial framework has to be developed on four dimensions -- three being space, the fourth being time -- so that researchers can map the information they have and look for patterns and relationships, Baldock says. “The grid comes into that in lots of ways -- in terms of computing resources for capturing large amounts of data and then presenting them in a form where they can be visualized. And also, all that information is very computing intensive. All embryos differ, so there’s a lot of computing power needed to bring them together on the same framework and be able to make comparisons.”
“Then, you get to the interesting stuff. Analysis -- sharing data across large databases, interoperability between gene sequence and protein databases. Eventually, we want to be able to do modeling and simulation using all that information,” Baldock says.
Several grid projects have been developed so far, according to Berry. “There are applications for particle physics and for astronomy being developed, but they’re only running on restricted sets of machines at the moment,” he says, adding that nothing will go live until March.
Other countries, including the United States, are developing grid systems of their own, and the U.K. grid is working with others to look at ways of exchanging data and linking resources, Berry says. The U.K. grid is being developed using Globus 2.0 software from the Globus Project, allowing each center to connect its own hardware and software setups, and, according to Berry, later expansion to connect to U.S. centers should be no problem.
The U.K. universities are predominantly connecting Linux environments to the grid, along with Sun Solaris, IBM AIX, and Hewlett-Packard Tru64 Unix platforms. The hardware being connected includes Sun, SGI Origin, and IBM supercomputers; desktop PCs; and a Sony PlayStation at the University of Manchester.
The computing power available once the grid is complete will vary according to which machines are connected at one time and how the resource is shared between universities and disciplines, Gordon says. “Eventually, we will have supercomputers on it -- possibly including one of the 10 biggest nonmilitary supercomputers in the world -- plus large Linux clusters, like the 500 CPUs we have here at Rutherford for particle physics research.”
As the grid is being built, each center is making only some of its resources available, “but it’s very simple to add more once real applications are being demonstrated and used,” Gordon says.
The United Kingdom’s e-science grid is on par with that being done elsewhere, Berry says, “although there are areas where we’re ahead of the United States. For example, there’s a lot more focus here on databases, on data manipulation, and making information available.”
The e-science grid is the “oil in the work -- the technological solution to help all the biological work being done,” Baldock says. “This will take 10 to 20 years -- it’s a very big, open-ended thing. But the building blocks are being put in place. Ultimately, the grid could help biologists come together and work together, collaborating. The next few years will be very exciting.”