CONVERSATION | Charles DeLisi, who helped conceive the Human Genome Project, turns to systems biology and an AIDS vaccine

Interview by Kevin Davies

Sept 16 2004 | According to the citation that accompanied his Presidential Citizens Medal, Boston University's Charles DeLisi was the first government scientist to conceive and outline the feasibility, goals, and parameters of the Human Genome Project. Currently the senior associate provost for biosciences at Boston University, DeLisi remains a leading figure in computational biology, with interests ranging from biosimulation to AIDS vaccine development. His long-range goal is "to relate expression patterns to pathways, pathways to networks, and networks to function." His bioinformatics graduate program will shortly move into a new 10-story, 184,000-square-foot multidisciplinary research building. Kevin Davies spoke with DeLisi in his office.

Q: What is your research background?
A: My A.B. and Ph.D. (from City College of New York and NYU) are both in physics. I had an early interest in genetics, but never took biology in college. I did watch it from a distance, was fascinated by its increasing conceptualization, and by the end of my graduate school days was pretty much hooked. At the time I felt my biggest obstacle to really delving into biology was not knowing chemistry — I had taken only a semester in college. So I spent the next three years in the Yale chemistry department learning the language of organic chemistry, and developing methods to calculate RNA secondary structure. My first position was in the theoretical division at Los Alamos, where, ironically, I developed an interest in immunology and cell biology.

After three years, I moved to the NIH intramural program as a visiting scientist. It was — and still is — an immunological mecca, and when I was offered tenure a year later, I couldn't resist staying. At NIH, I was doing mostly immunology, but I was watching molecular biology, and spent some time in the director's office when they were starting to consider the development of GenBank. NIH understood the importance of computers in storing and managing data, but the extramural system didn't appear ready to accept the computer as an analytical engine. The idea of devising algorithms that would look for biological function was pretty much beyond the way most people thought about things. The intramural program offered more flexibility, and I invited Minoru Kanehisa (now a professor at Kyoto University, Japan) — a brilliant computer person and first-rate biologist — to NIH, where we developed the first relational database management system for protein sequences, DNA sequences, structures, etc.

Q: How did you get involved in planning the Human Genome Project?
A: The obvious question [around 1982] was, would we ever get the whole genome sequence? I dismissed the possibility because I didn't feel the biomedical culture was going to accept a project of that size and complexity. We were developing codes for intron-exon boundaries, using Bayesian statistics in the early '80s, but when I gave talks, I'd get these blank stares. It didn't take hold, it was so foreign to people ... People didn't feel overwhelmed by data — they felt they could still handle things in the laboratory. And at that particular moment in history, they were right — but they didn't see the tsunami.

Charles DeLisi
I went to DOE (Department of Energy) [in 1985] as director of their health and environmental research programs, and after three months a report appeared on my desk from the Congressional Office of Technology Assessment (OTA). That report mentioned sequencing the entire human genome. People on that committee were fairly prominent biologists, so I got some indication that someone else in the world thought this wasn't a totally nutty thing to do. Having a reference genome would be spectacularly important. Robert Sinsheimer had held a workshop in May 1985, but nothing had come of it. They couldn't convert that interest into policy. At DOE, we decided to sponsor a workshop [in Santa Fe] of leading molecular biologists and geneticists to get our own sense of what the community felt.

This was initially just a DOE project?
When we held the Santa Fe workshop, we invited every federal agency to it, and no one was interested! All these agencies now, they have genomics as their underpinning, they had no idea what was going on. NIH was hesitant ... but a National Academy of Sciences committee strongly endorsed the project, so that induced NIH to move forward. I was pretty comfortable by then that the project had a life of its own. I moved to BU in 1990 and basically returned to structurally based immunology, and didn't look to genomics until about six years ago. When I started reorienting my own research, it was like waking up after 20 years ... Everything that was an important problem 20 years ago was unrecognizable, just unbelievable.

How important to the pace of the genome project was Celera?
I think industrial involvement has been very important to setting the pace. If [J. Craig] Venter and Celera hadn't come on the scene, the genome project would still be going on. When I left Washington in 1987, I had estimated 2000-2001 as the completion date. It was, in my opinion, unnecessarily stretched out to 2006. The feds accelerated their schedule in response to Celera. The reason it went as well as it did is that the economy was spectacular. If not, venture money would not have been as plentiful, and Celera might never have been formed.

What are you working on now?
I'm still interested in immunology, including infectious diseases. I'm starting to work with teams in India, Maui, and Thailand to develop a vaccine for AIDS. The research goes from basic chemistry to clinical testing, with several points where you can't do things without high-intensity computing, especially at the end of the project, which involves a lot of integration. We'd like to develop a vaccine component for cellular immunity, but an effective vaccine is probably a decade away, assuming it's achievable.

How is computation involved?
Our goal is to develop an epitope vaccine — we look for immunogenic peptides encoded by HIV, carrying out an exhaustive experimental search of the genome. The virus mutates under selective pressure, so a lot of immunogenic sites have mutated away. In order to get a good vaccine, you may need to go backwards in time ... to examine the genome not just as it now is, but as it once was. It's not going to be easy. Therapy (as opposed to a vaccine) is possible, but when you have a disease with essentially no cases of natural immunity, you have to be doubtful whether a vaccine will be found.

One aspect of the computational component is in designing immunogenic peptides. You need to screen computationally because there are too many combinations to explore them all experimentally. They have to be designed so that they bind with high affinity ... it's not likely we'll find a single vaccine that covers the whole human population. Selecting a set of immunogenic peptides — from a group of candidates that have been obtained by an exhaustive search — that optimizes cost, efficacy, and population coverage is a very complex computational problem.

What is the focus of the BU bioinformatics program?
We're doing systems work, not just bioinformatics. We cover the gamut. It's primarily computational/mathematical, but with some engineering of new high-throughput technologies, and numerous collaborations with experimental biologists, clinicians, and engineers. We're really beginning to map out the circuitry of the cell and model it. We look at all genomes, and network orthologies. We have our own clusters, and the Center for Computational Science supports high-performance computing, using IBM. The bioinformatics program spans engineering, arts and sciences, and medicine. We have over 100 students in our program, about 70 Ph.D. students. I wanted to plateau at around 50 Ph.D. students, but we've had some spectacular students coming through this program. We share students with faculty at Harvard and MIT, and have a number of industrial partners, including Pfizer, Serono, and IBM.

What would you consider a milestone in systems biology?
Mapping the transcriptional network of a eukaryotic cell: understanding its functional organization, along with a predictive understanding — probably at some higher-order supramolecular level — of response to changes in environment. Data alone won't get us there — it's going to require much better methods than we currently have for data integration and analysis.

Interview By: Kevin Davies