February 10, 2012
| Bio-IT World > Computing the Genome


Computing the Genome


Horizons

CONVERSATION | Charles DeLisi, who helped conceive the Human Genome Project, turns to systems biology and an AIDS vaccine

Interview by Kevin Davies
 

Sept 16 2004 | According to the citation that accompanied his Presidential Citizens Medal, Boston University's Charles DeLisi was the first government scientist to conceive and outline the feasibility, goals, and parameters of the Human Genome Project. Currently the senior associate provost for biosciences at Boston University, DeLisi remains a leading figure in computational biology, with interests ranging from biosimulation to AIDS vaccine development. His long-range goal is "to relate expression patterns to pathways, pathways to networks, and networks to function." His bioinformatics graduate program will shortly move into a new 10-story, 184,000-square-foot multidisciplinary research building. Kevin Davies spoke with DeLisi in his office.

Q: What is your research background?
A: My A.B. and Ph.D. (from City College of New York and NYU) are both in physics. I had an early interest in genetics, but never took biology in college. I did watch it from a distance, was fascinated by its increasing conceptualization, and by the end of my graduate school days was pretty much hooked. At the time I felt my biggest obstacle to really delving into biology was not knowing chemistry — I had taken only a semester in college. So I spent the next three years in the Yale chemistry department learning the language of organic chemistry, and developing methods to calculate RNA secondary structure. My first position was in the theoretical division at Los Alamos, where, ironically, I developed an interest in immunology and cell biology.

After three years, I moved to the NIH intramural program as a visiting scientist. It was — and still is — an immunological mecca, and when I was offered tenure a year later, I couldn't resist staying. At NIH, I was doing mostly immunology, but I was watching molecular biology, and spent some time in the director's office when they were starting to consider the development of GenBank. NIH understood the importance of computers in storing and managing data, but the extramural system didn't appear ready to accept the computer as an analytical engine. The idea of devising algorithms that would look for biological function was pretty much beyond the way most people thought about things. The intramural program offered more flexibility, and I invited Minoru Kanehisa (now a professor at Kyoto University, Japan) — a brilliant computer person and first-rate biologist — to NIH, where we developed the first relational database management system for protein sequences, DNA sequences, structures, etc.

Q: How did you get involved in planning the Human Genome Project?
A: The obvious question [around 1982] was, would we ever get the whole genome sequence? I dismissed the possibility because I didn't feel the biomedical culture was going to accept a project of that size and complexity. We were developing codes for intron-exon boundaries, using Bayesian statistics in the early '80s, but when I gave talks, I'd get these blank stares. It didn't take hold, it was so foreign to people ... People didn't feel overwhelmed by data — they felt they could still handle things in the laboratory. And at that particular moment in history, they were right — but they didn't see the tsunami.


Charles DeLisi
I went to DOE (Department of Energy) [in 1985] as director of their health and environmental research programs, and after three months a report appeared on my desk from the Congressional Office of Technology Assessment (OTA). That report mentioned sequencing the entire human genome. People on that committee were fairly prominent biologists, so I got some indication that someone else in the world thought this wasn't a totally nutty thing to do. Having a reference genome would be spectacularly important. Robert Sinsheimer had held a workshop in May 1985, but nothing had come of it. They couldn't convert that interest into policy. At DOE, we decided to sponsor a workshop [in Santa Fe] of leading molecular biologists and geneticists to get our own sense of what the community felt.

This was initially just a DOE project?
When we held the Santa Fe workshop, we invited every federal agency to it, and no one was interested! All these agencies now, they have genomics as their underpinning, they had no idea what was going on. NIH was hesitant ... but a National Academy of Sciences committee strongly endorsed the project, so that induced NIH to move forward. I was pretty comfortable by then that the project had a life of its own. I moved to BU in 1990 and basically returned to structurally based immunology, and didn't look to genomics until about six years ago. When I started reorienting my own research, it was like waking up after 20 years ... Everything that was an important problem 20 years ago was unrecognizable, just unbelievable.

How important to the pace of the genome project was Celera?
I think industrial involvement has been very important to setting the pace. If [J. Craig] Venter and Celera hadn't come on the scene, the genome project would still be going on. When I left Washington in 1987, I had estimated 2000-2001 as the completion date. It was, in my opinion, unnecessarily stretched out to 2006. The feds accelerated their schedule in response to Celera. The reason it went as well as it did is that the economy was spectacular. If not, venture money would not have been as plentiful, and Celera might never have been formed.

What are you working on now?
I'm still interested in immunology, including infectious diseases. I'm starting to work with teams in India, Maui, and Thailand to develop a vaccine for AIDS. The research goes from basic chemistry to clinical testing, with several points where you can't do things without high-intensity computing, especially at the end of the project, which involves a lot of integration. We'd like to develop a vaccine component for cellular immunity, but an effective vaccine is probably a decade away, assuming it's achievable.

How is computation involved?
Our goal is to develop an epitope vaccine — we look for immunogenic peptides encoded by HIV, carrying out an exhaustive experimental search of the genome. The virus mutates under selective pressure, so a lot of immunogenic sites have mutated away. In order to get a good vaccine, you may need to go backwards in time ... to examine the genome not just as it now is, but as it once was. It's not going to be easy. Therapy (as opposed to a vaccine) is possible, but when you have a disease with essentially no cases of natural immunity, you have to be doubtful whether a vaccine will be found.

One aspect of the computational component is in designing immunogenic peptides. You need to screen computationally because there are too many combinations to explore them all experimentally. They have to be designed so that they bind with high affinity ... it's not likely we'll find a single vaccine that covers the whole human population. Selecting a set of immunogenic peptides — from a group of candidates that have been obtained by an exhaustive search — that optimizes cost, efficacy, and population coverage is a very complex computational problem.

What is the focus of the BU bioinformatics program?
We're doing systems work, not just bioinformatics. We cover the gamut. It's primarily computational/mathematical, but with some engineering of new high-throughput technologies, and numerous collaborations with experimental biologists, clinicians, and engineers. We're really beginning to map out the circuitry of the cell and model it. We look at all genomes, and network orthologies. We have our own clusters, and the Center for Computational Science supports high-performance computing, using IBM. The bioinformatics program spans engineering, arts and sciences, and medicine. We have over 100 students in our program, about 70 Ph.D. students. I wanted to plateau at around 50 Ph.D. students, but we've had some spectacular students coming through this program. We share students with faculty at Harvard and MIT, and have a number of industrial partners, including Pfizer, Serono, and IBM.

What would you consider a milestone in systems biology?
Mapping the transcriptional network of a eukaryotic cell: understanding its functional organization, along with a predictive understanding — probably at some higher-order supramolecular level — of response to changes in environment. Data alone won't get us there — it's going to require much better methods than we currently have for data integration and analysis.



Interview By: Kevin Davies



PICTURE OF DELISI BY: MICHAEL MANNING




White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .