CONVERSATION · TurboWorx CEO says technology gap is too wide
BY SALVATORE SALAMONE
January 15, 2005 | TurboWorx president and CEO Jeff Augen not only combines computational and biology expertise, but also has a clear vision of how to advance life science discovery. Augen has two decades of experience in computational biochemistry and IT. A co-founder (with Caroline Kovac) of IBM's life sciences division, where he says he coined the term "information-based medicine," Augen played a prominent role at the I3C and The SNP Consortium. He is also the author of Bioinformatics in the Post-Genomic Era: Genome, Transcriptome, Proteome, and Information-Based Medicine, recently published by Addison-Wesley.
Senior IT editor Salvatore Salamone asked Augen about the state of the life sciences and the challenges facing information-based medicine.
Q: Describe the state of our industry today and how we got here.
A: From 1999 to 2001, the focus was on high-performance computing to determine sequences and protein structure. At that time, there was a rapid evolution in technology. Computer hardware got a lot better, algorithms matured quickly, and there was an influx of very sophisticated computer programmers into the space ... In the mid-1990s, you had people who were interested in computational biology entering grad school with strong mathematical skills or a computer science background. A whole group of people graduated around 2000 who were very sophisticated [in] computer science and biology. That drove this very rapid advance right at the time of the sequencing of the human genome. We got a lot of new algorithms and new techniques in informatics. A lot of the focus in that time frame was on high-throughput sequencing, protein folding, and structure determination. That was very important to the launch of computational biology.
The era that we're in now — that's just beginning — builds on [this work] in an interesting way. The focus ultimately is to do more and more of what we used to do in wet chemistry, in silico. And the second important thrust of all this activity is to be able to accurately diagnose diseases.
Is this being done today?
"Diagnostics has always lagged behind the science. The gap has grown larger than ever before."
Not to any great extent. Diagnostics has always lagged well behind the science. I feel like the gap has grown larger than ever before. As an example, people have been doing gene-expression profiling for about a decade. In research labs today, people use microarrays all the time. There's nothing exotic about it. Yet, when you go to a doctor, the doctor doesn't take out a gene chip. So the gap has widened to more than a decade in the use of a new technology. Why is that?
Because the interpretation of results is getting more and more complex. There are probably 20 different classes of melanoma based on the expression profile of the tumor. We don't yet have a set of standards where someone can take a microarray, plug it into a computer program that weighs the regulation of [thousands] of those genes, and says this is type 1 or type 2 or type 3. And that's what we need for microarray technology to be used commonly in a clinical sense. What's needed to get to that stage?
Ultimately, we need to have standard array profiles that are used to generate mathematical weighting functions that use neural networks or some other techniques to classify a disease. That weighting function would be stored in a [microarray] reader or machine. The doctor or a lab would take a tumor and do an expression array profile. The result, which is a string of ones or zeros, would then be piped through a small processor that compares them using the agreed weighting function. [The system] would say that this is a type 3 melanoma and it should be treated this way. We're far away from that. But isn't some of this being done already?
It's being done in an experimental fashion. You read journal papers where people take a particular type of leukemia and do expression profiling and use a neural network or some sophisticated statistical technique and they are able to differentiate between different tumor types that they couldn't distinguish before. How does this work move to a clinical setting?
Just like we industrialized gene sequencing, we need to industrialize medical diagnostics - disease by disease, category by category. We have to standardize the way the expression arrays are run, the way the weighting functions are generated, and the way the analysis is done.
We also need more patient data. [Because of privacy issues], we haven't thought about how to take the enormous amount of data that could be collected from hospitals and medical centers, combine that data in some way, and make it available to those who would be doing this kind of work to building these standardized diagnostics. The only places where we see progress is the large medical centers — like the Mayo Clinic — where they have an enormous patient population. They've moved forward on this kind of problem just with their own patients. We're not going to see enough of that unless we do something to change the rules.
Is this type of work being done in other places around the world?
In countries like Canada, where they have socialized medicine, the data is available. There are privacy rules to protect people, and no one is discriminated against [based on clinical information.]. With that type of data, you can do this kind of in silico research on the development of new diagnostics. I'm not proposing any particular medical care system, I'm just saying we need the data. We need a lot of work to be done on anonymization of data, so that people feel confident that nothing will happen to them.
What are the key technology enablers to do this?
Fast-forward 10 years and let's assume we're starting to deploy expression profiling as a diagnostic tool in dozens of diseases. Let's also assume that we have standardized and approved tests, the arrays are widely available, the price has come down, and the doctors know how to take samples and prepare them. Let's say all of that is out of the way. What's missing is the compute infrastructure — the database infrastructure, the networks, the connectivity, the infrastructure for distributing the software on a regular basis, and teaching people how to use it. All of that has got to get done!
You need this infrastructure to [handle] upgrading software when somebody perfects a more precise weighting function that gives you a more accurate reading on the diagnostic. You need a way to distribute new weighting function and for doctors to upload the new software onto their machines. So there will probably be an industry that forms just around building microarray readers and distributing software. People will have service contracts that will keep the machines up to date.
Would such an infrastructure also play a role once a medical test is completed?
Yes. The other thing that's missing [today] is the connectivity, the availability, the interfaces that would allow a clinician to access a dozen different databases so that for a given patient with a certain expression profile, clinical history, demographic history, and set of symptoms, [the doctor] could search a broad array of databases and say, 'I found five other patients just like you, same age, same profile, same clinical symptoms, and based on the outcome of how these patients did on various drugs, I know the best way to treat you.'
The only way to get to that [level] is to build a national compute infrastructure that includes databases, interfaces, desktop machines, connectivity, and so forth. A national infrastructure for information-based medicine is as ambitious a project as the construction of the freeway system back in the 1950s. And just like the freeway system, that kind of infrastructure can only be funded by the government. Right now we're depending on independent entities — medical centers, computer companies — to put together that infrastructure, and that's never going to happen. There isn't a consolidated approach, which is what we need.
Where will innovation and funding for such a project come from?
I don't know what the nature of the organization that would put [such an infrastructure] in place, and I don't know what the timeframe would be. But I expect it would happen over a decade, and the expense would rival something like the space program over the course of a decade. But I think it's money well spent. I think that the way that most of our technology has advanced over the last 50 years has been by focusing on these key, grand-scale projects. Just like we invented Teflon for the space program, we'll invent new interfaces, algorithms, and methodologies for connecting databases.
How would such a project justify its funding?
The government would benefit from the reduced cost of long-term medical care. I suspect that the cost of funding such an infrastructure would be considerably less than the cost of taking care of very sick people with critical conditions near the end of their lives. Proactive medical care is one way to dramatically cut the cost. [For a regular person], the benefit would be the wider availability of better care. You wouldn't have to go to the largest, most exotic medical center to get the same level of diagnostics as someone who does.
PHOTO BY TRACEY KROLL