By John Russell
April 16, 2008 | Personal and consumer genomics are boiling right now. The excitement, fueled by the plummeting costs of DNA sequencing and the rise of companies like 23andMe, feels tangibly different than many earlier biotech trends. It feels more concrete as if, for good or ill, something interesting and vast is about to happen. It’s easy to imagine a raucous Wild West period during which traditional therapeutics, nutriceuticals, and “lifestyle” applications create a boisterous, buyer-beware marketplace and stir regulatory confusion.
Yet sifting through the coming data flood and connecting the dots in ways that accurately predict SNPS-to-outcomes remains a huge challenge. Gene Network Sciences CEO Colin Hill says flatly, "GNS is gunning to be the first group that really breaks this open, by having a scalable, supercomputer-driven automated platform that can turn that raw data into discoveries of the key SNPs driving the outcomes."
| Colin Hill
To hit that target, this pioneer in data-driven systems biology (see GNS Charts “Unknown” Biology
) is pushing ahead forcefully. Just this week Gene Network Sciences (GNS) announced that biosimulation pioneer and MacArthur award winner James Collins, currently at Boston University, will join the company as Chief Scientific Officer, a new position. This is an important gain for GNS and adds more scientific heft to the company (See sidebar: Jim Collins Joins GNS as CSO).
GNS has also expanded its reverse-engineering/forward simulation platform to accommodate DNA sequence data in addition to traditional molecular data. GNS doesn’t need more money, more partners, or more computational power, says Hill (though more data is always welcome); instead, GNS needs simply to put the platform to work to prove its power. SBNL editor John Russell recently spoke with Hill about GNS’ steady progress and its ambitious plans.
SBNL: The last time we met, we were talking about GNS plans to emphasize the commercial side of the business. How is GNS different today than it was a year and a half ago touching on technology and business issues?
Hill: It’s a good question. In the last year and a half the technology’s become a lot more robust, more scalable. We’ve gone through rewrites of the code to make it more robust. It’s the same platform -- reverse engineering, forward simulations is the core technology of the company – but it’s become faster, stronger. There are a greater number of interaction forms, which are really the building blocks that describe interactions between components, between drugs and genes, between genes and other genes, between proteins and outcomes, clinical variables.
SBNL: These are the inputs basically?
Hill: These are the inputs, yes, but our approach is a data-driven and machine-driven approach. So, unlike most of the other systems biology efforts that start with literature information [describing] what’s connected to what, our approach really starts with raw data, components measured under various conditions. The interaction forms are the building blocks, the various ways in which components may interact, that the software uses.
SBNL: Is GNS attacking different questions than it was a year and a half ago?
Hill: The new question that’s become a key focus area for the company is going from “SNPs to outcomes.” We weren’t as focused on DNA sequence information [before]. We hired somebody who’s driving that effort. We’re also driving a lot of our own discovery and we’ve found some partnerships with academic groups such as the Moffitt Cancer Center in Florida and the Weill Cornell Medical School in New York. That’s enabling us to go after some of our own discoveries in addition to the big pharma and biotech collaborations.
But this SNP-to-outcome problem is a really big one. With all the progress that groups like Steve Turner’s company (Pacific Biosciences) is making and other groups like deCode or 23andMe, the data is going to be there. We have a lot of information on the variations that make us all different and determine our disease progression and response to therapeutics. But we have a big problem determining which of the three million genetic variations are causative of the outcomes. That’s a very difficult computational problem that nobody has solved. You’re not hearing about that in all the articles in The New York Times about the race to very fast genomes.
SBNL: Can you build causal relationships from SNP data a priori without reference to the literature?
Hill: The answer is yes under certain conditions, depending on the information you have about inheritance and other information such as gene expression together with genomic variation and outcomes. Eric Schadt leads a group at Merck, specifically from the Rosetta faction, and they’ve been gunning for this problem for some time and have certainly made some breakthroughs related to metabolic disease. GNS is gunning to be the first group that really breaks this open, by having a scalable, supercomputer-driven automated platform that can turn that raw data into discoveries of the key SNPs driving the outcomes.
SBNL: When we talked before you said one of the things you did not want to do long-term was to be a grant shop or have that reputation. You wanted to be a commercially viable venture. How have you progressed along that path and what constitutes viability?
Hill: This is a central issue for all platform companies --is to what extent are you going to be a software service content provider where you really get judged in terms of revenues and profitability and growth versus to what extent are you more of a discovery shop where you’re capturing biological IP drug or diagnostic related IP and attempting to advance that forward.
We’ve seen many companies that started as platform companies make a transition to become drug discovery entities. The question about being commercially focused and how well you’re doing there is not necessarily about just revenue growth and profitability because if you’re investing in drug discovery that’s a longer-term play and that you have a different financial profile.
SBNL: Is that what GNS is trying to do? Are you trying to fund your platform development through R&D collaborations while the real goal is to generate, capture, and commercialize biological IP?
Hill: You’re mainly right. We’re not planning to become a drug company. We understand where our expertise is. We think we’re the best in the world at data-driven computation in this sphere. We have no desire to try to bring on capabilities that are well outside of that, [such as] medicinal chemistry and such. I’ve been thinking about this for ten years. I can’t say I have the ultimate solution and when you look around at the industry it’s hard to say that anybody’s really cracked the nut on this.
SBNL: No one has.
Hill: Everybody agrees the drug discovery industry has to change. Pharma’s in the toilet with Wall Street and everybody’s calling for gloom and doom and such. Everyone agrees there need to be new tools to advance the state of the art. The pharma companies know this better than anybody. However, companies that have breakthrough technologies still have a hard time commercializing those technologies and capturing some of the upside from them. From an investor’s point of view a lot of these platform companies have not performed well.
There ends up being tension between to what extent does one need to take the platform and go down the drug discovery pathway in order to capture real value and the value of the technology [itself]. The classic thing is firing all your discovery people and then hiring regulatory folks or even in licensing a compound and becoming a drug maker and I feel like that can’t be the right answer.
SBNL: So, is the idea wrong? Are systems biology companies, like all platform companies, naïve to think they are going to change the drug game in a significant way in a five- or even six-year window? Is it simply that technology development doesn’t pay off or that drug discovery is inherently too messy, chancy and takes too long.
Hill: No. No. I don’t think so. I honestly think it’s the technology. I think many of the approaches that dominated the early days of systems biology were off the mark. People will look back some years from now and say those approaches couldn’t have worked. There was too much unknown about biology, it was too complex, and there wasn’t enough data. I’m referring to approaches based on literature as the starting point, whether it’s assembling that information together into databases so you can visualize your molecular profiling data in this context or it’s doing the simulation models based on literature information. I think those approaches have inherent limitations.
I’ve said to many people that for a number of years GNS was misguided. [Our] approach of trying to model all of the known pathways involving cancer cell biology had its merits, certainly as an academic effort, and had some use in the commercial setting, but I think it was limited. We’ve come to the realization that most of the biological circuitry of human cells is unknown; 95% of it’s unknown. So, knowing only 5% of the information from all of the tens of thousands of papers that you could assemble from Nature, Cell, Science, and such over the last few decades, how and why do we think we should have accurate simulations and predictions about the phenotypes and clinical outcomes based upon such fragmentary knowledge? The answer is we shouldn’t.
If I gave you a computer chip and 95% of the circuitry was missing and I told you, John, can you predict what happens to this computer chip when I perturb that part and that part and that part and you’d say hell no. How could you? Most of the circuitry is unknown. I feel like the classic approach to systems biology, which GNS was a part of, didn’t really have a chance at having great impact on the discovery and development process. Without more of the guts of the system known, your solving a bunch of nonlinear differential equations wasn’t really going to cut it.
We first need to discover what are the key molecules driving disease progression. We have to discover what the key molecules are driving drug response, both from efficacy and the safety perspective. There’s been some recent papers from the Cancer Genome Anatomy Project and [Bert] Vogelstein’s group at Johns Hopkins showing a huge amount of heterogeneity in human tumors; lung cancer, breast cancer. Assuming we believe those results, this is telling us that something is misguided about the view that there is this canonical uber model that controls disease progression and is going to be common to everybody.
SBNL: If that’s true, what does it mean for the GNS value proposition?
Hill: These realizations were a big part of GNS shifting gears four years ago and really focusing on an inference based approach or combining inference with simulation. We don’t believe in simulation [alone]. You just have to have the more complete system to start with, which is the big difference between engineering approaches that have been able to start with a complete blueprint. One of my favorite examples is a blueprint of the circuitry of Nokia cell phone. It’s a very complicated circuit diagram, but the point is an engineer built it. They know it. They know all the parts of it so they can predict what happens when you perturb things. It’s not the case in biology. God didn’t hand us down a blueprint and say this is how human physiology works. We’ve discovered little bits and pieces here and there. We have a big challenge, which is to infer that circuitry before we can simulate outcomes.
So back to your question. The value proposition for our partners, whether pharma partners or academic clinics, is we now have the tool. It scales with the power of IBM’s largest super computers that allows us to take in data from a variety of sources, heterogeneous data, and actually discover the causal regulatory models connecting either genetic perturbation or drug perturbation to the molecular entities, be they genes or proteins or metabolites, and the clinical outcomes that they’re driving.
SBNL: Is the model is still basically based on up or down regulation of genomic components?
Hill: That’s a typical kind of perturbation one would do to a model that’s been reverse engineered. So, for example, in a project around Type II diabetes there are a number of endpoints such as blood chemistries, body weight, cholesterol, a number of things being measured. In this case, I think there were 18 different endpoints. The inputs were various drugs and data discovered from proteomics profiling from serum in animals and a number of entities. We inferred the causal connections between the drugs, proteins, and a variety of endpoints from this reversed engineered model. I’m drawing a single picture, but we actually reverse engineer a thousand models to create an ensemble of models that have a lot of overlap, but some real differences.
Perturbations are made through the system, so turning the dose of a drug up and then knocking up or knocking down one of the proteins and measuring how it’s changing some number of endpoints. What you end up with are outcomes. Let’s say it was body weight you were tracking. We can dial up the dose of drug one in silico and observe the body weight. Say knock down of protein 47 may result in a further shift in body weight. Going through a systematic knockdown of these various protein components, we now reveal that protein 47 is having this major effect on body weight in the presence of drug one, and therefore, it’s a key marker, potential co-drug with drug number one and we can tell you that with x-amount of confidence. Or we can tell you there isn’t enough data because instead of a nicely clustered histogram of results across all one thousand models, we’re seeing something that’s all over the place.
SBNL: Would be at least some starting hypothesis -- the result of down regulating protein 47 is that glucose transformation is slowed, etc?
Hill: The hypothesis generation and testing is completely automated at the outset, although once models are built, the user can test many hypotheses. Only at the very end do we annotate the molecular components you’ve discovered to be the key drivers. This is an attempt to reconstruct the system that gave rise to the data. Essentially it’s directed, data-driven, high-throughput guessing based on some very solid statistical and mathematical principles. But I think it’s a rather profound thing that a lot of biologists have a hard time digesting; the concept of discovering things by computer and discovering models, discovering biological mechanisms not in the traditional wet lab.
SBNL: Is most of the GNS’s current work in discovery or is it in sort of the comparison of compounds?
Hill: That’s a very good question. I want to say it’s about half and half; there is a good mix at this point and across a variety of data types. Like I said, we’re doing our first set of projects in genomics or in genomics being sequence versus molecular profiling. The team is now operating at a different level of test in terms of the number of projects they can execute on simultaneously. It’s putting the platform we’ve been investing in to the test.
This is what we were practicing for and developing and investing for all these years and we’re starting to see it really pay off. I mean the scalability of this approach goes well beyond whatever you can do manually. Part of the beauty of this approach is it is automated. You have to do some statistical analysis of the data ahead of time. You have to understand the experimental design. Often we work with a collaborator to design the experiments in the first place. But once the data is in the right form, the process of reverse engineering the models and then doing the simulations to produce onto discover the key molecules driving outcomes, that part’s pretty fast.
SBNL: Do you have sufficient staff to take on a lot more work, or would you need to scale staff-wise to take on more projects?
Hill: Our operational model scales very well with additional projects. Investing in this machine driven, data driven approach, it’s a different bet — you’re investing in the automation of the software, you’re investing in hardware, you’re investing in the access to data, we’re buying more data, funding with collaborators, new data, and that’s getting cheaper. The cost of computing power is a third of what it was only a few years ago. So you have to kind of take a step back and ask, well what’s going on here and where is this going in the long run? At some point machines will win. I believe in machines. I believe that computers will be the main driver of discoveries in the biomedical world in the near term.
SBNL: When you say near term, what does that mean? Is it ten years, five years?
Hill: No, near term is two-to-five years.
Hill: I think the growing amounts of data [will drive it] – I don’t mind being a contrarian.
SBNL: Aren’t you the one who says that 95% of biology is unknown? It seems to me that it’s a lot of ground to cover in two to five year.
Hill: It’s not that one has to unravel all of the unknown biology and anticipate every next discovery of small interfering RNAs and micro RNAs and things that people are finding. That’s not what I’m talking about. Our challenge, the industry’s challenge of developing a deeper mechanistic understanding of disease progression and the drug response, doesn’t necessarily need every last piece of biological knowledge. But we do need to be able to discover the key things really driving the outcomes.
SBNL: Is biopharma more or less enthusiastic about this approach today?
Hill: More enthusiastic, but it’s a sober enthusiasm. It’s more specific about solving problems. For example, combo therapies in cancer. This is something that a lot of companies need to solve. They can’t run enough clinical trials to explore all the combinations with standard of care therapies with their new targeted cancer drug. So here’s an area where this kind of approach has a clear win, from single drugs applied at multiple doses in your biological system. We have a platform that can combine those drugs in two-way combinations or three-way combinations and determine the most synergistic combinations and the dose ratios needed to get to those results.
SBNL: What projects is GNS currently working on? Can you give us a picture of the business? How long is a typical project?
Hill: We try to be able to turn most projects around in a few months. And a lot of that time is to set up the statistical analysis and interfacing with the partner, that part doesn’t go away. Right now there are on the order of a dozen projects going on with a mixture of pharma, biotech, and clinical research labs.
SBNL: Who are some big commercial collaborators?
Hill: So I can cite Pfizer and CombinatoRx. The academic partners are also important and are becoming more commercially focused these days, that’s clear. You see more and more partnerships between big pharma and these groups.
SBNL: How does GNS start developing its own biological IP?
Hill: About a year and a half ago we hired a patent lawyer from New York, Tom Neyarapally, he’s our VP of Corporate Strategy in IP, and his main mandate was to really broaden the GNS business model, and to monetize the REFS (Reverse Engineering Forward Simulation) platform in other ways. He really led the effort to start the collaborations with groups like Moffitt and Weill Cornell Medical School and other such places. The main thrust of the business is capturing this IP, diagnostic or therapeutic related.
SBNL: That’s through the academic collaborations where they might be more willing to share the IP?
Hill: That was the idea. It’s turning out that commercial collaborations were also able to [yield] more or less, this IP.
SBNL: Biological IP?
Hill: Yes. That’s actually really important and it is an important part of the business model going forward. We’re not going to turn into a drug maker or a diagnostic maker. But we see ourselves as a discovery engine. That is what GNS is about.
SBNL: What are the therapeutic areas in which you are working?
Hill: The big focuses, in terms of our internal discovery, are oncology, naturally, metabolic disease, meaning diabetes, Types I and II, and Alzheimer’s.
SBNL: Is there a dedicated internal group to internal research as it were? If so how big is it?
Hill: Yes. It’s pretty small. We’ve been dealing with the question of how we allocate expenses to these different groups but it’s the same platform. I like to cut things up because it’s easier to explain the business model to investors and other people, but I’ve kind of come full circle with that thought. At the level of the platform, the process, and the work being done, it actually doesn’t make a difference — the scientist doing the work probably don’t even know whether a contract says what aspect of the IP we own as a result.
We know we’re never going to become a multi billion dollar company just doing service deals. We believe that data driven computational discovery is the way forward and is becoming a bigger and bigger part of the process. The challenge of how to capture the upside is a challenge everybody has dealt with, from deCODE to Millennium to Entelos, across the board.
SBNL: Repositioning drugs seems fashionable right now; do you have any activity there?
Hill: Some of the work we’re doing in combination therapies can fit there, and our collaboration with CombinatoRx as well.
SBNL: Will you need to raise additional funds?
Hill: We do not have to raise money this year. We probably will. We did a little raise last year and ended up being over-subscribed by a lot, so that actually put us in a good position. We’re still cautious. My feeling is that there’s more competition coming from outside of the usual suspects in systems biology coming after some of these kinds of problems. Going from SNPs to outcomes, you better believe the competition is heating up there, and from groups you may not normally associate with this problem, such as Microsoft or Google.
My feeling is they are starting to think about these things a lot more and they’re coming with some really heavy machinery. I think you’re going to see some surprises in the next year or two as far as who tries to stake a claim in this area. Their approaches are going to be completely different than the classical systems biology based approaches.
SBNL: Milestones for the next 12 to 18 months?
Hill: SNPs to outcomes across a few different therapeutic areas — that’s what we’re gunning for, really being able to relate SNPs to change and changes to outcomes. Essentially be able to do in an automated fashion in weeks or months what Eric Schadt at the Rosetta/Merck group did over the course of a couple years. Number two is combo therapies and oncology, to be the first to take a single drug, multiple dose, data sets and explore very quickly billions and trillions of drug and dose combinations of cancer drugs and discover the most efficacious combinations and the corresponding markers that indicate the patients that will have the strongest response.
That’s all I care about. We are doubled down on our investments in technology. We are out there buying up data, partnering to get data, and the things that were clear bottlenecks to GNS a year and a half ago, two years ago, they’re not there anymore. Could we do more with more money? Absolutely. We’d love to blow this out in a bigger way and I think the issue I’ll be dealing with over the next year, 18 months, will be when is the time to possibly pull the trigger and accelerate. It’s a bet. If we’re right — and this is the way forward — this will yield discoveries at a pace and a scale that have never been seen before.
Jim Collins Joins GNS as CSO
Gene Network Sciences announced today that Dr. James Collins, professor of biomedical engineering and University Professor at Boston University and co-director of Boston University’s Center for Biodynamics, will join the company as chief scientific officer. He will retain his fulltime position at BU while acting as GNS CSO.
Collins is a MacArthur Fellow, an original Technology Review TR100 award recipient, a recipient of an NIH Director's Pioneer Award, and a Rhodes Scholar. Professor Collins will also co-chair GNS’s Scientific Advisory Board. He is the founder and former CSO of Cellicon Biotechnologies, a leading systems biology company, and Afferent Corporation, a medical device company.
“Professor Collins’ broad experience as a scientific leader in systems biology and the reverse-engineering of genetic networks makes him uniquely qualified to lead GNS’s scientific efforts,” said Colin Hill, President and Chief Executive Office of GNS.
"During the past year, GNS has made important advances in the validation of its technology," said Collins in a release. "In addition to discovering drug mechanism of action and biomarkers used by its pharma partners in clinical development, GNS’s formation of discovery alliances with leading academic clinics and its discovery of potential therapeutic targets ab initio from human data has established GNS as a unique computational biology company.”
Supporting Collins in his new role as CSO of GNS is a newly constituted Scientific Advisory Board that brings expertise in key areas that will be invaluable in evolving GNS’s discovery engine and advancing GNS’s pipeline of gene network based targets and diagnostics programs from discovery to market. Members of the new SAB are leading pharmaceutical executives and academic systems biology pioneers with significant pharmaceutical drug discovery and development experience.
In addition to Professor Collins, the members of the SAB include:
* Jeffrey Besterman, Ph.D., Executive Vice President of Research & Development and Chief Scientific Officer, Methylgene.
* Jeffrey Hanke, Ph.D., Vice President, Cancer Care Discovery, AstraZeneca.
* Stuart Kaufmann, M.D.., Chair of Biocomplexity and Informatics, University of Calgary. Dr. Kaufmann is the co-founder of Santa Fe Institute, Darwin Molecular, BiosGroup, GenPathway, and is a MacArthur Fellow.
* Hiroaki Kitano, Director of Sony Computer Science (Tokyo). Dr. Kitano is the President of the Institute for Systems Biology and is the visionary behind Aibo, the robotic dog.
* Christophe Lengauer, Ph.D., Executive Director, Cancer Discover, Novartis.
* Matej Oresic, Ph.D., Principal Investigator, Systems Biology & Bioinformatics, VTT Technical Research Center of Finland. Dr. Oresic is the founder of Zora Biosciences and is the former head of computational biology at BG Medicine.
* Hans Winkler, Ph.D., Senior Director, Functional Genomics, Johnson & Johnson.