In searching 400 years of French-Canadian history for genetic clues to diseases among Quebec's founding population, Genizon BioSciences — formerly Galileo Genomics — is rapidly becoming the bio-IT company du jour.
BY KEVIN DAVIES
February 15, 2005 | 1608. A Dutch spectacle-maker named Hans Lipperhey invents the "spyglass." The device is enhanced by the Italian mathematician Galileo Galilei, who turns his telescope to the heavens. He is eventually put under house arrest, courtesy of Pope Urban VIII.
That same year, a troupe of 28 French farmers led by Samuel de Champlain establishes a trading settlement along the St. Lawrence River. The province, later named Quebec, becomes a haven for thousands of French Catholics escaping Protestant persecution. While many return home or venture further west, several thousand remain and prosper ...
Four centuries later, Montreal-based Genizon BioSciences — originally Galileo Genomics ("Genomique" north of the border) until its name change last December — believes that mining the genomes of today's population of 6 million Quebecois (French Canadians) holds the key to identifying genes and drug targets for more than two dozen debilitating disorders, including diabetes, asthma, and arthritis. "We're using a combination of biology and IT to change the way medicine is practiced and the way drugs are developed," says CEO John Hooper.
The French-Canadian gene pool, which began with just a few thousand founders, provides almost ideal circumstances for Genizon's gambit. Whether studying strokes in Iceland or obesity in Southeast Asia, researchers know that reducing the genetic background or "noise" by selecting an isolated, homogenous population increases the chances of finding a significant "signal" — an SNP or mutation that pre-disposes to disease (see Feb. 2004 Bio·IT World, page 34).
QUARTET CANADA: (clockwise from left) John Raelson, John Hooper, Nathalie Laplante, and Majid Belouchi hope to benefit from, and give back to, the Quebecois population.
Such signals can be detected with far fewer patients, and in much less time. In Iceland, the rich genealogical history of today's population of fewer than 300,000 people has enabled deCODE to map genes for schizophrenia, diabetes, and stroke, and move into the clinic (see Jan. 2003 Bio·IT World, page 22).
In some senses, Genizon is the deCODE of Quebec. Both are exploring a lengthy list of disease interests in a homogenous study population with an arsenal of bio-IT tools, from high-throughput genotyping to custom bioinformatics programs.
Unlike deCODE, however, Genizon has aroused little public dissent to its patient recruitment efforts. And the mapping approach is different. "We use high-density SNPs, whereas deCODE uses microsatellites. They rely on inferential pedigrees; we rely on open population case-control association analysis," says Bill Cheliak, vice president of business development.
Perhaps the most critical difference involves money. Although Genizon has raised more than CDN$50 million in two rounds of financing, it has yet to partner with Big Pharma, as deCODE did with Roche a few years ago. But that may be about to change.
Guilt by Association
Galileo was founded in 1999 by molecular biologist Majid Belouchi, geneticist John Raelson, and psychologist Nathalie Laplante, who had all been colleagues at Canadian biotech Algene. Using a method called linkage disequilibrium (LD) mapping, Algene mapped a putative gene for schizophrenia. "That's how we knew we could do it," Raelson says. After the company folded, Belouchi persuaded his friends to set up Galileo. "We wouldn't have done it ourselves, but Majid was crazy enough to persuade us," Raelson says. Myriad Genetics provided $1.5 million in seed funding.
|When Londoner John Hooper arrived in Canada in 1967 with a Ph.D. in organic chemistry, he only intended to stay for a couple of years.
Hooper joined six months later and made the fateful decision to expand the original focus from a handful of diseases to two dozen. His concern was that if one or two projects "bombed or we were beaten, we'd be out of business. But if we chose 20, we could afford to fail on some of them." Belouchi, chief scientific officer, sees it somewhat differently: "If we'd done just three diseases, and today we announce we find genes, what would happen? Everyone would come and compete with us in Quebec."
As director of ethics and clinical recruitment, Laplante has already obtained blood samples from more than 30,000 participants, courtesy of a network of 900 physicians and 250 nurses, so the barrier to entry is virtually insurmountable. "Most physicians are very enthusiastic," she says. "They're used to working with pharmas, so they're being paid more money than we offer. But they see it as refreshing, approaching the disease on a genetic level more than a symptomatic level."
Genizon will ultimately enroll one percent of the Quebecois population — some 60,000 people. Recruiting for 27 disorders is not always straightforward. "Collecting people with obsessive-compulsive disorder is quite difficult — they tend not to come out a lot," Hooper says. "Panic disorder and essential tremor are difficult diseases to recruit as well."
The criteria for selecting diseases included prevalence, importance, and unmet medical need. Epilepsy was excluded, Raelson says, as its heritability "is very complicated." Genizon also prefers not to compete unduly with academic researchers — a claim that amuses University of Toronto's Kathy Siminovitch, who studies Crohn's disease. "Sure, are you kidding?" she says, laughing, when asked if she is competing with Genizon. But in the case of bipolar disorder, Raelson says, "There are researchers in Quebec with huge samples. We'd rather work with them than against them."
For each disease, patients are recruited in trios. (DNA from both parents, or a spouse and child, of each patient helps determine which markers are associated with the disease.) To qualify, each patient must have four French-Canadian grandparents.
DRESS CODE: Genizon scientists must don easily distinguishable lab coats (to minimize contamination risk) while tending to Illumina BeadStations (center).
Each day, some 50 bar-coded blood samples arrive for processing at Genizon's headquarters, in St-Laurent, near Montreal. The lobby resembles a greenhouse, courtesy of Raelson, who was originally a botanist. Even Hooper doesn't enjoy automatic access to the spacious laboratory, where the samples are processed, then stored in -80ÞC freezers named after various mountain ranges.
Redundancy is de rigueur. Duplicate samples are stored offsite, and a gas-powered generator stands by in the event of a power failure. Duplicate air-conditioning systems make the small server room feel like a cold room, but one can't be too careful. On the coldest day of the year in Montreal last year — -35ÞC — the system failed, briefly sending the mercury up to +35ÞC.
Powered by Perlegen
The history and structure of the Quebecois population make it almost ideal for high-throughput genotyping (see "The French Connection," page 28). "The beauty of the analysis," Cheliak says, "is linkage-based disequilibrium," which can pinpoint a disease mutation with higher resolution than standard pedigree-based linkage methods. LD mapping surveys genetic recombination events dating back many generations. Essentially, the idea is to scan the genomes of patients with SNPs spaced according to the genetic sharing of the population (the Quebec LD Map) to search for blocks of markers, or haplotypes, associated with an ancestral French mutation that pre-dates the Quebec settlement.
THE FRENCH PATIENT: Pamphlets help recruit volunteers.
"In the Quebec population, you have 12 to 16 generations of recombination events. We're looking for regions around genes from a common ancestor that are shared by groups of patients," Cheliak says. "We'll narrow the candidate gene region significantly more than is possible in a family." These regions typically span just a few hundred kilobases, a fraction of the candidate intervals identified using linkage methods.
The greatest progress so far has been in Crohn's disease, a chronic inflammatory disorder of the bowel, chosen because of excellent patient recruitment and the prior identification of two susceptibility genes, NOD2 and OCTN, which serve as positive controls.
In the summer of 2003, Genizon issued a request for proposal for its genotyping requirements. Applied Biosystems, Affymetrix, Beckman, Illumina, and Sequenom all tendered bids. "Capital cost structure was significantly different for each supplier, and not all could reach Genizon's target cost," says René Paulussen, vice president of operations. Discussions with Affymetrix led to collaboration with the chipmaker's spinoff, Perlegen Sciences.
Last summer, Genizon announced the results of its collaboration with Perlegen. The genotyping, which involved 1,500 Quebecois, utilized 248,000 SNPs spaced roughly every 15 kilobases, yielding 372 million genotypes. "We believe it's one of the largest genetic databases in the world," Raelson says. Using a pair of proprietary algorithms — HSS and LDSTATS — Raelson's group identified 10 new candidate regions, including several possibly stronger than NOD2 and OCTN. Siminovitch and other academics have similar results, but she acknowledges that [Genizon's] population is going to be a strength."
Having pinpointed the Crohn's candidate regions, Genizon is now fine-mapping, saturating the regions with multimarker haplotypes — SNPs spaced every 2,000 to 4,000 bases. For this, Genizon uses two Illumina 500GX BeadStations, costing about $0.5 million, which sit cordoned off in the laboratory. The BeadStations process a total of 1.2 million genotypes per day.
While the Crohn's DNA sequencing continues, attention is turning to psoriasis, "then it's one disease a month after that," Cheliak says, including attention deficit/hyperactivity disorder, and acne. Progress will be quicker now that Perlegen has completed the Quebec LD map — 80,000 SNPs spanning the genome at 1 SNP every 10 to 40 kilobases. "That will form the basis for the map for all the subsequent 26 diseases that we'll do the analysis on," Cheliak says. "We've got 12 million genotypes per day reserved for Genizon."
|The French Connection
|Quebec's population of 6 million is ideal for genetic mapping studies
Genizon's patient population and genotyping moxie would mean little without a superb IT infrastructure to handle the data deluge. "When you're in the data-generation business, you'd better have good IT," Hooper says (see "Hooper Dreams," page 30). "If you can't crunch the data, you get nothing out of them." Of his vice president of IT, Hooper says simply, "Jean-François Levesque is the best IT professional I've worked with. Were it not for him and the way he's set this up, we'd have a lot of data and nothing to do with them."
With a background in hardware engineering and telecommunications, Levesque initially found it difficult to adjust. His staff of 17 has to adapt genetic algorithms, whether written in-house or imported, to cope with the volume of data. Levesque says Genizon "lets the science people do their thing and come up with some brilliant algorithms and ideas to process the data. Our job in IT is to put these things in production. This means sometimes rewriting the code completely, optimizing it, making it scalable." As Cheliak quips, "You can't go to Wal-Mart and buy the software."
For example, Levesque says, "you might find some code that can handle 200 SNPs great, but to handle 200,000 SNPs is a whole different ball game — that's where professional software engineers make the difference." Building those collaborations "took a while. I'd never worked with scientists before. They're very particular ... they're used to working in a more individual approach, and not worrying about anybody else."
One of the challenges was to adapt the LDSTATS program that was originally developed to simulate a few thousand virtual subjects. "When we scaled up the simulation to the Quebec population of 6 million, the hard drives started burning up from all the I/O activity," recalls IT systems architect Borivoj Stojkovic. "Expensive drives! They kept burning up, running 24 hours a day, nonstop. We re-developed and optimized the simulation program, which as a result no longer burns our drives and runs the simulation in less than an hour."
A big challenge was fixing another program that was producing different results on different operating systems. Levesque recalls, "We could get a good result on Windows, but once we tried it on the Linux machines certain data sets were failing." With Extreme Programming — IT and genetics team members working together on a single PC — Levesque's team traced the problem to floating-point precision — a discrepancy in rounding-off errors that varied with the operating system. As a result, Stojkovic says, "the program is now portable; we can throw it at any resource — Windows, Linux, Solaris."
MONSIEUR IT: Jean-François Levesque is adapting quickly to gene hunting.
Another example was a program licensed from an Ivy League academic institution, which Stojkovic says had "serious performance issues with the size of our data sets" and would have required "close to 250 days on one of our CPUs to crunch a single disease." Editing the code was a last resort, but ultimately it required "getting under the hood, and working with the genetics department to understand the algorithm." One function was re-allocating a small piece of memory billions of times. The fix was easy, and the program now runs on the computing cluster in a few hours for a given disease.Hunting with Beowulf
Genizon uses a 256-node Beowulf cluster run by CLUMEQ, an association of Quebec universities and based at McGill University. "It's one of the largest clusters in Canada, and was on the top 500 list of computing resources," Levesque says. McGill was interested in collaborating with different industries, while Genizon wanted to optimize bioinformatics programs. "Having access to this cluster without having to maintain a huge system is great," Levesque says.
In addition to on-demand capacity, the CLUMEQ experience has helped Genizon build its own internal grid, which incorporates various platforms, including Windows workstations, Linux, and Sun Solaris servers. To ensure uniform response, Levesque says it is essential to "build code that will be able to execute on any of those [platforms] when that CPU is available." He relies on Platform Computing's LSF software for cluster and grid management, which automatically re-schedules jobs if a server fails.
When it came to choosing a LIMS that could manage potentially 55,000 samples in Genizon's biobank, Genizon opted to buy, not build. From 100 candidates, Levesque selected LabVantage Solutions because of its flexibility, customization, and service. Levesque also likes the "good API to get to the database." Customization ensures it is fully integrated with Genizon's Clinical Data Management and Genetic Analysis systems. "At the end of the day, we'll have 4 billion to 4.5 billion genotypes that will need to be managed from those 55,000 individuals," Cheliak says.
The Crohn's genotyping data from Perlegen was a major test, requiring 1.2 million jobs on the cluster, 20,000 computing hours, and about 1 TB of raw data. Entering those data into Oracle was another obstacle, until Levesque's team realized that "you have to drop all the constraints on the database entities, and just dump the data into the table." Once there, "you can rebuild the indices and constraints, to make sure everything is correct, but you do that at the end. You learn these kind of things along the way."
Hooper has big plans, but remains realistic. "The ambition of many of us is to become a drug discovery company. We just can't do it for 27 diseases. Initially, we're going to have to license." Since publicizing the Crohn's data last summer, three of the top five pharmas have visited Montreal for talks. Hooper puts a value of $40 million to $150 million for each validated susceptibility gene, citing the $150-million deal inked by fellow Canadian biotech Xenon Pharmaceuticals with Novartis in 2004 for the rights to an obesity gene target and accompanying lead.
The key word is validated. Genizon has to obtain proof that its target genes are druggable — or find associated genes that are. "We're in the gene discovery business," Hooper says. "We can add some value to it, by doing analysis in silico and a little bit of lab work, and we can work with partners as well, such as Gene Logic."
|Through the Lens
|A partial inventory of Genizon BioSciences' key IT and software suppliers.
To that end, Genizon recruited some experienced scientists from Genome Therapeutics, including Randy Little, Paul van Eerdewegh, and Tim Keith, who spearheaded the identification of the ADAM33 asthma gene. Keith will be integrating "additional capabilities downstream to ensure that susceptibility genes we do identify can be translated into effective targets for therapeutic development." (Indeed, it is this additional genes-to-target capability that prompted the company's new name.)
Some techniques, such as yeast two-hybrid analysis, will be introduced to define disease gene networks. Others, such as mass spectrometry, will likely be outsourced. There is equal emphasis on computational approaches, from statistical genetic methods to identify interacting disease genes, to new text-mining tools. "There will be many genes to do literature mining on, and there has to be a systematic way to do that. Software to do 'literature blasts' will make a difference," Keith says.
Hooper offers three reasons why Genizon will succeed where other startups studying isolated populations, such as Sweden's UmanGenomics, have failed. "SNP maps — without those, we wouldn't know where to look. Second, genotyping power, speed, and price. Third, the human genome sequence — it's only [in 2004] that we had enough confidence that we can select SNPs throughout the genome and know they're in the right place. Had we tried to do this two to three years ago, it would have been a disaster."
The goal is to eventually repay the Quebecois population. "To do pharma in Quebec would be the best thing we can do, to give back to the community," Belouchi says. "First, we need to sell five to 10 disease genes," which could allow expansion into infectious diseases and oncology.
Once successful, Genizon will put 3 percent of the cumulative net profits into a foundation for the Quebec population. "The recommendations of the Human Genome Organisation were that people involved in exploiting the population of its genetic content should feed money back to it," Hooper explains. "We feel that quite strongly. You've got to make sure everybody benefits."
PHOTOGRAPHS BY JULIE DUROCHER