Ensemble Discovery believes a new class of therapeutic compounds - and the novel means to synthesize them - could transform the drug discovery process.
By Kevin Davies
April 1, 2008 | For decades, most drug discovery programs have essentially focused on two very different classes of molecules. On the one hand, small molecular weight compounds plucked from massive libraries, the traditional domain of big pharma. At the opposite end of the scale, much larger biologics, such as recombinant antibodies, have found increasing success in recent years.
But a small biopharma in Cambridge, Mass., Ensemble Discovery, believes that a new class of compounds - not very imaginatively called the Ensemblins - could provide new leads against a variety of intractable targets, even protein-protein interactions.
Ensemble was founded in 2004, based on an ingenious concept of DNA programmable chemistry developed by Harvard University professor (and Howard Hughes Medical Institute Investigator) David Liu. Liu is a "brilliant young chemist and tenured professor" who was seeking a commercial outlet for the technology says Ensemble's Chief Business Officer, Laurence Reid.
Over a couple of years, Doug Cole, a partner at Flagship Ventures, "worked with David, [and] gained his trust, watching the technology blossom," says Reid, who spent a decade with Millennium Pharmaceuticals. "It's a sophisticated technology - it took a while for it to be ready to roll into a company." Liu is now chair of the scientific advisory board.
"We have exclusive rights to his technology in all industries," says Reid. "Long term, there is potential for discovery of new chemical modalities in other industries, but right now, we're focused on new compound discovery in drug discovery." Other interests include various diagnostic and biodetection applications.
Ensemble's CSO, Nick Terrett, joined in 2006, moving a few blocks north from Pfizer's Research Technology Center. (See "Pfizer's Pursuit of Technology," Bio•IT World, April 2006) Terrett spent two decades at Pfizer and played a lead role in the discovery of Viagra. He says it's a refreshing challenge to be part of a small (35-member) start-up, and to help define a new therapeutic modality. He talks in the precise terms one would expect of a Cambridge-educated organic chemist.
"Ensemblins are a range of structural classes that are inspired by nature," Terrett explains. That said, about 18 months ago, Ensemble elected to focus on producing libraries of macrocyclic ringed structures. Naturally occurring examples include cyclosporine, erythromycin, and vancomycin, which contain closed rings of at least 12 or 13 atoms. "This ring structure is a way of generating large molecules that can bind to a protein surface," says Terrett. "You don't get a major entropic loss upon binding, because they're not as flexible as an open-chained molecule."
Using Liu's DNA-based synthesis methods, Ensemble builds collections of macrocyclic compounds. These are composed of "diverse building blocks with variable ring sizes, a measure of conformation rigidity, and potential for distributed binding to protein," says Terrett.
Last year, Ensemble made its first big library of 20,000 compounds, ranging from molecular weight 500 to 1000, with a median weight of about 650. This is significantly larger than the infamous 500-molecular-weight threshold made famous by former Pfizer chemist Chris Lipinski's "Rule of Five" (Ro5).
Indeed, none of Ensemble's macrocyclics meet all of Lipinski's Ro5 criteria. There's overlap, certainly. For example, most molecules have good lipophilicity and an acceptable number of hydrogen bond donors and acceptors. But in virtually all cases, the molecular weight is significantly larger.
Nevertheless, Ensemble views this as a distinct advantage. "We have evidence that these compounds still have drug-like properties, despite falling outside Lipinski's Rule-of-Five parameter," says Terrett. Such properties have been demonstrated in dozens of published studies, showing significant enhancements in potency, selectivity, physicochemisty, and pharmacokinetics when open-chained molecules are converted to closed rings.
Moreover, Terrett argues that there is "probably more diversity in all macrocyclic compounds from molecular weight 500-2000 than in all molecules below 500," because of the greater variation in larger molecules. "Lipinski says nothing about ability to bind to a target. The Rule of Five is about the ability to get to the target," he stresses.
"It's a magical couple of hundred Daltons [extra]," adds Reid. "People have bought into the Rule of Five as a guide to drug discovery way beyond what Lipinski ever imagined or advocated, because it was a look back at what people had done historically."
"Very few groups have cottoned onto the opportunity," says Terrett. "We are targeting protein-protein interactions, proteases, phosphatases, many of which are significant [therapeutic] opportunities." (See sidebar: "Target Practice," below) One example is the binding of epidermal growth factor to its receptor. The ligand spreads across the receptor surface, with lots of low-energy interactions. "Small molecules won't have sufficient binding energy to disrupt that," says Terrett. "They're much better for binding to kinases."
As for protein-protein interactions, the contact surface area could be thousands of square angstroms, says Terrett. "Small molecular-weight compounds don't have enough size to actually prevent the interaction. So size is an important part of this." Reid uses the analogy of a spider, albeit one with only four legs, stretching out over its protein target.
Until now, says Reid, macrocyclic compounds could only be made in small batches. "Making diverse collections has never been done before," he says. "We're accessing numbers way beyond what people have done before." Ensemble synthesized some 30,000 compounds in 2007, and is now producing 100,000 with a team of three chemists.
Closing the Loop
The key to making these large diverse collections of macrocyclic compounds is the ingenious method developed by Liu at Harvard. "Making macrocyclics is synthetically very challenging," Terrett explains, particularly the final ring closure step. "Using DNA-programmed chemistry, because the molecule is tethered to DNA at both ends, we can actually control that ring-closure reaction."
The genius of Liu's method is to tether chemical domains to custom DNA strands. "Every compound we make in the library has a unique DNA sequence attached to it," says Terrett. "It's more than a barcode, because it actually directs the synthesis."
The compounds are synthesized in a step-by-step process, with the tethered DNA serving two key purposes. First, each incoming reagent molecule has a short DNA tag that hybridizes specifically to the template DNA strand (usually 59 bases in length). This duplex formation brings the two chemical groups into close proximity, and as Terrett explains: "You get a massive effective increase in molarity. This drives the specificity of the chemistry, and the formation of new covalent bonds."
"The specificity of those DNA codons is key," says Nathan Walsh, the company's senior informatics scientist. He spent considerable time "trying to figure out which pieces of DNA you can have next to each other, such that when you come in with a complementary piece [of DNA], it wouldn't stick in places it shouldn't."
This process then repeats two or three times. The chemists introduce the next reagent, with its own unique DNA tag. The DNA strands anneal, and the budding macrocyclic compound adds another piece. At each step, a variety of chemical groups can be introduced, the diversity sharply increasing. The final step is to close the macrocyclic ring, using a handful of different compounds, using a classic chemical procedure such as the Wittig reaction.
The synthetic process really begins with a large library of different DNA sequences. "You make every combination of codon sequence. It's like a translation from a DNA sequence to a macrocyclic structure," says Walsh.
It took Ensemble's team several months to optimize the process, but now Terrett says, "We can make anything from a few hundred to 10,000 compounds. It's very hard to do in large numbers by other means." He adds that, "In theory, there's nothing stopping you from going to 20,000, 100,000, 1 million compounds. You just need to have enough DNA sequences to start off with."
With a respectable library of compounds now at its disposal, Ensemble can screen those compounds against a protein of interest, using affinity selection to identify the strongest binding compounds. The protein target is denatured to elute the binders, which are identified in turn via their DNA tags. Subsequently, Ensemble will evaluate whether their hits exhibit agonist or antagonist properties.
Walsh explains that the binding profile is compared "pre" and "post," meaning before and after attachment of the target protein to the column. The "fold enrichment," or strength of binding, is calculated from the ratio of post to pre binding.
Ensemble's biggest library of 20,000 compounds is actually divided into four pools of 5,000 compounds, each pool representing a different linker molecule - lysine, ornithine, aminophenylalanine, or diaminobutanoic acid. The ring sizes range from 15-27 atoms.
In a typical screening, perhaps 20-30 compounds would meet certain criteria and qualify as hits. The compounds' pharmacokinetic and physicochemical properties - solubility, lipophilicity, and so on - are measured externally. "A lot of the parts of the process we start in house, and once we're comfortable it works in our hands, we let other people do it for us," says Walsh. That includes some of the chemistry, DNA sequencing, and metabolic profiling.
Ensemble has identified a couple of dozen attractive targets for its internal research, including phosphatases, proteases, and protein-protein interactions. Reid explains that they are prioritizing cases where there is "significant medical need and commercial opportunity, but for the majority of targets, there's a validated mechanism."
Potential targets include PTP1B (diabetes), BACE (Alzheimer's disease), oncology targets (such as the Bcl-2 and XIAP proteins) and immune system targets (such as TNF). "They're all attractive," says Terrett. "We've not applied any hierarchy in house. We've worked up the assays, run the screens. If we see activity we pursue it. If we don't see activity, then too bad."
Adds Reid: "Over the next year or two, we'll clearly select 1-2 diseases to become more biologically expert in, but right now, our sense is that there's a whole set of different kinds of biochemistries that could be uniquely addressed by this technology. There's a lot of great biology out there that's either effectively been undruggable or is only druggable if you can bring an antibody against it." So while Ensemble concentrates on its chemistry and technology platform, "we want the biology to be as validated as possible."
Ensemble is still in the hit validation stage, but Terrett says he aims to have validated leads by the end of 2008. From there, he says it's "quite possible" that they will identify clinical candidates in 2009, and move into the clinic in 2010.
Reid says wryly that Ensemble would benefit from the "validation and expertise that a major pharma would bring to the table. In the last 18 months, we've honed this strategy and validated the platform. We've shown that you can drive affinities up. Pharma has said this is exciting, but show me it works in your version of drug discovery."
Reid and Terrett are open to partnerships with other companies who have refractory targets, and would be interested to strike alliances in which Ensemble would lead the screening and identification of active compounds.
"We're at the beginning of a real expansion of the program," Reid says. "We have the data in hand such that pharma companies are now saying, 'I can believe that the Ensemblins have the potential to be drug like. Let's talk about my favorite targets that have been hard for me to drug. How can we work together?'"
Sidebar: Target Practice
One of Ensemble's early drug target prospects is a phosphatase called PTP1B, which Terrett calls "a very tough target." After screening it against its 20,000 compound library, scientists used data visualization software called Miner 3D to perform "instant SAR." The compound hits are neatly categorized by shared chemical groups, sorted along lines and planes on the 3D image. The four colors distinguish the four different linker groups used to close the macrocyclic ring.
Once hits have been identified, Ensemble synthesizes milligram quantities of the pure compound, using solid-phase and solution-phase chemistries.
The best hit so far for PTB1B has a binding efficiency of 2 µm. Terrett concedes: "That's not great, but it's very good for this target and type of molecule. We're making more analogues, trying to push the potency up. But we're [also] trying to keep the compound that has good pharmacokinetic properties." Ideally, he'd like to be able to do more in silico visualization, but no one really knows the exact binding site on the phosphatase yet.
A second example concerns a major oncology target. The library screening was performed in the presence (or absence) of the protein's natural pepti de target, thus providing a means to assess specificity.
Twenty compounds bound and passed the initial criteria. Based on those, Ensemble chemists made a follow-up library of 1700 compounds, used in an iterative library screening. The best candidate shows 650-fold enrichment, compared to the natural peptide's 800-fold enrichment. Further medicinal chemistry is underway in the hope of optimizing the molecule to achieve nanomolar binding. --K.D.
This article appeared in Bio-IT World Magazine.
Subscriptions are free for qualifying individuals. Apply Today.