Fragile Expedition

December 18, 2013

UPDATE 12/19/2013: A correction has been made to this article regarding the discovery that AGG inserts in the tandem repeat region of FMR1 can affect the heritability of fragile X syndrome. The article originally credited Paul Hagerman's lab with this discovery; however, the credit instead belongs to the lab of Dr. Flora Tassone, also of UC Davis. The text has been edited accordingly. 

By Aaron Krol 

December 18, 2013 | At the farthest reaches of the long arm of the X chromosome, there sits a stretch of DNA that looks like something a cat would type by sitting on the Ctrl and V keys. Translated into standard nucleic acid notation, it reads: 

CGG CGG CGG CGG CGG CGG CGG CGG

and so on. This kind of genetic duplication – common in non-coding regions, but rarer in active genes – is called a tandem repeat, and just how many repeats can be found in this position varies from person to person. Some people have only five copies of this sequence, while the longest recorded stretch of repeats went on for well over a thousand copies. And although it might seem strange for anyone but a deeply postmodern poet to describe a long cycle of three repeating letters as “interesting” or “mysterious,” this genetic locus is both.

It’s interesting because it houses the single most common known cause of autism – in fact, the single most common cause of mental impairment of any type that can be passed from parent to child. The gene at this locus is called FMR1, and the protein it codes for, FMRP, is essential for normal brain development. Any X chromosome with more than 45 copies of the CGG repeat on FMR1 is considered to harbor a mutant variant of the gene. At around 200 copies, the tandem repeats silence the gene, shutting down production of FMRP and causing “fragile X syndrome.” Individuals affected by fragile X, particularly boys, will show slowed cognitive development, sometimes to the extent of autism, as well as behavioral quirks like hyperactivity and extremely repetitive rituals. People with fragile X also have distinctive physical features, including an elongated face, stuck-out ears, a skinny physique, and unusually flexible fingers. Oddly, if you look under an electron microscope, their X chromosomes have some telltale visual signs of their own: a “pinch” at the locus where their tandem repeats have multiplied uncontrollably, which causes a short knob of genetic material to dangle off the end of the chromosome like an earlobe.

The locus is also mysterious, because it’s almost impossible to sequence – even though we know pretty much what the sequence will look like. That’s because modern sequencers work by reading DNA in small fragments, on the order of 100-400 bases long. By doing this enough times, sequencers generate a huge set of overlapping fragments. A computer finds the stretches where the fragments match, and stitches them back together to produce the unified sequence.

This works very well for most of the genome, but if a sequencer runs into a 600-base stretch of nothing but CGG, it won’t have any idea how to align the fragments: many of them will be nothing but CGG from start to end. This makes it impossible to tell how long the repeats go on; and if there is an interruption somewhere down the line, say an AGG slipped in, it won’t be clear where it fits, or even whether these interruptions occur more than once. Sequencers just discard this information, and leave this section of FMR1 a black hole in their readouts.

Paul Hagerman profile 
Dr. Paul Hagerman, an expert in fragile X genetics at the UC Davis School of Medicine, Dept. of Biochemistry and Molecular Medicine. Image credit: Hagerman lab 

That’s a problem Dr. Paul Hagerman has grappled with for years. His lab at UC Davis has been working on fragile X for over a decade, during a fruitful period for genetic discovery. Among his lab’s contributions was the discovery of a new condition connected with carrying an intermediate number of CGG repeats, between about 55 and 200, called the "premutation" range. The condition is fragile X-associated tremor/ataxia syndrome (FXTAS), and it presents with late-onset neurodegenerative symptoms and sometimes childhood seizures. Having an intermediate number of repeats can also result in fragile X-associated primary ovarian insufficiency (FXPOI), which causes women to experience early menopause. These discoveries have shifted our understanding of the premutation phenotype, making it more important than ever to understand where FMR1 variants fall on the spectrum.

Yet without sequencers, it’s hard to tell where an individual sits in this range. For years, the Hagerman lab could rely only on imprecise measures like Southern blotting – good enough to diagnose fragile X syndrome, but little help nailing down a carrier’s genotype.

Unmasking the Fragile X Gene 

The progress of genetics has been characterized by huge, much-heralded leaps forward, followed by laborious backtracking to make some useful sense of the terrain that was hurdled past. When the human genome was first sequenced in 2003, scientific boosters predicted a revolution in medicine. A decade later, it’s a simple matter for the average person to have her genome tested for up to a million common mutations, yet it would be foolhardy in the extreme to base a health regimen on the results.

For fragile X, the leap came almost as soon as Dr. Hagerman got wind of SMRT technology. Designed by Pacific Biosciences, or PacBio for short, and released commercially in 2011, this sequencing method breaks the genome into much larger fragments than alternative sequencers: the record read length on a PacBio instrument, set in October of this year, topped 40,000 bases. Even an average read on a SMRT sequencer is thousands of bases long. This is technology capable of eating through hundreds of CGG repeats in a single bite.

UC Davis acquired an early SMRT sequencer in 2011, and in October of 2012, the Hagerman lab published the first fully-characterized sequences of the FMR1 gene in the unmutated, premutated and mutated ranges. The achievement was a milestone in fragile X research, rescuing FMR1 from a nebulous realm of genetic guesswork. For the first time, cutting-edge technology seemed to promise a world where anyone could learn exactly what fragile X-related variants lurked in their DNA, and what this meant for their future health and that of their children.

Today, Dr. Hagerman is still working to make good on that promise.

New Challenges 

“We have a dual interest [in fragile X sequencing],” says Dr. Hagerman. “One is to develop a diagnostic procedure, and the second is to do screening.” Hagerman wants to design two separate genetic tests. The first will offer a complete description of the FMR1 locus that any doctor would accept as clinically valid, and another, preliminary test will flag individuals with any level of FMR1 mutation.

Both these projects differ in key ways from the headline sequencing runs of Hagerman’s 2012 paper. A diagnostic procedure has to meet some strenuous goalposts before either clinicians or the FDA will accept it for use on actual patients. It has to preserve its accuracy across the entire conceivable range of repeats that people might carry. It has to reliably capture rare, single-point mutations that can affect how fragile X and related disorders develop. It has to work on mosaic individuals, for whom different cells may carry different numbers of repeats: a common phenomenon in FMR1 disorders, because large numbers of tandem repeats are unstable and can spontaneously duplicate. “All of those features are important in developing a clinical diagnostic tool so you can use this for genetic counseling,” says Dr. Hagerman.

Hagerman’s earlier work also used a common shortcut that needs to be ironed out of the process. To obtain enough genetic material to sequence, Hagerman’s lab performed PCR amplification on their samples, using polymerases to copy the relevant DNA many times over. This boosts the signal fed into the SMRT sequencers, but it can also add extra CGG repeats to the sequence, again because tandem repeats are naturally prone to duplication. A diagnostic test will have to work with an unamplified raw sample to be considered valid.

The need for a genetic diagnostic is pressing. Although existing methods can diagnose fragile X syndrome itself, they say relatively little about the risks to family members of affected individuals. Relatives of those with fragile X tend to carry FMR1 genes in the premutation range, putting them at risk for FXTAS or FXPOI, conditions that almost always go undiagnosed today. Carriers can also pass full mutations onto their children, as the tandem repeats multiply during meiosis – a risk that grows greater with age. To assess all these dangers, the precise length of the repeat needs to be quantified.

“The tests that currently exist have well-known limitations,” Hagerman told Bio-IT World. Southern blots, for instance, “are very inaccurate in the carrier range… PCR methods are much more accurate in the premutation range, but they’re much less accurate in the full mutation range.” A full genetic description will also capture single-point mutations that can make a major difference in the fragile X family of disorders. For instance, another UC Davis lab, under the direction of Dr. Flora Tassone, discovered that just one or two AGG sequences scattered in the CGG repeats can dramatically reduce the risk of a carrier passing on a full mutation. “The probability of a mother… giving birth to a son who would have the full fragile X syndrome can be reduced as much as tenfold by a single AGG interruption,” says Dr. Stephen Turner, founder and CTO of PacBio. Current diagnostics don’t detect AGG inserts, but SMRT sequencers can.

A Key Partner 

Hagerman’s early success in sequencing FMR1 has been encouraging enough to attract an NIH grant to develop a diagnostic test. Awarded in September of this year, the grant will culminate in a 300-person trial of the fine-tuned test before UC Davis prepares it for clinical release. PacBio is closely involved in the project, which could establish a unique position for the company in clinical use at a time when the sequencing industry is scrambling to enter the clinical market. Illumina, the world’s largest sequencing company, recently received FDA approval for a cystic fibrosis diagnostic, a landmark in medical genetics. QIAGEN, a multibillion-dollar diagnostics company, is preparing to release the new GeneReader instrument specifically for clinical use.

PacBio is smaller than either of these companies, but its SMRT sequencers are the only instruments that can access fragile X mutations. “We are right now in the business of facilitating sequencing that can’t be done using other techniques,” Turner, who is also principal investigator for the NIH grant, told Bio-IT World. “The longest read length of any technology other than PacBio is about one thousand bases, so they’re far short of being able to cover [the range of FMR1 mutations].”

Two aspects of SMRT technology – the acronym stands for single molecule, real time – give PacBio the coverage needed to delve into the black boxes of the genome. The first is a DNA polymerase engineered from the φ29 bacteriophage. Bacteriophages, viruses that invade bacteria and replicate inside their hosts, use polymerases to copy their genetic sequences. The φ29 phage is remarkable for replicating its entire, nearly 20,000-base genome in a single step, using just one enzyme. Armed with this highly accurate, long-reading polymerase, a SMRT sequencer can chew through thousands of bases using just one enzyme and one molecule of sample DNA.

“The more important thing is that we’re watching it in real time,” says Turner. “The other technologies have very brittle and regimented recipes, where they apply the polymerase to a mixture of molecules for a preset period of time.” A typical process is to flood the DNA sample with enzymes and nucleotides, wash away the mixture, and then retroactively determine the sequence by checking which nucleotides successfully bonded with the sample. SMRT sequencing, however, checks the bases one by one as they’re incorporated into the polymerase. “[We give] each base in the sequence precisely the amount of time it needs to incorporate,” says Turner. “No more, no less.”
  

SMRT seq manufact 
A manufacturing floor for SMRT sequencers. Image credit: Pacific Biosciences 

 

Both these elements serve to give a clearer picture of fragile X genetics. Because SMRT sequencing targets just one molecule of sample DNA at a time, it is ideally suited to detecting mosaicism: if an individual has different FMR1 alleles in different chromosomes, the SMRT sequencer will record those sequences separately, rather than returning blended results. In addition, the real time analysis lets SMRT sequencers detect DNA methylation, a chemical modification of nucleotides that causes the gene silencing in fragile X syndrome. Methylated bases incorporate into the PacBio polymerase at a predictably different rate than unmodified bases, a difference that is automatically recorded as the sequencer reads the DNA in real time. This means that a SMRT diagnostic test should be able to tell individuals not only the sequence of their FMR1 genes, but also the extent to which they are silenced.

To PacBio, this precision is a validation of the company’s recent move into clinical functions. “Clearly at some point in the future, Pacific Biosciences technology is poised to play an important role [in the clinic],” Turner told Bio-IT World. PacBio signed a $75 million licensing deal with Roche this September to develop in vitro diagnostics based on SMRT technology, making it clear that the fragile X diagnostic is not an isolated project.

Turner emphasizes that this deal will not prohibit outside groups from marketing their own SMRT-based tests, so if a fragile X diagnostic does emerge from Dr. Hagerman’s lab, it is unlikely to fall under the terms of the Roche agreement.

Statewide Screening 

A genetic diagnostic for fragile X syndrome will be a major step forward for both the families affected by the disorder, and future research into its genetics. “We’re going to position ourselves to understand new findings much better,” says Turner, “because we have the full sequence.”

However, even the best clinical test wouldn’t address one of the most important limitations to diagnosis today. Fragile X is just one of many causes of autism and cognitive impairment, and although its physical signs help clinicians identify the disorder, many cases go undiagnosed for months or years before the symptoms become obvious. This can set back treatments that have a chance to improve the lives of individuals with fragile X. There is no cure for the syndrome, says Dr. Hagerman, but “it’s very clear that early intervention is beneficial. Both medical intervention and behavioral, educational intervention has a very good effect on outcome.” Certain medications can help lessen children’s anxiety, OCD-like symptoms, or hyperactivity. More importantly, early behavioral therapy often makes a crucial difference in acclimating children with fragile X to social situations.

Those with full fragile X mutations are at least generally diagnosed during childhood. People with premutations, however, may never receive a diagnosis. Children may have unexplained seizures; adults with FXTAS may undergo loss of memory and motor control that could have been alleviated by early intervention; women with FXPOI may delay having children only to discover that they have a condition causing early menopause. These individuals may never have any cause to seek out the genetic test that could have warned them about their carrier status.

That’s why Dr. Hagerman is also trying to develop a cheaper, faster screening test, with the ultimate goal of testing all infants in his home state of California. The screen would only detect whether the FMR1 repeat region is longer than normal, but those individuals flagged in screening could then receive the full diagnostic.

“At present,” says Hagerman, “there is no way to do that test on a cost-effective basis for large numbers of individuals.” The challenges are very different from those facing a clinical diagnostic. Mosaicism, methylation, and single-point mutations could all be ignored, and Hagerman’s 2012 sequencing was already more than accurate enough. Instead, he says, the questions are, “How low can you push the cost? Can you get it down to, say, a dollar a test? And can you do tens of thousands in a reasonable timeframe?”

These are questions PacBio hopes to be well-positioned to address. “The nice thing about the PacBio method and SMRT sequencing is that you, in principle, can do a high degree of multiplexing,” adds Hagerman. “So you can pull large numbers of samples in the same tube and sequence.” Users of SMRT sequencers can “barcode” their samples when multiplexing, attaching unique 16-base pair sequences to the DNA samples so that, after sequencing, it’s easy to tell which sequence came from which source.

This makes it easier to run mass screenings, but the price of testing remains a concern. One of the reasons SMRT technology remains favored for niche applications that other sequencers can’t cover is that PacBio’s costs can be much higher than the industry standard. With hundreds of thousands of children born in California every year, Dr. Hagerman’s screening ambitions rest heavily on that $1-per-test figure. And while the fragile X diagnostic is covered by the NIH grant, Hagerman’s lab has yet to locate a funding source to help develop their screen.

Screening also faces political obstacles. Newborn screening can sometimes discover unwanted results. Huntington’s disease, for example, is caused by a similar tandem repeat mutation to the HTT gene on chromosome 4. However, no early treatment or intervention has ever been proven effective in treating Huntington’s – meaning that a screen is likely to cause serious anxiety to those carrying the mutation, without helping them fight the disease. To secure support for a fragile X screen, Hagerman will need to address this possibility. “I think the issue,” he says, “is going to be, what is the benefit of screening? What the state will want to know – what any state would want to know – is, why is there an advantage? Is there an early intervention that would justify newborn screening?”

To Dr. Hagerman, these questions are already settled. “I think most people now are accepting of the fact that early intervention for fragile X is beneficial,” he says. “Even for the premutation individuals, there’s very clearly a benefit of early intervention – particularly because some significant fraction of children will develop seizures… You want to get at those early and aggressively.” Still, implementing a screening program large enough to make a widespread difference will require some political persuasion, in addition to the scientific challenge.

Hagerman is prepared to break down those barriers. “We’re optimistic that we can meet the milestones… We’re really talking about, eighteen months to two years from now, having a clinical diagnostic test,” he says. Just having the diagnostic available could make the screen easier to add into the mix, because there will be a clear next step for those infants flagged in screening. It could also encourage broader adoption of the technology needed to run the screens, as SMRT sequencers remain fairly uncommon pieces of equipment. Dr. Hagerman sometimes has to ship his samples to Washington State when the UC Davis instrument is unavailable. “In the scheme of things,” he says, however, “putting a PacBio sequencer in [a diagnostic] facility is not such an onerous task. After all, we see mass spectrometers that cost way more in such facilities for doing screening and protein analysis.”

California is a well-chosen location to test the feasibility of Dr. Hagerman’s vision. Earlier this year, the state made another genetic screen – a noninvasive prenatal test for chromosomal disorders like Down syndrome – available to all pregnant women considered at elevated risk. The state is also large and influential, and a statewide screen for fragile X could serve as a model for other regions ready to embrace genetic diagnostics. Meanwhile, other previously unsequenceable tandem repeat disorders, like myotonic dystrophy and Friedrich's ataxia, now look open to diagnostics using SMRT sequencers. Whether sooner or later, these kinds of genetic tests are likely to play a major role in the future improving public health from birth to old age.

As the example of fragile X shows, the road to that future will not always be smooth – but if researchers are determined enough, medical feats once thought impossible can gradually become routine.