February 10, 2003 | IT TAKES A mighty special mouse to grace the cover of Nature. It helps to be dramatically oversized (the first transgenic mouse, reported in 1982), morbidly obese (the leptin gene, 1994), or to have undergone a sex-change operation (the testis determining factor, 1991). The composite white mouse on the Nature cover late last year marked the completion of 96 percent of the mouse genome. What had taken 13 years to produce for humans had taken just 13 months for the mouse.
The publication arrived on Dec. 5, 2002 — fittingly, Walt Disney's 101st birthday — and was greeted with great fanfare. "Its significance matches that of the human genome," gushed Allan Bradley, director of the Wellcome Trust Sanger Institute in Cambridge, England. A predictable response from one of the world's leading mouse geneticists, but is it merited?
The international mouse genome consortium applied Celera Genomics Group's controversial whole-genome shotgun approach to produce a rapid initial draft of the mouse genome, although this method suffers in aligning strongly repetitive sections of DNA. The composite sequence was assembled using the Arachne program, developed at the Whitehead Institute Center for Genome Research.
| The genome shuffle: The 20 chromosomes of the mouse genome (top) color-coded to highlight the related segments of the human genome (below).
There are some glaring differences between the genomes of mouse and man: Mice have three fewer pairs of chromosomes and about 0.5 billion fewer bases of DNA than humans. Over the course of 75 million years of evolution since the last common ancestor, the mouse and human genomes have been shuffled such that there are 342 conserved sequence blocks between the species, with fewer than 300 chromosomal rearrangements required to produce the current human genome.
At the gene level, however, the similarities between human and mouse are startling. At 96 percent, the published sequence of mouse strain C57/BL6 — available from the same three Web portals as the human sequence (see Paper View, November 2002 Bio·IT World, page 56) — is far more complete than the human sequence was at its unveiling two years ago. The new tally of murine genes agrees remarkably well with the latest estimate of human genes, which has been lowered 30 percent since publication of the first draft, down from 32,000 to just 22,808 predicted genes. Using the Ensembl gene prediction pipeline (featuring GenWise and Genscan), the consortium estimates the mouse genome contains 22,011 genes, including about 40 percent predicted transcripts that are related to proteins in other species.
However, the authors of the report stress that these numbers remain tentative while both genome sequences are wrapped up: Real genes (particularly those that are either small and/or expressed in low abundance) have been overlooked, while other putative genes will doubtless prove to be pseudogenes. The emerging consensus points to a maximum total gene count in humans of 30,000 — barely 10 percent more than the mustard weed.
|Mouse Genome Sequencing Consortium. "Initial sequencing and comparative analysis of the mouse genome." Nature 420, 520-562 (2002).
Full text available at: www.nature.com/nature/ mousegenome
Z. Xuan, J. Wang, and M.Q. Zhang. "Computational comparison of two mouse draft genomes and the human golden path." Genome Biology 4, R1 (2002).
Full text available at: www.genomebiology.com /2002/4/1/R1
Only 1 percent of mouse genes — 118 to be precise — appear to lack a human counterpart, and the final number may be even smaller, as human counterparts could be lurking in uncharacterized regions of the genome, or their sequences may have diverged significantly. One example that had previously escaped notice is a distantly related homologue (in both mouse and human) of dystrophin, the gene mutated in the most severe form of muscular dystrophy.
Using two new dual-genome gene prediction programs — Twinscan and SGP2 — the survey of the mouse sequence reveals about 1,000 potential new coding regions, a conclusion validated in a small trial PCR (polymerase chain reaction) experiment to see if these putative genes exist as RNA transcripts.
Despite the conservation of genes, the rest of the genome has undergone turbulent shifts during evolution. The mouse genome contains only 2.5 billion bases, 14 percent less than the human. The chief reason appears to be a higher rate of deletion among the repetitive DNA elements that constitute the "junk DNA." Having said that, genomewide sequence comparisons reveal that about 2.5 percent of noncoding, or junk, mouse DNA has, in the words of the Whitehead Institute's Eric Lander, been "lovingly preserved by evolution" compared to human (that is, it exhibits a greater degree of conservation than would be expected by chance). Evaluating these sequences will be a high priority in the coming years.
The public consortium's mouse sequence begs comparison with the Celera assembly, based on three different mouse strains, which has been available to subscribers for the past 18 months. Writing in Genome Biology, Michael Zhang and co-workers from Cold Spring Harbor Laboratory report that the two mouse genome assemblies differ by about 10 percent. On the other hand, Zhang's team says homologies between the two assemblies predict 6,000 novel genes.
In the next few years, the genomes of rat, dog, and chimpanzee will round out the mammalian top five, but none will match the immediate impact of the mouse genome, which provides powerful testimony to the value of comparative genomics. "Evolution's crucible is a far more sensitive instrument than any other available to modern experimental science," the consortium concludes in Nature. With the mouse sequence laid virtually alongside its mammalian counterpart, the road to mining the human genome just moved one giant mouse click closer.
ART BY NATURE