How Common Data Could Lead To Uncommon Alzheimer’s Discoveries

By Paul Nicolaus

January 14, 2019 | Talk of a data tsunami may be cliché, but it has become a fitting metaphor. Imaging and genomic technologies have dramatically increased the amount of information generated and used to make clinical decisions, and emerging sources, like wearables, only add to this ever-growing wave.

Now that healthcare and IT are so closely intertwined, the onslaught of data presents possibilities like never before, and those caught up in the effort to save current and future generations from the ravages of Alzheimer’s disease wonder how all this data might be better used to pinpoint patterns of interest and discover pathways for treatments.

“There is an untapped opportunity to leverage existing data from longitudinal cohorts, from the postmortem human brain, and from clinical trials to help the field advance our shared goals more effectively than we otherwise could,” said Eric Reiman, executive director of the Banner Alzheimer’s Institute.

That untapped potential carries with it plenty of challenges, though. Traditionally, there has been a lack of incentive to share data. And for those who do want to share, there can be a lack of technical or financial support needed to clean information and make it widely available. Then there’s the added difficulty of figuring out how to compile data in a common format.

The good news, according to Reiman, is that many of these hurdles are being addressed, and the result is that we’re starting to see just how valuable shared data can be in the field of Alzheimer’s research.

Benefits of Shared Alzheimer’s Data

Reiman’s array of affiliations means he has been aware of and involved in an array of research projects in recent years. In addition to his role with the Banner Alzheimer’s Institute, he is also CEO of Banner Research, clinical director of neurogenomics at the Translational Genomics Research Institute (TGen), director of the Arizona Alzheimer’s Consortium, and professor at both the University of Arizona and Arizona State University.

As he considers progress made in the realm of shared Alzheimer’s data, Reiman points to a Brain and Body Donation Program in the greater Phoenix metropolitan area that included 10 elderly volunteers who met clinical and neuropathological criteria for Alzheimer’s disease as well as 10 others who were cognitively unimpaired and did not meet that criteria. The program studies participants’ function during life as well as their organs and tissue after death.

“Because this is in the heart of the largest, most-highly concentrated senior citizen community, it is very high-quality brain tissue,” he pointed out, noting that the average time to conduct an autopsy or move and freeze a brain before it further deteriorates is just three hours following the time of death.

The decision was made to provide a common, public resource of gene expression data from neurons that came from these 10 disease cases and 10 controls. Over a decade ago, the data were made publicly available on the TGen website and has since paid dividends many times over. “That data has been used in nearly 800 publications,” Reiman said. “It has been used to provide additional support for the discovery of more than two-dozen Alzheimer’s susceptibility genes, and that’s from a small dataset.”

There is a related effort called the Accelerating Medicines Partnership - Alzheimer's Disease (AMP-AD), a collaboration among government, industry, and nonprofit organizations. The difference is that the Target Discovery and Preclinical Validation Project (one of the AMP-AD projects) uses chunks of brain tissue—not individual neurons—to characterize more detailed gene expression data from greater numbers of cases and controls.

This AMP-AD Target Discovery and Preclinical Validation Project dataset has come from different brain tissues and different groups, Reiman explained, and there has been an effort to harmonize those data and provide a resource that would make it possible to compare information. The sharing of data and analytical tools has been enabled by a portal developed and maintained by Seattle-based Sage Bionetworks and hosted on Synapse, an informatics platform that allows for storage, access, and analyses among the various academic and industry partners involved.

The idea behind this work is to use large datasets to reach a better understanding of Alzheimer’s disease and treatments that are more likely to work in people and not just in genetically engineered mice. Grant funding from the Switzerland-based NOMIS Foundation is allowing Reiman and colleagues to pursue this idea using RNA sequencing data in 100 Alzheimer’s disease cases and controls. The study examines five different brain cell types in six different brain regions that differ in their vulnerability or resistance to the disease.

“Big data analysts on our team, and others in the field, will use these data in part to find disease networks and drivers of those networks that could be targeted by new treatments,” he explained. The approach is meant to foster a push-pull relationship between the use of human data and experimental work from animal and cellular models, and the intent is to “learn more by capitalizing on complementary approaches than we can by using either approach alone.”

On another front, the Global Alzheimer’s Association Interactive Network (GAAIN), a system for open access data powered by the Laboratory of Neuro Imaging (LONI) at the University of Southern California in Los Angeles, links scientists with databases from all over the world, enabling a researcher to go into GAAIN and perform a search that might pull information from several different sources.

Yet another example of progress made can be found in the Alzheimer’s Disease Neuroimaging Initiative (ADNI), a longitudinal multicenter study that looks to develop biomarkers for early detection and tracking of the disease. Initiated back in 2004 and funded as a private-public partnership, the ADNI dataset consists of brain imaging, cerebrospinal fluid, and cognitive information gathered from participants who have been followed over time. “Not only has the data been harmonized once it’s collected,” Reiman said, “but it’s been standardized from the beginning even though it’s been collected from multiple sites.”

Getting Pharmaceutical Companies Involved

There are now similar ADNI-like programs around the world, according to Reiman, and there is an effort on the part of the Collaboration for Alzheimer’s Prevention (CAP) to accomplish much of the same data sharing and collaboration in clinical trials. In 2016, the group (which includes leaders from FDA, the National Institute on Aging, and the Alzheimer’s Association, among others) published principles to guide data and sample sharing in clinical trials in Alzheimer’s & Dementia: the Journal of the Alzheimer’s Association (doi: 10.1016/j.jalz.2016.04.001).

“We think there’s a lot that can be learned in these trials besides developing a company’s drug that would help the entire field develop faster ways to test prevention therapies,” he said. “And we’re hoping that those standards increasingly become the standard for sharing.”

Information is gathered every time a research lab or drug company performs a study, but to date, it has not been common for researchers overseeing large studies to share their data widely in a user-friendly, accessible way. This has been even less likely in clinical trials considering drug companies’ concerns about threats to intellectual property, trial integrity, and approval chances. They also wonder about the cost of technical support and who will foot that bill.

“A lot of those issues are being worked out,” Reiman said. “I think you’ll see a growing emphasis on that from both NIH and philanthropy working with academic researchers and industry to share more data in meaningful ways.”

One example is the Alzheimer’s Prevention Initiative (API) Autosomal Dominant Alzheimer's Disease (ADAD) Trial, which includes cognitively healthy individuals who, because of their genetic history, are expected to develop Alzheimer’s disease. The trial, which was proposed as a public/private partnership involving NIH, philanthropy, and industry, focuses on whether an anti‐amyloid antibody treatment can ward off the disease.


“We secured an agreement from Genentech and Roche to share the datasets,” Reiman noted, and this is just one of several prevention trials that have secured commitments from the companies involved to share baseline data.

Could Ontologies Be a Difference Maker?

Once all of these data are compiled, will big data analysis and artificial intelligence (AI) glean insights that couldn’t be learned before? Hopefully, but training algorithms requires commonality between datasets.

Different researchers use different ways to measure cognitive decline, describe patient observations, and store information. These differences make it difficult for an algorithm to pick up on patterns across datasets. Peter Robinson, a computational biologist at The Jackson Laboratory, sees ontologies as a potential remedy. These systematic representations of knowledge not only describe the properties of terms but also indicate the relationships between them.

“Computers are great at crunching numbers or at searching texts for individual words, but they do not function well out of the box to compute over human knowledge,” he explained, “so ontologies are really a tool to represent human knowledge computationally such that a computer program can infer things that you didn’t put into the program.”

Robinson’s group developed the Human Phenotype Ontology (HPO) to serve as a standardized vocabulary of phenotypic abnormalities seen in human disease, and it is now an international standard used by the Sanger Institute, a number of NIH-funded groups, Genome Canada, the rare diseases section of the UK’s 100,000 Genomes Project, and others.

In a paper published this fall in the New England Journal of Medicine (doi: 10.1056/NEJMra1615014), he and colleagues describe the types of changes that could help revise the medical data infrastructure and better enable the integration and analysis of large amounts of data.

Although ontologies can help align data across patients and systems while providing consistency across huge numbers of terms and concepts, Robinson acknowledges that ontologies will not solve all the issues that need to be addressed in order to harness the true potential of medical data, or Alzheimer’s data in particular.

Electronic health records (EHRs), for example, provide an opportunity to collect many types of data, but they can also prevent the analyses of patient-level, high-throughput, or molecular data in conjunction with clinical information because the records are oftentimes incomplete, incorrect, or lacking in detail. This, he says, all stems from their design.

“The current generation of electronic health records, unfortunately, isn’t really built to help doctors do their job,” he said. “It’s made to be really good at billing.” A challenge that needs to be overcome moving forward is “to capture data in electronic health records that is comprehensive and accurate and semantically harmonized.”

Another difficulty relates to patient privacy and confidentiality. How can these concerns be appropriately balanced with efforts toward scientific progress and the betterment of society? “The whole topic of data privacy, of deidentification of data, is really important,” Robinson added, “and I think it’s still an unsolved problem.”

Paul Nicolaus is a freelance writer specializing in science, nature, and health. Learn more at

Click here to login and leave a comment.  


Add Comment

Text Only 2000 character limit

Page 1 of 1