Deanna Church Brings Reference Genome Expertise to Personalis

February 5, 2014 | Fifteen years ago, Deanna Church joined the National Center for Biotechnology Information (NCBI) and eventually took the lead of the NCBI’s group contributing to the Genome Reference Consortium (GRC). Church has been instrumental in creating and advocating for the reference assemblies (see, “Deanna Church on the Reference Genome Past, Present and Future”).

But at the end of last year, Church left NCBI and joined Personalis, a Stanford startup offering genome interpretation services and genome-scale diagnostics. Suddenly, Church said, she’s getting to use the resources she spent so many years developing at NCBI.

Church and Bio-IT World editor Allison Proffitt spoke about Church’s role and goals for Personalis.

Bio-IT World: I saw that the formal announcement went out last week that you had joined the Personalis team. But you’ve been there for a month now, a month and a half?

Deanna Church: Yes, I started in early December, basically right after Thanksgiving, or right around Thanksgiving. I think it’s going great. So I’ve been working with the team here. My department is really the genomics and content area. As you know, Personalis really has done a great job of amassing a huge amount of databases and data for annotation interpretation. So I’m going to be involved in starting to help take that to the next level and continue to improve the really great annotation data they’ve already put in place. And then, perhaps not too surprisingly, I’ll also be working to try and help in the pipeline and annotation side of things to try and use so many of the improvements the GRC has been doing for the reference assembly over the past few years.

One of the biggest challenges in terms of personal genomics is not just identifying the variants, but really associating those variants with phenotype – or in our case, since we’ve really been focusing a lot more in the Mendelian space, really trying to find that constant variant. And so you really need to try and aggregate all of the available information that’s out there, taking advantage of resources like ClinVar and OMIM and other resources that are out there, to try and integrate that so you can get the most knowledge that you can to interpret that genome.

Personalis’ focus is on Mendelian diseases. Why that and not another area of focus?

I think there are a lot of pretty compelling reasons. If you look at the field in general, there’s a lot of utility in looking at these Mendelian disorders, because I think the whole general genome interpretation of healthy individuals is still really, really challenging. Whereas if somebody comes to you and they clearly have a genetic disorder, and you can try and identify the cause of that, that’s a much more tractable problem to address right now, and you can actually really do some immediate good – especially as the field transitions from these more panel-type tests to whole exome. If you can shorten the time that it takes that person with that disorder to understand what the source of the problem is, you’ve really helped improve their clinical outcome and their clinical care. Even if there’s not necessarily a treatment, there’s a lot of evidence to suggest that having these families at least get diagnosis is really, really helpful for them.

You’re bringing in many years of expertise in building the reference genome at NCBI. What’s your goal here that you couldn’t have done there before?

There’s been a real fundamental difference between my role at NCBI and my role here. And that is, at NCBI, my role was really to facilitate research for other researchers. So my group was building tools that facilitated other people’s research. Which is great, and I think it was important, but here it’s sort of a fundamentally different role where I can build tools for both my team and other teams here at Personalis, and actually try and apply them to genome interpretation. So to me, it’s finally getting to use a lot of the resources that I helped contribute to building while I was at NCBI. So that’s kind of one of the reasons for the switch: to do a little bit more application, and get a little bit closer to the data than I was at NCBI.

Did the timing just happen to line up well with the end of Build 38 or had Personalis approached you earlier and you just needed to finish that first?

Actually, I think the timing for both Personalis and myself just happened to be pretty good. I definitely was looking to see what other opportunities were with [Build] 38 coming out, because I felt like that was a pretty significant milestone. And really, in fact, the day-to-day runnings of the GRC had been handed off to Valerie Schneider well over a year before I left NCBI anyway (see, “Getting to Know the New Reference Genome Assembly”), so I really had already moved into much more of a consulting role for the GRC, rather than the day-to-day management of it. So the timing seemed to be quite good with respect to that.

Why did you think Personalis was the best fit after NCBI?

I think the GRC has done a lot of work on improving the assembly in a lot of ways, and very few places have really taken advantage of that to the fullest extent. One of the things that I really loved about Personalis when I was talking with them was, they’ve really had a very intense focus on data quality. They really had a very sophisticated understanding of many of the problems with the reference assembly. They’ve been using GRCh37 for a while and had a pretty good understanding of many of its shortcomings and some of the issues surrounding that, and I was very impressed that in the terms that they talked about it, I hadn’t heard that from a lot of other people. And so trying to join a team that was already interested in trying to get the most out of the reference assembly was really exciting for me on a personal level, and actually getting to be more involved in thinking about ways that we can better use the assembly for analysis is pretty exciting.

So many people in the field of genome analysis sort of take the reference assembly for granted, and don’t necessarily acknowledge the fact on a day-to-day basis that, even though you might be doing whole genome sequencing, you’re very rarely doing a whole genome analysis. And the same is true for exomes. So if you do standard exome analysis, there are typically many parts of the genome that are missing from that analysis. And there are obviously technical and sequencing reasons for that, but there are also analysis and reference assembly reasons for that. And [Personalis] had a really very good and thoughtful approach to dealing with that, and an understanding of some of the shortcomings, so they were already very interested in trying to take full advantage of it.

Personalis wants raw data only. Even if you’ve already had your data aligned, they don’t want that, they want to do all of it themselves. Why is that?

Yes, that’s right. I think that Personalis has put a lot of work into trying to improve that – to get as much out of the informatics as they can. The gaps that currently exist in individual genome interpretation span all aspects of the pipeline. So it can start with sequencing, you might have parts of the genome that aren’t even represented in your sequence if produced. There can be problems with the reference, problems with the alignment, problems with the interpretation, and so they’ve really worked hard to try and approach this problem as an integrated problem, rather than just say, well, we’re just going to work on the sequencing or we’re just going to work on the bioinformatics. Because you can’t. If you really want to do good genome interpretation, you can’t just focus on one part of the problem. You have to think of it as a holistic problem.

They’ve actually done a lot of work in terms of improving the alignment process or improving the interpretation process. And so it’s a very integrated process. It’s not just, well, we’re going to improve in one place or we’re going to improve in the other place.

This is going to be slightly an unfair question, because you’ve been there two months. But if you could see Personalis do anything at all in the next year, what would your hope be for the company?

Well, I would certainly like to see us grow and start doing a lot more clinical exomes. One of the things I certainly would love to see us doing is using things like GRCh38 to do analysis, as opposed to just GRCh37, and really being able to take advantage of the full assembly, not just the primary assembly, which is largely what people use right now. So really being able to robustly analyze a multi-allelic reference as opposed to just the primary reference would be fantastic for Personalis.