Toby Bloom on the New York Genome Center’s Crystallizing Clinical Vision

By Allison Proffitt

March 26, 2014 | Since its inception, the New York Genome Center has been evolving. The Center was legally incorporated in April of 2010, and was unveiled in a ceremony in November of 2011. In September 2013, NYGC moved into its Manhattan headquarters at its “official launch”. And last week, the Center announced its first founding technology member (as opposed to previously-named technology collaborators).

Now, three years after the legal incorporation, there’s the feeling that the Center is picking up momentum and crystallizing its identity. Since January, two faculty members have been hired—Tuuli Lappalainen and Joe Pickrell, both of whom also have appointments at Columbia University—and a Chief Operating Officer and a Chief Communications Officer have joined the team.

The new hires make Toby Bloom, the Center’s Deputy Science Director, Informatics, look like an NYGC veteran with almost a year under her belt (she was hired in May 2013).

Though relatively new to New York, Bloom is a veteran of informatics and genomics. Bloom’s Ph.D. in computer science from MIT led her to IBM and Clinsoft after graduate school among other appointments, but before joining NYGC, she was back in Kendall Square managing business analytics and data warehousing at the Broad Institute for more than ten years.

With that expertise in hand, at the New York Genome Center she is responsible for the Central Computational Biology Group, all of the software engineering and informatics infrastructure needed for the sequencing center, and all of IT and research computing: “Everything vaguely related to computers.”

In defining NYGC’s identity, Bloom is quick to clarify what the Center is not. “Initially the New York Genome Center looked to a lot of people like just a fee-for-service sequencing center. We are not that,” she says. “We are an academic research center with faculty members with joint appointments at various New York institutions.”

The distinction is important to Bloom and NYGC. She stresses that NYGC is involved in research projects on its own and for its member organizations—which currently number 17 including founding members, associate members, and technology members.

But the list of what the New York Genome Center does do is long. The center was a startup of six people when she joined, Bloom says (The NYGC has a staff of about 75 right now, and the website lists job openings for 20), serving a community of collaborators in New York and beyond. The Center is focused on clinical genomics, and using genomics to improve health in the short term, staying close to the patients and close to the doctors who treat them. The Center provides pilot funding to researchers and maintains lab space for visiting researchers. The Center is not yet CLIA certified, though Bloom says that the CLIA audit has been completed, and NYGC is waiting on final approval. The Center organizes working groups on topics of interest and hosts a seminar series open to the community. Bloom says that the Center intends to be a trusted broker and neutral territory for data, and a facilitator of partnerships.

And it does some fee-for-service sequencing.

Pure Mix

The projects that NYGC takes on are a mix—“a pure mix”—of ones driven by the community and ones driven by NYGC, Bloom says. For example, NYGC and six other large medical institutions in New York have a contract from PCORI, the Patient Centered Outcomes Research Institute, to merge all of the de-identified patient medical records from the hospitals. Bloom is expecting somewhere between 2.5 million and 6.5 million longitudinal medical records.

“That’s more data than any hospital can get on their own, and we are putting in place all the infrastructure, the regulations and security, data use agreements, and everything we need to be able to do that to let researchers come to one place and get all of the clinical data they need,” she says.

Toby Bloom, the Deputy Science Director, Informatics at the New York Genome Center

The goal is to share the data with other PCORI sites around the country, Bloom says. Initially the collection should enable large retrospective studies of clinical data, but she hopes that genomic data will one day be incorporated and that prospective studies can be done.

The PCORI project was driven by the partner institutions, Bloom recalls, but the NYGC is driving two other large research projects.

The first is a project Bloom is very excited about: a large, 4-disease, autoimmune project. Though in its early stages, Bloom expects to take early genome samples and then weekly blood samples from patients undergoing treatment for autoimmune diseases. Microbiome samples will be gathered periodically, and possibly other data as well. In addition, the patients will see their doctors monthly and clinical measures of inflammation and other markers will be collected. Participants will be invited to download an app on their smart phones that will track their movement via GPS—information including how far they moved and how quickly—that will be relayed to the study.

The project is the future of clinical research, Bloom says.

“We’ve got five or six different kinds of data here. The multimodal nature of this data and the fact that it’s all longitudinal, that we’re taking it at different time points…makes it very challenging from the data analysis point of view, from the data storage point of view, from the data querying point of view.”

Bloom hopes that by querying the data, researchers will find patterns that can predict flares in Crohn’s Disease or rheumatoid arthritis.

“One of the things we’re trying to do here is see whether we can help patients better manage their own chronic diseases, at the same time that we’re trying to use genomics to figure out what the biological mechanism of those diseases is.”

Working with Watson

The second project being driven by NYGC is the collaboration with IBM announced last week. With nine other institutions, NYGC is leading a glioblastoma trial that currently involves over 20 neurologists and neuro-oncologists.

The Genome Center and its consortium members will sequence tumors of glioblastoma patients and combine that genomic data with RNA-seq data and clinical data. Watson’s role will be to take that sequencing and clinical data, and scan the available literature to try to identify what pathways are involved and which are the relevant mutations, so that the physicians can choose on which to focus.

The study is the first of its kind, said Robert Darnell, NYGC’s President, CEO, and Scientific Director, at the press conference held last week, and will, “connect the passion of the physician with the power of technology.”

Both projects are led by Darnell, who is also an attending neuro-oncologist at Memorial Sloan Kettering Cancer Center, and senior physician and professor of cancer biology at Rockefeller University.

“It’s hard to talk about whether they’re driven by the New York Genome Center or driven by Rockefeller because the President of the New York Genome Center is a Howard Hughes Medical Investigator and Rockefeller Professor,” Bloom explains. “They’re being driven by Bob Darnell, which makes them driven by us on some level,” she says.

Clinical Culture

Darnell’s very hands-on leadership style shapes the Genome Center’s culture. “With the glioblastoma project, Bob, I think, basically contacted every neurologist in the city he knew of,” Bloom says. To launch the IBM partnership, Darnell went to IBM and presented the vision for study himself, recalled John Kelly III, Senior Vice President and Director of IBM Research during the launch press conference.

The Center also has an active scientific and clinical steering committee made up of representatives from the member organizations that meet regularly to discuss projects and needs of the community.

The combined effect, Bloom says, is that the Center is “so much closer to clinical care” than other institutions. “There’s more of a sense that we have clinical projects that are associated with real patients.”

The New York refrain also runs deep in the Center’s identity. In the new faculty interviews posted on the NYGC blog, the second question asks why researchers wanted to move to New York City to practice science.

When discussing the glioblastoma study, Bloom points out the strengths of the New York City location. “One of the things about being in New York City is we have among the most diverse populations in the world, and if we’re looking at all of these hospitals—not just Rockefeller University or MSKCC—we’re going to get much more diverse populations.”

But the Center is not limiting itself to collaborations or partnerships within New York City or State, Bloom says. Indeed, the Jackson Lab, headquartered in Maine, was one of the institutional founding members. Bloom says the Center is currently “talking to some institutions on the West Coast right now,” and that facilities are even open to non-members.

The environment was a draw for Bloom, she says. “The culture of a startup is just different… Moving fast and not having the grand plans but trying to build them day by day is just very different and very exciting and lots of fun.”

Database Love

Yet Bloom does have grand plans for the Center’s infrastructure.

“One of the first things I’m going to get done is that database [for the PCORI project] to hold the clinical data for all of those hospitals and to be able to answer queries for the PCORI network against that data,” she says.

Next on the to-do list has to do with the Illumina HiSeq X Ten system that’s on its way. NYGC purchased one of the systems a week after Illumina announced them, making it one of the first four X Ten customers.

Bloom expects the first of the ten instruments to arrive within the month. It will join 16 HiSeqs already in house in a sequencing facility that has the physical capacity for 80 instruments. But physical capacity and infrastructure are different animals.

“I’m looking at a brand new analysis pipeline management project to try to optimize all of the computation that we have to do to make sure that it will scale with the X Ten,” Bloom said. “I’m just making sure the computational infrastructure can scale.”

When NYGC completes a sequencing project, the Center stores all data for two years and analyzes as much of the data as the collaborators want analyzed—through interpretation if necessary. That will require a hefty infrastructure on its own, but Bloom says that the Center plans to provide space for data and compute capacity even to groups who haven’t done sequencing with the Center.

“We’re hoping to build that out as an informatics hub for the New York biomedical community,” Bloom says. “Again, it’s a way to foster collaboration among all of those hospitals because if they can all get approval to exchange data, they have a place where they can easily share data.”

That’s the kind of project that Bloom loves.

“Building multimodal databases, multidimensional databases—whatever terminology you want to use—it’s not something we’ve done at this scale with this much data,” she says. “I want to build that database.”

And it seems that’s the kind of project the New York Genome Center loves as well.

“I’m an old data warehousing person from before I was a genomics person. I want to build the be-all, end-all of multimodal databases that holds not only the genomics data but the clinical data you need, the personally-provided data you need, the information coming off devices—all of it,” Bloom says. “To bring all of that together in a way that you can look through cohorts you need, that you could answer the queries you need, that you could find the data you need, or the patients you need, or the other doctors who want to work with you on finding them.”

Editor’s Note: Toby Bloom will lead a panel discussion on the Big Data Storage and Security Maze: Balancing Collaboration and Privacy at the 2014 Bio-IT World Conference & Expo, May 1, in Boston.