Tour of the Cloud: Emerald Cloud Labs’ Hardware and Software Vision

September 24, 2015

By Allison Proffitt

September 25, 2015 | It was more than a year ago when I first spoke with Brian Frezza and DJ Kleinbaum about their vision for Emerald Cloud Labs. It’s plug and play experimentation at its most elegant: Let robots handle the tedious, troublesome bits—pipetting, mixing, and timing—and automate everything. Researchers enter the parameters of the experiment they want done, and Emerald does it. Exactly.

But seeing it is something else.

ECL-floor

I visited ECL’s first facility—ECL-1—in South San Francisco earlier this month. Housed in a former post office processing facility, robotic pipetting arms and automated protocols are busy doing whatever is ordered. The labs are bright and clean. Just a few “orchestrators” (lab techs) moved through their paces with rolling carts, lab coats, and bar code readers. Every reagent is tracked. Robotic arms move exactingly in climate controlled boxes. “Sensornet”—the lab’s sensor system—records temperature, humidity, light levels, pressure readings, CO2 saturation, liquid levels, pH, and other environmental inputs. Every data point is recorded, and ECL customers have access to all of it.

ECL chose not to build instrumentation, instead relying on names you know. “It's a very large and mature industry we're joining, and there is fantastic hardware that we have no intention of reinventing,” Kleinbaum said. “Waters makes outstanding Mass Spectrometers; Hamilton makes fantastic liquid handlers; Bruker makes the best NMRs on the market; and ProteinSimple makes a revolutionary new Western Blotting device that we love. It is our preference to integrate the best in class hardware off the shelf.” All of the connective pieces are made from a “bucket of parts” Kleinbaum stressed. Every hose and gasket and arm and nozzle is uniform and swappable. There are no “lucky” instruments here, he said.

Emerald isn’t the only cloud lab. Transcriptic in Menlo Park, Calif., has a similar vision and has been live a bit longer. But that hasn’t stopped ECL’s funding push. In June, Emerald Cloud Labs closed a $20.5 million series C funding round led by Schooner Capital brings the total funding to $34 million. Today ECL has about $5 million worth of equipment now offering 46 types of experiments. In the next 18 months, Kleinbaum and Frezza expect to have about $13 million worth of equipment and will add NMR, electron microscopy, x-ray crystallography, and DNA sequencing to their menu.

ECL has already run more than one million samples since the lab went online in April of this year, and is now running 24 hours/day on Tuesday, Wednesday and Thursday and 18 hours/day the rest of the week. Beta users include big pharma, academic labs, and startups. Kleinbaum says there are 700 labs on the waiting list, and many of those will move into production as the lab moves to 24/7 productivity by Thanksgiving.

“While our near term focus is more on… getting all the experiments online rather than massively scaling up capacity of the experiments we already offer, we'll also be ramping up capacity and hope to get to everyone on the wait list onto the ECL-1 within a year or so,” said Frezza. At full capacity, the present facility should be able to run 4.5 million samples each year.

Building a Global Platform

But for Frezza and Kleinbaum, the goal isn’t just volume, reproducibility, and experimental design granularity. Of ECL’s staff of 29, only five operators facilitate experiments. For all of the instrumentation and custom-designed workflows and SensorNet infrastructure, Emerald is a software enterprise.

Kleinbaum and Frezza want to not only generate data, but facilitate the data analysis afterward. “Right away we realized that it’s goofy that every instrument has its own data plotting program,” Frezza said. The solution has been to export data to Excel, but that’s not a solution that scales when researchers can put 20 experiments with 10-15 samples each into the queue a day, Frezza argues.

ECL built its Integrated Software Environment—ISE, now in version 3—to view and work with all of the data coming out of the cloud lab. ECL uses Wolfram language on the back end (Stephen Wolfram is an advisor) and all of the compute happens in Amazon EC2. To plot data, just type “plot” into the command line and all data are visualized and interactive, Frezza says.

ECL-sw

“This works for every different data type we have in the system. It works for PAGE [polyacrylamide gel electrophoresis] data, microscope data,” Frezza says. “It can handle crystal structures, RNA folding. You can do mass spec; you can do fluorescent spectroscopy. We have a probe design tool, you can do flow cytometry data in glorious and totally unnecessary 3D if you want it.”  And, he adds, it’s really fast.

“Imagine if you worked at a giant pharmaceutical company,” says Kleinbaum. “If you could—before you ran an experiment—say, ‘What are the ten most similar experiments that have been run in the history of Pfizer?’ That would be hugely valuable in making the decision about the next experiment you’re about to do.”

The system opens up a new level of data query, Frezza says: “Show me every microscope image I’ve taken of this cell line stained with this antibody. Show me examples of this cell line after 20 or more splits compared to the first and second split. Show me every chemical reaction I’ve run on this functional group.”  

All of this happens without any data entry; the data is indexed coming out of the lab.

ECL intends for the platform to be “the global platform” that runs all of your experiments and analysis. “You don’t need to open any other software unless you want to,” Frezza says. “This gives you every facility you need to have to do your daily work.”

This is the kind of functionality that Frezza says is secondary to startups that have no data yet and are desperate for instrumentation, but Kleinbaum says it blows away big pharma who are drowning in their own data. “They understand the value of this data, and the value of their scientists in Edison, New Jersey, being able to know what their scientists in Zurich did the previous day.”

Programmers can connect to the ISE through a RESTful API, but ECL has also built a graphical user interface that will be released within the quarter. All the talk of coding could put some users off, but Kleinbaum and Frezza don’t want that to be the case. ECL doesn’t require programming skills, Frezza stressed; it’s not really a command line. “Everything in the ECL has to convert to these one-line commands,” Frezza says, but the GUI will be all point and click. “People are kind of coding, but we don’t want to trigger people’s code-a-phobia… On their very first session a scientist can be instantly productive and feel their way through things by just clicking around.”

Of course there are downsides, Frezza says. “Your organizational skills need to go way up because you can run huge numbers of experiments. Your data becomes extremely valuable when you have this level of capture! You start being able to search through it; you want it all the time.”