The Next Evolution of BaseSpace
By Aaron Krol
November 1, 2013 | In a long-anticipated move to expand the range of its cloud-based analytics services, Illumina announced that BaseSpace has opened its Native App Engine to any prospective developers who want to add an app to the BaseSpace cloud. Alex Dickinson, Senior VP of Cloud Genomics, made the announcement at Illumina’s first worldwide developers meeting held in Boston on October 21, where Illumina representatives and app developers from a range of corporations had an opportunity to share their cloud offerings and test out the engine.
It’s been just over a year since the first BaseSpace app went live on October 15, 2012, signaling a decisive move by Illumina into the analytics end of the sequencing pipeline. While Illumina’s core products remain its pair of next-generation sequencers, the MiSeq and rapid-run HiSeq devices that together account for 90% of all genetic sequencing performed worldwide, the company saw a chance to leverage its dominance of the market to provide a seamless, end-to-end informatics pathway integrated directly with the raw data reads. Since December of 2011, all users of Illumina instruments have had the option to store their sequencing data securely in the BaseSpace cloud, where it can be accessed remotely for long-distance collaboration and use on mobile devices. From the beginning, however, Illumina envisioned BaseSpace as more than just a storage option. The Illumina management team realized that, with the advent of next-generation technology, sequencing even very large sets of genetic data had become far faster than analyzing that data, creating a bottleneck in the pipeline that slowed the use of Illumina devices. “Increasingly, what we’re hearing from the marketplace is people don’t just want lots and lots of raw data,” said Nick Naclerio, the Senior VP of Corporate Development at Illumina, in an interview with Bio-IT World. “They need to be able to put that data into a biological context and convert it into some actionable information.”
By creating an app store within BaseSpace, Illumina could offer streamlined access to informatics software from a variety of providers, and easy movement of data between tools, speeding up the stream of data through analysis. “What we intend to do is make it seamless for users to go from the sequencer all the way up into enterprise-level informatics,” says Naclerio, who can point to around a dozen apps already housed in BaseSpace for niche functions ranging from de novo genome assembly to HLA typing. The majority of these apps were written by third-party companies looking to connect with the large customer base inside the Illumina cloud, although a few are provided by Illumina itself.
Still, many of the apps released so far are proof-of-concept programs with limited functionality compared to commercial alternatives, and despite more than 13,000 app launches to date and the introduction of an ecommerce system this May, BaseSpace has yet to really take off as a marketplace for genetic analysis. At this early, speculative stage, established developers still regard BaseSpace more as an introduction to potential users than a sales floor. “BaseSpace is a very rich, well-designed, evolving environment that it’s just natural for us to want to play in,” Matt Landry, the CTO of software developer Biomatters, told Bio-IT World. Biomatters is the first company to create two apps on BaseSpace: a simple genome browser released in 2012, and just in the past week a more robust informatics tool called Melanoma Profiler for exploring the mutations and known drug associations found in cancer cell sequences. Yet Biomatters has no intention of charging for the use of these apps, nor of moving its core business of providing researchers with full analytical services for specific projects into BaseSpace. “The way we look at BaseSpace,” adds Landry, “is it’s a good platform for us to make some of the sub-tools available… We work with a new informatics tool, and we expose it on BaseSpace, and make it available to everybody.” Meanwhile, the smaller startups who have the most to gain from using BaseSpace as a central platform and marketplace have faced the largest barriers to entry, not least of which is the need to operate an independent cloud where their apps can run after being launched in BaseSpace.
Letting the Apps in
The public release of the Native App Engine addresses those barriers, marking a shift in the evolution of BaseSpace from its experimental phase to something more akin to an iPhone’s app store: an environment that attracts its own dedicated businesses seeking to compete directly with the dominant software developers’ independent platforms. The Native App Engine is the same tool Illumina uses to create its own apps, and it eases the demands on app designers in two major ways. First, it allows apps to be hosted inside the BaseSpace cloud, rather than redirecting users to third-party clouds. And second, it integrates code written in any language into a common app portal, allowing developers to write and test their apps locally in a comfortable format, without the use of a software development kit. “We need to provide an environment where everyone can be incented to participate,” said Scott Kahn, CIO of Illumina, at the developers meeting.
The Native App Engine proceeds in three steps, beginning with a form builder where developers select the data inputs they want their apps to process. Data can be collected from a user’s own hardware, from BaseSpace’s cloud storage of raw reads, from data offered publicly within BaseSpace, or even from other apps; Illumina insists that users have continuing access to data generated during app runs. Next, the app’s code is fed into Docker, a container engine that allows the app to run in any Linux environment, regardless of the coding language it’s written in. Finally, a report builder provides options for displaying the results of the app’s analysis to users. This segregation of the app’s functionality from its input and output settings ensures that apps created through the Native App Engine have a unified look and feel, with an interface that BaseSpace users will recognize even as they travel between apps written in different languages by widely disparate designers.
Illumina is in a unique position to host an app store on this communal model, because it views its informatics services less as an independent driver of revenue, and more as crucial support for its primary business of making sequencers. If app designers want to make a profit from their contributions to BaseSpace, Illumina will take a small cut of what they charge users; but if designers want to make their apps freely available, or only charge for the computing power required to run an analysis, Illumina is happy to host their programs at no fee. “The end user benefits,” Kahn said to Bio-IT World, “because they’re going to get a richer set of tools to operate on their data. But I also think that the developers benefit, because they have a very simple way to access people that need their algorithm.” Illumina is also able to offer large volumes of free storage to attract users to BaseSpace, because the company’s primary concern with storage is that it not be an impediment to use of the MiSeq and HiSeq devices.
The first third-party apps built with the Native App Engine likely won’t appear for a few months, as Illumina maintains a quality control process including expert review for all submissions to BaseSpace, which must be cleared before an app becomes available to users. “We definitely have to fill the role of curation,” says Kahn, stressing the importance of a “level of standards that users will come to trust and rely upon.” In the meantime, Illumina is encouraging designers to begin the process through incentivized contests, offering prizes to developers who create the first apps with specific capabilities. Illumina’s first BaseSpace contest, for app ideas that fulfill novel functions, just closed on October 23, producing two winners.
But simply giving prospective developers access to the huge Illumina customer base, with the largest companies and smallest startups operating on the same platform, may be incentive enough. Landry, for one, sees the terms of competition between informatics companies being altered in this new ecosystem. “Some of the areas that we ended up competing on in the desktop market are around ease of use, [and] ability to pull in data and export data,” he says. “A lot of that goes away with something like BaseSpace… It does change the way that you would compete trying to offer a comprehensive set of analytics tools.”
A screenshot of the Melanoma Profiler app in use. Image credit: Biomatters
Betting on the Cloud
The pace of change in Illumina’s analytics capabilities has been accelerating since BaseSpace was first conceived two years ago. On October 29, just eight days after the Native App Engine launch, Illumina finalized the acquisition of big data company NextBio, to be incorporated into the newly-formed Enterprise Informatics Business Unit led by Naclerio. NextBio offers two major assets to Illumina: a cloud-based research platform that performs metadata analysis on the level of genes, molecular pathways, clinical data and population studies; and the world’s largest curated dataset of genomic information to power those platforms, collected from sources both public and exclusive to NextBio. “By having that rich collection of datasets,” Naclerio told Bio-IT World, “it allows you to interpret your genome or your experiment in the context of all the prior work” performed in your field. The NextBio dataset also promises deeper and more informed analyses through BaseSpace. “BaseSpace was designed… to provide functionality through a rich set of Illumina-developed and third-party apps,” says Naclerio. “With NextBio, we can extend what people can do with that genomic information by bringing it together with systems biology and allow them to interpret their samples in the context of a bigger database.”
Over the next year, Illumina hopes to see NextBio merged with existing Illumina informatics assets, approve an increasing number of BaseSpace apps from designers both large and small, and begin hosting BaseSpace through Amazon cloud sites outside the U.S. In betting on an environment where cutting-edge genetic research will increasingly be able to take place in the cloud, Illumina seems to be in step with companies like Biomatters that have been exclusively devoted to the analytic side for years. “We think [cloud technology] maps very well onto what we see as the future applied use of genomics,” says Landry, who is eager to see how Biomatters’ partners adapt to cloud software. “That’s one of the interesting things that BaseSpace will help uncover for everybody: how easy and seamless it will be to try to do research on the cloud.”