Update: On January 24, additional details about the Science DMZ were added to paragraph 5. Nothing was removed from the original story.
By Allison Proffitt
January 17, 2014 | At Aspera, it’s business as usual, Michelle Munson told Bio-IT World this morning, after the announcement that the software company’s acquisition by IBM closed. The acquisition is an “incredible opportunity for Aspera,” Munson said, an opportunity to “continue doing what we’re doing, but with a broader reach.”
IBM announced the acquisition in mid-December, and Munson said that, practically, it represents few changes. “We’re the same group of people, the same leadership.” Munson remains President and CEO. “Our technology strategy and product strategy will remain the same,” she continued. Aspera will stay in its Emeryville, California headquarters and will keep its France-based offices as well. Financial terms were not disclosed.
The technology strategy is surely what attracted Big Blue in the first place. Aspera’s patented, highly efficient bulk data transport technology began life in Munson’s garage in 2004 and is now used for big data transfer by Netflix, Apple, the Broad Institute, and a wide range of other companies and markets. The fasp protocol (fast, adaptive, secure protocol) enables much faster file transfers than any other software or hardware wide area network (WAN) acceleration solution—all on commodity hardware (see, Aspera’s fasp Track for High-Speed Data Delivery
). Aspera serves users in entertainment and digital media (the company has won an Emmy for their work in the Outstanding Achievement in Engineering Development category); energy, oil and gas; finance; and life sciences.
Under IBM’s headship, the company’s focus will stay much the same, Munson told Bio-IT World. With the fasp transport technology in use in all types of industries, Aspera’s software development is focused on continuing to increase raw performance; improving ease of use; integration; and developing its software developer kit and APIs.
It’s an approach is very much in line with efforts to achieve a “science DMZ”, Munson said. Taking its name from the standard acronym for demilitarized zone, a science DMZ is a computer subnetwork that is secure, but is optimized for speed without firewalls. The concept was created by the Department of Energy's ESnet (Energy Sciences Network)
. “It’s a combination of technologies that allow for unmitigated fast, efficient, and secure flow of the large data that powers life sciences research in and out of private network domains,” Munson explained.
“The three points that we emphasize in our software development—unmitigated performance, ease of access, usability; and integration with the rest of the infrastructure—those are the three key points of the science DMZ concept.”
First, Aspera continues to refine its performance benchmarks. The speed of fasp has always been good and it continues to look better. Partnering with Intel, Aspera benchmarked their technology using the Intel Xeon processor E5-2600 product family. They achieved 40 gbps transfers on commodity hardware, and 10 gbps WAN speeds on virtual machines.
Data volumes are growing in life sciences, Munson noted, and Aspera continues to look for ways to speed data transfer on commodity hardware. Munson said Aspera is seeing improvements from both encryption and synchronization technologies. “We’ve been able to achieve a 2-3 times speed up in our standard software around encryption, just taking advantage of the standard encryption optimizations in market hardware,” she says. “We built a synchronization engine that… delivers not only very fast initial sync of data, but also updates.”
Ease of use continues to be an area of focus for Aspera. The company launched Aspera Drive last year, which brings together all transfer and synchronization paradigms of the faspex 4.0 backend directly into the desktop file explorer. Munson calls it, “a major innovation for us.” In both Windows and Mac environments, users have access to Aspera software to enable transparent sync and drag and drop file transfer. “It makes getting these giant datasets to and from long distance endpoints very easy… the thought is that the end user has the utility and ease of something like DropBox for example, but is able to deal with giant datasets with very fast speed and security.” Munson expects Aspera Drive to appeal to all types of users—from an individual researcher, “who doesn’t want to be troubled with advanced… IT” to a clinician, “who might need to get at some patient data for pathology analysis for example.”
“We want to be able to keep that performance and power, while being able to offer an extremely easy, transparent user experience,” she said.
The area of emphasis for Aspera that echoes what much of the IBM acquisition coverage has focused on, is direct integration of Aspera into cloud or object storage platforms. “The key issue with this opportunity,” Munson said, speaking of the growth of cloud platforms from Amazon and its competitors, “is how in the world do you get these large file sets to and from where they originate?... We’re dealing with not only gigabytes and terabytes but petabytes at a time!”
Aspera’s object storage integration was built a few years ago to replace the native HTTP input/output and it continues to be updated. “What started in Amazon S3 is now fully commercialized on top of Google Storage, which has a heavy life sciences emphasis, Microsoft Azure, OpenStack, and… Akamai.”
In each case, fasp is directly integrated with the storage platform allowing users to send and receive large datasets high speeds, with encryption, and encryption at rest. Without Aspera, these transfers would be, “an http put or get, which is encumbered by slowness over the LAN, the inability to resume large data, challenges around encryption both on the wire and at rest,” Munson said. “It would be really impossible to use it at scale and at global distances.”
Finally, Munson said, Aspera is focused on further developing its software development kit and APIs: “really making those able to meet the needs of the life sciences community in every capacity.”
“There are many genomics portals that integrate the browser plugin and server software and have their own, if you will, private experience,” Munson said. “National Cancer Institute has both public and private sites. EBI in Cambridge [UK] has their own integrated site. Broad Institute has one. Our SDK [software development kit] and the APIs we develop allow for that type of integration.”
Aspera released a new API with LINUX or UNIX command line publishing utility for Aspera faspex. The genomics group at Stanford is using it to build a publishing portal, Munson said. “They’re very interested to use faspex because it’s very easy to distribute large research datasets to people locally that are not necessarily that IT savvy to do it with speed and security. This new integration allows for completely automated publishing from their backend cluster, to get those datasets out. It preserves the security model, it has ease of integration directly with where the data lies on the compute cluster, and it allows them to maintain the automate disk process between a very—if you will—UNIX-y cluster high performance cluster world with a very user-friendly web application for data sharing.”