AWS’s Singh talks passions: science and the cloud.
May 18, 2010 | Deepak Singh called life sciences, where he has spent most of his career, and large scale distributed computing, his current job as business development manager at Amazon Web Services (AWS), his two passions in the closing keynote at the 2010 Bio-IT World Conference & Expo.
Appropriately, Singh presented his talk on the infrastructure and life science applications of Cloud computing from the cloud – at least that’s how it seemed when the live video feed from Seattle threatened to deteriorate. Thanks to a crack IT staff, full connectivity was quickly restored.
It was unlikely that many members in the audience have not done business with Amazon’s retail Web site. In order to build a robust commercial site, he said, “You had to create a massively scalable infrastructure... that doesn’t break down every Christmas.” That infrastructure hinges on availability, efficiency, services orientation, and security.
In Singh’s estimate, Amazon was the perfect creator for a massively scalable Cloud computing solution. Having done the hard part, it started asking, “What happens if you take the lessons we have learned in building this very large scale, very available infrastructure and make it available to anyone with a credit card?”
AWS has become what Singh calls a “toolkit that gets stuff done.” It’s made up of a compute tool (EC2); a storage tool (S3); and a database tool (see “Amazon’s Cloud Raining Gifts for 2010,” Bio•IT World, Jan 2010). On top of those are monitoring services, tracking, and much more. AWS is growing at an astounding pace. Only a couple of years after the launch, the bandwidth far exceeded the bandwidth consumed by all of Amazon’s global websites. By the end of 2009, S3 housed over 100 billion objects. But that growth is good news for customers.
“The larger we get, the cheaper it gets for us to run our operations,” said Singh. Since its inception, AWS has reduced prices 16 times, and now has three payment models: on demand, reserved instances, and spot instances, or the cloud market.
“You put in a request, and if the market price is lower than the price you put in, your instances will start and you can get the job done,” explained Singh. “You pay the market price.” The caveat is that if the market price increases, you lose your instances. Singh said this is an excellent option for applications that “can handle failure gracefully.” (A Web site called cloudexchange.org tracks instance prices.)
The Fourth Paradigm
Singh chose not to dwell on security—a chief concern of companies entering the cloud—because he’s “not a security guru.” But he did point out that Amazon holds many customer credit cards and has been invested in security long before AWS came into existence. He quoted a colleague: “It’s more difficult to get into an AWS data center than it was to get into FBI headquarters.”
Amazon has moved the firewall to the server, Singh said. The user is separated from the cloud by the hypervisor, which allows multiple operating systems to run concurrently. “The key thing is we control everything hypervisor down. We manage it. We are responsible for it. You manage everything hypervisor up. So the guest operating system... that is your responsibility. It’s a shared responsibility model.”
What Singh is excited about, though, is more than just secure and elastic compute power. “Data-intensive scientific discovery is becoming the fourth paradigm of research,” he said (experiment, theory, and observation being the first three).
This is a season of changes and challenges in life sciences: data types and volume are rapidly increasing and research is becoming more collaborative. It’s imperative that data be shared.
That’s where the AWS tools come into play: providing scalable, elastic, and available compute and storage power to enable public private partnerships, big pharma and small biotech collaborations, and nonprofit and academia endeavors.
Singh listed many companies now working to facilitate data management and data processing in the cloud for the life sciences, including Recombinant, Mathworks, Mathematica, Univa UD, Sun, Cycle Computing, and Right Scale.
“This whole concept of resources that are relevant and are required only for a task, is very powerful,” Singh said. “It’s efficient, it’s cost effective, and to me, that’s one of those compelling reasons I moved to AWS a couple of years ago.” •