Clouds and More Clouds in Seattle



AWS’s Singh talks passions: science and the cloud. 

May 18, 2010 | Deepak Singh called life sciences, where he has spent most of his career, and large scale distributed computing, his current job as business development manager at Amazon Web Services (AWS), his two passions in the closing keynote at the 2010 Bio-IT World Conference & Expo.

Appropriately, Singh presented his talk on the infrastructure and life science applications of Cloud computing from the cloud – at least that’s how it seemed when the live video feed from Seattle threatened to deteriorate. Thanks to a crack IT staff, full connectivity was quickly restored.

It was unlikely that many members in the audience have not done business with Amazon’s retail Web site. In order to build a robust commercial site, he said, “You had to create a massively scalable infrastructure... that doesn’t break down every Christmas.” That infrastructure hinges on availability, efficiency, services orientation, and security.

In Singh’s estimate, Amazon was the perfect creator for a massively scalable Cloud computing solution. Having done the hard part, it started asking, “What happens if you take the lessons we have learned in building this very large scale, very available infrastructure and make it available to anyone with a credit card?”

AWS has become what Singh calls a “toolkit that gets stuff done.” It’s made up of a compute tool (EC2); a storage tool (S3); and a database tool (see “Amazon’s Cloud Raining Gifts for 2010,” Bio•IT World, Jan 2010). On top of those are monitoring services, tracking, and much more. AWS is growing at an astounding pace. Only a couple of years after the launch, the bandwidth far exceeded the bandwidth consumed by all of Amazon’s global websites. By the end of 2009, S3 housed over 100 billion objects. But that growth is good news for customers.

“The larger we get, the cheaper it gets for us to run our operations,” said Singh. Since its inception, AWS has reduced prices 16 times, and now has three payment models: on demand, reserved instances, and spot instances, or the cloud market.

“You put in a request, and if the market price is lower than the price you put in, your instances will start and you can get the job done,” explained Singh. “You pay the market price.” The caveat is that if the market price increases, you lose your instances. Singh said this is an excellent option for applications that “can handle failure gracefully.” (A Web site called cloudexchange.org tracks instance prices.)

The Fourth Paradigm

Singh chose not to dwell on security—a chief concern of companies entering the cloud—because he’s “not a security guru.” But he did point out that Amazon holds many customer credit cards and has been invested in security long before AWS came into existence. He quoted a colleague: “It’s more difficult to get into an AWS data center than it was to get into FBI headquarters.”

Amazon has moved the firewall to the server, Singh said. The user is separated from the cloud by the hypervisor, which allows multiple operating systems to run concurrently. “The key thing is we control everything hypervisor down. We manage it. We are responsible for it. You manage everything hypervisor up. So the guest operating system... that is your responsibility. It’s a shared responsibility model.”

What Singh is excited about, though, is more than just secure and elastic compute power. “Data-intensive scientific discovery is becoming the fourth paradigm of research,” he said (experiment, theory, and observation being the first three).

This is a season of changes and challenges in life sciences: data types and volume are rapidly increasing and research is becoming more collaborative. It’s imperative that data be shared.

That’s where the AWS tools come into play: providing scalable, elastic, and available compute and storage power to enable public private partnerships, big pharma and small biotech collaborations, and nonprofit and academia endeavors.

Singh listed many companies now working to facilitate data management and data processing in the cloud for the life sciences, including Recombinant, Mathworks, Mathematica, Univa UD, Sun, Cycle Computing, and Right Scale.

“This whole concept of resources that are relevant and are required only for a task, is very powerful,” Singh said. “It’s efficient, it’s cost effective, and to me, that’s one of those compelling reasons I moved to AWS a couple of years ago.”


This article also appeared in the May-June 2010 issue of Bio-IT World Magazine. Subscriptions are free for qualifying individuals. Apply today.

Click here to login and leave a comment.  

0 Comments

Add Comment

Text Only 2000 character limit

Page 1 of 1



White Papers & Special Reports

sgi whp 2
Managing the Modern Genomics Data Flood
Sponsored by SGI

Managing and storing the perfect storm of multi-disciplined data pouring from next generation sequencers and other omics instruments is a central challenge in life sciences. Discover in this paper how the SGI ArcFiniti storage solution, optimized for unstructured genomics and life sciences data can: 

  • Reduce costs, proactively protect data integrity, and deliver the high performance I/O required for genomics data processing and analysis.  
  • Effectively manage capacities from 156TB to 1.4PB as a disk based, integrated hardware and software platform 


sgi - whp 1
Turning Genomics Data into Practical Insight
Sponsored by SGI

With worldwide sequencing capacity approaching 13 quadrillion DNA bases annually turning genomics data into knowledge is a true computational challenge. Read this paper and learn how the SGI UV coherent shared memory platform can:  

  • Speed results time while cost competitively tackling the most difficult computational problems across all omics disciplines. 
  • Push performance by scaling to extraordinary levels, up to 256 sockets (2,560 cores, 4,096 threads) per single system (one OS image). 

Provide support for up to 16TB of coherent shared memory in a single system image enabling extreme efficiency across a wide range of compute demands. 



accerlys-logo_2012_wh
New Complimentary Market Survey…
Collaborations and Communications Within Drug Discovery Research
Sponsored by Accelrys
This survey was conducted by the Cambridge Healthtech Media Group in January, 2012. It was sponsored by Accelrys related to their HEOS initiative to gather valid information around externalizing collaborative research while improving communications in the cloud. With 310 qualified industry respondents the survey findings reveal useful usage and trends patterns.  An insightful follow-on discussion and webinar related to this survey, and the HEOS by Scynexis SaaS portal is also available on the Bio-IT World website for complementary viewing.
 


Job Openings

tessella logo 
Scientific Software Engineer
Boston MA
$70,000 to $95,000
 
Apply at http://jobs.tessella.com   

oxford nanopore logo 


Early Access Collaborations ManagersClick here to find out more and apply   

Oxford Nanopore's GridION technology, VP, Sales and Marketing Click to  Apply  

For reprints and/or copyright permission, please contact  Tim McLucas, (781) 972-1342, tmclucas@healthtech.com .