CLOUD COMPUTING
INFRASTRUCTURE
TAKE A SEAT & PREPARE TO FLY
1 Anh M. Nguyen
CS525, UIUC, Spring 2009
GOALS
Define
Cloud:public cloud, private cloud
Cloud Computing
Why cloud computing?
Obstacles & opportunities
Current state of cloud computing
Amazon EC2
Google AppEngine
2
WHAT IS CLOUD COMPUTING?
I don’t understand what we would do differently
in the light of Cloud Computing other than change
the wordings of some of our ads
Larry Ellision, Oracle’s CEO
I have not heard two people say the same thing
about it [cloud]. There are multiple definitions out
there of “the cloud”
Andy Isherwood, HP’s Vice President of European Software Sales
It’s stupidity. It’s worse than stupidity: it’s a marketing
hype campaign.
Richard Stallman, Free Software Foundation founder
3
SOFTWARE AS A SERVICE (SAAS)
Application is used as an on demand service. Often
provided via the Internet
Think on-demand TV programs
Example: Google App (online office)
Benefits to users
Reduce expenses: multiple computers, multiple users
Ease of usage: easy installation, access everywhere
Benefits to providers
Easierto maintain
Control usage (no illegal copies)
4
UTILITY COMPUTING (UC)
Computing resources (cpu hour, memory, network) and
platform to run software are provided as on demand
service
Think electricity service
The same evolution happened
Hardware as a service (HaaS), Infrastructure as a service
(IaaS), Platform as a Service (PaaS)
Examples of UC providers: Amazon EC2, Google
AppEngine …
Who will use UC? Is UC the end of high-end PC?
Peoplewho otherwise has to build their own data center:
SaaS providers, analytics & batch processing
5
UTILITY COMPUTING - BENEFIT TO
USERS
Mitigate the risks of over-provisioning and under-
provisioning
No up-front cost, invest on other aspects
(marketing, technology…)
Less maintenance & operational cost
Save time, time = money
In summary: Reduce cost
6
UTILITY COMPUTING – MITIGATE RISKS
Capacity
Real world utilization 5%-20%
Animoto demand surge:
from 50 servers to 3500
servers in 3 days
Black Friday sales
Demand
Over-provisioning t
Capacity Capacity
Demand Demand
t 1 2 3 t 7
On demand, scalable
Under-provisioning
UTILITY COMPUTING – BENEFIT TO
PROVIDERS
Make money
Economies of scale
Resource Cost for medium scale Cost for large scale Ratio
Network $95 / Mbps / month $13 / Mbps / month ~7x
Storage $2.20 / GB / month $0.40 / GB / month ~6x
Administration ≈140 servers/admin >1000 servers/admin ~7x
Timediversity: different peeks for different services
Geographical diversity: choice of best location
Electric price in Idaho = 1/5 in Hawaii
Existing infrastructure & expertise
Google, Amazon: utilize off-peak capacity
8
UTILITY COMPUTING – AMAZON EC2
Elastic Compute Cloud
Rent virtual machine instances to run your software.
Monitor and increase / decrease the number of VMs
as demand changes
How to use:
Create an Amazon Machine Image (AMI): applications,
libraries, data and associated settings
Upload AMI to Amazon S3 (simple storage service)
Use Amazon EC2 web service to configure security and
network access
Choose OS, start AMI instances
Monitor & control via web interface or APIs
9
AMAZON EC2
Characteristics:
Elastic: increase or decrease capacity within minutes
Monitor and control via EC2 APIs
Completely controlled: root access to each instances
Flexible: choose your OS, software packages…
Redhat, Ubuntu, openSuse, Windows Sever 2003,…
Small, large, extra large instances
Reliable: Amazon datacenters, high availability and redundancies
Secure: web interface to configure firewall settings
Cost:
CPU: small instance, $0.10 per hour for Linux, $0.125 per hour for
Windows (1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor)
Bandwidth: in $0.10, out $0.17 per GB
Storage: $0.10 per GB-month, $0.10 per 1 million I/O requests
10
UTILITY COMPUTING - GOOGLE
APPENGINE
Write your web program in Python and submit to
Google. It will take care of the rest
How to use
Download AppEngine SDK
Develop your program locally
A set of python programs, input = requested url, output =
return message
Debug locally
Registerfor an application id
Submit your application to Google
11
GOOGLE APPENGINE – HELLO WORLD
Cre
ating a Simple Request Handler
Creat
e a file helloworld.py:
print 'Content-Type: text/plain'
print ''
print 'Hello, world!‘
Ma
p url to handler
Edit
configuration file app.yaml
application: helloworld
version: 1
handlers:
- url: /.*
script: helloworld.py
Dat
a storage:
Distributed file system
Store using AppEngine API, retrieve using GQL
De
bug: http://localhost:8080/ 12
GOOGLE APPENGINE
R
egister for an application ID
http://appengine.google.com
Verification code sent to your mobile
U
ploading the Application
appcfg.py update helloworld/
Enter your Google username and password at the prompts
http://application-id.appspot.com
M
anage using Administration Console
Set up domain name
Invite other people to be developers
View error logs, traffic logs
Switch between different versions 13
GOOGLE APPENGINE
Characteristics
Easy to start, little administration
Scale automatically
Reliable
Integrate with Google user service: get user nickname, request
login…
Cost:
Can set daily quota
CPU hour: 1.2 GHz Intel x86 processor
Free quotas going to be reduced soon
Resource Unit Unit cost Free (daily)
Outgoing Bandwidth gigabytes $0.12 10GB
Incoming Bandwidth gigabytes $0.10 10GB
CPU Time CPU hours $0.10 46 hours
14
Stored Data gigabytes per month $0.15 1GB (all)
SPECTRUM OF ABSTRACTIONS
Different levels of abstraction
Instruction
Set VM: Amazon EC2
Framework VM: Google AppEngine
Similar to languages
Higher level abstractions can be built on top of lower
ones
Lower-level, Higher-level,
More flexibility, Less flexibility,
More management Less management
Not scalable by default Automatically scalable
15
EC2 Azure AppEngine
Force.com
DETAILED COMPARISON
Amazon Google AppEngine
Computation •x86 Instruction Set Architecture •Predefined 3-tiers Web app
model •Not scalable by default. Can structure
use 3rd party service such as •Fixed language: Python
RightScale •Automatic scaling up and
down
Storage model•Scaling varies from none (EBS) •Fixed API: BigTable
to fully automatic (SimpleDB, S3) •Automatic scaling
Networking •Define network access policies •Fixed topology to for 3-tier
model •Choose availability zones, Web app structure
independent network failure •Automatic scaling
•Elastic IP addresses, persistently
routable name
•Automatic scaling
16
WHAT IS A CLOUD?
Software and hardware to operate datacenters
Public cloud: cloud used to provide utility computing
Amazon EC2: Amazon datacenters, Xen, EC2 APIs and
administrative interface
Google AppEngine: Google data center, GFS,
AppEngine APIs, administrative interface…
Batch processing softwares: MapReduce, Hadoop, Pig,
Dryad
Private cloud: datacenters, not available for rental
How about the academic clouds?
17
Protected clouds
Cloud Computing
A combination of existing concepts
SaaS Users PaaS Users
SaaS Utility Computing
SaaS Providers PaaS Providers
/ PaaS Users
18
CLOUD COMPUTING
Cloud Computing = SaaS + PaaS (utility computing)
Electricity
Cloud TV Video On Demand
(Cloud Computing) (SaaS) On Demand
(PaaS)
19
WHAT IS NEW IN CLOUD COMPUTING
The illusion of infinite computing resources
The elimination of an up-front commitment by users
The ability to use and pay on demand
Cloud Computing vs P2P?
Both take advantage of remote resources
P2P: does not use clouds (datacenters), peers do not
get paid, lower reliability
Cloud Computing vs Grid Computing?
Both use clouds
Grid Computing requires commitment, share based on 20
common interests. Not public cloud
CLOUD KILLER APPS
Mobile and web applications
Mobile devices: low memory & computation power
Extensions of desktop software
Matlab, Mathematica
Batch processing / MapReduce
Peter Harkins at The Washington Post: 200 EC2 instances
(1,407 server hours), convert 17,481 pages of Hillary Clinton’s
travel documents within 9 hours
The New York Times used 100 Amazon EC2 instances +
Hadoop application to recognize 4TB of raw TIFF image into
1.1 million PDFs in 24 hours ($240)
21
SHOULD I MOVE INTO A CLOUD
Does it really save money?
Costcloud > Costdatacenter , balance by Utilization
UserHourscloud > UserHoursdatacenter (under-provisioning)
Other factors
Re-implement programs
Move data into cloud
What else?
Example:
Upload rate 20Mbits / s. 500GB takes 55 hours
If can process locally in less than 55 hours moving into a cloud would not
save time
22
ADOPTION CHALLENGES
Challenge Opportunity
Availability Multiple providers
Data lock-in Standardization
Data Confidentiality and Encryption, VLANs, Firewalls
Auditability
Coghead, a cloud vendor closed its business a week ago
Customers need to rewrite their applications
Online storage service The Linkup closed July 10, 2008
20,000 paying subscribers lost their data
23
23
ADOPTION CHALLENGES
24
Cloud Control, InformationWeek Reports, 2009
GROWTH CHALLENGES
Challenge Opportunity
Data transfer bottlenecks FedEx-ing disks, reuse data multiple
times
Performance unpredictability Improved VM support, flash memory
Scalable storage Invent scalable storage
Bugs in large distributed Invent Debugger using Distributed VMs
systems
Scaling quickly Invent Auto-Scaler
25
GROWTH CHALLENGES
Data transfer bottle neck
WAN cost reduces slowest:
2003 2008: WAN 2.7x, CPU 16x, storage 10x
Fastest way to transfer large data: send the disks
Performance unpredictability
Large variation in I/O operations
Inefficiency in I/O virtualization
26
POLICY AND BUSINESS CHALLENGE
Challenge Opportunity
Reputation Fate Sharing Offer reputation-guarding services like
those for email
Software Licensing Pay-for-use licenses; Bulk use sales
Reputation: Many blacklists use IP addresses and
IP ranges
Software licensing:
Open source software readily applicable
Windows, IBM softwares offered per hour for EC2
27
THE FUTURE?
Application software:
Cloud & client parts, disconnection tolerance
Infrastructure software:
Resource accounting, VM awareness
Hardware systems:
Containers, energy proportionality
28
DISCUSSION
Is their definition correct?
What applications of cloud computing in your
research area that you can think of?
Which service would you choose, EC2 or
GoogApp?
Can you predict the future of cloud computing?
29
REFERENCES
Above the Clouds: A Berkeley View of Cloud Computing, Michael
Armbrust et al, Feb 2009 (white paper and presentation)
Google AppEngine: http://code.google.com/appengine/
Amazon EC2: http://aws.amazon.com/ec2/
Lessons From The Demise Of A Cloud Startup, John Foley, Feb 2009
Cloud Control, InformationWeek Reports, 2009
30
ARE YOU READY FOR A RIDE?
31
BACKUP SLIDES
32
RIGHTSCALE
$2500 initial fee
$500 monthly
33