1.1 - SystemModels
1.1 - SystemModels
Shajulin Benedict
[email protected]
Indian Institute of Information Technology Kottayam
www.sbenedictglobal.com
1.1 Cloud Computing - Introduction
www.sbenedictglobal.com
Cloud Computing - Industries
Company
wants
the data in
Cloud…
www.sbenedictglobal.com
What is Cloud Computing
https://en.wikipedia.org/wiki/Blind_men_and_an_elephant
www.sbenedictglobal.com
What is Cloud Computing? - Definition
• According to Forrester,
• Cloud computing is a form of standardized IT-based capability
– such as Internet-based services, software, or IT
infrastructure – offered by a service provider that is accessible
via. Internet protocols from any computer is always available
and scales automatically to adjust to demand, is either pay-
per-use or advertising-based has Web- or programmatic
based control interfaces, and enables full customer self-
service.
• According to NIST
• Cloud Computing is an approach to computing pool of
configurable computing resources (e.g., networks, servers,
storage, applications, and services) that can be rapidly
provisioned and released with minimal management effort or
service provider interaction.
5
www.sbenedictglobal.com
What is Cloud Computing? - Definition
• According to Buyya
• Cloud is a parallel and distributed computing system
consisting of a collection of inter-connected and virtualized
computers that are dynamically provisioned and presented as
one established through negotiation between the service
providers and consumers.
• According to McKinsey Co
• Clouds are hardware based services offering compute,
network, and storage capacity where: Hardware management
is highly abstracted from the buyer; buyers incur infrastructure
costs as variable ones; and, infrastructure capacity is highly
elastic.
www.sbenedictglobal.com
Cloud Computing and Services
www.sbenedictglobal.com
Who needs Cloud?
• Almost Everyone!!!
Pay as you go
Developers
Or
Users
www.sbenedictglobal.com
History – Cloud Computing
www.sbenedictglobal.com
Challenges
• Scalability issues
• Inefficient most of the times – replication
• Data security
• Ownership, proprietary, trust, …
• Fault tolerance issues
• Green issues
• Data management issues – locality, zones, replication
www.sbenedictglobal.com
1.2 Base Technologies – Cloud Computing
• Contents (5 Parts):
Computing Domains
CPU,GPU,Memory
1.2 Base
Technologies
Virtualization
www.sbenedictglobal.com
1.2.1 Computing Domains
Computing Domains
–Evolutionary changes – HTC / HPC
– Age of Internet computing
www.sbenedictglobal.com
Evolutionary Changes in Computing
www.sbenedictglobal.com
Computing distinctions
Central Parallel Distributed Cloud
computing Computing Computing Computing
Computer All processors Here, The resources
resources are are tightly computers that can be either a
centralized in coupled with have private centralized or
one physical centralized memories are distributed
system. shared memory. connected and computing
communicated system.
through a
network. A form of utility
computing or
service
computing.
www.sbenedictglobal.com
Age of Internet Computing
• Billions of people use the internet.
• Thus, internet users require HPC capabilities at their
desk.
• The emergence of clouds indeed requires HTC built
over parallel and distributed computing technologies.
www.sbenedictglobal.com
Generations
• Computer technology has reached five generations!
• Each generation had 10 to 20 generations
• Overlap of around 10 years.
• 1950 – 70 → mainframes
• 1960 – 1980 → lower cost mini computers
• 1970 – 1990 → PCs with VLSI microprocessors
• 1980 – 2000 → wired and wireless massive portable
computers
• Since 1990 → HPC and HTC systems emerged (in the form of
clusters, grids, clouds ..)
www.sbenedictglobal.com
Evolution of HTC and HPC over Internet
Virtualization
Characteristics
Of HTC and HPC RFID and sensors
www.sbenedictglobal.com
Scalable Computing trends
• Several factors drive computing applications
• Moores law
• number of transistors per square inch on integrated circuits had
doubled every two year (WIKI),
• processor speed doubles every 18 months.
• Gilders law
• bandwidth grows at least three times faster than computer power”
• Price/performance ratio of systems
• refers to a product's ability to deliver performance
• Degree of parallelism
• indicates how many operations can be or are being
simultaneously executed by a computer.
www.sbenedictglobal.com
Degree of Parallelism
• Degree of parallelism
• bit-serial processing fashion (50 years ago)
• bit-level parallelism – word serial processing
• instruction-level parallelism (ILP) – due to 8-,16-,32-64bit
procs.
– Used in modern CPUs – demand h/w and compiler support
– Multiple issue superscalar architecture,
– Dynamic branch prediction
• data-level parallelism – explored in GPUs (many cores)
• task-level parallelism – explored in GPUs (many cores)
• job-level parallelism
www.sbenedictglobal.com
Scientific Applications – Possibility to Scale
• Science and Engineering
• Simulations, genomic analysis, seismic, earthquake
simulations, weather forecasting, HEP
• Business, Education, and Health care services
• Content delivery, transaction processing, distance education..
• Internet and web services
• Traffic monitoring, digital government, online tax return
processing.
• Mission critical applications
• Crisis managements, recovery, intelligent systems. ..
www.sbenedictglobal.com
Application Requirements
www.sbenedictglobal.com
1.2 Base Technologies – Cloud Computing
Computing
Domains
CPU,GPU,Memory
1.2 Base
Technologies
Virtualization
www.sbenedictglobal.com
1.2.1 CPU-GPU-Memory-Networks
www.sbenedictglobal.com
Multicore CPU Technology
• Recent processors are having n processing cores
(dual, quad, six, eight, and more)
• The manufacturers try to reduce the size of the
processors and increase the core count.
• Intel has introduced tick-tock model since 2007.
• Tick / Tock duration → 12 to 18 months
• Tick duration → shrink the size of processors
• New fabrication process
• Tock duration → frame new architecture
• New micro-architecture
www.sbenedictglobal.com
Multicore technology – contd.
Tick
Model Processor Name Size Year
Tock
/ Nahalem 45 nm 2008
Q3-2017 (Delayed—
Cannon Lake 10 nm
2019?
L3 cache / DRAM
26
www.sbenedictglobal.com
Multi-core technology – contd.
• The clock rate got increased
• 10 MHz for Intel 286 to 6 GHz for 14th generation Intel Rapter
Lake (20 cores too)
• It is constantly being increased until last decade
• Clock rate growth limitation
• Immature chip manufacturing technology reduced the growth.
• Too much of heat dissipation with high frequencies or high
voltages.
• Thus, many core CPU evolved.
www.sbenedictglobal.com
Manycore GPU
• GPUs are involved.
• GPU is a graphics coprocessor (accelerator) mounted on a
computers graphics card or video card.
• CPU instructions are offloaded from CPU to GPUs.
• First GPU was NVIDIA’s GeForce 256 (1999)
• HPC community uses GPGPUs (General Purpose computing
on GPUs).
• Top 500 lists have many GPGPU systems.
• Typical GPGPU:
• NVIDIA GPUs may have 128 GPU cores on a chip.
• Each core may handle 8 threads of instructions.
• Thus, nearing 1024 threads of parallelism.
• Data intensive calculations are offloaded from CPU and
executed in parallel on these threads.
• Multiple GPUs could be clustered to form a massive parallel 28
GPU. www.sbenedictglobal.com
Skylake Architecture
www.sbenedictglobal.com
Intel Kabylake Processor
www.sbenedictglobal.com
Xeon Phi – Knights Landing
Intel Xeon Phi generations
Knights Corner, 2012
Knights Landing, 2016
Knights Hill
www.sbenedictglobal.com
Quantum Computers
Intel, IBM, Google, Microsoft and the other giants are
developing quantum computers.
www.sbenedictglobal.com
Quantum Computing -- History
Quantum computing
1981 – Conference took place by MIT and IBM
1995 – Error correction using quantum physics
2016 – IBM Q experience launched.
Key question of the conference?
Relationship between physics and information…
Key theme:
Simulate information based on nature…
www.sbenedictglobal.com
Quantum Computing
www.sbenedictglobal.com
Quantum Computing - Qubits
Classical computers deal with 1 or 0 (two states)
QC deals with 1 or 0 or both at the same time.
QC can stay in the superposition state (i.e.,
simultaneously in 1 and 0)
www.sbenedictglobal.com
Qubits - Kinds
Spin of electrons
Based on magnetic fields (spin up / down)
Atoms energy level
High, Low energy levels
Photons
Horizontal or vertical polarization based on light
Superconducting circuits
At very low temperature, some materials flow current without
any resistance.
(Clockwise / Anticlockwise – and, atoms).
https://www.youtube.com/watch?v=KhrTTqwKjn4
www.sbenedictglobal.com
Quantum Computer
How does it look like?
IBM-Q:
www.sbenedictglobal.com
Memory, Storage, and WAN
Year DRAM Year Disk
Memory Capacity
Chip (Hard Disk)
capacity
1976 16KB 1981 260MB
2011 64GB 2011 3TB (Seagate
barracuda XT)
https://www.youtube.com/watch?v=DLM20pWqMyU
www.sbenedictglobal.com
Memory, Storage, and WAN
• Storage Technology
• Rapid growth in flash memory and Solid State Drives (SSDs)
• NAND flash (non volatile)
• Emerging memory technologies
• Nanowire based storage
• These growth factors impact the future of HPC or HTC.
• Typically, SSD can handle 300000 to 1 million write cycles per
block.
• HDD – Hard Disk Drive – (in 2022, 30TB HDD!)
• Tape units are dead!!!
• Disks are tape units!!!
• Flashes are disks!!!
• And, memory are caches!!! – now…
• 39
www.sbenedictglobal.com
Memory, Storage, WAN
• System Interconnects
More
Concurrent
Computers
are required
cConnected via.
Internet in a
Hierarchical fashion
www.sbenedictglobal.com
1.2 Base Technologies – Cloud Computing
Computing
Domains
Networking
Systems
1.2 Base
Technologies
Scalable
Technology
system
Convergence
Models
Virtualization
www.sbenedictglobal.com
1.2.3 System Models
www.sbenedictglobal.com
Scalable Computing – Basics
Cluster computing
Computational grids
P2P networks Cloud
Internet technologies Computing
Virtualization
Underlying Scalable
Technologies Cloud
www.sbenedictglobal.com
Scalable Cloud Computing - Models
• A gigantic computer with an elephantine specification
might not solve complex problems.
• We need scalable distributed computing solutions.
• These scalable systems involve 100s, 1000s, or even
millions of computers.
• These computers are
• Collectively
• cooperatively,
• or collaboratively involved
at various levels of
executions.
www.sbenedictglobal.com
Cluster Computing - Architecture
Disks
And
Server I/O
Ethernet or Myrinet or Internet via.
Infiniband or high speed n/w Gateway
www.sbenedictglobal.com
Cluster Computing
• Benefits
• Scalable performance
• Efficient Message passing
• High system availability
• Seamless fault tolerance
• Cluster-wide job management
• Challenges
• A dedicated OS to operate cluster as SSI is not available.
• S/w and applications must rely on the middleware to achieve
high performance.
www.sbenedictglobal.com
Supercomputer
• System:
• These are a large collection of clusters housed in a single
building. Cloud do utilize those supercomputers to frame
datacenters.
• The computing units are tied up with high speed networks.
• OS:
• Earlier, supercomputers had single OS (such as, Chippewa
OS – mostly, as a job sharing system).
• The software costs increased tremendously. And, the
designers switched to unix style OS.
• Modern Supercomputer OS
• Each nodes were installed with different versions of OS.
• Drawbacks of Supercomputers
• Expensive ones.
www.sbenedictglobal.com
SuperMUC @ Leibniz Supercomputer Centre
www.sbenedictglobal.com
LRZ Supercomputer
• Massively parallel system with more than 241,000
cores.
• 2 Phases of implementation (2011 – 2014)
• It reaches 6.8 petaflop/s with LINPACK
• RANKED – 27th in the top500 list.
• SunwayTaihulight from NRCPC – (93 PetaFLOPS)
• FLOPS – floating point operations per second (FLOPS for
supercomputers –MIPS for PCs)
Mega 106 million
Giga 109 billion
Tera 1012 trillion
Peta 1015 quadrillion
Exa 1018 quintillion
Zetta 1021 sextillion
www.sbenedictglobal.com
LRZ Supercomputer
Phase II
Phase I
Islandtype Fat Nodes Thin Nodes Many Cores Nodes Haswell Nodes
Total Number of
205 9216 32 3072
nodes
Total Number of
8,200 147,456 3,840 (Phi) 86,016
cores
Total Number of
1 18 1 6
Islands
www.sbenedictglobal.com
Phase – I ---- LRZ SuperMUC
• A distributed memory architecture with 19 partitions
• 18 partitions called islands + 1 I/O
• Node is a shared memory system
with 2 processors
• Sandy Bridge-EP
Intel Xeon E5-2680 8C
• Sandy -> 2.7 GHz (3.5 GHz)
• 32 GByte memory
• Inifiniband network interface
• Processor has 8 cores
• 2-way hyperthreading
• 21.6 GFlops @ 2.7 GHz per core
• 172.8 GFlops per processor
www.sbenedictglobal.com
Sandy Bridge Processor
L3 cache
• Shared cache among two processors in a node.
www.sbenedictglobal.com
NUMA Node
4GB 2 QPI links 4GB
4GB 4GB
Sandy Bridge Each 2 GT/s Sandy Bridge
4GB 4GB
4GB 4GB
8xPCIe3.0 (8GB/s)
Infiniband
www.sbenedictglobal.com
Performance of Phase-I
126 links
Nodes
www.sbenedictglobal.com
Interconnection Network
Infiniband FDR-10
FDR means Fourteen Data Rate
FDR means 14Gb/s per lane
FDR-10 has an effective data rate of 38.79 Gbit/s
Latency: 100 nsec per switch, 1usec MPI
Vendor: Mellanox
Intra-Island Topology: non-blocking tree
256 communication pairs can talk in parallel.
Inter-Island Topology: Pruned Tree 4:1
128 links per island to next level
www.sbenedictglobal.com
MPI Performance – IBM MPI over Infiniband
Bandwidth MB/s
Bandwidth MB/s
Cold Corridoor
Infiniband (red)
and
Ethernet (green)
cabling
www.sbenedictglobal.com
Infiniband Interconnect
19 islands with 126 Spine Switches
www.sbenedictglobal.com
IO System
GPFS for
$WORK and
$SCRATCH Login nodes $HOME Archive
www.sbenedictglobal.com
SuperMUC Power Consumption
Linpack HPL run May 17,2012 – 2.582 PF
Run Start: 17.05.2012 20:56, 965,40 kW
SuperMUC HPL Power Consumption (Infrastructure, Run End: 18.05.2012 08:37, 711,02 kW
Duration: 42045s or 11.68 hours
Machine Room & PDU Measurements) Avg. power: 2758.87 kW
4000 50000 Energy: 32308.68 kWh
45000
3500
40000 Power (Machine Room,
3000 kW)
Power (infrastructure,
2000 25000 kW)
Energy (Machine Room,
20000 kWh)
1500
15000
1000
10000
500 Subsystems included in PDU
5000
measurements:
0 0
o Computational Nodes
16:301 8:001 9:302 1:002 2:300:00 1:30 3:00 4:30 6:00 7:30 9:0010:301 2:001 3:30
Time (Clock) o Interconnect Network
www.sbenedictglobal.com
SuperMUC - Costs
• 1 million euro = 9 crore Rs.
2010-2014 2014-2016
Phase 1 Phase 2
High End System
Investment Costs (Hardware
53 Mio € ~ 19 Mio €
and Software)
Operating Costs (Electricity
costs and maintenance for
32 Mio € ~ 29 Mio €
hardware und software, some
additional personnel)
SUM 85 Mio € ~48 Mio €
Extension Buildings (construction
and Infrastructure) 49 Mio €
www.sbenedictglobal.com
Energy Capping in Contract with IBM
• New funding scheme: energy included
• Contract includes energy cost for 5 years
• Power consumption of the system varies between
1 and 3 MW depending on the usage by the
applications.
• Power bands
• LRZ has upper and lower power bands to utilize in the
applications.
• Contract penalty of 100000 Euros.
• The contract is based on the energy consumed in a
benchmark suite agreed between IBM and LRZ
www.sbenedictglobal.com
IBM iDataplex dx360 M4: Water cooling
• Sandy&Ivybridge processor-based machines utilized
IBM iDataPlex system (Thin nodes and manycore
nodes).
• Heat was cooled by water – Free cooling.
• Power advantage over air-cooled node:
• Warm water cooled ~10% Input=18 to 45 degree centigrade
Output = < 50 degree centigrade
(cold water cooled ~15%)
• due to lower Tcomponents and no fans
• Typical operating conditions:
• Tair = 25 – 35°C
• Twater = 18 – 45°C
• No fans
• Less energy and noise
www.sbenedictglobal.com
LRZ Infrastructure layout
https://www.youtube.com/watch?v=LzTedSh51Tw
Cooling process
Hö
Höchstleistungsrechner
(sä
(säulenfrei)
Zugangsbr
Entranceü cke
Server/Netz
Archiv/Backup
Archiv/Backup
Cooling
Klima
Electrical Systems
www.sbenedictglobal.com
P2P Networks
• P2P computing has hierarchical levels of networking –
logical and physical.
• In P2P architecture, P2P systems are introduced at
the physical level and overlay networks are at the
logical level.
• In P2P, all nodes act as a client and a server.
• All clients autonomously choose their peers (join or
leave on their own).
• No central coordination or central database is needed.
69
www.sbenedictglobal.com
P2P Architecture
Overlay
networks –
logical
Physical
network and
Computers
• Clients contribute resources for the tasks. They are involved in the
ongoing P2P computation.
• Task level parallelism could be achieved using P2P.
• Dynamic formation of topologies are possible using P2P.
70
www.sbenedictglobal.com
P2P Architecture
• Data items or files are distributed in the participating
peers.
• Based on the communication, the IDs are initialized
and overlay networks (logical) are formed.
• 2 types of overlay networks
• Unstructured – a random graph, no fixed route, flooding
• Structured – fixed connection topology, routing algorithms..
• 3 design objectives
• Data locality
• Network proximity (nearness in the space)
• Interoperability
www.sbenedictglobal.com
Scalabe P2P Grid - Vishwa
Vishwa P2P Grid – IIT Madras (Prof. Janakiram)
www.sbenedictglobal.com
P2P Application Families
• Four Application groups are identified in P2P Systems
• For distributed file sharing
– Eg. BitTorrent
– Eg. Content distribution of MP3, music, video.
• For collaborative networking
– Eg. MSN, Skype, instant messaging
• For distributed computing
– Eg. Vishwa, SETI@home [search for extra-terrestrial intelligence]
• For other purpose, such as, discovery, communication,
security, and resource aggregation
– Eg. JXTA (programming), .NET, FightingAID@home (finding new
drug for HIV)
www.sbenedictglobal.com
P2P Computing Challenges
• Hardware/Software/Network Issues
• P2P needs to handle too many hardware models and
architectures.
• Incompatibility exists between software and the OS.
• It needs to handle different network connections and
protocols. (Scalability issues).
• P2P performance is affected by routing efficiency and
self-organization by participating peers.
• As P2P is not centralized, manageability is also
difficult.
www.sbenedictglobal.com
Computing Grids
• Power grid concept.
• Resource sharing & coordinated problem solving in
dynamic, multi-institutional virtual organizations.
• Difference from clustering is that the grids are widely
spread across the globe.
• Distributed vs. Grid computing
Grid is an evolution of distributed computing
Dynamic
Geographically independent
Built around standards
Internet backbone
www.sbenedictglobal.com
Why Grids?
Large-scale science and engineering are done through
the interaction of people, heterogeneous computing
resources, information systems, and instruments, all
of which are geographically and organizationally
dispersed.
www.sbenedictglobal.com
Grid computing
• Special instruments may also be involved.
• Eg. Radio telescope in SETI@home
• Grid is often constructed across LAN, WAN, or
internet backbone networks at a regional, national, or
global scale.
• 3 kinds
• Computational grids
• Data grids
• P2P grids.
www.sbenedictglobal.com
Grid Architecture – Hourglass Model
User Applications
Services-Directory brokering,
monitoring
www.sbenedictglobal.com
1.2 Base Technologies – Cloud Computing
Computing
Domains
Networking
Systems
1.2 Base
Technologies
Scalable
Technology
system
Convergence
Models
Virtualization
www.sbenedictglobal.com
Technology convergence
Computing Paradigms
Attributes / Capabilities
www.sbenedictglobal.com