Distributed Systems
Chapter 1: Introduction to Distributed
Systems
November 5, 2018
Presentation Outline
Introduction and Definition of Distributed Systems
Characteristics of Distributed Systems
Organization and Goals of DSs
The Client-Server Model
Types of Distributed Systems
Advantages and Challenges of DSs
Hardware and Software Concepts
2
1.1. Introduction
From a Single Computer to DS
Before the mid-80s, computers were
very expensive (hundred of thousands or even millions
of dollars)
very slow (a few thousand instructions per second)
not connected among themselves
After the mid-80s: two major developments
cheap and powerful microprocessor-based computers
appeared
computer networks
LANs at speeds ranging from 10 to 1000 Mbps
WANs at speed ranging from 64 Kbps to gigabits/sec
Consequence
feasibility of using a large network of computers to work for the
same application; this is in contrast to the old centralized systems
where there was a single computer with its peripherals
Distributed Systems 3
…Introduction
Networks of computers are everywhere!
Mobile phone networks
Corporate networks
Factory networks
Campus networks
Home networks
In-car networks
On board networks in planes and trains
This subject aims:
to cover characteristics of networked
computers that impact system designers
and implementers, and
to present the main concepts and
techniques that have been developed to
help in the tasks of designing and
implementing systems and applications that
are based on them (networks).
4
What Is a Distributed System?
Definition:
Operational perspective:
“A system in which hardware or software components
located at networked computers communicate and
coordinate their actions only by message passing.”
[Coulouris].
User perspective:
A distributed system is:
a collection of independent computers that appears to its
users as a single coherent system - computer (Tanenbaum
& Van Steen)
5
This definition has two aspects:
1. Hardware: autonomous machines
2. Software: a single system view for the users
(Middleware)
Examples:
Cluster:
“A type of parallel or distributed processing system, which consists of a
collection of interconnected stand-alone computers cooperatively working
together as a single, integrated computing resource” [Buyya].
Cloud:
“a type of parallel and distributed system consisting of a collection of
interconnected and virtualised computers that are dynamically
provisioned and presented as one or more unified computing resources
based on service-level agreements established through negotiation
between the service provider and consumers” [Buyya].
6
Why Distributed?
Resource and Data Sharing
printers, databases, multimedia servers, ...
Availability, Reliability
the loss of some instances can be hidden
Scalability, Extensibility
the system grows with demand (e.g., extra servers)
Performance
huge power (CPU, memory, ...) available
Inherent distribution, communication
organizational distribution, e-mail, video
7
1.2. Characteristics of Distributed Systems
Differences between the computers and the ways they
communicate are hidden from users
Users and applications can interact with a distributed system
in a consistent and uniform way regardless of location
Distributed systems should be easy to expand and scale
a distributed system is normally continuously available, even
if there may be partial failures
- Users and applications should not notice that parts are
being replaced or fixed, or that new parts are added to serve
more users or applications
8
1.3. Organization and Goals of a Distributed Systems
to support heterogeneous computers and networks and to
provide a single-system view, a distributed system is
often organized by means of a layer of software called
middleware that extends over multiple machines
Same interface everywhere
a distributed system organized as middleware; note that the middleware
layer extends over multiple machines 9
Goals of a distributed system:
a distributed system should
make resources accessible(printers, computers, storage
facilities, data, files, Web pages, ...)
reasons: economics, to collaborate and exchange
information
be transparent: hide the fact that the resources and
processes are distributed across multiple computers.
be open
be scalable
Transparency in a Distributed System
a distributed system that is able to present itself to users
and applications as if it were only a single computer
system is said to be transparent
10
different forms of transparency in a distributed system
Transparency Description
Access Hide differences in data representation
and how a resource is accessed
Location Hide where a resource is physically located; where
is http://www.prenhall.com/index.html? (naming)
Migration Hide that a resource may move to another location
Relocation Hide that a resource may be moved to another
location while in use; e.g., mobile users using their wireless
laptops
Replication Hide that a resource is replicated
Concurrency Hide that a resource may be shared by several
competitive users; a resource must be left in a
consistent state
Failure Hide the failure and recovery of a resource
Persistence Hide whether a (software) resource is in memory or on
disk
11
Openness in a Distributed System
an Open Distributed System is a system that offers services
according to standard rules that describe the syntax and
semantics of those services; e.g., protocols in networks
a distributed system should be open
we need well-defined interfaces
interoperability
components of different origin can communicate
portability
components work on different platforms
another goal of an open distributed system is that it should
be flexible and extensible; easy to configure the system out
of different components; easy to add new components,
replace existing ones
12
in distributed systems, such services are often specified
through interfaces often described using an Interface
Definition Language (IDL)
specify only syntax: the names of the functions, types
of parameters, return values, possible exceptions, ...
Scalability in Distributed Systems
Scalability in three dimensions
a distributed system should be scalable:
in size: adding more users and resources to the system
Geographically : users and resources may be far apart
Administratively: should be easy to manage even if it
spans many administrative organizations
13
Scalability Problems
Problems with size scalability: performance problems caused by
limited capacity of servers and networks
Often caused by centralized solutions
Concept Example
Single server for all users-mostly for security
Centralized services
reasons
Centralized data A single on-line telephone book
Doing routing based on complete
Centralized algorithms
information
examples of scalability limitations
Problems with geographical scalability:
traditional synchronous communication in LAN
unreliable communications in WAN
Problems with administrative scalability:
Conflicting policies, complex management, security problems
14
Scaling Techniques
how to solve scaling problems
the problem is mainly performance, and arises as a result
of limitations in the capacity of servers and networks (for
geographical scalability)
three possible solutions: hiding communication latencies,
distribution, and replication
15
a. Hiding Communication Latencies
try to avoid waiting for responses to remote service
requests
let the requester do other useful job
i.e., construct requesting applications that use only
asynchronous communication instead of synchronous
communication; when a reply arrives the application is
interrupted
good for batch processing and parallel applications but
not for interactive applications
for interactive applications, move part of the job to the
client to reduce communication; e.g. filling a form and
checking the entries
16
b. Replication
replicate components across a distributed system to
increase availability and for load balancing, leading to
better performance
decided by the owner of a resource
caching (a special form of replication) also reduces
communication latency; decided by the user
but, caching and replication may lead to consistency
problems (see Chapter 6 - Consistency and Replication)
17
1.4. The Client-Server Model
how are processes organized in a system
thinking in terms of clients requesting services from
servers
general interaction between a client and a server
18
1.4.1. Application Layering
no clear distinction between a client and a server; for
instance a server for a distributed database may act as a
client when it forwards requests to different file servers
three levels exist
the user-interface level: implemented by clients and
contains all that is required by a client; usually
through GUIs, but not necessarily
the processing level: contains the applications
the data level: contains the programs that maintain
the actual data dealt with
19
1.5 TYPES OF DISTRIBUTED SYSTEMS
1. Distributed computing systems
Used for high performance computing tasks
Cluster and Cloud computing systems
Grid computing systems
2. Distributed information systems
Systems mainly for management and integration of business functions
Transaction processing systems
Enterprise application integration
– Goal: Distribute information across several servers
3. Distributed pervasive( Ubiquitous ) systems
– Focus on mobile, embedded, communicating systems
– Goal: Spread a real-life environment with a large variety of smart devices.
20
1. Distributed Computing Systems
a) Cluster Computing Systems
Essentially a group of systems connected through a LAN.
Homogeneous
o Same OS, near-identical hardware
A collection of computing nodes + master node
Master runs middleware: parallel execution and management
Centralized job management & scheduling system
21
b) Grid Computing Systems
Lots of nodes (including clusters across multiple subnets) from
everywhere.
Federation of autonomous and heterogeneous computer
systems (HW,OS,...), several admin domains
Heterogeneous
Dispersed across several organizations
To allow for collaborations, grids generally use virtual
organizations.
Distributed job management & scheduling
22
Fig: A layered architecture for grid computing systems
c) Cloud Computing Systems
Over 20 definitions:
http://cloudcomputing.sys-con.com/read/612375_p.htm
Renting “remote storage” backup
Renting “remote server” hosting Web server
Renting “remote more servers” to manage large workload
Scientific definition of Cloud Computing
“Cloud is a market-oriented distributed computing system
consisting of a collection of inter-connected and virtualized
computers that are dynamically provisioned and presented as
one or more unified computing resources based on service-
level agreements (SLAs) established through negotiation
between the service provider and consumers.”
SLA = {negotiated and agreed QoS parameters + rewards
+ penalties for violation of agreement....}
( taken from- www.cloudbus.org + www.buyya.com)
23
Cloud Services
Infrastructure as a Service (IaaS)
CPU, Storage: Amazon.com, Google Software as a Service (SaaS)
Compute, ….
Platform as a Service (PaaS)
Google App Engine, Microsoft Platform as a Service (PaaS)
Azure,..
Software as a Service (SaaS)
Gmail.com,Facebook.com,Youtube.com,S
alesForce.Com,… Infrastructure as a Service (IaaS)
24
…..Cloud Services
Fig: Cloud Service architecture
25
Cloud Deployment Models
Public/Internet Private/Enterprise Hybrid/Inter
Clouds Clouds Clouds
3rd party, Mixed usage of
Cloud model run
multi-tenant Cloud private and public
within a company’s
infrastructure Clouds: Leasing public
own Data Center /
& services: cloud services
infrastructure for
when private cloud
internal and/or
* available on capacity is
partners use.
subscription basis insufficient
26
Cloud Applications
• Scientific/Tech Applications
• Business Applications
• Consumer/Social Applications
Science and Technical Applications
Business Applications
Consumer/Social Applications 27
Transaction Processing Systems
A transaction is a collection of operations on the state of an object
(database, object composition, etc.) that satisfies the following properties
(ACID):
Atomicity: All operations either succeed, or all of them fail.
- When the transaction fails, the state of the object will remain
unaffected by the transaction.
Consistency: A transaction establishes a valid state transition.
- This does not exclude the possibility of invalid,
intermediate states during the transaction’s execution.
Isolation: Concurrent transactions do not interfere with each other.
- It appears to each transaction T that other transactions occur either
before T, or after T, but never both.
Durability: After the execution of a transaction, its effects are
made permanent:
- Changes to the state survive failures.
28
Transaction Processing Monitor
In many cases, the data involved in a transaction is distributed across several
servers. A TP Monitor is responsible for coordinating the execution of a
transaction
29
Mobile Computing Systems
Mobile computing systems are generally a subclass of ubiquitous
computing systems and meet all of the five requirements.
Typical characteristics
Many different types of mobile devices: smart phones, remote controls,
car equipment, and so on
Wireless communication
Devices may continuously change their location =>
o setting up a route may be problematic, as routes can change
frequently
o devices may easily be temporarily disconnected=>
disruption-tolerant networks
30
Sensor Networks
Consists of spatially distributed autonomous sensors to
cooperatively monitor physical or environmental conditions, such
as temperature, sound, vibration, pressure, motion or pollutants,
etc.
Characteristics
The nodes to which sensors are attached are:
• Many (10s-1000s)
• Simple (small memory/compute/communication capacity)
• Often battery-powered (or even battery-less)
31
EXAMPLE
32
Distributed Pervasive Systems: Examples
Electronic Health Systems
Devices are physically close to a person
Where and how should monitored data be stored?
How can we prevent loss of crucial data?
What infrastructure is needed to generate and
propagate alerts?
How can security be enforced?
How can physicians provide online feedback?
33
EXAMPLE
34
Pros and Cons of Distributed Systems
Pros of Distributed Systems
Performance: Very often a collection of processors can provide higher
performance (and better price/performance ratio) than a centralized
computer.
Distribution: many applications involve, by their nature, spatially
separated machines (banking, commercial, automotive system).
Reliability (fault tolerance): if some of the machines crash, the system
can survive.
Incremental growth: as requirements on processing power grow, new
machines can be added incrementally.
Sharing of data/resources: shared data is essential to many
applications (banking, computer supported cooperative work,
reservation systems); other resources can be also shared (e.g. expensive
printers).
Communication: facilitates human-to-human communication.
35
Cons of Distributed Systems
Difficulties of developing distributed software: how should
operating systems, programming languages and
applications look like?
Networking problems: several problems are created by
the network infrastructure, which have to be dealt
with: loss of messages, overloading, ...
Security problems: sharing generates the problem of data
security.
36
1.6 Hardware and Software Concepts
o Hardware Concepts
different classification schemes exist
Multiprocessors - with shared memory
Multicomputers - that do not share memory
can be homogeneous or heterogeneous
37
Heterogeneous Multicomputer Systems
most distributed systems are built on heterogeneous
multicomputer systems
the computers could be different in processor type,
memory size, architecture, power, operating system, etc.
and the interconnection network may be highly
heterogeneous as well
the distributed system provides a software layer to hide the
heterogeneity at the hardware level; i.e., provides
transparency
38
o Software Concepts
OSs in relation to distributed systems
tightly-coupled systems, referred to as distributed OSs
(DOS)
the OS tries to maintain a single, global view of the
resources it manages
used for multiprocessors and homogeneous
multicomputers
loosely-coupled systems, referred to as network OSs
(NOS)
a collection of computers each running its own OS; they
work together to make their services and resources
available to others
used for heterogeneous multicomputers
Middleware: to enhance the services of NOSs so that a
better support for distribution transparency is provided
39
Distributed Operating Systems
two types
multiprocessor operating system: to manage the
resources of a multiprocessor
multicomputer operating system: for homogeneous
multicomputers
Uniprocessor Operating Systems
separating applications from operating system code
through a microkernel
40
Network Operating Systems
possibly heterogeneous underlying hardware
constructed from a collection of uniprocessor systems, each
with its own operating system and connected to each other
in a computer network
general structure of a network operating system
41
Services offered by network operating systems
remote login (rlogin)
remote file copy (rcp)
shared file systems through file servers
two clients and a server in a network operating system
42
43