Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views29 pages

Unit 1 Part 1

ppt

Uploaded by

S PRABAKARAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views29 pages

Unit 1 Part 1

ppt

Uploaded by

S PRABAKARAN
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

CS 3551 DISTRIBUTED

COMPUTING
Book
Unit 1
What is distributed Computing?
HOD wants to assign paper correction (500 papers in total)

Scenario 1 Scenario 2
Assign paper to single faculty Arun Assign paper to 5 faculties

Arun => 500 papers => 5 days Arun => 100 papers =>
Gopal => 100 papers =>
Ashif => 100 papers =>
Hari => 100 papers =>
Murali => 100 papers =>
Eg - Online banking transaction for 10 million users
Eg 2 Image Rendering - Resize, Filter, Color,
Effects
Definition

e Autonomous processors communicating over a communication network


e Some characteristics
► No common physical clock
► No shared memory
► Geographical seperation
► Autonomy and heterogeneity
Characteristics
Relation to Computer System Components
Key Points
● The distributed software is also termed as middleware.
● A distributed execution is the execution of processes across the distributed
system to collaboratively achieve a common goal. An execution is also
sometimes termed a computation or a run.
● The middleware is the distributed software that drives the distributed system,
while providing transparency of heterogeneity at the platform level.
● Middleware layer does not contain the traditional application layer functions of
the network protocol stack, such as http, mail, ftp, and telnet.
● Various primitives and calls to functions defined in various libraries of the
middleware layer are embedded in the user program code.
● There exist several libraries to choose from to invoke primitives for the more
common functions – such as reliable and ordered multicasting – of the
middleware layer.
● There are several standards such as Object Management Group’s (OMG)
common object request broker architecture (CORBA) [36], and the
remote procedure call (RPC) mechanism.
Motivation/ Benefit of Distributed Computing
Key Points
1. Inherently distributed computations
a. In many applications such as money transfer in banking, or reaching consensus among parties that are
geographically distant, the computation is inherently distributed.
2. Resource sharing
a. Resources such as peripherals, complete data sets in databases, special libraries, as well as data
(variable/files) cannot be fully replicated at all the sites because it is often neither practical nor cost-
effective.
b. Further, they cannot be placed at a single site because access to that site might prove to be a
bottleneck.
c. Therefore, such resources are typically distributed across the system. For example, distributed databases
such as DB2 partition the data sets across several servers, in addition to replicating them at a few sites
for rapid access as well as reliability.
3. Access to geographically remote data and resources
a. In many scenarios, the data cannot be replicated at every site participating in the distributed execution
because it may be too large or too sensitive to be replicated. For example, payroll data within a
multinational corporation is both too large and too sensitive to be replicated at every branch office/site. It is
therefore stored at a central server which can be queried by branch offices. Similarly, special resources
such as supercomputers exist only in certain locations, and to access such supercomputers, users need
to log in remotely.
4. Enhanced reliability
a. A distributed system has the inherent potential to provide increased reliability because of the possibility
of replicating resources and executions, as well as the reality that geographically distributed resources
areii. not likelyintegrity,
to crash/malfunction at theofsame
i.e., the value/state time under
the resource normal
should circumstances.
be correct, in the face of concurrent
b. Reliability entails
access fromseveral
multiple aspects:
processors, as per the semantics expected by the application;
i.
iii. ••fault-tolerance,
availability, i.e., i.e.,
thethe
resource
ability should be accessible
to recover at failures,
from system all times;where
• such failures may be defined
to occur in
one of many failure models.
5. Increased performance/cost ratio
a. By resource sharing and accessing geographically remote data and resources, the
performance/cost ratio is increased.
b. Although higher throughput has not necessarily been the main objective behind using a
distributed system, nevertheless, any task can be partitioned across the various computers in
the distributed system.
c. Such a configuration provides a better performance/cost ratio than using special parallel
machines.
6. Scalability
a. As the processors are usually connected by a wide-area network, adding more processors
does not pose a direct bottleneck for the communication network.
7. Modularity and incremental expandability
a. Heterogeneous processors may be easily added into the system without affecting the
performance, as long as those processors are running the same middleware algorithms.
b. Similarly, existing processors may be easily replaced by other processors.
Distributed Vs Parallel Computing
Message Passing vs Shared Memory
Key Points
● Shared memory systems are those in which there is a (common) shared address
space throughout the system.
● Communication among processors takes place via shared data variables, and
control variables for synchronization among the processors.
● Semaphores and monitors that were originally designed for shared memory
uniprocessors and multiprocessors are examples of how synchronization can be
achieved in shared memory systems.
● All multicomputer (NUMA as well as message-passing) systems that do not have
a
shared address space provided by the underlying architecture and hardware
necessarily communicate by message passing.
● For a distributed system, this abstraction is called distributed shared memory.
Implementing this abstraction has a certain cost but it simplifies the task of the
application programmer.
Emulating MP in SM
Shared Memory P1 P2 P3

1. The shared address space can be partitioned into disjoint parts, one part
being assigned to each processor.
2. “Send” and “receive” operations can be implemented by writing to and reading
from the destination/sender processor’s address space, respectively.
Specifically, a separate location can be reserved as the mailbox for each
ordered pair of processes.
3. A Pi–Pj message-passing can be emulated by a write by Pi to the mailbox and
then a read by Pj from the mailbox.
4. The write and read operations need to be controlled using synchronization
primitives to inform the receiver/sender after the data has been sent/received.
Emulating SM in MP
1. This involves the use of “send” and “receive” operations for “write” and “read”
operations.
2. Each shared location can be modeled as a separate process;
a. “write” to a shared location is emulated by sending an update message to the
corresponding owner process;
b. a “read” to a shared location is emulated by sending a query message to the owner
process.
3. the latencies involved in read and write operations may be high even when
using shared memory emulation
4. An application can of course use a combination of shared memory and
message-passing.
5. In a MIMD message-passing multicomputer system, each “processor” may be
a tightly coupled multiprocessor system with shared memory. Within the
multiprocessor system, the processors communicate via shared memory.
Between two computers, the communication is by message passing

You might also like