COMP3358 Distributed and Parallel Computing Final Notes

The document outlines various concepts related to client-server architecture, including the RMI process for remote procedure calls, the ACID properties of transactions, and consistency models. It also discusses caching performance in systems like Facebook's photo cache, the Raft consensus protocol for fault tolerance, and the scalability of MapReduce dataflow. Key takeaways include the importance of parallelization, data locality, and fault tolerance in distributed systems.

Uploaded by

dave

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views10 pages

COMP3358 Distributed and Parallel Computing Final Notes

Uploaded by

dave

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

L02 Client Server

L02 RMI → Example of object middleware

In summary, the process follows this order:
1. Parameter Collection: The client stub gathers the parameters for
the remote procedure call
2. Marshalling: The parameters are prepared for transmission
(including type information, order, etc.)
3. Serialization: As part of marshalling, the parameters are converted
into a stream of bytes
4. Network Transmission: The serialized data is sent over the
network
5. Deserialization: The server converts the byte stream back into
data structures
6. Unmarshalling: The server reconstructs the original parameter
values.
7. Calling the server procedure/ method
L03 ACID & Transcation

Atomicity. In a txn involving two or more discrete

pieces of info, either all of the pieces are committed or
none are. (example: transaction 1 is effectively
incomplete, because transaction 2 has overwritten its
effect)
Consistency. A transaction either creates a new
and valid state of data, or, if any failure occurs, returns
all data to its state before the transaction was started.
(example: two people have booked the seat, but only
one booking is recorded in the database)
Isolation. A txn in process and not yet committed
must remain isolated from any other txn. (example:
both transactions update the same location;
transaction 1 is not isolated from transaction 2)
Durability. Committed data is saved by the
system such that, even in the event of a failure and
system restart, the data is available in its correct state.
(example: the effects of transaction 1 have not
endured because they have been overwritten by
transaction 2)
L04 Consistency Models

Parallelization and scalability, Amdahl's law

Synchronization, consistency
L05 Facebook photo cache
Mutual exclusion, locking, and issues related to locking
Key takeaway: FIFO takes X cache size to get a 59% hit
ratio, while S4LRU only takes ⅓ X cache size to get the
same performance.

Conclusion:
• Quantify caching performance
• Quantify popularity changes across layers of caches
• Recency, frequency, age, and social factors impact
cache
L06 RAFT - http://thesecretlivesofdata.com/raft/
Reliable, Replicated, Redundant, And Fault-Tolerant.
Raft consensus is already the strongest protocol in
terms of reliability, 100% sequential consistency, best
effort (not 100% guaranteed), in some cases it can faill
to elect the leader or approve some operations.

If the network is really bad, using raft protocol, can

cause no progress at all as no node can achieve
consensus, but this is okay as no progress is still
considered sequential consistency (better than making
local decision and make a diverged decision)
L07 Map-reduce What makes MapReduce dataflow so scalable?
Parallelization:
- Both map and reduce operations run in parallel across
many machines
- Input data is automatically partitioned, allowing
independent processing
Locality optimization:
- The system tries to assign mappers to machines where the
input data is stored
- This minimizes network transfer and utilizes data locality
The shuffle mechanism:
- Though network-intensive, it's a structured
communication pattern
- Data with the same keys is routed to the same reducer
regardless of scale
Fault tolerance:
- Failed tasks are automatically redistributed and restarted
- Intermediate results are stored for recovery purposes

What is meant by "dataflow" in MapReduce?

In the context of MapReduce, "dataflow" refers to the controlled
movement and transformation of data through the entire processing
pipeline. Specifically, it describes:
- The path data takes: How data flows from input sources
through mappers, through the shuffle phase, to reducers,
and finally to output storage.
- Transformation stages: How data changes form at each
stage - from raw input data to key-value pairs after
mapping, to grouped key-value pairs after shuffling, to
aggregated results after reducing.
- Data exchange patterns: How data moves between
distributed components, particularly during the shuffle
phase where data is reorganized across the network.
L09 HLF (Hyperledger Fabric) and BFT (Byzantine Fault
Tolerance)

Hyperledger Fabric (HLF)

SAMPLE FINAL

L08, L10, L11

L08 - Spark

Extra time:
BIDL: A High-throughput, Low-latency Permissioned
Blockchain Framework for Datacenter Networks

Comprehensive System Design Course
No ratings yet
Comprehensive System Design Course
14 pages
Unit-4 DFS-1
No ratings yet
Unit-4 DFS-1
9 pages
System Design
No ratings yet
System Design
30 pages
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
No ratings yet
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
46 pages
Networking Long
No ratings yet
Networking Long
17 pages
CC Unit-4
No ratings yet
CC Unit-4
28 pages
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
No ratings yet
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
20 pages
IETF Differentiated Services: Scalability: Flexible Service Models
No ratings yet
IETF Differentiated Services: Scalability: Flexible Service Models
44 pages
Sol 2003 Networks
No ratings yet
Sol 2003 Networks
15 pages
DC Mod 5
No ratings yet
DC Mod 5
12 pages
System Design
No ratings yet
System Design
8 pages
07 Replication
No ratings yet
07 Replication
14 pages
Reliable Distributed Systems
No ratings yet
Reliable Distributed Systems
44 pages
Distributed File Systems
No ratings yet
Distributed File Systems
42 pages
Distributed File System
No ratings yet
Distributed File System
68 pages
Distributed File Systems
No ratings yet
Distributed File Systems
28 pages
Big Data IN A Gist
No ratings yet
Big Data IN A Gist
16 pages
Dos
No ratings yet
Dos
86 pages
A Case Study On Different Applications and Security Issues in Distributed Systems
No ratings yet
A Case Study On Different Applications and Security Issues in Distributed Systems
10 pages
ISDS550
0% (1)
ISDS550
26 pages
File Models and File Accessing Models: Prepared By: Mehta Ishani 1300407010030
No ratings yet
File Models and File Accessing Models: Prepared By: Mehta Ishani 1300407010030
18 pages
Chapter 7 Notes Final
No ratings yet
Chapter 7 Notes Final
13 pages
A17111
No ratings yet
A17111
52 pages
Lecture 9 - RPC and Concurrency Control
No ratings yet
Lecture 9 - RPC and Concurrency Control
29 pages
U4S9
No ratings yet
U4S9
18 pages
CS422 Final Report
No ratings yet
CS422 Final Report
12 pages
DFS
No ratings yet
DFS
37 pages
Lpic Devops 701 1
No ratings yet
Lpic Devops 701 1
13 pages
Multiprotocol Label Switching (MPLS) (MPLS) : Higher Institute For Applied Sciences and Technology
No ratings yet
Multiprotocol Label Switching (MPLS) (MPLS) : Higher Institute For Applied Sciences and Technology
31 pages
Ds
No ratings yet
Ds
32 pages
Algomasterio System Design Interview Handbook
No ratings yet
Algomasterio System Design Interview Handbook
19 pages
System Design Notes 1664811186
No ratings yet
System Design Notes 1664811186
24 pages
MC - Unit 5 - 31-5-2022 - Final
No ratings yet
MC - Unit 5 - 31-5-2022 - Final
28 pages
Chapter 6 - Consistency and Replication
No ratings yet
Chapter 6 - Consistency and Replication
24 pages
Microservice Patterns
No ratings yet
Microservice Patterns
8 pages
Qyality of Service
No ratings yet
Qyality of Service
19 pages
Large Scale Design Workbook Guide
No ratings yet
Large Scale Design Workbook Guide
7 pages
Databases: Lesson 03 Data Cache Consistency Maintenance in Mobile and Web Environments
No ratings yet
Databases: Lesson 03 Data Cache Consistency Maintenance in Mobile and Web Environments
39 pages
MDPF Documentation
No ratings yet
MDPF Documentation
60 pages
Managing Replicated Objects: Deterministic Thread Scheduling
No ratings yet
Managing Replicated Objects: Deterministic Thread Scheduling
12 pages
Intro To DS Chapter 5
No ratings yet
Intro To DS Chapter 5
76 pages
CN Sem-3 IMP
No ratings yet
CN Sem-3 IMP
12 pages
Distributed Shared Memory
No ratings yet
Distributed Shared Memory
51 pages
VTU Exam Question Paper With Solution of 15CS52 Computer Networks Dec-2017-Shyamasree Ghosh
No ratings yet
VTU Exam Question Paper With Solution of 15CS52 Computer Networks Dec-2017-Shyamasree Ghosh
32 pages
Cloud Storage Systems Overview
No ratings yet
Cloud Storage Systems Overview
35 pages
Routing Protocols in Pervasive Computing
No ratings yet
Routing Protocols in Pervasive Computing
10 pages
Cheat Sheet 1
No ratings yet
Cheat Sheet 1
2 pages
A17 Last
No ratings yet
A17 Last
68 pages
CN Assignment 2
No ratings yet
CN Assignment 2
16 pages
Lecture 7.2 Consistency
No ratings yet
Lecture 7.2 Consistency
9 pages
Chapter 2 Slides - DS
No ratings yet
Chapter 2 Slides - DS
28 pages
System Design Interview
No ratings yet
System Design Interview
4 pages
He-Phan-Bo - Wyatt-Lloyd - L19-Big-Data - (Cuuduongthancong - Com)
No ratings yet
He-Phan-Bo - Wyatt-Lloyd - L19-Big-Data - (Cuuduongthancong - Com)
16 pages
CNND QB Ut2
No ratings yet
CNND QB Ut2
15 pages
Application Level Consensus
No ratings yet
Application Level Consensus
10 pages
07 Webservices A
No ratings yet
07 Webservices A
69 pages
Chapter 4
No ratings yet
Chapter 4
10 pages
All Notes
No ratings yet
All Notes
127 pages
Dec 2019 UGmacro2020 - Final
No ratings yet
Dec 2019 UGmacro2020 - Final
12 pages
3358 Sample Final Part2 Empty
No ratings yet
3358 Sample Final Part2 Empty
3 pages
L04 Concurrency Consistency Updated
No ratings yet
L04 Concurrency Consistency Updated
40 pages
3358 Sample Final Part1 Empty
No ratings yet
3358 Sample Final Part1 Empty
9 pages
DPV - Chapter - 7 - Solution Linear Programming and Reductions
No ratings yet
DPV - Chapter - 7 - Solution Linear Programming and Reductions
15 pages
3358 Sample Final Part2
No ratings yet
3358 Sample Final Part2
4 pages
Chapter 06
No ratings yet
Chapter 06
26 pages
Process Synchronization
No ratings yet
Process Synchronization
23 pages
OS UNIT-2 (Chapter 4, CPU Scheduling)
No ratings yet
OS UNIT-2 (Chapter 4, CPU Scheduling)
8 pages
Introduction To Distributed Database Systems
No ratings yet
Introduction To Distributed Database Systems
22 pages
Parallel Computing for Students
No ratings yet
Parallel Computing for Students
113 pages
CS69201 Week9
No ratings yet
CS69201 Week9
5 pages
UNIT-4 Multithreading Oops
No ratings yet
UNIT-4 Multithreading Oops
21 pages
Principles of Concurrency
No ratings yet
Principles of Concurrency
7 pages
OS Synchronization with Monitors
No ratings yet
OS Synchronization with Monitors
21 pages
OS Concurrency Basics
No ratings yet
OS Concurrency Basics
15 pages
Unit V DBMS
No ratings yet
Unit V DBMS
24 pages
Process Synchronization
No ratings yet
Process Synchronization
7 pages
POSIX Threads & Mutex Guide
No ratings yet
POSIX Threads & Mutex Guide
20 pages
Round Robin Scheduling Algorithm With Example
No ratings yet
Round Robin Scheduling Algorithm With Example
3 pages
Unit 3
No ratings yet
Unit 3
10 pages
Cs g524 Advanced Computer Architecture1
No ratings yet
Cs g524 Advanced Computer Architecture1
2 pages
Operating Systems Lab 1 2013 Regulation
No ratings yet
Operating Systems Lab 1 2013 Regulation
116 pages
Efficient Round Robin Scheduling Algorithm With Dynamic Time Slice
No ratings yet
Efficient Round Robin Scheduling Algorithm With Dynamic Time Slice
10 pages
CH2 Deadlocks
No ratings yet
CH2 Deadlocks
14 pages
Os Paper Solutions
100% (1)
Os Paper Solutions
52 pages
21CS53 DBMS Iat3 QB
No ratings yet
21CS53 DBMS Iat3 QB
2 pages
Operating System Exam Papers
No ratings yet
Operating System Exam Papers
6 pages
21CS56 - Operating Systems Chapter 8 - (Module3) Deadlocks: Department of Information Science and Engg
No ratings yet
21CS56 - Operating Systems Chapter 8 - (Module3) Deadlocks: Department of Information Science and Engg
49 pages
Unit I
No ratings yet
Unit I
22 pages
Interprocessor Communication and Synchronization 2
No ratings yet
Interprocessor Communication and Synchronization 2
3 pages
Chapter 5: Process Synchronization: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
No ratings yet
Chapter 5: Process Synchronization: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
58 pages
DBMS Module4 QuestionBank
No ratings yet
DBMS Module4 QuestionBank
2 pages
Concurrency Control & Recovery
No ratings yet
Concurrency Control & Recovery
39 pages
Chapter 1 Multithreading
No ratings yet
Chapter 1 Multithreading
66 pages
Operating System Concepts-PG-DAC: Suggested Teaching Guidelines For
No ratings yet
Operating System Concepts-PG-DAC: Suggested Teaching Guidelines For
4 pages

COMP3358 Distributed and Parallel Computing Final Notes

Uploaded by

COMP3358 Distributed and Parallel Computing Final Notes

Uploaded by

L02 Client Server

L02 RMI → Example of object middleware

Atomicity. In a txn involving two or more discrete

Parallelization and scalability, Amdahl's law

If the network is really bad, using raft protocol, can

What is meant by "dataflow" in MapReduce?

Hyperledger Fabric (HLF)

L08, L10, L11

You might also like