0% found this document useful (0 votes)

14 views6 pages

Unit 8

The document discusses fault tolerance in distributed systems, outlining its importance and various types, including process resilience and failure detection. It explains failure models, reliable communication methods, and the necessity for redundancy to mask failures. Additionally, it covers the mechanisms for detecting failures and recovery strategies to ensure system reliability despite faults.

Uploaded by

hike.praji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views6 pages

Unit 8

Uploaded by

hike.praji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Fault Tolerance

Introduction to fault tolerance

Four types of fault tolerance:
Process resilience
Failure detection and reliable multicasting
Failure Detection

General Background
a. Basic concept
Failures can happen due to variety of reasons they are, hardware faults, bugs, operator
errors, network errors/outage.

A characteristics features of DS that distinguish them from single machine system in the
motion in a partial failure

b. Failure Models
c. Failure masking by redundancy
d. Reliable communication
There are two types of reliable communications they are:
 Reliable request-reply communication
It is designed to support the roads and message exchange in typical client server
interaction
Classes of failure in request-reply communication there are five classes:
 The client is unable to request a server
 The request message from the client to server is lost
 The server crashes receiving a request
 The reply message from the server to client is lost
 The client crashes after spending a request

 Reliable group communication

As we considered reliable request reply communication we also need to consider
reliable multicasting service. There are three groups of reliable group
communication they are:
 The basic reliable multicasting scheme
 Scalability in reliable multicasting
 Atomic multicast
Goal and Fault Tolerance
//
An overall goal in DS is to construct the system in such as way that it can automatically recover
from partial failure

Figure page no 8
Fault tolerance is the property that enables the system to continue operating properly in the event
of failure.

Faults, errors and failure

Figure
A system is said is said to be fault tolerance if it can provide its service even in the present of
fault tolerance

Fault tolerance requirement

Robust fault tolerance system request they are,
 No single point of failure
 Fault isolation
 Availability of revision modes

Figure

Failure Models
Figure

Failure masking by redundancy

The key technique for masking is to use redundancy.
Usually, extra bits are added to allow recovery from garbled bits.
Figure
Process resilience
 The key approach to tolerating a faulty process is to organize several identical process
into a group

R P

 If one process in a group fails, hopefully some other process can take over.

Caveats
 A process can join a group or leave one during system operation.
 A process can be a member of separate groups at a same time.

Flat VS Hierarchical

Flat Group Hierarchical Group

(+) Symmetrical (+) Decision making is simple

(+) No single point of failure (-)Asymmetrical

An important distinguish between different groups has to be with their internal structure.
How can we achieve k – fault – tolerance system?
This would require an agreement protocol applied to a processed group.
Agreement in faulty system 1
 Electing a coordinator
 Deciding whether or not to commit a trans-section
 Diving task among workers
 Synchronization
Agreement in faulty system 2
Goal has all non-faulty process which concerns on some issue and establish that concerns within
a finite number of steps.
 Synchronization VS A synchronization system
 Communication delay is bounded or not
 Message delivered is ordered or not
 Message transmission is done through unicasting and multicasting

Agreement in faulty system 3

Page number 25

Agreement in faulty system 4

 Process behave asynchronization
 Message transmission is unicast
 Communication delay are unbounded

Byzantine agreement problem 1

 Lampod assumption they are, synchronization
 Delay is bounded

Process Failure Detection

Before we mask failure, we generally need to detect them
Time out mechanism
 In failure detection a time out mechanism is usually involved
 Specify a timer after a period of time, trigger a time out
Page number 37
Failure Detection
Distributed Commit
 Atomic multicasting is an example of a more general problem of distributed commit
 Distributed commit is often established by the means of a coordinator and participants.
 There are two phase of commit protocol they are
 One phase commit protocol
In a simple scheme, coordinator can tell all participants whether or not to locally
perform the operation in question.

 Two phase commit protocol

Assuming that no failures occur, the two phase commit protocol (two PC) consist
of the following two phase
Recovery
So far we have mainly concentrated on algorithms that allows us to tolerate faults. There are
three points of recovery:
 Error recovery
 Check pointing
 Message logging

Questions
What is fault tolerance? Briefly explain
Process resilience.

Technical File of Wound Drainage System
100% (4)
Technical File of Wound Drainage System
39 pages
Chapter 8-Fault Tolerance
100% (1)
Chapter 8-Fault Tolerance
71 pages
Intro To DS Chapter 6
No ratings yet
Intro To DS Chapter 6
51 pages
Ds Chapter 7
No ratings yet
Ds Chapter 7
21 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
30 pages
Chapter 7-Fault Tolerance
No ratings yet
Chapter 7-Fault Tolerance
71 pages
Chen 07
No ratings yet
Chen 07
39 pages
Unit5 Compressed Fault Tolerance - PACE
No ratings yet
Unit5 Compressed Fault Tolerance - PACE
11 pages
Chapter 8 - Fault Tolerance
No ratings yet
Chapter 8 - Fault Tolerance
19 pages
DS Chapter 8-Fault Tolerance
No ratings yet
DS Chapter 8-Fault Tolerance
68 pages
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
No ratings yet
Fault Tolerance: Click To Add Text Dealing Successfully With Partial System. Key Technique: Redundancy
48 pages
Fault Tolerance FDCC
No ratings yet
Fault Tolerance FDCC
76 pages
Chapter 8 Fault Tolerance
No ratings yet
Chapter 8 Fault Tolerance
20 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
11 pages
Fault
No ratings yet
Fault
101 pages
Slides 08
No ratings yet
Slides 08
107 pages
Chapter 8
No ratings yet
Chapter 8
107 pages
ProcessResilience FaultTolerance Recovery
No ratings yet
ProcessResilience FaultTolerance Recovery
21 pages
Week 04
No ratings yet
Week 04
49 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
41 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
37 pages
Fault Tolerance Notes
No ratings yet
Fault Tolerance Notes
101 pages
Chapter Seven
No ratings yet
Chapter Seven
13 pages
Slides 08 PDF
No ratings yet
Slides 08 PDF
95 pages
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
No ratings yet
Fault Tolerance:-: Introduction, Process Resilience, Distributed Commit, Recovery
52 pages
Chapte Four DS
No ratings yet
Chapte Four DS
37 pages
Distributed Systems Resilience
No ratings yet
Distributed Systems Resilience
25 pages
w9s1 FaultTolerance1
No ratings yet
w9s1 FaultTolerance1
34 pages
DS Unit - 4
No ratings yet
DS Unit - 4
20 pages
Chapter 8-Fault Tolerance
No ratings yet
Chapter 8-Fault Tolerance
51 pages
Chapter 06 Fault - Tolerance
No ratings yet
Chapter 06 Fault - Tolerance
30 pages
Lecture 7
No ratings yet
Lecture 7
57 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
21 pages
DS Chapter V8.0fault Tolerance
No ratings yet
DS Chapter V8.0fault Tolerance
23 pages
Du3 1
No ratings yet
Du3 1
54 pages
Fault System One
No ratings yet
Fault System One
19 pages
Fault Tolerant Message Passing Systems
No ratings yet
Fault Tolerant Message Passing Systems
26 pages
Unit # IV Replication and Fault Tolerance
No ratings yet
Unit # IV Replication and Fault Tolerance
82 pages
DS CH7 - Fault Tolerance
No ratings yet
DS CH7 - Fault Tolerance
17 pages
Process Resilience: by Ravalika Pola
No ratings yet
Process Resilience: by Ravalika Pola
17 pages
Cs3551 - Dss-Unit - IV Notes Final
No ratings yet
Cs3551 - Dss-Unit - IV Notes Final
46 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Fault Tolerance
No ratings yet
Fault Tolerance
40 pages
DS Unit-3 Notes
No ratings yet
DS Unit-3 Notes
35 pages
BCS 413 - Lecture7 - Fault Tolerance
No ratings yet
BCS 413 - Lecture7 - Fault Tolerance
47 pages
Ch8 Distributed
No ratings yet
Ch8 Distributed
12 pages
WRL0004 TMP
No ratings yet
WRL0004 TMP
9 pages
Distributed Computing: Farhad Muhammad Riaz
No ratings yet
Distributed Computing: Farhad Muhammad Riaz
18 pages
Lm1-Consensus Algorithm
No ratings yet
Lm1-Consensus Algorithm
35 pages
Fault Tolerance in Distributed Systems
No ratings yet
Fault Tolerance in Distributed Systems
9 pages
CSC423 - Lec12 - Distributed and Parallel ComputerSystems
No ratings yet
CSC423 - Lec12 - Distributed and Parallel ComputerSystems
28 pages
Module 5 Notes
No ratings yet
Module 5 Notes
10 pages
# Consensus and Agreement Algorithms: Distributed Computing
No ratings yet
# Consensus and Agreement Algorithms: Distributed Computing
9 pages
Dis Sys
No ratings yet
Dis Sys
16 pages
Fault Tolerance Slides
No ratings yet
Fault Tolerance Slides
18 pages
Fault Tolerance Slides
No ratings yet
Fault Tolerance Slides
18 pages
Distributed Systems: Consensus & Fault Tolerance
No ratings yet
Distributed Systems: Consensus & Fault Tolerance
10 pages
ch08 Ts TK Fault Tolerance I
No ratings yet
ch08 Ts TK Fault Tolerance I
29 pages
Ch-4-Fault Tularance - Naming-SM
No ratings yet
Ch-4-Fault Tularance - Naming-SM
42 pages
CN Lecture7
No ratings yet
CN Lecture7
6 pages
Dotnet Programs
No ratings yet
Dotnet Programs
2 pages
UNIT-11 (Java Application)
No ratings yet
UNIT-11 (Java Application)
51 pages
Distributed System Assignment Questions
No ratings yet
Distributed System Assignment Questions
1 page
OSI Reference Layer
No ratings yet
OSI Reference Layer
23 pages
Unit - 3
No ratings yet
Unit - 3
17 pages
Manual Troqueladra Lb201
No ratings yet
Manual Troqueladra Lb201
18 pages
Copd Health Teaching
100% (1)
Copd Health Teaching
2 pages
Contamination Prevention in The Manufacture of Crop Protection Products Detailed Version
No ratings yet
Contamination Prevention in The Manufacture of Crop Protection Products Detailed Version
60 pages
The Coca-Cola Company - The World's Largest Beverage Company
No ratings yet
The Coca-Cola Company - The World's Largest Beverage Company
2 pages
Innovative Engineering Company
No ratings yet
Innovative Engineering Company
21 pages
Proj 3 Enc 2135
No ratings yet
Proj 3 Enc 2135
5 pages
Clearcutting and High Severity Wildfire Have Comparable - 2014 - Forest Ecology
No ratings yet
Clearcutting and High Severity Wildfire Have Comparable - 2014 - Forest Ecology
8 pages
EVS Question Bank
No ratings yet
EVS Question Bank
30 pages
Experiment 1: Determination of Total Acidity of Vinegar: Final Laboratory Report
No ratings yet
Experiment 1: Determination of Total Acidity of Vinegar: Final Laboratory Report
13 pages
Hospitality Diplomas for 10th Pass
No ratings yet
Hospitality Diplomas for 10th Pass
22 pages
Rehabilitation Strategies For Pusher Syndrome
No ratings yet
Rehabilitation Strategies For Pusher Syndrome
26 pages
ABX00080 Datasheet
No ratings yet
ABX00080 Datasheet
35 pages
Philhealth Agency's Mandate and Functions
No ratings yet
Philhealth Agency's Mandate and Functions
10 pages
SantValves WM1 SantBrassWaterMeterMulti
No ratings yet
SantValves WM1 SantBrassWaterMeterMulti
1 page
LVMV Online2
No ratings yet
LVMV Online2
60 pages
Instruction Manual: ESG1 Series Controller
No ratings yet
Instruction Manual: ESG1 Series Controller
83 pages
Deutsche Bank - January 2021 Survey Results
No ratings yet
Deutsche Bank - January 2021 Survey Results
12 pages
Where To Buy GBL Wheelcleaner in China
No ratings yet
Where To Buy GBL Wheelcleaner in China
1 page
9518 03S 03 06 08 10 Blower Metal Venturi
No ratings yet
9518 03S 03 06 08 10 Blower Metal Venturi
1 page
Roll Up Serranda
No ratings yet
Roll Up Serranda
9 pages
JASON Study: Human Performance
No ratings yet
JASON Study: Human Performance
90 pages
Ags 225d
No ratings yet
Ags 225d
82 pages
Hoot Bar Menu
No ratings yet
Hoot Bar Menu
9 pages
Swine Nutrition Basics
No ratings yet
Swine Nutrition Basics
56 pages
Luxmate Emotion
No ratings yet
Luxmate Emotion
23 pages
FTC Testimony - 99 Percent Lose Money in MLMs
0% (1)
FTC Testimony - 99 Percent Lose Money in MLMs
2 pages
Kingdom Eubacteria
100% (1)
Kingdom Eubacteria
13 pages
Petc 213 Review Questions
No ratings yet
Petc 213 Review Questions
3 pages
Biology 11 and 12: Cell Modifications and Specialization
No ratings yet
Biology 11 and 12: Cell Modifications and Specialization
2 pages

Unit 8

Uploaded by

Unit 8

Uploaded by

Fault Tolerance

Introduction to fault tolerance

 Reliable group communication

Faults, errors and failure

Fault tolerance requirement

Failure masking by redundancy

Flat Group Hierarchical Group

(+) Symmetrical (+) Decision making is simple

(+) No single point of failure (-)Asymmetrical

Agreement in faulty system 3

Agreement in faulty system 4

Byzantine agreement problem 1

Process Failure Detection

 Two phase commit protocol

You might also like