Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views97 pages

Unit9 Transaction Management

Transaction management involves a set of operations that represent changes in a database, such as transferring money between accounts. The ACID properties (Atomicity, Consistency, Isolation, Durability) ensure that transactions are processed reliably and maintain database integrity. Various transaction states (Active, Committed, Failed, etc.) and concurrency control methods are discussed to manage simultaneous operations without interference.

Uploaded by

thatdumb67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views97 pages

Unit9 Transaction Management

Transaction management involves a set of operations that represent changes in a database, such as transferring money between accounts. The ACID properties (Atomicity, Consistency, Isolation, Durability) ensure that transactions are processed reliably and maintain database integrity. Various transaction states (Active, Committed, Failed, etc.) and concurrency control methods are discussed to manage simultaneous operations without interference.

Uploaded by

thatdumb67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

Unit 6

Transaction Management

1
Introduction
• A transaction is a set of logically related operations. It is
generally represent change in Database.
• For example, you are transferring money from your bank account to your friend’s
account, the set of operations would be like this: Simple Transaction
• Example
1. Read your account balance
2. Deduct the amount from your balance
3. Write the remaining balance to your account
4. Read your friend’s account balance
5. Add the amount to his account balance
6. Write the new updated balance to his account.
• This whole set of operations can be called a transaction.

2
Cont.
• In DBMS, we write the above 6 steps transaction like this: Lets say
your account is A and your friend’s account is B, you are transferring
10000 from A to B, the steps of the transaction are:
1. R(A);
2. A = A - 10000;
3. W(A);
4. R(B);
5. B = B + 10000;
6.W(B);
7. commit
• In the above transaction R refers to the Read operation and W
refers to the write operation.

3
ACID Properties
• A transaction is a single logical unit of work
which accesses and possibly modifies the
contents of a database.
• Transactions access data using read and write
operations.
• ACID properties:- In order to maintain
consistency in a database, before and after
the transaction, certain properties are
followed.

4
5
Atomicity
• By this, we mean that either the entire transaction
takes place at once or doesn’t happen at all.
• There is no midway i.e. transactions do not occur
partially.
• Each transaction is considered as one unit and either
runs to completion or is not executed at all.
• It involves the following two operations.
 Abort:
– If a transaction aborts, changes made to database are not visible.
 Commit:
– If a transaction commits, changes made are visible.
– Atomicity is also known as the ‘All or nothing rule’.

6
• Consider the following • If the transaction fails
transaction T consisting of after completion of T1
T1 and T2: but before completion of
T2.( say, after write(X) but
• Transfer of 100 from before write(Y)), then
account X to account Y. amount has been
deducted from X but not
added to Y.
• This results in an
inconsistent database
state.
• Therefore, the transaction
must be executed in
entirety in order to
ensure correctness of
database state.
7
Consistency
• This means that integrity constraints must be
maintained so that the database is consistent before
and after the transaction. It refers to the correctness of a
database.

8
Isolation
• This property ensures that multiple transactions can
occur concurrently without leading to the
inconsistency of database state.
• Transactions occur independently without
interference.
• Changes occurring in a particular transaction will not
be visible to any other transaction until that
particular change in that transaction is written to
memory or has been committed.
• This property ensures that the execution of
transactions concurrently will result in a state that is
equivalent to a state achieved these were executed
serially in some order.

9
• Suppose T has been executed till
• Let X= 50000 , Y = 500 Read (Y) and then T’’ starts.
• As a result , interleaving of
operations takes place due to
which T’’ reads correct value of X
but incorrect value of Y and sum
computed by
– T’’: (X+Y = 50, 000+500=50, 500) is
thus not consistent with the sum
at end of transaction: T”: (X+Y =
50, 000 + 450 = 50, 450).
• This results in database
inconsistency, due to a loss of 50
units.
• Hence, transactions must take
place in isolation and changes
should be visible only after they
have been made to the main
memory

10
Durability
• This property ensures that once the transaction has
completed execution, the updates and modifications
to the database are stored in and written to disk and
they persist even if a system failure occurs.
• These updates now become permanent and are stored
in non-volatile memory due to this transactions are
never lost.
• The ACID properties, in totality, provide a mechanism
to ensure correctness and consistency of a database in
a way such that each transaction is a group of
operations that acts a single unit, produces consistent
results, acts in isolation from other operations and
updates that it makes are durably stored.

11
Transaction States
• A transaction in DBMS 1. Active
can be in one of the 2. Committed
following states. 3. Partially committed
4. Failed
5. Abort
6. Terminate

Fig: Transaction States


12
Cont.
• Active State: • Committed State :
– As we have discussed in – If a transaction completes
the DBMS transaction the execution successfully
introduction that a then all the changes made
transaction is a sequence in the local memory
of operations. during partially committed
– If a transaction is in state are permanently
execution then it is said to stored in the database.
be in active state. – You can also see in the
– It doesn’t matter which above diagram that a
step is in execution, until transaction goes from
unless the transaction is partially committed state
executing, it remains in to committed state when
active state. everything is successful.

13
• Partially Committed:
– The transaction has executed all its operations successfully, but
the changes have not yet been made permanent in the
database.
– As we can see in the above diagram that a transaction goes into
“partially committed” state from the active state when there are
read and write operations present in the transaction.
– A transaction contains number of read and write operations.
Once the whole transaction is successfully executed, the
transaction goes into partially committed state where we have
all the read and write operations performed on the main
memory (local memory) instead of the actual database.
– The reason why we have this state is because a transaction can
fail during execution so if we are making the changes in the
actual database instead of local memory, database may be left
in an inconsistent state in case of any failure.
– This state helps us to rollback the changes made to the database
in case of a failure during execution.

14
• Aborted State: • Failed State
– As we have seen above, if – If a transaction is executing
a transaction fails during and a failure occurs, either
execution then the a hardware failure or a
transaction goes into a software failure then the
failed state. transaction goes into failed
– The changes made into the state from the active state.
local memory (or buffer)
are rolled back to the • Terminate
previous consistent state
and the transaction goes – Releases all the resources
into aborted state from the
failed state.

15
System log
• System log , sometimes called trial or journal, is sequence of log
records, recording all the updates activities on the database .
• In a stable storage , logs for each transaction are maintained .
• Any operation which is performed on the database is recorded on
the log.
• Prior to performing any modification to DB , an update log record is
created to reflect that modification.
• Has these fields:-
– Transaction identifier
– Data item
– Old values
– New values

16
Types of failures
• A computer failure
• A transaction on system error (logical programming errors)
• Local errors or exception conditions detected by the
transaction (insufficient account balance, data for transaction not
found, )
• Concurrency control enforcement ()
• Disk Failure
• Physical problems and catastrophes

17
Cont.

18
Serial Schedules:
• Schedules in which the transactions are executed non-interleaved,
i.e., a serial schedule is one in which no transaction starts until a
running transaction has ended are called serial schedules.
• i.e., In Serial schedule, a transaction is executed completely
before starting the execution of another transaction.
• Schedules are defined as, Time order sequence of two or more
transaction.
• In other words, you can say that in serial schedule, a transaction
does not start execution until the currently running transaction
finished execution.
• This type of execution of transaction is also known as non-
interleaved execution.
• The example we have seen below is the serial schedule.

19
Cont.

• where R(A) denotes that a read operation is performed


on some data item ‘A’ This is a serial schedule since the
transactions perform serially in the order T1 —> T2.

20
Non-Serial Schedule:
• This is a type of Scheduling where the operations of multiple
transactions are interleaved.
• This might lead to a rise in the concurrency problem.
• The transactions are executed in a non-serial manner, keeping the
end result correct and same as the serial schedule.
• Unlike the serial schedule where one transaction must wait for
another to complete all its operation, in the non-serial schedule,
the other transaction proceeds without waiting for the previous
transaction to complete.
• This sort of schedule does not provide any benefit of the concurrent
transaction.
• It can be of two types namely,
A. Serializable
B. Non-Serializable Schedule.

21
(A) Serializable
– This is used to maintain the consistency of the database.
– It is mainly used in the Non-Serial scheduling to verify whether the
scheduling will lead to any inconsistency or not.
– On the other hand, a serial schedule does not need the serializability because
it follows a transaction only when the previous transaction is complete.
– The non-serial schedule is said to be in a serializable schedule only
when it is equivalent to the serial schedules, for an n number of
transactions.
– Since concurrency is allowed in this case thus, multiple transactions can
execute concurrently.
– Serializable schedules behave exactly same as serial schedules. Thus,
serializable schedules are always-
• Consistent
• Recoverable
• Casacadeless
• Strict

22
• These are of two types
Serializable
I. Conflict Serializable
II. View seriazable
I. Conflict Serializable ::- Solution:
• Above non-serial schedule
– A schedule is called
is conflict serializable
conflict serializable if it
schedule.
can be transformed into
a serial schedule by
swapping non-
conflicting operations.
Suppose a non-serial
schedule:

23
II . View Serializable:-
• A Schedule is called view serializable if it is view
equal to a serial schedule (no overlapping
transactions).
• If a given schedule is found to be view equivalent
to some serial schedule, then it is called as a view
serializable schedule.
• A conflict schedule is a view serializable but if the
serializability contains blind writes(read without
write), then the view serializable does not conflict
serializable.
24
Cont.

• Two schedule are view equivalent if they follow


the following conditions.
1. First Read must be performed by same transaction.
2. Last Write must be performed by same transactions.
3. Produce Consumer sequence should be maintained
i.e. (Update Read).
On each data items are same.
• Suppose two schedule S1 and S2 .we test view
serialize ability as follows.
25
Two Schedule S1 and S2

26
Cont.
• Solution:
• For Variable A and B:
Variable FR(S1,S2) LW(S1,S2) Updated Read

T1T2(S1),
For A T1,T1 T2,T2
T1T2(S2)

T1T2(S1),
For B T1,T1 T2,T2 T1T2(S2)

• We can see that all above three condition are met.so we


can say S1 and S2 are view serializable view equivalent.

27
Cont.
• Suppose two schedule S1 and S2 below. Test
weather the following schedule is view
serializable or not.

28
(B) Non-Serializable
– The non-serializable schedule is divided into two
types,
I. Recoverable
II. Non-recoverable Schedule.

29
I. Recoverable Schedule::-
• Schedules in which transactions commit only
after all transactions whose changes they read
commit(no dirty read) are called recoverable
schedules.
• In other words, if some transaction Tj is reading
value updated or written by some other
transaction Ti, then the commit of Tj must occur
after the commit of Ti.
• Example: Consider the following schedule
involving two transactions T1 and T2.
30
Cont.

• This is a recoverable schedule since T1 commits before


T2, that makes the value read by T2 correct.
31
• Non-Recoverable • T2 read the value of A
Schedule::- written by T1, and
• Example: Consider the committed. T1 later
following schedule aborted, therefore the
involving two value read by T2 is
transactions T1 and T2. wrong, but since T2
committed, this
schedule is non-
recoverable.

32
Basic Concept of Concurrency Control
and Recovery
Concurrency Control:-
• is Processes of managing simultaneous operations on the
database without having them interfere with one another
in a shared Data base.
• Prevents interference when two or more users are
accessing database simultaneously and at least one is
updating data.
• Although two transactions may be correct in themselves,
interleaving of operations may produce an incorrect result.
• Finally, the technique is used to protect data when multiple
users are accessing same data concurrently (same time ) is
called concurrency control.

33
Transaction isolation level
• Dirty read:- A dirty read is the situation when a transaction read a data that has not
yet been committed.
• NON- repeatable read:- occurs when a transaction reads same row twice and get
different values same time. (read un committed data)
T1 T2
R(X)
update (X) and then commit
re-read
(got different vale of X)

• Phantom read:- occurs when two same queries are executed but the rows retrieved
by two, are different.
T1 T2
R(X)
R(X)
delete(X)
R(X) => undefined

34
Transaction isolation level
• Repeatable read:- occurs when a transaction reads same row twice
and get same values again. This Allows only committed data to be read.
• Read committed:- allows only committed data to be read, but does
not require repeatable reads.
• Read uncommitted:- allows uncommitted data to be read . It is
lowest isolation level allowed by SQL.

35
Cont.
• The advantages of concurrency control are as
follows
– Waiting time will be decreased.
– Response time will decrease.
– Resource utilization will increase.
– System performance & Efficiency is increased.

36
Problem in Concurrency Control
• Three examples of potential problems caused
by concurrency:
– Lost update problem.
– Dirty Read Problem
– Unrepeatable Read Problem

37
Lost Update Problem(write- write
problem)
• A lost update problem occurs due to the update of
the same record by two different transactions at the
same time.
• In simple words, when two transactions are updating
the same record at the same time in a DBMS then a
lost update problem occurs.
• The first transaction updates a record and the second
transaction updates the same record again, which
nullifies the update of the first transaction.
• As the update by the first transaction is lost this
concurrency problem is known as the lost update
problem.

38
Example
• T1 T2
R(A)
A++
W(A)
W(A) => [blind write]
committee
Committee

39
Cont.

40
Dirty Read Problem
• The dirty read problem in DBMS occurs when a
transaction reads the data that has been updated
by another transaction that is still uncommitted.
• It arises due to multiple uncommitted
transactions executing simultaneously.
• Example: Consider two transactions A and B
performing read/write operations on a data DT in
the database DB. The current value of DT is 1000:
The following table shows the read/write
operations in A and B transactions.
41
Cont.

42
Cont.
• Transaction A reads the value of data DT as 1000
and modifies it to 1500 which gets stored in the
temporary buffer.
• The transaction B reads the data DT as 1500 and
commits it and the value of DT permanently gets
changed to 1500 in the database DB.
• Then some server errors occur in transaction A
and it wants to get rollback to its initial value, i.e.,
1000 and then the dirty read problem occurs.
43
Unrepeatable Read Problem
• The unrepeatable read problem occurs when
two or more different values of the same data
are read during the read operations in the
same transaction.
• Example: Consider two transactions A and B
performing read/write operations on a data
DT in the database DB. The current value of
DT is 1000: The following table shows the
read/write operations in A and B transactions.

44
Cont.

45
Cont.
• Transaction A and B initially read the value of
DT as 1000.
• Transaction A modifies the value of DT from
1000 to 1500 and then again transaction B
reads the value and finds it to be 1500.
• Transaction B finds two different values of DT
in its two different read operations.

46
To overcome of this problem we use
• Lock-Based Protocols -
– To attain consistency, isolation between the transactions is the most important tool.
– Isolation is achieved if we disable the transaction to perform a read/write operation.
This is known as locking an operation in a transaction. Through lock-based protocols,
desired operations are freely allowed to perform locking the undesired operations.

• Time-Based Protocols -
– According to this protocol, every transaction has a timestamp attached to it. The
timestamp is based on the time in which the transaction is entered into the system.
– There are read and write timestamps associated with every transaction which consist of
the time at which the latest read and write operations are performed respectively.

• Validation Based Protocols -


– In this protocol, we have certain phases i.e. Reading Phase, Validation Phase, and
Validation Test Phase during each of which we undergo certain commands to make sure
that the Dirty read problem should not occur.

47
Assignment
• Purpose of concurrency control?

48
Locking Protocols
• Lock is a mechanism
– to control concurrent access to data items.
– Locking is a concurrent control technique used to ensure serializability.
– LOCKING is the simplest idea to achieve isolation .
– A locking protocol is a set of rules that a transaction follows to attain
serializability.
• A transaction needs to acquire a lock before performing a
transaction i.e. first obtain lock on a data item then perform derived
operation then lock it
• To provide better concurrency along with isolation we use different
modes of lock.
a) Binary locking
b) The read lock is known as shared lock(S-Lock)
c) read/write lock is known as exclusive lock. (X-Lock)

49
a) Binary locking :- has only two states i.e. Locked (1) or unlocked (0)
. Other transaction cannot access locked items, not even for the read
operation. Because of this restriction, binary locks are generally no
longer used.
b) shared lock(Lock- S):-
– Denoted by lock-s(A)
– Transaction can perform read operation only another
transaction can also obtain lock on the same data item at the
same time.
– Example
T1 T2
lock-s(A)
lock-s(A)
c) exclusive lock(Lock- X):-
 Denoted by Lock –X
 Transaction can perform both Read/ Write operation another
transaction can not obtained Shared or Exclusive mode

50
Locking Protocols

• Lock-compatibility matrix

• Any number of transactions can hold shared locks on an item, but if


any transaction holds an exclusive on the item no other
transaction may hold any lock on the item.
• If a lock cannot be granted, the requesting transaction is made to
wait till all incompatible locks held by other transactions have been
released. The lock is then granted.

51
Cont.
• Ex:
T1:
lock-S(A); //Grant-S(A,T1)
Read(A);
Unlock(A);
Lock-S(B); //Grant-S(B,T1)
Read(B);
Unlock(B);
Display (A+B)

52
Executing with Locks
• Transactions request locks
(or upgrades).
• Lock manager grants or
blocks requests.
• Transactions release locks.
• Lock manager updates its
internal lock-table.
• It keeps track of what
transactions hold what locks
and what transactions are
waiting to acquire any locks.

53
Properties of Locks based approach
• If we do unlocking (early) inconsistency will arise and if we don’t do unlocking then
concurrency will be poor
• Than transaction follow some set of rules for locking an unlocking of data items
e.g. 2PL , graphed based
• We says a schedule is legal under a protocol. If it can be generated using the rules
of the protocols.
• Problems is shared/ Exclusive Locking
– May not produce only serializable schedule
– May not free from recoverability
– May not be free from deadlock
– May not free from starvation

54
Lock conversion
• Changing the mode of a lock that is already held is called lock
conversion .
• A transaction that holds a lock on an item A is allowed under
certain condition to change the lock state from one state to
another. A mechanism for upgrading a shared lock to an exclusive
lock, and downgrading an exclusive lock to shared lock is called Lock
conversion.
• Upgrading:- conversion from shared lock to exclusive lock. A lock-
S(A) can be Upgraded to lock-X(A) if Ti is only transaction holding
the lock-s on element A . Otherwise upgrade operation will be
rejected.
• Downgrade:- conversion from exclusive lock to shared lock. Down
grade lock-X to lock-S when we feel that no longer want to write on
data-item A. As we are holding lock-X on A , no need to check any
conditions.

55
Two-Phase Locking (2PL) Protocol
• Two-phase locking (2PL) defines the rules of how to acquire the locks on
a data item and how to release the locks to ensure the serializability. 2PL
assumes that a transaction can only be in one of two phases: an growing
phase and shrinking phase
• Phase 1: Growing:-
– transaction can only obtain locks
– Can’t release any lock
– From now on it has no option but to keep acquiring all locks it would need. It
can’t release any lock at this phase even if it has finished working with a
locked data item. Ultimately the transaction reaches a point where all lock it
may need has been acquired. This point is called LOCK point.

56
Two-Phase Locking (2PL) Protocol
• Phase 2: Shrinking :-
Lock point has been reached, the transaction enters the shrinking phase.
– Transaction can to only release locks that it previously acquired.
– It cannot obtain/acquire new locks.
– The transaction enters the shrinking phase as soon as it releases the first lock
after crossing the lock point . from now on it has no option but to keep
releasing all the acquired locks.

57
Cont.

• The 2PL locking protocol divides the execution phase of the transaction
into three parts.
• In the first part, when the execution of the transaction starts, it seeks
permission for the lock it requires.
• In the second part, the transaction acquires all the locks.
• The third phase is started as soon as the transaction releases its first lock.
• In the third phase, the transaction cannot demand any new locks. It only
releases the acquired locks.

58
Cont.
• Two-phase locking does not ensure freedom from deadlocks
• Cascading roll-back is possible under two-phase locking.
• To avoid this,2PL can be divided/ modified into following types
a) conservative 2PL:- prevents deadlock by locking all desired data items
before transaction begins execution and can avoid cascading rollback.
b) Strict two-phase locking:- Here a transaction must hold all its exclusive
locks till it commits/aborts. i.e. Unlocking is performed after a
transaction terminates. It can avoid cascading rollback but dead lock
may occurs.
c) Rigorous two-phase locking is more restrictive type of strict 2PL. Here ,
transaction does not release any of the lock until after abort or commits.
(all locks are held till commit/abort). In this protocol transactions can be
serialized in the order in which they commit. It can avoid cascading
rollback but dead lock may occurs.

59
Deadlock
• A deadlock is a situation where a set of processes are blocked
because each process is holding a resource and waiting for
another resource acquired by some other process.
• A system is in dead lock state if there exist a set of
transaction in the set is waiting for another transaction in
the set.
• E.g. If there exist a set of waiting transaction T0, T1............Tn
such that T0T1, T1 T2, ........TnT0 no transaction can
progress in such situation.
• There are two principle for dealing with deadlock problem
a) Prevention
b) Detection and recovery

60
Deadlock
a) Prevention:-
• First, ensure no cyclic wait , can occur by ordering the request for lock
(or, all locks to be acquired together) which require that each
transaction locks all it’s data item before it begins execution . Either all
are locked in one step or none of them are locked.
• Second approach is closer to deadlock recovery, and performs
transaction rollback instead of waiting for lock, whenever the wait could
potentially result in deadlock. For preventing deadlocks is to impose an
ordering of all data items, and to require that transaction lock data items
only in a sequence consistent with the ordering.
• Following schemas use transaction timestamps to order transactions for
the stake of deadlock prevention:-
I. Wait –die
II. Wound-wait
III. Lock-time-out

61
Deadlock
I. Wait–die :- only older transaction to wait for younger one, otherwise
the transaction is abort (dies) and restarted from same time stamp so,
that eventually it will become the oldest active transaction and will not
die.
Ti Tj

Q here Ti request for Q and Tj holds Q


e.g. If Time stamp (Ti) old < T.S (Tj) younger , then Ti must wait
If Time stamp (Ti) younger >T.S (Tj) older , then Ti rollback
Wait-die allows only older transaction to wait for the younger one,
otherwise the transaction abort(rollback)
I. Wound-wait:- only a younger transaction can wait for an older one . If
an older transaction request a lock held by a younger one, the younger
one is aborted.
e.g. If Time stamp (Ti) older < T.S (Tj) younger, then Ti rollback
If Time stamp (Ti) younger > T.S (Tj) older , then Ti wait
III. Lock-time-out:- e.g. If Ti get destination with in with in time limit (5
min), if not then Ti rollback
62
Deadlock
b) Detection and recovery:-
when deadlock algorithm determines that a deadlock exists, the system
must recover from the deadlock. The most common solution is to
rollback one or more transaction to break the deadlock .these three
actions need to be taken:-
I. Choice of deadlock victim
II. Total /Partial rollback
III. Avoid Starvation:- it happens if some transaction is always chosen as
victim
Assignment :- difference between starvation and Deadlock??

63
Timestamp ordering
• The timestamp ordering protocol is a concurrency control
mechanism used in database systems to manage concurrent
transactions and ensure that the database remains in a
consistent state. It is a key technique in maintaining
serializability, which is a property that guarantees
transactions' outcomes are the same as if they had been
executed sequentially, one after the other.
• Tells Order when every enter into system
• Concurrency control technique based on timestamp ordering
do not use locks, hence deadlocks cannot occurs

64
Timestamp ordering
• Read- TS, last transaction no that perform read successfully
• T1 (10:00 am ) T2 (11:12 am )
• Read-TS (A) = 11:12

• Write-TS, last transaction no that perform write successfully


– Write TS (A) = last transaction that write

• E.g. T1 [100] T2 [200]


if T1, R(A) old ; T2W(A) young is allowed or T1 W(A) old ,
T2 R(A) younger is also allowed ; T1 W(A )older , T2 W(A)
younger is also allowed

65
Timestamp ordering
• Basic Timestamp Ordering (TSO) is a fundamental concurrency
control protocol used in database systems to ensure
serializability of transactions.
• It uses timestamps to manage the order of transaction
operations and ensure consistency.

66
Timestamp ordering
Protocol Details
• Transaction Timestamps:
– When a transaction T begins, it is assigned a unique
timestamp TS(T).
• Read Operation:
– For a read operation by transaction Ti​ on a data item X:
• If TS(Ti)>WTS(X), where WTS(X) is the timestamp of the last write
operation on X, then the read is allowed.
• Otherwise, if TS(Ti) ≤ WTS(X), the read is aborted because it would
see an inconsistent state.

67
Timestamp ordering
Protocol Details
• Write Operation:
– For a write operation by transaction Ti​ on a data item X:
• If TS(Ti)>RTS(X), where RTS(X) is the timestamp of the last
read operation on X, and TS(Ti)>WTS(X), then the write is
allowed.
• Otherwise, if TS(Ti) ≤ RTS(X) or TS(Ti)≤WTS(X) , the write is
aborted to prevent inconsistency.

• Commit and Abort:


– If a transaction completes without any aborts, it is allowed to commit.
If a transaction is aborted due to timestamp conflicts, it may be
restarted with a new timestamp.

68
Timestamp ordering
Example

• Consider two transactions, T1 and T2, with timestamps 100


and 200 respectively:
• T1 starts and writes to data item X. Timestamp of the write
operation is recorded as WTS(X)=100.
• T2 starts and wants to read X:
– Since TS(T2)=200 and TS(T2)>WTS(X), the read is allowed.
T2 can read the value written by T1.
• If T2 tries to write to X:
– TS(T2)=200 and RTS(X)=100 (timestamp of last read by T1).
– Since TS(T2)>RTS(X) and TS(T2)>WTS(X), the write is
allowed.

69
Timestamp ordering
• Thomas's Write Rule is an extension of the Basic Timestamp
Ordering protocol, designed to handle certain situations more
efficiently and avoid some of the performance issues associated
with Basic Timestamp Ordering.
• It specifically addresses how to handle write operations when
there are conflicts with other transactions.
a) If R_TS(X) > TS(T), then abort and roll back T and reject the
operation.
b) If W_TS(X) > TS(T), then don’t execute the Write Operation and
continue processing. This is a case of Outdated or Obsolete Writes.
Remember, outdated writes are ignored in Thomas Write Rule but
a Transaction following the Basic TO protocol will abort such a
Transaction.
c) If neither the condition in 1 or 2 occurs, then and only then
execute the W_item(X) operation of T and set W_TS(X) to TS(T).

70
Timestamp ordering
Advantages and Disadvantages
Advantages:
• Ensures serializability, which is a strong consistency
guarantee.
• Simple to implement in terms of logic and does not require
locking mechanisms.
Disadvantages:
• Can lead to a high rate of transaction aborts if conflicts are
frequent.
• May not be as efficient in environments with a high volume of
concurrent transactions due to the overhead of checking
timestamps and handling aborts.

71
Introduction to data recovery
• Data recovery is the process of retrieving lost, corrupted,
damaged, or inaccessible data or restoring database to a
consistent state when failure occurs

72
Introduction to data recovery
Need of recovery
• DBMS, recovery techniques play a vital role in maintaining
the consistency and reliability of data.
• There are basically the following types of failures that may
occur and lead to the failure of the transaction such as:
– Transaction Failure, System Failure/Crash, System Error, Local Error,
Disk Failure, Catastrophe, Network Failure
• These techniques such as backup and restoration, log-based
recovery, checkpoints, and shadow paging ensure that even in
case of unexpected system failures or errors, your data
remains secure and consistent.

73
Introduction to data recovery
Failure Classification
• There are basically the following types of failures that may
occur and lead to the failure of the transaction such as:
– Transaction Failure
• System Error
• Local Error
– System Failure/Crash
– Disk Failure
– Catastrophe
– Network Failure

74
Introduction to data recovery
Failure Classification
• Transaction Failure:- It is a type of failure that occurs when a transaction is not able
to complete.
– System Error:- This type of failure is also known as the transaction which may also
occur because of erroneous parameter values or because of a logical programming
error. E.g. as integer or divide by zero
– Local Error :- A simple example of this is that data for the transaction may not be
found. When we want to debit money from an insufficient balance account it leads
to the cancellation of our request or transaction.
• System Failure/Crash:- A hardware, software, or network error, H/W failure
• Disk Failure:- This type of failure basically occurs when some disk loses its
data because of a read or write malfunction or because of a disk
read/write head crash. This may happen during a read /write operation of
the transaction.
• Catastrophe:- fire ,theft etc
• Network Failure:- failure occurs when there is a disruption in the
communication channel between the client and database server.

75
Recovery Concepts
1. Log based recovery
2. Cashing (buffering) of Disk Blocks
3. Write- Ahead Logging
4. Check pointing

76
Recovery Concepts
1. Log based recovery:-
– in this recovery, a log is
maintained , in which all the
modifications of the database
are kept.
– A log consists of log records.
– For each activity of DB, • Undo and Redo Operations
separate log record is made. a) Undo: using a log record sets the
Log records are maintained in data item specified in log record
a serial manner in which to old value.
different activities are b) Redo: using a log record sets the
happened. data item specified in log record
to new value.

77
Recovery Concepts
• For any transaction Ti, various log
records are:-
• [Ti start]: It contains information
about when a transaction Ti
starts.
• [Ti commit]: It contains
information about when a
transaction Ti commits.
• [Ti abort]: It contains information
about when a transaction Ti
aborts.

78
Recovery Concepts
Advantages of Log based Recovery
• Durability: In the event of a breakdown, the log file offers a dependable
and long-lasting method of recovering data. It guarantees that in the event
of a system crash, no committed transaction is lost.
• Faster Recovery: Since log-based recovery recovers databases by replaying
committed transactions from the log file, it is typically faster than
alternative recovery methods.
• Incremental Backup: Backups can be made in increments using log-based
recovery. Just the changes made since the last backup are kept in the log
file, rather than creating a complete backup of the database each time.
• Lowers the Risk of Data Corruption: By making sure that all transactions
are correctly committed or canceled before they are written to
the database, log-based recovery lowers the risk of data corruption.

79
Recovery Concepts
Disadvantages of Log based Recovery
• Additional overhead: Maintaining the log file incurs an additional
overhead on the database system, which can reduce the performance of
the system.
• Complexity: Log-based recovery is a complex process that requires careful
management and administration. If not managed properly, it can lead to
data inconsistencies or loss.
• Storage space: The log file can consume a significant amount of storage
space, especially in a database with a large number of transactions.
• Time-Consuming: The process of replaying the transactions from the log
file can be time-consuming, especially if there are a large number of
transactions to recover.

80
Recovery Concepts
2. Caching (Buffering ) of Disk Blocks:-
• Caching, or buffering, of disk blocks is a technique used in
computer systems to improve performance and efficiency by
temporarily storing data in a faster-access memory (cache) rather
than repeatedly accessing slower storage devices like hard drives
or SSDs.
• Two main strategies can be employed when flushing a modified buffer back to disk.
• First, Placing update:- write the buffer to the same original disk location
, thus overwriting the old value of any changed data items on disk.
• Second , Shadowing:- writes an update buffer at a different disk location,
so multiple versions of data items can be maintained, but this approach is not
typically used in practice.

81
Recovery Concepts
3. Write-Ahead Logging:-
• The core idea behind Write-Ahead Logging is that any
changes made to the database must first be recorded in a log
before they are applied to the actual database. This log,
known as the WAL, serves as a record of all changes made to
the database, allowing the system to recover from failures by
replaying or undoing these changes as needed.
– This technique involves writing changes to a log before updating the
actual database.
– If a transaction fails before completing, the changes can be rolled back
using the log.

82
Recovery Concepts
4. Check pointing :-
 Why do we need checkpoint ?
• Whenever transaction logs are created in a real-time
environment, it eats up lots of storage space. Also keeping
track of every update and its maintenance may increase the
physical space of the system. Eventually, the transaction log
file may not be handled as the size keeps growing. This can
be addressed with checkpoints. The methodology utilized
for removing all previous transaction logs and storing them
in permanent storage is called a Checkpoint.
• Check point is the point of synchronization between the
database and the transaction log file.

83
Recovery Concepts
• Steps to Use Checkpoints in • The behavior when the system
the Database crashes and recovers when
concurrent transactions are
a) Write the begin_checkpoint executed is shown below:
record into a log.
b) Collect checkpoint data in
stable storage.
c) Write the end_checkpoint
record into a log.

84
• The recovery system reads the logs
backward from the end to the last
checkpoint i.e. from T4 to T1.
• It will keep track of two lists – Undo and
Redo.
• Whenever there is a log with
instructions <Tn, start>and <Tn,
commit> or only <Tn, commit> then it
will put that transaction in Redo List. T2
and T3 contain <Tn, Start> and <Tn,
Commit> whereas T1 will have only
<Tn, Commit>. Here, T1, T2, and T3 are
in the redo list.
• Whenever a log record with no
instruction of commit or abort is found,
that transaction is put to Undo List
<Here, T4 has <Tn, Start> but no <Tn,
commit> as it is an ongoing transaction.
T4 will be put on the undo list.
• All the transactions in the redo list are
deleted with their previous logs and
then redone before saving their logs. All
the transactions in the undo list are
undone and their logs are deleted.
85
Recovery Techniques
Types of Recovery Techniques in DBMS
• Database recovery techniques are used in database management systems
(DBMS) to restore a database to a consistent state after a failure or error
has occurred.
• The main goal of recovery techniques is to ensure data integrity and
consistency and prevent data loss.
• They are two type of recovery techniques:-
a) Rollback :- If transaction is failed then we have recovery management
system. With the help of rollback , transaction recovery will take place.
The rollback will take place with the help of transaction logs. logs are the
file that keeps a record of all the activity which is done by the
transaction. The rollback undo all the modifications that has taken place
during a transaction.

86
Recovery Techniques
Types of Recovery Techniques in DBMS
a) Cascading Rollback :- It is formed using two different words which
are cascade and Rollback. The word cascade means, “waterfall” and
rollback means, “ an act of making an action to change back to what it
was before”.
• Due to the failure of a single transaction a cascade of transaction
rollbacks. This is known as cascading rollback. For instance we can refer
to the below mentioned transaction.
 What role do transaction logs play in Rollback?
– Transaction logs record all activities in a transaction, allowing the rollback process to
undo modifications and recover from transaction failures.
• What is the difference between Rollback and Commit in DBMS
transactions?
– Rollback in DBMS cancels or undoes a transaction’s changes, while Commit finalizes a
transaction, making its changes permanent.

87
Recovery based on deferred update
Deferred Update:
• It is a technique for the maintenance of the transaction log files of the
DBMS.
• It is also called NO-UNDO/REDO technique.
• It is used for the recovery of transaction failures that occur due to
power, memory, or OS failures. Whenever any transaction is executed,
the updates are not made immediately to the database.
• They are first recorded on the log file and then those changes are applied
once the commit is done. This is called the “Re-doing” process. Once the
rollback is done none of the changes are applied to the database and the
changes in the log file are also discarded. If the commit is done before
crashing the system, then after restarting the system the changes that
have been recorded in the log file are thus applied to the database.

88
Recovery based on Immediate update
Immediate Update
• It is a technique for the maintenance of the transaction log files of the
DBMS. It is also called UNDO/REDO technique. It is used for the recovery
of transaction failures that occur due to power, memory, or OS failures.
• Whenever any transaction is executed, the updates are made directly to
the database and the log file is also maintained which contains both old
and new values. Once the commit is done, all the changes get stored
permanently in the database, and records in the log file are thus
discarded. Once rollback is done the old values get restored in the
database and all the changes made to the database are also discarded.
This is called the “Un-doing” process. If the commit is done before
crashing the system, then after restarting the system the changes are
stored permanently in the database.

89
Assignment
• Difference between Deferred update and
Immediate update??

90
Shadow Paging
• Shadow paging is a technique primarily used to implement durability in
DBMS .
• Shadow Paging is an alternating technique for recovery to overcome
the disadvantage of log-based recovery technique.
• The main idea behind the shadow paging is to keep two page tables in DB,
one is used for current operations and other is used in case of recovery.
• Page Table:- total no. of entries in page table is equal to the number of
pages in DB. Each entry in page table contains a pointer to the physical
location of pages.
I. Shadow page table:- table cannot be changed during any transaction.
II. Current page table:- this table may be changed during transaction.

91
Shadow Paging
• Shadow Paging is recovery technique that is used to recover DB. In this recovery technique,
database is considered as made up of fixed size of logical units of storage which are referred
as pages. pages are mapped into physical blocks of storage, with help of the page table which
allow one entry for each logical page of database. This method uses two page tables
named current page table and shadow page table. The entries which are present in current
page table are used to point to most recent database pages on disk. Another table i.e.,
Shadow page table is used when the transaction starts which is copying current page table.
After this, shadow page table gets saved on disk and current page table is going to be used
for transaction. Entries present in current page table may be changed during execution but in
shadow page table it never get changed. After transaction, both tables become identical. This
technique is also known as Cut-of-Place updating.

92
Shadow Paging

93
To understand concept, consider above figure. In this 2 write
operations are performed on page 3 and 5. Before start of write
operation on page 3, current page table points to old page 3. When
write operation starts following steps are performed :
1. Firstly, search start for available free block in disk blocks.
2. After finding free block, it copies page 3 to free block which is
represented by Page 3 (New).
3. Now current page table points to Page 3 (New) on disk but shadow
page table points to old page 3 because it is not modified.
4. The changes are now propagated to Page 3 (New) which is pointed
by current page table.

94
COMMIT Operation : To commit transaction following steps
should be done :
I. All the modifications which are done by transaction which
are present in buffers are transferred to physical database.
II. Output current page table to disk.
III. Disk address of current page table output to fixed location
which is in stable storage containing address of shadow page
table. This operation overwrites address of old shadow page
table. With this current page table becomes same as shadow
page table and transaction is committed.

95
Database backup and recovery from
catastrophic Failures
• A catastrophic failure is one where a stable, secondary storage device gets
corrupt. With the storage device, all the valuable data that is stored inside
is lost.
• The recovery manager of a DBMS must also be equipped to handle more
catastrophic failures such as disk crash. The main technique used to
handle such crashes in a DB backup , in which the whole database and the
log are periodically copied onto a cheap storage medium such as magnetic
tapes or other large capacity offline storage device . In case of a
catastrophic system failure, the latest backup copy can be re-loaded from
the tape to the disk, and the system can be resorted.

96
Assignment
• Question page no. 243, 244, 245, 246

97

You might also like