National Institute of Technology K arnataka S urath kal
Department of Information Technology
IT 252 DATABASE SYSTEMS
Transaction Processing
Dr. Jayashree T R
Course Outline
Course Plan: Theory:
Part A: Parallel Computer Architectures
Week 1: Introduction to to the course, highlighting the data and databases, and related basic
concepts. Advantages and need for a database system.
Week 2: Attributes, tuples, relational schema, conceptual model, introduction to SQL-DML, DDL,
creating relations/ tables, access and manipulate data.
Week 3,4: Data model, relational model, and SQL, relational data model and relational database
constraints, SQL data definition and data types, specifying constraints, basic retrieval queries, complex
quires, triggers, views, schema modification.
Week 5-6: ER model development, entity types/sets, attribute, relationship types/sets, simple employee
database conceptual design using ER concepts, ER to relational mapping algorithm, EER model.
Course Outline
Functional dependency definition, need of normalization, anomalies and
Week 7,8 : redundant information in tuples, normalization steps (INF, 2NF, 3NF),
BCNF, introduction to the higher normal forms.
Week 9: Procedural query languages: relational algebra and SQL-SELECT, PROJECT, RENAME, binary
and JOIN operations with examples.
Non-Procedural query languages: relational calculus, and SQL-tuple relational calculus, existential a
universal quantifier with examples.
Week 10,11 : Transaction management, schedule and serializability, concurrency control, 2-phase lock,
recovery mechanism: undo/redo values.
Week 1 2 , 1 3 : Disk storage- basic file structures, ordered/unordered. binary search, hashing, indexing,
importance of indexes, primary and secondary indexing methods, clustered, B tree, B++ trees.
Week 1 4 : Current trends in database system, introduction to data warehousing, data mining.
Practical: MySQL commands
Project: Team of 2 or 3 members
Introduction to Transaction Processing Concepts and Theory
Introduction to Transaction Processing
• Single-user DBMS (system): At most one user at a time can use the system
• Example: home computer
• Multiuser DBMS (system): Many users can access the system (database)
concurrently
• Example: airline reservations system
• Concurrency:
• Multiprogramming:
• Allows operating system to execute multiple processes concurrently
• Executes commands from one process, then suspends that process and
executes commands from another process, etc.
Note: Multiprogramming is a concept related to concurrent program
execution, but they are not exactly the same.
In Multiprogramming the CPU switches between programs (processes) quickly, giving
the illusion that they are running simultaneously.
Concurrency is the ability of a system to execute multiple tasks at the same time (or
make it appear as if they're running simultaneously). A single-core CPU can run
concurrent tasks by switching between them quickly (e.g.: multitasking on your mobile
phone).
Parallelism: is about executing multiple tasks simultaneously. It requires
multiple processing units (like multi-core processors) where tasks are truly
running at the same time. E.g.: Weather Forecasting, data processing, video rendering,
ML based applications.
Introduction to Transaction Processing
Concurrency:
• Interleaved processing: Concurrent execution of processes is interleaved
in a single CPU.
• Parallel processing: Processes are concurrently executed in multiple CPUs.
Introduction to Transaction Processing
Introduction to Transaction Processing
• Transaction: is an executing program that forms a logical unit of database
processing.
• A transaction includes one or more database access operations—these can
include insertion, deletion, modification (update), or retrieval operations.
• The database operations that form a transaction can either be embedded within
an application program or they can be specified interactively via a high-level
query language such as SQL.
• specifying the transaction boundaries : using explicit begin transaction
and end transaction statements in an application program.
• An application program may contain several transactions separated by the Begin
and End transaction boundaries.
e.g.:
BEGIN TRANSACTION; -- Start the transaction
// Perform your operations here
INSERT INTO accounts (account_id, balance) VALUES (1, 1000);
UPDATE accounts SET balance = balance - 200 WHERE account_id = 1;
// If everything is successful, commit the transaction
COMMIT;
//If an error occurs, you can roll back the transaction instead
ROLLBACK;
Note: Explore how to handle transactions in Java (JDBC), Python, React, (Node.js +
Express).
Introduction to Transaction Processing
• Transaction processing systems:
• Systems with large databases and hundreds of concurrent
users
• Require high availability and fast response time
Read-only transaction: If the database operations in a transaction do not
update the database but only retrieve data, the transaction is called a read-only
transaction.
Read-write transaction: If the database operations in a transaction involve both
reading (retrieving) and writing (updating) the database, such transaction is called
a read-write transaction.
Introduction to Transaction Processing
record id can be the data item name.
disk block address is used as data item name.
Record ID + Attribute Name is used as data item name
The basic database access operations that a transaction can include are as
follows:
Read and Write Operations in detail:
Read and Write Operations in detail:
Find Disk Block
Load into Buffer
Update in Memory
Flush to Disk
DBMS Buffers:
• In the context of a Database Management System (DBMS), buffers are
used to temporarily store data that is being transferred between disk storage and
main memory (RAM).
• Their purpose is to optimize the performance of the DBMS by reducing the
number of costly disk accesses. Since accessing data from the disk is much
slower than accessing it from memory, buffers help in improving the overall efficiency
of data operations.
• A buffer hit occurs when the requested data is already in the buffer, and can be returned
quickly. A buffer miss happens when the requested data is not in the buffer, requiring it to be fetched
from the disk.
• Buffer pool can be managed using Replacement Policies such as Least Recently Used (LRU),
Most Recently Used (MRU).
MySQL Buffer Pool: InnoDB Buffer Pool
PostgreSQL Buffer Pool: Shared Buffers
Why Concurrency Control is needed in a Transaction:
Transactions submitted by various users may execute concurrently. In such cases,
• Access and update the same database items
• Some form of concurrency control is needed
Furthermore, following problems occur when concurrent execution is
uncontrolled.
Why Concurrency Control is needed in a Transaction:
Why Concurrency Control is needed in a Transaction:
Why Concurrency Control is needed in a Transaction:
Why Concurrency Control is needed in a Transaction:
Why Concurrency Control is needed in a Transaction:
Transaction and System Concepts
Transaction and System Concepts
System must keep track of when each transaction starts, terminates,
commits, and/or aborts
• BEGIN_TRANSACTION
• READ or WRITE
• END_TRANSACTION
• COMMIT_TRANSACTION
• ROLLBACK (or ABORT)
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Transaction and System Concepts
Schedules (Histories) of Transactions
A schedule (or history) S of n transactions T1, T2, … , Tn is an ordering of the
operations of the transactions. Operations from different transactions can be
interleaved in the schedule S. However, for each transaction Ti that participates in
the schedule S, the operations of Ti in S must appear in the same order in which they
occur in Ti.
A schedule is used to preserve the order of operations in each of the individual
transaction.
The order of operations in S is considered to be a total ordering, meaning that for
any two operations in the schedule, one must occur before the other.
Schedules (Histories) of Transactions
A shorthand notation for describing a schedule uses the symbols b, r, w, e, c, and a for the operations
begin_transaction, read_item, write_item, end_transaction, commit, and abort, respectively, and appends
as a subscript the transaction id (transaction number) to each operation in the schedule.
Schedules (Histories) of Transactions
The schedule, Sa can be written as follows in this Schedule, Sb can be written as follows, if we
notation: assume that transaction T1 aborted after its
Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y); read_item(Y) operation:
Sb: r1(X); w1(X); r2(X); w2(X); r1(Y); a1;
Schedules (Histories) of Transactions
Conflicting Operations in a Schedule
Two operations in a schedule are said to conflict if they satisfy all three of the following
conditions:
• they belong to different transactions;
• they access the same item X; and
• at least one of the operations is a write_item(X).
e.g. In Sa: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y);
Conflicting operations are: r1(X) and w2(X) ;
r2(X) and w1(X), and
w1(X) and w2(X).
Schedules (Histories) of Transactions
Non-conflicting Operations are:
r1(X) and r2(X);
w2(X) and w1(Y);
r1(X) and w1(X)
A read-write conflict happens when: A write-write conflict occurs
• One transaction reads a data item. when two transactions attempt to
• Another transaction writes to the same data item.
write to the same data item
simultaneously.
• These operations belong to different transactions. i.e., A write-write conflict happens when:
e.g.: Consider two transactions: Two transactions, say T1 and T2, both
perform write operations on the same data
T1: Read(A), Write(A) item. These operations belong to different
T2: Write(A), Read(A) transactions.
Schedule : T1: Read(A) → (R1(A))
Schedule : T1: Write(A = 10) → (W1(A))
T2: Write(A) → (W2(A))
T2: Write(A = 20) → (W2(A))
T1: Write(A) → (W1(A))
SERIAL AND SERIALIZABLE SCHEDULE
Serializability
In DBMS, converting a non-serial schedule to a serial schedule is essential for
ensuring consistency and correctness in concurrent transactions.
Here are the 2 approaches:
• Conflict-Serializability: If the non-serial schedule can be converted into a
serial schedule by swapping non-conflicting operations.
• View-Serializability: If the schedule maintains the same data view as some
serial schedule.
Conflict Serializability
Since there are no cycles in the precedence graph generated in the previous slide,
the schedule is conflict-serializable.
Now, we can create a serial schedule based on the topological order of the
graph.
The precedence graph with the following edges are observed:
• T2 → T1
• T2 → T3 T2 must come before both T1 and T3, and T3 must come before T1.
• T3 → T1
Hence, the serial schedule is: T2 → T3 → T1 (i.e. All operations of T2, All
operations of T3, All operations of T1)
Conflict Serializability
Homework:
Construct the
precedence graphs for
schedules A to D from
Figure 20.5 to test for
conflict serializability ?
Conflict Equivalent Serializability
Conflict Equivalent Serializability
View Equivalence and View Serializability
View equivalence of two schedules:
• As long as each read operation of a transaction reads the result of the same write
operation in both schedules, the write operations of each transaction must produce the
same results.
• Read operations said to see the same view in both schedules.
• Two schedules S and S′ are said to be view equivalent if the following three conditions hold:
1. The same set of transactions participates in S and S′, and S and S′ include the same
operations of those transactions.
2. For any operation ri (X) of Ti in S, if the value of X read by the operation has been written
by an operation wj(X) of Tj (or if it is the original value of X before the schedule started), the same
condition must hold for the value of X read by operation ri(X) of Ti in S′.
3. If the operation wk(Y) of Tk is the last operation to write item Y in S, then wk(Y) of Tk
must also be the last operation to write item Y in S′.
View Equivalence and View Serializability
View serializability: Definition of view serializability is based on view equivalence. i.e. A
schedule is view serializable if it is view equivalent to a serial schedule.
THANK YOU