Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
19 views25 pages

Dbms Unit 3

The document discusses functional dependency in relational databases, explaining its types such as trivial, non-trivial, multivalued, transitive, fully functional, and partial dependencies. It also covers normalization processes to reduce redundancy and improve data consistency, detailing normal forms from 1NF to BCNF. Additionally, the document addresses transaction management, including operations like read, write, commit, and rollback, along with the states of transactions in a DBMS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views25 pages

Dbms Unit 3

The document discusses functional dependency in relational databases, explaining its types such as trivial, non-trivial, multivalued, transitive, fully functional, and partial dependencies. It also covers normalization processes to reduce redundancy and improve data consistency, detailing normal forms from 1NF to BCNF. Additionally, the document addresses transaction management, including operations like read, write, commit, and rollback, along with the states of transactions in a DBMS.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

UNIT 3

Functional Dependency
In relational database management, functional dependency is a concept that specifies the relationship
between two sets of attributes where one attribute determines the value of another attribute. It is
denoted as X → Y, where the attribute set on the left side of the arrow, X is called Determinant, and Y is
called the Dependent.
Types of Functional Dependencies in DBMS
1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency
1. Trivial Functional Dependency
In Trivial Functional Dependency, a dependent is always a subset of the determinant. i.e. If X → Y and Y is
the subset of X, then it is called trivial functional dependency.
Symbolically: A→B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A→A & B→B
Example 1 :
 ABC -> AB
 ABC -> A
 ABC -> ABC
Example 2:

roll_no name age


42 abc 17
43 pqr 18
44 xyz 18

Here, {roll_no, name} → name is a trivial functional dependency, since the dependent name is a subset of
determinant set {roll_no, name}. Similarly, roll_no → roll_no is also an example of trivial functional
dependency.

2. Non-trivial Functional Dependency


In Non-trivial functional dependency, the dependent is strictly not a subset of the determinant. i.e. If X →
Y and Y is not a subset of X, then it is called Non-trivial functional dependency.
Example 1 :
 Id -> Name
 Name -> DOB
Example 2:

roll_no name age


42 abc 17
43 pqr 18
44 xyz 18

Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a subset of
determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial functional dependency, since age
is not a subset of {roll_no, name}
3. Multivalued Functional Dependency
In Multivalued functional dependency, entities of the dependent set are not dependent on each other. i.e.
If a → {b, c} and there exists no functional dependency between b and c, then it is called a multivalued
functional dependency.
Example:

bike_model manuf_year color


tu1001 2007 Black
tu1001 2007 Red
tu2012 2008 Black
tu2012 2008 Red
tu2222 2009 Black
tu2222 2009 Red

In this table:
 X: bike_model
 Y: color
 Z: manuf_year
For each bike model (bike_model):
1. There is a group of colors (color) and a group of manufacturing years (manuf_year).
2. The colors do not depend on the manufacturing year, and the manufacturing year does not depend
on the colors. They are independent.
3. The sets of color and manuf_year are linked only to bike_model.
That’s what makes it a multivalued dependency.
In this case these two columns are said to be multivalued dependent on bike_model.
4. Transitive Functional Dependency
In transitive functional dependency, dependent is indirectly dependent on determinant. i.e.
If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive functional dependency.
Example:

enrol_no name dept building_no


42 abc CO 4
43 pqr EC 2
44 xyz IT 1
45 abc EC 2

Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of transitivity, enrol_no →
building_no is a valid functional dependency. This is an indirect functional dependency, hence called
Transitive functional dependency.
5. Fully Functional Dependency
In full functional dependency an attribute or a set of attributes uniquely determines another attribute or
set of attributes. If a relation R has attributes X, Y, Z with the dependencies X->Y and X->Z which states that
those dependencies are fully functional.
6. Partial Functional Dependency
In partial functional dependency a non key attribute depends on a part of the composite key, rather than
the whole key. If a relation R has attributes X, Y, Z where X and Y are the composite key and Z is non key
attribute. Then X->Z is a partial functional dependency in RBDMS.

Normalization
Normalization is a process of organizing data in a database to reduce redundancy and improve data
consistency. In it we divide table into 2 or more tables.
This concept is used to reduce redundancy.
Normalization is a process of organizing the data in database to avoid data redundancy, insertion anomaly,
update anomaly & deletion anomaly.
Normalization is the process of organizing data in a database to minimize redundancy and dependency. In
database design, there are different normal forms based on the primary keys of a table. These include :

First Normal Form (1NF)


A relation is said to be in 1NF only if all the domain(attributes) contain only single(atomic) values. 1NF
requires that each column in a table contains atomic values and that each row is uniquely identified. This
means that a table cannot have repeating groups or arrays as columns, and each row must have a
unique primary key.
It States that an attribute of a table cannot hold multiple values. It must hold only single valued
attribute.
Example: A table is in 1NF if each column contains atomic values and each row is uniquely identified. For
example, a table that lists customers and their phone numbers.

Customer ID Name Phone Numbers

1 John 555-1234, 555-5678

2 Jane 555-9876

3 Michael 555-5555

This violates 1NF because the Phone Numbers column contains repeating groups.
To normalize this table to 1NF, we can split the Phone Numbers column into separate rows and add a
separate primary key column.

Customer ID Name Phone Number

1 John 555-1234

1 John 555-5678

2 Jane 555-9876

3 Michael 555-5555

Second Normal Form (2NF)  Remove Partial Dependency


A relation is said to be in 2NF if and only if(iff) it is in 1NF and every non-Key attribute is fully dependent on
key attribute that is we remove Partial Dependency if exist, by dividing a table into 2 or more tables or by
merging 2 or more tables into one.
Example
A table is in 2NF if each non-primary key column is fully functionally dependent on the primary key. For
example, a table that lists orders and their line items:
Order ID Customer ID Customer Name Item ID Item Name Quantity

1 1 John 1 Shirt 2

1 1 John 2 Pants 1

2 2 Jane 1 Shirt 1

2 2 Jane 3 Hat 3

This violates 2NF because the Customer Name column depends on only part of the primary key (Customer
ID). To normalize this table to 2NF, we can split it into two tables.

Order ID Customer ID Item ID Quantity


1 1 1 2
1 1 2 1
2 2 1 1
2 2 3 3

Customer ID Customer Name


1 John
2 Jane

Third Normal Form (3NF):  (Remove Transitive functional Dependecy)


A relation is said to be in 3NF iff it is in 2NF and there is no Transitive functional dependency. This means
that a table should not have transitive dependencies, where a non-primary key column depends on
another non-primary key column.
Example: To explain 3NF further, let's consider an example of a table that lists customer orders.

Order ID Customer ID Customer Name Customer City Order Date Order Total

1 100 John Smith New York 2022-01-01 100

2 101 Jane Doe Los Angeles 2022-01-02 200

3 102 Bob Johnson San Francisco 2022-01-03 300


In this example, the non-primary key column "Customer City" is transitively dependent on the primary key. That is, it
depends on "Customer ID", which is not part of the primary key, instead of depending directly on the primary key
"Order ID". To bring this table to 3NF, we can split it into two tables.

Table 1: Customers

Customer ID Customer Name Customer City

100 John Smith New York

101 Jane Doe Los Angeles

102 Bob Johnson San Francisco

Table 2: Orders

Order ID Customer ID Order Date Order Total

1 100 2022-01-01 100

2 101 2022-01-02 200

3 102 2022-01-03 300

Now, the "Customer City" column is no longer transitively dependent on the primary key and is instead in a
separate table that has a direct relationship with the primary key.

Boyce-Codd Normal Form (BCNF)


BCNF is a stricter form of 3NF that applies to tables with more than one candidate key. BCNF requires
that each non-trivial dependency in a table is a dependency on a candidate key. This means that a table
should not have non-trivial dependencies, where a non-primary key column depends on another non-
primary key column. BCNF ensures that each table in a database is a separate entity and eliminates
redundancies.
Example
A table is in BCNF if each determinant is a candidate key. In other words, every non-trivial functional
dependency in the table must be on a candidate key. For example, consider a table that lists information
about books and their authors.

Book ID Title Author ID Author Name Author Nationality


1 Crime and Punishment 100 Fyodor Dostoevsky Russian
2 The Great Gatsby 101 F. Scott Fitzgerald American
3 Pride and Prejudice 102 Jane Austen British
Table: Books
In this example, the functional dependency between "Author ID" and "Author Name" violates BCNF
because it is not on a candidate key. To bring this table to BCNF, we can split it into two tables.
Table 1: Authors

Author ID Author Name Author Nationality


101 Fyodor Dostoevsky Russian
101 F. Scott Fitzgerald American
102 Jane Austen British

Table 2: Books
Book ID Title Author ID
1 Crime and Punishment 100
2 The Great Gatsby 101
3 Pride and Prejudice 102
Now, the "Author Name" and "Author Nationality" columns are not transitively dependent on the
primary key, and the table is in BCNF.
Advantages of Normalization
 Reduced Data Redundancy
 Improved Data Consistency
 Simplified Database Maintenance
 Improved Query Performance
Loss less Decomposition
Lossless Decomposition in DBMS refers to breaking a relation (table) into two or more sub-relations in
such a way that no information is lost when the sub-relations are joined back together.
When the relational model is not in an appropriate normal form, then the decomposition of a relationship
is required. A table is broken into multiple tables which is known as decomposition. It is done to eliminate
redundancy and inconsistency. Decomposition is categorized into two types- lossless join decomposition
and dependency preserving.
In short, the original relation can be obtained by using joins on the decomposed relations. Here the original
data is preserved and it is ensured that the original data and the data after reconstruction should be the
same.
Criteria of Lossless Join Decomposition in DBMS?
For lossless join decomposition, we select a common attribute. Attributes in DBMS are the descriptive
properties which describe an entity.
Example 1: Consider the following relations- R = (D, E, F)
R1 = (D, E)
R2 = (E, F)
The relation R has 3 attributes D, E, and F. The relation R is decomposed into two relations Relation-1 and
Relation-2. Relation-1 and Relation-2 both have two attributes. Both have a common attribute 'E'.
Now, let us draw a table of Relation R with raw data −
Also, it is important to remember that the value present in Column E should be unique. If there is a presence
of a duplicate value, it is not possible for lossless join decomposition to take place.
R = (D, E, F)
D E F

78 19 16

39 76 91

78 29 44

It is decomposed as follows-
R1(D, E)
D E

78 19

39 76

78 29
R2(E, F)
E F
19 16
76 91
29 44

Let us check the first condition. It was The union of the sub-relations R1 and R2 must contain all the
attributes that are available in the original relation R before decomposition.
So, R1 U R2= R
D E F
78 19 16
39 76 91
78 29 44
The relation obtained above is the same as the original relation R. We can say that it is an example
of Lossless-join decomposition.

Transaction Management
In a Database Management System (DBMS), a transaction is a sequence of operations performed as a
single logical unit of work. These operations may involve reading, writing, updating, or deleting data in the
database. A transaction is considered complete only if all its operations are successfully executed,
Otherwise the transaction must be rolled back, ensuring the database remains in a consistent state.
Operations of Transaction:
A user can make different types of requests to access and modify the contents of a database. So, we have
different types of operations relating to a transaction. They are discussed as follows:
1) Read(X)
Example: For a banking system, when a user checks their balance, a Read operation is performed on
their account balance:
SELECT balance FROM accounts WHERE account_id = 'A123';
This updates the balance of the user's account after withdrawal
2) Write(X)
Example: For the banking system, if a user withdraws money, a Write operation is performed after the
balance is updated:
UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A123';
This updates the balance of the user’s account after withdrawal.
3) Commit
Example: After a successful money transfer in a banking system, a Commit operation finalizes the
transaction: COMMIT;
Once the transaction is committed, the changes to the database are permanent, and the transaction is
considered successful.
4) Rollback
Example: Suppose during the money transfer process, the system encounters an issue, like insufficient
funds in the sender’s account. In that case, the transaction is rolled back:
ROLLBACK;
This will undo all the operations performed so far and ensure that the database remains consistent.
Transaction States in DBMS
A transaction in a DBMS refers to a series of operations that are executed together as a single unit to
perform a task, such as transferring money between accounts in a banking system. A transaction state
refers to the current phase or condition of a transaction during its execution in a database. It represents
the progress of the transaction and determines whether it will successfully complete (commit) or fail
(abort).
 A transaction goes through several states during its lifetime.
 A transaction log is used to keep a record of all the steps taken during the transaction.
 A transaction involves two main operations, Read Operation and Write Operation.
 Read Operation: Reads data from the database, stores it temporarily in memory (buffer), and uses
it as needed.
 Write Operation: Updates the database with the changed data using the buffer.
Note: From the start of executing instructions to the end, Read and Write operations are treated as a
single transaction. This ensures the database remains consistent and reliable throughout the process.
Different Types of Transaction States in DBMS
1. Active State
2. Partially Committed State
3. Failed State
4. Aborted State
5. Committed State
6. Terminated State

1. Active State
 It is the first stage of any transaction when it has begun to execute. The execution of the
transaction takes place in this state.
 When the instruction of the transaction are executing then the transaction is in active state. In case
of execution of all instruction of transaction, transaction can go to "Partially committed state"
otherwise go to "Failed state" from active state.
 Operations such as insertion, deletion, or updation are performed during this state.
 During this state, the data records are under manipulation and they are not saved to the database,
rather they remain somewhere in a buffer in the main memory.
2. Partially Committed
After the execution of all instruction of a transaction, the transaction enters into partially committed state
from active state.
At this stage, still changes are possible in transaction because all changes made by the transaction are
still stored in the buffer of main memory.
 The transaction has finished its final operation, but the changes are still not saved to the database.
 After completing all read and write operations, the modifications are initially stored in main
memory or a local buffer. If the changes are made permanent in the database then the state will
change to “committed state” and in case of failure it will go to the “failed state”.
3. Committed: This state of transaction is achieved when all the transaction-related operations have
been executed successfully along with the Commit operation, i.e. data is saved into the database after the
required manipulations in this state. This marks the successful completion of a transaction.
4. Failed State: If any of the transaction-related operations cause an error during the active or partially
committed state, further execution of the transaction is stopped and it is brought into a failed state. Here,
the database recovery system makes sure that the database is in a consistent state.
5. Aborted State: If a transaction reaches the failed state due to a failed check, the database recovery
system will attempt to restore it to a consistent state. If recovery is not possible, the transaction is either
rolled back or cancelled to ensure the database remains consistent.
6. Terminated State: It refers to the final state of a transaction, indicating that it has completed its
execution. Once a transaction reaches this state, it has either been successfully committed or aborted and
get ready to execute the new transaction.

Example of Transaction States


Imagine a bank transaction where a user wants to transfer $500 from Account A to Account B. The system
should handle the following transaction states:
1. Active State:
 The transaction begins. It reads the balance of Account A and checks if it has enough funds.
 Example: Read balance of Account A = $1000.
2. Partially Committed State:
 The transaction performs all its operations but hasn’t yet saved (committed) the changes to the
database.
 Example: Deduct $500 from Account A’s balance ($1000 – $500 = $500) and temporarily update
Account B’s balance (add $500).
3. Committed State:
 The transaction successfully completes, and the changes are saved permanently in the database.
 Example: Account A’s new balance = $500; Account B’s new balance = $1500. Changes are written
to the database.
4. Failed State:
 If something goes wrong during the transaction (e.g., power failure, system crash), the transaction
moves to this state.
 Example: System crashes after deducting $500 from Account A but before adding it to Account B.
5. Aborted State:
 The failed transaction is rolled back, and the database is restored to its original state.
 Example: Account A’s balance is restored to $1000, and no changes are made to Account B.
These states ensure that either the transaction completes successfully (committed) or the database is
restored to its original state (aborted), maintaining consistency and preventing errors.
ACID Properties
ACID stands for Atomicity, Consistency, Isolation, and Durability. These four key properties define how a
transaction should be processed in a reliable and predictable manner, ensuring that the database remains
consistent, even in cases of failures.
What Are Transactions in DBMS?
A transaction in DBMS refers to a sequence of operations performed as a single unit of work. These
operations may involve reading or writing data to the database. To maintain data integrity, DBMS ensures
that each transaction adheres to the ACID properties. Think of a transaction like an ATM withdrawal.
When we withdraw money from our account, the transaction involves several steps:
 Checking your balance.
 Deducting the money from your account.
 Adding the money to the bank's record.
For the transaction to be successful, all steps must be completed. If any part of this process fails (e.g., if
there’s a system crash), the entire transaction should fail, and no data should be altered. This ensures the
database remains in a consistent state.

The Four ACID Properties


1. Atomicity: "All or Nothing"
Atomicity ensures that a transaction is atomic, it means that either the entire transaction completes fully
or doesn't execute at all. There is no in-between state i.e. transactions do not occur partially. There must
be no state in a database where a transaction is left partially completed. States should be define either
before the execution of the transaction or after execution of the transaction.

2. Consistency: Maintaining Valid Data States


The database must remain in a consistent state after any transaction. No transaction should have any
adverse effect on the data residing in the database. If the database is in a consistent state before the
execution of a transaction, it must remain consistent after the execution of the transaction as well.
It means a transaction begins with the consistent database and finishes with the consistent state.

3. Isolation: Ensuring Concurrent Transactions Don't Interfere


This property ensures that multiple transactions can occur concurrently without leading to
the inconsistency of the database state. Transactions occur independently without interference. Changes
occurring in a particular transaction will not be visible to any other transaction until that particular change
in that transaction is written to memory or has been committed.

4. Durability: Persisting Changes


This property ensures that once the transaction has completed execution, the updates and modifications
to the database are stored in and written to disk and they persist even if a system failure occurs. These
updates now become permanent and are stored in non-volatile memory.
Deadlock in DBMS
Deadlock handling in DBMS involves techniques to detect and resolve deadlocks that occur when
transactions wait for each other to release resources.
In a database management system (DBMS), a deadlock occurs when two or more transactions are waiting
for each other to release resources, such as locks on database objects, that they need to complete their
operations. As a result, none of the transactions can proceed, leading to a situation where they are stuck
or “deadlocked.”

Features of deadlock in a DBMS:


 Mutual Exclusion: Each resource can be held by only one transaction at a time, and other
transactions must wait for it to be released.
 Hold and Wait: Transactions can request resources while holding on to resources already allocated
to them.
 No Preemption: Resources cannot be taken away from a transaction forcibly, and the transaction
must release them voluntarily.
 Circular Wait: Transactions are waiting for resources in a circular chain, where each transaction is
waiting for a resource held by the next transaction in the chain. Indefinite Blocking: Transactions
are blocked indefinitely, waiting for resources to become available, and no transaction can
proceed.

Deadlock in DBMS

Deadlock Avoidance: When a database is stuck in a deadlock, It is always better to avoid the deadlock
rather than restarting or aborting the database. The deadlock avoidance method is suitable for smaller
databases whereas the deadlock prevention method is suitable for larger databases.
In the above given example, Transactions that access Students and Grades should always access the tables
in the same order. In this way, in the scenario described above, Transaction T1 simply waits for transaction
T2 to release the lock on Grades before it begins. When transaction T2 releases the lock, Transaction T1
can proceed freely.

Deadlock Detection: When a transaction waits indefinitely to obtain a lock, the database management
system should detect whether the transaction is involved in a deadlock or not.

Wait-for-graph is one of the methods for detecting the deadlock situation. This method is suitable for
smaller databases. In this method, a graph is drawn based on the transaction and its lock on the resource.
If the graph created has a closed loop or a cycle, then there is a deadlock.
For the above-mentioned scenario, the Wait-For graph is drawn below:

Deadlock prevention: For a large database, the deadlock prevention method is suitable. A deadlock
can be prevented if the resources are allocated in such a way that a deadlock never occurs.
The DBMS analyzes the operations whether they can create a deadlock situation or not, If they do, that
transaction is never allowed to be executed. Deadlock prevention mechanism proposes two schemes:
• Wait-Die Scheme: In this scheme, If a transaction requests a resource that is locked by another
transaction, then the DBMS simply checks the timestamp of both transactions and allows the
older transaction to wait until the resource is available for execution.
• Wound Wait Scheme: In this scheme, if an older transaction requests for a resource held by a
younger transaction, then an older transaction forces a younger transaction to kill the
transaction and release the resource. The younger transaction is restarted with a minute delay
but with the same timestamp. If the younger transaction is requesting a resource that is held by
an older one, then the younger transaction is asked to wait till the older one releases it.
The following table lists the differences between Wait – Die and Wound -Wait scheme prevention
schemes:
Wait – Die Wound -Wait

It is based on a non-preemptive technique. It is based on a preemptive technique.

In this, older transactions must wait for the younger one to In this, older transactions never wait for younger
release its data items. transactions.

The number of aborts and rollbacks is higher in these In this, the number of aborts and rollback is lesser.
techniques.

Applications:
 Delayed Transactions: Deadlocks can cause transactions to be delayed, as the resources they need
are being held by other transactions. This can lead to slower response times and longer wait times
for users.
 Lost Transactions: In some cases, deadlocks can cause transactions to be lost or aborted, which can
result in data inconsistencies or other issues.
 Reduced User Satisfaction: Deadlocks can lead to a perception of poor system performance and
can reduce user satisfaction with the application. This can have a negative impact on user adoption
and retention.

Disadvantages:
 System downtime: Deadlock can cause system downtime, which can result in loss of productivity
and revenue for businesses that rely on the DBMS.
 Resource waste: When transactions are waiting for resources, these resources are not being used,
leading to wasted resources and decreased system efficiency.
 Reduced concurrency: Deadlock can lead to a decrease in system concurrency, which can result in
slower transaction processing and reduced throughput.
 Complex resolution: Resolving deadlock can be a complex and time-consuming process.

Serializability in DBMS
Serializability in DBMS guarantees that the execution of multiple transactions in parallel does not produce
any unexpected or incorrect results. This is accomplished by enforcing a set of rules that ensure that each
transaction is executed as if it were the only transaction running in the system.
What is a Schedule in DBMS: When several transaction are running concurrently then the order execution
of various instructions is known as Schedule. A schedule can include multiple transactions running
concurrently, each performing a series of read and write operations on the database. The order of
execution of these operations can significantly impact the final state of the database and the correctness
of the results. Schedule is of 2 types:

1. Serial Schedule in DBMS: The serial schedule is a type of schedule where each transaction is
executed completely before the next transaction begins. In the Serial schedule, when the first transaction
completes its cycle, then the next transaction is executed. This schedule is viewed as simplest. Due to the
fact that transactions only execute one after the other, serial schedules are always serializable. Example
of Serial Schedule:
Transaction 1 Transaction 2

R(x)

W(x)

R(y)

W(y)

R(y)

W(y)

R(x)

W(x)

2. Non-Serial Schedule in DBMS: The Non-Serial schedule is a type of schedule where transactions
are executed concurrently, with some overlap in time. Unlike a serial schedule, where transactions are
executed one after the other with no overlap, a non-serial schedule allows transactions to execute
simultaneously. Example of Non-Serial Schedule:

Transaction 1 Transaction 2

R(x)

W(x)

R(y)

W(y)

R(y)

R(x)

W(y)

W(x)

Types of Serializability in DBMS


There are mainly two types of serializability in DBMS:
1. View Serializability in DBMS: View Serializability is the process of determining whether or not a given
schedule is view serializable. If a schedule is a view equivalent to a serial schedule, it is view serializable.
To test for view serializability, we first identify the read and write operations of each transaction. A
schedule is considered view serializable if it is view equivalent to a serial schedule, which is a schedule
where the transactions are executed one after the other without any overlap. Example of View
Serializability in DBMS:

Transaction 1 Transaction 2

R(a)

W(a)

R(a)

W(a)

R(b)

W(b)

R(b)

W(b)

Let’s swap the read-write operations in the middle of the two transactions to create the view equivalent
schedule.
Transaction 1 Transaction 2

R(a)

W(a)

R(b)

W(b)

R(a)

W(a)

R(b)

W(b)

2. Conflict Serializability in DBMS: A database management system (DBMS) schedule’s ability to prevent
the sequence of conflicting transactions from having an impact on the transactions’ results is known as
conflict serializability in DBMS. Conflicting transactions are those that make unauthorized changes to the
same database data item. Example of Conflict Serialzability:
Transaction 1 Transaction 2 Transaction 3

R(x)

R(y)

R(x)

R(y)

R(z)

W(y)

W(z)

R(z)

W(x)

W(z)

Failure Classification in DBMS


Failure in terms of a database can be defined as its inability to execute the specified transaction or loss of
data from the database. There are many reasons that can cause database failures such as network failure,
system crash, natural disasters, carelessness, software errors, etc. A Failure Classification in DBMS can be
classified as:

Transaction Failure:
If a transaction is not able to execute or it comes to a point from where the transaction becomes incapable
of executing further then it is termed as a failure in a transaction.
Reason for a transaction failure in DBMS:
1. Logical error: A logical error occurs if a transaction is unable to execute because of some mistakes
in the code or due to the presence of some internal faults.
2. System error: Where the termination of an active transaction is done by the database system itself
due to some system issue or because the database management system is unable to proceed with
the transaction. For example- The system ends an operating transaction if it reaches a deadlock
condition or if there is an unavailability of resources.
System Crash:
A system crash usually occurs when there is some sort of hardware or software breakdown. Some other
problems which are external to the system and cause the system to abruptly stop or eventually crash
include failure of the transaction, operating system errors, power cuts, main memory crash, etc.
These types of failures are often termed soft failures and are responsible for the data losses in the volatile
memory. It is assumed that a system crash does not have any effect on the data stored in the non-volatile
storage.
Data-transfer Failure:
When a disk failure occurs amid data-transfer operation resulting in loss of content from disk storage then
such failures are categorized as data-transfer failures. Some other reason for disk failures includes disk
head crash, disk unreachability, formation of bad sectors, read-write errors on the disk, etc.
In order to quickly recover from a disk failure caused a data-transfer operation, the backup copy of the
data stored on other tapes or disks can be used.

Recovery and Atomicity


When a system crashes, it may have several transactions being executed and various files opened for them
to modify the data items. Transactions are made of various operations, which are atomic in nature. But
according to ACID properties of DBMS, atomicity of transactions as a whole must be maintained, that is,
either all the operations are executed or none.
When a DBMS recovers from a crash, it should maintain the following −
 It should check the states of all the transactions, which were being executed.
 A transaction may be in the middle of some operation; the DBMS must ensure the atomicity of the
transaction in this case.
 It should check whether the transaction can be completed now or it needs to be rolled back.
 No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well as maintaining the
atomicity of a transaction −
 Maintaining the logs of each transaction, and writing them onto some stable storage before
actually modifying the database.
 Maintaining shadow paging, where the changes are done on a volatile memory, and later, the
actual database is updated.
Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a transaction. Log-
based recovery is a widely used approach in database management systems to recover from system
failures and maintain atomicity and durability of transactions. The fundamental idea behind log-based
recovery is to keep a log of all changes made to the database, so that after a failure, the system can use
the log to restore the database to a consistent state.

Log Based Recovery

What is log-based recovery in DBMS?


1. As the name suggests, log is a sequence of records that is maintained in a stable storage devices to
note down all the changes made by transactions in a sequential manner. This log is used to recover
the transaction in case of failure.
2. Any operation performed by transaction on database is recorded in the log.
3. It is important to record the log before the actual operation performed on the database, this make
sure that if an operation fail, it is already recorded in the log.

Logs for different database modification approaches


There are two database modification approaches used by the transactions. Here we will learn how the logs
are maintained for each approach:

1. Deferred Database Modification


In this approach, the transaction does not commit the changes the database, until it is completed
successfully.
In this approach, all the logs are created at once and stored in the database.

2. Immediate Database Modification


In this approach, the transaction make change immediately after an operation is performed by the
transaction.
In this approach, logs are recorded just before the transaction is going to perform an operation in
database.
How Log-Based Recovery Works
1. Transaction Logging
2. Writing to the Log
3. Checkpointing
4. Recovery Process
5. Commit/Rollback
Recovery using Log Records
In case of a transaction failure, the log is referenced to recover the transaction and rollback or redone all
the changes done by the transaction.
 If the log contains the entry <Tn, Start> and <Tn, Commit> or <Tn, Start> and <Tn, Abort> then the
transaction Tn needs to be redone based on the log entries for each operation recorded in the log.
 If the log contains the entry <Tn, Start> but doesn’t contain an entry for <Tn, Commit> or <Tn,
Abort> then the transaction needs to be rolled back.
Benefits of Log-Based Recovery:
 Atomicity: Guarantees that even if a system fails in the middle of a transaction, the transaction can
be rolled back using the log.
 Durability: Ensures that once a transaction is committed, its effects are permanent and can be
reconstructed even after a system failure.
 Efficiency: Since logging typically involves sequential writes, it is generally faster than random
access writes to a database.

Recovery with concurrent Transaction


Concurrency control means that multiple transactions can be executed at the same time and then the
interleaved logs occur. But there may be changes in transaction results so maintain the order of execution
of those transactions. During recovery, it would be very difficult for the recovery system to backtrack all
the logs and then start recovering. Recovery with concurrent transactions can be done in the following
four ways.
1. Interaction with concurrency control
2. Transaction rollback
3. Checkpoints
4. Restart recovery

1. Interaction with concurrency control: In this scheme, the recovery scheme depends greatly on the
concurrency control scheme that is used. So, to rollback a failed transaction, we must undo the updates
performed by the transaction.
2. Transaction rollback:
• In this scheme, we rollback a failed transaction by using the log.
• The system scans the log backward a failed transaction, for every log record found in the log the
system restores the data item.
3. Checkpoints:
• Checkpoints is a process of saving a snapshot of the applications state so that it can restart from
that point in case of failure.
• Checkpoint is a point of time at which a record is written onto the database form the buffers.
• Checkpoint shortens the recovery process.
• When it reaches the checkpoint, then the transaction will be updated into the database, and till
that point, the entire log file will be removed from the file. Then the log file is updated with the
new step of transaction till the next checkpoint and so on.
• Checkpoint acts like a bookmark.
When more than one transaction are being executed in parallel, the logs are interleaved. At the time of
recovery, it would become hard for the recovery system to backtrack all logs, and then start recovering. To
ease this situation, most modern DBMS use the concept of 'checkpoints'.
Keeping and maintaining logs in real time and in real environment may fill out all the memory space
available in the system. As time passes, the log file may grow too big to be handled at all. Checkpoint is a
mechanism where all the previous logs are removed from the system and stored permanently in a storage
disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions
were committed.
Recovery: When a system with concurrent transactions crashes and recovers, it behaves in the following
manner –

 The recovery system reads the logs backwards from the end to the last checkpoint.
 It maintains two lists, an undo-list and a redo-list.
 If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts
the transaction in the redo-list.
 If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the
transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the transactions in the
redo-list and their previous logs are removed and then redone before saving their logs.
Introduction to Security and Authorization
Database Security means keeping sensitive information safe and prevent the loss of data. Security of data
base is controlled by Database Administrator (DBA).
The following are the main control measures are used to provide security of data in databases:
1. Authentication
2. Access control
3. Flow control
4. Database Security applying Statistical Method
5. Encryption
These are explained as following below:
1. Authentication :
Authentication is the process of confirmation that whether the user log in only according to the
rights provided to him to perform the activities of data base. A particular user can login only up to
his privilege but he can’t access the other sensitive data.
By using these authentication tools for biometrics such as retina and figure prints can prevent the
data base from unauthorized/malicious users.
2. Access Control :
The security mechanism of DBMS must include some provisions for restricting access to the data
base by unauthorized users. Access control is done by creating user accounts and to control login
process by the DBMS. So, that database access of sensitive data is possible only to those people
(database users) who are allowed to access such data and to restrict access to unauthorized
persons.
The database system must also keep the track of all operations performed by certain user
throughout the entire login time.
3. Flow Control :
This prevents information from flowing in a way that it reaches unauthorized users. Channels are
the pathways for information to flow implicitly in ways that violate the privacy policy of a company
are called convert channels.
4. Database Security applying Statistical Method :
Statistical database security focuses on the protection of confidential individual values stored in
and used for statistical purposes and used to retrieve the summaries of values based on categories.
They do not permit to retrieve the individual information.
This allows to access the database to get statistical information about the number of employees in
the company but not to access the detailed confidential/personal information about the specific
individual employee.
5. Encryption :
This method is mainly used to protect sensitive data (such as credit card numbers, OTP numbers)
and other sensitive numbers. The data is encoded using some encoding algorithms.
An unauthorized user who tries to access this encoded data will face difficulty in decoding it, but
authorized users are given decoding keys to decode data.
Authorization
Authorization is the process where the database manager gets information about the authenticated
user. Part of that information is determining which database operations the user can perform and which
data objects a user can access. OR
Authorization is a set of rules that can be used to determine which user has what type of access to which
portion of the database.
Authorization determines what actions a user, already authenticated, is allowed to perform on the
database and which data they can access. It's a crucial part of database security, ensuring data integrity
and preventing unauthorized access. Authorization follows authentication, where the user's identity is
verified, and grants access based on their role and privileges.
Authorization is a privilege provided by the Database Administer. Users of the database can only view the
contents they are authorized to view.
The different permissions for authorizations available are:

 Primary Permission - This is granted to users publicly and directly.


 Secondary Permission - This is granted to groups and automatically awarded to a user if he is a
member of the group.
 Public Permission - This is publicly granted to all the users.
 Context sensitive permission - This is related to sensitive content and only granted to a select
users.
The following forms of authorization are permitted on database items:
1.READ: It allows reading of data object, but not modification, deletion or insertion of data object.
2. INSERT: It allows insertion of new data, but not the modification of existing data, e.g., insertion of
tuple in a relation.
3. UPDATE: It allows modification of data, but not its deletion. But data items like primary-key
attributes may not be modified.
4. DELETE: It allows deletion of data only. In addition to these manipulation operations, a user may be
granted control operations like:
(i) Add: It allows adding new objects such as new relations.
(ii) Drop: It allows the deletion of relations in a database.
(iii) Alter: It allows addition of new attributes in a relations or deletion of existing attributes from the
database.
The categories of authorization that can be given to users are:
System Administrator - This is the highest administrative authorization for a user. Users with this
authorization can also execute some database administrator commands such as restore or upgrade a
database.
System Control - This is the highest control authorization for a user. This allows maintenance operations
on the database but not direct access to data.
System Maintenance - This is the lower level of system control authority. It also allows users to maintain
the database but within a database manager instance.
System Monitor - Using this authority, the user can monitor the database and take snapshots of it.

You might also like