Dbms Unit 3
Dbms Unit 3
Functional Dependency
In relational database management, functional dependency is a concept that specifies the relationship
between two sets of attributes where one attribute determines the value of another attribute. It is
denoted as X → Y, where the attribute set on the left side of the arrow, X is called Determinant, and Y is
called the Dependent.
Types of Functional Dependencies in DBMS
1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency
1. Trivial Functional Dependency
In Trivial Functional Dependency, a dependent is always a subset of the determinant. i.e. If X → Y and Y is
the subset of X, then it is called trivial functional dependency.
Symbolically: A→B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A→A & B→B
Example 1 :
ABC -> AB
ABC -> A
ABC -> ABC
Example 2:
Here, {roll_no, name} → name is a trivial functional dependency, since the dependent name is a subset of
determinant set {roll_no, name}. Similarly, roll_no → roll_no is also an example of trivial functional
dependency.
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a subset of
determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial functional dependency, since age
is not a subset of {roll_no, name}
3. Multivalued Functional Dependency
In Multivalued functional dependency, entities of the dependent set are not dependent on each other. i.e.
If a → {b, c} and there exists no functional dependency between b and c, then it is called a multivalued
functional dependency.
Example:
In this table:
X: bike_model
Y: color
Z: manuf_year
For each bike model (bike_model):
1. There is a group of colors (color) and a group of manufacturing years (manuf_year).
2. The colors do not depend on the manufacturing year, and the manufacturing year does not depend
on the colors. They are independent.
3. The sets of color and manuf_year are linked only to bike_model.
That’s what makes it a multivalued dependency.
In this case these two columns are said to be multivalued dependent on bike_model.
4. Transitive Functional Dependency
In transitive functional dependency, dependent is indirectly dependent on determinant. i.e.
If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive functional dependency.
Example:
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of transitivity, enrol_no →
building_no is a valid functional dependency. This is an indirect functional dependency, hence called
Transitive functional dependency.
5. Fully Functional Dependency
In full functional dependency an attribute or a set of attributes uniquely determines another attribute or
set of attributes. If a relation R has attributes X, Y, Z with the dependencies X->Y and X->Z which states that
those dependencies are fully functional.
6. Partial Functional Dependency
In partial functional dependency a non key attribute depends on a part of the composite key, rather than
the whole key. If a relation R has attributes X, Y, Z where X and Y are the composite key and Z is non key
attribute. Then X->Z is a partial functional dependency in RBDMS.
Normalization
Normalization is a process of organizing data in a database to reduce redundancy and improve data
consistency. In it we divide table into 2 or more tables.
This concept is used to reduce redundancy.
Normalization is a process of organizing the data in database to avoid data redundancy, insertion anomaly,
update anomaly & deletion anomaly.
Normalization is the process of organizing data in a database to minimize redundancy and dependency. In
database design, there are different normal forms based on the primary keys of a table. These include :
2 Jane 555-9876
3 Michael 555-5555
This violates 1NF because the Phone Numbers column contains repeating groups.
To normalize this table to 1NF, we can split the Phone Numbers column into separate rows and add a
separate primary key column.
1 John 555-1234
1 John 555-5678
2 Jane 555-9876
3 Michael 555-5555
1 1 John 1 Shirt 2
1 1 John 2 Pants 1
2 2 Jane 1 Shirt 1
2 2 Jane 3 Hat 3
This violates 2NF because the Customer Name column depends on only part of the primary key (Customer
ID). To normalize this table to 2NF, we can split it into two tables.
Order ID Customer ID Customer Name Customer City Order Date Order Total
Table 1: Customers
Table 2: Orders
Now, the "Customer City" column is no longer transitively dependent on the primary key and is instead in a
separate table that has a direct relationship with the primary key.
Table 2: Books
Book ID Title Author ID
1 Crime and Punishment 100
2 The Great Gatsby 101
3 Pride and Prejudice 102
Now, the "Author Name" and "Author Nationality" columns are not transitively dependent on the
primary key, and the table is in BCNF.
Advantages of Normalization
Reduced Data Redundancy
Improved Data Consistency
Simplified Database Maintenance
Improved Query Performance
Loss less Decomposition
Lossless Decomposition in DBMS refers to breaking a relation (table) into two or more sub-relations in
such a way that no information is lost when the sub-relations are joined back together.
When the relational model is not in an appropriate normal form, then the decomposition of a relationship
is required. A table is broken into multiple tables which is known as decomposition. It is done to eliminate
redundancy and inconsistency. Decomposition is categorized into two types- lossless join decomposition
and dependency preserving.
In short, the original relation can be obtained by using joins on the decomposed relations. Here the original
data is preserved and it is ensured that the original data and the data after reconstruction should be the
same.
Criteria of Lossless Join Decomposition in DBMS?
For lossless join decomposition, we select a common attribute. Attributes in DBMS are the descriptive
properties which describe an entity.
Example 1: Consider the following relations- R = (D, E, F)
R1 = (D, E)
R2 = (E, F)
The relation R has 3 attributes D, E, and F. The relation R is decomposed into two relations Relation-1 and
Relation-2. Relation-1 and Relation-2 both have two attributes. Both have a common attribute 'E'.
Now, let us draw a table of Relation R with raw data −
Also, it is important to remember that the value present in Column E should be unique. If there is a presence
of a duplicate value, it is not possible for lossless join decomposition to take place.
R = (D, E, F)
D E F
78 19 16
39 76 91
78 29 44
It is decomposed as follows-
R1(D, E)
D E
78 19
39 76
78 29
R2(E, F)
E F
19 16
76 91
29 44
Let us check the first condition. It was The union of the sub-relations R1 and R2 must contain all the
attributes that are available in the original relation R before decomposition.
So, R1 U R2= R
D E F
78 19 16
39 76 91
78 29 44
The relation obtained above is the same as the original relation R. We can say that it is an example
of Lossless-join decomposition.
Transaction Management
In a Database Management System (DBMS), a transaction is a sequence of operations performed as a
single logical unit of work. These operations may involve reading, writing, updating, or deleting data in the
database. A transaction is considered complete only if all its operations are successfully executed,
Otherwise the transaction must be rolled back, ensuring the database remains in a consistent state.
Operations of Transaction:
A user can make different types of requests to access and modify the contents of a database. So, we have
different types of operations relating to a transaction. They are discussed as follows:
1) Read(X)
Example: For a banking system, when a user checks their balance, a Read operation is performed on
their account balance:
SELECT balance FROM accounts WHERE account_id = 'A123';
This updates the balance of the user's account after withdrawal
2) Write(X)
Example: For the banking system, if a user withdraws money, a Write operation is performed after the
balance is updated:
UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A123';
This updates the balance of the user’s account after withdrawal.
3) Commit
Example: After a successful money transfer in a banking system, a Commit operation finalizes the
transaction: COMMIT;
Once the transaction is committed, the changes to the database are permanent, and the transaction is
considered successful.
4) Rollback
Example: Suppose during the money transfer process, the system encounters an issue, like insufficient
funds in the sender’s account. In that case, the transaction is rolled back:
ROLLBACK;
This will undo all the operations performed so far and ensure that the database remains consistent.
Transaction States in DBMS
A transaction in a DBMS refers to a series of operations that are executed together as a single unit to
perform a task, such as transferring money between accounts in a banking system. A transaction state
refers to the current phase or condition of a transaction during its execution in a database. It represents
the progress of the transaction and determines whether it will successfully complete (commit) or fail
(abort).
A transaction goes through several states during its lifetime.
A transaction log is used to keep a record of all the steps taken during the transaction.
A transaction involves two main operations, Read Operation and Write Operation.
Read Operation: Reads data from the database, stores it temporarily in memory (buffer), and uses
it as needed.
Write Operation: Updates the database with the changed data using the buffer.
Note: From the start of executing instructions to the end, Read and Write operations are treated as a
single transaction. This ensures the database remains consistent and reliable throughout the process.
Different Types of Transaction States in DBMS
1. Active State
2. Partially Committed State
3. Failed State
4. Aborted State
5. Committed State
6. Terminated State
1. Active State
It is the first stage of any transaction when it has begun to execute. The execution of the
transaction takes place in this state.
When the instruction of the transaction are executing then the transaction is in active state. In case
of execution of all instruction of transaction, transaction can go to "Partially committed state"
otherwise go to "Failed state" from active state.
Operations such as insertion, deletion, or updation are performed during this state.
During this state, the data records are under manipulation and they are not saved to the database,
rather they remain somewhere in a buffer in the main memory.
2. Partially Committed
After the execution of all instruction of a transaction, the transaction enters into partially committed state
from active state.
At this stage, still changes are possible in transaction because all changes made by the transaction are
still stored in the buffer of main memory.
The transaction has finished its final operation, but the changes are still not saved to the database.
After completing all read and write operations, the modifications are initially stored in main
memory or a local buffer. If the changes are made permanent in the database then the state will
change to “committed state” and in case of failure it will go to the “failed state”.
3. Committed: This state of transaction is achieved when all the transaction-related operations have
been executed successfully along with the Commit operation, i.e. data is saved into the database after the
required manipulations in this state. This marks the successful completion of a transaction.
4. Failed State: If any of the transaction-related operations cause an error during the active or partially
committed state, further execution of the transaction is stopped and it is brought into a failed state. Here,
the database recovery system makes sure that the database is in a consistent state.
5. Aborted State: If a transaction reaches the failed state due to a failed check, the database recovery
system will attempt to restore it to a consistent state. If recovery is not possible, the transaction is either
rolled back or cancelled to ensure the database remains consistent.
6. Terminated State: It refers to the final state of a transaction, indicating that it has completed its
execution. Once a transaction reaches this state, it has either been successfully committed or aborted and
get ready to execute the new transaction.
Deadlock in DBMS
Deadlock Avoidance: When a database is stuck in a deadlock, It is always better to avoid the deadlock
rather than restarting or aborting the database. The deadlock avoidance method is suitable for smaller
databases whereas the deadlock prevention method is suitable for larger databases.
In the above given example, Transactions that access Students and Grades should always access the tables
in the same order. In this way, in the scenario described above, Transaction T1 simply waits for transaction
T2 to release the lock on Grades before it begins. When transaction T2 releases the lock, Transaction T1
can proceed freely.
Deadlock Detection: When a transaction waits indefinitely to obtain a lock, the database management
system should detect whether the transaction is involved in a deadlock or not.
Wait-for-graph is one of the methods for detecting the deadlock situation. This method is suitable for
smaller databases. In this method, a graph is drawn based on the transaction and its lock on the resource.
If the graph created has a closed loop or a cycle, then there is a deadlock.
For the above-mentioned scenario, the Wait-For graph is drawn below:
Deadlock prevention: For a large database, the deadlock prevention method is suitable. A deadlock
can be prevented if the resources are allocated in such a way that a deadlock never occurs.
The DBMS analyzes the operations whether they can create a deadlock situation or not, If they do, that
transaction is never allowed to be executed. Deadlock prevention mechanism proposes two schemes:
• Wait-Die Scheme: In this scheme, If a transaction requests a resource that is locked by another
transaction, then the DBMS simply checks the timestamp of both transactions and allows the
older transaction to wait until the resource is available for execution.
• Wound Wait Scheme: In this scheme, if an older transaction requests for a resource held by a
younger transaction, then an older transaction forces a younger transaction to kill the
transaction and release the resource. The younger transaction is restarted with a minute delay
but with the same timestamp. If the younger transaction is requesting a resource that is held by
an older one, then the younger transaction is asked to wait till the older one releases it.
The following table lists the differences between Wait – Die and Wound -Wait scheme prevention
schemes:
Wait – Die Wound -Wait
In this, older transactions must wait for the younger one to In this, older transactions never wait for younger
release its data items. transactions.
The number of aborts and rollbacks is higher in these In this, the number of aborts and rollback is lesser.
techniques.
Applications:
Delayed Transactions: Deadlocks can cause transactions to be delayed, as the resources they need
are being held by other transactions. This can lead to slower response times and longer wait times
for users.
Lost Transactions: In some cases, deadlocks can cause transactions to be lost or aborted, which can
result in data inconsistencies or other issues.
Reduced User Satisfaction: Deadlocks can lead to a perception of poor system performance and
can reduce user satisfaction with the application. This can have a negative impact on user adoption
and retention.
Disadvantages:
System downtime: Deadlock can cause system downtime, which can result in loss of productivity
and revenue for businesses that rely on the DBMS.
Resource waste: When transactions are waiting for resources, these resources are not being used,
leading to wasted resources and decreased system efficiency.
Reduced concurrency: Deadlock can lead to a decrease in system concurrency, which can result in
slower transaction processing and reduced throughput.
Complex resolution: Resolving deadlock can be a complex and time-consuming process.
Serializability in DBMS
Serializability in DBMS guarantees that the execution of multiple transactions in parallel does not produce
any unexpected or incorrect results. This is accomplished by enforcing a set of rules that ensure that each
transaction is executed as if it were the only transaction running in the system.
What is a Schedule in DBMS: When several transaction are running concurrently then the order execution
of various instructions is known as Schedule. A schedule can include multiple transactions running
concurrently, each performing a series of read and write operations on the database. The order of
execution of these operations can significantly impact the final state of the database and the correctness
of the results. Schedule is of 2 types:
1. Serial Schedule in DBMS: The serial schedule is a type of schedule where each transaction is
executed completely before the next transaction begins. In the Serial schedule, when the first transaction
completes its cycle, then the next transaction is executed. This schedule is viewed as simplest. Due to the
fact that transactions only execute one after the other, serial schedules are always serializable. Example
of Serial Schedule:
Transaction 1 Transaction 2
R(x)
W(x)
R(y)
W(y)
R(y)
W(y)
R(x)
W(x)
2. Non-Serial Schedule in DBMS: The Non-Serial schedule is a type of schedule where transactions
are executed concurrently, with some overlap in time. Unlike a serial schedule, where transactions are
executed one after the other with no overlap, a non-serial schedule allows transactions to execute
simultaneously. Example of Non-Serial Schedule:
Transaction 1 Transaction 2
R(x)
W(x)
R(y)
W(y)
R(y)
R(x)
W(y)
W(x)
Transaction 1 Transaction 2
R(a)
W(a)
R(a)
W(a)
R(b)
W(b)
R(b)
W(b)
Let’s swap the read-write operations in the middle of the two transactions to create the view equivalent
schedule.
Transaction 1 Transaction 2
R(a)
W(a)
R(b)
W(b)
R(a)
W(a)
R(b)
W(b)
2. Conflict Serializability in DBMS: A database management system (DBMS) schedule’s ability to prevent
the sequence of conflicting transactions from having an impact on the transactions’ results is known as
conflict serializability in DBMS. Conflicting transactions are those that make unauthorized changes to the
same database data item. Example of Conflict Serialzability:
Transaction 1 Transaction 2 Transaction 3
R(x)
R(y)
R(x)
R(y)
R(z)
W(y)
W(z)
R(z)
W(x)
W(z)
Transaction Failure:
If a transaction is not able to execute or it comes to a point from where the transaction becomes incapable
of executing further then it is termed as a failure in a transaction.
Reason for a transaction failure in DBMS:
1. Logical error: A logical error occurs if a transaction is unable to execute because of some mistakes
in the code or due to the presence of some internal faults.
2. System error: Where the termination of an active transaction is done by the database system itself
due to some system issue or because the database management system is unable to proceed with
the transaction. For example- The system ends an operating transaction if it reaches a deadlock
condition or if there is an unavailability of resources.
System Crash:
A system crash usually occurs when there is some sort of hardware or software breakdown. Some other
problems which are external to the system and cause the system to abruptly stop or eventually crash
include failure of the transaction, operating system errors, power cuts, main memory crash, etc.
These types of failures are often termed soft failures and are responsible for the data losses in the volatile
memory. It is assumed that a system crash does not have any effect on the data stored in the non-volatile
storage.
Data-transfer Failure:
When a disk failure occurs amid data-transfer operation resulting in loss of content from disk storage then
such failures are categorized as data-transfer failures. Some other reason for disk failures includes disk
head crash, disk unreachability, formation of bad sectors, read-write errors on the disk, etc.
In order to quickly recover from a disk failure caused a data-transfer operation, the backup copy of the
data stored on other tapes or disks can be used.
1. Interaction with concurrency control: In this scheme, the recovery scheme depends greatly on the
concurrency control scheme that is used. So, to rollback a failed transaction, we must undo the updates
performed by the transaction.
2. Transaction rollback:
• In this scheme, we rollback a failed transaction by using the log.
• The system scans the log backward a failed transaction, for every log record found in the log the
system restores the data item.
3. Checkpoints:
• Checkpoints is a process of saving a snapshot of the applications state so that it can restart from
that point in case of failure.
• Checkpoint is a point of time at which a record is written onto the database form the buffers.
• Checkpoint shortens the recovery process.
• When it reaches the checkpoint, then the transaction will be updated into the database, and till
that point, the entire log file will be removed from the file. Then the log file is updated with the
new step of transaction till the next checkpoint and so on.
• Checkpoint acts like a bookmark.
When more than one transaction are being executed in parallel, the logs are interleaved. At the time of
recovery, it would become hard for the recovery system to backtrack all logs, and then start recovering. To
ease this situation, most modern DBMS use the concept of 'checkpoints'.
Keeping and maintaining logs in real time and in real environment may fill out all the memory space
available in the system. As time passes, the log file may grow too big to be handled at all. Checkpoint is a
mechanism where all the previous logs are removed from the system and stored permanently in a storage
disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions
were committed.
Recovery: When a system with concurrent transactions crashes and recovers, it behaves in the following
manner –
The recovery system reads the logs backwards from the end to the last checkpoint.
It maintains two lists, an undo-list and a redo-list.
If the recovery system sees a log with <Tn, Start> and <Tn, Commit> or just <Tn, Commit>, it puts
the transaction in the redo-list.
If the recovery system sees a log with <Tn, Start> but no commit or abort log found, it puts the
transaction in undo-list.
All the transactions in the undo-list are then undone and their logs are removed. All the transactions in the
redo-list and their previous logs are removed and then redone before saving their logs.
Introduction to Security and Authorization
Database Security means keeping sensitive information safe and prevent the loss of data. Security of data
base is controlled by Database Administrator (DBA).
The following are the main control measures are used to provide security of data in databases:
1. Authentication
2. Access control
3. Flow control
4. Database Security applying Statistical Method
5. Encryption
These are explained as following below:
1. Authentication :
Authentication is the process of confirmation that whether the user log in only according to the
rights provided to him to perform the activities of data base. A particular user can login only up to
his privilege but he can’t access the other sensitive data.
By using these authentication tools for biometrics such as retina and figure prints can prevent the
data base from unauthorized/malicious users.
2. Access Control :
The security mechanism of DBMS must include some provisions for restricting access to the data
base by unauthorized users. Access control is done by creating user accounts and to control login
process by the DBMS. So, that database access of sensitive data is possible only to those people
(database users) who are allowed to access such data and to restrict access to unauthorized
persons.
The database system must also keep the track of all operations performed by certain user
throughout the entire login time.
3. Flow Control :
This prevents information from flowing in a way that it reaches unauthorized users. Channels are
the pathways for information to flow implicitly in ways that violate the privacy policy of a company
are called convert channels.
4. Database Security applying Statistical Method :
Statistical database security focuses on the protection of confidential individual values stored in
and used for statistical purposes and used to retrieve the summaries of values based on categories.
They do not permit to retrieve the individual information.
This allows to access the database to get statistical information about the number of employees in
the company but not to access the detailed confidential/personal information about the specific
individual employee.
5. Encryption :
This method is mainly used to protect sensitive data (such as credit card numbers, OTP numbers)
and other sensitive numbers. The data is encoded using some encoding algorithms.
An unauthorized user who tries to access this encoded data will face difficulty in decoding it, but
authorized users are given decoding keys to decode data.
Authorization
Authorization is the process where the database manager gets information about the authenticated
user. Part of that information is determining which database operations the user can perform and which
data objects a user can access. OR
Authorization is a set of rules that can be used to determine which user has what type of access to which
portion of the database.
Authorization determines what actions a user, already authenticated, is allowed to perform on the
database and which data they can access. It's a crucial part of database security, ensuring data integrity
and preventing unauthorized access. Authorization follows authentication, where the user's identity is
verified, and grants access based on their role and privileges.
Authorization is a privilege provided by the Database Administer. Users of the database can only view the
contents they are authorized to view.
The different permissions for authorizations available are: