Dbms
Dbms
An Entity-Relationship (ER) Model is a way to visually and logically plan a database before building it. It
helps us identify what data we need to store, how different pieces of data are related, and how to organize
everything efficiently. Once we have an ER model, we can convert it into a relational database design, which
means creating tables and relationships in a way that computers can manage and users can easily query.
What is an ER Model?
Entities: Objects or things we want to store data about (e.g., Student, Teacher, Course).
Clarity: ER diagrams make it easy to see how data is organized and related.
Planning: Helps avoid mistakes and missing data before building the actual database.
1. Entities
Entities are the main objects you want to store information about. Each entity becomes a table in the database.
Examples:
Student
Teacher
Course
2. Attributes
Examples:
3. Relationships
Examples:
4. Keys
Foreign Key: Connects one table to another (e.g., StudentID in the Enrollment table).
Suppose we want to keep track of students and the courses they take.
Enrollment: A student can enroll in many courses, and each course can have many students (many-to-
many relationship).
+-----------+ +-----------+
| Student | | Course |
+-----------+ +-----------+
| StudentID |<--------->| CourseID |
| Name | | Name |
| Age | +-----------+
+-----------+
To represent many-to-many relationships in a relational database, we use a third table called a junction table
(e.g., Enrollment).
Enrollment Table:
StudentID CourseID
1 101
1 102
2 101
4. Map Relationships:
o Many-to-Many: Create a new table with foreign keys from both entities.
Clarity: Makes it easier to see what data is needed and how it’s connected.
Real-Life Example
Imagine you are designing a library database. You need to keep track of books, members, and which books are
borrowed.
ER Diagram:
[Member]---<borrows>---[Book]
Relational Tables:
1 Riya Guwahati
MemberID BookID
---------- --------
1 101
Diagram Example
+-----------+ +-----------+
| Student | | Course |
+-----------+ +-----------+
| StudentID |<--------->| CourseID |
| Name | | Name |
| Age | +-----------+
+-----------+
Summary Table
Step What to Do
ER modeling is the foundation of good relational database design. By carefully planning entities, attributes, and
relationships, you can create a database that is easy to use, efficient, and error-free. Once the ER diagram is
ready, converting it into tables and keys is straightforward, ensuring your data is well-organized and reliable for
any application.
ER diagrams are powerful tools for designing databases, but sometimes real-world situations are more complex.
To handle these, we use advanced concepts like Generalisation, Specialisation, and Aggregation. These help
us simplify, organize, or add detail to our database designs.
Generalisation
Definition
Generalisation is the process of combining similar entities into a single, more general entity. It’s a bottom-up
approach, where you look for common features among entities and create a super-entity.
Example
Suppose you have two entities: Car and Bike. Both have common attributes like "Color", "EngineNo", and
"Owner". Instead of repeating these attributes in both entities, you can generalize them into a single entity called
Vehicle.
Diagram:
Car Bike
\ /
[Vehicle]
Specialisation
Definition
Specialisation is the opposite of generalisation. It’s a top-down approach, where you start with a general entity
and split it into more specific sub-entities based on unique features.
Example
Suppose you have an entity called Employee. Some employees are Teachers, others are Clerks. Teachers have
"Subject" as an attribute, while Clerks have "Department".
Diagram:
[Employee]
/ \
[Teacher] [Clerk]
Aggregation
Definition
Aggregation is used when a relationship itself needs to be treated as an entity. It helps when you want to model
relationships between relationships.
Example
Suppose you have Employee and Project entities, and a relationship "Works_On". Now, you want to track
which Department is responsible for each "Works_On" relationship.
Diagram:
[Employee]---(works on)---[Project]
\________________________/
[Assignment] (Aggregation)
Real-Life Example
Imagine a university:
Aggregation: "Student participates in Event" can be aggregated if you want to relate "Event
Participation" to "Sponsorship".
Diagram Summary
Generalisation:
Car Bike
\ /
Vehicle
Specialisation:
Employee
/ \
Teacher Clerk
Aggregation:
[Employee]---(works on)---[Project]
\________________________/
[Assignment]
Conclusion
Generalisation, specialisation, and aggregation are advanced ER diagram concepts that help you design flexible,
organized, and detailed databases. They make it easier to handle real-world complexity and ensure your database
can grow as requirements change.
A database model is a way to organize and structure data in a database. The three main types are Hierarchical,
Network, and Relational models. Each has its own way of storing data and connecting information.
Hierarchical Model
Structure
Each parent can have multiple children, but each child has only one parent.
Example
Think of a company:
CEO (parent)
Diagram:
CEO
|
Manager
|
Employee
Network Model
Structure
Each record can have multiple parents and children (many-to-many relationships).
Example
Diagram:
Relational Model
Structure
Example
StudentID Name
1 Riya
CourseID CourseName
---------- ------------
101 Math
StudentID CourseID
----------- ----------
1 101
Comparison Table
Conclusion
Understanding these models helps you choose the right structure for your data. Today, the relational model is
the most popular because it is flexible, powerful, and easy to use.
Relational database design is about organizing data into tables and defining rules for storing, updating, and
retrieving data efficiently and accurately. It uses relational languages like SQL to interact with data.
Underlying Concepts
Structure
Example Structure:
CourseID CourseName
---------- ------------
101 Math
StudentID CourseID
----------- ----------
1 101
Relational Languages
1. Relational Algebra
Example:
Commands:
Example:
SELECT Students.Name
FROM Students
JOIN Enrollment ON Students.StudentID = Enrollment.StudentID
JOIN Courses ON Enrollment.CourseID = Courses.CourseID
WHERE Courses.CourseName = 'Math';
Conclusion
Relational database design and languages like SQL make it easy to store, retrieve, and manage data.
Understanding these basics is essential for modern database systems.
Integrity constraints are rules that ensure the accuracy and reliability of data in a database. They prevent
mistakes and keep data consistent.
Domain Constraints
Referential Integrity
Diagram:
Assertions
Triggers
Functional Dependencies
Conclusion
Integrity constraints are the backbone of a reliable database. They help maintain trust in your data by preventing
errors and enforcing rules.
Normalization is the process of organizing tables to reduce data repetition and improve data integrity. It uses
rules called normal forms.
Steps of Normalisation
No transitive dependency.
Multivalued Dependencies
Join Dependencies
When a table can be split into smaller tables and joined back without losing information.
Example: Normalisation
Student Courses
After normalization:
Student Course
Riya Math
Riya Eng
Conclusion
Normalization makes your database efficient, easy to manage, and free from unnecessary duplication.
A transaction is a sequence of database operations that must be completed as a single unit. Transaction
processing ensures data remains accurate and reliable even when many users access the database at once.
Transaction Concept
Transaction: A group of operations (like transferring money) that should be completed together.
Example: Transferring ₹100 from account A to B involves deducting from A and adding to B.
Transaction States
ACID Properties
Serializability
Transactions should produce the same result as if they were run one after another.
Recoverability
Conclusion
Transaction processing ensures that databases remain correct, even with many users and operations happening at
the same time.
Concurrency control manages how multiple users access the database at the same time, preventing errors and
conflicts.
Lock-Based Protocols
Timestamp-Based Protocols
Validation-Based Protocols
Multi-Version Schemes
Deadlock Handling
Conclusion
Concurrency control ensures that many users can safely use the database at the same time without causing
problems.
9. Recovery System: Lock-Based Recovery, Checkpoints,
Shadow Paging, Buffer Management, Recovery from
Loss, Logging, Rollback, Restart Recovery, Fuzzy
Checkpointing
Introduction
A recovery system restores the database to a correct state after a failure (like a crash or power cut).
Lock-Based Recovery
Immediate: Changes are saved as soon as they happen, but can be undone.
Checkpoints
Save the state of the database at certain points for easier recovery.
Shadow Paging
Use copies of data pages to keep the original safe until changes are committed.
Buffer Management
Transaction Rollback
Restart Recovery
Fuzzy Checkpointing
Conclusion
A good recovery system protects your data from crashes and failures, ensuring you never lose important
information.
1. QUERY PROCESSING
Introduction
Query processing is the series of steps a database management system (DBMS) follows to take a user’s query
(usually written in SQL), understand it, find the best way to answer it, and then actually get the results from the
database. This process is like translating a question in English into a step-by-step plan that the computer can
follow, making sure the answer is found quickly and efficiently.
Parsing is the first step. The DBMS breaks the SQL query into pieces called tokens (like SELECT,
FROM, WHERE) and checks if the query follows the correct syntax (grammar rules).
Translation means converting the SQL query into an internal form, often using relational algebra (a
kind of mathematical language the DBMS understands).
The DBMS also checks if the tables and columns mentioned actually exist and if the operations make
sense (semantic analysis).
Example:
The DBMS checks if the employee table and emp_name and salary columns exist.
2. Query Optimization
There are often many ways to answer a query. The DBMS generates different possible “plans” for how to
get the answer.
Query optimization is the process of choosing the best (fastest and least costly) plan.
o How much CPU and disk access will each plan need?
3. Query Evaluation
The DBMS uses physical operators (like scanning a table, using an index, joining tables) to actually
fetch the data.
Example Flow
1. User Query:
SELECT name FROM students WHERE marks > 80;
2. Parsing:
DBMS checks syntax and existence of table/column.
3. Translation:
SQL is converted to relational algebra, e.g.,
σ(marks > 80)(students)
4. Optimization:
DBMS considers using an index on marks, or scanning the table.
5. Evaluation:
The best plan is run, and results are shown.
The DBMS represents the query as a query tree (a diagram showing the order of operations).
Diagram:
Allows users to write simple SQL, while the DBMS handles the complex details.
Conclusion
Query processing is a key part of any DBMS. It turns user questions into efficient actions, making sure the
database is both powerful and easy to use.
A database is stored on disk as files. The way data is stored and organized on disk affects how fast and
efficiently it can be accessed. Understanding storage and file structure is important for designing and using
databases.
Storage Types
1. Primary Storage
2. Secondary Storage
File Structure
Database files are divided into blocks (fixed-size chunks, e.g., 4KB each).
Each block can store several records (rows from a table).
The way records are arranged in blocks affects how quickly they can be read or written.
2. Sequential Files:
3. Hashed Files:
Blocking Factor
The blocking factor is the number of records that fit in one block.
Example: If a block is 4KB and each record is 100 bytes, then blocking factor = 4096/100 = 40 records
per block.
Access Paths
It helps the DBMS find data quickly without scanning the whole file.
Hashing
Very fast for equality searches (e.g., find student with ID=123).
External Sorting
Conclusion
Understanding storage and file structure helps in designing databases that are fast, reliable, and efficient.
Indexing Techniques
Diagram:
Hashing Techniques
External Sorting
Conclusion
Choosing the right file organization and access path can make a huge difference in database performance.
4. TRANSFORMATION OF RELATIONAL
EXPRESSIONS, BREAKING OF QUERIES INTO
SUBQUERIES TO OPTIMISE EXECUTION PLAN,
SELECT, PROJECT AND JOIN OPERATIONS, SET
OPERATIONS, AGGREGATION, COST-BASED
QUERY OPTIMISATION, MEASUREMENT OF
COST OF A QUERY, EVALUATION OF
EXPRESSIONS
Transformation of Relational Expressions
Example:
Selects rows where marks > 80, then projects the name column.
Set Operations
Aggregation
The optimizer estimates the "cost" (time, CPU, disk access) of each possible plan.
Measurement of Cost
Factors:
o CPU time
Evaluation of Expressions
The DBMS chooses the best order and method to evaluate each part of the query.
Conclusion
Transforming and optimizing queries ensures that the database answers questions quickly, even with large
amounts of data.
Faster than cost-based optimization, but may not always find the absolute best plan.
Query Tree
Example:
JOIN
/ \
SELECT SELECT
| |
Table1 Table2
Query Graph
Representations of Queries
Example: If you know all students have marks > 0, you can remove "WHERE marks > 0" from the query.
Conclusion
Heuristic and semantic optimizations help the DBMS answer queries faster by simplifying and rearranging
operations based on rules and data knowledge.
1. QUERY PROCESSING
Introduction
Query processing is the backbone of every Database Management System (DBMS). When a user writes a query
(usually in SQL), the DBMS must interpret it, figure out the best way to execute it, and then return the result.
This process is called query processing. Efficient query processing is crucial for fast and correct data retrieval,
especially when databases are large and queries are complex.
Parsing: The DBMS checks the query’s syntax (grammar) and breaks it into tokens (keywords, table
names, column names, etc.).
Semantic Analysis: It checks if the tables and columns exist and if the operations make sense.
Translation: The query is converted into an internal form, usually relational algebra (a mathematical
way to describe database operations).
Example:
The DBMS checks if the students table and name and marks columns exist.
2. Query Optimization
There are many ways to answer a query. The DBMS creates different “plans” for how to get the answer.
Query optimization is the process of finding the fastest, least costly plan.
o Indexes available
o Table sizes
o Order of joins
Example:
If you join tables A, B, and C, the optimizer tries different join orders and chooses the best.
3. Query Evaluation
The DBMS uses physical operators (like table scans, index scans, joins) to fetch data.
Diagram:
Techniques Used
Example
1 Riya 85
2 Arjun 75
101 Math 1
102 Science 2
Query:
Summary Table
Step Description
Conclusion
Query processing turns user questions into efficient actions, ensuring that databases remain powerful and easy to
use, even as they grow in size and complexity.
A database is stored on disk as files. The way data is stored and organized on disk affects how quickly and
efficiently it can be accessed. Understanding storage and file structure is crucial for designing high-performing
databases.
Types of Storage
1. Primary Storage
2. Secondary Storage
File Structure
Database files are divided into blocks (fixed-size chunks, e.g., 4KB).
2. Sequential Files
3. Hashed Files
Blocking Factor
Number of records per block.
Example: If a block is 4KB and each record is 100 bytes, blocking factor = 4096/100 = 40 records per
block.
Access Paths
Indexing
It helps the DBMS find data quickly without scanning the whole file.
Hashing
Very fast for equality searches (e.g., find student with ID=123).
External Sorting
Example Diagram
+--------+--------+--------+
| Block1 | Block2 | Block3 |
+--------+--------+--------+
| rec1 | rec5 | rec9 |
| rec2 | rec6 | rec10 |
| rec3 | rec7 | rec11 |
| rec4 | rec8 | rec12 |
+--------+--------+--------+
Conclusion
Efficient storage and file structure design is essential for fast, reliable, and scalable databases.
Diagram:
Hashing Techniques
External Sorting
Merge Sort: Splits data into sorted runs, then merges them.
Conclusion
Choosing the right file organization and access path can make a huge difference in database performance.
4. TRANSFORMATION OF RELATIONAL
EXPRESSIONS, BREAKING OF QUERIES INTO
SUBQUERIES TO OPTIMISE EXECUTION PLAN,
SELECT, PROJECT AND JOIN OPERATIONS, SET
OPERATIONS, AGGREGATION, COST-BASED
QUERY OPTIMISATION, MEASUREMENT OF
COST OF A QUERY, EVALUATION OF
EXPRESSIONS
Transformation of Relational Expressions
Example:
Selects rows where marks > 80, then projects the name column.
Set Operations
Union: Combines results from two queries (removes duplicates).
Aggregation
The optimizer estimates the "cost" (time, CPU, disk access) of each possible plan.
Measurement of Cost
Factors:
o CPU time
Evaluation of Expressions
The DBMS chooses the best order and method to evaluate each part of the query.
Conclusion
Transforming and optimizing queries ensures that the database answers questions quickly, even with large
amounts of data.
Faster than cost-based optimization, but may not always find the absolute best plan.
Query Tree
Example:
JOIN
/ \
SELECT SELECT
| |
Table1 Table2
Query Graph
Example: If you know all students have marks > 0, you can remove "WHERE marks > 0" from the query.
Conclusion
Heuristic and semantic optimizations help the DBMS answer queries faster by simplifying and rearranging
operations based on rules and data knowledge.
END OF FILE