Dbms Notes
Dbms Notes
because data is at the core of nearly every modern software application. From social media to
online banking, databases are the backbone that allows systems to store, manage, and retrieve
vast amounts of information reliably and securely. Understanding DBMS provides a
foundational skillset for building robust, scalable, and efficient software.
Importance for Computer Science and Engineering (CSE)
Core technical skills
Core Technical Skills from DBMS
1. Data Modeling and Design
o Learn to structure and organize data effectively.
o Use techniques like normalization to reduce redundancy and improve
consistency.
2. Efficient Data Retrieval
o Optimize database queries with indexing and other techniques.
o Ensures applications run fast and smoothly instead of being slow.
3. Transaction Management
o Understand and apply ACID properties (Atomicity, Consistency, Isolation,
Durability).
o Guarantees reliable and correct data operations, even during failures.
4. Concurrency Control
o Manage multiple users accessing/updating data at the same time.
o Prevents data conflicts and corruption in multi-user environments.
Real-world application development
Nearly every application uses a database: Whether building a mobile app, a
website, or an enterprise system, you will almost certainly be interacting with a
database. DBMS knowledge is fundamental for a full-stack developer who needs to
manage both the front-end and back-end of an application.
Career opportunities: A solid understanding of DBMS, including proficiency in
Structured Query Language (SQL) and an awareness of NoSQL alternatives like
MongoDB, opens up a wide range of career paths. This includes roles such as:
o Database Administrator (DBA)
o Back-end Developer
o Data Scientist or Analyst
o Big Data Engineer
Database
A database is a collection of related data.
Data means known facts that can be recorded and have meaning.
Example: Names, phone numbers, and addresses of people you know.
You can store this data in an address book, Microsoft Access, or Excel.
Any collection of related data with meaning can be called a database.
For example, even the words on a page of text can be seen as related data.
Properties of a Database:
1. A database represents a part of the real world. Any changes in the real world are
shown in the database.
2. A database is a logically organized collection of data with meaning. Random data
cannot be called a database.
3. A database is created for a specific purpose, with a set of users and applications in
mind.
Database Management System (DBMS)
A DBMS is a collection of programs that allows users to create, manage, and
maintain a database.
It is a general-purpose software system that supports:
o Defining the database → specifying data types, structures, and constraints.
o Constructing the database → storing the data on storage media controlled by
DBMS.
o Manipulating the database → querying (retrieving data), updating (reflecting
real-world changes), and generating reports.
o Sharing the database → enabling multiple users and applications to access the
data simultaneously.
DBMS Components and Functions
1. Database Catalog / Dictionary (Meta-data):
o Stores the database definition (data about data).
o Contains descriptions of structures, types, and constraints.
2. Application Interaction:
o Application programs send queries and requests to the DBMS.
o A query retrieves data, while a transaction may both read and write data.
3. Protection Features:
o System Protection → safeguards against hardware/software failures or
crashes.
o Security Protection → prevents unauthorized or malicious access.
4. Maintenance Features:
o Supports long-term use (databases may last for many years).
o Allows the system to evolve when requirements change over time.
5. Complexity of DBMS:
o A DBMS is usually a very complex software system because it manages
many critical tasks.
o It is not always necessary to use a general-purpose DBMS; one could create
special-purpose DBMS software with custom programs, but it also requires a
lot of effort and complexity.
Example
The UNIVERSITY database example shows how data is organized into different files such as
STUDENT, COURSE, SECTION, GRADE_REPORT, and PREREQUISITE, each storing
specific types of information. Every file has a defined structure with data types (like numbers,
strings, or codes) to represent details clearly. Records from different files are linked, such as a
student’s record being related to their grades or courses with their prerequisites. This example
highlights how databases store structured data and maintain meaningful relationships among
different types of records to manage real-world information efficiently.
File Processing vs. Database Approach
In traditional file processing:
o Each user/application creates and manages their own files.
o Example:
Grade office keeps files on students and grades.
Accounting office keeps files on fees and payments.
o Even though both use student data, they maintain separate files.
o This leads to:
Data redundancy (same data stored multiple times).
Wasted storage space.
Extra effort to keep data up to date.
In the database approach:
o A single shared repository stores all data.
o Data is defined once and then accessed by many users/applications.
o Data names/labels are standardized and used across queries, transactions, and
applications.
Main Characteristics of the Database Approach
1. Self-describing nature – The database contains both the data and its description
(meta-data).
2. Insulation between programs and data – Programs are separate from data; changes
in data structure don’t always require program changes.
3. Multiple views support – Different users can see the same data in different ways,
depending on their needs.
4. Data sharing and multiuser transaction processing – Many users and applications
can share and update the same database safely.
Self-Describing Nature of Database Systems
A key feature of the database approach is that it stores not just the data, but also a
complete description of the database (structure + constraints).
This description is stored in the DBMS catalog.
What the Catalog (Meta-data) Contains
Structure of each file.
Type and storage format of each data item.
Constraints on the data.
This information (called meta-data) describes the structure of the actual database.
Uses of the Catalog
The DBMS software refers to it to understand the database structure.
Users may also use it to find details about data organization.
A general-purpose DBMS is not built for one specific database (e.g., university,
banking, company), so it uses the catalog to handle any type of database.
Example
In a UNIVERSITY database, the catalog stores definitions of files like STUDENT,
COURSE, etc.
When a request is made to access Name of a STUDENT, the DBMS looks into the
catalog to find where "Name" is located and how it is stored.
Relational Model
The relational model represents a database as a collection of relations (tables).
Relation (Table)
A relation is the main structure used in the relational model.
Represented as a table of rows and columns.
Each table has:
o A name (e.g., STUDENT, COURSE).
o A set of columns (attributes) with names.
o A set of rows (tuples) containing actual data.
Example:
STUDENT (Name, Student_number, Class)
Tuple (Row / Record)
Each row in the table = tuple.
Represents a single fact or entity instance.
Attribute (Column)
Each column in the table = attribute.
Describes a property/characteristic of the entity.
All values in a column belong to the same data type.
Domain
Each attribute has a domain = set of possible valid values.
Example:
o Domain of Student_number → integers (positive).
o Domain of Name → strings (alphabets).
o Domain of Grade → {‘A’, ‘B’, ‘C’, ‘D’, ‘F’}.
o Employee_ages: Possible ages of employees in a company; each must be an
integer value between 15 and 80.
Domains ensure data integrity (only valid values allowed).
2. Candidate Key
All possible attributes that can uniquely identify a record.
👉 In this table, possible Candidate Keys are:
RollNo (unique for every student).
Email (unique, but Meera’s is NULL, so not reliable in practice).
Phone (unique for every student).
📌 Candidate Keys = { RollNo, Email, Phone }
3. Alternate Key
Candidate Keys other than the Primary Key.
👉 If we choose RollNo as Primary Key, then:
Email and Phone become Alternate Keys.
4. Super Key
Any set of attributes that can uniquely identify a record (Candidate Key + extra
attributes).
👉 Examples of Super Keys in this table:
{RollNo}
{Phone}
{RollNo, Name}
{RollNo, Email}
{Phone, Course}
📌 All Candidate Keys are Super Keys, but not all Super Keys are minimal.
5. Composite Key
A key made of two or more attributes used together to uniquely identify a record.
👉 Example:
Suppose RollNo was not available, then (Name, Phone) together could uniquely
identify a student.
(RollNo, Course) is also a composite key, though RollNo alone is already unique.
7. Unique Key
Similar to Primary Key: values must be unique.
But unlike Primary Key, it can contain NULL (once).
👉 Example:
Email can be a Unique Key.
All students must have different emails, but Meera can have NULL.
Entity-Relationship (ER) model
Conceptual modeling is a crucial phase in designing a successful database application.
A database application includes both the database and the programs that handle queries and
updates.
Example: A BANK database application manages customer accounts with programs for
deposits and withdrawals.
Database applications often have user-friendly GUIs (forms, menus) for end users (e.g., bank
tellers).
During conceptual database design the focus is on database structures and constraints.
The Entity-Relationship (ER) model is a widely used high-level conceptual data model for
database design.
ER model introduces basic data-structuring concepts and constraints for designing conceptual
schemas.
ER diagrams are the diagrammatic notation used to represent ER models.
The ER model describes data as entities, relationships, and attributes.
Design Choices for ER Conceptual Design
1. Entities and Their Attributes
EMPLOYEE (Main entity)
Attributes: Fname, Minit, Lname, Bdate, Address, Salary, Sex, Ssn
→ Example of an entity with simple attributes (Name can be composite: Fname +
Minit + Lname).
DEPARTMENT
Attributes: Name, Number, Locations
→ Locations is multivalued (a department can have multiple locations).
PROJECT
Attributes: Name, Number, Location.
DEPENDENT
Attributes: Name, Sex, Birth_date, Relationship.
Composite vs. Simple (Atomic) Attributes
Composite Attribute → An attribute that can be divided into smaller sub-parts.
o Example in diagram: Name of EMPLOYEE → split into Fname, Minit,
Lname.
o Each sub-attribute has meaning and can be stored separately.
Simple (Atomic) Attribute → Cannot be divided further.
o Examples: Ssn, Salary, Sex, Bdate.
Single-Valued vs. Multivalued Attributes
Single-Valued Attribute → Holds only one value for each entity instance.
o Example: An EMPLOYEE has only one Ssn, one Salary, one Address.
Multivalued Attribute → Can have multiple values for each entity.
o Example: A DEPARTMENT can have multiple Locations (shown with a
double oval in the diagram).
Stored vs. Derived Attributes
Stored Attribute → Directly stored in the database.
o Examples: Ssn, Salary, Name, Address, Hours.
Derived Attribute → Value can be computed from other attributes, not stored
permanently.
o Example: Number_of_employees in DEPARTMENT → derived by counting
employees linked via WORKS_FOR.
o Represented by a dashed oval in ER diagrams.
NULL Values
Used when attribute values are missing, unknown, or not applicable.
o Example:
An EMPLOYEE without a supervisor → SUPERVISION relationship
= NULL.
A PROJECT not yet assigned to an employee → WORKS_ON (Hours)
may be NULL.
A DEPENDENT may not exist for some employees → relationship =
NULL.
Complex Attributes
A nested combination of composite and multivalued attributes.
Example (not explicitly shown in this diagram but can be inferred):
o Address could be modeled as a composite attribute (Street, City, State, Zip),
o and if an employee has multiple addresses (e.g., permanent and temporary),
then Address becomes complex (Composite + Multivalued).
o In this diagram, Name (composite) and Locations (multivalued) together
illustrate the idea of complexity when combined.
Entity Types and Entity Sets
Entity Type → A collection of similar objects described by the same attributes.
o In the diagram, the entity types are:
EMPLOYEE (described by Fname, Minit, Lname, Ssn, Bdate,
Address, Salary, Sex)
DEPARTMENT (Name, Number, Locations)
PROJECT (Name, Number, Location)
DEPENDENT (Name, Sex, Birth_date, Relationship)
Entity Set → The actual collection of entities (instances) of an entity type in the
database at a given time.
o Example:
EMPLOYEE set → all employees like John Smith, Mary Brown, etc.
PROJECT set → all projects like Project X, Project Y.
Key Attributes of an Entity Type
Key Attribute → An attribute (or combination) that uniquely identifies each entity in
the set.
o In the diagram:
EMPLOYEE → Ssn (Social Security Number) is the primary key.
DEPARTMENT → Number (Dept Number) is the key.
PROJECT → Number (Project Number) is the key.
DEPENDENT → Name is not enough; key is a combination of
(Employee Ssn + Dependent Name) since different employees can
have dependents with the same name.
Value Sets (Domains) of Attributes
Value Set (Domain) → The set of possible values an attribute can take.
o In the diagram:
EMPLOYEE →
Ssn → domain = set of 9-digit numbers.
Sex → domain = {Male, Female}.
Salary → domain = positive real numbers.
Bdate → domain = valid dates.
DEPARTMENT →
Number → domain = integers (unique dept numbers).
Name → domain = strings of characters.
Locations → domain = set of city names/addresses.
PROJECT →
Number → domain = integers (unique project IDs).
Location → domain = set of city/location names.
DEPENDENT →
Birth_date → domain = valid dates.
Relationship → domain = {Spouse, Son, Daughter, etc.}.
Relationships as Attributes
Sometimes relationships carry their own descriptive attributes.
Examples in diagram:
o WORKS_FOR → attribute Start_date (when employee started in department).
o MANAGES → attribute Start_date (when manager started managing dept).
o WORKS_ON → attribute Hours (how many hours employee works on
project).
Role Names and Recursive Relationships
Role Names: Distinguish multiple roles played by the same entity type in a
relationship.
Example:
o In SUPERVISION, both entities are EMPLOYEE.
o Roles: Supervisor and Supervisee.
Without role names, it would be unclear who supervises whom.
Cardinality Ratios for Binary Relationships
Cardinality Ratio = Maximum number of relationship instances an entity can
participate in.
Examples in diagram:
o WORKS_FOR → Many employees (N) work for one department (1) → N:1.
o MANAGES → One employee manages one department → 1:1.
o CONTROLS → One department controls many projects → 1:N.
o WORKS_ON → Many employees work on many projects → M:N.
o DEPENDENTS_OF → One employee can have many dependents, but each
dependent belongs to only one employee → 1:N.
o SUPERVISION → One supervisor can supervise many employees, but each
employee has one supervisor → 1:N.
Participation Constraints and Existence Dependencies
Total Participation: Every entity must participate in the relationship.
Partial Participation: Some entities may not participate.
In the diagram:
o WORKS_FOR: Total participation of EMPLOYEE (every employee must
work for a dept).
o MANAGES: Partial participation of EMPLOYEE (not every employee is a
manager).
o WORKS_ON: Partial participation (some employees may not work on any
project, some projects may not have employees assigned yet).
o DEPENDENTS_OF: Partial participation (not every employee has
dependents).
o SUPERVISION: Partial participation (not every employee supervises others
or is supervised).
Existence Dependency: If an entity cannot exist without being related through a
relationship.
o Example:
A DEPENDENT cannot exist without being related to an EMPLOYEE
→ existence dependent.
A DEPARTMENT can exist without a manager initially (before
assignment) → not existence dependent.
Weak Entity Type (Definition)
A weak entity type is one that cannot be uniquely identified by its own attributes
alone.
It depends on a strong entity type (called its owner) for identification.
It always participates in a total, identifying relationship with its owner.
It usually has a partial key (an attribute that, combined with the owner’s key,
uniquely identifies it).
Weak Entity in the COMPANY ER Diagram
DEPENDENT is a weak entity type.
o Attributes: Name, Sex, Birth_date, Relationship.
o Problem: Just knowing a dependent’s Name (e.g., “John”) is not unique, since
many employees could have dependents named John.
o To uniquely identify a dependent, we need:
The owner’s key (Employee’s Ssn) + the dependent’s partial key
(Name).
Identifying Relationship:
o DEPENDENTS_OF links DEPENDENT to EMPLOYEE.
o It is total participation (a dependent cannot exist without being related to
some employee).
o EMPLOYEE is the owner (strong entity).
o DEPENDENT is the weak entity.