Database Normalization - Complete Notes
Normalization is the process of organizing data in a database to minimize redundancy and
improve data integrity. It divides larger tables into smaller, related tables and defines relationships
among them. This helps to avoid anomalies and ensures consistency.
Importance of Normalization:
• Reduces redundancy (avoids storing same data multiple times).
• Ensures data consistency and accuracy.
• Makes maintenance easier.
• Optimizes queries and storage.
• Prevents anomalies in database operations.
Goals of Normalization:
1. Eliminate redundant data.
2. Ensure logical data storage.
3. Improve data integrity.
4. Simplify data maintenance.
5. Prevent anomalies (insertion, update, deletion).
Anomalies due to Poor Design:
1. Insertion Anomaly: Cannot insert data without other information.
2. Update Anomaly: Change in one place may require multiple updates.
3. Deletion Anomaly: Deleting one data may remove essential information.
Functional Dependencies:
• A functional dependency (FD) is a relationship between attributes in a relation.
If A → B, then attribute A uniquely determines attribute B.
Prime Attribute: Attribute that is part of a candidate key.
Non-Prime Attribute: Attribute not part of any candidate key.
Partial Dependency: When a non-prime attribute depends on part of a composite key.
Full Dependency: When a non-prime attribute depends on the whole composite key.
Transitive Dependency: When a non-prime attribute depends on another non-prime attribute.
Normal Forms:
Normalization is carried out in steps called normal forms.
First Normal Form (1NF)
Rule: Ensure atomic values, no repeating groups.
Second Normal Form (2NF)
Rule: Table must be in 1NF and there should be no partial dependency.
Third Normal Form (3NF)
Rule: Table must be in 2NF and there should be no transitive dependency.
Boyce-Codd Normal Form (BCNF)
Rule: Table must be in 3NF and every determinant must be a super key.
Summary of Normal Forms:
1NF → Atomic values, no repeating groups.
2NF → 1NF + No partial dependency.
3NF → 2NF + No transitive dependency.
BCNF → 3NF + Every determinant is a super key.
Normalization and Normal Forms
Normalization is a process of organizing data in a database to minimize redundancy and improve
data integrity. It is done in stages called Normal Forms (1NF, 2NF, 3NF, BCNF). Each form has
stricter rules to eliminate anomalies.
First Normal Form (1NF)
Rule: Each cell should have atomic (indivisible) values, no repeating groups.
Example (Not in 1NF):
Student_ID | Name | Courses
1 | John | DBMS, SQL
2 | Anita | Python, DBMS, AI
Converted to 1NF:
Student_ID | Name | Course
1 | John | DBMS
1 | John | SQL
2 | Anita | Python
2 | Anita | DBMS
2 | Anita | AI
Second Normal Form (2NF)
Rule: Table must be in 1NF and no partial dependencies (non-prime attribute should depend on the
whole composite key).
Example (Not in 2NF):
Student_ID | Course_ID | Student_Name | Instructor
Problem: Student_Name depends only on Student_ID (partial dependency).
Converted to 2NF:
Students Table: Student_ID | Student_Name
Enrollment Table: Student_ID | Course_ID | Instructor
Third Normal Form (3NF)
Rule: Table must be in 2NF and no transitive dependencies (non-prime attribute should not depend
on another non-prime attribute).
Example (Not in 3NF):
Roll_No | Name | Dept_ID | Dept_Name
Problem: Roll_No → Dept_ID, and Dept_ID → Dept_Name (transitive dependency).
Converted to 3NF:
Students Table: Roll_No | Name | Dept_ID
Departments Table: Dept_ID | Dept_Name
Boyce-Codd Normal Form (BCNF)
Rule: Table must be in 3NF and for every functional dependency (X → Y), X must be a super key.
Example (Not in BCNF):
Course | Instructor | Room
Dependencies: Course → Room, Instructor → Room (Instructor is not a super key).
Converted to BCNF:
Course Table: Course | Instructor
Instructor Table: Instructor | Room
Summary of Normal Forms
1NF → Atomic values, no repeating groups.
2NF → 1NF + No partial dependency.
3NF → 2NF + No transitive dependency.
BCNF → 3NF + Every determinant is a super key.