0% found this document useful (0 votes)

15 views53 pages

Chapter 4

Chapter 4 discusses database design theory and normalization, emphasizing the importance of creating well-structured relations to minimize redundancy and avoid anomalies such as insertion, deletion, and updation issues. It outlines the normalization process, detailing the different normal forms (1NF, 2NF, 3NF, BCNF, and 4NF) and their criteria for ensuring data integrity and efficiency. The chapter also covers functional dependencies and the significance of eliminating transitive dependencies to maintain a robust database structure.

Uploaded by

abcde1marc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views53 pages

Chapter 4

Uploaded by

abcde1marc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 53

Chapter-4

DATABASE DESIGN THEORY AND

NORMALIZATION
Outline
Well structure relation
In logical database design, a process called normalization is
used to produce a data model that has the properties of simplicity,
non-redundancy and minimal maintenance.

When designing a database, the analyst should aim to design

well-structured relations/tables.
Cont’d...
A well-structured relation contains a minimum amount of
redundancy and allows users to insert, update and delete the
rows in a table without errors or inconsistencies.

It will be designed by using a technique called normalization.

Normalization is the process of decomposing relations with

anomalies to produce smaller, well-structured relations.
Basic concepts of normalization
A technique of organizing a data into multiple related tables to
minimize data redundancy.

 Through normalization we want to design for our relational

database a set of files that
 Have as little redundancy as possible,

 Permit efficient updates of the data in the database, and

 Avoid the danger of losing data unknowingly.

Data redundancy

What is data redundancy? And why should we reduce it?

Table
Row-1 X
X
Row-2
X
Row-3 X
Row-4
In a relation if a fact is stored more than once ,there is data
redundancy.
Cont’d…
Data redundancy also leads to other issues like:
Insertion , Deletion and Updation problems

Student_table
ID Stu-name Dept Hod Phone-no
Ababa CS Mr. x 0918502888
AAU01
Yadiel CS Mr. x 0918502888
AAU02
John CS Mr. x 0918502888
AAU03
Jonatan CS Mr. x 0918502888
AAU04
Cont’d…
Repetition of data increases the size of databases and leads to
other more issues such as
Insertion anomalies

Deletion anomalies and

Updation anomalies

These types of error can creep into databases that are

insufficiently normalized.
Insertion anomalies

An "Insertion Anomaly" is a failure to place information about a

new database entry into all the places in the database where
information about that new entry needs to be stored.

In a properly normalized database, information about a new entry

needs to be inserted into only one place in the database.

In an inadequately normalized database, information about a new

entry may need to be inserted into more than one place.
Deletion anomalies
A "Deletion Anomaly" is a failure to remove information about
an existing database entry when it is time to remove that entry.

In an inadequately normalized database, information about that

old entry may need to be deleted from more than one place, and,
human fallibility being what it is, some of the needed additional
deletions may be missed.

Loss of related data when some other data is deleted

Updation anomalies
 Modification Anomaly: changing data in a row forces
changes to other rows because of duplication.
 Leads to data inconsistency .

 All three kinds of anomalies are highly undesirable.

 The main purpose of normalization is to reduce the chances

for anomalies to occur in a database
Purpose of normalization
The main goal of database normalization is to restructure the
logical data model of a database to:
Eliminate redundancy
Organize data efficiently
Reduce the potential for data anomalies
Cont’d…

There are well-accepted principles and rules that are used to

normalize a data model.

Types of normalizations are :

First normal form
Second normal form
Third normal form, BCNF.

A databases must be normalize to at least the level of the 3rd Normal

Form (3 NF).
First normal form(1NF)

oA table (relation) is in 1NF if

 There are no duplicated rows in the table.

Each cell is single-valued (i.e., there are no repeating

groups).
 Entries in a column (attribute, field) are of the same
kind.
Each column should have a unique name
Example: Look the following Employee table below
Cont’d…
Second Normal Form (2NF)

A table is in 2NF if it is in 1NF and if all non-key attributes

are dependent on all of the key attributes.

FD = x  y

If t1.x = t2.x then t1.y = t2.y

Functional dependency
Functional dependency is a relationship that exists when one
attribute uniquely determines another attributes.

For a relation R, attribute B is functionally dependent on attribute

A if for every valid instance of A, that value of A uniquely defines
the value of B.

If B is functionally dependent on A we write A  B

The attribute on the left-hand side of the arrow is called a

determinant.
Cont’d...
An attribute may be functionally dependent on two or more attributes

Example

The relation EMP_COURSE (Emp_ID, Course Title,

Date_Completed) can have its functional dependencies described as:
Emp_ID, Course_Title  Date_Completed

i.e., the date of course completed is completely determined by the identity of

the employee and the title of the course
Cont’d...
Other examples:
VIN  Make, Model, Colour, i.e. the make, model and colour
of a vehicle are functionally dependent on the vehicle
identification number

ISBN  Title, First_Author_Name, i.e. the title of a book and

the name of the first author are functionally dependent on the
book’s ISBN
Second normal form (2NF)
2NF = already in 1NF and every non key attribute is fully
functionally dependent on the primary key.

If 1NF will be in 2NF if any one of the following conditions

applies:
Cont’d...
The primary key consists of only one attribute(such as
Emp_ID in Employee)

thus all of the attributes in the relation are components of the

primary key

Every non key attribute is functionally dependent on the full

set of primary key attributes
Cont’d...
Employee is an example of a relation that is not in 2NF.

The primary key for this relation is the composite key Emp_ID
and Course_Title.

Therefore the non key attributes Name, Dept_Name and Salary

are functionally dependent on part of the primary key (Emp_ID)
but not on Course_Title.
Functional Dependencies in Employee

Dependency on entire primary key

EmpID CourseTitle Name DeptName Salary Date_Completed

Dependency on only part of the key

EmpID, CourseTitle  DateCompleted

EmpID  Name, DeptName, Salary

Therefore, NOT in 2nd Normal Form

Partial dependency
Partial functional dependency is a functional dependency in
which one or more nonkey attributes (such as Name) are
functionally dependent on part (but not all) of the primary key

In Employee this creates redundancy in that relation, which

results in anomalies when that table is updated (as noticed
previously: Insertion, Deletion and Modification anomalies)
Converting to 2NF
To do this we decompose the relation into new relations that
satisfy one (or more) of the conditions described above.
Employee relation is decomposed into the following 2 relations:

EMPLOYEE (Emp_ID, Name, Dept_Name, Salary). This

satisfies the first condition above and is in 2NF

EMP_COURSE(Emp_ID, Course_Title, Date_Completed)

This satisfies the third property above and is also in 2NF.
Cont’d…
Decomposed into two separate relations

EmpID Name DeptName Salary

EmpID CourseTitle DateCompleted

Both are full functional dependencies

Third normal form (3NF)
3NF = already in 2NF and no transitive dependencies exist.

Transitive dependency = a functional dependency between two (or more)

nonkey attributes.

E,g. SALES(Cust_ID, Name, Salesperson, Region) – Cust_ID the

primary key so that all the remaining attributes are functionally dependent
on this

However there is a transitive dependency, as Region is dependent on

Salesperson and Salesperson is dependent on Cust_ID

As a result there would be update anomalies for this .

Relation with transitive dependency
Relation with transitive dependency

CustID  Name
CustID  Salesperson BUT
CustID  Region
CustID  Salesperson  Region
All this is OK Transitive dependency
(2nd NF) (not 3rd NF)
Anomalies
Insertion anomaly because a new salesperson assigned to the
‘North Region’ cannot be entered until a customer has been
assigned, because a value for Cust_ID must be provided to insert a
row in the table
Deletion anomaly because if customer 6837 is deleted from the
table, we lose the information that salesperson Hernandez is
assigned to the ‘East Region’
Modification anomaly because if salesperson Smith is reassigned
to the ‘East Region’, several rows must be changed to reflect that
fact.
Removing transitive
dependencies
This can be done by decomposing SALES into two relations
(see following table.)

Salesperson, which is the determinant in the transitive

dependency in SALES, becomes the primary key in Sperson

Salesperson becomes a foreign key in SALES1

Also uses less storage space because the dependent data (such
as Region) do not have to repeat for each customer
Decomposing the SALES relation
Relations in 3NF

Salesperson  Region

CustID  Name
CustID  Salesperson

Now, there are no transitive dependencies, both relations are

in 3rd NF
BCNF
Boyce–Codd normal form (or BCNF or 3.5NF) is a normal form
used in database normalization.

It is a slightly stronger version of the third normal form (3NF).

Rules for BCNF
For a table to satisfy the Boyce-Codd Normal Form, it should satisfy
the following two conditions:
It should be in the Third Normal Form.
And, for any dependency A → B, A should be a super key.
it means, that for a dependency A → B, A cannot be a non-prime
attribute, if B is a prime attribute.
Example
student_id subject professor
101 Java P.Java
101 C++ P.Cpp
102 Java P.Java2
103 C# P.Chash
104 Java P.Java

Student id and subject together form PK. one professor teaches only
one subject, but one subject may have two different professors.

Hence, subject depends on the professor name.

To make this relation(table) satisfy BCNF, we will decompose this table
into two tables, student table and professor table.
Student Table

student_id p_id

And, Professor Table

p_id professor subject

Inference rules(Armstrong’s
Axioms)
The following six rules IR1 through IR6 are well-known inference rules for functional
dependencies:

IR1 (reflexive rule)1: If X ⊇ Y, then X →Y.

IR2 (augmentation rule)2: {X → Y} |=XZ → YZ.

IR3 (transitive rule): {X → Y, Y → Z} |=X → Z.

IR4 (decomposition, or projective rule): {X → YZ} |=X → Y.

IR5 (union, or additive rule): {X → Y, X → Z} |=X → YZ.

IR6 (pseudotransitive rule): {X → Y, WY → Z} |=WX → Z.

4NF
For a table to satisfy the Fourth Normal Form, it should satisfy the following
two conditions:
1. It should be in the Boyce-Codd Normal Form.
2. And, the table should not have any Multi-valued Dependency.
For a dependency A → B, if for a single value of A, multiple value of B
exists, then the table may have multi-valued dependency.
a table should have at-least 3 columns for it to have a multi-valued
dependency.
And, for a relation R(A,B,C), if there is a multi-valued dependency between,
A and B, then B and C should be independent of each other.
If all these conditions are true for any relation(table), it is said to have multi-
valued dependency.
Example
Below we have a college enrolment table with columns s_id, course
and hobby.
student with s_id 1 has opted for two courses, Science and Maths, and
has two hobbies, Cricket and Hockey.

s_id course hobby

1 Science Cricket
1 Maths Hockey
2 C# Cricket
2 Php Hockey
 The two records for student with s_id 1, will give rise to two more
records, as shown below, because for one student, two hobbies exists,
hence along with both the courses, these hobbies should be specified.

s_id course hobby

1 Science Cricket
1 Maths Hockey
1 Science Hockey
1 Maths Cricket
CourseOpted Table Hobbies Table
s_id course s_id hobby
1 Science 1 Cricket
1 Maths 1 Hockey
2 C# 2 Cricket
2 Php 2 Hockey
Attribute closure(closure set)
set of attributes which can be functionally determined from it.
Example: R(A,B,C,D,E)

FD: {A → B, B → C , C → D, D → E }

 A+ = {ABCDE}

AD + = {ADBEC}

B + = {BCDE}
Equivalence of Sets of Functional
Dependencies
 A set of functional dependencies F is said to cover another set of
functional dependencies E if every FD in E is also in F+; that is, if every
dependency in E can be inferred from F; alternatively, we can say that E is
covered by F.

Two sets of functional dependencies E and F are equivalent if E+ = F+.

Therefore, equivalence means that every FD in E can be inferred from F,

and every FD in F can be inferred from E; that is, E is equivalent to F if
both the conditions—E covers F and F covers E—hold.
Minimal Sets(canonical) of
Functional Dependencies
 F to be minimal if it satisfies the following conditions:

1. singleton RHS

2. No extraneous(two or more) attribute in LHS

3. Eliminate redundant
Example
 R(A,B,C,D,E)
FD = { A → D, BC → AD, C → B, E → A, E → D}
Step 1 singleton RHS
{ A → D, BC → A, BC → D, C → B, E → A, E → D}
Step 2 extraneous attribute in LHS
BC → A, BC → D (eliminate either B,C in the first and B,C in the second)

To eliminate first compute B and C closure using the above FD

{ A → D, BC → A, BC → D, C → B, E → A, E → D}

B+ = B , C+ = CBAD so the C closure already contains the B so we can

eliminate B(this means we can reach to B using the C closure)
Example
{ A → D, BC → A, BC → D, C → B, E → A, E → D} eliminate B

= { A → D, C → A, BC → D, C → B, E → A, E → D}

B+ = B , C+ = CBAD so the C closure already contains the B so we

can eliminate B(this means we can reach to B using the C closure)
once again

{ A → D, C → A, BC → D, C → B, E → A, E → D} eliminate B

= { A → D, C → A, C → D, C → B, E → A, E → D}
Example
FD = { A → D, C → A, C → D, C → B, E → A, E → D}

Step 3: starting from first FD finding the redundant one so we take A→D and if A+
contains the D then we can eliminate other wise we can’t. when computing A + assuming
we eliminate A → D we get = A , we can’t eliminate so this stays and move to the second
one.

{ A → D, C → A, C → D, C → B, E → A, E → D} compute C + assuming you eliminate

C → A , C + = CDB it doesn’t contain A so we need this.

{ A → D, C → A, C → D, C → B, E → A, E → D} , compute C +
assuming you
eliminate C → D, C + = CADB , now we can eliminate C → D

Applying this to all FD we get = { A → D, C → A, C → B, E → A}

Properties of Relational Decompositions

In a database, it breaks the table into multiple tables. If the relation has no proper
decomposition, then it may lead to problems like loss of information. Decomposition
is used to eliminate some of the problems of bad design like anomalies,
inconsistencies, and redundancy.

The properties of a relational decomposition are

1.Attribute Preservation: The attributes in R will appear in at least one relation

schema Ri in the decomposition, i.e., no attribute is lost.

2.Dependency preservation :If a decomposition is not dependency preserving some

dependency is lost in decomposition.

If R decomposed into R1,R2,R3 then all the F1,F2,F3 Union should cover FD in R
Cont’d

3. Non Additive Join Property: ensures that no spurious tuples are generated when
a NATURAL JOIN operation is applied to the relations resulting from the
decomposition.

4. No redundancy: Decomposition is used to eliminate some of the problems of bad

design like anomalies, inconsistencies, and redundancy. If the relation has no proper
decomposition, then it may lead to problems like loss of information.

5. Lossless join property: It is the ability to ensure that any instance of the original
relation can be identified from corresponding instances in the smaller relations.
Cont’d

We must carefully consider the problems associated with NULLs when designing a

relational database schema. There is no fully satisfactory relational design theory as

yet that includes NULL values.

A tuple that does not participate in a natural join is called a dangling tuple.
Dangling tuples may indicate a consistency problem in the database.
Example
R (A,B,C,D,E)
FD: {A → B, B → C, C → D, D → A}
Check if these two decompositions preserve FD. R1(A,B,C) and R2(C,D,E) First, find the FD set
for R1
A+ = ABCD (A →BC) Since D IS NOT IN R1 we eliminate D from FD1
B+ = BCDA (B →CA)
C+ = CDAB (C →AB)
AB+ = ABCD (AB →C)
BC+ = BCDA (BC →A) These two are already in the above 3 closure above so FD1 will be
{A →BC, B →CA, C →AB}
Following same process for R2 we get FD2 {C → D, D → C}
Example
Then FD1 U FD2 should be equivalent to FD
i.e FD {A → B, B → C, C → D, D → A} should be a member of
FD1 U FD2 {A →BC, B →CA, C →AB, C → D, D → C}
Therefore, check each FD exist in FD1 U FD2, All the first 3FD exist in the union relation. but what
about the fourth? It does not seem to exist directly so we will check if it exists indirectly by finding
out the determinant D closure
D+ = DCAB since A is a member in closure set of D then we can say D → A is member of FD1 U
FD2 {A →BC, B →CA, C →AB, C → D, D → C}
We can conclude that the decomposed relation preserve dependency.

Air Commands
No ratings yet
Air Commands
400 pages
IP.21 Development & Digitization Guide
No ratings yet
IP.21 Development & Digitization Guide
16 pages
What Is A Relational Database?
No ratings yet
What Is A Relational Database?
13 pages
Normalization
100% (1)
Normalization
51 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
31 pages
Normalization 1st To 5th NF With Example
No ratings yet
Normalization 1st To 5th NF With Example
33 pages
Normalization: Database Management System
No ratings yet
Normalization: Database Management System
33 pages
Normalization and Denormalization
No ratings yet
Normalization and Denormalization
44 pages
IS202 Data Management: Part 2: Normalization
No ratings yet
IS202 Data Management: Part 2: Normalization
39 pages
Chapter 7 Normalization
No ratings yet
Chapter 7 Normalization
24 pages
Relational Design
No ratings yet
Relational Design
50 pages
Functional Dependency & Normalization Guide
No ratings yet
Functional Dependency & Normalization Guide
46 pages
Topic 07
No ratings yet
Topic 07
56 pages
File Renaming Utility Guide
No ratings yet
File Renaming Utility Guide
23 pages
Normalization
No ratings yet
Normalization
27 pages
Chapter Four
No ratings yet
Chapter Four
47 pages
Lec 5 Normalization
No ratings yet
Lec 5 Normalization
25 pages
Lecture Note 4 - Dependencies and Normalization
No ratings yet
Lecture Note 4 - Dependencies and Normalization
35 pages
Unit Iv Data Normalization: Semantics of Attributes Should Be Easy To Interpret
No ratings yet
Unit Iv Data Normalization: Semantics of Attributes Should Be Easy To Interpret
14 pages
Advanced Databases Course Guide
No ratings yet
Advanced Databases Course Guide
721 pages
MYSQL DAY - 20 (Normalization)
No ratings yet
MYSQL DAY - 20 (Normalization)
13 pages
Unit 4 Relational Database Design
No ratings yet
Unit 4 Relational Database Design
22 pages
Normalization
No ratings yet
Normalization
30 pages
Dbms Assignment ON Normalization: Submitted By, R.Kiruba Sankar
No ratings yet
Dbms Assignment ON Normalization: Submitted By, R.Kiruba Sankar
10 pages
Database Anamolies
No ratings yet
Database Anamolies
39 pages
Chapter 4 DB
No ratings yet
Chapter 4 DB
30 pages
Fundamental of Database CH-5
No ratings yet
Fundamental of Database CH-5
34 pages
Chapter 9 NORMALIZATION
No ratings yet
Chapter 9 NORMALIZATION
45 pages
ch4 Slide
No ratings yet
ch4 Slide
32 pages
DBMS Module 3
No ratings yet
DBMS Module 3
24 pages
ADBMS Lec4
No ratings yet
ADBMS Lec4
35 pages
RDBMS Unit 4
No ratings yet
RDBMS Unit 4
15 pages
DBMS 5 FDB Functional Dependency
No ratings yet
DBMS 5 FDB Functional Dependency
30 pages
Normalization
No ratings yet
Normalization
31 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
Lecture 8 - Normalisation
No ratings yet
Lecture 8 - Normalisation
7 pages
Databases Lecture 5
No ratings yet
Databases Lecture 5
34 pages
Unit 3 (KCS501)
No ratings yet
Unit 3 (KCS501)
13 pages
Library Cataloging Standards Guide
No ratings yet
Library Cataloging Standards Guide
39 pages
PHP and MYSQL Interview Questions and Answers PART 1
No ratings yet
PHP and MYSQL Interview Questions and Answers PART 1
25 pages
Lesson Plan Class: 12: Session 2019 - 2020
No ratings yet
Lesson Plan Class: 12: Session 2019 - 2020
21 pages
Magnetic Storage
No ratings yet
Magnetic Storage
4 pages
Unit 3 1
No ratings yet
Unit 3 1
11 pages
Unit IV Database Normalization
No ratings yet
Unit IV Database Normalization
36 pages
Basededados
No ratings yet
Basededados
6 pages
Lec 5
No ratings yet
Lec 5
12 pages
List of Mysql
No ratings yet
List of Mysql
67 pages
Ramnit Analysis
No ratings yet
Ramnit Analysis
14 pages
Disaster Recovery - Srinivas
No ratings yet
Disaster Recovery - Srinivas
5 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
Semester 3 Winforms Exam Guide
No ratings yet
Semester 3 Winforms Exam Guide
4 pages
Metrics Specification Template
No ratings yet
Metrics Specification Template
29 pages
Dbms Theory Notes Unit IV
No ratings yet
Dbms Theory Notes Unit IV
73 pages
Chapter 9 NORMALIZATION 1
No ratings yet
Chapter 9 NORMALIZATION 1
45 pages
Editorial Board - 2021 - Annals of Anatomy - Anatomischer Anzeiger
No ratings yet
Editorial Board - 2021 - Annals of Anatomy - Anatomischer Anzeiger
1 page
DB Lecture 9&10
No ratings yet
DB Lecture 9&10
50 pages
CLass 12 CS Practical File 2022-23
0% (1)
CLass 12 CS Practical File 2022-23
4 pages
DBMS Unit-III
No ratings yet
DBMS Unit-III
42 pages
How To Diagnose A Problem in The Item Catalog PDF
No ratings yet
How To Diagnose A Problem in The Item Catalog PDF
8 pages
Normalization
No ratings yet
Normalization
30 pages
Data Normalization
No ratings yet
Data Normalization
25 pages
Bhavik and Parth
No ratings yet
Bhavik and Parth
12 pages
Persepsi Pemustaka Terhadap Aplikasi E-Library Di UPT Perpustakaan UIN Raden Fatah Palembang Berdasarkan Teori Information System Success Model
No ratings yet
Persepsi Pemustaka Terhadap Aplikasi E-Library Di UPT Perpustakaan UIN Raden Fatah Palembang Berdasarkan Teori Information System Success Model
9 pages
Analytics Tools-Wps & Sas
100% (1)
Analytics Tools-Wps & Sas
11 pages
Lecture 7 - 8 - Normalization
No ratings yet
Lecture 7 - 8 - Normalization
30 pages
Normalization
No ratings yet
Normalization
33 pages
Database Unit 4 Normilization 1 1
No ratings yet
Database Unit 4 Normilization 1 1
38 pages
Chapter 4
No ratings yet
Chapter 4
45 pages
22it305 - Database Management Systems: 1.1 Relation For Every Entity
No ratings yet
22it305 - Database Management Systems: 1.1 Relation For Every Entity
11 pages
DBMS Notes
No ratings yet
DBMS Notes
27 pages
Chapter 4 - Functional Dependency and Normalization
No ratings yet
Chapter 4 - Functional Dependency and Normalization
17 pages
Allinone
No ratings yet
Allinone
9 pages
Big Data With Spark and Hadoop
No ratings yet
Big Data With Spark and Hadoop
9 pages
12 Normalization
No ratings yet
12 Normalization
41 pages
Relational Database Degin - FD and Normalization
No ratings yet
Relational Database Degin - FD and Normalization
32 pages
DATA4
No ratings yet
DATA4
259 pages
DBMS
No ratings yet
DBMS
8 pages
Log
No ratings yet
Log
178 pages
Dbms Lab Manual Reg2021 24-25
No ratings yet
Dbms Lab Manual Reg2021 24-25
73 pages
HPE0-J68 - Demo
No ratings yet
HPE0-J68 - Demo
16 pages
Chapter 5
No ratings yet
Chapter 5
29 pages
MS Excel Formulas and There Usage On 26-2-2025
No ratings yet
MS Excel Formulas and There Usage On 26-2-2025
6 pages
Bapanapalli Raghavendra Resume
No ratings yet
Bapanapalli Raghavendra Resume
2 pages
Database Normalization
No ratings yet
Database Normalization
10 pages
Unit 3 Relational Database Designs
No ratings yet
Unit 3 Relational Database Designs
36 pages
Chapter 4
No ratings yet
Chapter 4
19 pages
Database Normalization A Step by Step Guide With Examples
No ratings yet
Database Normalization A Step by Step Guide With Examples
12 pages
DB CH 4
No ratings yet
DB CH 4
29 pages