Normalization
Oum Saokosal
Master's Degree in Information Systems, South
Korea
012-252-752 / 010-878-992
[email protected]Normalization
> The biggest problem needed to be solved in database is
data redundancy.
» Why data redundancy is the problem? Because it causes:
» Insert Anomaly
» Update Anomaly
» Delete Anomaly
attend ject Tone Lt
Sok San Database Master's 012666777
Van Sokhen Database Bachelor's 017678678
Sok San E-Commerce Master's 012666777Normalization (Cont.)
> Normalization is the process of removing redundant data
from your tables to improve storage efficiency, data
integrity, and scalability.
» Normalization generally involves splitting existing tables
into multiple ones, which must be re-joined or linked
each time a query is issued.
» Why normalization?
The relation derived from the user view or data store will most
likely be unnormalized.
The problem usually happens when an existing system uses
unstructured file, e.g. in MS Excel.Steps of Normalization
> First Normal Form (1NF)
» Second Normal Form (2NF)
> Third Normal Form (3NF)
» Boyce-Codd Normal Form (BCNF)
> Fourth Normal Form (4NF)
» Fifth Normal Form (5NF)
In practice, 1NF, 2NF, and 3NF are enough for database.First Normal Form (1NF)
The official qualifications for 1NF are:
Each attribute name must be unique.
Each attribute value must be single.
Each row must be unique.
There is no repeating groups.
» Additional:
Choose a primary key.
» Reminder:
A primary key is unique, not null, unchanged. A primary
key can be either an attribute or combined attributes.First Normal Form (1NF) (Cont.)
» Example ofa table not in INF :
Group Me) o) (ol Student Score
Group A Intro MongoDB Sok San 18 marks
Sao Ry 17 marks
Group B Intro MySQL. Chan Tina 19 marks
Tith Sophea 16 marks
It violates the 1NF because:
» Attribute values are not single.
» Repeating groups exists.First Normal Form (1NF) (Cont.)
» After eliminating:
Group Topic Family Name Given Name
A Intro MongoDB Sok San
A Intro MongoDB Sao Ry
B Intro MySQL Chan Tina
B Intro MySQL Tith Sophea
> Now it is in 1NF.
However, it might still violate 2NF and so on.Functional Dependencies
We say an attribute, B, has a functional dependency on another
attribute, A, if for any two records, which have
the same value for A, then the values for B in these two records
must be the same. We illustrate this as:
ADB (read as: Adetermines Bor B depends on A)
ee Seren
Sok San POS Mart Sys
[email protected]
Sao Ry Univ Mgt Sys
[email protected]
Sok San Web Redesign
[email protected]
‘Chan Sokna POS Mart Sys
[email protected]
Sao Ry DB Design
[email protected]
employee name > email addressFunctional Dependencies (cont.)
| _EmpNum EmpEmail EmpFname EmpLname
123
[email protected] John Doe
456 |
[email protected] Peter Smith
555.
[email protected] Alan Lee
633
[email protected] Peter Doe
787
[email protected] Alan Lee
If EmpNum is the PK then the FDs:
EmpNum > EmpEmail, EmpFname, EmpLname
must exist.Functional Dependencies (cont.)
EmpNum ~> EmpEmail, EmpFname, EmpLname
| 3 different ways
| you might see
_ EmpEmail
EmpNum <———+ EmpFname
EmpLname
| EmpNum EmpEmail | EmpFname EmpLname
es eeDeterminant
Functional Dependency
EmpNum > EmpEmail
Attribute on the left hand side is known as the
determinant
+ EmpNum is a determinant of EmpEmailSecond Normal Form (2NF)
The official qualifications for 2NF are:
|. A table is already in I NF.
2. All nonkey attributes are fully dependent on the primary
key.
All partial dependencies are removed to place in another table.Example of a table not in 2NF:
CourselD Cee ul
17101 201301 25 Database
17101 201302 25 Database
17102 201301 30 Web Prog
17102 201302 35 Web Prog
17103 201401 20 Networking
Primary Key
The Course Name depends on only CourselD, a part of the primary key
not the whole primary {CourselD, SemesterID}.|t’s called partial
dependency.
Solution:
Remove CourselD and Course Name together to create a new table.
> 13CourselD fe aay CourselD SemesterID Num Student
ITIOL Database ITIOL 201301 25
ITIOL Database ITIOL 201302 25
IT102 Web Prog IT102 201301 30
IT102 Web Prog IT102 201302 35
IT103 Networking IT103 201401 20
Done? Oh no, it is still
not in 1NF yet.
Remove the repeating
groups too.
Finally, connect the (ey ees
relationship. ITIOI Database
IT102 Web Prog
p14 IT103 NetworkingThird Normal Form (3NF)
The official qualifications for 3NF are:
|. A table is already in 2NF.
2. Nonprimary key attributes do not depend on other
nonprimary key attributes
(i.e. no transitive dependencies)
All transitive dependencies are removed to place in
another table.Example of a Table not in 3NF:
Course Name Teacher Name Teacher Tel
Database Sok Piseth 012 123 456
Database Sao Kanha 0977 322 III
Web Prog Chan Veasna 012 412 333
Web Prog Chan Veasna 012 412 333
Networking Pou Sambath 077 545 221
| —__|_
ay
Primary Key The Teacher Tel is a nonkey attribute, and
the Teacher Name is also a nonkey atttribute.
But Teacher Tel depends on Teacher Name.
It is called transitive dependency.
Solution:
Remove Teacher Name and Teacher Tel together
to create a new table.
> 16Done?
Sac cranea ie Mace CCcaicl Oh no, it is still not in 1NF yet.
Sok Piseth 012 123 456 Remove Repeating row.
Sao Kanha 0977 322 III
Chan Veasna 012 412 333
i Database
Chan Veasna 012 412 333
a) Database T2
Pou Sambath 077 545 221
3 Web Prog T3
Bichon tn Me CC Web Prog 73
5 Networking
Sok Piseth 012 123 456
Sao Kanha 0977 322 III
Chan Veasna 012 412 333
Pou Sambath 077 545 221
Note about primary key: en Teacher Name Teacher Tel
-In theory, you can choose TI Sok Piseth 012 123 456
Teacher Name to be a primary key. +) sao Kanha 0977 322 III
-But in practice, you should add
Teacher ID as the primary key. embers eles
> 17
T4 Pou Sambath 077 545 221Boyce Codd Normal Form (BCNF) - 3.5NF
The official qualifications for BCNF are:
|. A table is already in 3NF.
2. All determinants must be superkeys.
All determinants that are not superkeys are removed to place in
another table.Boyce Codd Normal Form (BCNF) (Cont.)
> Example of a table not in BCNF:
Sok DB John
Sao DB William.
Chan E-Commerce Todd
Sok E-Commerce Todd
Chan DB William
» Key: {Student, Course}
» Functional Dependency:
» {Student, Course} Teacher
» Teacher > Course
Problem: Teacher is not a superkey but determines Course.
> 19Solution: Decouple a table
contains Teacher and Course
Sok DB from from original table (Student,
Sao DB Course). Finally, connect the new
and old table to third table
Chan E-Commerce contains Course.
Sok E-Commerce
Chan be >—_
DB
E-Commerce
Course
DB John
DB William
E-Commerce ToddForth Normal Form (4NF)
The official qualifications for 4NF are:
|. A table is already in BCNF.
2. A table contains no multi-valued dependencies.
> Multi-valued dependency: MVDs occur when two
or more independent multi valued facts about the
same attribute occur within the same table.
A7>>B _— (Bmulti-valued depends on A)Forth Normal Form (4NF) (Cont.)
> Example of a table not in 4NF:
Sok IT Football
Sok IT Volleyball
Sao IT Football
Sao Med Football
Chan IT NULL
Puth NULL Football
Tith NULL NULL
» Key: {Student, Major, Hobby}
» MVD: Student >> Major, Hobby
> 22Solution: Decouple to each Student Major
table contains MVD. Finally, Sok IT
connect each to a third table
contains Student. Sao IT
Sao Med
Chan IT
Se Puth NULL
Sao Tith NULL
Chan
cet Sree Pe
ci Sok Football
Sok Volleyball
Sao Football
Chan NULL
Puth Football
Tith NULL
> 23Fifth Normal Form (5NF)
The official qualifications for 5NF are:
|. A table is already in 4NF.
2. The attributes of multi-valued dependencies are related.
24Fifth Normal Form (5NF) (Cont.)
> Example of a table not in 5NF:
EGIEg Company
Sok. MIAF Trading
Sao Coca-Cola Corp
Sao Coca-Cola Corp
Sao Coca-Cola Corp
Chan Angkor Brewery
Chan Cambodia Brewery
Product
Zenya
Coke
Fanta
Sprite
Angkor Beer
Cambodia Beer
> Key: {Seller, Company, Product}
» MVD: Seller >> Company, Product
» Product is related to Company.
> 25BS "isc mar trading 1. MIAF Trading
Sok Sis Cem Cole Carp Cosas Cary
Sao Chan Angkor Brewery eee Sy
ah Gen Cambodia Brewery Sores Brewery
M
EE Company Serres
MIAF Trading Zenya
Coca-Cola Corp Coke
Coca-Cola Corp Fanta
Coca-Cola Corp Sprite
Sao Sprite
Coke
‘Angkor Brewery Angkor Beer
Chan Angkor Beer
Fanta Cambodia Cambodia
Shani Cambodl Sprite Brewery Beer
Beer M
‘Angkor Beer 1
Cambodia Beer