Chapter 4
Functional Dependency and Normalization
1
Outlines
Convert ER to relations
Normalization
Physical database design
2
Converting ER Diagram to Relations
Three basic rules to convert ER into tables.
For a relation with one to one cardinality
All the attributes are merged into a single table.
i.e. primary key or candidate key of one relation is foreign key for the
other.
For a relation with one to many cardinality
Post the primary key or candidate key for the “one” side as a foreign
key attribute to the “many side”.
For a relationship with many to many
Create a new table (which is the associative entity) and post primary
key or candidate key from each entity as attributes in the new table
along with some additional attributed (if applicable).
3
Cont’d
4
Cont’d
5
Cont’d
6
Cont’d
Mapping Regular Entities to relation
Simple attributes: ER Attributes map directly on to the relation.
Composite attribute: Use only their simple, component attributes
Multi-Valued Attribute: Becomes a separate relation with a foreign
key taken from the super entity.
7
Cont’d
8
Functional Dependency
The functional dependency is a relationship that exists between two
attributes. It typically exists between the primary key and non-key
attribute within a table.
X → Y.
The left side of FD is known as a determinant, the right side of the
production is known as a dependent.
For example:
Assume we have an employee table with attributes: Emp_Id,
Emp_Name, Emp_Address.
Here Emp_Id attribute can uniquely identify the Emp_Name attribute
of employee table because if we know the Emp_Id, we can tell that
employee name associated with it.
Functional dependency can be written as:-
Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on Emp_Id.
9
Functional Dependency (FD)
Two data items A and B are said to be in a determinant or dependent
relationship if certain values of data item B always appears with certain
values of data item A.
If the data item A is the determinant data item and B the dependent data
item then the direction of the association is from A to B and not vice versa.
The essence of this idea is that if the existence of something, call it A,
implies that B must exist and have a certain value, then we say that "B is
functionally dependent on A."
We also often express this idea by saying that "A determines B," or that "B
is a function of A," or that "A functionally governs B." Often, the notions
of functionality and functional dependency are expressed briefly by the
statement, "If A, then B.“
The notation is: A B which is read as; B is functionally dependent
on A .
10
Partial Dependency
If an attribute which is not a member of the primary key
is dependent on some part of the primary key (if we have
composite primary key) then that attribute is partially
functionally dependent on the primary key.
Let {A,B} is the Primary Key and C is no key attribute.
Then if {A,B} C and B C but A C doesn’t hold
Then C is partially functionally dependent on {A,B}
11
Full Dependency
If an attribute which is not a member of the primary key is not
dependent on some part of the primary key but the whole key (if we
have composite primary key) then that attribute is fully functionally
dependent on the primary key.
Let {A,B} is the Primary Key and C is no key attribute
Then if {A,B} C both B C and A C hold
Then C is fully functionally dependent on {A,B}
12
Transitive Dependency
In mathematics and logic, a transitive relationship is a relationship of
the following form: "If A implies B, and if also B implies C, then A
implies C."
13
Normalization
What is Normalization?
Normalization is the process of decomposing the relations into
relations with fewer attributes.
It is the process of organizing the data in the database that is used to
minimize the redundancy from a relation or set of relations.
It is also used to eliminate Insertion, Update, and Deletion
Anomalies.
Normalization divides the larger table into smaller and links them using
relationships.
A large database may result in data duplication. This repetition of data
may result in making relations very large and it isn't easy to maintain
and update data. And wastage and poor utilization of disk space and
resources. And also probability of errors and inconsistencies increases.
So to handle these problems, we should analyze and decompose the
relations with redundant data into smaller, simpler, and well-structured
relations that are satisfy desirable properties.
14
Objective of Normalization
The main reason for normalizing the relations is removing anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause
data integrity and other problems as the database grows.
Normalization consists of a series of guidelines that helps to guide
you in creating a good database structure.
Types of Anomalies
A. Insert Anomaly:- An insert anomaly occurs in the relational database when
some attributes or data items are to be inserted into the database without
existence of other attributes. For example, In the Student table below, if we
want to insert a new courseID, we need to wait until the student enrolled in a
course. In this way, it is difficult to insert new record in the table. Hence, it is
called insertion anomalies.
There are two students in the above table, Jemal' and ‘Rahel', whose records
are repetitive when we enter a new CourseID. Hence it repeats the
studRegistration, StudName and address attributes.
15
Cont..
B. Update Anomalies:- This anomaly occurs when duplicate data is
updated only in one place and not in all instances. Hence, it makes
our data or table inconsistent state. For example, suppose there is a
student Jemal' who belongs to Student table. If we want to update the
course in the Student, we need to update the same in the course table;
otherwise, the data can be inconsistent. And it reflects the changes in
a table with updated values where some of them will not.
C. Delete Anomalies:- An anomaly occurs in a database table when some
records are lost or deleted from the database table due to the deletion of
other records. For example, if we want to remove Teshager from the
Student table, it also removes his address, course and other details from the
Student table. Therefore, we can say that deleting some attributes can
remove other attributes of the database table.
16
Cont..
Student Table:
StudRegistration CourseID StudName Address Course
205 6204 Jemal Los Angeles Economics
205 6247 Jemal Los Angeles Economics
224 6247 Teshager New York Mathematics
230 6204 Rahel Egypt Computer
230 6208 Rahel Egypt Accounts
17
Types of Normal Forms
The followings are the various types of Normal Forms:-
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
Boyce and Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Fifth Normal Form (5NF)
18
Steps of Normalization
Normal Form Description
1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-
key attributes are fully functional dependent on the
primary key.
3NF A relation will be in 3NF if it is in 2NF and no
transition dependency exists.
BCNF A stronger definition of 3NF is known as Boyce
Codd's normal form.
4NF A relation will be in 4NF if it is in Boyce Codd's
normal form and has no multi-valued dependency.
5NF A relation is in 5NF. If it is in 4NF and does not
contain any join dependency, joining should be
lossless.
19
First Normal Form (1NF)
A relation will be 1NF if it contains an atomic value.
It states that an attribute of a table cannot hold multiple values. It
must hold only single-valued attribute.
First normal form disallows the multi-valued attribute, composite
attribute, and their combinations.
Example:- Relation EMPLOYEE is not in 1NF because of multi-
valued attribute EMP_PHONE.
EMPLOYEE table below:-
EMP_ID EMP_NAME EMP_PHONE EMP_STATE
14 Jemal 0972826385, SNNPR
0964738238
20 Habib 0974783832 Amhara
12 Sami 0990372389, AA
20 0989830302
Cont..
The decomposition of the EMPLOYEE table into 1NF has been
shown below:-
EMP_ID EMP_NAME EMP_PHONE EMP_STATE
14 Jemal 0972826385 SNNPR
14 Jemal 0964738238 SNNPR
20 Habib 0974783832 Amhara
12 Sami 0990372389 AA
12 Sami 0989830302 AA
21
Second Normal Form (2NF)
No partial dependency of a non key attribute on part of the primary
key.
Any table that is in 1NF and has a single-attribute (i.e., a non-
composite) key is automatically also in 2NF.
Definition of a table (relation) in 2NF
It is in 1NF and
If all non-key attributes are dependent on all of the key. i.e. no
partial dependency.
Since a partial dependency occurs when a non-key attribute is
dependent on only a part of the (composite) key, the definition of
2NF is sometimes phrased as, "A table is in 2NF if it is in 1NF and
if it has no partial dependencies."
22
Second Normal form 2NF … Cont’d
Example for 2NF:
This schema is in its 1NF since we don’t have any repeating groups or
attributes with multi-valued property. To convert it to a 2NF we need to
remove all partial dependencies of non key attributes on part of the primary
key.
{EmpID, ProjNo} EmpName, ProjName, ProjLoc, ProjFund,
ProjMangID
But in addition to this we have the following dependencies
EmpID EmpName
ProjNo ProjName, ProjLoc, ProjFund, ProjMangID
23
Second Normal form 2NF … Cont’d
As we can see some non key attributes are partially dependent on
some part of the primary key. Thus these collections of attributes
should be moved to a new relation.
24
Second Normal form 2NF … Cont’d
• Example 2: Normalize the following relation.
• The primary key for this table is the composite key (PatientId,
RelativeId).
25
Second Normal form 2NF … Cont’d
So, to determine if it satisfies 2NF, you have to find out if all other
fields in it depend fully on both PatientId and RelativeId; that is, you
need to decide whether the following conditions are true:
(PatientId, RelativeId) Relationship and
(PatientId, RelativeId) Patient_tel.
However, on the dependencies in the patient table, only the following
are true:
(PatientId, RelativeId) Relationship and
(PatientId) Patient_tel.
Therefore; based on the above dependency the normalized relation will be
divided into to tables.
26
Second Normal form 2NF … Cont’d
27
Third Normal Form (3NF)
A relation will be in 3NF if it is in 2NF and not contain any transitive
partial dependency.
3NF is used to reduce the data duplication. It is also used to achieve the
data integrity.
If there is no transitive dependency for non-prime attributes, then the
relation is in third normal form.
A relation is in third normal form if it holds at least one of the following
conditions for every non-trivial function dependency X → Y.
X is a super key.
Y is a prime attribute, i.e., each element of Y is part of some candidate
key.
Example:-
Employee_Detail Table:-
28
Cont..
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY
222 Habib 201010 Amhara Bahardar
333 Senait 02228 Somalia Jigjiga
444 Leta 60007 Somalia Jigjiga
555 Kassahun 06389 Sidama Hawassa
666 Jemal 462007 Afar Semera
Super key in the table above:-
{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_
ZIP}....so on
29
Cont..
Candidate key: {EMP_ID}
Non-prime attributes: In the given table, all attributes except
EMP_ID are non-prime.
Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and
EMP_ZIP dependent on EMP_ID. The non-prime attributes
(EMP_STATE, EMP_CITY) transitively dependent on super
key(EMP_ID). It violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to
the new <EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary
key.
Employee Table:-
30
Cont..
EMP_ID EMP_NAME EMP_ZIP
222 Habib 201010
333 Senait 02228
444 Leta 60007
555 Kassahun 06389
666 Jemal 462007
Employee_Zip Table:-
EMP_ZIP EMP_STATE EMP_CITY
201010 Amhara Bahardar
02228 Somalia Jigjiga
60007 Somalia Jigjiga
06389 Sidama Hawassa
31
462007 Afar Semera
Other Normal Forms
Boyce-Codd Normal Form (BCNF): Isolate Independent Multiple
Relationships - No table may contain two or more 1:n or N:M
relationships that are not directly related. The correct solution, to cause
the model to be in 4th normal form, is to ensure that all M:M
relationships are resolved independently if they are indeed independent.
Def.: A table is in BCNF if it is in 3NF and if every determinant is a
candidate key.
Forth Normal form (4NF): Isolate Semantically Related Multiple
Relationships - There may be practical constrains on information that
justify separating logically related many-to-many relationships.
Def.: A table is in 4NF if it is in BCNF and if it has no multi-valued
dependencies.
32
Other Normal Forms … Cont’d
Fifth Normal Form (5NF): A model limited to only simple
(elemental) facts.
Def.: A table is in 5NF, also called "Projection-Join Normal
Form" (PJNF), if it is in 4NF and if every join dependency in the
table is a consequence of the candidate keys of the table.
Domain-Key Normal Form (DKNF): A model free from all
modification anomalies.
Def.: A table is in DKNF if every constraint on the table is a
logical consequence of the definition of keys and domains.
33
END.
34