0% found this document useful (0 votes)

13 views88 pages

Unit 3 Normalization

The document provides an overview of the relational model in database management systems, detailing concepts such as tables, attributes, tuples, and relational schemas. It explains key constraints, integrity rules, and operations like selection, projection, and joins, emphasizing the importance of domain constraints and referential integrity. Additionally, it introduces relational algebra and various operators used for data retrieval and manipulation.

Uploaded by

mayankgrover846

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views88 pages

Unit 3 Normalization

Uploaded by

mayankgrover846

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 88

Data Base Management System

Unit -3
Relational Model
 Main idea:
 Table: relation
 Column header: attribute
 Row: tuple

 Relational schema: name(attributes)

 Example: employee(ssno,name,salary)
 Attributes:
 Each attribute has a domain – domain constraint
 Each attribute is atomic: we cannot refer to or directly see
a subpart of the value.
Relation Example

Account Customer
AccountId CustomerId Balance Id Name Addr
150 20 11,000 20 Tom Irvine
160 23 2,300 23 Jane LA
180 23 32,000 32 Jack Riverside
• Database schema consists of
– a set of relation schema
– Account(AccountId, CustomerId, Balance)
– Customer(Id, Name, Addr)
– a set of constraints over the relation schema
– AccountId, CustomerId must an integer
– Name and Addr must be a string of characters
– CustomerId in Account must be of Ids in Customer
– etc.
NULL value

Customer(Id, Name, Addr)

Id Name Addr
20 Tom Irvine
23 Jane LA
32 Jack NULL

 Attributes can take a special value: NULL

 Either not known: we don’t know Jack’s address
Domain Constraints
 Every attribute has a type:
 integer, float, date, boolean, string, etc.
 An attribute can have a domain. E.g.:
 Id > 0
 Salary > 0
 age < 100
 City in {Irvine, LA, Riverside}
 An insertion can violate the domain constraint.
 DBMS checks if insertion violates domain constraint and reject the insertion.

Integer String String

Id Name City
20 Tom Irvine
23 Jane San Diego
-2 Jack Riverside violations
Key Constraints

 Superkey: A Super Key is a set of one or more attributes (columns)

that can uniquely identify a row (tuple) in a table. No two rows in
the table can have the same values for a Super Key. Every
Candidate Key is a Super Key, but not every Super Key is a
Candidate Key.

 Any superset of {Account} is also a superkey

 There can be multiple superkeys

Log(LogId, AccountId, Xact#, Time, Amount) Illegal

LogID AccountID Xact# Time Amount
1001 111 4 1/12/02 $100
1001 122 4 12/28/01 $20
1003 333 6 9/1/00 $60
Example of Super Key
Table: Employee
Possible Super Keys:
1️⃣ {Emp_ID} (Unique by itself )
2️⃣ {Emp_ID, Name} (Contains extra attribute )
3️⃣ {Email} (Each email is unique )
4️⃣ {Phone} (Each phone number is unique )
5️⃣ {Emp_ID, Email, Phone, Dept_ID} (Still unique but redundant )
Minimal Super Keys like {Emp_ID}, {Email}, {Phone} are
Candidate Keys.
Redundant Super Keys contain extra attributes and are not
minimal.
Emp_ID Name Email Phone Dept_ID
101 Alice [email protected] 9876543210 HR
102 Bob [email protected] 9876543211 IT
103 Charlie [email protected] 9876543212 HR
Keys

 Key:
 Minimal superkey (no proper subset is a superkey)
 If more than one key: choose one as a primary key
 Example:
 Key 1: LogID (primary key)
 Key 2: AccountId, Xact#
 Superkeys: all supersets of the keys

Log(LogId, AccountId, Xact#, Time, Ammount)

LogID AccountID Xact# Time Amount
1001 111 4 1/12/02 $100 OK
1002 122 4 12/28/01 $20
1003 333 6 9/1/00 $60
Integrity Rules

There are two Integrity Rules that every relation should follow :
1. Entity Integrity (Rule 1)
2. Referential Integrity (Rule 2)

Entity Integrity states that –

If attribute A of a relation R is a prime attribute of R, then A

can not accept null and duplicate values.
Referential Integrity Constraints
 Given two relations R and S, R has a primary key X (a set of attributes)
 A set of attributes Y is a foreign key of S if:
 Attributes in Y have same domains as attributes X
 For every tuple s in S, there exists a tuple r in R: s[Y] = r[X].
 A referential integrity constraint from attributes Y of S to R means that Y is
a foreign that refers to the primary key of R.
 The foreign key must be either equal to the primary key or be entirely null.

Foreign key Y X (primary key of R)

r
s

S R
Examples of Referential Integrity

Account Customer
AccountId CustomerId Balance Id Name Addr
150 20 11,000 20 Tom Irvine
160 23 2,300 23 Jane LA
180 23 32,000 32 Jack Riverside

Account.customerId to Customer.Id

Student Dept
Id Name Dept Name chair
1111 Mike ICS ICS Tom
2222 Harry CE CE Jane
3333 Ford ICS MATH Jack

Student.dept to Dept.name: every value of Student.dept must also be a

value of Dept.name.
Relational Algebra

Relational Algebra is :
1. The formal description of how a relational database
operates
2. An interface to the data stored in the database itself.
3. The mathematics which underpin SQL operations

The DBMS must take whatever SQL statements the

user types in and translate them into relational algebra
operations before applying them to the database.
Operators - Retrieval
There are two groups of operations:

1. Mathematical set theory based relations:

UNION, INTERSECTION, DIFFERENCE, and
CARTESIAN PRODUCT.
2. Special database oriented operations:
SELECT , PROJECT and JOIN.
Symbolic Notation
 SELECT σ (sigma)
 PROJECT  (pi)
 PRODUCT  (times)
 JOIN ⋈ (bow-tie)
 UNION  (cup)
 INTERSECTION  (cap)
 DIFFERENCE - (minus)
 RENAME  (rho)
SET Operations - requirements
For set operations to function correctly the relations
R and S must be union compatible. Two relations
are union compatible if

They have the same number of attributes

The domain of each attribute in column order is
the same in both R and S.
Set Operations - semantics
Consider two relations R and S.
 UNION of R and S
the union of two relations is a relation that includes all
the tuples that are either in R or in S or in both R and S.
Duplicate tuples are eliminated.

 INTERSECTION of R and S
the intersection of R and S is a relation that includes
all tuples that are both in R and S.

 DIFFERENCE of R and S
the difference of R and S is the relation that contains
all the tuples that are in R but that are not in S.
Union , Intersection , Difference -

Set operators. Relations must have the same

schema.

R(name, dept) S(name, dept)

Name Dept Name Dept
Jack Physics Jack Physics
Tom ICS Mary Math

RS RS R-S

Name Dept Name Dept Name Dept
Jack Physics Jack Physics Tom ICS
Tom ICS
Mary Math
Relational SELECT
SELECT is used to obtain a subset of the tuples of a
relation that satisfy a select condition.
For example, find all employees born after 1st Jan 1950:
SELECT dob > ’01/JAN/1950’ (employee)
or
σ dob > ’01/JAN/1950’ (employee)
Conditions can be combined together using ^ (AND) and v
(OR). For example, all employees in department 1 called
`Smith':
σ depno = 1 ^ surname = `Smith‘ (employee)
Selection 

 c (R): return tuples in R that satisfy condition C.

Emp (name, dept, salary)
Name Dept Salary
Jane ICS 30K
Jack Physics 30K
Tom ICS 75K
Joe Math 40K
Jack Math 50K

 salary>35K (Emp)  dept=ics and salary<40K (Emp)

Name Dept Salary Name Dept Salary
Tom ICS 75K Jane ICS 30K
Joe Math 40K
Jack Math 50K
Relational PROJECT
The PROJECT operation is used to select a subset of the attributes of a
relation by specifying the names of the required attributes.

For example, to get a list of all employees with their salary

PROJECT ename, salary (employee)

OR
πename, salary(employee)
Projection 

A1,…,Ak(R): pick columns of attributes A1,…,Ak of R.

Emp (name, dept, salary)
Name Dept Salary
Jane ICS 30K
Jack Physics 30K
Tom ICS 75K
Joe Math 40K
Jack Math 50K

name,dept (Emp) name (Emp)

Name Dept Name
Jane ICS Jane
Jack Physics Jack
Tom ICS Tom
Joe Math Joe
Jack Math
Duplicates (“Jack”) eliminated.
CARTESIAN PRODUCT
The Cartesian Product is also an operator which
works on two sets. It is sometimes called the
CROSS PRODUCT or CROSS JOIN.

It combines the tuples of one relation with all the

tuples of the other relation.
Cartesian Product: 
R  S: pair each tuple r in R with each tuple s in S.

Emp (name, dept) Contact(name, addr)

Name Dept Name Addr
Jack Physics Jack Irvine
Tom LA
Tom ICS
Mary Riverside

Emp  Contact
E.name Dept C.Name Addr
Jack Physics Jack Irvine
Jack Physics Tom LA
Jack Physics Mary Riverside
Tom ICS Jack Irvine
Tom ICS Tom LA
Tom ICS Mary Riverside
JOIN Example
 JOIN is used to combine related tuples from two
relations R and S.
 In its simplest form the JOIN operator is just the
cross product of the two relations and is represented
as (R ⋈ S).

JOIN allows you to evaluate a join condition between

the attributes of the relations on which the join is
undertaken.
The notation used is R ⋈ S
Join Condition
Join
R C
S =  c (R  S)
• Join condition C is of the form:
<cond_1> AND <cond_2> AND … AND <cond_k>
Each cond_i is of the form A op B, where:
– A is an attribute of R, B is an attribute of S
– op is a comparison operator: =, <, >, , , or .
• Different types:
– Theta-join
– Equi-join
– Natural join
Theta-Join

R S
R.A>S.C

R(A,B) S(C,D)
A B C D
3 4 2 7
5 7 6 8

RS Result
R.A R.B S.C S.D
3 4 2 7 R.A R.B S.C S.D
3 4 6 8 3 4 2 7
5 7 2 7 5 7 2 7
5 7 6 8
Theta-Join

R S
R.A>S.C, R.B  S.D

R(A,B) S(C,D)
A B C D
3 4 2 7
5 7 6 8

RS Result
R.A R.B S.C S.D R.A R.B S.C S.D
3 4 2 7 3 4 2 7
3 4 6 8
5 7 2 7
5 7 6 8
Equi-Join

 Special kind of theta-join: C only uses the equality operator.

R(A,B) S(C,D)
A B C D
3 4 2 7
5 7 6 8

R S
R.B=S.D
RS Result
R.A R.B S.C S.D R.A R.B S.C S.D
3 4 2 7 5 7 2 7
3 4 6 8
5 7 2 7
5 7 6 8
Natural-Join

 Relations R and S. Let L be the union of their attributes.

 Let A1,…,Ak be their common attributes.

R S =  L (R S)
R.A1=S.A1,…,R.Ak=S.Ak
Natural-Join

Emp (name, dept) Contact(name, addr)

Name Dept Name Addr
Jack Physics Jack Irvine
Tom LA
Tom ICS
Mary Riverside

Emp Contact: all employee names, depts, and addresses.

Emp.name Emp.Dept Contact.name Contact.addr

Jack Physics Jack Irvine
Jack Physics Tom LA
Emp  Contact Jack Physics Mary Riverside
Tom ICS Jack Irvine
Tom ICS Tom LA
Tom ICS Mary Riverside

Result Name Dept Addr

Jack Physics Irvine
Tom ICS LA
Outer Joins

 Motivation: “join” can lose information

 E.g.: natural join of R and S loses info about Tom and
Mary, since they do not join with other tuples.
 Called “dangling tuples”.
R S
Name Dept Name Addr
Jack Physics Jack Irvine
Tom ICS Mike LA
Mary Riverside
• Outer join: natural join, but use NULL values to fill in dangling tuples.
• Three types: “left”, “right”, or “full”
Left Outer Join
Name Dept Name Addr
R Jack Physics Jack Irvine S
Mike LA
Tom ICS Mary Riverside

Left outer join

R S

Name Dept Addr

Jack Physics Irvine
Tom ICS NULL

Pad null value for left dangling tuples.

Right Outer Join
Name Addr
Name Dept Jack Irvine
R Jack Physics Mike LA S
Tom ICS Mary Riverside

Right outer join

R S

Name Dept Addr

Jack Physics Irvine
Mike NULL LA
Mary NULL Riverside

Pad null value for right dangling tuples.

Full Outer Join

Name Dept Name Addr

R Jack Physics Jack Irvine S
Tom ICS Mike LA
Mary Riverside

Full outer join

R S

Name Dept Addr

Jack Physics Irvine
Tom ICS NULL
Mike NULL LA
Mary NULL Riverside

Pad null values for both left and right dangling tuples.
Joins Revised

Result of applying these joins in a query:

INNER JOIN: Select only those rows that have values in common in the
columns specified in the ON clause.
LEFT, RIGHT, or FULL OUTER JOIN: Select all rows from the table on the left (or
right, or both) regardless of whether the other table has values in common
and (usually) enter NULL where data is missing.
Combining Different Operations

 Construct general expressions using basic operations.

 Schema of each operation:
 , , -: same as the schema of the two relations
 Selection  : same as the relation’s schema
 Projection : attributes in the projection
 Cartesian product  : attributes in two relations, use prefix
to avoid confusion
 Theta Join : same as 
C
 Natural Join : union of relations’ attributes, merge
common attributes
 Renaming: new renamed attributes
Example 1
customer(ssn, name, city)
account(custssn, balance)
“List account balances of Tom.”
balance ( custssn = ssn
(account  (
name =tom
customer )))

balance
Tree representation  custssn= ssn


account  name=tom
customer
Example 1(cont)
customer(ssn, name, city)
account(custssn, balance)
“List account balances of Tom.”

balance

ssn=custssn

account  name=tom
customer
Comparing RA and SQL
Relational algebra:
 is closed (the result of every expression is a relation)
 has a rigorous foundation
 has simple semantics
 is used for reasoning, query optimisation, etc.
SQL:
 is a superset of relational algebra
 has convenient formatting features, etc.
 provides aggregate functions
 has complicated semantics
 is an end-user language.
Functional
Dependencies
And
Normalization
Schema Normalization

 Decompose relational schemes to

 remove redundancy
 remove anomalies
 Result of normalization:
 Semantically-equivalent relational scheme
 Represent the same information as the original
 Be able to reconstruct the original from
decomposed relations.
Functional Dependencies
 Motivation: avoid redundancy in database design.
Relation R(A1,...,An,B1,...,Bm,C1,...,Cl)
Definition: A1,...,An functionally determine
B1,...,Bm,i.e.,
(A1,...,An →B1,...,Bm)
iff for any two tuples r1 and r2 in R,
r1(A1,...,An ) = r2(A1,...,An )
implies r1(B1,...,Bm) = r2(B1,...,Bm)
 By definition: a superkey → all attributes of the
relation.
Example
Take(StudentID, CID, Semster, Grade)
FD: (StudentId,Cid,semester) → Grade
StudentId Cid Semester Grade
1111 ICS184 Winter 02 A
1111 ICS184 Winter 02 B Illegal
2222 ICS143 Fall 01 A-

What if FD: (StudentId, Cid) → Semester?

StudentId Cid Semester Grade
1111 ICS184 Winter 02 A
1111 ICS184 Spring 02 A Illegal
2222 ICS143 Fall 01 A-

“Each student can take a course only once.”

FD Sets
 A set of FDs on a relation: e.g., R(A,B,C), {A→B,
B→C, A→C, AB→A}
 Some dependencies can be derived
 e.g., A→C can be derived from {A→B, B→C}.
 Some dependencies are trivial
 e.g., AB→A is “trivial.”
Trivial Dependencies

 Those that are true for every relation

 A1 A2…An → B1 B2…Bm is trivial if B’s are a subset of the
A’s.
 Example: XY → X (here X is a subset of XY)

 Called nontrivial if none of the B’s is one of the A’s.

 Example: AB→C (i.e. there is no such attribute at right
side of the FD which is at left side also)
Closure of FD Set
 Definition: Let F be a set of FDs of a relation R.
We use F+ to denote the set of all FDs that must
hold over R, i.e.:
F+ = { X → Y | F logically implies X → Y}
 F+ is called the closure of F.
 Example: F = {A→B, B→C}, then A→C is in F+.
Armstrong’s Axioms: Inferring All FDs

Given a set of FDs F over a relation R, how to compute F+?

• Reflexivity:
– If Y is a subset of X, then X →Y.
– Example: AB→A, ABC→AB, etc.

• Augmentation:
– If X→Y, then XZ→YZ.
– Example: If A→B, then AC→BC.

• Transitivity:
– If X→Y, and Y→Z, then X→Z.
– Example: If AB→C, and C→D, then AB→D.
More Rules Derived from AAs

 Union Rule( or additivity):

 If X→Y, X→Z, then X→YZ

 Projectivity
 If X→YZ, then X→Y and X→Z

 Pseudo-Transitivity Rule:
 If X→Y, WY→Z, then WX→Z
The Normalization Process
 In relational databases the term normalization refers to a reversible step-
by-step process in which a given set of relations is decomposed into a set
of smaller relations that have a progressively simpler and more regular
structure.

 The objectives of the normalization process are:

 To make it feasible to represent any relation in the

database.
applies to First Normal Form
 To free relations from undesirable insertion, update and
deletion anomalies.
applies to all normal forms
The Normalization Process

 The entire normalization process is based

upon

 the analysis of relations

 their schemes
 their primary keys
 their functional dependencies.
Normalization

rmalized Relati
o
n t normal fo o
rs d normal r

ns
Functional

on normal f
dependency

m m
Sec Fi
No transitive of nonkey
dependency
d f

or
attributes on
between the primary

Thir

orm
nonkey key - Atomic
attributes Boyce- values only
Codd and
Higher
All Full
determinants Functional
are candidate dependency
of nonkey
keys - Single
multivalued attributes on
dependency the primary
key
Normal Forms

1st Normal Form No repeating data groups

2nd Normal Form No partial key dependency
3rd Normal Form No transitive dependency
Boyce-Codd Normal Form Reduce keys dependency
Unnormalized Relations

 First step in normalization is to convert the data into a

two-dimensional table

 A relation is said to be unnormalized if does not conatin

atomic values.
Eg of Unnormalized Relation
Patient # Surgeon # Surg. date Patient Name Patient Addr Surgeon Surgery Postop drug
Drug side effects

Gallstone
s removal;
Jan 1, 15 New St. Beth Little Kidney
145 1995; June New York, Michael stones Penicillin, rash
1111 311 12, 1995 John White NY Diamond removal none- none

Eye
Charles Cataract
Apr 5, Field removal
243 1994 May 10 Main St. Patricia Thrombos Tetracyclin Fever
1234 467 10, 1995 Mary Jones Rye, NY Gold is removal e none none
Dogwood
Lane Open
Jan 8, Harrison, David Heart Cephalosp
2345 189 1996 Charles Brown NY Rosen Surgery orin none
55 Boston
Post Road,
Nov 5, Chester, Cholecyst
4876 145 1995 Hal Kane CN Beth Little ectomy Demicillin none
Blind Brook Gallstone
May 10, Mamaronec s
5123 145 1995 Paul Kosher k, NY Beth Little Removal none none
Eye
Cornea
Replacem
Apr 5, Hilton Road ent Eye
1994 Dec Larchmont, Charles cataract Tetracyclin
6845 243 15, 1984 Ann Hood NY Field removal e Fever
First Normal Form

 Tomove to First Normal Form a relation must

contain only atomic values at each row and
column.
No repeating groups
 Relation in 1NF contains only atomic
values.
First Normal Form
 Three Formal definitions of First Normal Form

 A relation r is said to be in First Normal Form (1NF) if and

only if every entry of the relation (each cell) has at most a
single value.

 A relation is in first normal form (1NF) if and only if all

underlying simple domain contains atomic values only.

 A relation is in 1NF if and only if all of its attributes are

based upon a simple domain.
 These two definitions are equivalent.
 If all relations of a database are in 1NF, we can say that
the database is in 1NF.
Eg of First Normal Form
The normalized representation of the PROJECT table
PROJECT
Proj Proj-Name Proj-Mgr- Emp-ID Emp- Emp-Dpt Emp-Hrly- Total
-ID ID Name Rate -Hrs
100 E-commerce 789487453 123423479 Heydary MIS 65 10
100 E-commerce 789487453 980808980 Jones TechSupport 45 6
100 E-commerce 789487453 234809000 Alexander TechSupport 35 6
100 E-commerce 789487453 542298973 Johnson TechDoc 30 12
110 Distance-Ed 820972445 432329700 Mantle MIS 50 5
110 Distance-Ed 820972445 689231199 Richardson TechSupport 35 12
110 Distance-Ed 820972445 712093093 Howard TechDoc 30 8
120 Cyber 980212343 834920043 Lopez Engineering 80 4
120 Cyber 980212343 380802233 Harrison TechSupport 35 11
120 Cyber 980212343 553208932 Olivier TechDoc 30 12
120 Cyber 980212343 123423479 Heydary MIS 65 07
130 Nitts 550227043 340783453 Shaw MIS 65 07
First Normal Form
 This normalized PROJECT table is not a relation
because it does not have a primary key.
 The attribute Proj-ID no longer identifies uniquely
any row.
 To transform this table into a relation a primary key
needs to be defined.
 A suitable PK for this table is the composite key
(Proj-ID, Emp-ID)
No other combination of the attributes of the table
will work as a PK.
Partial Dependencies
 Identifying the partial dependencies in the PROJECT-
EMPLOYEE relation.

 The PK of this relation is formed by the attributes Proj-ID

and Emp-ID.
 This implies that {Proj-ID, Emp-ID} uniquely identifies a
tuple in the relation.
 They functionally determine any individual attribute or
any combination of attributes of the relation.
 However, we only need attribute Emp-ID to functionally
determine the following attributes:
 Emp-Name, Emp-Dpt, Emp-Hrly-Rate.
Second Normal Form

And we need only Proj-Id attribute to functionally determine

proj_name and Proj_Mgr_Id.
So, we decompose the relation into following two relations:

PROJECT Proj- Proj- Proj-Mgr-

ID Name ID
100 E- 789487453
commerce
110 Distance- 820972445
Ed
120 Cyber 980212343
130 Nitts 550227043
Second Normal Form
PROJECT-EMPLOYEE

Emp-ID Emp-Name Emp-Dpt Emp-Hrly-

Rate
123423479 Heydary MIS 65
980808980 Jones TechSupport 45
234809000 Alexander TechSupport 35
542298973 Johnson TechDoc 30
432329700 Mantle MIS 50
689231199 Richardson TechSupport 35
712093093 Howard TechDoc 30
834920043 Lopez Engineering 80
380802233 Harrison TechSupport 35
553208932 Olivier TechDoc 30
340783453 Shaw MIS 65
 There are no partial dependencies in both the tables
because the determinant of the key only has a single
attribute.
Emp-Name
 For eg: Proj-ID
Emp-Dpt

Emp-ID Emp-Hrly-Rate

 To relate these two relations, we create a third table

(relationship table) that consists of the primary keys of
both the relations as foreign key and an attribute ‘Total-
Hrs-Worked’ because it is fully dependent on the key of
the relation {Proj-Id, Emp-Id}.
Second Normal Form
A relation is said to be in Second Normal Form if is in 1NF and
when every non key attribute is fully functionally dependent on
the primary key.
Or No nonprime attribute is partially dependent on any key .

Now, the example relation scheme is in 2NF with following

relations:
Project (Proj-Id, Proj-Name, Proj-Mgr-Id)
Employee (Emp-Id, Emp-Name, Emp_dept, Emp-Hrly-Rate )
Proj_Emp (Proj-id, Emp-Id, Total-Hrs-Worked)
Data Anomalies in 2NF Relations

 Insertion anomalies occur in the EMPLOYEE

relation.
 Consider a situation where we would like to set
in advance the rate to be charged by the
employees of a new department.
 We cannot insert this information until there is an
employee assigned to that department.
Notice that the rate that a department charges
is independent of whether or not it has
employees.
Data Anomalies in 2NF Relations

 The EMPLOYEE relation is also susceptible to

deletion anomalies.

 This type of anomaly occurs whenever we delete

the tuple of an employee who happens to be the
only employee left in a department.
 Inthis case, we will also lose the information
about the rate that the department charges.
Data Anomalies in 2NF Relations

 Update anomalies will also occur in the EMPLOYEE

relation because there may be several employees from
the same department working on different projects.

 If thedepartment rate changes, we need to make

sure that the corresponding rate is changed for all
employees that work for that department.
Otherwise the database may end up in an
inconsistent state.
Transitive Dependencies
 A transitive dependency is a functional dependency which holds by virtue of
transitivity. A transitive dependency can occur only in a relation that has three
or more attributes. Let A, B, and C designate three distinct attributes and
following conditions hold:
 A→B (where A is the key of the relation)
 B→C
 Then the functional dependency A → C (which follows from 1 and 3 by the
axiom of transitivity) is a transitive dependency.
 For eg: If in a relation Book is the key and
{Book} → {Author}
{Author} → {Nationality}
Therefore {Book} → {Nationality} is a transitive dependency.
 Transitive dependency occurs when a non-key attribute determines another
non-key attribute.
Transitive Dependencies
 Assume the following functional dependencies of
attributes A, B and C of relation r(R):

C
Third Normal Form
 A relation is in 3NF iff it is in 2NF and every non key attribute is non
transitively dependent on the primary key.

 A relation r(R) is in Third Normal Form (3NF) if and only if the following
conditions are satisfied simultaneously:
 r(R) is already in 2NF.
 No nonprime attribute is transitively dependent on the key.

 The objective of transforming relations into 3NF is to remove all transitive

dependencies.
 Given a relation R with FDs F, test if R is in 3NF.
 Compute all the candidate keys of R
 For each X→Y in F, check if it violates 3NF
 If X
is not a superkey, and Y is not part of a candidate key, then
X→Y violates 3NF.
Conversion to Third Normal Form

A* A*
B B
Convert to
C

B*
* indicates the key or the C
determinant of the relation.
Third Normal Form
 Using the general procedure, we will transform our 2NF
relation example to a 3NF relation.
 The relation EMPLOYEE is not in 3NF because there is a
transitive dependency of a nonprime attribute on the primary
key of the relation.
 In this case, the nonprime attribute Emp-Hrly-Rate is
transitively dependent on the key through the functional
dependency Emp-Dpt → Emp-Hrly-Rate.
 To transform this relation into a 3NF relation:
 it is necessary to remove any transitive dependency of a
nonprime attribute on the key.
 It is necessary to create two new relations.
Third Normal Form

The scheme of
the first relation that we have
named EMPLOYEE is:

EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt)

The scheme of
the second relation that we have
named CHARGES is:

CHARGES (Emp-Dpt, Emp-Hrly-Rate)

Data Anomalies in Third Normal Form
 The Third Normal Form helped us to get rid of the data
anomalies caused either by
 transitive dependencies on the PK or
 by dependencies of a nonprime attribute on another
nonprime attribute.

 However, relations in 3NF are still susceptible to data

anomalies, particularly when
 the relations have two overlapping candidate keys or
 when a nonprime attribute functionally determines a
prime attribute.
Boyce-Codd Normal Form (BCNF)

• A relation is in BCNF iff every determinant is a candidate key.

OR
• In other words, a relational schema R is in Boyce–Codd normal
form if and only if for every one of its dependencies X → Y, at least
one of the following conditions hold:
• X → Y is a trivial functional dependency (Y ⊆ X)
• X is a superkey for schema R

• The definition of 3NF does not deal with a relation that:

• has multiple candidate keys, where
• those candidate keys are composite, and
• the candidate keys overlap (i.e., have at least one common
attribute)
Example of BCNF

Candidate keys are (sid, part_id)

and (sname, part_id).
With following FDs: sname part_id
1. { sid, part_id } → qty sid qty
2. { sname, part_id } → qty
SSP
3. sid → sname
4. sname → sid

The relation is in 3NF:

For sid → sname, … sname is in a candidate key.
For sname → sid, … sid is in a candidate key.

However, this leads to redundancy and loss of information

Example of BCNF
If we decompose the schema into
R1 = ( sid, sname ), R2 = ( sid, part_id, qty )
These are in BCNF.

The decomposition is dependency preserving.

{ sname, part_id } → qty can be deduced from

(1) sname → sid (given)

(2) { sname, part_id } → { sid, part_id } (augmentation on (1))
(3) { sid, part_id } → qty (given)

and finally transitivity on (2) and (3).

3NF vs BCNF
 Only in rare cases does a 3NF table not meet the
requirements of BCNF. A 3NF table which does not have
multiple overlapping candidate keys is guaranteed to be in
BCNF. Depending on what its functional dependencies are, a
3NF table with two or more overlapping candidate keys may
or may not be in BCNF.
 If a relation schema is not in BCNF
 it is possible to obtain a lossless-join decomposition into a
collection of BCNF relation schemas.
 Dependency-preserving is not guaranteed.
 3NF
 There is always a dependency-preserving, lossless-join
decomposition into a collection of 3NF relation schemas.
Properties of a good Decomposition
A decomposition of a relation R into sub-relations R1, R2,…….,
Rn should possess following properties:

The decomposition should be

• Attribute Preserving ( All the attributes in the given relation

must occur in any of the sub – relations)
• Dependency Preserving ( All the FDs in the given relation
must be preserved in the decomposed relations)
• Lossless join ( The natural join of decomposed relations should
produce the same original relation back, without any spurious
tuples).
• No redundancy ( The redundancy should be minimized in the
decomposed relations).
Lossless Join Decomposition
The relation schemas { R1, R2, …, Rn } is a lossless-join decomposition of R
if:
for all possible relations r on schema R,
r = R1( r )   R2( r ) …   Rn ( r )
Example:
Student = ( sid, sname, major)
F = { sid → sname, sid → major}

{ sid, sname } + { sid, major } is a lossless join decomposition

the intersection = {sid} is a key in both schemas

{sid, major} + { sname, major } is not a lossless join decomposition

the intersection = {major} is not a key in either
{sid, major} or { sname, major }
Another Example

R = { A, B, C, D }
F = { A → B, C → D }.
Key is {AC}.
introduce
Decomposition: { (A, B), (C, D), (A, C) } virtually
Consider it a two step decomposition:
1. Decompose R into R1 = (A, B), R2 = (A, C, D)
2. Decompose R2 into R3 = (C, D), R4 = (A, C)
This is a lossless join decomposition.

If R is decomposed into (A, B), (C, D)

This is a lossy-join decomposition.
Fourth Normal Form
A relation R is in 4NF if and only if it satisfies following
conditions:
 If R is already in 3NF or in BCNF.
 If it contains no multi valued dependencies.

MVDs occur when two or more independent multi valued facts

about the same attribute occur within the same relation.

This means that if in a relation R, having A, B and C attributes,

B and C are multi valued represented as A→→B and A→→C,
then MVD exists only if B and C are independent of each other.
Example: 4NF
Example: 4NF
Fifth Normal Form
 A relation R is in 5NF (also called Projection-Join Normal form
or PJNF) iff every join dependency in the relation R is implied
by the candidate keys of the relation R.

 A relation decomposed into two relations must have lossless

join property, which ensures that no spurious tuples are
generated when relations are reunited using a natural join.

 There are requirements to decompose a relation into more

than two relations. Such cases are managed by join
dependency and 5NF.

 Implies that relations that have been decomposed in previous

NF can be recombined via natural joins to recreate the
original relation.
Fifth Normal Form
Consider the different case where, if an agent is an agent for a company and that
company makes a product, then he always sells that product for the company.
Under these circumstances, the 'agent company product' table is as shown
below . This relation contains following dependencies.
Agent →→ Company
Agent →→ Product_Name
Company→→Product_Name
agent_company_product_table
Fifth Normal Form
The table is necessary in order to show all the information required.
Suneet, for example, sells ABC's Nuts and Screws, but not ABC's Bolts. Raj is
not an age it for CDE and does not sell ABC's Nuts or Screws. The table is
in 4NF because it contains no multi-valued dependency. It does,
however, contain an element of redundancy in that it records the fact
that Suneet is an agent for ABC twice. Suppose that the table is
decomposed into its two projections, PI and P2.

The redundancy has been eliminated, but the information about which
companies make which products and which of these products they
supply to which agents has been lost. The natural join of these two
projections will result in some spurious tuples (additional tuples which were
not present in the original relation).
Fifth Normal Form
This table can be decomposed into its three projections without loss of
information as demonstrated below .

If we take the natural join of these relations then we get the original
relation back. So this is the correct decomposition.

decompose into three projection

THANK
YOU

DBMS Ninja Notes
No ratings yet
DBMS Ninja Notes
134 pages
WEEK 4 CIS 205 Relational Algebra Full
No ratings yet
WEEK 4 CIS 205 Relational Algebra Full
21 pages
Relational Database Concepts
No ratings yet
Relational Database Concepts
108 pages
Dbms 2
No ratings yet
Dbms 2
54 pages
Ii Unit
No ratings yet
Ii Unit
20 pages
DBMS Module-2
No ratings yet
DBMS Module-2
80 pages
Relational Model & Algebra Guide
No ratings yet
Relational Model & Algebra Guide
38 pages
Relational Algebra
No ratings yet
Relational Algebra
31 pages
Relational Database Concepts Guide
No ratings yet
Relational Database Concepts Guide
28 pages
DBMS2
No ratings yet
DBMS2
79 pages
Unit 4
No ratings yet
Unit 4
43 pages
Dbms 3
No ratings yet
Dbms 3
13 pages
Unit II Notes Dbms
No ratings yet
Unit II Notes Dbms
56 pages
RelationalModel
No ratings yet
RelationalModel
68 pages
Relational Algebra
100% (1)
Relational Algebra
140 pages
Notes of Unit - II DBMS Updated
No ratings yet
Notes of Unit - II DBMS Updated
90 pages
Introduction of Relational Algebra in DBMS
No ratings yet
Introduction of Relational Algebra in DBMS
9 pages
Chapter - 03C Rel Algebra and SQL
No ratings yet
Chapter - 03C Rel Algebra and SQL
28 pages
Relational Algebra Expressions in Database
No ratings yet
Relational Algebra Expressions in Database
115 pages
ch2 Dbms
No ratings yet
ch2 Dbms
24 pages
Relational Database Concepts Guide
No ratings yet
Relational Database Concepts Guide
7 pages
DBMS - Unit-2 Relational Algebra
No ratings yet
DBMS - Unit-2 Relational Algebra
113 pages
Unit 2
No ratings yet
Unit 2
75 pages
Relational Algebra and SQL
No ratings yet
Relational Algebra and SQL
68 pages
Relational Algebra
No ratings yet
Relational Algebra
10 pages
Unit 2 Rdbms
No ratings yet
Unit 2 Rdbms
9 pages
Relational Algebra
No ratings yet
Relational Algebra
20 pages
Notes dbms2
No ratings yet
Notes dbms2
47 pages
Lecture 2
No ratings yet
Lecture 2
82 pages
Week01RelationalModelandSQL 170579
No ratings yet
Week01RelationalModelandSQL 170579
47 pages
Module 4
No ratings yet
Module 4
122 pages
Relational Model
No ratings yet
Relational Model
64 pages
DBMS Module 2
No ratings yet
DBMS Module 2
8 pages
Relational Algebra
No ratings yet
Relational Algebra
33 pages
Dbms Reference Material 3
No ratings yet
Dbms Reference Material 3
21 pages
数据库原理与实践 Database Systems-Principle and Practice
No ratings yet
数据库原理与实践 Database Systems-Principle and Practice
182 pages
DBMS Joins & Relational Algebra
No ratings yet
DBMS Joins & Relational Algebra
17 pages
RDBMS11 12july
No ratings yet
RDBMS11 12july
35 pages
Chapter 2: Intro To Relational Model
No ratings yet
Chapter 2: Intro To Relational Model
39 pages
2.relational Database
No ratings yet
2.relational Database
74 pages
Relational Database Model Guide
No ratings yet
Relational Database Model Guide
34 pages
Relational Algebra Basics
No ratings yet
Relational Algebra Basics
35 pages
Realtional Model - Relational Algebra
No ratings yet
Realtional Model - Relational Algebra
67 pages
DBS Part 2-1
No ratings yet
DBS Part 2-1
23 pages
Unit II
No ratings yet
Unit II
57 pages
Relational Models.
No ratings yet
Relational Models.
34 pages
Unit 2ND Notes (DBMS)
No ratings yet
Unit 2ND Notes (DBMS)
21 pages
Co-So-Du-Lieu - Truong-Tuan-Anh - Dbs-Algebra - (Cuuduongthancong - Com)
No ratings yet
Co-So-Du-Lieu - Truong-Tuan-Anh - Dbs-Algebra - (Cuuduongthancong - Com)
53 pages
Unit-4 Relational Model and SQL Commands - Image.Marked
No ratings yet
Unit-4 Relational Model and SQL Commands - Image.Marked
34 pages
Difference Between The Following
No ratings yet
Difference Between The Following
13 pages
INFO2120 Database Study Guide
No ratings yet
INFO2120 Database Study Guide
15 pages
FALLSEM2019-20 CSE2004 ETH VL2019201000657 Reference Material I 26-Aug-2019 RELATIONAL ALGEBRA
No ratings yet
FALLSEM2019-20 CSE2004 ETH VL2019201000657 Reference Material I 26-Aug-2019 RELATIONAL ALGEBRA
68 pages
Relational Algebra and Relational Tuple Calculus
No ratings yet
Relational Algebra and Relational Tuple Calculus
18 pages
Relational Algebra
No ratings yet
Relational Algebra
21 pages
Introduction of Relational Algebra in DBMS
No ratings yet
Introduction of Relational Algebra in DBMS
6 pages
Rdbms Unit 3
No ratings yet
Rdbms Unit 3
15 pages
DBMS - Unit 2
No ratings yet
DBMS - Unit 2
108 pages
DBMS Ece Unit 2
No ratings yet
DBMS Ece Unit 2
13 pages
Relational Algebra: Basics & Limits
No ratings yet
Relational Algebra: Basics & Limits
9 pages
Normalization
No ratings yet
Normalization
27 pages
UNIT-III Lecture Notes
No ratings yet
UNIT-III Lecture Notes
29 pages
Functional Dependencies & Normalization
No ratings yet
Functional Dependencies & Normalization
65 pages
Restaurant Management System
No ratings yet
Restaurant Management System
33 pages
DBMS Interview Questions Guide
No ratings yet
DBMS Interview Questions Guide
25 pages
Database Normalization Basics
No ratings yet
Database Normalization Basics
61 pages
Functional Dependencies and Normalization For Relational Databases
100% (2)
Functional Dependencies and Normalization For Relational Databases
11 pages
Library Database Project
100% (1)
Library Database Project
38 pages
CoSc 2041 Chapter 4-1
No ratings yet
CoSc 2041 Chapter 4-1
16 pages
DBMS Oral Questions
No ratings yet
DBMS Oral Questions
15 pages
3 Normalization
No ratings yet
3 Normalization
16 pages
DBMS Quick Guide
No ratings yet
DBMS Quick Guide
75 pages
(Ebook PDF) Database Processing Fundamentals, Design, and Implementation, 15th Edition Instant Download
100% (3)
(Ebook PDF) Database Processing Fundamentals, Design, and Implementation, 15th Edition Instant Download
56 pages
Database Systems: Design, Implementation, and Management: Normalization of Database Tables
No ratings yet
Database Systems: Design, Implementation, and Management: Normalization of Database Tables
50 pages
Previous GATE Questions With Solutions On DBMS (Normalization) - CS/IT
100% (3)
Previous GATE Questions With Solutions On DBMS (Normalization) - CS/IT
7 pages
Question and Answer Set of System Analysis and
No ratings yet
Question and Answer Set of System Analysis and
209 pages
Role of SEBON
No ratings yet
Role of SEBON
22 pages
Rdbms Notes
No ratings yet
Rdbms Notes
28 pages
Normalization
No ratings yet
Normalization
13 pages
RDBMS Unit 4
No ratings yet
RDBMS Unit 4
15 pages
DBMS Unit-4 Notes
No ratings yet
DBMS Unit-4 Notes
42 pages
Relational Database Management System: Normalization
No ratings yet
Relational Database Management System: Normalization
8 pages
Normal Forms
No ratings yet
Normal Forms
14 pages
Keys Dbms
No ratings yet
Keys Dbms
34 pages
BCNF
No ratings yet
BCNF
3 pages
Set-3 DBMS MCQ With Solution
No ratings yet
Set-3 DBMS MCQ With Solution
10 pages
Modeling Data Objects
No ratings yet
Modeling Data Objects
10 pages
Normalization PPT Complete
No ratings yet
Normalization PPT Complete
24 pages
A - Quick Look at Normalization
No ratings yet
A - Quick Look at Normalization
18 pages