Mata Kuliah Sistem Basis Data TA 2016/2017
Database Design
Normal forms & functional
dependencies
Sumber: http://web.stanford.edu/class/cs145/
University of Stanford
Database Design (1)
Overview of design theory & normal forms
Data anomalies & constraints
Functional dependencies
Mata Kuliah Sistem Basis Data TA 2016/2017
2
Overview
design theory &
normal forms
Mata Kuliah Sistem Basis Data TA 2016/2017
Design Theory
Design theory is about how to represent your data to avoid
anomalies.
Teori desain digunakan untuk merepresentasikan data yang tidak menimbulkan anomali.
It is a mostly mechanical process
Tools can carry out routine portions
Mata Kuliah Sistem Basis Data TA 2016/2017
4
Normal Forms
1st Normal Form (1NF) = All tables are flat
2nd Normal Form = disused
Boyce-Codd Normal Form (BCNF) DB designs based on
functional
dependencies,
3rd Normal Form (3NF) intended to prevent
data anomalies
4th and 5th Normal Forms
Mata Kuliah Sistem Basis Data TA 2016/2017
5
1st Normal Form (1NF)
Student Courses
Student Courses
Mary CS145
Mary {CS145,CS229}
Mary CS229
Joe {CS145,CS106}
Joe CS145
Joe CS106
Violates 1NF. In 1st NF
1NF Constraint: harus bersifat atomic!
Mata Kuliah Sistem Basis Data TA 2016/2017
6
Data anomalies
& constraints
Mata Kuliah Sistem Basis Data TA 2016/2017
Constraints Prevent (some)
Anomalies in the Data
A poorly designed database causes anomalies:
Student Course Room
Mary CS145 B01
If every course is in
Joe CS145 B01 only one room,
Sam CS145 B01 contains redundant
information!
.. .. ..
Mata Kuliah Sistem Basis Data TA 2016/2017
8
Constraints Prevent (some)
Anomalies in the Data
A poorly designed database causes anomalies:
Student Course Room
.. .. ..
If everyone drops the class, we lose what
room the class is in! = a delete anomaly
9
Constraints Prevent (some)
Anomalies in the Data
A poorly designed database causes anomalies:
Student Course Room
Mary CS145 B01 Similarly, we cant
reserve a room
Joe CS145 B01 without students
Sam CS145 B01 = an insert
anomaly
CS229 C12 .. .. ..
Mata Kuliah Sistem Basis Data TA 2016/2017
10
Constraints Prevent (some)
Anomalies in the Data
Student Course Is this form better?
Mary CS145 Course Room
Redundancy?
Joe CS145 CS145 B01 Update anomaly?
Sam CS145 CS229 C12 Delete anomaly?
Insert anomaly?
.. ..
Today: develop theory to understand why this design may be better and how to find
this decomposition
Mata Kuliah Sistem Basis Data TA 2016/2017
11
Functional
Dependencies
Mata Kuliah Sistem Basis Data TA 2016/2017
Functional Dependencies (FDs)
A functional dependency X Y holds over relation schema R if, for every
allowable instance r of R:
t1 r, t2 r, pX (t1) = pX (t2)
implies pY (t1) = pY (t2)
(where t1 and t2 are tuples;X and Y are sets of attributes)
Explanation: CAUTION: The opposite is not true.
X Y means:
If for 2 tuples X is the same, then Y must also be the same.
Read as determines
Functional Dependencies (FDs)
An FD is a statement about all allowable relations.
Identified based on semantics, NOT instances
Given an instance of R, we can disprove a FD, but we cannot verify
the validity of a FD.
Question: Are FDs related to keys?
if K all attributes of R then K is a superkey for R
(does not require K to be minimal.)
FDs are a generalization of keys.
Functional Dependencies (FDs): Example
Consider relation obtained from Hourly_Emps:
Hourly_Emps (ssn, name, lot, rating, wage_per_hr, hrs_per_wk)
We sometimes denote a relation schema by listing the attributes: e.g., SNLRWH
This is really the set of attributes {S,N,L,R,W,H}.
What are some FDs on Hourly_Emps? ssn is the key: S SNLRWH
rating determines wage_per_hr: R W
lot determines lot: L L (trivial
dependency)
Functional Dependencies (FDs): Example
Problems Due to R W
S N L R W H
123-22-3666 Attishoo 48 8 10 40
Hourly_Emps
231-31-5368 Smiley 22 8 10 30 R=6, W=?
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
Update anomaly: Should we be allowed to modify W in only the 1st tuple of
SNLRWH?
Insertion anomaly: What if we want to insert an employee and dont know the hourly
wage for his or her rating? (or we get it wrong?)
Deletion anomaly: If we delete all employees with rating 5, we lose the information
about the wage for rating 5!
Functional Dependencies (FDs): Example
Detecting Reduncancy
S N L R W H
123-22-3666 Attishoo 48 8 10 40
Hourly_Emps
231-31-5368 Smiley 22 8 10 30
131-24-3650 Smethurst 35 5 7 30
434-26-3751 Guldu 35 5 7 32
612-67-4134 Madayan 35 8 10 40
Q: Why was R W problematic, but S W not?
Functional Dependencies (FDs): Example
Redundancy can be removed by chopping the
Decomposing a Relation
relation into pieces (vertically!)
FDs are used to drive this process.
R W is causing the problems, so decompose SNLRWH
into what relations?
S N L R H
123-22-3666 Attishoo 48 8 40
R W
231-31-5368 Smiley 22 8 30
8 10
131-24-3650 Smethurst 35 5 30
5 7
434-26-3751 Guldu 35 5 32
612-67-4134 Madayan 35 8 40 Wages
Hourly_Emps2
Functional Dependencies (FDs):
Refining an ER Diagram
Before:
1st diagram becomes: since
Workers(S,N,L,D,Si) name dname
Departments(D,M,B) ssn lot did budget
Lots associated with workers.
Suppose all workers in a Employees Works_In Departments
dept are assigned the same lot:
DL
Redundancy; fixed by:
Workers2(S,N,D,Si) Dept_Lots(D,L)
After:
budget
Departments(D,M,B) since
name dname
Can fine-tune this:
ssn did
Workers2(S,N,D,Si) lot
Departments(D,M,B,L)
Employees Works_In Departments
Functional Dependencies (FDs): Reasoning About FDs
Given some FDs, we can usually infer additional FDs:
title studio, star implies title studio and title star
title studio and title star implies title studio, star
title studio, studio star implies title star
But,
title, star studio does NOT necessarily imply that title
studio or that star studio
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
Finding Functional Dependencies
Equivalent to asking: Given a set of FDs, F = {f1,fn}, does an FD g hold?
Inference problem: How do we decide?
Answer: Three simple rules called Armstrongs
Rules.
1. Split/Combine,
2. Reduction, and
3. Transitivity ideas by picture
1. Split/Combine
A1 Am B1 Bn
A1, , Am B1,,Bn
1. Split/Combine
A1 Am B1 Bn
A1, , Am B1,,Bn
is equivalent to the following n FDs
A1,,Am Bi for i=1,,n
1. Split/Combine
A1 Am B1 Bn
And vice-versa, A1,,Am Bi for i=1,,n
is equivalent to
A1, , Am B1,,Bn
Reduction/Trivial
A1 Am
A1,,Am Aj for any j=1,,m
3. Transitive Closure
A1 Am B1 Bn C1 Ck
A1, , Am B1,,Bn and
B1,,Bn C1,,Ck
3. Transitive Closure
A1 Am B1 Bn C1 Ck
A1, , Am B1,,Bn and
B1,,Bn C1,,Ck
implies
A1,,Am C1,,Ck
Finding Functional Dependencies
Example:
Products Provided FDs:
Name Color Category Dep Price 1. {Name} {Color}
Gizmo Green Gadget Toys 49 2. {Category} {Department}
Widget Black Gadget Toys 59 3. {Color, Category} {Price}
Gizmo Green Whatsit Garden 99
Which / how many other FDs hold?
Finding Functional Dependencies
Example:
Inferred FDs: Provided FDs:
Inferred FD Rule used 1. {Name} {Color}
4. {Name, Category} -> {Name} ? 2. {Category} {Dept.}
5. {Name, Category} -> {Color} ? 3. {Color, Category}
{Price}
6. {Name, Category} -> {Category} ?
7. {Name, Category -> {Color, Category} ?
8. {Name, Category} -> {Price} ?
Which / how many other FDs hold?
Finding Functional Dependencies
Example:
Inferred FDs: Provided FDs:
Inferred FD Rule used 1. {Name} {Color}
4. {Name, Category} -> {Name} Trivial 2. {Category} {Dept.}
5. {Name, Category} -> {Color} Transitive (4 -> 1) 3. {Color, Category}
{Price}
6. {Name, Category} -> {Category} Trivial
7. {Name, Category -> {Color, Category} Split/combine (5 + 6)
8. {Name, Category} -> {Price} Transitive (7 -> 3)
Can we find an algorithmic way to do this?
Closures
Mata Kuliah Sistem Basis Data TA 2016/2017
Closure of a set of Attributes
Given a set of attributes A1, , An and a set of FDs F:
Then the closure, {A1, , An}+ is the set of attributes B s.t. {A1, , An} B
Example: F = {name} {color}
{category} {department}
{color, category} {price}
Example {name}+ = {name, color}
Closures: {name, category}+ =
{name, category, color, dept, price}
{color}+ = {color}
32
Closure Algorithm
Start with X = {A1, , An} and set of FDs F.
Repeat until X doesnt change; do:
if {B1, , Bn} C is entailed by F
and {B1, , Bn} X
then add C to X.
Return X as X+
33
Closure Algorithm
Start with X = {A1, , An}, FDs F. {name, category}+ =
Repeat until X doesnt change; do: {name, category}
if {B1, , Bn} C is in F and {B1,
, Bn} X:
then add C to X.
Return X as X+
F=
{name} {color}
{category} {dept}
{color, category} {price}
34
Closure Algorithm
Start with X = {A1, , An}, FDs F. {name, category}+ =
Repeat until X doesnt change; do: {name, category}
if {B1, , Bn} C is in F and {B1,
, Bn} X: {name, category}+ =
then add C to X. {name, category, color}
Return X as X+
F=
{name} {color}
{category} {dept}
{color, category} {price}
35
Closure Algorithm
Start with X = {A1, , An}, FDs F. {name, category}+ =
Repeat until X doesnt change; do: {name, category}
if {B1, , Bn} C is in F and {B1,
, Bn} X: {name, category}+ =
then add C to X. {name, category, color}
Return X as X+
F= {name, category}+ =
{name} {color} {name, category, color, dept}
{category} {dept}
{color, category} {price}
36
Closure Algorithm
Start with X = {A1, , An}, FDs F. {name, category}+ =
Repeat until X doesnt change; do: {name, category}
if {B1, , Bn} C is in F and {B1,
, Bn} X: {name, category}+ =
then add C to X. {name, category, color}
Return X as X+
F= {name, category}+ =
{name} {color} {name, category, color, dept}
{category} {dept}
{name, category}+ =
{name, category, color, dept, price}
{color, category} {price}
37
Example
R(A,B,C,D,E,F) {A,B} {C}
{A,D} {E}
{B} {D}
{A,F} {B}
Compute {A,B}+ = {A, B, }
Compute {A, F}+ = {A, F, }
38
Example
R(A,B,C,D,E,F) {A,B} {C}
{A,D} {E}
{B} {D}
{A,F} {B}
Compute {A,B}+ = {A, B, C, D }
Compute {A, F}+ = {A, F, B }
39
Example
R(A,B,C,D,E,F) {A,B} {C}
{A,D} {E}
{B} {D}
{A,F} {B}
Compute {A,B}+ = {A, B, C, D, E}
Compute {A, F}+ = {A, B, C, D, E, F}
40