Chapter 3
Database Design:
Logical Design-Part2
Introduction
Steps For Designing
Database
Conceptual Design STEP 1
STEP 2
Logical Design
STEP 3
STEP 4 STEP 6
STEP 5 STEP 7
Physical Design
STEP 8
STEP 9
Methodology Overview -
Conceptual Database Design
Step
1 Build local conceptual data
model for each user view
Step 1.1 Identify entity types
Step 1.2 Identify relationship types
Step 1.3 Identify and associate attributes with entity or
relationship types
Step 1.4 Determine attribute domains
Step 1.5 Determine candidate and primary key attributes
Step 1.6 Consider use of enhanced modeling concepts (optional
step)
Step 1.7 Check model for redundancy
Step 1.8 Validate local conceptual model against user transactions
Step 1.9 Review local conceptual data model with user
Methodology Overview - Logical
Database Design for Relational
Model
Step2 Build and validate local logical
data model for each view
Step 2.1 Remove features not compatible with the
relational model (optional step)
Step 2.2 Derive relations for local logical data model
Step 2.3 Validate relations using
normalization
Step 2.4 Validate relations against user transactions
Step 2.5 Define integrity constraints
Step 2.6 Review local logical data model with user
Methodology Overview - Logical
Database Design for Relational
Model(cont)
Step3 Build and validate global logical
data model
Step 3.1 Merge local logical data models into
global model
Step 3.2 Validate global logical data model
Step 3.3 Check for future growth
Step 3.4 Review global logical data model with
users
Methodology Overview - Physical
Database Design for Relational
Databases
Step 4 Translate global logical data model for
target DBMS
Step 4.1 Design base relations
Step 4.2 Design representation of derived data
Step 4.3 Design enterprise constraints
Step 5 Design physical representation
Step 5.1 Analyze transactions
Step 5.2 Choose file organization
Step 5.3 Choose indexes
Step 5.4 Estimate disk space requirements
Methodology Overview - Physical
Database Design for Relational
Databases(cont)
Step 6 Design user views
Step 7 Design security mechanisms
Step 8 Consider the introduction of
controlled redundancy
Step 9 Monitor and tune the
operational system
Normalization
Normalization
Main objective in developing a logical data
model for relational database systems is to
create an accurate representation of
the data, its relationships, and
constraints.
To achieve this objective, must identify a
suitable set of relations.
Normalization(cont)
Four most commonly used normal forms
are first (1NF), second (2NF) and
third (3NF) normal forms, and
Boyce–Codd normal form (BCNF).
Based on functional dependencies
among the attributes of a relation.
A relation can be normalized to a specific
form to prevent possible occurrence
of anomalies and data redundancy.
Data Redundancy
Data Redundancy(cont)
StaffBranch relation has redundant data:
details of a branch are repeated for
every member of staff.
In contrast, branch information appears
only once for each branch in Branch relation
and only branchNo is repeated in Staff
relation, to represent where each member of
staff works.
Normalization(cont)
A bad database design may suffer from anomalies that make the
database difficult to use:
COMPANIES(company_name, owner_id,
company_address, date_founded,owner_name,
owner_title, #shares )
Suppose Primary Key (company_name, owner_id)
Anomalies:
update anomaly occurs if changing the value of an attribute
leads to an inconsistent database state.
insertion anomaly occurs if we cannot insert a tuple due to
some design flaw.
deletion anomaly occurs if deleting a tuple results in
unexpected loss of information.
Normalization is the systematic process for removing all such
anomalies in database design.
Update Anomaly
If a company has three owners, there are three tuples in the
COMPANIES relation for this company.
If this company moves to a new location, the company’s
address must be updated consistently in all three tuples
updating the company address in just one or two of the tuples
creates an inconsistent database state
It would be better if the company name and address were in a
separate relation so that the address of each company appears in
only one tuple
COMPANIES(company_name, company_address,
date_founded, owner_id, owner_name,
owner_title, #shares )
Example of Update
Anomalies
To insert a new staff with branchNo B007 into the
StaffBranch relation;
To delete a tuple that represents the last member of staff
located at a branch B007;
To change the address of branch B003.
StaffBranch
staffNo sName position salary branchNo bAddress
SL21 John White Manager 30000 B005 22 Deer Rd, London
SG37 Ann Beech Assistant 12000 B003 163 Main St,Glasgow
SG14 David Ford Supervisor 18000 B003 163 Main St,Glasgow
SA9 Mary Howe Assistant 9000 B007 16 Argyll St, Aberdeen
SG5 Susan Brand Manager 24000 B003 163 Main St,Glasgow
SL41 Julie Lee Assistant 9000 B005 22 Deer Rd, London
Figure 1 StraffBranch relation
Example of Update
Anomalies (cont)
Staff
staffNo sName position salary branceNo
SL21 John White Manager 30000 B005
SG37 Ann Beech Assistant 12000 B003
SG14 David Ford Supervisor 18000 B003
SA9 Mary Howe Assistant 9000 B007
SG5 Susan Brand Manager 24000 B003
SL41 Julie Lee Assistant 9000 B005
Branch
branceNo bAddress
B005 22 Deer Rd, London
B007 16 Argyll St, Aberdeen
B003 163 Main St,Glasgow
Figure 2 Straff and Branch relations
Insert Anomaly
Suppose that three people have just created a new company:
the three founders have no titles yet
stock distributions have yet to be defined
The new company cannot be added to the COMPANIES
relation because there is not enough information to fill in all
the attributes of a tuple
at best, null values can be used to complete a tuple
It would be better if owner and stock information was stored
in a different relation
COMPANIES(company_name, company_address,
date_founded, owner_id, owner_name,
owner_title, #shares )
Delete Anomaly
Suppose that an owner of a company retires so is no longer
an owner but retains stock in the company
If this person’s tuple is deleted from the COMPANIES
relation, then we lose the information about how much
stock the person still owns
If the stock information was stored in a different relation,
then we can retain this information after the person is
deleted as an owner of the company
COMPANIES(company_name, company_address,
date_founded, owner_id, owner_name,
owner_title, #shares )
Relationship Between
Normal Forms
Steps For Normalization
Unnormalized Form
Remove Repeating Groups
First Normalized Form
(1NF)
Remove Partial Dependencies
Second
Normalized Form
(2NF)
Remove Transitive Dependencies
Third
Normalized Form
(3NF)
Unnormalized A table that contains one or more repeating groups
Form (UNF)
First Normal A relation in which the intersection of each row and
Form (1NF) column contains one and only one value
Indicates that if A and B are attributes of a relation,
Full Functional
B is fully functionally dependent on A if B is
Dependency
functionally dependent on A, but not on any proper
subset of A.
Second Normal A relation that is in 1NF form and every non-
Form (2NF) primary-key attribute is fully functionality
dependent on the primary key.
A condition where A, B, C are attributes of a relation
Transitive
such that if A->B and B->C then C is transitively
Dependency
dependent on A via B (provided that A is not
functionally dependent on B or C)
Third Normal A relation that is in 1NF and 2NF, and in which no
Form (3NF) non-primary key attribute is transitively dependent
on the primary key
Unnormalized
Table/Relation
SALESPERSON SALESPERSON SALES CUSTOMER CUSTOMER WAREHOUSE WAREHOUSE SALES
NUMBER NAME AREA NUMBER NAME NUMBER LOCATION AMOUNT
3462 Waters West 18765 Delta Sys. 4 Fargo 13540
18830 Levy & Sons 3 Bismarck 10600
19242 Ranier Com 3 Bismarck 9700
3593 Dryne East 18841 R.W.Flood 2 Superior 11560
18899 Seward 2 Superior 2590
19565 Stodola 1 Plymouth 8800
… … … … … … … …
SALESPERSON
First Normal {PK}
SALESPERSON
NUMBER
SALESPERSON
NAME
SALES
AREA
Form (1NF) 3462
3593
Waters
Dryne
West
East
… … …
SALESPERSON-CUSTOMER
SALESPERSON CUSTOMER CUSTOMER NAME WAREHOUSE WAREHOUSE SALES
{PK} NUMBER NUMBER NUMBER LOCATION AMOUNT
3462 18765 Delta Sys. 4 Fargo 13540
3462 18830 Levy & Sons 3 Bismarck 10600
3462 19242 Ranier Com 3 Bismarck 9700
3593 18841 R.W.Flood 2 Superior 11560
3593 18899 Seward 2 Superior 2590
3593 19565 Stodola 1 Plymouth 8800
… … … … … …
A Data Model Diagarm
(SALESPERSON-CUSTOMER)
SALES
AMOUNT
CUSTOMER
NAME
SALESPERSON CUSTOMER
NUMBER NUMBER
WAREHOUSE
NUMBER
WAREHOUSE
LOCATION
SALESPERSON CUSTOMER SALES
Second
NUMBER AMOUNT
SALES NUMBER
3462 18765 13540
3462 18830 10600
Normal Form 3462 19242 9700
(2NF)
3593 18841 11560
3593 18899 2590
3593 19565 8800
CUSTOMER-WAREHOUSE … … …
{PK} CUSTOMER CUSTOMER NAME WAREHOUSE WAREHOUSE
NUMBER NUMBER LOCATION
18765 Delta Sys. 4 Fargo
18830 Levy & Sons 3 Bismarck
19242 Ranier Com 3 Bismarck
18841 R.W.Flood 2 Superior
18899 Seward 2 Superior
19565 Stodola 1 Plymouth
… … … …
SALESPERSON CUSTOMER SALES
SALESPERSON SALES NUMBER NUMBER AMOUNT
3462 18765 13540
SALESPERSON SALESPERSON SALES
{PK} NUMBER NAME AREA 3462 18830 10600
3462 19242 9700
3462 Waters West
3593 Dryne East 3593 18841 11560
3593 18899 2590
… … …
3593 19565 8800
CUSTOMER-WAREHOUSE … … …
{PK} CUSTOMER CUSTOMER NAME WAREHOUSE WAREHOUSE
NUMBER NUMBER LOCATION
18765 Delta Sys. 4 Fargo
18830 Levy & Sons 3 Bismarck
19242 Ranier Com 3 Bismarck
18841 R.W.Flood 2 Superior
18899 Seward 2 Superior
19565 Stodola 1 Plymouth
… … … …
A Data Model Diagram
(CUSTOMER-WAREHOUSE)
CUSTOMER
CUSTOMER
NAME
NUMBER
WAREHOUSE
NUMBER
WAREHOUSE
LOCATION
WAREHOUSE
Third Normal {PK}
WAREHOUSE
NUMBER
WAREHOUSE
LOCATION
Form (3NF) 4
3
Fargo
Bismarck
CUSTOMER {FK} 2 Superior
CUSTOMER CUSTOMER WAREHOUSE
{PK} NUMBER NAME NUMBER 1 Plymouth
18765 Delta Sys. 4 … …
18830 Levy & Sons 3
19242 Ranier Com 3
18841 R.W.Flood 2
18899 Seward 2
19565 Stodola 1
… … …
SALESPERSON SALES
SALESPERSON CUSTOMER SALES
SALESPERSON SALESPERSON SALES
NUMBER NUMBER AMOUNT
{PK} NUMBER NAME AREA
3462 18765 13540
3462 18830 10600
3462 Waters West
3593 Dryne East 3462 19242 9700
… … …
3593 18841 11560
CUSTOMER {FK} 3593 18899 2590
CUSTOMER CUSTOMER WAREHOUSE 3593 19565 8800
{PK} NUMBER NAME NUMBER
… … …
18765 Delta Sys. 4
18830 Levy & Sons 3 WAREHOUSE
WAREHOUSE WAREHOUSE
19242 Ranier Com 3 {PK} NUMBER LOCATION
4 Fargo
18841 R.W.Flood 2
3 Bismarck
18899 Seward 2
2 Superior
19565 Stodola 1
1 Plymouth
… … … … …
Boyce-Codd Form (BCNF)
A more restricted version of 3NF (known as Boyce-
Codd Normal Form) A relation is in BCNF, if
and only if, every determinant is a candidate
key.
Consider the following relation:
StaffProject(StnNo,PjcNo, StaffNo);
StnNo,PjcNo -> StaffNo (primary key)
StaffNo,StnNo -> PjcNo (candidate key)
StaffNo -> PjcNo (not a candidate key)
Boyce-Codd Normal Form for above:
Project(PjcNo, StnNo);
Staff(StaffNo, PjcNo)
Conceptual Level,
Conceptual Schema,
Logical Design,
Logical Data Model
CASE 1
DreamHome Case Study – An Overview
Pls refer to Connolly, T., Begg,
C.(2002).“Database System: A Practical
Approach to Design, Implementation, and
Management”, Addision Wesley, USA
Chapter 10, Section 10.4.1, pp 309
Functional
Dependencies
Functional dependency describes the relationship between
attributes in a relation.
For example, if A and B are attributes of relation R, and B is
functionally dependent on A ( denoted A B), if each value of
A is associated with exactly one value of B. ( A and B may each
consist of one or more attributes.)
B is functionally
A B
dependent on A
Determinant
Refers to the attribute or group of
attributes on the left-hand side of
the arrow of a functional
dependency
First Normal Form (1NF)
Unnormalized form (UNF)
A table that contains one or more repeating groups.
ClientNo cName propertyNo pAddress rentStart rentFinish rent ownerNo oName
6 lawrence Tina
1-Jul-00 31-Aug-01 350 CO40 Murphy
PG4 St,Glasgow
John
CR76
kay Tony
PG16 5 Novar Dr, Shaw
1-Sep-02 1-Sep-02 450 CO93
Glasgow
6 lawrence Tina
PG4 1-Sep-99 10-Jun-00 350 CO40 Murphy
St,Glasgow
Tony
Aline 2 Manor Rd,
CR56 PG36 10-Oct-00 1-Dec-01 370 CO93 Shaw
Stewart Glasgow
Tony
5 Novar Dr, Shaw
PG16 1-Nov-02 1-Aug-03 450 CO93
Glasgow
Figure 3 ClientRental unnormalized table
1NF ClientRental relation with
the first approach
The ClientRental relation is defined as follows:
ClientRental
( clientNo, propertyNo, cName, pAddress, rentStart, rentFinish, rent,
ownerNo, oName)
ClientNo propertyNo cName pAddress rentStart rentFinish rent ownerNo oName
John 6 lawrence Tina
CR76 PG4 1-Jul-00 31-Aug-01 350 CO40
Kay St,Glasgow Murphy
John 5 Novar Dr, Tony
CR76 PG16 1-Sep-02 1-Sep-02 450 CO93
Kay Glasgow Shaw
Aline 6 lawrence Tina
CR56 PG4 1-Sep-99 10-Jun-00 350 CO40
Stewart St,Glasgow Murphy
Tony
Aline 2 Manor Rd,
CR56 PG36 10-Oct-00 1-Dec-01 370 CO93 Shaw
Stewart Glasgow
Tony
Aline 5 Novar Dr,
CR56 PG16 1-Nov-02 1-Aug-03 450 CO93 Shaw
Stewart Glasgow
Figure 4 1NF ClientRental relation with the first approach
1NF ClientRental relation with the
second approach
Client (clientNo, cName)
PropertyRentalOwner (clientNo, propertyNo, pAddress,
rentStart, rentFinish, rent, ownerNo,
oName)
ClientNo cName With the second approach, we remove the repeating group
CR76 John Kay (property rented details) by placing the repeating data along
CR56 Aline Stewart
with a copy of the original key attribute (clientNo) in a
separate relation.
ClientNo propertyNo pAddress rentStart rentFinish rent ownerNo oName
6 lawrence Tina
CR76 PG4 1-Jul-00 31-Aug-01 350 CO40
St,Glasgow Murphy
5 Novar Dr, Tony
CR76 PG16 1-Sep-02 1-Sep-02 450 CO93
Glasgow Shaw
6 lawrence Tina
CR56 PG4 1-Sep-99 10-Jun-00 350 CO40
St,Glasgow Murphy
2 Manor Rd, Tony
CR56 PG36 10-Oct-00 1-Dec-01 370 CO93
Glasgow Shaw
5 Novar Dr, Tony
CR56 PG16 1-Nov-02 1-Aug-03 450 CO93
Glasgow Shaw
Figure 5 1NF ClientRental relation with the second approach
2NF ClientRental
relation
The ClientRental relation has the following functional
dependencies:
fd1 clientNo, propertyNo rentStart, rentFinish
(Primary Key)
fd2 clientNo cName
(Partial dependency)
fd3 propertyNo pAddress, rent, ownerNo, oName
(Partial dependency)
fd4 ownerNo oName
(Transitive Dependency)
fd5 clientNo, rentStart propertyNo, pAddress,
rentFinish, rent, ownerNo, oName
(Candidate key)
fd6 propertyNo, rentStart clientNo, cName, rentFinish
(Candidate key)
2NF ClientRental
relation
Client (clientNo, cName)
Rental (clientNo, propertyNo, rentStart, rentFinish)
PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName)
Client Rental
ClientNo cName ClientNo propertyNo rentStart rentFinish
CR76 John Kay CR76 PG4 1-Jul-00 31-Aug-01
CR56 Aline Stewart CR76 PG16 1-Sep-02 1-Sep-02
CR56 PG4 1-Sep-99 10-Jun-00
CR56 PG36 10-Oct-00 1-Dec-01
CR56 PG16 1-Nov-02 1-Aug-03
PropertyOwner
propertyNo pAddress rent ownerNo oName
PG4 6 lawrence St,Glasgow 350 CO40 Tina Murphy
PG16 5 Novar Dr, Glasgow 450 CO93 Tony Shaw
PG36 2 Manor Rd, Glasgow 370 CO93 Tony Shaw
Figure 6 2NF ClientRental relation
3NF ClientRental
relation
The functional dependencies for the Client, Rental and
PropertyOwner relations are as follows:
Client
fd2 clientNo cName (Primary Key)
Rental
fd1 clientNo, propertyNo rentStart, rentFinish (Primary Key)
fd5 clientNo, rentStart propertyNo, rentFinish (Candidate key)
fd6 propertyNo, rentStart clientNo, rentFinish (Candidate key)
PropertyOwner
fd3 propertyNo pAddress, rent, ownerNo, oName (Primary Key)
fd4 ownerNo oName (Transitive Dependency)
3NF ClientRental
relation
The resulting 3NF relations have the forms:
Client
(clientNo, cName)
Rental
(clientNo, propertyNo, rentStart, rentFinish)
PropertyOwner
(propertyNo, pAddress, rent, ownerNo)
Owner
(ownerNo, oName)
3NF ClientRental
relation
Client Rental
ClientNo propertyNo rentStart rentFinish
ClientNo cName
CR76 PG4 1-Jul-00 31-Aug-01
CR76 John Kay
CR56 Aline Stewart CR76 PG16 1-Sep-02 1-Sep-02
CR56 PG4 1-Sep-99 10-Jun-00
CR56 PG36 10-Oct-00 1-Dec-01
CR56 PG16 1-Nov-02 1-Aug-03
PropertyOwner Owner
propertyNo pAddress rent ownerNo ownerNo oName
PG4 6 lawrence St,Glasgow 350 CO40 CO40 Tina Murphy
PG16 5 Novar Dr, Glasgow 450 CO93 CO93 Tony Shaw
PG36 2 Manor Rd, Glasgow 370 CO93
Figure 7 2NF ClientRental relation
Boyce-Codd Normal
Form (BCNF)
Boyce-Codd normal form (BCNF)
A relation is in BCNF, if and only if, every determinant
is a candidate key.
The difference between 3NF and BCNF is that for a
functional dependency A B, 3NF allows this
dependency in a relation if B is a primary-key
attribute and A is not a candidate key, whereas BCNF
insists that for this dependency to remain in a
relation, A must be a candidate key.
Boyce-Codd Normal Form
(BCNF)(cont)
To test whether a relation is in BCNF, we identify all
the determinant is an attribute, or a group of
attributes, on which some other attribute is fully
functionally dependent.
Violation of BCNF is quite rare, since it may only
happen under specific condition. The potential to
violate BCNF may occur in a relation that:
Contains two (or more) composite candidate keys;
The candidate keys overlap, that is have at least
one attribute in common.
Example of BCNF
fd1 clientNo, propertyNo rentStart, rentFinish (Primary Key)
fd2 clientNo, rentStart propertyNo, rentFinish (Candidate key)
fd3 propertyNo, rentStart clientNo, rentFinish (Candidate key)
3NF already
BCNF 3NF
Client Rental
already
ClientNo propertyNo rentStart rentFinish BCNF
ClientNo cName
CR76 PG4 1-Jul-00 31-Aug-01
CR76 John Kay
CR56 Aline Stewart CR76 PG16 1-Sep-02 1-Sep-02
CR56 PG4 1-Sep-99 10-Jun-00
CR56 PG36 10-Oct-00 1-Dec-01
3NF CR56 PG16 1-Nov-02 1-Aug-03
already
PropertyOwner BCNF
Owner
3NF
propertyNo pAddress rent ownerNo
ownerNo oName already
PG4 6 lawrence St,Glasgow 350 CO40 BCNF
CO40 Tina Murphy
PG16 5 Novar Dr, Glasgow 450 CO93
CO93 Tony Shaw
PG36 2 Manor Rd, Glasgow 370 CO93
Example of BCNF
fd1 clientNo, interviewDate interviewTime, staffNo, roomNo (Primary Key)
fd2 staffNo, interviewDate, interviewTime clientNo, roomNo (Candidate key)
fd3 roomNo, interviewDate, interviewTime clientNo, staffNo (Candidate key)
fd4 staffNo, interviewDate roomNo (not a candidate key)
As a consequece the ClientInterview relation may suffer from update anomalies. For
example, two tuples have to be updated if the roomNo need be changed for staffNo SG5 on
the 13-May-02.
ClientInterview
ClientNo interviewDate interviewTime staffNo roomNo
CR76 13-May-02 10.30 SG5 G101
CR56 13-May-02 12.00 SG5 G101
CR74 13-May-02 12.00 SG37 G102
CR56 1-Jul-02 10.30 SG5 G102
Figure 8 ClientInterview relation
Example of BCNF
fd1 clientNo, interviewDate interviewTime, staffNo, roomNo (Primary Key)
fd2 staffNo, interviewDate, interviewTime clientNo, roomNo (Candidate key)
fd3 roomNo, interviewDate, interviewTime clientNo, staffNo (Candidate key)
fd4 staffNo, interviewDate roomNo (not a candidate key)
As a consequece the ClientInterview relation may suffer from update anomalies. For
example, two tuples have to be updated if the roomNo need be changed for staffNo SG5 on
the 13-May-02.
ClientInterview
ClientNo interviewDate interviewTime staffNo roomNo
CR76 13-May-02 10.30 SG5 G101
CR56 13-May-02 12.00 SG5 G101
CR74 13-May-02 12.00 SG37 G102
CR56 1-Jul-02 10.30 SG5 G102
Figure 8 ClientInterview relation
Example of BCNF(cont)
To transform the ClientInterview relation to BCNF, we must remove
the violating functional dependency by creating two new relations
called Interview and StaffRoom as shown below,
Interview (clientNo, interviewDate, interviewTime, staffNo)
StaffRoom(staffNo, interviewDate, roomNo)
Interview
ClientNo interviewDate interviewTime staffNo
CR76 13-May-02 10.30 SG5
CR76 13-May-02 12.00 SG5
CR74 13-May-02 12.00 SG37
CR56 1-Jul-02 10.30 SG5
StaffRoom
staffNo interviewDate roomNo
SG5 13-May-02 G101
SG37 13-May-02 G102
SG5 1-Jul-02 G102
Figure 9 BCNF Interview and StaffRoom relations
Questions