Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
172 views122 pages

01 Data Model DWH

Uploaded by

Alankar Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
172 views122 pages

01 Data Model DWH

Uploaded by

Alankar Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

DATA

MODELLING
Introduction to Data Models

I want to Build a I want to build


Hybrid a mansion
I want to build
Automobile
a Spaceship

I want to build an
I want to build
Operating
an Application
System

I want to build I want to build


a Database a Warehouse

What do I need You Need a Model


to Accomplish my Goal?

Copyright © Capgemini 2015. All Rights Reserved 2


1.1: Introduction to Data Models

Definition of a Model

 Model is a replica or a representation of particular aspects and


segments of the real world.

 Modeling provides effective ways to describe/verify the real-


world information requirements to/from the stakeholders in an
organization.

 Modeling is an integral part of the design and development of


any system.

 A correct model is essential.

Copyright © Capgemini 2015. All Rights Reserved 3


DATA MODELLING - Cont..,

Generally speaking, a model is an abstraction and reflection of the

real world.

Modeling gives us the ability to visualize what we cannot yet realize.

It is the same with data modeling.

The primary aim of a data model is to make sure that all data objects

required by the business are accurately and fully represented.

Copyright © Capgemini 2015. All Rights Reserved 4


Data Modeling
 WHAT IS A DATA MODEL?
A data model is an abstraction of some aspect of the real world
(system).

 NEED OF DATA MODEL?


 Helps to visualize the business
 A model is a means of communication.
 Models help elicit and document requirements.
 Models reduce the cost of change.
 Model is the essence of DW architecture based on which DW
will be implemented

Copyright © Capgemini 2015. All Rights Reserved 5


1.2: Data Modeling Technique

What is Data Modeling?

 Data modeling is a technique for


exploring the data structures
needed to support an
organization’s information need.
 It would be a conceptual
representation or a replica of the
data structure required in the
database system.
 A data model focuses on which
data is required and how the data
should be organized.
 At the conceptual level, the data
model is independent of any
hardware or software constraints.

Copyright © Capgemini 2015. All Rights Reserved 6


DATA MODELLING – Cont..,

Data modeling is important because it specifies the data


structure, which can impact all aspects of data usage.

For example, it can have a significant impact on


performance. This is particularly true with data warehousing.

And, the data warehouse is the primary structural element in


business intelligence.

Copyright © Capgemini 2015. All Rights Reserved 7


What do we want to do with the data?
Model depends on what kind of data analysis we want to do:
• Different Data Analysis Techniques
Query and reporting
• Display Query Results
Multidimensional analysis
• Analyze data content by looking at it in different
perspectives
Data mining
• Discover patterns and clustering attributes in data

Copyright © Capgemini 2015. All Rights Reserved 8


1.3: Simple Data Model

Example of a Simple Data Model

 The data is divided into two tables: one for policy data and one for
customer data.

Copyright © Capgemini 2015. All Rights Reserved 9


1.5: Features of a Good Data Model

What Makes a Good Data Model?

 Completeness
 Ensure that every piece of information required for a System is recorded and
maintained.
 Non-Redundant
 One fact should be recorded only once. Repetition may result in inconsistency
and increased storage requirements.

Copyright © Capgemini 2015. All Rights Reserved 13


1.5: Features of a Good Data Model

What Makes a Good Data Model? (contd..)

 Adherence to Business Rules


 Ensure that every piece of information required for a System is recorded and
maintained.
 The collected data is to be recorded by considering all business rules. It should
not violate any rule.

Copyright © Capgemini 2015. All Rights Reserved 14


1.5: Features of a Good Data Model

What Makes a Good Data Model? (contd..)

 Data Reusability
 Design a data structure to ensure re-usability.

 Stability and Flexibility


 A model needs to be flexible enough to adopt to new changes without forcing the
programmer to re-write the code.

Copyright © Capgemini 2015. All Rights Reserved 15


1.5: Features of a Good Data Model

What Makes a Good Data Model? (contd..)

 Elegance
 A data model should neatly present the required data in the least possible number
of groups or tables.

 Communication
 A model should present the data in a manner understandable to all stakeholders.

Copyright © Capgemini 2015. All Rights Reserved 16


1.5: Features of a Good Data Model

What Makes a Good Data Model? (contd..)

 Integration
 A good model is compatible with the existing and future systems.

 Avoid Conflicting Objectives


 A good model can strike a good balance between groups with different sets of
requirements.

Copyright © Capgemini 2015. All Rights Reserved 17


1.6: Adding Performance

Performance of a Data Model

 Performance makes a good model better…


 Performance differs from our other criteria because it depends
heavily on the software and hardware platforms on which the
database will run.
 Performance requirements are usually “added to the mix” at a stage
later than the other criteria, only when necessary.

Copyright © Capgemini 2015. All Rights Reserved 18


Where Data Models are used ?

 Operational Systems
 Traditional Applications designed to run the day-to-day business of the Enterprise
 External Systems ***
 Data used within an Enterprise that is obtained from outside sources
 Staging Areas ***
 Created to aid in the collection and transformation of data that is targeted for a
Data Warehouse

Copyright © Capgemini 2015. All Rights Reserved 19


Where Data Models are used ?

 Operational Data Store ***


 W. H. Inmon and Claudia Imhoff definition: “A subject-oriented, integrated, volatile,
current valued data store containing only corporate detailed data”.
 Data Warehouse (DW)
 W. H. Inmon definition: “A subject-oriented, integrated, non-volatile, time-variant
collection of data organized to support management needs”.
 Data Mart (DM)
 TDWI definition: “A data structure that is optimized for access. It is designed to
facilitate end-user analysis of data. It typically supports a single analytic
application used by a distinct set of workers.”
 *** - Not discussed here

Copyright © Capgemini 2015. All Rights Reserved 20


What Data Modeling is not…

 A waste of time!
 A one time effort
 The ultimate IT application development cure
 A quick process
 A function solely performed and understood by and for IT
professionals

Copyright © Capgemini 2015. All Rights Reserved 21


1.7: People involved in Data Modeling

People involved in Data Modeling

 System users, owners, and/or sponsors of business


 To verify that the model meets their requirements..
 Business specialists (subject matter experts or SMEs)
 To verify the accuracy and stability of the business rule and processes.
 Data modeler
 To ensure that he will design the model correctly and will not miss out on any
important requirement.
 Process modelers
 To ensure that they will use the model correctly.

Copyright © Capgemini 2015. All Rights Reserved 22


1.7: People involved in Data Modeling

People involved in Data Modeling (contd..)

 Physical database designer (or DBA)


 To understand the difference between logical and physical model
 To design database to achieve the required performance
 Systems integration manager and enterprise architect
 To understand how the new database will fit into existing system.
 To think beyond current project.

Copyright © Capgemini 2015. All Rights Reserved 23


1.8: Data Modeling Stages and Deliverables

Data modeling stages and deliverables

 A data modeling process


goes through various
stages and produces the
following deliverables:
 Conceptual Model
 Logical Model
 Physical Data Model

Copyright © Capgemini 2015. All Rights Reserved 24


1.8: Data Modeling Stages and Deliverables

Conceptual Data Model

 A conceptual data model identifies the highest-level relationships


between the different entities
 Features of conceptual data model include:
 Includes the important entities and the relationships among them.
 No attribute is specified.
 No primary key is specified

Copyright © Capgemini 2015. All Rights Reserved 25


1.8: Data Modeling Stages and Deliverables

Logical Data Model

 A logical data model describes the data in as much detail as


possible, without regard to how they will be physical implemented in
the database.
 Features of a logical data model include:
 Includes all entities and relationships among them.
 All attributes for each entity are specified.
 The primary key for each entity is specified.
 Foreign keys (keys identifying the relationship between different entities) are
specified.
 Normalization occurs at this level.

Copyright © Capgemini 2015. All Rights Reserved 26


1.8: Data Modeling Stages and Deliverables

Logical Data Model (contd..)

 The steps for designing the logical data model are as follows:
 Specify primary keys for all entities.
 Find the relationships between different entities.
 Find all attributes for each entity.
 Resolve many-to-many relationships.
 Normalization.

Copyright © Capgemini 2015. All Rights Reserved 27


1.8: Data Modeling Stages and Deliverables

Physical Data Model (contd..)

 Physical data model represents


how the model will be built in the
database.

 A physical database model


shows all table structures,
including column name, column
data type, column constraints,
primary key, foreign key, and
relationships between tables.

Copyright © Capgemini 2015. All Rights Reserved 28


1.8: Data Modeling Stages and Deliverables

Physical Data Model (contd..)

 Features of a physical data model include:


 Specification of all tables and columns.
 Foreign keys are used to identify relationships between tables.
 Demoralization may occur based on user requirements.
 Physical considerations may cause the physical data model to be quite different
from the logical data model.
 Physical data model will be different for different RDBMS. For example, data type
for a column may be different between MySQL and SQL Server

Copyright © Capgemini 2015. All Rights Reserved 29


1.9: Classification of Information Level

Levels of Information

• Information Content - General Ideas


• Human Concept of Application Domain
• Data System as Understood by Users

Logical
• Details of whole information Content
• Reference to specific Database Software
• No Details of Hardware/Software

Physical
• Details at Level of Internal data storage
• Intricacies of Specific Database
• Details of Physical implementation

Copyright © Capgemini 2015. All Rights Reserved 30


Data Modeling for
Business
Intelligence
Lesson 2: Understanding
Business Requirements
Lesson Objectives

 This lesson will provide overview of various


techniques of Requirement gathering
 We will learn about:
 Need of Requirement Analysis
 The Data Life cycle
 Ways of Collecting requirement
 Business Requirement Specification (BRS)

Copyright © Capgemini 2015. All Rights Reserved 32


2.1: Collecting Requirements

Requirements Collection

 Experts think that the requirement gathering should be treated as a


separate phase.
 Though, some suggest that it should be a part of the conceptual
design phase.
 The requirement phase is used for the following:
 Collecting the business requirement
 Formulating the understanding of requirement
 Requirement analysis starts as soon as a business case is prepared
or received.

Copyright © Capgemini 2015. All Rights Reserved 33


2.2: Understanding Requirements

Understanding Business Requirements

 Any system is usually developed in response to a problem, an


opportunity, or a requirement.
 Its statement should be supported by a formal business case. The
case is used for the following:

Studying Providing
Understanding benefits of the logical
Estimating
the problem proposed starting of
the cost
statement system the project
for modeler

 It is important to understand the data life cycle in an application.

Copyright © Capgemini 2015. All Rights Reserved 34


2.3: Understanding Data Life Cycle
Study of Data Life Cycle

 Need for data


 Needed data
 Collect needed data
 Store data
 Use data
 Delete obsolete data
 Archive historical data

 A data life cycle help us to


state the requirements
clearly.

Copyright © Capgemini 2015. All Rights Reserved 35


2.4: What is a Good Software Requirement?

Characteristics of a Good Requirement

 Specific  Understandable:
 Correct: A true statement of what Comprehendible by User,
the requirement should do Business and Developers
 Complete: Encompass all  Detailed/ Granular: Granular to
requirements of concern to the be implemented in test cases and
Users design
 Unambiguous: Has only one  Explicit: Encompass all derived
interpretation requirements
 Consistent: Does not conflict  Traceable: It should be possible
with other Requirements to trace a component requirement
 Verifiable: Can be tested to meet to its source
the Requirements  Manageable & Organized:
 Attainable: It should be within Scalability and change
the scope of the project management, should be
structured

Copyright © Capgemini 2015. All Rights Reserved 36


2.5: Collecting Business Requirements

Collation of Business Requirements

 Conduct interviews and workshops:


 Avoid using data model in interviews and workshops.
 Prefer UML, Use Cases, Activity Diagrams, DFD, and so on.
 Conduct interviews with senior managers.
 Conduct interviews with Subject Matter Experts (Do not let them Design.)
 Conduct facilitated workshops.

 Verify your own understanding about requirements.

Copyright © Capgemini 2015. All Rights Reserved 37


2.5: Interview with Stakeholders and Users

Interviewing Stakeholders and Users

 Ask questions to the stakeholders at a pre-decided time and venue


to gather requirement knowledge:
 Ask open-ended questions
 Use structured agenda of fairly open questions
 Interviews are good for documentation and agreement on common
or discussed objectives.
 Management support is required to obtain time from stakeholders.

Copyright © Capgemini 2015. All Rights Reserved 38


2.5: Other Methods of Collecting Requirements

Other Methods of Collecting Requirements

 Direct Observation Techniques:


 This allows you to asses users’ needs and problems associated with the use of
services.
 This technique is designed for a specific purpose; to identify a problem, describe a
situation, assess user satisfaction, and so on.
 Surveys
 This is more suitable when stakeholders are spread globally.
 Data Collection and Analysis
 These are indirect sources of information to provide an approximation of the
needs of the user.
 Source can be public data, marketing data or any other data.

Copyright © Capgemini 2015. All Rights Reserved 39


2.6: Specifying Business Requirements

Business Requirements Specification

 The most important task is to define “statement of requirements”


or Business Requirement Specification.

 The issues could be as follows:


 Many requirements are well-known but impractical to document them.
 Some requirements are only relevant to specific design alternatives.
 Some requirements may emerge only when the client has seen an actual design.
 High-level business directions and rules cannot be captured directly.

Business
Requirement
Specification

Copyright © Capgemini 2015. All Rights Reserved 40


DATA MODELING TECHNIQUES
DATA MODELING TECHNIQUES:
 There are 2 techniques of Data
Modeling :

 E-R Modeling
 Dimensional Modeling
Entity-Relationship Model
 Entity-Relationship (ER) Model is based on the
notion of real-world entities and relationships
among them.
 While formulating real-world scenario into the
database model, the ER Model creates entity set,
relationship set, general attributes and
constraints.
 ER Model is best used for the conceptual design
of a database.ER Model is based on the following:

 Entities and their attributes.


 Relationships among entities.
ER Model - Example Notation
ENTITY and RELATIONSHIP
 Entity − An entity in an ER Model is a real-world
entity having properties called attributes.
 Every attribute is defined by its set of values
called domain.
 For example, in a school database, a student is
considered as an entity. Student has various
attributes like name, age, class, etc.

 Relationship − The logical association among


entities is called relationship.
 Relationships are mapped with entities in
various ways. Mapping cardinalities define the
number of association between two entities.
TYPES OF DATAMODEL
 There are 3 types of Data Models as
listed below:

 CONCEPTUAL DATA MODEL


 LOGICAL DATA MODEL
 PHYSICAL DATA MODEL
CONCEPTUAL DATA MODEL
 A conceptual data model identifies the highest-
level relationships between the different entities.

 Features of conceptual data model include:

 Includes the important entities and the relationships among


them.

 No attribute is specified.

 No primary key is specified.


CONCEPTUAL DATA MODEL

From the figure above, we can see that the only information shown
via the conceptual data model is the entities that describe the data
and the relationships between those entities. No other information is
shown through the conceptual data model.
LOGICAL DATA MODEL
 A logical data model describes the data in as
much detail as possible, without regard to how
they will be physical implemented in the
database.
 Features of a logical data model include:

 Includes all entities and relationships among them.


 All attributes for each entity are specified.
 The primary key for each entity is specified.
 Foreign keys (keys identifying the relationship between different
entities) are specified.
 Normalization occurs at this level.
Steps for Logical Data Model Design
 The steps for designing the logical data
model are as follows:

 Specify primary keys for all entities.


 Find the relationships between different entities.
 Find all attributes for each entity.
 Resolve many-to-many relationships.
 Normalization.
LOGICAL DATA MODEL
COMPARISON LOGICAL ‘&’ CONCEPTUAL
 Comparing the logical data model shown above
with the conceptual data model diagram, we see
the main differences between the two:

 In a logical data model, primary keys are present,


whereas in a conceptual data model, no primary
key is present.

 In a logical data model, all attributes are


specified within an entity. No attributes are
specified in a conceptual data model.
COMPARISON LOGICAL ‘&’ CONCEPTUAL

 Relationships between entities are specified


using primary keys and foreign keys in a logical
data model.

 In a conceptual data model, the relationships


are simply stated, not specified, so we simply
know that two entities are related, but we do
not specify what attributes are used for this
relationship.
PHYSICAL DATA MODEL
 Physical data model represents how the model
will be built in the database.
 A physical database model shows all table
structures, including column name, column data
type, column constraints, primary key, foreign
key, and relationships between tables.

 Features of a physical data model include:


Specification all tables and columns.
Foreign keys are used to identify relationships between
tables.
PHYSICAL DATA MODEL
 Denormalization may occur based on user
requirements.

 Physical considerations may cause the physical


data model to be quite different from the logical
data model.

 Physical data model will be different for different


RDBMS. For example, data type for a column may
be different between MySQL and SQL Server.
STEPS FOR DESIGNING PHYSICAL DM

 The steps for physical data model design


are as follows:

 Convert entities into tables.


 Convert relationships into foreign keys.
 Convert attributes into columns.
 Modify the physical data model based on physical constraints /
requirements.
PHYSICAL DATA MODEL
CONCEPTS OF
ER - MODELLING
CONCEPTS OF THE ER MODEL

 Entity types

 Relationship types

 Attributes
ENTITY TYPE
 Entity type

 Group of objects with same properties, identified by


enterprise as having an independent existence.
 For example, in a school database, students, teachers,
classes, and courses offered can be considered as entities.

 Entity occurrence

 Uniquely identifiable object of an entity type.


Example: For an ENTITY
ER - MODELLING SYMBOLS

E ENTITY SET

E WEAK ENTITY SET

R RELATIONSHIP SET

R IDENTIFYING RELATIONSHIP FOR A


WEAK ENTITY SET
ER - MODELLING SYMBOLS

A ATTRIBUTE

A MULTI-VALUED ATTRIBUTE

A DERIVED ATTRIBUTE

A PRIMARY KEY ATTRIBUTE


ER - MODELLING SYMBOLS

1 1
R ONE-TO-ONE RELATIONSHIP

M 1
R MANY-TO-ONE RELATIONSHIP

M M
R MANY-TO-MANY RELATIONSHIP
ATTRIBUTES

 Attribute

 Property of an entity or a relationship type.


 For example, a student entity may have name, class, and
age as attributes.

 Attribute Domain

 Set of allowable values for one or more attributes.


EXAMPLE: For an ATTRIBUTE
ATTRIBUTES TYPES
 Simple Attribute
 Attribute composed of a single component with an
independent existence.
 For example, a student's phone number is an atomic value of
10 digits.

 Composite Attribute
 Attribute composed of multiple components, each with an
independent existence.
 For example, a student's complete name may have
first_name and last_name.
EXAMPLE: COMPOSITE ATTRIBUTE
ATTRIBUTES TYPES
 Single-valued Attribute
 Attribute that holds a single value for each occurrence of an
entity type.
 For example − Social_Security_Number.

 Multi-valued Attribute
 Attribute that holds multiple values for each occurrence of
an entity type.
 For example, a person can have more than one phone
number, email_address, etc.
EXAMPLE: MULTIVALUED ATTRIBUTE
ATTRIBUTES TYPES
 Derived Attribute
 Attribute that represents a value that is derivable from
value of a related attribute, or set of attributes, not
necessarily in the same entity type.
 For example, age can be derived from data_of_birth.
EXAMPLE: DERIVED ATTRIBUTE
Entity-Set and Keys
 Key is an attribute or collection of attributes
that uniquely identifies an entity among entity
set.
 For example, the roll number of a student
makes him/her identifiable among students.
 Super Key − is defined as a set of attributes within
a table that uniquely identifies each record within a
table.
 Candidate Key − Candidate keys are defined as the
set of fields from which primary key can be selected.
It is an attribute or set of attribute that can act as a
primary key for a table to uniquely identify each
record in that table.
Entity-Set and Keys
 Primary Key − A primary key is one of the
candidate keys chosen by the database
designer to uniquely identify the entity set.

 Composite Key Key that consist of two or


more attributes that uniquely identify an
entity occurrence is called Composite key.
EXAMPLE: PRIMARY KEY
EXAMPLE: COMPOSITE KEY
Relationship
 The association among entities is called a
relationship. For example, an
employee works_at a department, a
student enrolls in a course. Here, Works_at
and Enrolls are called relationships.

 Relationship Set:
A set of relationships of similar type is called a
relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive
attributes.
Mapping Cardinalities
 Cardinality defines the number of entities in one
entity set, which can be associated with the number of
entities of other set via relationship set.

 One-to-one − One entity from entity set A can be


associated with at most one entity of entity set B and
vice versa.
EXAMPLE: 1 – 1
Mapping Cardinalities
 One-to-many − One entity from entity set A can
be associated with more than one entities of
entity set B however an entity from entity set B,
can be associated with at most one entity.
EXAMPLE: 1 - N
Mapping Cardinalities
 Many-to-one − More than one entities from entity
set A can be associated with at most one entity of
entity set B, however an entity from entity set B can
be associated with more than one entity from entity
set A.
EXAMPLE: N - 1
Mapping Cardinalities
 Many-to-many − One entity from A can be
associated with more than one entity from B and
vice versa.
EXAMPLE: N - N
Normalisation
 Normalisation is a 'fancy' term for a
set of rules, designed to make sure that
a database is organised in the best way
possible

 This allows the data to be processed more


efficiently and any query to be processed.

 These rules depend on relationships being


established between the entities to create
a functional dependency between them.

86
The normalisation process involves:

 Finding and grouping together all the


entities and their attributes.

 Removing repeating groups of data.

 Providing unique keys for each entity


in the database system.

87
The Three Major Stages of Normalisation:

 First Normal Form


 1NF is the first level of normalisation. An entity (table) is in First Normal
form if it contains no repeating attributes (fields) or groups of attributes.
 Second Normal Form
 An entity is in 2NF if no attribute (not part of the primary key) is
dependent on only part of the primary key. This only applies to entities
with concatenated primary keys.
 Third Normal Form
 An entity is in 3NF if all attributes are entirely dependent on the primary
key and not on any attribute that is not part of the primary key.
 In a relational schema, each tuple is divided into fields called
Domains.
Functional Dependency

Functional dependency means that there


must be only a one-to-one dependency for
each attribute mapped from a primary key
to that attribute.

It defines a relationship in which the


existence of one entity/attribute is
entirely dependent on the existence of
another (one-to-one).

89
Functional Dependency - Practical Example

SALES Order Number is the primary key.


Order Number
Acc.No. The value for each attribute of SALES,
except Item Price, depends upon the value
Customer
of the primary key.
Address
Date All attributes of SALES, except Item Price,
Item are Functionally Dependent on the
primary key
Quantity
Item Price Item Price is Functionally Dependent on the
Total Cost the attribute Item

90
To produce a set of entities in First Normal Form
(1NF):

 Remove repeating (multiple) groups within the primary


entities (tables) so that each record (row) within the
entity is the same length.

 Repeating groups then become new entities, linked


together by a one-to-many relationship.

 Relationships are created by including a primary key


from one entity as a foreign key in another entity

91
Al's Baker Shop

Order Acc. Customer Address Date Item Qty. Item Total


No. No. Price Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 Bakewell 20 0.15 12.35
Cove Tart
Danish 13 0.20
Pastry
Apple Pie 45 0.15

4633 526 Smiths 12 Dee View, 19/7 Butteries 120 0.20 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 Apple Pie 130 0.15 56.50
Snacks Banchory
Cherry 100 0.18
Pie
Steak Pie 30 0.50

Meringue 20 0.20
Pie
1788 032 Tasty Bite 17 Wood Place, 18/7 Apple Pie 15 0.15 7.50
Insch
Danish 50 0.20
Pastry

92
Al's Baker Shop
Order Acc. Customer Address Date Item Qty. Item Total
No. No. Price Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 Bakewell 20 0.15 12.35
Cove Tart
Danish 13 0.20
Pastry
Apple Pie 45 0.15

4633 526 Smiths 12 Dee View, 19/7 Butteries 120 0.20 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 Apple Pie 130 0.15 56.50
Snacks Banchory
Cherry 100 0.18
Pie
Steak Pie 30 0.50

Meringue 20 0.20
Pie
1788 032 Tasty Bite 17 Wood Place, 18/7 Apple Pie 15 0.15 7.50
Insch
Danish 50 0.20
Pastry

93
Al's Baker Shop
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive,
Cove
16/7 12.35
Orders
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Order Item Qty. Item


No. Price
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20
7823
4633
Apple Pie
Butteries
45
120
0.15
0.20
Items Purchased
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
1788 Danish Pastry 50 0.20

94
Al's Baker Shop
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35

Orders
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Order Item Qty. Item


No. Price
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20
7823
4633
Apple Pie
Butteries
45
120
0.15
0.20
Items Purchased
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
1788 Danish Pastry 50 0.20

95
Al's Baker Shop
Order Acc. Customer Address Date Total
Orders Table: No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Order No. can be used to uniquely identify each record and can
therefore be made the primary key.

Orders (Order No.


Acc. No.
Customer
Address
Date
Total Cost)

96
Al's Baker Shop

Order Item Qty. Item


No. Price
Items Purchased Table:
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20
7823 Apple Pie 45 0.15
No one attribute can be used to
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
uniquely identify a record.
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50 Concatenated key is required
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
1788 Danish Pastry 50 0.20
Order No. and Item together can
uniquely identify a record.

Items (*Order No.


Purchased Item
Quantity
Item Price)

97
Al's Baker Shop
First Normal Form

Orders (Order No. Items (*Order No.


Acc. No. Purchased Item
Customer Quantity
Address Item Price
Date
Total Cost)

98
Q&A

1.What is Normalisation?

2.What does Normalisation process


involves?

3.What is Ist Normal Form?

4.In a relational schema, each


tuple is divided into fields called
Domains. T/F
99
The Three Major Stages of Normalisation:

 First Normal Form


 1NF is the first level of normalisation. An entity (table) is in First
Normal form if it contains no repeating attributes (fields) or
groups of attributes.
 Second Normal Form
 An entity is in 2NF if no attribute (not part of the primary key) is
dependent on only part of the primary key. This only applies to
entries with concatenated primary keys.
 Third Normal Form
 An entity is in 3NF if all attributes are entirely dependent on the
primary key and not on any attribute that is not part of the
primary key.

100
To produce a set of entities in
Second Normal Form (2NF):

 Test for dependency by testing each particular attribute


in turn to check that it can be uniquely identified by
making use of all the primary key. This test need not be
completed unless you have at least one table which
requires a concatenated Primary Key

 Remove all partially dependent attributes to a new


entity.

N.B. – A concatenated key occurs when you need two fields


together in order to uniquely identify a record

101
Al's Baker Shop
First Normal Form

Orders (Order No. Items (*Order No.


Acc. No. Purchased Item
Customer Quantity
Address Item Price)
Date
Total Cost)

102
Al's Baker Shop

Orders Table: Order Acc. Customer Address Date Total


No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Because this entity has a single attribute as the primary key


there can be no partial dependencies and therefore the entity
is already in 2NF.

103
Al's Baker Shop
(2NF Step 1)

Order Item Qty. Item


No. Price
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20 Items Purchased Table:
7823 Apple Pie 45 0.15 Primary key is Order No. and Item
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50
2276 Meringue Pie 20 0.20
Test for dependency by testing each
1788 Apple Pie 15 0.15 particular attribute.
1788 Danish Pastry 50 0.20

Primary Key Attribute Functionally Dependent?

104
Al's Baker Shop
(2NF Step 1)

Order Item Qty. Item


No. Price
7823 Bakewell Tart 20 0.15 Items Purchased Table:
7823 Danish Pastry 13 0.20
7823 Apple Pie 45 0.15 Primary key is Order No. and Item
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50 Test for dependency by testing each
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
particular attribute.
1788 Danish Pastry 50 0.20

Primary Key Attribute Functionally Dependent?

Order No Quantity YES


Item Quantity is functionally dependent
on Order No. and Item.

105
Al's Baker Shop
(2NF Step 1)

Order Item Qty. Item


No. Price
7823 Bakewell Tart 20 0.15 Items Purchased Table:
7823 Danish Pastry 13 0.20
7823 Apple Pie 45 0.15 Primary key is Order No. and Item
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50 Test for dependency by testing each
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
particular attribute.
1788 Danish Pastry 50 0.20

Primary Key Attribute Functionally Dependent?


Order No Quantity YES
Item Quantity is functionally
dependent
on Order No. and Item.
Order No Item Price NO
Item Item price is functionally dependent
Item, but not on Order No. and Item

106
Al's Baker Shop
(2NF Step 2)

Remove any partially dependent attributes to a new entity

Order Item Qty. Item


Item
No. Price
Price
7823 Bakewell Tart 20 0.15
0.15
7823 Danish Pastry 13 0.20
0.20
7823 Apple Pie 45 0.15
0.15
4633 Butteries 120 0.20
0.20
2276 Apple Pie 130 0.15
0.15
2276 Cherry Pie 100 0.18
0.18
2276 Steak Pie 30 0.50
0.50
2276 Meringue Pie 20 0.20
0.20
1788 Apple Pie 15 0.15
0.15
1788 Danish Pastry 50 0.20
0.20

107
Al's Baker Shop
(2NF Step 2)

Remove any partially dependent attributes to a new entity

Order Item Qty.


No.
Item
7823 Bakewell Tart 20
Price
7823 Danish Pastry 13
0.15
7823 Apple Pie 45
0.15
4633 Butteries 120 0.20
2276 Apple Pie 130 0.18
2276 Cherry Pie 100 0.20
2276 Steak Pie 30 0.20
2276 Meringue Pie 20 0.50
1788 Apple Pie 15
1788 Danish Pastry 50

Part Order Price List

108
Al's Baker Shop
(2NF Step 2)

Create a relationship between the tables


and assign Primary Keys
Order Item Qty. Item Item
No. Price
7823 Bakewell Tart 20 Apple Pie 0.15
7823 Danish Pastry 13 Bakewell Tart 0.15
7823 Apple Pie 45 Butteries 0.20
4633 Butteries 120 Cherry Pie 0.18
2276 Apple Pie 130 Danish Pastry 0.20
2276 Cherry Pie 100 Meringue Pie 0.20
2276 Steak Pie 30 Steak Pie 0.50
2276 Meringue Pie 20
1788 Apple Pie 15
1788 Danish Pastry 50

Part Order Price List


Primary Key: Order No.
and *Item Primary Key: Item

109
Al's Baker Shop
Second Normal Form

Orders (Order No. Price List (Item


Acc. No. Item Price)
Customer
Address
Date
Total Cost) Part Order (*Order No.
*Item
Quantity)

110
The Three Major Stages of
Normalisation:

First Normal Form


 1NF is the first level of normalisation. An entity (table) is in
First Normal form if it contains no repeating attributes (fields)
or groups of attributes.
Second Normal Form
 An entity is in 2NF if no attribute (not part of the primary key)
is dependent on only part of the primary key. This only
applies to entries with concatenated primary keys.
Third Normal Form
 An entity is in 3NF if all attributes are entirely dependent on
the primary key and not on any attribute that is not part of the
primary key.

111
To produce a set of entities in Third Normal Form
(3NF):

 Test each attribute in turn to check for


dependency on the primary key.

 Remove all transitive dependencies to a new


entity.
 A transitive dependency is where an attribute is dependent on
another attribute (or attributes) that is (are) NOT the primary
key

112
Al's Baker Shop
Second Normal Form

Orders (Order No. Price List (Item


Acc. No. Item Price)
Customer
Address
Date
Total Cost) Part Order (*Order No.
*Item
Quantity)

113
Al's Baker Shop - 3NF Step1

Order Acc. Customer Address Date Total


No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Test for Aberdeen
dependency 2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Primary Key Attribute Trasnsitive Dependency

114
Al's Baker Shop -
3NF Step1 Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Primary Key Attribute Trasnsitive Dependency


Order No. Acc.No. YES: Acc.No can be found if we know either
Customer or Address

115
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 16/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Primary Key Attribute Trasnsitive Dependency


Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Customer can be found if we know either
Acc.No. or Address

116
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 16/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Primary Key Attribute Trasnsitive Dependency


Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Acc.No. can be found if we know either
Customer or Address
Order No. Address YES Address can be found if we know either
Customer or Acc.No

117
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Primary Key Attribute Trasnsitive Dependency


Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Acc.No. can be found if we know either
Customer or Address
Order No. Address YES Address can be found if we know either
Customer or Acc.No
Order No. Date NO Dependent on Order No.

118
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch

Primary Key Attribute Trasnsitive Dependency


Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Acc.No. can be found if we know either
Customer or Address
Order No. Address YES Address can be found if we know either
Customer or Acc.No
Order No. Date NO Dependent on Order No.

Order No. Total Cost NO Dependent on Order No.

119
Al's Baker Shop - 3NF Step2
Remove transitive dependencies to a new entity

Order Acc. Customer Address Date Total


No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 16/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Acc. Customer Address
No.
178 Daisy's Café 27 Bay Drive,
Cove
526 Smiths 12 Dee View,
Aberdeen
167 Sally's 3 High Street,
Snacks Banchory
032 Tasty Bite 17 Wood Place,
Insch

120
Al's Baker Shop - 3NF Step2

Remove transitive dependencies to a new entity

Orders Customers
Order Date Total Acc. Customer Address
No. Cost No.
7823 16/7 12.35 178 Daisy's Café 27 Bay Drive,
4633 16/7 24.00 Cove
2276 17/7 56.50 526 Smiths 12 Dee View,
1788 18/7 7.50 Aberdeen
167 Sally's 3 High Street,
Snacks Banchory
032 Tasty Bite 17 Wood Place,
Insch

121
Al's Baker Shop - 3NF Step2

Create a relationship between the tables


and assign Primary Keys
Orders Customers
Order Acc. Date Total Acc. Customer Address
No. No. Cost
No.
7823 178 16/7 12.35 178 Daisy's Café 27 Bay Drive,
4633 526 16/7 24.00 Cove
2276 167 17/7 56.50 526 Smiths 12 Dee View,
1788 032 18/7 7.50 Aberdeen
167 Sally's 3 High Street,
Snacks Banchory
032 Tasty Bite 17 Wood Place,
Insch

Primary Key: Primary Key:


Order No. Acc.No.

122
Al's Baker Shop
Third Normal Form

Customers (Acc. No. Orders (Order No.


Customer *Acc. No.
Address) Date
Total Cost)

Part Order (*Order No.


Price List (Item *Item
Item Price) Quantity)

Normalisation Complete
123
Normalisation -

1. Remove repeating groups to create a new entity


2. Create a relationship using one of the attributes that
are left [Usually the primary key]

3.‘Check out’ entities with concatenated keys. If any


attribute is not fully dependent on both parts of the
primary key remove it to create a new entity.
4. Create a relationship using one of the attributes that
are left [Usually the primary key]

5.‘Check out’ every entity. If any attribute is dependent


on any attribute other than the primary key, remove it
into a new entity.
6. Create a relationship using one of the attributes

124
Q&A

1. What is 2nd Normal Form?

2.In 2nd Normal Form which


dependency is tested?

3.What is 3rd Normal Form?

4.What is Transitive Dependency?


125

You might also like