DATA
MODELLING
Introduction to Data Models
I want to Build a I want to build
Hybrid a mansion
I want to build
Automobile
a Spaceship
I want to build an
I want to build
Operating
an Application
System
I want to build I want to build
a Database a Warehouse
What do I need You Need a Model
to Accomplish my Goal?
Copyright © Capgemini 2015. All Rights Reserved 2
1.1: Introduction to Data Models
Definition of a Model
Model is a replica or a representation of particular aspects and
segments of the real world.
Modeling provides effective ways to describe/verify the real-
world information requirements to/from the stakeholders in an
organization.
Modeling is an integral part of the design and development of
any system.
A correct model is essential.
Copyright © Capgemini 2015. All Rights Reserved 3
DATA MODELLING - Cont..,
Generally speaking, a model is an abstraction and reflection of the
real world.
Modeling gives us the ability to visualize what we cannot yet realize.
It is the same with data modeling.
The primary aim of a data model is to make sure that all data objects
required by the business are accurately and fully represented.
Copyright © Capgemini 2015. All Rights Reserved 4
Data Modeling
WHAT IS A DATA MODEL?
A data model is an abstraction of some aspect of the real world
(system).
NEED OF DATA MODEL?
Helps to visualize the business
A model is a means of communication.
Models help elicit and document requirements.
Models reduce the cost of change.
Model is the essence of DW architecture based on which DW
will be implemented
Copyright © Capgemini 2015. All Rights Reserved 5
1.2: Data Modeling Technique
What is Data Modeling?
Data modeling is a technique for
exploring the data structures
needed to support an
organization’s information need.
It would be a conceptual
representation or a replica of the
data structure required in the
database system.
A data model focuses on which
data is required and how the data
should be organized.
At the conceptual level, the data
model is independent of any
hardware or software constraints.
Copyright © Capgemini 2015. All Rights Reserved 6
DATA MODELLING – Cont..,
Data modeling is important because it specifies the data
structure, which can impact all aspects of data usage.
For example, it can have a significant impact on
performance. This is particularly true with data warehousing.
And, the data warehouse is the primary structural element in
business intelligence.
Copyright © Capgemini 2015. All Rights Reserved 7
What do we want to do with the data?
Model depends on what kind of data analysis we want to do:
• Different Data Analysis Techniques
Query and reporting
• Display Query Results
Multidimensional analysis
• Analyze data content by looking at it in different
perspectives
Data mining
• Discover patterns and clustering attributes in data
Copyright © Capgemini 2015. All Rights Reserved 8
1.3: Simple Data Model
Example of a Simple Data Model
The data is divided into two tables: one for policy data and one for
customer data.
Copyright © Capgemini 2015. All Rights Reserved 9
1.5: Features of a Good Data Model
What Makes a Good Data Model?
Completeness
Ensure that every piece of information required for a System is recorded and
maintained.
Non-Redundant
One fact should be recorded only once. Repetition may result in inconsistency
and increased storage requirements.
Copyright © Capgemini 2015. All Rights Reserved 13
1.5: Features of a Good Data Model
What Makes a Good Data Model? (contd..)
Adherence to Business Rules
Ensure that every piece of information required for a System is recorded and
maintained.
The collected data is to be recorded by considering all business rules. It should
not violate any rule.
Copyright © Capgemini 2015. All Rights Reserved 14
1.5: Features of a Good Data Model
What Makes a Good Data Model? (contd..)
Data Reusability
Design a data structure to ensure re-usability.
Stability and Flexibility
A model needs to be flexible enough to adopt to new changes without forcing the
programmer to re-write the code.
Copyright © Capgemini 2015. All Rights Reserved 15
1.5: Features of a Good Data Model
What Makes a Good Data Model? (contd..)
Elegance
A data model should neatly present the required data in the least possible number
of groups or tables.
Communication
A model should present the data in a manner understandable to all stakeholders.
Copyright © Capgemini 2015. All Rights Reserved 16
1.5: Features of a Good Data Model
What Makes a Good Data Model? (contd..)
Integration
A good model is compatible with the existing and future systems.
Avoid Conflicting Objectives
A good model can strike a good balance between groups with different sets of
requirements.
Copyright © Capgemini 2015. All Rights Reserved 17
1.6: Adding Performance
Performance of a Data Model
Performance makes a good model better…
Performance differs from our other criteria because it depends
heavily on the software and hardware platforms on which the
database will run.
Performance requirements are usually “added to the mix” at a stage
later than the other criteria, only when necessary.
Copyright © Capgemini 2015. All Rights Reserved 18
Where Data Models are used ?
Operational Systems
Traditional Applications designed to run the day-to-day business of the Enterprise
External Systems ***
Data used within an Enterprise that is obtained from outside sources
Staging Areas ***
Created to aid in the collection and transformation of data that is targeted for a
Data Warehouse
Copyright © Capgemini 2015. All Rights Reserved 19
Where Data Models are used ?
Operational Data Store ***
W. H. Inmon and Claudia Imhoff definition: “A subject-oriented, integrated, volatile,
current valued data store containing only corporate detailed data”.
Data Warehouse (DW)
W. H. Inmon definition: “A subject-oriented, integrated, non-volatile, time-variant
collection of data organized to support management needs”.
Data Mart (DM)
TDWI definition: “A data structure that is optimized for access. It is designed to
facilitate end-user analysis of data. It typically supports a single analytic
application used by a distinct set of workers.”
*** - Not discussed here
Copyright © Capgemini 2015. All Rights Reserved 20
What Data Modeling is not…
A waste of time!
A one time effort
The ultimate IT application development cure
A quick process
A function solely performed and understood by and for IT
professionals
Copyright © Capgemini 2015. All Rights Reserved 21
1.7: People involved in Data Modeling
People involved in Data Modeling
System users, owners, and/or sponsors of business
To verify that the model meets their requirements..
Business specialists (subject matter experts or SMEs)
To verify the accuracy and stability of the business rule and processes.
Data modeler
To ensure that he will design the model correctly and will not miss out on any
important requirement.
Process modelers
To ensure that they will use the model correctly.
Copyright © Capgemini 2015. All Rights Reserved 22
1.7: People involved in Data Modeling
People involved in Data Modeling (contd..)
Physical database designer (or DBA)
To understand the difference between logical and physical model
To design database to achieve the required performance
Systems integration manager and enterprise architect
To understand how the new database will fit into existing system.
To think beyond current project.
Copyright © Capgemini 2015. All Rights Reserved 23
1.8: Data Modeling Stages and Deliverables
Data modeling stages and deliverables
A data modeling process
goes through various
stages and produces the
following deliverables:
Conceptual Model
Logical Model
Physical Data Model
Copyright © Capgemini 2015. All Rights Reserved 24
1.8: Data Modeling Stages and Deliverables
Conceptual Data Model
A conceptual data model identifies the highest-level relationships
between the different entities
Features of conceptual data model include:
Includes the important entities and the relationships among them.
No attribute is specified.
No primary key is specified
Copyright © Capgemini 2015. All Rights Reserved 25
1.8: Data Modeling Stages and Deliverables
Logical Data Model
A logical data model describes the data in as much detail as
possible, without regard to how they will be physical implemented in
the database.
Features of a logical data model include:
Includes all entities and relationships among them.
All attributes for each entity are specified.
The primary key for each entity is specified.
Foreign keys (keys identifying the relationship between different entities) are
specified.
Normalization occurs at this level.
Copyright © Capgemini 2015. All Rights Reserved 26
1.8: Data Modeling Stages and Deliverables
Logical Data Model (contd..)
The steps for designing the logical data model are as follows:
Specify primary keys for all entities.
Find the relationships between different entities.
Find all attributes for each entity.
Resolve many-to-many relationships.
Normalization.
Copyright © Capgemini 2015. All Rights Reserved 27
1.8: Data Modeling Stages and Deliverables
Physical Data Model (contd..)
Physical data model represents
how the model will be built in the
database.
A physical database model
shows all table structures,
including column name, column
data type, column constraints,
primary key, foreign key, and
relationships between tables.
Copyright © Capgemini 2015. All Rights Reserved 28
1.8: Data Modeling Stages and Deliverables
Physical Data Model (contd..)
Features of a physical data model include:
Specification of all tables and columns.
Foreign keys are used to identify relationships between tables.
Demoralization may occur based on user requirements.
Physical considerations may cause the physical data model to be quite different
from the logical data model.
Physical data model will be different for different RDBMS. For example, data type
for a column may be different between MySQL and SQL Server
Copyright © Capgemini 2015. All Rights Reserved 29
1.9: Classification of Information Level
Levels of Information
• Information Content - General Ideas
• Human Concept of Application Domain
• Data System as Understood by Users
Logical
• Details of whole information Content
• Reference to specific Database Software
• No Details of Hardware/Software
Physical
• Details at Level of Internal data storage
• Intricacies of Specific Database
• Details of Physical implementation
Copyright © Capgemini 2015. All Rights Reserved 30
Data Modeling for
Business
Intelligence
Lesson 2: Understanding
Business Requirements
Lesson Objectives
This lesson will provide overview of various
techniques of Requirement gathering
We will learn about:
Need of Requirement Analysis
The Data Life cycle
Ways of Collecting requirement
Business Requirement Specification (BRS)
Copyright © Capgemini 2015. All Rights Reserved 32
2.1: Collecting Requirements
Requirements Collection
Experts think that the requirement gathering should be treated as a
separate phase.
Though, some suggest that it should be a part of the conceptual
design phase.
The requirement phase is used for the following:
Collecting the business requirement
Formulating the understanding of requirement
Requirement analysis starts as soon as a business case is prepared
or received.
Copyright © Capgemini 2015. All Rights Reserved 33
2.2: Understanding Requirements
Understanding Business Requirements
Any system is usually developed in response to a problem, an
opportunity, or a requirement.
Its statement should be supported by a formal business case. The
case is used for the following:
Studying Providing
Understanding benefits of the logical
Estimating
the problem proposed starting of
the cost
statement system the project
for modeler
It is important to understand the data life cycle in an application.
Copyright © Capgemini 2015. All Rights Reserved 34
2.3: Understanding Data Life Cycle
Study of Data Life Cycle
Need for data
Needed data
Collect needed data
Store data
Use data
Delete obsolete data
Archive historical data
A data life cycle help us to
state the requirements
clearly.
Copyright © Capgemini 2015. All Rights Reserved 35
2.4: What is a Good Software Requirement?
Characteristics of a Good Requirement
Specific Understandable:
Correct: A true statement of what Comprehendible by User,
the requirement should do Business and Developers
Complete: Encompass all Detailed/ Granular: Granular to
requirements of concern to the be implemented in test cases and
Users design
Unambiguous: Has only one Explicit: Encompass all derived
interpretation requirements
Consistent: Does not conflict Traceable: It should be possible
with other Requirements to trace a component requirement
Verifiable: Can be tested to meet to its source
the Requirements Manageable & Organized:
Attainable: It should be within Scalability and change
the scope of the project management, should be
structured
Copyright © Capgemini 2015. All Rights Reserved 36
2.5: Collecting Business Requirements
Collation of Business Requirements
Conduct interviews and workshops:
Avoid using data model in interviews and workshops.
Prefer UML, Use Cases, Activity Diagrams, DFD, and so on.
Conduct interviews with senior managers.
Conduct interviews with Subject Matter Experts (Do not let them Design.)
Conduct facilitated workshops.
Verify your own understanding about requirements.
Copyright © Capgemini 2015. All Rights Reserved 37
2.5: Interview with Stakeholders and Users
Interviewing Stakeholders and Users
Ask questions to the stakeholders at a pre-decided time and venue
to gather requirement knowledge:
Ask open-ended questions
Use structured agenda of fairly open questions
Interviews are good for documentation and agreement on common
or discussed objectives.
Management support is required to obtain time from stakeholders.
Copyright © Capgemini 2015. All Rights Reserved 38
2.5: Other Methods of Collecting Requirements
Other Methods of Collecting Requirements
Direct Observation Techniques:
This allows you to asses users’ needs and problems associated with the use of
services.
This technique is designed for a specific purpose; to identify a problem, describe a
situation, assess user satisfaction, and so on.
Surveys
This is more suitable when stakeholders are spread globally.
Data Collection and Analysis
These are indirect sources of information to provide an approximation of the
needs of the user.
Source can be public data, marketing data or any other data.
Copyright © Capgemini 2015. All Rights Reserved 39
2.6: Specifying Business Requirements
Business Requirements Specification
The most important task is to define “statement of requirements”
or Business Requirement Specification.
The issues could be as follows:
Many requirements are well-known but impractical to document them.
Some requirements are only relevant to specific design alternatives.
Some requirements may emerge only when the client has seen an actual design.
High-level business directions and rules cannot be captured directly.
Business
Requirement
Specification
Copyright © Capgemini 2015. All Rights Reserved 40
DATA MODELING TECHNIQUES
DATA MODELING TECHNIQUES:
There are 2 techniques of Data
Modeling :
E-R Modeling
Dimensional Modeling
Entity-Relationship Model
Entity-Relationship (ER) Model is based on the
notion of real-world entities and relationships
among them.
While formulating real-world scenario into the
database model, the ER Model creates entity set,
relationship set, general attributes and
constraints.
ER Model is best used for the conceptual design
of a database.ER Model is based on the following:
Entities and their attributes.
Relationships among entities.
ER Model - Example Notation
ENTITY and RELATIONSHIP
Entity − An entity in an ER Model is a real-world
entity having properties called attributes.
Every attribute is defined by its set of values
called domain.
For example, in a school database, a student is
considered as an entity. Student has various
attributes like name, age, class, etc.
Relationship − The logical association among
entities is called relationship.
Relationships are mapped with entities in
various ways. Mapping cardinalities define the
number of association between two entities.
TYPES OF DATAMODEL
There are 3 types of Data Models as
listed below:
CONCEPTUAL DATA MODEL
LOGICAL DATA MODEL
PHYSICAL DATA MODEL
CONCEPTUAL DATA MODEL
A conceptual data model identifies the highest-
level relationships between the different entities.
Features of conceptual data model include:
Includes the important entities and the relationships among
them.
No attribute is specified.
No primary key is specified.
CONCEPTUAL DATA MODEL
From the figure above, we can see that the only information shown
via the conceptual data model is the entities that describe the data
and the relationships between those entities. No other information is
shown through the conceptual data model.
LOGICAL DATA MODEL
A logical data model describes the data in as
much detail as possible, without regard to how
they will be physical implemented in the
database.
Features of a logical data model include:
Includes all entities and relationships among them.
All attributes for each entity are specified.
The primary key for each entity is specified.
Foreign keys (keys identifying the relationship between different
entities) are specified.
Normalization occurs at this level.
Steps for Logical Data Model Design
The steps for designing the logical data
model are as follows:
Specify primary keys for all entities.
Find the relationships between different entities.
Find all attributes for each entity.
Resolve many-to-many relationships.
Normalization.
LOGICAL DATA MODEL
COMPARISON LOGICAL ‘&’ CONCEPTUAL
Comparing the logical data model shown above
with the conceptual data model diagram, we see
the main differences between the two:
In a logical data model, primary keys are present,
whereas in a conceptual data model, no primary
key is present.
In a logical data model, all attributes are
specified within an entity. No attributes are
specified in a conceptual data model.
COMPARISON LOGICAL ‘&’ CONCEPTUAL
Relationships between entities are specified
using primary keys and foreign keys in a logical
data model.
In a conceptual data model, the relationships
are simply stated, not specified, so we simply
know that two entities are related, but we do
not specify what attributes are used for this
relationship.
PHYSICAL DATA MODEL
Physical data model represents how the model
will be built in the database.
A physical database model shows all table
structures, including column name, column data
type, column constraints, primary key, foreign
key, and relationships between tables.
Features of a physical data model include:
Specification all tables and columns.
Foreign keys are used to identify relationships between
tables.
PHYSICAL DATA MODEL
Denormalization may occur based on user
requirements.
Physical considerations may cause the physical
data model to be quite different from the logical
data model.
Physical data model will be different for different
RDBMS. For example, data type for a column may
be different between MySQL and SQL Server.
STEPS FOR DESIGNING PHYSICAL DM
The steps for physical data model design
are as follows:
Convert entities into tables.
Convert relationships into foreign keys.
Convert attributes into columns.
Modify the physical data model based on physical constraints /
requirements.
PHYSICAL DATA MODEL
CONCEPTS OF
ER - MODELLING
CONCEPTS OF THE ER MODEL
Entity types
Relationship types
Attributes
ENTITY TYPE
Entity type
Group of objects with same properties, identified by
enterprise as having an independent existence.
For example, in a school database, students, teachers,
classes, and courses offered can be considered as entities.
Entity occurrence
Uniquely identifiable object of an entity type.
Example: For an ENTITY
ER - MODELLING SYMBOLS
E ENTITY SET
E WEAK ENTITY SET
R RELATIONSHIP SET
R IDENTIFYING RELATIONSHIP FOR A
WEAK ENTITY SET
ER - MODELLING SYMBOLS
A ATTRIBUTE
A MULTI-VALUED ATTRIBUTE
A DERIVED ATTRIBUTE
A PRIMARY KEY ATTRIBUTE
ER - MODELLING SYMBOLS
1 1
R ONE-TO-ONE RELATIONSHIP
M 1
R MANY-TO-ONE RELATIONSHIP
M M
R MANY-TO-MANY RELATIONSHIP
ATTRIBUTES
Attribute
Property of an entity or a relationship type.
For example, a student entity may have name, class, and
age as attributes.
Attribute Domain
Set of allowable values for one or more attributes.
EXAMPLE: For an ATTRIBUTE
ATTRIBUTES TYPES
Simple Attribute
Attribute composed of a single component with an
independent existence.
For example, a student's phone number is an atomic value of
10 digits.
Composite Attribute
Attribute composed of multiple components, each with an
independent existence.
For example, a student's complete name may have
first_name and last_name.
EXAMPLE: COMPOSITE ATTRIBUTE
ATTRIBUTES TYPES
Single-valued Attribute
Attribute that holds a single value for each occurrence of an
entity type.
For example − Social_Security_Number.
Multi-valued Attribute
Attribute that holds multiple values for each occurrence of
an entity type.
For example, a person can have more than one phone
number, email_address, etc.
EXAMPLE: MULTIVALUED ATTRIBUTE
ATTRIBUTES TYPES
Derived Attribute
Attribute that represents a value that is derivable from
value of a related attribute, or set of attributes, not
necessarily in the same entity type.
For example, age can be derived from data_of_birth.
EXAMPLE: DERIVED ATTRIBUTE
Entity-Set and Keys
Key is an attribute or collection of attributes
that uniquely identifies an entity among entity
set.
For example, the roll number of a student
makes him/her identifiable among students.
Super Key − is defined as a set of attributes within
a table that uniquely identifies each record within a
table.
Candidate Key − Candidate keys are defined as the
set of fields from which primary key can be selected.
It is an attribute or set of attribute that can act as a
primary key for a table to uniquely identify each
record in that table.
Entity-Set and Keys
Primary Key − A primary key is one of the
candidate keys chosen by the database
designer to uniquely identify the entity set.
Composite Key Key that consist of two or
more attributes that uniquely identify an
entity occurrence is called Composite key.
EXAMPLE: PRIMARY KEY
EXAMPLE: COMPOSITE KEY
Relationship
The association among entities is called a
relationship. For example, an
employee works_at a department, a
student enrolls in a course. Here, Works_at
and Enrolls are called relationships.
Relationship Set:
A set of relationships of similar type is called a
relationship set. Like entities, a relationship too can have
attributes. These attributes are called descriptive
attributes.
Mapping Cardinalities
Cardinality defines the number of entities in one
entity set, which can be associated with the number of
entities of other set via relationship set.
One-to-one − One entity from entity set A can be
associated with at most one entity of entity set B and
vice versa.
EXAMPLE: 1 – 1
Mapping Cardinalities
One-to-many − One entity from entity set A can
be associated with more than one entities of
entity set B however an entity from entity set B,
can be associated with at most one entity.
EXAMPLE: 1 - N
Mapping Cardinalities
Many-to-one − More than one entities from entity
set A can be associated with at most one entity of
entity set B, however an entity from entity set B can
be associated with more than one entity from entity
set A.
EXAMPLE: N - 1
Mapping Cardinalities
Many-to-many − One entity from A can be
associated with more than one entity from B and
vice versa.
EXAMPLE: N - N
Normalisation
Normalisation is a 'fancy' term for a
set of rules, designed to make sure that
a database is organised in the best way
possible
This allows the data to be processed more
efficiently and any query to be processed.
These rules depend on relationships being
established between the entities to create
a functional dependency between them.
86
The normalisation process involves:
Finding and grouping together all the
entities and their attributes.
Removing repeating groups of data.
Providing unique keys for each entity
in the database system.
87
The Three Major Stages of Normalisation:
First Normal Form
1NF is the first level of normalisation. An entity (table) is in First Normal
form if it contains no repeating attributes (fields) or groups of attributes.
Second Normal Form
An entity is in 2NF if no attribute (not part of the primary key) is
dependent on only part of the primary key. This only applies to entities
with concatenated primary keys.
Third Normal Form
An entity is in 3NF if all attributes are entirely dependent on the primary
key and not on any attribute that is not part of the primary key.
In a relational schema, each tuple is divided into fields called
Domains.
Functional Dependency
Functional dependency means that there
must be only a one-to-one dependency for
each attribute mapped from a primary key
to that attribute.
It defines a relationship in which the
existence of one entity/attribute is
entirely dependent on the existence of
another (one-to-one).
89
Functional Dependency - Practical Example
SALES Order Number is the primary key.
Order Number
Acc.No. The value for each attribute of SALES,
except Item Price, depends upon the value
Customer
of the primary key.
Address
Date All attributes of SALES, except Item Price,
Item are Functionally Dependent on the
primary key
Quantity
Item Price Item Price is Functionally Dependent on the
Total Cost the attribute Item
90
To produce a set of entities in First Normal Form
(1NF):
Remove repeating (multiple) groups within the primary
entities (tables) so that each record (row) within the
entity is the same length.
Repeating groups then become new entities, linked
together by a one-to-many relationship.
Relationships are created by including a primary key
from one entity as a foreign key in another entity
91
Al's Baker Shop
Order Acc. Customer Address Date Item Qty. Item Total
No. No. Price Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 Bakewell 20 0.15 12.35
Cove Tart
Danish 13 0.20
Pastry
Apple Pie 45 0.15
4633 526 Smiths 12 Dee View, 19/7 Butteries 120 0.20 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 Apple Pie 130 0.15 56.50
Snacks Banchory
Cherry 100 0.18
Pie
Steak Pie 30 0.50
Meringue 20 0.20
Pie
1788 032 Tasty Bite 17 Wood Place, 18/7 Apple Pie 15 0.15 7.50
Insch
Danish 50 0.20
Pastry
92
Al's Baker Shop
Order Acc. Customer Address Date Item Qty. Item Total
No. No. Price Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 Bakewell 20 0.15 12.35
Cove Tart
Danish 13 0.20
Pastry
Apple Pie 45 0.15
4633 526 Smiths 12 Dee View, 19/7 Butteries 120 0.20 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 Apple Pie 130 0.15 56.50
Snacks Banchory
Cherry 100 0.18
Pie
Steak Pie 30 0.50
Meringue 20 0.20
Pie
1788 032 Tasty Bite 17 Wood Place, 18/7 Apple Pie 15 0.15 7.50
Insch
Danish 50 0.20
Pastry
93
Al's Baker Shop
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive,
Cove
16/7 12.35
Orders
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Order Item Qty. Item
No. Price
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20
7823
4633
Apple Pie
Butteries
45
120
0.15
0.20
Items Purchased
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
1788 Danish Pastry 50 0.20
94
Al's Baker Shop
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Orders
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Order Item Qty. Item
No. Price
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20
7823
4633
Apple Pie
Butteries
45
120
0.15
0.20
Items Purchased
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
1788 Danish Pastry 50 0.20
95
Al's Baker Shop
Order Acc. Customer Address Date Total
Orders Table: No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Order No. can be used to uniquely identify each record and can
therefore be made the primary key.
Orders (Order No.
Acc. No.
Customer
Address
Date
Total Cost)
96
Al's Baker Shop
Order Item Qty. Item
No. Price
Items Purchased Table:
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20
7823 Apple Pie 45 0.15
No one attribute can be used to
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
uniquely identify a record.
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50 Concatenated key is required
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
1788 Danish Pastry 50 0.20
Order No. and Item together can
uniquely identify a record.
Items (*Order No.
Purchased Item
Quantity
Item Price)
97
Al's Baker Shop
First Normal Form
Orders (Order No. Items (*Order No.
Acc. No. Purchased Item
Customer Quantity
Address Item Price
Date
Total Cost)
98
Q&A
1.What is Normalisation?
2.What does Normalisation process
involves?
3.What is Ist Normal Form?
4.In a relational schema, each
tuple is divided into fields called
Domains. T/F
99
The Three Major Stages of Normalisation:
First Normal Form
1NF is the first level of normalisation. An entity (table) is in First
Normal form if it contains no repeating attributes (fields) or
groups of attributes.
Second Normal Form
An entity is in 2NF if no attribute (not part of the primary key) is
dependent on only part of the primary key. This only applies to
entries with concatenated primary keys.
Third Normal Form
An entity is in 3NF if all attributes are entirely dependent on the
primary key and not on any attribute that is not part of the
primary key.
100
To produce a set of entities in
Second Normal Form (2NF):
Test for dependency by testing each particular attribute
in turn to check that it can be uniquely identified by
making use of all the primary key. This test need not be
completed unless you have at least one table which
requires a concatenated Primary Key
Remove all partially dependent attributes to a new
entity.
N.B. – A concatenated key occurs when you need two fields
together in order to uniquely identify a record
101
Al's Baker Shop
First Normal Form
Orders (Order No. Items (*Order No.
Acc. No. Purchased Item
Customer Quantity
Address Item Price)
Date
Total Cost)
102
Al's Baker Shop
Orders Table: Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Because this entity has a single attribute as the primary key
there can be no partial dependencies and therefore the entity
is already in 2NF.
103
Al's Baker Shop
(2NF Step 1)
Order Item Qty. Item
No. Price
7823 Bakewell Tart 20 0.15
7823 Danish Pastry 13 0.20 Items Purchased Table:
7823 Apple Pie 45 0.15 Primary key is Order No. and Item
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50
2276 Meringue Pie 20 0.20
Test for dependency by testing each
1788 Apple Pie 15 0.15 particular attribute.
1788 Danish Pastry 50 0.20
Primary Key Attribute Functionally Dependent?
104
Al's Baker Shop
(2NF Step 1)
Order Item Qty. Item
No. Price
7823 Bakewell Tart 20 0.15 Items Purchased Table:
7823 Danish Pastry 13 0.20
7823 Apple Pie 45 0.15 Primary key is Order No. and Item
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50 Test for dependency by testing each
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
particular attribute.
1788 Danish Pastry 50 0.20
Primary Key Attribute Functionally Dependent?
Order No Quantity YES
Item Quantity is functionally dependent
on Order No. and Item.
105
Al's Baker Shop
(2NF Step 1)
Order Item Qty. Item
No. Price
7823 Bakewell Tart 20 0.15 Items Purchased Table:
7823 Danish Pastry 13 0.20
7823 Apple Pie 45 0.15 Primary key is Order No. and Item
4633 Butteries 120 0.20
2276 Apple Pie 130 0.15
2276 Cherry Pie 100 0.18
2276 Steak Pie 30 0.50 Test for dependency by testing each
2276 Meringue Pie 20 0.20
1788 Apple Pie 15 0.15
particular attribute.
1788 Danish Pastry 50 0.20
Primary Key Attribute Functionally Dependent?
Order No Quantity YES
Item Quantity is functionally
dependent
on Order No. and Item.
Order No Item Price NO
Item Item price is functionally dependent
Item, but not on Order No. and Item
106
Al's Baker Shop
(2NF Step 2)
Remove any partially dependent attributes to a new entity
Order Item Qty. Item
Item
No. Price
Price
7823 Bakewell Tart 20 0.15
0.15
7823 Danish Pastry 13 0.20
0.20
7823 Apple Pie 45 0.15
0.15
4633 Butteries 120 0.20
0.20
2276 Apple Pie 130 0.15
0.15
2276 Cherry Pie 100 0.18
0.18
2276 Steak Pie 30 0.50
0.50
2276 Meringue Pie 20 0.20
0.20
1788 Apple Pie 15 0.15
0.15
1788 Danish Pastry 50 0.20
0.20
107
Al's Baker Shop
(2NF Step 2)
Remove any partially dependent attributes to a new entity
Order Item Qty.
No.
Item
7823 Bakewell Tart 20
Price
7823 Danish Pastry 13
0.15
7823 Apple Pie 45
0.15
4633 Butteries 120 0.20
2276 Apple Pie 130 0.18
2276 Cherry Pie 100 0.20
2276 Steak Pie 30 0.20
2276 Meringue Pie 20 0.50
1788 Apple Pie 15
1788 Danish Pastry 50
Part Order Price List
108
Al's Baker Shop
(2NF Step 2)
Create a relationship between the tables
and assign Primary Keys
Order Item Qty. Item Item
No. Price
7823 Bakewell Tart 20 Apple Pie 0.15
7823 Danish Pastry 13 Bakewell Tart 0.15
7823 Apple Pie 45 Butteries 0.20
4633 Butteries 120 Cherry Pie 0.18
2276 Apple Pie 130 Danish Pastry 0.20
2276 Cherry Pie 100 Meringue Pie 0.20
2276 Steak Pie 30 Steak Pie 0.50
2276 Meringue Pie 20
1788 Apple Pie 15
1788 Danish Pastry 50
Part Order Price List
Primary Key: Order No.
and *Item Primary Key: Item
109
Al's Baker Shop
Second Normal Form
Orders (Order No. Price List (Item
Acc. No. Item Price)
Customer
Address
Date
Total Cost) Part Order (*Order No.
*Item
Quantity)
110
The Three Major Stages of
Normalisation:
First Normal Form
1NF is the first level of normalisation. An entity (table) is in
First Normal form if it contains no repeating attributes (fields)
or groups of attributes.
Second Normal Form
An entity is in 2NF if no attribute (not part of the primary key)
is dependent on only part of the primary key. This only
applies to entries with concatenated primary keys.
Third Normal Form
An entity is in 3NF if all attributes are entirely dependent on
the primary key and not on any attribute that is not part of the
primary key.
111
To produce a set of entities in Third Normal Form
(3NF):
Test each attribute in turn to check for
dependency on the primary key.
Remove all transitive dependencies to a new
entity.
A transitive dependency is where an attribute is dependent on
another attribute (or attributes) that is (are) NOT the primary
key
112
Al's Baker Shop
Second Normal Form
Orders (Order No. Price List (Item
Acc. No. Item Price)
Customer
Address
Date
Total Cost) Part Order (*Order No.
*Item
Quantity)
113
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
Test for Aberdeen
dependency 2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Primary Key Attribute Trasnsitive Dependency
114
Al's Baker Shop -
3NF Step1 Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Primary Key Attribute Trasnsitive Dependency
Order No. Acc.No. YES: Acc.No can be found if we know either
Customer or Address
115
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 16/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Primary Key Attribute Trasnsitive Dependency
Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Customer can be found if we know either
Acc.No. or Address
116
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 16/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Primary Key Attribute Trasnsitive Dependency
Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Acc.No. can be found if we know either
Customer or Address
Order No. Address YES Address can be found if we know either
Customer or Acc.No
117
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Primary Key Attribute Trasnsitive Dependency
Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Acc.No. can be found if we know either
Customer or Address
Order No. Address YES Address can be found if we know either
Customer or Acc.No
Order No. Date NO Dependent on Order No.
118
Al's Baker Shop - 3NF Step1
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Test for Cove
4633 526 Smiths 12 Dee View, 19/7 24.00
dependency Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Primary Key Attribute Trasnsitive Dependency
Order No. Acc.No. YES: Customer can be found if we know either
Acc.No. or Address
Order No. Customer YES Acc.No. can be found if we know either
Customer or Address
Order No. Address YES Address can be found if we know either
Customer or Acc.No
Order No. Date NO Dependent on Order No.
Order No. Total Cost NO Dependent on Order No.
119
Al's Baker Shop - 3NF Step2
Remove transitive dependencies to a new entity
Order Acc. Customer Address Date Total
No. No. Cost
7823 178 Daisy's Café 27 Bay Drive, 16/7 12.35
Cove
4633 526 Smiths 12 Dee View, 16/7 24.00
Aberdeen
2276 167 Sally's 3 High Street, 17/7 56.50
Snacks Banchory
1788 032 Tasty Bite 17 Wood Place, 18/7 7.50
Insch
Acc. Customer Address
No.
178 Daisy's Café 27 Bay Drive,
Cove
526 Smiths 12 Dee View,
Aberdeen
167 Sally's 3 High Street,
Snacks Banchory
032 Tasty Bite 17 Wood Place,
Insch
120
Al's Baker Shop - 3NF Step2
Remove transitive dependencies to a new entity
Orders Customers
Order Date Total Acc. Customer Address
No. Cost No.
7823 16/7 12.35 178 Daisy's Café 27 Bay Drive,
4633 16/7 24.00 Cove
2276 17/7 56.50 526 Smiths 12 Dee View,
1788 18/7 7.50 Aberdeen
167 Sally's 3 High Street,
Snacks Banchory
032 Tasty Bite 17 Wood Place,
Insch
121
Al's Baker Shop - 3NF Step2
Create a relationship between the tables
and assign Primary Keys
Orders Customers
Order Acc. Date Total Acc. Customer Address
No. No. Cost
No.
7823 178 16/7 12.35 178 Daisy's Café 27 Bay Drive,
4633 526 16/7 24.00 Cove
2276 167 17/7 56.50 526 Smiths 12 Dee View,
1788 032 18/7 7.50 Aberdeen
167 Sally's 3 High Street,
Snacks Banchory
032 Tasty Bite 17 Wood Place,
Insch
Primary Key: Primary Key:
Order No. Acc.No.
122
Al's Baker Shop
Third Normal Form
Customers (Acc. No. Orders (Order No.
Customer *Acc. No.
Address) Date
Total Cost)
Part Order (*Order No.
Price List (Item *Item
Item Price) Quantity)
Normalisation Complete
123
Normalisation -
1. Remove repeating groups to create a new entity
2. Create a relationship using one of the attributes that
are left [Usually the primary key]
3.‘Check out’ entities with concatenated keys. If any
attribute is not fully dependent on both parts of the
primary key remove it to create a new entity.
4. Create a relationship using one of the attributes that
are left [Usually the primary key]
5.‘Check out’ every entity. If any attribute is dependent
on any attribute other than the primary key, remove it
into a new entity.
6. Create a relationship using one of the attributes
124
Q&A
1. What is 2nd Normal Form?
2.In 2nd Normal Form which
dependency is tested?
3.What is 3rd Normal Form?
4.What is Transitive Dependency?
125