ICT450 Topic 2
Database Design DATA
And Development MODELS
Nor Azlina Binti Aziz Fadzillah
FSKM
UiTM Kampus Seremban, NS
DATA MODELING & DATA MODELS
Database design
focuses on how the database structure
will be used to store and manage end-user data
Data modeling
is the first step to design a database
DATA MODELING AND DATA MODELS
• Iterative and • Simple • Abstraction of a real-
progressive process representations of world object or event
of creating a specific complex real-world • Model helps us to
data model for a data structures understand the
determined problem • Useful for supporting complexities of the
domain a specific problem real-world
• Problem domain is a domain environment
clearly defined
area within the
real-world
environment with
well-defined scope
and boundaries.
4
THE IMPORTANCE OF DATA MODELS
As a communication tool that facilitate interaction
among the designer, the applications programmer and
the end user.
Give an overall view of the database.
Organize data for various users.
An abstraction for the creation of good database
(database blueprint).
5
DATA MODEL BASIC BUILDING BLOCKS
6
Business Rules
Business rule is a brief, precise, and unambiguous
description of a policy, procedure, or principle within
an organization
Business rule derived from a detailed description of an
organization’s operation
Enable for defining the basic building blocks
Describe main and distinguishing characteristics of the
data
7
Business Rules
Must be rendered in writing/available in written
form
Must be kept up to date
Sometimes are external to the organization
Must be easy to understand and widely
distributed
Describe characteristics of the data as viewed by
the company
8
Sources of Business Rules
Company managers
Policy makers
Department managers
Written documentation (procedures, standards, operation
manuals)
Direct interviews with end users
9
Group Project: Fact-Finding Techniques
10
Examining
documents
(document review)
Questionnaire Interviewing
Commonly Used
Fact-Finding
Techniques
Observation the
Research organization in
operations
Reasons for Identifying and Documenting
Business Rules
Help standardize company’s view of data
Communications tool between users and designers
Allow designer to:
• Understand the nature, role, scope of data, and business processes
• Develop appropriate relationship participation rules and constraints
• Create an accurate data model
Business Rules
Example 1
A painter must paint many paintings.
A painting must be painted by one and only one painter.
Example 2
An employee may learn many skills.
A skill may be learnt by many employees.
Example 3
An employee may manage one store.
A store must be managed by one employee.
12
Translating Business Rules into Data Model
Components
Nouns translate into entities
Verbs translate into relationships among entities
Relationships are bidirectional
Questions to identify the relationship type
• How many instances of B are related to one instance of A?
• How many instances of A are related to one instance of B?
13
Activity: Translating Business Rules into Data
Model Components
Translate the following business rules into data model
components (entities, relationship verb, relationship
type):
Example 1
A painter must paint many paintings.
A painting must be painted by one and only one painter.
Example 2
An employee may learn many skills.
A skill may be learnt by many employees.
Example 3
An employee may manage one store.
A store must be managed by one employee. 14
Naming Conventions
Entity names - Required to:
• Be descriptive of the objects in the business environment
• Use terminology that is familiar to the users
• Must be in CAPITAL LETTER (UPPERCASE)
Attribute names - Required to:
• Be descriptive of the data represented by the attribute
Advantages of proper naming:
• Facilitates communication between parties
• Promotes self-documentation
15
The Evolution of Data Models
Semantic data
- data is
organized in
such a way that
it can be
interpreted
meaningfully
without human
intervention
Hierarchical and Network Models
Hierarchical Models Network Models
Manage large amounts of Represent complex data
data for complex relationships
manufacturing projects Improve database
Represented by an upside- performance and impose a
down tree which contains database standard
segments Depicts both one-to-many
Segments: Equivalent of a file (1:M) and many-to-many
system’s record type (M:N) relationships
Depicts a set of one-to-many
(1:M) relationships
17
Hierarchical Model
18
Network Model
19
Standard Database Concepts from Network
Model Still Used by Modern Data Models
Schema Subschema Data Manipulation Data Definition
Language (DML) Language (DDL)
Conceptual Portion of the Environment in Enables the
organization database seen which data database
of the entire by the can be administrator
application managed and to define the
database as programs that
viewed by the is used to schema
produce the
database desired
work with the components
administrator information data in the
from the data database
within the
database
20
Relational Model
Relational Model
Developed by Codd (IBM) in 1970
▪ considered ingenious but impractical in 1970
▪ Computers lacked power to implement the relational model
Conceptually simple, based on mathematical concept of relational
Today, the relational model is the current database implementation
standard
Relational Database Management System (RDBMS)
Performs same basic functions provided by hierarchical and network
DBMS systems, in addition to other functions
Most important advantage of the RDBMS is its ability to hide the
complexities of the relational model from the user
21
Relational Model
Produced an automatic transmission database that replaced
standard transmission databases
Based on a relation
• Relation or table: Matrix composed of intersecting tuple (rows) and attribute (columns)
Describes a precise set of data manipulation constructs
Advantages Disadvantages
• Structural independence is promoted using • Requires substantial hardware and system
independent tables software overhead
• Tabular view improves conceptual simplicity • Conceptual simplicity gives untrained people
• Ad hoc query capability is based on SQL the tools to use a good system poorly
• Isolates the end user from physical-level • May promote information problems
details
• Improves implementation and management
simplicity
22
Relational Model
Relational Model
Rise to dominance due in part to its powerful and
flexible query language
Structured Query Language (SQL) allows the user to
specify what must be done without specifying how it must
be done
SQL-based relational database application involves:
▪ User interface
▪ A set of tables stored in the database
▪ SQL engine
23
A Relational Diagram (MS Access)
Cengage Learning © 2015
Entity Relationship (ER) Model
Entity Relationship (E-R) Model
Introduced by Chen in 1976
Widely accepted tool for graphical representation of
entities and their relationships in a database structure
Entity relationship diagram (ERD)
Uses graphic representations to model database
components
Entity instance or entity occurrence
Rows in the relational table
Connectivity: Term used to label the relationship
types (1:1, 1:M, M:N)
25
Entity Relationship (ER) Model
26
The Object-Oriented Data Model (OODM) or
Semantic Data Model
Object-oriented database management system(OODBMS)
• Based on OODM
Object: Contains data and their relationships with operations that are performed on it
• Basic building block for autonomous structures
• Abstraction of real-world entity
Attributes - Describe the properties of an object
Class: Collection of similar objects with shared structure and behavior organized in a
class hierarchy
• Class hierarchy: Resembles an upside-down tree in which each class has only one parent
Inheritance: Object inherits methods and attributes of parent class
Unified Modeling Language (UML)
• Describes sets of diagrams and symbols to graphically model a system
A Comparison of OO, UML, and ER Models
A Comparison of OO, UML, and ER Models
Advantages Disadvantages
Slow development of standards caused
vendors to supply their own enhancements
Semantic content is added
• Compromised widely accepted standard
Visual representation includes semantic Complex navigational system
content
Learning curve is steep
Inheritance promotes data integrity
High system overhead slows transactions
Object/Relational and XML
Extended relational data model (ERDM)
• Supports OO features and complex data representation
• Object/Relational Database Management System (O/R
DBMS)
• Based on ERDM, focuses on better data management
Extensible Markup Language (XML)
• Manages unstructured data for efficient and effective
exchange of all data types
30
Big Data
Aims to:
Find new and better ways to manage large amounts of web
and sensor-generated data and derive business insight from it
Provide high performance and scalability at a reasonable cost
Basic characteristics of Big Data databases (3 Vs):
Volume – amount of data
Velocity – speed in data growth rapidly and speed to process
data quickly
Variety – data comes in multiple different data formats
Big Data Challenges
Not always possible to fit unstructured, social media
and sensor-generated data in the conventional
structure of rows and columns
Expensive (multiformat data – more storage,
processing power, sophisticated data analysis tools)
Data analysis based on OLAP tools proved
inconsistent dealing with unstructured data
Frequently Used Big Data New Technologies
HADOOP
HADOOP DISTRIBUTED FILE
SYSTEM(HDFS)
MapReduce NoSQL
33
Frequently Used Big Data New Technologies
• Hadoop is an open source distributed processing framework that manages data processing and storage for big data
applications running in clustered systems.
• It is at the center of a growing ecosystem of big data technologies that are primarily used to support advanced analytics
initiatives, including predictive analytics, data mining and machine learning applications.
Hadoop • Hadoop can handle various forms of structured and unstructured data, giving users more flexibility for collecting,
processing and analyzing data than relational databases and data warehouses provide.
• Primary data storage system used by Hadoop applications.
• It employs a NameNode and DataNode architecture to implement a distributed file system that provides high-
performance access to data across highly scalable Hadoop clusters.
• HDFS is a key part of the many Hadoop ecosystem technologies, as it provides a reliable means for managing pools
HDFS of big data and supporting related big data analytics applications.
• Core component of the Apache Hadoop software framework.
• Hadoop enables resilient, distributed processing of massive unstructured data sets across commodity computer clusters,
in which each node of the cluster includes its own storage.
• Serves two essential functions: it filters and parcels out work to various nodes within the cluster or map, a function
MapReduce sometimes referred to as the mapper, and it organizes and reduces the results from each node into a cohesive answer to
a query, referred to as the reducer.
34
NoSQL (Not only SQL / Non SQL) Databases
Not based on the relational model
Support distributed database architectures
Provide high scalability, high availability, and fault tolerance
Support large amounts of sparse data
Geared toward performance rather than transaction consistency
Store data in key-value stores
NoSQL (Not only SQL / Non SQL) Databases
• High scalability, availability, and fault tolerance are provided
• Uses low-cost commodity hardware
Advantages
• Supports Big Data
• Key-value model improves storage efficiency
• Complex programming is required
Disadvantages • There is no relationship support
• There is no transaction integrity support
A Simple Key-value Representation
The Evolution of Data Models
Semantic data
- data is
organized in
such a way that
it can be
interpreted
meaningfully
without human
intervention
Data Models: A Summary
Each new data model capitalized on the
shortcomings of previous models
Common characteristics:
▪ Conceptual simplicity without compromising the
semantic completeness of the database
▪ Represent the real world as closely as possible
▪ Representation of real-world transformations
(behavior) must comply with consistency and
integrity characteristics of any data model
Some models better suited for some tasks
39
Data Model Basic Terminology Comparison
40
Degrees of Data Abstraction
The major purpose of a database system is to
provide users with an abstract view of the system.
Using levels of abstraction can be very helpful in
integrating multiple and conflicting views of data at
different levels of an organization
Many processes begin at high level of abstraction and
proceed to an ever-increasing level of detail.
Designing a usable database follows the same basic
process
The database system hides certain details of how
data is stored, created and maintained 41
Data Abstraction Levels
42
The External Model
End users’ view of the data
environment
ER diagrams are used to represent
the external views
External schema: Specific
representation of an external view
43
The Conceptual Model
Represents a global view of the entire
database by the entire organization
Conceptual schema: Basis for the
identification and high-level description of the
main data objects
Has a macro-level view of data environment
Is software and hardware independent
Logical design: Task of creating a
conceptual data model
The Internal Model
Representing database as seen by
the DBMS mapping conceptual
model to the DBMS
Internal schema: Specific
representation of an internal model
• Uses the database constructs supported by
the chosen database
Is software dependent and
hardware independent
Logical independence: Changing
internal model without affecting the
conceptual model
The Physical Model
Operates at lowest level of abstraction
Describes the way data are saved on storage media
such as disks or tapes
Requires the definition of physical storage and data
access methods
Relational model aimed at logical level
• Does not require physical-level details
Physical independence: Changes in physical model
do not affect internal model
Levels of Data Abstraction
Cengage Learning © 2015
Levels of Data Abstraction
(3 Tier Architecture)
User 1 User 2 User n
- end-users’ view Micro-view
External Model - h/w independent View 1 …
View 2 View n ERD
- s/w independent
- designer’s view Macro-view
Conceptual Model - h/w independent Conceptual Schema ERD
- s/w independent
- DBMS’s view Internal Schema SQL
Internal Model - h/w independent
- s/w dependent
- physical data organization
Physical Model - h/w dependent Database
- s/w dependent
48
Summary
A data model is an abstraction of a complex real-world
data environment
Basic data modeling components:
Entities
Attributes
Relationships
Constraints
Business rules identify and define basic modeling
components
Hierarchical model and network model were early
models that are no longer used, but some of the concepts
are found in current data models.
49
Summary
Relational model
Current database implementation standard
ER model is a tool for data modeling
◼ Complements relational model
Object-oriented data model: object is basic modeling
structure
Relational model adopted object-oriented extensions:
extended relational data model (ERDM)
OO data models depicted using UML
Data-modeling requirements are a function of different
data views and abstraction levels
Three abstraction levels: external, conceptual, internal &
physical
50