UNIT II
DATABASE DESIGN
Entity-Relationship model
E-R diagram can express the overall logical structure of a
database graphically
It develops a conceptual design for the database. It also
develops a very simple and easy to design view of data.
It is easy to visualize and understand.
The E-R data model employs three basic
concepts:
Entity sets
Relationship sets and
Attributes
Entity set
Entity:
- a “thing” or “object” in the real world that is distinguishable
from all other objects.
An entity has a set of properties, and the values for some set of
properties must uniquely identify an entity
For example, in a school database, students, teachers, classes,
and courses offered can be considered as entities..
Entity Set:
An entity set is a collection of similar types of entities.
An entity set may contain entities with attribute
sharing similar values.
For example, the entity set “student” might represent
the set of all students in the university.
Entity
Attributes
Entities are represented by means of their properties,
called attributes.
All attributes have values.
For example, a student entity may have name, class,
and age as attributes.
Types of Attributes:
1) Simple attribute − Simple attributes are atomic values, which
cannot be divided further.
For example, a student's phone number is an atomic value of
10 digits.
2) Composite attribute − Composite attributes are made of
more than one simple attribute.
For example, a student's complete name may have first_name
and last_name.
3) Derived attribute :
Derived attributes are the attributes that do not exist in the
physical database, but their values are derived from other attributes
present in the database.
For example, average_salary in a department should not be saved
directly in the database, instead it can be derived.
For another example, age can be derived from data_of_birth.
3) Single-value attribute
Single-value attributes contain single value.
Ex: name, id
4) Multi-value attribute
Multi-value attributes may contain more than one
values.
For example, a person can have more than one
phone number, email_address, etc.
Entity-Set and Keys
Key is an attribute or collection of attributes that uniquely
identifies an entity among entity set.
For example, the roll_number of a student identifiable among
students.
Super Key − A set of attributes (one or more) that collectively
identifies an entity in an entity set.
Candidate Key − A minimal super key is called a candidate key. An
entity set may have more than one candidate key.
Primary Key − A primary key is one of the candidate keys chosen
by the database designer to uniquely identify the entity set.
Relationship
The association among entities is called a relationship.
For example, an employee works_at a department, a student enrolls in
a course.
Here, Works_at and Enrolls are called relationships.
In ER diagram, relationship type is represented by a diamond and
connecting the entities with lines.
Degree of a relationship set:
The number of different entity sets participating in a
relationship set is called as degree of a relationship set.
1) Unary Relationship –
When there is only ONE entity set participating in a relation,
the relationship is called as unary relationship. For example,
one person is married to only one person.
2) Binary Relationship (degree 2)
When there are TWO entities set participating in a relation, the
relationship is called as binary relationship.
For example, Student is enrolled in Course.
3) Ternary (degree 3)
In the Ternary relationship, there are three types of entity
associates.
4) n-ary Relationship
When there are n entities set participating in a relation,
the relationship is called as n-ary relationship.
We have 5 entities Teacher, Class, Location, Salary, Course.
So, here five entity types are associating we can say an n-
ary relationship is 5.
Cardinality/Constrains
Cardinality is the number of instance of an entity from a relation that can be
associated with the relation.
1) One to One Relationship
When only one instance of an entity is associated with the relationship,
it is marked as '1:1'.
2) One-to-many
When more than one instance of an entity is associated with a
relationship, it is marked as '1:N'.
3) Many-to-one
When more than one instance of entity is associated
with the relationship, it is marked as 'N:1'.
Many – to - Many
When more than one instances of an entity is associated with
more than one instances of another entity then it is called many
to many relationship.
Participation Constraints
Total Participation − Each entity is involved in the relationship.
Total participation is represented by double lines.
Partial participation − Not all entities are involved in the
relationship. Partial participation is represented by single lines.
Strong Entity:
The strong entity has a primary key. Weak entities are dependent
on strong entity. Its existence is not dependent on any other
entity.
Strong Entity is represented by a single rectangle −
Weak Entity:
The weak entity in DBMS do not have a primary key and are
dependent on the parent entity. It mainly depends on other
entities.
Weak Entity is represented by double rectangle −
Your Text Here
Normalization
“Good Database”
A technique of organizing the data into multiple related
tables, to minimize Data Redundancy
Normalization is the process of organizing the data in the
database.
Normalization is used to minimize the redundancy from a
relation or set of relations. It is also used to eliminate undesirable
characteristics like Insertion, Update, and Deletion Anomalies.
Normalization divides the larger table into smaller and links
them using relationships.
Data Redundancy
Repetition of data, increases the size of database
Other issues like:
Insertion problems
Deletion problems
Updation problems
Atomic Domain and 1st Normal Form
"Atomic" means "cannot be divided or split in smaller parts".
Applied to 1NF this means that a column should not contain
more than one value.
Step 1 of Normalization:
Scalable Table design which can be easily extensible.
If our Table not even in 1st Normal form its considered a
poor Database.
4 RULES
Rule 1:
Each Column should have atomic values
4 RULES
Rule 2:
A Column should contain values that are of same type
Do not inter-mix different types of values in any
column
4 RULES
Rule 3:
Each column should have a unique name
4 RULES
Rule 4:
Order in which data is doesn’t matter. Using SQL query you
can easily fetch data in any order from the table
2nd Normal Form(2NF)
For a table to be in the Second Normal Form, it must
satisfy two conditions:
The table should be in the First Normal Form.
There should be no Partial Dependency.
Prime Attribute & Non-Prime Attribute
Functional Dependency
Functional dependency is a relationship of one
attribute or field in a record to another.
In DB, we often have the case where one field defines
the other.
X Y
“X determines Y” or “ X is Functionally dependent on Y”
Marks Grade
It
typically exists between the primary key and non-
key attribute within a table.
StudentID ProjectID StudentName ProjectName
S89 P09 Olivia Geo Location
S92 P07 Alexandra Cluster
Exploration
S56 P03 Ava IoT Devices
S92 P05 Alexandra Cloud
Deployment
ProjectID ProjectName
Primary key
{Student ID, ProjectID} Primary key
StudentID ProjectID StudentName ProjectName
S89 P09 Olivia Geo Location
S76 P07 Jacob Cluster
Exploration
S56 P03 Ava IoT Devices
S92 P05 Alexandra Cloud
Deployment
{Student ID, Subject ID} Primary key
Teacher column only depends on subjectid
{Employee ID, Department ID} Primary key
3rd Normal Form(3NF)
A relation will be in 3NF if
it is in 2NF and
not contain any transitive partial dependency
Primary key or
prime attribute
X Y
Functional Dependency
Prime attribute Non- Prime attribute
X Y
Partial Functional
Dependency
Part of Prime Non- Prime attribute
attribute
X Y
Trasitive Functional
Dependency
Non - prime attribute Non- Prime attribute
X Y
BCNF
SUPER KEY
Boyce codd Normal Form(BCNF)
OR
3.5 Normal Form
It should be in 3rd Normal form
For any dependency X Y, X must be in
super Key
Professor Subject
N F
BC
Not a super key
t in
N o
4TH NORMAL FORM(4NF)
1)A relation will be in 4NF if it is in Boyce Codd normal
form
2)no multi-valued dependency.
For a dependency A → B, if for a single value of A,
multiple values of B exists, then the relation will be a
multi-valued dependency.
Multi-valued dependency
For a table with A,B,C columns
A B , is Multi Valued Dependency
Then B and C should be independent of each other
Three Rules of Multi Valued Dependency:
5th Normal Form(5NF)
A table is said to be in Fifth Normal Form if it satisfies the following
properties
1.If it is in fourth normal form and
2.It should not have join dependency
It is also called as Project Join Normal Form(PJNF)
If the table is having join dependency then it should be broken into
smaller tables.
R A relation with join dependency
Break down
R1 R2
Join again
R
Get the same relation again
AGENT COMPANY PRODUCT
A CAR
A TRUCK
A CAR
A TRUCK
B CAR
AGENT COMPANY PRODUCT
A FORD CAR
A FORD TRUCK
A GM CAR
A GM TRUCK
B FORD CAR
AGENT COMPANY COMPANY PRODUCT
A FORD FORD CAR
A GM FORD TRUCK
GM CAR
B FORD
GM TRUCK
AGENT PRODUCT
A CAR
A TRUCK
B CAR
R1 R2 R3
COMPA PRODU
AGENT COMPANY NY CT AGENT PRODUCT
A FORD A CAR
FORD CAR
A GM A TRUCK
FORD TRUCK
B FORD B CAR
GM CAR
GM TRUCK
AGENT COMPANY PRODUCT
FORD
A CAR
FORD
A TRUCK
GM
A CAR
GM
A TRUCK
FORD
B CAR