UNIT 2: RELATIONAL
MODEL
Relational Model
Basic structure of relational databases
Database schema,
keys, Schema diagrams,
Relational Query languages,
Relational algebra-Fundamental,
Additional and Extended Relational Algebra
Operations
RELATIONAL MODEL
The relational model is today the primary data model for
commercial data processing applications.
Relational Model (RM) represents the database as a
collection of relations. A relation is nothing but a table of
values.
These rows in the table denote a real-world entity or
relationship
The data are represented as a set of relations. In the relational
model, data are stored as tables.
The physical storage of the data is independent of the way the
data are logically organized.
Some popular Relational Database management
systems are:
DB2 and Informix Dynamic Server – IBM
Oracle and RDB – Oracle
MySQL --Oracle
SQL Server and Access – Microsoft
Structure of Relational Databases:
A relational database consists of a collection of
relation(tables) each of which is assigned a
unique name.
Consider a relation STUDENT with attributes
ROLL_NO, NAME, ADDRESS, PHONE and AGE
shown in Table.
STRUCTURE OF RELATIONAL DATABASES
Attribute
Tuple
Relation Schema
Relation Instance
Domain
Degree
Cardinality
NULL Values
Structure of Relational
Databases(cont….)
Attribute (column Headers):
Attributes are the properties that define a
relation. e.g.; ROLL_NO, NAME
Tuple(Row):
Each row in the relation is known as tuple.
Relation Schema:
A relation schema represents name of the
relation with its attributes.
For Example:
STUDENT (ROLL_NO, NAME, ADDRESS, PHONE ,AGE)
It is relation schema for STUDENT.
If a schema has more than 1 relation, it is called
Relational Schema.
RELATION INSTANCE:
The set of tuples of a relation at a particular instance of
time is called as relation instance.
Table shows the relation instance of STUDENT at a
particular time.
It can change whenever there is insertion, deletion or
updation in the database.
Ex. t1 Relation instance. And After deletion t2 relation
instance
DOMAIN:
For each attribute of a relation, there is a set of
permitted values, called the domain of that
attribute.
It contains a set of atomic(indivisible) values
that an attribute can take. i.e All possible column
values.
Ex1. In student table name attribute should be
set of character as a value.
Ex2. in Employee table age of employee should
be in between 20-50.
Degree: The number of attributes in the relation is known
as degree of the relation. The STUDENT relation defined
above has degree 5.
Cardinality: The number of tuples in a relation is known as
cardinality. The STUDENT relation defined above has
cardinality 4.
NULL Values: The value which is not known or unavailable
is called NULL value. It is represented by blank space. e.g.;
PHONE of STUDENT having ROLL_NO 4 is NULL.
DATABASE SCHEMA
A database schema is the skeleton structure that
represents the logical view of the entire database.
It defines how the data is organized and how the
relations among them are associated.
It formulates all the constraints that are to be
applied on the data.
A database schema defines its entities and the
relationship among them.
It contains a descriptive detail of the database,
which can be depicted by means of schema
diagrams.
It’s the database designers who design the
schema to help programmers understand the
database and make it useful.
DATABASE SCHEMA(CONT….)
A database schema can be divided broadly into
two categories −
Physical Database Schema − This schema
pertains to the actual storage of data and its
form of storage like files, indices, etc. It defines
how the data will be stored in a secondary
storage.
Logical Database Schema − This schema
defines all the logical constraints that need to be
applied on the data stored. It defines tables,
views, and integrity constraints.
DATABASE SCHEMA
DATABASE INSTANCE
Schema & Instance:
It is important that we distinguish these two terms
individually.
Database schema is the skeleton of database. It is
designed when the database doesn't exist at all. Once the
database is operational, it is very difficult to make any
changes to it.
A database schema does not contain any data or
information.
A database instance is a state of operational database
with data at any given time. It contains a snapshot of the
database.
Database instances tend to change with time. A DBMS
ensures that its every instance (state) is in a valid state, by
diligently following all the validations, constraints, and
conditions that the database designers have imposed.
KEYS :
KEYS in DBMS is an attribute or set of attributes
which helps you to identify a row(tuple) in a
relation(table).
They allow you to find the relation between two
tables.
Keys help you uniquely identify a row in a table
by a combination of one or more columns in that
table.
Key is also helpful for finding unique record or
row from the table.
KEY EXAMPLE:
Why we need a Key?
Keys help you to identify any row of data in a table.
In a real-world application, a table could contain
thousands of records. Moreover, the records could be
duplicated. Keys in RDBMS ensure that you can
uniquely identify a table record despite these
challenges.
Allows you to establish a relationship between and
identify the relation between tables.
Help you to enforce identity and integrity in the
relationship.
TYPES OF KEY:
PRIMARY KEY
It is the first key which is used to identify one and
only one instance of an entity uniquely.
An entity can contain multiple keys as we saw in
PERSON table.
The key which is most suitable from those lists
become a primary key.
In the EMPLOYEE table, ID can be primary key since
it is unique for each employee. In the EMPLOYEE
table, we can even select License_Number and
Passport_Number as primary key since they are also
unique.
For each entity, selection of the primary key is based
on requirement and developers.
PRIMARY KEY EXAPLE:
2. CANDIDATE KEY
A candidate key is an attribute or set of an
attribute which can uniquely identify a tuple.
The remaining attributes except for primary key
are considered as a candidate key.
The candidate keys are as strong as the primary
key.
For example: In the EMPLOYEE table, id is best
suited for the primary key. Rest of the attributes
like SSN, Passport_Number, and
License_Number, etc. are considered as a
candidate key.
2. CANDIDATE KEY EXAPLE
3. SUPER KEY
A superkey is a group of single or multiple keys
which identifies rows in a table.
A Super key may have additional attributes that are
not needed for unique identification.
Super key is a superset of a candidate key.
For example: In the EMPLOYEE table,
for(EMPLOEE_ID, EMPLOYEE_NAME) the name of two
employees can be the same, but their EMPLYEE_ID
can't be the same. Hence, this combination can also
be a key.
SUPER KEY EXAMPLE:
FOREIGN KEY
Foreign keys are the column of the table which is
used to point to the primary key of another table.
In a company, every employee works in a specific
department, and employee and department are
two different entities. So we can't store the
information of the department in the employee
table. That's why we link these two tables through
the primary key of one table.
We add the primary key of the DEPARTMENT table,
Department_Id as a new attribute in the EMPLOYEE
table.
Now in the EMPLOYEE table, Department_Id is the
foreign key, and both the tables are related.
FOREIGN KEY EXAMPLE:
SCHEMA DIAGRAMS
A database schema, along with primary key and
foreign key dependencies, can be depicted by
schema diagrams. Figure shows the schema
diagram for our university organization.
Each relation appears as a box, with the relation
name at the top in blue, and the attributes listed
inside the box.
Primary key attributes are shown underlined.
Foreign key dependencies appear as arrows from
the foreign key attributes of the referencing
relation to the primary key of the referenced
relation.
RELATIONAL QUERY LANGUAGES
Relational query languages use relational algebra
to break the user requests and instruct the DBMS
to execute the requests.
It is the language by which user communicates
with the database.
These relational query languages can be
procedural or non-procedural.
PROCEDURAL QUERY LANGUAGE
A procedural query language will have set of
queries instructing the DBMS to perform various
transactions in the sequence to meet the user
request.
For example, get_CGPA procedure will have
various queries to get the marks of student in
each subject, calculate the total marks, and then
decide the CGPA based on his total marks.
This procedural query language tells the database
what is required from the database and how to
get them from the database.
Relational algebra is a procedural query
language.
NON-PROCEDURAL QUERY LANGUAGE
Non-procedural queries will have single query on
one or more tables to get result from the
database.
For example, get the name and address of the
student with particular ID will have single query
on STUDENT table.
Relational Calculus is a non procedural language
which informs what to do with the tables, but
doesn’t inform how to accomplish this.
These query languages basically will have
queries on tables in the database
RELATIONAL ALGEBRA
Relational algebra is a procedural query language, which
takes instances of relations as input and yields instances of
relations as output.
Like Mathematical model it uses Operators and
Operands (tables) to perform queries.
An operator can be either unary or binary. They accept
relations as their input and yield relations as their output.
Relational algebra is performed recursively on a relation
and intermediate results are also considered relations.
The fundamental operations of relational algebra are :
Select
Project
Union
Set difference
Cartesian product
Rename
1. SELECT OPERATION:
The select operation selects tuples that satisfy a
given predicate.
It is denoted by sigma (σ).
Notation: σ p(r)
Where:
σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which
may use connectors like: AND OR and NOT.
These relational can use as relational operators
like =, ≠, ≥, <, >, ≤.
SELECT OPERATION EXAMPLE: (LOAN TABLE)
Input: σ BRANCH_NAME="perryride" (LOAN)
Selects tuples from LOAN where BRANCH_NAME is 'perryride'.
2. σsubject = "database" and price = "450"(Books)
Output −
Selects tuples from books where subject is
'database' and 'price' is 450.
3. σsubject = "database" and price = "450" or year >
"2010"(Books)
Output −
Selects tuples from books where subject is
'database' and 'price' is 450 or those books
published after 2010.
PROJECT OPERATION (∏)
This operation shows the list of those attributes
that we wish to appear in the result. Rest of the
attributes are eliminated from the table.
It is denoted by ∏.
It projects column(s) that satisfy a given
predicate.
Notation − ∏
A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as
relation is a set.
For example 1. −
∏subject, author (Books)
Selects and projects columns named as subject
and author from the relation Books.
PROJECT OPERATION (∏) EXAMPLE
∏ Customer_Name, Customer_City (CUSTOMER)
UNION OPERATION (∪)
It performs binary union between two given
relations.
Notation − r U s
o Where, r and s are database relations.
For a union operation to be valid, the following
conditions must hold −
r, and s must have the same number of
attributes.
Attribute domains must be compatible.
Duplicate tuples are automatically eliminated.
Example:
∏ author (Books) ∪ ∏ author (Articles)
Output − Projects the names of the authors who
have either written a book or an article or both.
∏ Student_Name (COURSE) ∪ ∏ Student_Name (STUDENT)
We can see there are no duplicate names present in the output
even though we had few common names in both the tables, also
in the COURSE table we had the duplicate name itself.
SET DIFFERENCE (-):
Suppose there are two tuples R and S.
The set intersection operation contains all tuples
that are in R but not in S.
It is denoted by intersection minus (-).
Notation: R - S
Finds all the tuples that are present in r but not
in s.
Example:
∏
author (Books) − ∏ author (Articles)
Output − Provides the name of authors who
have written books but not articles.
write a query to select those student names that are
present in STUDENT table but not present in COURSE table.
∏ Student_Name (STUDENT) - ∏ Student_Name (COURSE)
CARTESIAN PRODUCT (Χ)
The Cartesian product is used to combine each row
in one table with each row in the other table.
It is also known as a cross product.
Combines information of two different relations into
one.
Notation − r Χ s
Where r and s are relations and their output will be
defined as −
σauthor = ‘ADAM'(Books Χ Articles)
Output −
Yields a relation, which shows all the books and
articles written by ADAM.
FIND THE CARTESIAN PRODUCT OF TABLE R AND S.
Query: R X S
RENAME OPERATION:
The results of relational algebra are also relations
but without any name.
The rename operation allows us to rename the
output relation. 'rename' operation is denoted with
small Greek letter rho ρ.
It is denoted by rho (ρ).
Example:
ρ(STUDENT1, STUDENT)
We can use the rename operator to rename
STUDENT relation to STUDENT1.
ADDITIONAL AND EXTENDED RELATIONAL ALGEBRA
OPERATIONS.
o Additional operations can be seen expressed using
fundamental operations.
The basic relational-algebra operations have been
extended in several ways.
A simple extension is to allow arithmetic
operations as part of projection.
An important extension is to allow aggregate
operations such as computing the sum of the
elements of a set, or their average.
Another important extension is the outer-
join operation, which allows relational-algebra
expressions to deal with null values, which model
missing information
ADDITIONAL RELATIONAL ALGEBRA OPERATIONS
Set Intersection operation
Assignment operation
Natural Join operation
Left Outer Join Operation
Right Outer Join Operation
Full Outer Join Operation
SET INTERSECTION(∩ ):
Suppose there are two tuples R and S. The set intersection
operation contains all tuples that are in both R & S.
It is denoted by intersection ∩.
Notation: R ∩ S
Finds all the tuples that are present in both r & s.
set intersection is not a fundamental operation and does
not add any power to the relational algebra. It is simply
more convenient to write r ∩ s than to write R − (R − S).
i.e.
R ∩ S = R − (R − S).
∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEP
OSITOR)
Output:
Projects the common names of the Customer_Name from
Borrow and R_name from Depositor.
∏ Student_Name (COURSE) ∩ ∏ Student_Name (STUDENT)
ASSIGNMENT OPERATOR (←)
Suppose you wish to assign result of an expression into
a relation R. For such work, we use assignment
operator (←).
Notation of Assignment Operator
R ← E.
Where,
R is relation,
E is Expression whose result we wish to assign to
relation variable R.
The result of the expression to the right hand side of ←
is assigned to relation variable on the left side of ← the
relation variable may be used in subsequent
expression.
R1 ← πname(Customer)
R2 ← πname(Employee)
R = R1 – R2
NATURAL JOIN OPERATION (⋈)
It is denoted by the join symbol ⋈.
The natural join operation forms a Cartesian
product of its two arguments, performs
selection forcing equality on those attributes
that appear in both relation schema, and finally
removes duplicate attributes. Natural join can
be defined as:
Notation of Natural Join Operation
P = R ⋈ S
Where,
P is resultant relation after applying natural join
operation on R and S, R and S stands for
relation (name of the table).
When we perform join operation Relation
Result= Employee ⋈ Salary
LEFT OUTER JOIN:
LEFT (OUTER) JOIN: Returns all records from the left table,
and the matched records from the right table.
It is denoted by ⟕.
The Left-Outer Join is an outer join that returns all
the values of the left table, and the values of the
right table that has matching values in the left
table.
If there is no matching result in the right table, it will
return null values in that field.
The Left-Outer Join can be depicted using the below
diagram.
Input =A⟕B
RIGHT (OUTER) JOIN:
RIGHT (OUTER) JOIN: Returns all records from the
right table, and the matched records from the left
table.
Right outer join contains the set of tuples of all
combinations in R and S that are equal on their
common attribute names.
In right outer join, tuples in S have no matching
tuples in R.
It is denoted by ⟖.
Input =A ⟖ B
FULL (OUTER) JOIN:
Returns all records when there is a match in either
left or right table.
Full outer join is like a left or right join except that it
contains all rows from both tables.
In full outer join, tuples in R that have no matching
tuples in S and tuples in S that have no matching
tuples in R in their common attribute name.
It is denoted by ⟗.
Input =A ⟗ B
EXTENDED RELATIONAL ALGEBRA OPERATIONS.
The basic relational-algebra operations have
been extended in several ways.
A simple extension is to allow arithmetic
operations as part of projection.
Generalized Projection
Aggregate Functions
GENERALIZED PROJECTION
The generalized-projection operation extends the projection
operation by allowing arithmetic functions to be used in the
projection list.
The generalized projection operation has the form
∏ F1,F2,...,Fn (E)
where
E is any relational-algebra expression, and
each of F1, F2, . . . , Fn is an arithmetic expression involving
constants and attributes in the schema of E an expression can
use arithmetic operations such as +,−, ∗, and ÷ on numeric
valued attributes, numeric constants, and on expressions that
generate a numeric result.
Generalized projection also permits operations on other data
types, such as concatenation of strings.
Ex. Given relation instructor(ID, name, dept_name, salary)
where salary is annual salary, get the same information but
with monthly salary
∏ID, name, dept_name, salary/12 (instructor)
AGGREGATE FUNCTIONS
Aggregation function takes a collection of values and
returns a single value as a result.
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
Aggregate operation in relational algebra:
E is any relational-algebra expression – G1 , G2 …,
Gn is a list of attributes on which to group (can be
empty)
Each Fi is an aggregate function – Each Ai is an
attribute name
AGGREGATE OPERATION – EXAMPLE •