Chapter 3: Relational Model and Relational Algebra
Introduction to the relational model
Relational Model is Proposed by Edgar. F. Codd.
Relational Model is a primary data model for commercial data processing applications.
It is a simple and elegant model with a mathematical basis and set theory.
Most of the modern DBMS are relational databases.
Relational algebra operations play crucial role in query optimization and execution.
Structure of relational databases
A relational database consists of collection of tables and each table has unique name.
In the relational model, the term
• relation (mathematical concept) is used to refer to a table,
• tuple (a sequence (or list) of values) is used to refer to a row/record and
• attribute refers to a column of a table.
Example- consider an ER diagram as
TITLE
NAME
AUTHOR_ID BOOK_ID
AUTHOR BOOK CATEGORY
CONTACT_NUMBER WRITES
PRICE
ADDRESS
The above ER diagram can be represented in form of tables as
AUTHOR:
1
BOOK:
In general, a row in a table represents a relationship among a set of values.
Relation instance
The term relation instance to refer to a specific instance of a relation, i.e., containing a specific
set of rows.
For example, instance of the relation AUTHOR has 4 tuples corresponding to 4 authors and
instance of the relation BOOK has 4 tuples corresponding to 4 books.
The order in which tuples appear in a relation is irrelevant, since a relation is a set of tuples.
Domain of attribute
For each attribute of a relation, there is a set of permitted values, called the domain of that
attribute.
Thus, the domain of the Salary attribute of the Faculty relation is the set of all possible salary
values, while the domain of the Name attribute is the set of all possible faculty names.
Attribute values are (normally) required to be atomic; that is, indivisible.
Some attribute values may be null values. The null value is a special value that signifies that
the value is unknown or does not exist.
For example, suppose the attribute CONTACT_NUMBER in the AUTHOR relation.
It may be possible that author does not have a phone number at all, or that the contact number
is unlisted. We would then have to use the null value to signify that the value is unknown or
does not exist.
Relation schema
A relation schema consists of a relation name, list of attributes or column names and their
corresponding domains.
If A1, A2, …, An are attributes of relation R, then
R = (A1, A2, …, An ) is a relation schema.
Example:
AUTHOR = (Author_ID, Name, Contact_Number, Address)
BOOK = (Book_ID, Title, Category, Price)
2
Example- Relation schema
ADDRESS
CREDITS
YEAR
STUDENT_ID
COURSE_ID
STUDENT COURSE
ENROLLS
STUDENT_NAME
COURSE_NAME
SEMESTER
STUDENT = (STUDENT_ID, STUDENT_NAME, ADDRESS)
COURSE = (COURSE_ID, COURSE_NAME, CREDITS)
ENROLLS = (STUDENT_ID, COURSE_ID, YEAR, SEMESTER)
Database schema & instance
Database schema is the logical structure of the database.
Example:
• AUTHOR = (Author_ID, Name, Contact_Number, Address)
• BOOK = (Book_ID, Title, Category, Price)
Database instance is a snapshot of the data in the database at a given instant in time.
For relations AUTHOR and BOOK database instance can be
3
Keys in relational model
There must be a way to specify how tuples within a given relation are distinguished.
Key is a set of attributes of relation whose values uniquely identify a tuple in any instance.
In other words, no two tuples in a relation are allowed to have exactly the same value for all
attributes.
There are four types of keys in relational model
• Super key
• Candidate key
• Primary key
• Foreign key
SUPER KEY
A super key is a set of one or more attributes that, taken collectively, allow us to identify
uniquely a tuple in the relation.
For example, the ID attribute of the relation STUDENT is sufficient to distinguish one student
tuple from another.
Thus, ID is a super key. The name attribute of student, on the other hand, is not a super key,
because several student smight have the same name.
A super key may contain extraneous attributes.
Example: STUDENT = (STUDENT_ID, NAME,ADDRESS)
For STUDENT entity set, super key can be
• {STUDENT_ID}
• {STUDENT_ID, NAME}
• {STUDENT_ID, NAME, ADDRESS}
CANDIDATE KEY
Minimal or subset of super keys are called candidate keys.
It is possible that several distinct sets of attributes could serve as a candidate key.
Example:
STUDENT = (STUDENT_ID, NAME,ADDRESS)
For STUDENT entity set, candidate key can be
• {STUDENT_ID}
• {STUDENT_ID, NAME}
4
PRIMARY KEY
Primary key is a candidate key that is chosen by the database designer as the principal
means of identifying tuples within a relation.
Examples:
• STUDENT = (STUDENT_ID, STUDENT_NAME,ADDRESS)
STUDENT_ID s primary key.
• AUTHOR = (AUTHOR_ID, Name, Contact_Number, Address)
AUTHOR_ID is primary key.
• BOOK = (BOOK_ID, Title, Category, Price)
BOOK_ID is primary key.
FOREIGN KEY
A relation (B) may include among its attributes the primary key of another relation (A).
This attribute is called a foreign key from relation B referencing relation A.
The relation B is also called the referencing relation of the foreign key dependency, and
relation A is called the referenced relation of the foreign key.
Example:
Consider two relations, EMPLOYEE and PROJECT having relationship Works_on.
Keys -Example
EMPLOYEE relation
PROJECT relation
Super-key for relation EMPLOYEE can be
5
=( EMP_ID,FNAME,MNAME,LNAME,CONTACTNUMBER,ADDRESS)
Candidate key for relation EMPLOYEE can be
=( EMP_ID,FNAME,MNAME,LNAME)
=(FNAME,MNAME,LNAME,CONTACTNUMBER)
Primary key
for relation EMPLOYEE is EMP_ID and
for relation PROJECT is PRJ_ID
Foreign key
EMP_ID is a foreign key for PROJECT relation.
The relation PROJECT is called the referencing relation and relation EMPLOYEE is called
the referenced relation of the foreign key.
Aspects of keys in relational model
• A key (whether primary, candidate, or super) is a property of the entire relation, rather
than of the individual tuples.
• Any two individual tuples in the relation are prohibited from having the same value on the
key attributes at the same time.
• The key represents a constraint in the real-world enterprise being modelled.
• There can be more than one foreign key in a relation scheme.
• The primary key should be chosen such that its attribute values are never, or very rarely,
changed.
For example, Aadhar (UID) numbers in India are guaranteed never to change.
• Unique identifiers generated by enterprises generally do not change, like EMP_ID.
• It is customary to list the primary key attributes of a relation schema before the other
attributes.
• Emp_ID attribute of Employee is listed first, since it is the primary key.
• Primary key attributes are also underlined.
• Attributes which are likely to change its value should not be part of the primary key,
• Example: Address of a person
• Attributes which are likely to have same value should not be part of the primary key.
• Example : Name of a person
6
Integrity constraints in relational model
Integrity constraints are necessary conditions to be satisfied by the data values in the
relational instances so that the set of data values constitute a meaningful database.
Types of integrity constraints:
1. Domain Constraints
• Actual values of an attribute in any tuple must belong to the declared domain i.e. values
must from set of permissible values.
2. Key Constraints
• Two tuples in any relation instance should not have identical values for attributes.
• Key attributes (Super, candidate & primary) cannot have null value.
3. Foreign Key Constraints
• Value in one relation appear in another relation (relation can have foreign key/s).
7
Referential integrity constraint
Referential integrity requires that a foreign key must have
• a matching primary key or
• a null value.
This constraint is specified between two tables (parent and child); it maintains the
correspondence between rows in these tables.
It means the reference from a row in one table to another table must be valid.
Conditions for referential integrity
• The related fields have the same data type.
• Both tables belong to the same database.
8
9
Referential integrity rules
10
Mapping the ER and EER model to the Relational model
• A database which conforms to an E-R diagram can be represented by a collection of
tables.
• For each entity set and relationship set there is a unique table which is assigned the name
of the corresponding entity set or relationship set.
• Each table has a number of columns (generally corresponding to attributes), which have
unique names.
• Primary keys allow entity sets and relationship sets to be expressed uniformly as tables
which represent the contents of the database.
Example
DEPN_NAME
EMP_ID EMP_NAME DEPN_RELATION
Employee EmpDepn
Dependent DOB
DEPT_NAME GENDER
REPRESENTING STRONG ENTITY SETS AS TABLES
A strong entity set reduces to a table with the same attributes.
REPRESENTING WEAK ENTITY SETS
A weak entity set becomes a table that includes a column for the primary key of the identifying
strong entity set.
EMP_ID DEP_NAME DEP_RELATION GENDER
101 Arjun Son Male
101 Sara Daughter Female
102 Samit Son Male
Composite primary key = (EMP_ID, DEP_NAME)
11
EMP_ID is primary key of identifying strong entity set (Employee) while DEP_NAME is
discriminator of weak entity set.
REPRESENTING RELATIONSHIP SETS AS TABLES
Representing one-to-many & many-to-many relationship
12
Representing Many-To-Many Relationship
Mapping EER Model to Relational Model
1. Representing specialization as tables
Method 1:
13
Method 2:
• Form a table for each entity set that is generalized.
EMPLOYEE
ID NAME ADDRESS CONTACT_NUMBER SALARY
CUSTOMER
14
Composite and Multivalued attributes to tables
Composite attributes
Composite attributes are flattened out by creating a separate attribute for each component
attribute.
Example: for entity set STUDENT with composite attribute Name with component attributes
first-name, middle name and last-name the table corresponding to the entity set can be
15
Multivalued attributes
• A multivalued attribute M of an entity E is represented by a separate table EM.
• Table EM has attributes corresponding to the primary key of E and an attribute
corresponding to multivalued attribute M.
• Each value of the multivalued attribute maps to a separate row of the table EM.
• Example: CONTACT_NUMBER of entity set STUDENT
STUDENT
STUDENT_ID CONTACT_NUMBER
201 1234567890
201 2345678901
301 3456789012
701 4567890123
701 5678901234
16
Relational Algebra (RA)
• The relational algebra is a procedural query language that means it gives a procedural
method of specifying a query for retrieval of data.
• It consists of a set of operators (unary and binary) that take (one or two) relation
instances as input (arguments) and return new relations as their result.
• SQL queries are internally translated into relational algebra expressions.
• It forms the core component of a relational query engine.
• It provides a framework for query optimization.
Operations in relational algebra
The six fundamental operations in the relational algebra are
• Select ( ) (sigma)
• Project ( ) (Pi)
• Union (U)
• Set difference (–)
• Cartesian product (X)
• Rename ( ) (rho)
In addition to the fundamental operations, there are several other operations—namely,
• set intersection
• Natural join
• assignment
UNARY and BINARY types of operations
Unary operations which operate on one relation are
• Select ( )
• Project ( )
• Rename ( )
Binary operations which operate on pairs of relations are
• Union (U)
• Set difference (–)
• Cartesian product (X)
17
Select operation ( )
The select operation selects tuples that satisfy a given predicate.
Notation: p (r)
• The lowercase Greek letter sigma ( ) denotes the selection.
• The selection predicate p appears as a subscript to .
• The argument relation r is in parentheses after the .
Example- Select Operation
AUTHOR
Example: select those tuples of the AUTHOR relation where the author name is Ruskin
Bond.
Name = “Ruskin Bond” (AUTHOR)
AUTHOR_ID NAME CONTACT_NUMBER ADDRESS
102 Ruskin Bond 3456789012 India
Select operation with comparison operators and connectives
In general, comparisons can be done using =, =, <, ≤, >, and ≥ operators in the selection
predicate.
Furthermore, several predicates can be also combined into a larger predicate by using the
connectives and (∧), or (∨), and not (¬).
Select operation is commutative:
σc1 (σc2( r)) = σc2 (σc1( r))
Examples:
1. To find the employees whose salary is more than 60000
SALARY > 60000 (EMPLOYEE)
18
2. To find the employees whose salary is more than 50000 and works in Testing
department.
SALARY >50000 ^ DEPT = “Testing” (EMPLOYEE)
3. Obtain information about employees whose salary is between 40000 and 50000
SALARY ≥40000 ^ SALARY < 50000 (EMPLOYEE)
Project operation ( )
Project Operation is a unary operation that returns its argument relation, with certain
attributes left out.
Projection is denoted by the uppercase Greek letter pi ().
We list those attributes that we wish to appear in the result as a subscript to .
Notation: A1,A2,A3 ….Ak (r)
where A1, A2, …, Ak are attribute names and r is a relation name.
The result is defined as the relation of k columns obtained by removing the columns that
are not listed.
Duplicate rows are removed from result, since relations are sets.
Example- Project Operation
EMPLOYEE
EMP_ID NAME SALARY DEPT
101 Sachin Tendulkar 45000 HR
102 Rahul Dravid 75000 Testing
103 Anil Kumble 55000 Testing
104 Virat Kohli 70000 Marketing
Example: Display ID and names of all employees.
EMP_ID, NAME (EMPLOYEE)
EMP_ID NAME
101 Sachin Tendulkar
102 Rahul Dravid
103 Anil Kumble
104 Virat Kohli
19
Composition of relational operations
The result of a relational-algebra operation is relation and therefore of relational-algebra
operations can be composed together into a relational-algebra expression.
Example: Find the ID and names employees whose salary is more than 60000.
R1 SALARY > 60000 (EMPLOYEE)
R2 EMP_ID,NAME (R1)
or it can be represented as
EMP_ID,NAME ( SALARY > 60000 (EMPLOYEE) )
The output will be
EMP_ID NAME
102 Rahul Dravid
104 Virat Kohli
Union operation (U)
The union operation allows us to combine two relations. It includes all tuples that are in tables
R or in S. It also eliminates duplicate tuples.
Notation: R U S
For R U S to be valid,
1. relations R and S must have the same number of attribute
2. the attribute domains must be compatible.
That is,the domains of the ith attribute of R and the ith attribute of S must be the same,for all i.
Example: 2nd column of R deals with the same type of values as does the 2nd column of S
Example- Training institute is conducting different courses-Batch-1 relation stores data of
regular courses while Batch-2 relation stores data of week-end courses,
20
Batch-1
Course_Id Course_Name
11 Foundation of C
21 C++
31 JAVA
Batch-2
Course_Id Course_Name
91 Python
21 C++
Batch-1 U Batch-2
Course_Id Course_Name
11 Foundation of C
21 C++
31 JAVA
91 Python
Set operators on relations
Set-intersection operation ( ∩ )
The set-intersection operation allows us to find tuples that are in both the input relations.
Notation: R S
For R ∩ S to be valid,
1. relations R and S must have the same number of attributes.
2. The attribute domains must be compatible.
21
Set difference operation (–)
The set-difference operation allows us to find tuples that are in one relation but are not in
another.
Notation: R – S
This expression produces a relation containing those tuples in R but not in S.
For R – S to be valid,
1. relations R and S must have the same number of attributes.
2. The attribute domains must be compatible.
22
Assignment operation ()
It is convenient at times to write a relational-algebra expression by assigning parts of it to
temporary relation variables.
The assignment operation is denoted by and works like assignment in a programming
language.
Example: Find all faculty in the “COMPUTER” and “EXTC” department.
Computer dept_name = “COMPUTER” (FACULTY)
EXTC dept_name = “EXTC” (FACULTY)
Computer ∪ EXTC
Rename operation ()
The results of relational-algebra expressions do not have a name that we can use to refer to
them. The rename operator, , is provided for that purpose like
ρ (Relation2, Relation1)
To rename STUDENT_SE relation to STUDENT_TE, we can use rename operator like:
ρ (STUDENT_TE, STUDENT_SE)
23
Cartesian-product operation (X) or (Cross product)
The Cartesian-product operation allows us to combine values from any two relations.
Notation: R X S
The Cartesian product R × S of two relations R, & S is computed by concatenating each tuple
t ∈ R with each tuple u ∈ S.
Example: Student
Std_ID Std_Name
101 Sachin
102 Saurav
103 Rahul
104 Virat
Sport
Sports_ID Sports_Name
1 Cricket
2 Football
Student X Sport
Std_ID Std_Name Sports_ID Sports_Name
101 Sachin 1 Cricket
101 Sachin 2 Football
102 Saurav 1 Cricket
102 Saurav 2 Football
103 Rahul 1 Cricket
103 Rahul 2 Football
104 Virat 1 Cricket
104 Virat 2 Football
24
Join operations (⋈ )
Join operation is essentially a cartesian product followed by a selection criterion.
Join operation denoted by ⋈.
JOIN operation also allows joining of related tuples from different relations.
Types of JOINS:
Inner Join:
• Theta join
• Equi join
• Natural join
Outer join:
• Left Outer Join
• Right Outer Join
• Full Outer Join
25
Inner join
In an inner join, only those tuples that satisfy the matching criteria are included, while the
rest are excluded.
1. Theta join:
The general case of JOIN operation is called a Theta join. It is denoted by symbol θ.
The theta join operation between relations R & S is defined as follows:
R ⋈θ S
where θ represents a condition.
Example:
26
2. Equi join:
When a theta join uses only equivalence condition, it becomes a equi join.
R ⋈ R.column 2 = S.column 2 (S)
Example:
27
3. Natural join (⋈)
Natural join does not use any comparison operator.
It does not concatenate the way a Cartesian product does.
We can perform a Natural Join only if there is at least one common attribute that exists
between two relations.
In addition, the attributes must have the same name and domain.
Natural join acts on those matching attributes where the values of attributes in both the
relations are same.
If R and S are relations without any attributes in common,
that is, R ∩ S = ∅ (null),
then R ⋈ S = R × S
Course
Course_ID Course_Name Department
CS01 Database Computer
EE01 BEE Electrical
ME01 Applied Mechanics Mechanical
Faculty
Department Faculty
Computer A
Electrical B
Mechanical C
Course ⋈ Faculty
Department Course_ID Course_Name Faculty
Computer CS01 Database A
Electrical EE01 BEE B
Mechanical ME01 Applied Mechanics C
28
29
OUTER JOINS
The inner joins (theta join, equijoin, and natural join) include only those tuples with matching
attributes and the rest are discarded in the resulting relation.
In an outer join, along with tuples that satisfy the matching criteria, we also include some or
all tuples that do not match the criteria.
There are three kinds of outer joins −
1. left outer join ( )
2. right outer join ( )
3. full outer join ( )
1. Left outer join (A B)
The left outer join operation allows keeping all tuples in the left relation. However, if there
is no matching tuple is found in right relation, then the attributes of right relation in the join
result are filled with null values.
30
2. Right outer join: ( A B)
In the right outer join, operation allows keeping all tuple in the right relation. However, if
there is no matching tuple is found in the left relation, then the attributes of the left relation in
the join result are filled with null values.
31
3. Full outer join: ( A B)
Reference:
• Database System Concepts, Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Sixth
Edition Mcgraw-Hill Publication (6th Edition)
• https://www.tutorialspoint.com/dbms/database_joins.htm
• https://www.tutorialspoint.com/what-is-join-operation-in-relational-algebra-dbms
32