Chapter 2 The Relational
Model of Data
THE RELATIONAL MODEL OF DATA 1
Objectives
Understand what is the relational model and
database design basing relational model.
Conceptualize data using the relational model.
Understand what basic relational algebra
operators under set semantics.
Express queries using relational algebra.
THE RELATIONAL MODEL OF DATA 2
Contents
2.1 An Overview of Data Models
2.2 Basics of the Relational Model
2.3 An Algebraic Query Language
THE RELATIONAL MODEL OF DATA 3
2.1 An Overview of Data Models
Data model: a collection of concepts for
describing data, including 3 parts:
Structure of the data
Ex: arrays or objects
Operations on the data
Queries and modification on data
Constraints on the data
Limitations on the data
THE RELATIONAL MODEL OF DATA 4
2.1 An Overview of Data Models
The relational model, including object-relational
extensions
The semi-structured data model, including XML and
related standards
Semi-structured data resembles trees or graphs rather
than tables or arrays
XML, a way to represent data by hierarchically nested
tagged elements
Operations involve following paths in tree from an element
to one or more of its nested sub elements, and so on
Constraints involve the data type of values associated
with a nested tag
THE RELATIONAL MODEL OF DATA 5
2.1 An Overview of Data Models
THE RELATIONAL MODEL OF DATA 6
2.2 Basics of the Relational Model
THE RELATIONAL MODEL OF DATA 7
2.2.1. Attributes
The relational model represents data as a 2-
dimensional table (called a relation)
Each row represents a CUSTOMER
Each column represent a property of
CUSTOMER and also called a “attribute”
2.2.2. Relation Schema
Relation Schema: A relation schema represents the name
of the relation with its attributes.
CUSTOMER(Customer_ID, Tax_ID, Name, Address)
Relation Schema
EMPLOYEE (FName,Minit, LName, Ssn, Bdate, Address, Sex, Salary,
Super_ssn, Dno)
KHOA(MaKhoa, TenKhoa, SoDT)
SINHVIEN(MaSV, HoTen, NgaySinh, QueQuan, MaLop)
2.2.3. Degree
Degree: The total number of attributes which in the
relation is called the degree of the relation.
The relation r defined on the attribute set U = {A1,
A2, ..., An} is often called an n-degree
Example:
CUSTOMER(Customer_ID, Tax_ID, Name, Address)
STUDENTS(Name, Ssn, Home_phone, Address, Office_phone, Age, Gp
2.2.4. Tuples
A row of a relation is called a tuple (or record)
When we want to write a tuple in isolation, not as
part of a relation, we normally use commas to
separate components
Exp: (1234567890,555-5512222, Munmun, 323
Broadway)
2.2.5. Domains
The relational model requires that each component of
each tuple must be atomic, that is, it must be of some
elementary type such as INTEGER or STRING
It is not permitted for a value to be a record structure,
set, list, array or any type that can have its values
broken into smaller components
A Domain is a particular elementary type of a attribute
In summary, a domain is a set of acceptable values that a
column is allowed to contain. This is based on various
properties and the data type for the column
2.2.5. Domains (cont’)
Example:
•The domain of Marital Status has a set of
possibilities: Married, Single, Divorced.
•The domain of the MONTH property is the
collection of integers from 1 to 12
•The domain of Salary is the set of all floating-point
numbers greater than 0 and less than 200,000.
•The domain of First Name is the set of character
strings that represents names of people.
2.2.6. Equivalent representation of a relation
Relations are sets of tuples, not lists of tuples
So, the order in which the tuples of a relation are
presented is not important
2.2.7. Relation instances
A relation about CUSTOMER is not static but
changing over time:
◦ We want to insert tuples for new CUSTOMER as
these appear
◦ We want to edit existing tuples if we get corrected
information about a CUSTOMER
◦ We want to delete a tuple from the database
A set of tuples for a given relation is called an
instance of that relation
2.2.7. Relation instances (cont’)
NOTE: Relation instances do not have duplicate tuples.
2.2.8. Keys of relations
A set of attributes forms a key for a relation if we don’t allow
2 tuples in a relation instance to have the same values in all
the attributes of the key
Exp: The key of relation
CUSTOMER(Customer_ID, Tax_ID, Name, Address)
is CustomerID or TaxID
Key (primary key)
Foreign key
Candidate key
Super key
2.2.8.1. Key (primary key)
PRIMARY KEY is a column or group of columns in a table
that uniquely identify every row in that table.
◦ Primary keys must contain unique values.
◦ A primary key column cannot have NULL values.
Suppose R(A1, A2,…,An) and K{A1, A2,…,An}. K is
called the key of the relational schema R when K satisfies
the following two conditions simultaneously:
1. K A1, A2,…,An
2. K’K mà K’ A1, A2,…,An
2.2.8.1. Key (primary key) (cont’)
Rules for defining Primary key:
•Two rows can't have the same primary key
value
•It must for every row to have a primary key
value.
•The primary key field cannot be null.
•The value in a primary key column can
never be modified or updated if any foreign
key refers to that primary key.
Key Attributes - Non Key Attributes
The properties that participate in the key
are called Key Attributes,
The properties that do not participate in the
key are called Non Key Attributes
Example:
EMPLOYEE (FName,Minit, LName, Ssn, Bdate, Address, Sex, Salary,
Super_ssn, Dno)
Determine the key for the Movies relation
There are three movies named “Gone with the
wind”, each made in different year and there
are usually many movies made in the same
year.
2.2.8.2. Foreign key
A foreign key is a set of one or more attributes in
a table that refers to the primary key in another
table.
FOREIGN KEY is a column that creates a
relationship between two tables.
The purpose of Foreign keys is to maintain data
integrity and allow navigation between two
different instances of an entity. It acts as a cross-
reference between two tables.
2.2.8.2. Foreign key (cont’)
2.2.8.3. Super key
A superkey is a group of single or multiple keys
which identifies rows in a table.
A Super key may have additional attributes that
are not needed for unique identification.
Example:
CUSTOMER(Customer_ID, Tax_ID, Name, Address)
Candidate Key
Candidate Key - is a set of attributes that
uniquely identify tuples in a table. Candidate Key
is a super key with no repeated attributes.
The Primary key should be selected from the
candidate keys.
Every table must have at least a single candidate
key.
A table can have multiple candidate keys but only
a single primary key.
THE RELATIONAL MODEL OF DATA 25
2.2.9. Database schema
Database schema = collection of relation schemas
2.3 An Algebraic Query Language
Relational Algebra
An algebra consists of operators and atomic
operands
Relational algebra is an example of an algebra,
its atomic operands are
Variables that stand for relations
Constants, which are finite relations
Relational algebra is a set of operations on
relations
Operations operate on one or more relations to
create new relation
THE RELATIONAL MODEL OF DATA 27
2.3 An Algebraic Query Language
Relational algebra fall into four classes
Set operations – union, intersection, difference
Selection and projection
Cartesian product and joins
Rename
THE RELATIONAL MODEL OF DATA 28
2.3 An Algebraic Query Language
Set operations R and S must be ‘type
compatible’
Union The same number of
R S = { t | t R t S} attributes
The domain of
Intersection corresponding
R S = { t | t R t S} attributes must be
compatible
Difference
R \ S = { t | t R t S}
Intersection can be expressed
in terms of set difference
R S = R \ (R \ S)
THE RELATIONAL MODEL OF DATA 29
Set operations- Example
name address gender birthdate
Carrie Fisher 123 Maple St., Holywood F 9/9/99
Mark Hamill 456 Oak Rd., Brentwood M 8/8/88
Relation R
name address gender birthdate
Carrie Fisher 123 Maple St., Holywood F 9/9/99
Harrison Ford 789 Palm Dr., Beverly Hills M 8/8/88
Relation S
THE RELATIONAL MODEL OF DATA 30
Set operations- Example
RS name address gender birthdate
Carrie Fisher 123 Maple St., Holywood F 9/9/99
Mark Hamill 456 Oak Rd., Brentwood M 8/8/88
Harrison Ford 789 Palm Dr., Beverly Hills M 8/8/88
RS name address gender birthdate
Carrie Fisher 123 Maple St., Holywood F 9/9/99
R\S name address gender birthdate
Mark Hamill 456 Oak Rd., Brentwood M 8/8/88
THE RELATIONAL MODEL OF DATA 31
Selection and projection
Selection
- R1 := σC (R2) with C illustrated conditions
- ex: <C1>( < C2> ( R)) = <C2> ( < C1> ( R)) = <C1> AND < C2>
Movies
title year length genre
Gone With the Wind 1939 231 Drama
Star Wars 1977 124 Scifi
Wayne’s World 1992 95 Comedy
σlength100(Movies)
title year length genre
Gone With the Wind 1939 231 Drama
Star Wars 1977 124 Scifi
THE RELATIONAL MODEL OF DATA 32
Selection and projection
Projection S := πA1,A2,…,An (R)
A1,A2,…,An are attributes of R
S relation schema S(A1,A2,…,An)
Movies title year length genre
Star Wars 1977 124 Scifi
Galaxy Quest 1999 104 Comedy
Wayne’s World 1992 95 Comedy
title,year,length(Movies) genre(Movies)
title year length genre
Star Wars 1977 124 Scifi
Galaxy Quest 1999 104 Comedy
Wayne’s World 1992 95
THE RELATIONAL MODEL OF DATA 33
Cartesian product and joins
Cartesian product R3 := R1 Χ R2
Relation R Relation S Cartesian Product R X S
A B B C D A R.B S.B C D
1 2 2 5 6 1 2 2 5 6
3 4 4 7 8 1 2 4 7 8
9 10 11 1 2 9 10 11
3 4 2 5 6
3 4 4 7 8
3 4 9 10 11
THE RELATIONAL MODEL OF DATA 34
Cartesian product and joins
theta joins R3 := R1 ⋈<join condition> R2
A B C B C D A U.B U.C V.B V.C D
1 2 3 2 3 4 1 2 3 2 3 4
6 7 8 2 3 5 1 2 3 2 3 5
9 7 8 7 8 10 1 2 3 7 8 10
6 7 8 7 8 10
Relation U Relation V
9 7 8 7 8 10
Figure 2.17: Result of U ⋈ A<D V
A U.B U.C V.B V.C D
1 2 3 7 8 10
Result of U ⋈ A<D AND U.BV.B V
THE RELATIONAL MODEL OF DATA 35
Cartesian product and joins
Natural join R3 := R1 ⋈ R2
Relation R Relation S Natural Join R ⋈ S
A B B C D A B C D
1 2 2 5 6 1 2 5 6
3 4 4 7 8 3 4 7 8
9 10 11
THE RELATIONAL MODEL OF DATA 36
Rename
The operation gives a new schema to a relation
ρS(A1,…,An)(R) makes S be a relation with attributes A1,
…,An and the same tuples as R
Simplified notation: S:=R (A1,A2,…,An)
Relation R Relation S RX S(X,C,D) (S)
A B B C D A B X C D
1 2 2 5 6 1 2 2 5 6
3 4 4 7 8 1 2 4 7 8
9 10 11 1 2 9 10 11
3 4 2 5 6
3 4 4 7 8
3 4 9 10 11
THE RELATIONAL MODEL OF DATA 37
Relational Expression
How we need relational expression
Relational algebra allows us to form expressions
Relational expression is constructed by applying
operations to the result of other operations
Expressions can be presented as expression
tree
THE RELATIONAL MODEL OF DATA 38
The role of relational algebra in a DBMS
THE RELATIONAL MODEL OF DATA 39
Relational Expression
Example: What are the titles and years of movies
made by Fox that are at least 100 minutes long?
◦ (1) Select those Movies tuples that have length
100
◦ (2) Select those Movies tuples that have
studioName=‘Fox’
◦ (3) Compute the intersection of (1) and (2)
◦ (4) Project the relation from (3) onto attributes
title and year
THE RELATIONAL MODEL OF DATA 40
Relational Expression
title,year
length>=100 studioName=‘Fox’
Movies Movies
Figure 2.18: Expression tree for a relational algebra expression
title,year(length100 (Movies) studioName=‘Fox’ (Movies))
title,year(length100 AND studioName=‘Fox’ (Movies))
THE RELATIONAL MODEL OF DATA 41
Exercise
THE RELATIONAL MODEL OF DATA 42