DATS310
Review of Relational Database
DR. RICHA SHARMA
C O M M O N W E A LT H U N I V E R S I T Y
1
Introduction
Introduction
Database
Relational Database
Entity Relationship Diagrams
Few Key Terms
Review of basic SQL
2
Introduction
Data storage prior to databases – file-based storage such
as text or spreadsheets (excel) files.
Challenges with file-based storage:
Data Redundancy leading to data inconsistency and
integrity issues, lack of security.
Data Anomalies such as update, insertion, deletion
anomalies
Dependence on structure of data
Difficult to quick answers and limited data sharing
3
Database
Database: an organized collection of data or structured
information that allow computer-based systems to store,
manage, and retrieve data very quickly.
Such computer-based systems are known as database
management systems (DBMS).
Various database designs/architectures explored earlier
include: network, hierarchical and relational database.
Unlike hierarchical and network database models,
relational database models allow the database designers
to focus on the logical representation of the data and its
relationships, rather than on the physical storage details.
4
Benefits of Database
Better data integration within an organization.
Improved data sharing and security
No more data anomalies.
Improved and better access to data
Reduced data inconsistency
5
Relational Database
Relational databases: based on mathematical theory of
Relations and Relational Algebra.
Being based on mathematical construct, it’s easier to think
of a relation as a table – that can be viewed as 2D
structure organized in the form of rows and columns.
Data is actually stored in a set of relations (tables).
Each table contains a set of records or tuples (rows).
Each record contains a set of attributes or fields (columns).
Each record has an attribute or set of attributes that
uniquely identifies that row – known as a primary key.
6
Relational Database (ctd.)
A table may contain an attribute or set of attributes that
refers to another table’s primary key. This reference is
called a foreign key.
Relational models enable us to view data logically rather
than physically!
Logical view of the relational database is identified by the
creating data relationships based on a logical construct of
a relation – pictorial representation is Entity-Relationship
diagram.
7
Entity-Relationship (ER) Diagrams
Entity: anything about which data are to be collected and
stored. Example - Customers, Books, Students, etc.
Attribute: a characteristic of an entity. Example: Customer
Id, Student Id, Book ISBN, Book Author etc.
Relationship: describes an association among entities.
Relationships can be unary, binary, ternary, etc. Types:
One-to-one (1:1) relationship
One-to-many (1:M) relationship
Many-to-many (M:N) relationship
Constraint: a restriction placed on the data. Example:
Book ISBN must be 13 digit number.
8
ER Diagram – an example
9
Review of Basic SQL
Structured Query Language – works with well-structured
relational databases!
SQL functions fit into two broad categories:
Data definition language (DDL)
Data manipulation language (DML)
DDL allows us to create and affect the structure of the
table, define keys too! Eg. create table, update table etc.
DML allows us to affect the contents of the table not the
structure! Eg. select, insert etc.
10
Review of DDL
Using DDL, we can:
Create and drop table
Alter table structure
Define keys
Define constraints
An example of creating the table:
CREATE TABLE EMPLOYEE (
empId INTEGER PRIMARY KEY,
name TEXT NOT NULL,
deptId TEXT NOT NULL,
salary NUMERIC(8,2)
);
11
Review of DDL – Data Types
The pattern for each field in create table is:
attribute_name datatype properties
SQL (99) Data Types:
char(n) and varchar(n)
text
Integer, numeric(p,d) and decimal(p,d),
binary, Boolean … and a few more!
VARCHAR is best for storing short to medium-length strings,
while TEXT is better suited for storing large amounts of textual
data.
12
Data Types
In CHAR, If the length of the string is less than set or fixed-
length then it is padded with extra memory space.
In VARCHAR, If the length of the string is less than the set or
fixed-length then it will store as it is without padded with extra
memory spaces.
NUMERIC data type is strict - enforces the exact precision
and scale being specified. This is in stark contrast to
DECIMAL, which allows more numbers than the stated
precision.
Oracle data types include most of SQL99 plus others:
varchar2(n) – up to 4000 chars, variable number
BLOB – Binary Large Object (like an image) up to 4GB
13
Another example: Create table
This example defines two extra properties - a validation check
on the gender character and a foreign key referencing a table
called department with a PK attribute called dId:
CREATE TABLE Employees2
(
empId INTEGER PRIMARY KEY,
name TEXT NOT NULL,
gender CHAR(1) CONSTRAINT validate_gender
(CHECK
(gender IN ('M', 'F’)),
salary NUMERIC(8,2),
deptId INTEGER,
FOREIGN KEY(deptId) REFERENCES
department(dId)
);
14
Another example (ctd.)
Other foreign key options: Assume we are creating a
dependent table that references an employees table then we
might use:
FOREIGN KEY(emp_Id) REFERENCES
Employees(empId) ON DELETE
CASCADE
This would ensure that if we delete an employee in the
Employee table, then we are delete the corresponding
dependent info.
15
Keys
Each row in a table must be uniquely identifiable!
Key is one or more attributes that determine other attributes.
Key’s role is based on determination - if you know the value of
attribute A, you can determine the value of attribute B.
Functional dependence: attribute B is functionally dependent
on A if the value of A determines the value of B:
A→B
Example: SSN → Birthdate
16
Keys (ctd.)
Composite key - Composed of more than one attribute
Key attribute - Any attribute that is part of a key
Superkey - Any key that uniquely identifies each row. Any
superset of PK is also a superkey.
Candidate key - A superkey without unnecessary attributes
(minimal superkey). An alternate key is one of the candidate
key not chosen as primary.
Primary Key – is the selected or chosen candidate key to be
the primary key!
17
Keys (ctd.)
Secondary key - Key used strictly for data retrieval purposes.
Foreign key (FK) - An attribute whose values match primary
key values in a related table.
Referential integrity constraint - FK contains a value that
refers to an existing valid tuple (row) in another relation.
Surrogate Key - Created to replace a natural key when natural
key is a poor candidate (eg. Student ID rather than SSN)
18
Nulls
Indicates an empty field - no data entry!
Nulls are not permitted in primary key.
Should be avoided in other attributes, but sometimes needed
as nulls can represent:
An unknown attribute value
A known, but missing, attribute value
Nulls may create problems with aggregate functions such as
COUNT, AVERAGE, and SUM!
Nulls can create logical problems when relational tables are
linked.
19
Other DDL commands
To remove or drop a table, this is how we do:
DROP TABLE employees;
To modify a table's structure, we alter it as:
ALTER TABLE employees ADD join_date
DATE;
In this example, we have added. We can add or remove
columns, change data types, add constraints, etc. using
ALTER TABLE command.
Existing data in the table must meet the new criteria or the
system will report an error.
20
Other DDL commands (ctd.)
Few more examples:
Oracle:
ALTER TABLE EMPLOYEE MODIFY name
varchar(30);
MySQL:
ALTER TABLE EMPLOYEE MODIFY COLUMN name
varchar(30));
SQL Server:
ALTER TABLE EMPLOYEE ALTER COLUMN name
varchar(30));
21
Other DDL commands (ctd.)
Few more examples:
Oracle:
ALTER TABLE EMPLOYEE MODIFY salary DEFAULT
10000;
MySQL:
ALTER TABLE EMPLOYEE ALTER salary SET DEFAULT
10000;
SQL Server:
ALTER TABLE EMPLOYEE ADD constraint df_sal
DEFAULT 10000 for salary;
22
DDL commands - Index
An index is a database structure that is used to improve the
performance of database in terms of data retrieval.
A database table can have one or more indexes associated
with it.
Index provides quick select, insert, delete operations if the
index is created properly identifying which fields are frequently
used for accessing the data. For example:
CREATE INDEX deptId_indx ON EMPLOYEE(deptId);
Searching on deptId on EMPLOYEE table using a SELECT
statement will now be much faster!
23
DML commands
DML commands affect the contents of the table. These
command include:
DELETE, INSERT, UPDATE, SELECT
Example of deleting data:
DELETE FROM employee;
DELETE FROM employee WHERE deptId = 4;
24
DML commands (ctd.)
Inserting data into table:
INSERT INTO EMPLOYEE VALUES (0001, 'Clark',
'Sales’);
Columns can be in any order if we specify the attribute
names:
INSERT INTO EMPLOYEE(name, empId, deptId)
VALUES ('Ava', 0003, 'Sales');
Attribute names should be specified if we do not want to add
value for null or default field as:
INSERT INTO EMPLOYEE(empId, name, deptId)
VALUES (0001, 'Clark', 'Sales’);
25
DML commands (ctd.)
Example of updating existing record data: :
UPDATE employee SET salary = 1.05*salary;
UPDATE employee SET salary = 1.05*salary WHERE
deptId = ‘Sales’;
26
DML commands(ctd.) - Select
Most frequently used DML query – meant to retrieve
data from one or more tables (join) based on conditions
we request.
The SELECT command allows us to add restrictions to
search criteria to show a subset of the rows or columns.
Syntax:
SELECT columnlist
FROM tablelist
[ WHERE conditionlist ] ;
27
DML commands(ctd.) - Select
We can use with select query:
Wildcard character * to fetch all columns
Logical operators within Where clause
Column alias, an example:
Select P_PRICE as “Unit Price” from
PRODUCT;
Computed columns such as product
Distinct clause to fetch unique values for non-primary
attributes
Order by clause to sort data on an attribute (ascending
order is by default).
Special operators and Aggregate operators
28
DML commands(ctd.) - Select
Special Operators:
BETWEEN - to check whether an attribute value is within
a range. If this does not work, we can use comparison
operators!
LIKE - to check whether an attribute value matches a
given string pattern.
IS NULL - to check whether an attribute value is null.
IN - to check whether an attribute value matches any
value within a value list
EXISTS – to check if a subquery returns any rows as
result.
NOT IN , NOT BETWEEN – work opposite of IN &
BETWEEN.
29
DML commands(ctd.) - Select
Aggregate Operators:
COUNT – returns the number of non-null values of an
attribute.
MAX & MIN
SUM – computes total for a specified attribute.
AVG - computes average for a specified attribute.
An example: SELECT COUNT(*), AVG(P_PRICE) FROM
PRODUCT;
30