▪ We start @ 14:15
▪ Remember to join MS Teams – code: 6esizxc
▪ You can download the lecture from MS Teams for your convenience
Instructor: Krystian Wojtkiewicz
School of Computer Science and Engineering
International University, VNU-HCMC Lecture 5: SQL Basics
ACKNOWLEDGEMENT
The following slides have been
created adapting the materials of the
book authored by Dr. Ramakrishnan
and Dr. Gehrke.
The following slides are referenced
from Dr. Sudeepa Roy, Duke University.
RECAP
Why we use a DBMS?
Structured data model: Relational data model
table, schema, instance, tuples, attributes
bag and set semantic
Data independence
Physical independence: Can change how data
is stored on disk without affecting
applications
Logical independence: can change schema
without affecting apps
▪ Relational data model is the standard for
database management
▪ – and is the main focus of this course
▪ Semi-structured model/XML is also
used in practice – you will use them in
homeworks or assignments
▪ Unstructured data (text/photo/video)
is unavoidable, but won’t be covered
in this class
▪ SQL basic:
▪ Create database
▪ Create tables, constraints, import data
▪ Reading material: [RG] Chapters 3 and 5
▪ Additional reading for practice: [GUW] Chapter 6
SQL is Structured Query Language, SQL is the standard language for
which is a computer language for Relational Database System
storing, manipulating and retrieving
data stored in a relational database.
https://www.tutorialspoint.com
− Dr. Edgar F. "Ted" Codd of IBM is known
as the father of relational databases. He
described a relational model for − IBM worked to develop Codd's ideas and
databases. released a product named System/R.
1970 1978
1974 1986
− Structured Query Language appeared. − IBM developed the first prototype of
relational database and standardized by
ANSI. The first relational database was
released by Relational Software which later
came to be known as Oracle.
▪ Standards:
A BRIEF
▪ SQL-86
▪ SQL-89 (minor revision)
HISTORY OF SQL ▪ SQL-92 (major revision)
▪ SQL-99 (major extensions, current standard)
▪ More: MS SQL Server history on the internet
Data Manipulation Language Data Definition Language (DDL)
(DML)
Querying: SELECT-FROM-WHERE CREATE/ALTER/DROP
Modifying: INSERT/DELETE/UPDATE
column/
attribute/
field
sid name login age gpa
row / 53666 Jones jones@cs 18 3.4
tuple /
53688 Smith smith@ee 18 3.2
record
53650 Smith smith1@math 19 3.8
53831 Madayan madayan@music 11 1.8
53832 Guldu guldu@music 12 2.0
• mathematically, relation is a set of tuples
– each tuple appears 0 or 1 times in the table
– order of the rows is unspecified
Standard query language for relational data
used for databases in many different contexts
inspires query languages for non-relational (e.g. SQL++)
Everything not in quotes (‘…’) is case insensitive
Provides standard types.
▪ Examples:
numbers: INT, FLOAT, DECIMAL(p,s)
DECIMAL(p,s): Exact numerical, precision p, scale s. Example: decimal(5,2) is
a number that has 3 digits before the decimal and 2 digits after the
decimal
strings: CHAR(n), VARCHAR(n)
CHAR(n): Fixed-length n
VARCHAR(n): Variable length. Maximum length n
SQL (“SEQUEL”) –
CONT.
▪ BOOLEAN
▪ DATE, TIME, TIMESTAMP
▪ DATE: Stores year, month, and day
values
▪ TIME: Stores hour, minute, and
second values
▪ TIMESTAMP: Stores year, month,
day, hour, minute, and second
values
▪ Additional types in here
E.g., type sqlite3
DEMO ON SQLITE
https://www.db-
book.com/university-
lab-dir/sqljs.html
Create Create database …
Create create table …
Drop drop table ...
SQL STATEMENTS Alter alter table ... add/remove ...
Insert insert into ... values ...
Delete delete from ... where ...
Update update ... set ... where ...
SQL STATEMENTS
CREATE DATABASE DatabaseName:
• CREATE DATABASE testDB;
DROP DATABASE DatabaseName:
• DROP DATABASE testDB;
USE DatabaseName:
• USE testDB;
CREATING TABLE
The basic syntax of the CREATE TABLE statement is
as follows:
CREATE TABLE table_name(
column1 datatype,
Column2 datatype,
column3 datatype,
.....
columnN datatype,
PRIMARY KEY (one or more columns)
);
CREATE TABLE Students
(sid CHAR(10),
▪ Creates the “Students” relation name CHAR(15),
– the type (domain) of each field is specified login CHAR(20),
– enforced by the DBMS whenever tuples are added or modified age INTEGER,
• As another example, the “Enrolled” table which holds gpa REAL/DECIMAL(2,1))
information about courses that students take
CREATE TABLE Enrolled
(sid CHAR(20), ??
cid CHAR(20),
grade CHAR(2))
sid cid grade
sid name login age gpa
53831 Carnatic101 C
53666 Jones jones@cs 18 3.4
53831 Reggae203 B
53688 Smith smith@eecs 18 3.2
53650 Smith smith@math 19 3.8 53650 Topology112 A
53666 History105 B
Students
Enrolled
▪ Customers(ID: int, name: string(20), age: int,
address: string(25), salary decimal(18,2))
CREATE TABLE Customers(
); all attributes
DESTROYING
RELATION/TABLE
Syntax:
Drop table Table_name
DROP TABLE Customers;
Destroys the relation Customers
– The schema information and the tuples are
deleted.
ALTER TABLE Students
ADD COLUMN firstYear: integer
The schema of Students is altered by adding a new field;
What’s the value in the new field?
every tuple in the current instance is extended with a null
value in the new field.
Syntax:
INSERT INTO TABLE_NAME (column1, column2, column3, ...,
columnN) VALUES (value1, value2, value3,..., valueN);
INSERT INTO TABLE_NAME
VALUES (value1, value2, value3, ..., valueN);
Can insert a single tuple using:
ADDING AND INSERT INTO Students (sid, name, login, age, gpa)
VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2)
DELETING Can delete all tuples satisfying some condition
(e.g., name = Smith):
TUPLES DELETE
FROM Students S
WHERE S.name = ‘Smith’
INSERT INTO CUSTOMERS (ID, NAME, AGE, ADDRESS, SALARY)
VALUES (7, 'Muffy', 24, 'Indore', 10000.00);
INSERT INTO CUSTOMERS
VALUES (7, 'Muffy', 24, 'Indore', 10000.00);
UPDATE Student
SET age = age + 2
where sid = ‘53680';
IC: condition that must be true for any instance of the
database
e.g., domain constraints
ICs are specified when schema is defined
ICs are checked when relations are modified
A legal instance of a relation is one that satisfies all specified
ICs
DBMS will not allow illegal instances
If the DBMS checks ICs, stored data is more faithful to real-
world meaning
Avoids data entry errors, too!
▪ NOT NULL Constraint − Ensures that a column cannot have NULL value.
▪ DEFAULT Constraint − Provides a default value for a column when
none is specified.
▪ UNIQUE Constraint − Ensures that all values in a column are different.
▪ PRIMARY Key − Uniquely identifies each row/record in a table.
▪ FOREIGN Key − Uniquely identifies a row in any of the given table.
▪ CHECK Constraint − Ensures that all the values in a column satisfies
certain conditions.
▪ INDEX − Used to create and retrieve data from the database very
quickly.
INTEGRITY CONSTRAINTS
NOT NULL
Example
CREATE TABLE CUSTOMERS(
ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL,
AGE INT NOT NULL,
ADDRESS CHAR (25) ,
SALARY DECIMAL (18, 2),
PRIMARY KEY (ID)
);
If CUSTOMERS table has already been created
ALTER TABLE CUSTOMERS
ALTER COLUMN SALARY DECIMAL (18, 2) NOT NULL
Example
CREATE TABLE CUSTOMERS(
ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL,
AGE INT NOT NULL,
ADDRESS CHAR (25) ,
SALARY DECIMAL (18, 2) DEFAULT 5000.00,
PRIMARY KEY (ID)
);
If the CUSTOMERS table has already been created
ALTER TABLE CUSTOMERS
DROP column SALARY;
ALTER TABLE CUSTOMERS
ADD SALARY DECIMAL (18, 2) DEFAULT 5000.00;
Example
CREATE TABLE CUSTOMERS(
ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL UNIQUE,
AGE INT NOT NULL,
ADDRESS CHAR (25) ,
SALARY DECIMAL (18, 2),
PRIMARY KEY (ID)
);
If the CUSTOMERS table has already been
created
ALTER TABLE CUSTOMERS
ADD CONSTRAINT UniqueConstraint UNIQUE(NAME);
DROP a UNIQUE Constraint
ALTER TABLE CUSTOMERS
DROP CONSTRAINT UniqueConstraint;
CREATE TABLE CUSTOMERS(
ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL,
AGE INT NOT NULL CHECK (AGE >= 18),
ADDRESS CHAR (25) ,
SALARY DECIMAL (18, 2),
PRIMARY KEY (ID)
);
INTEGRITY CONSTRAINTS
CHECK
If the CUSTOMERS table has already been
created
ALTER TABLE CUSTOMERS
ADD CONSTRAINT CheckConstraint CHECK(AGE >=18);
DROP a CHECK Constraint
ALTER TABLE CUSTOMERS
DROP CONSTRAINT CheckConstraint;
INTEGRITY CONSTRAINTS
INDEX
CREATE INDEX index_name
Syntax:
ON table_name ( column1, column2.....);
To create an INDEX on the
AGE column, to optimize the CREATE INDEX idx_age
search on customers for a ON CUSTOMERS ( AGE );
specific age
ALTER TABLE CUSTOMERS
DROP an INDEX Constraint
DROP INDEX idx_age;
Key/ Candidate Key
Primary Key
Primary key attributes are underlined in a schema
Person(pid, address, name)
Person2(address, name, age, job)
Super Key
Foreign Key
PRIMARY KEY
CONSTRAINTS
Key = subset of columns that uniquely identifies tuple
Another constraint on the table
no two tuples can have the same values for those
columns
Examples:
Movie(title, year, length, genre): key is (title, year)
what is a good key for Student?
Students(sid: string, name: string, login:
string, age: integer, gpa: real).
Can have multiple keys for a table
Only one of those keys may be “primary”
DBMS often makes searches by primary key fastest
other keys are called “secondary”
• Possibly many candidate keys
– specified using UNIQUE
– one of which is chosen as the primary key.
Example:
CREATE TABLE Enrolled (sid
“For a given student and course, there is a CHAR(10)
single grade.” cid CHAR(20),
What a primary key is in a table? grade CHAR(2), PRIMARY
KEY ???)
PRIMARY AND
CANDIDATE KEYS IN SQL
Possibly many candidate keys
specified using UNIQUE
one of which is chosen as the primary key.
“For a given student and course, there is a single
grade.”
CREATE TABLE Enrolled
(sid CHAR(10)
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid,cid) )
• Possibly many candidate keys
CREATE TABLE Enrolled
– specified using UNIQUE
(sid CHAR(10)
– one of which is chosen as the primary key.
cid CHAR(20),
• “For a given student and course, there is a grade CHAR(2),
single grade.” vs PRIMARY KEY (sid,cid)
• “Students can take only one course, and CREATE TABLE Enrolled
vs. receive a single grade for that course; (sid CHAR(10)
further, no two students in a course cid CHAR(20),
receive the same grade.” grade CHAR(2),
PRIMARY KEY ???, UNIQUE ??? )
CREATE TABLE Enrolled
• Possibly many candidate keys
( sid CHAR(10)
– specified using UNIQUE
cid CHAR(20),
– one of which is chosen as the primary key. grade CHAR(2),
• “For a given
vs. student and course, there is a PRIMARY KEY (sid, cid)
single grade.” vs
CREATE TABLE Enrolled
• “Students can take only one course, and ( sid CHAR(10)
receive a single grade for that course; cid CHAR(20),
further, no two students in a course grade CHAR(2),
PRIMARY KEY sid,
receive the same grade.” UNIQUE (cid, grade))
• Possibly many candidate keys
– specified using UNIQUE CREATE TABLE Enrolled
( sid CHAR(10)
– one of which is chosen as the primary key.
cid CHAR(20),
• “For a given student and course, there is a grade CHAR(2),
single grade.” vs PRIMARY KEY (sid, cid)
• “Studentsvs.
can take only one course, and receive
CREATE TABLE Enrolled
a single grade for that course; further, no two
( sid CHAR(10)
students in a course receive the same grade.” cid CHAR(20),
• Used carelessly, an IC can prevent the storage grade CHAR(2),
of database instances that arise in practice! PRIMARY KEY sid,
UNIQUE (cid, grade))
Foreign key: Set of fields in one Must correspond to primary key of the
relation that is used to 'refer' to second relation
Like a 'logical pointer'
a tuple in another relation
Enrolled(sid: string, cid: string, grade:
string)
E.g. sid is a foreign key If all foreign key constraints are enforced,
referring to Students: referential integrity is achieved
i.e., no dangling references
• Only students listed in the Students relation should be allowed to enroll for courses
CREATE TABLE Enrolled
(sid CHAR(10), cid CHAR(20), grade CHAR(2),
PRIMARY KEY (sid,cid),
FOREIGN KEY (sid) REFERENCES Students )
▪ Enrolled
sid cid grade ▪ Students
53666 Carnatic101 C sid name login age gpa
53666 Reggae203 B 53666 Jones jones@cs 18 3.4
53650 Topology112 A 53688 Smith smith@eecs 18 3.2
53666 History105 B 53650 Smith smith@math 19 3.8
ENFORCING FOREIGN
KEY CONSTRAINTS
If there is a foreign-key constraint from
relation R to relation S, two violations
are possible:
An insert or update to R introduces
values not found in S.
A deletion or update to S causes
some tuples of R to “dangle.”
Example:
suppose R = Enrolled, S = Students
An insert or update to Enrolled that
introduces a nonexistent Students
must be rejected.
A delete or update to Students that
removes a student value found in
some tuples of Enrolled can be
handled in four ways (next slide)
Default: Reject the
modification.
Deleted Students:
delete Enrolled tuple.
Cascade: Make the
same changes in
Enrolled. Updated Students:
change value in
Enrolled.
Set NULL: Change the
sid in E to NULL.
Default is No action:
(delete/update is
rejected)
Then delete all
Delete the 53666 tuples from
tuple from Enrolled that
Students: have sid =
’53666’.
Update the 53666 Then change all
tuple by Enrolled tuples
changing ’53666’ with sid = ’53666’
to ’53686’: to sid = ’53686’.
Change all tuples
Delete the 53666 of Enrolled that
tuple from have sid =
Students: ‘53666’ to have
sid = NULL.
Update the 53666
tuple by Same change as
changing ’53666’ for deletion.
to ’53686’:
SQL/92 and SQL:1999 support all 4 options on deletes and updates
CREATE TABLE Enrolled
( sid CHAR(10),
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid,cid),
FOREIGN KEY (sid) REFERENCES Students
ON DELETE CASCADE
ON UPDATE SET DEFAULT )
ICs are based upon the semantics of the real-world
enterprise that is being described in the database relations
Can we infer ICs from an instance?
We can check a database instance to see if an IC is
violated, but we can NEVER infer that an IC is true by
looking at an instance.
An IC is a statement about all possible instances!
From example, we know name is not a key, but the
assertion that sid is a key is given to us.
Key and foreign key ICs are the most common; more
general ICs supported too
Sailor
• We will use these instances of the Sailors sid sname rating age
and Reserves relations in our examples
22 dustin 7 45
31 lubber 8 55
• If the key for the Reserves relation
58 rusty 10 35
contained only the attributes sid and bid,
how would the semantics differ?
Reserves
sid bid day
22 101 10/10/96
58 103 11/12/96
1. Is a NULL value same as zero or a blank space? If it is not
then what is the difference?
Thank you for your attention!
▪ 1. Is a NULL value same as zero or a blank space? If not then
what is the difference?
▪ A NULL value is not same as zero or a blank space. A NULL
value is a value which is ‘unavailable, unassigned, unknown or
not applicable’. Whereas zero is a number and blank space is
a character.