Dbmsimpunit 3
Dbmsimpunit 3
UNIT –III
Structured Query Language (SQL) is a data sub language that has constructs for defining and
processing a database.
It can be
History of SQL-92
SQL3 incorporates some object-oriented concepts but has not gained acceptance in industry.
. Create Table
CREATE TABLE PROJECT (ProjectID Integer Primary Key, Name Char(25) Unique Not Null,
Department VarChar (100) Null, MaxHours Numeric(6,1) Default 100);
Constraints
Constraints can be defined within the CREATE TABLE statement, or they can be added to the table
after it is created using the ALTER table statement.
NULL/NOT NULL
FOREIGN KEY
CHECK
ALTER Statement
ALTER statement changes table structure, properties, or constraints after it has been created. Example
ALTER TABLE ASSIGNMENT ADD CONSTRAINT EmployeeFK FOREIGN KEY
(EmployeeNum) REFERENCES EMPLOYEE (EmployeeNumber) ON UPDATE CASCADE ON
DELETE NO ACTION;
DROP Statements
DROP TABLE statement removes tables and their data from the database A table cannot be dropped
if it contains foreign key values needed by other tables. H Use ALTER TABLE DROP
CONSTRAINT to remove integrity constraints in the other table first Example:
SELECT can be used to obtain values of specific columns, specific rows, or both.
ORDER BY phrase can be used to sort rows from SELECT statement. SELECT Name, Department
FROM EMPLOYEE ORDER BY Department;
SELECT Name, Department FROM EMPLOYEE ORDER BY Department DESC, Name ASC;
The order of the column names must match the order of the values. Values for all NOT NULL
columns must be provided
– INSERT INTO PROJECT VALUES (1600, ‗Q4 Tax Prep‘, ‗Accounting‘, 100);
UPDATE Statement
A view contains rows and columns, just like a real table. The fields in a view are fields from one or
more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the
data were coming from one single table.
EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE, ADDRESS, SEX, SALARY, #SUPERSSN, #DNO)
DEPARTMENT (DNAME, DNUMBER, #MGRSSN, MGRSTARTDATE)
SQL queries
Query 0
Retrieve the birthdate and address of the employee(s) whose name is ‘John B Smith’
SELECT BDATE, ADDRESS FROM EMPLOYEE WHERE FNAME = ‘John’ AND MINIT = ‘B’ AND LNAME =
‘Smith’;
Query 1
Retrieve the name and address of all employees who work for the ‘Research’ department
SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE, DEPARTMENT WHERE DNAME = ‘Research’
AND DNUMBER = DNO;
Query2
Retrieve all the attributes of an EMPLOYEE and the attributes of the DEPARTMENT he or she works
in for every employee of the ‘Research’ department .
SELECT * FROM EMPLOYEE, DEPARTMENT WHERE DNAME = ‘Research’ AND DNO = DNUMBER;
Ans:
Basic Query Processing Steps ...
transform the SQL query to a query plan represented by a relational algebra expression (for
relational DBMS) - different possible relational algebra expressions for a single query
transform the initial query plan into the best possible query plan based on the given data set -
specify the execution of single query plan operations (evaluation primitives)
• e.g. which algorithms and indices to be used - the query execution plan is defined by a sequence of
evaluation primitives
Ans: The query costs are defined by the time to answer a query (process the query execution plan)
Different factors contribute to the query costs
disk access time, CPU time or even network communication time The costs are often dominated by
the disk access time seek time (tS ) (~4 ms) transfer time (tT ) (e.g. 0.1 ms per disk block) - write
operations are normally slower than read operations For simplicity, we will use the number of block
transfers and the number of seeks as cost measure
Selection Operation
The lowest-level query processing operator for accessing data is the file scan search and retrieve
records for a given selection condition
Linear search
given a file with n blocks, we scan each block and check if any records satisfy the condition a
selection on a candidate key attribute (unique) can be terminated after a record has been found -
average costs: tS + n/2 * tT , worst case costs: tS + n * tT applicable to any file regardless of
ordering, the availability of indices or the type of selection operation
Binary search
an equality selection condition on a file that is ordered on the selection attribute (n blocks) can be
realised via a binary search note that this only works if we assume that the blocks of the file are
stored continously! worst case costs: log2 (n) * (tS + tT )
Sorting
Sorting in database systems is important for two reasons a query may specify that the output
should be sorted the processing of some relational query operations can be implemented more
efficiently based on sorted relations - e.g. join operation.
For relations that fit into memory, techniques like quicksort can be used For relations that do not
fit into memory an external merge sort algorithm can be used.
External Merge Sort Example
Join Operation
nested-loop join
Nested-Loop Join
Merge join
Hash join
There are two approaches how a query execution tree can be evaluated
materialisation - compute the result of an evaluation primitive and materialise (store) the new
relation on the disk
pipelining - pass on tuples to parent operations even while an operation is still being executed
Materialisation
Evaluate one operation after another starting at the leave nodes of the query expression tree.
materialise intermediate results in temporary relations and use those for evaluating operations at
the next level.
Pipelining
Pipelining evaluates multiple operations simultaneously by passing results of one operation to the
next one without storing the tuples on the disk
Much cheaper than materialisation since no I/O operations for temporary relations
There are alternative ways for evaluating a given query different equivalent expressions (query
expression trees) different potential algorithms for each operation of the expression
Types of Query optimization
1.Cost based
2.Heuristic based
Cost based
generate logically equivalent expressions by using a set of equivalence rules (2) annotate the
expressions to get alternative query evaluation plans (e.g. which algorithms to be used) (3) select the
cheapest plan based on the estimated costs.
statistical information from the catalogue manager in combination with the expected
performance of the algorithms
Equivalence Rules
a DBMS may use some heuristics to reduce the number of cost-based choices
A heuristic optimisation transforms the query expression tree by using a set of rules that typically
improve the execution performance
perform most restrictive selection and join operations (smallest result size) before other
operations