0% found this document useful (0 votes)

24 views54 pages

Module 4 DBMS

SQL, or Structured Query Language, is a standard language used for managing and manipulating data in relational database management systems (RDBMS). It includes various commands categorized into Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), Transaction Control Language (TCL), and Data Query Language (DQL), each serving specific functions such as creating tables, modifying data, and querying databases. SQL also supports the creation of database objects like tables, views, sequences, and indexes to enhance data management and retrieval efficiency.

Uploaded by

aparnasr64

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views54 pages

Module 4 DBMS

Uploaded by

aparnasr64

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

SQL

SQL stands for Structured Query Language. It is used for storing and managing data in
relational database management system (RDMS). It is a standard language for Relational
Database System. It enables a user to create, read, update and delete relational
databases and tables. All the RDBMS like MySQL, Informix, Oracle, MS Access and SQL
Server use SQL as their standard database language. SQL allows users to query the
database in a number of ways, using English-like statements.

Rules:

SQL follows the following rules:

Structure query language is not case sensitive. Generally, keywords of SQL are written in
uppercase. Statements of SQL are dependent on text lines. We can use a single SQL
statement on one or multiple text line. Using the SQL statements, you can perform most
of the actions in a database. SQL depends on tuple relational calculus and relational
algebra.

SQL process:

When an SQL command is executing for any RDBMS, then the system figure out the
best way to carry out the request and the SQL engine determines that how to interpret
the task. In the process, various components are included. These components can be
optimization Engine, Query engine, Query dispatcher, classic, etc. All the non-SQL
queries are handled by the classic query engine, but SQL query engine won't handle
logical files.
Characteristics of SQL

SQL is easy to learn. SQL is used to access data from relational database management
systems. SQL can execute queries against the database. SQL is used to describe the data.
SQL is used to define the data in the database and manipulate it when needed. SQL is
used to create and drop the database and table. SQL is used to create a view, stored
procedure, function in a database. SQL allows users to set permissions on tables,
procedures, and views.

Advantages of SQL

There are the following advantages of SQL:

1) High speed: Using the SQL queries, the user can quickly and efficiently retrieve a
large amount of records from a database.

2) No coding needed: In the standard SQL, it is very easy to manage the database
system. It doesn't require a substantial amount of code to manage the database system.

3) Well defined standards: Long established are used by the SQL databases that are
being used by ISO and ANSI.

4) Portability : SQL can be used in laptop, PCs, server and even some mobile phones.

5) Interactive language : SQL is a domain language used to communicate with the

database. It is also used to receive answers to the complex questions in seconds.

6) Multiple data view: Using the SQL language, the users can make different views of the
database structure.

SQL Data type

SQL Data type is used to define the values that a column can contain. Every column is
required to have a name and data type in the database table.
SQL Commands

SQL commands are instructions. It is used to communicate with the database. It is also
used to perform specific tasks, functions, and queries of data. SQL can perform various
tasks like create a table, add data to tables, drop the table, modify the table, set
permission for users.

Types of SQL Commands

There are five types of SQL commands: DDL, DML, DCL, TCL, and DQL.

SQL COMMAND

1. Data Definition Language (DDL)

o DDL changes the structure of the table like creating a table, deleting a table,
altering a table, etc.
o All the command of DDL are auto-committed that means it permanently save all
the changes in the database.
Here are some commands that come under DDL:

o CREATE
o ALTER
o DROP
o TRUNCATE

a. CREATE It is used to create a new table in the database.

Syntax:

CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);

Example:

CREATE TABLE EMPLOYEE(Name VARCHAR2(20), Email VARCHAR2(100), DOB DATE);

b. DROP: It is used to delete both the structure and record stored in the table.
Syntax
DROP TABLE table_name;
Example:

DROP TABLE EMPLOYEE;

c. ALTER: It is used to alter the structure of the database. This change could be either to
modify the characteristics of an existing attribute or probably to add a new attribute.

Syntax:

To add a new column in the table

ALTER TABLE table_name ADD column_name COLUMN-definition;

To modify existing column in the table:

ALTER TABLE table_name MODIFY(column_definitions....);
EXAMPLE

ALTER TABLE STU_DETAILS ADD(ADDRESS VARCHAR2(20));

ALTER TABLE STU_DETAILS MODIFY (NAME VARCHAR2(20));
d. TRUNCATE: It is used to delete all the rows from the table and free the space
containing the table.

Syntax:

TRUNCATE TABLE table_name;

Example:

TRUNCATE TABLE EMPLOYEE;

2. Data Manipulation Language

o DML commands are used to modify the database. It is responsible for all form of
changes in the database.
o The command of DML is not auto-committed that means it can't permanently
save all the changes in the database. They can be rollback.

Here are some commands that come under DML:

o INSERT
o UPDATE
o DELETE

a. INSERT: The INSERT statement is a SQL query. It is used to insert data into the row of
a table.

Syntax:

INSERT INTO TABLE_NAME(col1, col2, col3,.... col N) VALUES (value1, value2, value3, .... ……..
valueN);
Or
INSERT INTO TABLE_NAME VALUES (value1, value2, value3, .... valueN); .

INSERT INTO Textbook (Author, Subject) VALUES ("DATE C J", "DBMS");

b. UPDATE: This command is used to update or modify the value of a column in the
table.
Syntax:

UPDATE table_name SET [column_name1= value1,...column_nameN = valueN] [WHERE C

ONDITION];

Example
UPDATE students SET User_Name = 'Scott' WHERE Student_Id = '3'

c. DELETE: It is used to remove one or more row from a table.

Syntax:

DELETE FROM table_name [WHERE condition];

Example

DELETE FROM javatpoint WHERE Author="Date C J";

3. Data Control Language

DCL commands are used to grant and take back authority from any database user.

The commands that come under DCL:

o Grant
o Revoke

a. Grant: It is used to give user access privileges to a database.

Example

GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_USER;

b. Revoke: It is used to take back permissions from the user.

Example

REVOKE SELECT, UPDATE ON MY_TABLE FROM USER1, USER2;

4. Transaction Control Language
TCL commands can only use with DML commands like INSERT, DELETE and UPDATE
only.

These operations are automatically committed in the database that's why they cannot
be used while creating tables or dropping them.

The commands that come under TCL:

o COMMIT
o ROLLBACK
o SAVEPOINT

a. Commit: Commit command is used to save all the transactions to the database.

Syntax:

COMMIT;

Example :

1. DELETE FROM CUSTOMERS WHERE AGE = 25;

2. COMMIT;

b. Rollback: Rollback command is used to undo transactions that have not already been
saved to the database.

Syntax:

ROLLBACK;

Example

1. DELETE FROM CUSTOMERS WHERE AGE = 25;

2. ROLLBACK;

c. SAVEPOINT: It is used to roll the transaction back to a certain point without rolling
back the entire transaction.

Syntax:
SAVEPOINT SAVEPOINT_NAME;

5. Data Query Language

DQL is used to fetch the data from the database.

It uses only one command:

o SELECT

a. SELECT: This is the same as the projection operation of relational algebra. It is used to
select the attribute based on the condition described by WHERE clause.

Syntax:

SELECT expressions FROM TABLES WHERE conditions;

For example:

SELECT emp_name FROM employee WHERE age > 20;

Database Objects

A database object is any defined object in a database that is used to store or

reference data. Anything which we make from create command is known as Database
Object. It can be used to hold and manipulate the data. Some of the examples of

database objects are: view, sequence, indexes, etc.

• Table – Basic unit of storage; composed rows and columns

• View – Logically represents subsets of data from one or more tables

• Sequence – Generates primary key values

• Index – Improves the performance of some queries

• Synonym – Alternative name for an object
Different database Objects :
1. Table – This database object is used to create a table in database.
Syntax :
CREATE TABLE [schema.]table
(column datatype [DEFAULT expr][, ...]);
Example :
CREATE TABLE dept
(deptno NUMBER(2),
dname VARCHAR2(14),
loc VARCHAR2(13));

2. View – This database object is used to create a view in database. A view is

a logical table based on a table or another view. A view contains no data
of its own but is like a window through which data from tables can be
viewed or changed. The tables on which a view is based are called base
tables. The view is stored as a SELECT statement in the data dictionary.
Syntax :
CREATE [OR REPLACE] [FORCE|NOFORCE] VIEW viewname[(alias[, alias]...)]
AS subquery [WITH CHECK OPTION [CONSTRAINT constraint]]
[WITH READ ONLY [CONSTRAINT constraint]];

Example :
CREATE VIEW salvu50 AS SELECT employee_id ID_NUMBER, last_name NAME,
salary*12 ANN_SALARY FROM employees WHERE department_id = 50;

3. Sequence – This database object is used to create a sequence in

database. A sequence is a user created database object that can be
shared by multiple users to generate unique integers. A typical usage for
sequences is to create a primary key value, which must be unique for each
row. The sequence is generated and incremented (or decremented) by an
internal Oracle routine.
Syntax :
CREATE SEQUENCE sequence [INCREMENT BY n] [START WITH n]
[{MAXVALUE n | NOMAXVALUE}] [{MINVALUE n | NOMINVALUE}]
[{CYCLE | NOCYCLE}] [{CACHE n | NOCACHE}];
Example :
CREATE SEQUENCE dept_deptid_seq INCREMENT BY 10 START WITH 120
MAXVALUE 9999 NOCACHE NOCYCLE;

Check if sequence is created by :

SELECT sequence_name, min_value, max_value, increment_by, last_number
FROM user_sequences;

4. Index – This database object is used to create a indexes in database. An

Oracle server index is a schema object that can speed up the retrieval of
rows by using a pointer. Indexes can be created explicitly or automatically.
If you do not have an index on the column, then a full table scan occurs.
An index provides direct and fast access to rows in a table. Its purpose is to
reduce the necessity of disk I/O by using an indexed path to locate data
quickly. The index is used and maintained automatically by the Oracle server.
Once an index is created, no direct activity is required by the user. Indexes
are logically and physically independent of the table they index. This means
that they can be created or dropped at any time and have no effect on the
base tables or other indexes.
Syntax :
CREATE INDEX index ON table (column[, column]...);

Example :
CREATE INDEX emp_last_name_idx ON employees(last_name);

5. Synonym – This database object is used to create a indexes in database.

It simplify access to objects by creating a synonym(another name for an
object). With synonyms, you can Ease referring to a table owned by
another user and shorten lengthy object names. To refer to a table owned
by another user, you need to prefix the table name with the name of the
user who created it followed by a period. Creating a synonym eliminates
the need to qualify the object name with the schema and provides you
with an alternative name for a table, view, sequence, procedure, or other
objects. This method can be especially useful with lengthy object names,
such as views.
Syntax :
CREATE [PUBLIC] SYNONYM synonym_name FOR object;

In the syntax:
PUBLIC : creates a synonym accessible to all users
synonym : is the name of the synonym to be created
object : identifies the object for which the synonym is created

Example :
CREATE SYNONYM d_sum FOR dept_sum_vu;
Views

Views in SQL are considered as a virtual table. A view also contains rows and columns.
To create the view, we can select the fields from one or more tables present in the
database. A view can either have specific rows based on certain condition or all the rows
of a table.

Student_Detail

STU_ID NAME ADDRESS

1 Stephan Delhi

2 Kathrin Noida

3 David Ghaziabad

4 Alina Gurugram

Student_Marks

STU_ID NAME MARKS AGE

1 Stephan 97 19

2 Kathrin 86 21

3 David 74 18

4 Alina 90 20

5 John 96 18
1. Creating view

A view can be created using the CREATE VIEW statement. A view can be creted from a
single table or multiple tables.

Syntax:

1. CREATE VIEW view_name AS SELECT column1, column2..... FROM table_name WHERE

condition;

2. Creating View from a single table

In this example, A View is created DetailsView from the table Student_Detail.

CREATE VIEW DetailsView AS SELECT NAME, ADDRESS FROM Student_Details WHERE STU_ID
< 4;

Just like table query, the view can be queried to view the data.

SELECT * FROM DetailsView;

Output:

NAME ADDRESS

Stephan Delhi

Kathrin Noida

David Ghaziabad

3. Creating View from multiple tables

View from multiple tables can be created by simply include multiple tables in the SELECT
statement.

In the given example, a view is created named MarksView from two tables
Student_Detail and Student_Marks.
Query:

CREATE VIEW MarksView ASSELECT Student_Detail.NAME, Student_Detail.ADDRESS, Student_Ma

rks.MARKS FROM Student_Detail, Student_Mark WHERE Student_Detail.NAME = Student_Mark
s.NAME;
1. SELECT * FROM MarksView;

NAME ADDRESS MARKS

Stephan Delhi 97

Kathrin Noida 86

David Ghaziabad 74

Alina Gurugram 90

4. Deleting View

A view can be deleted using the Drop View statement.

Syntax

DROP VIEW view_name;

Example
DROP VIEW MarksView;

SQL SEQUENCES

Sequence is a set of integers 1, 2, 3, … that are generated and supported

by some database systems to produce unique values on demand. A sequence is a
user defined schema bound object that generates a sequence of numeric values.
Sequences are frequently used in many databases because many applications
require each row in a table to contain a unique value and sequences provides an
easy way to generate them. The sequence of numeric values is generated in an
ascending or descending order at defined intervals and can be configured to
restart when max_value exceeds.
Syntax:
CREATE SEQUENCE sequence_name START WITH initial_value INCREMENT BY
increment_value MINVALUE minimum value MAXVALUE maximum value
CYCLE|NOCYCLE ;
Where
sequence_name: Name of the sequence.

initial_value: starting value from where the sequence starts.

Initial_value should be greater than or equal to minimum value and less than equal to
maximum value.

increment_value: Value by which sequence will increment itself.

Increment_value can be positive or negative.

minimum_value: Minimum value of the sequence.

maximum_value: Maximum value of the sequence.

cycle: When sequence reaches its set_limit it starts from beginning.

nocycle: An exception will be thrown if sequence exceeds its max_value.

Example:

CREATE SEQUENCE sequence_1 start with 1 increment by 1 minvalue 0 maxvalue 100

cycle;
SQL Index

Indexes are special lookup tables. It is used to retrieve data from the database very fast. An
Index is used to speed up select queries and where clauses. But it shows down the data input
with insert and update statements. Indexes can be created or dropped without affecting the
data. An index in a database is just like an index in the back of a book. For example: When you
reference all pages in a book that discusses a certain topic, you first have to refer to the index,
which alphabetically lists all the topics and then referred to one or more specific page numbers.

Create Index statement

It is used to create an index on a table. It allows duplicate value.

CREATE INDEX index_name ON table_name (column1, column2, ...);

Example:

CREATE INDEX idx_name ON Persons (LastName, FirstName);

Drop Index Statement

It is used to delete an index in a table

DROP INDEX index_name;

Example
DROP INDEX websites_idx;

SYNONYM
A SYNONYM provides another name for database object, referred to as original
object that may exist on a local or another server. A synonym belongs to
schema, name of synonym should be unique. A synonym cannot be original
object for an additional synonym and synonym cannot refer to user-defined
function.
The query below results in an entry for each synonym in database. This query
provides details about synonym metadata such as the name of synonym and
name of the base object.
select * from sys.synonyms ;

Note : Synonyms are database dependent and cannot be accessed by other

databases.
Syntax –
CREATE SYNONYM synonymname FOR
servername.databasename.schemaname.objectname;

Assertions in DBMS

When a constraint involves 2 (or) more tables, the table constraint mechanism is
sometimes hard and results may not come as expected. To cover such situation SQL
supports the creation of assertions that are constraints not associated with only one
table. And an assertion statement should ensure a certain condition will always exist in
the database. DBMS always checks the assertion whenever modifications are done in the
corresponding table.
• Assertions = conditions that the database must always satisfy

• Domain constraints and referential-integrity constraints are specific forms of assertions

• CHECK – verify the assertion on one-table, one-attribute

• ASSERTION – verify one or more tables, one or more attributes Some constraints
cannot be expressed by using only domain constraints or referential-integrity
constraints; for example,

• “Every department must have at least five courses offered every semester” – must be
expressed as an assertion

Syntax –

CREATE ASSERTION [ assertion_name ] CHECK ( [ condition ] );

Eg:-The total length of all movies by a given studio shall not exceed 10,000 minutes

CREATE ASSERTION sumLength CHECK (10000 >= ALL (SELECT SUM(length) FROM
Movies GROUP BY studioName ) )

Cursor

Cursor is a mechanism that provides a way to select multiple rows of data from the
database and then process each row individually inside a PL/SQL program. The cursor
first points at row1 and once it is processed it then advances to row2 and so on

TYPES OF CURSORS

1. IMPLICIT

2. EXPLICIT

IMPLICIT

• These are created by default when DML statements like, INSERT, UPDATE, and
DELETE statements are executed.
• The user is not aware of this happening & will not be able to control or process
the information.
● When an implicit cursor is working, DBMS performs the open, fetches and close
automatically Implicit cursors

Explicit cursors

• Explicit cursors are programmer defined cursors for gaining more control over the
context area.

▪ An explicit cursor should be defined in the declaration section of the PL/SQL Block.

• It is created on a SELECT Statement which returns more than one row.

Explicit cursor involves four steps:

1. Declaring the cursor for initializing in the memory

2. Opening the cursor for allocating memory

3. Fetching the cursor for retrieving data

4. Closing the cursor to release allocated memory

Triggers

Triggers are stored programs, which are automatically executed or fired when some
events occur. Triggers are written to be executed in response to any of the following
events −
• A database manipulation (DML) statement (DELETE, INSERT, or UPDATE)
• A database definition (DDL) statement (CREATE, ALTER, or DROP).
• A database operation (SERVERERROR, LOGON, LOGOFF, STARTUP, or
SHUTDOWN).
Triggers can be defined on the table, view, schema, or database with which the event is
associated.

The events that fire a trigger Event

1 • DML statements Event

2 • DDL statements Event

3 • System events Event

4 • User events

Major Features

• Triggers do not accept parameters or arguments.

• triggers cannot perform commit or rollback operations
• triggers are normally slow

Benefits of Triggers

Triggers can be written for the following purposes −

• Generating some derived column values automatically

• Enforcing referential integrity
• Event logging and storing information on table access
• Auditing
• Synchronous replication of tables
• Imposing security authorizations
• Preventing invalid transactions
Creating Triggers
The syntax for creating a trigger is −

CREATE [OR REPLACE ] TRIGGER trigger_name

{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
DECLARE
Declaration-statements
BEGIN
Executable-statements
EXCEPTION
Exception-handling-statements
END;

Where,
• CREATE [OR REPLACE] TRIGGER trigger_name − Creates or replaces an existing
trigger with the trigger_name.
• {BEFORE | AFTER | INSTEAD OF} − This specifies when the trigger will be
executed. The INSTEAD OF clause is used for creating trigger on a view.
• {INSERT [OR] | UPDATE [OR] | DELETE} − This specifies the DML operation.
• [OF col_name] − This specifies the column name that will be updated.
• [ON table_name] − This specifies the name of the table associated with the
trigger.
• [REFERENCING OLD AS o NEW AS n] − This allows you to refer new and old
values for various DML statements, such as INSERT, UPDATE, and DELETE.
• [FOR EACH ROW] − This specifies a row-level trigger, i.e., the trigger will be
executed for each row being affected. Otherwise the trigger will execute just
once when the SQL statement is executed, which is called a table level trigger.
• WHEN (condition) − This provides a condition for rows for which the trigger
would fire. This clause is valid only for row-level triggers.
Example
The following program creates a row-level trigger for the customers table that would
fire for INSERT or UPDATE or DELETE operations performed on the CUSTOMERS table.
This trigger will display the salary difference between the old values and new values −

• CREATE OR REPLACE TRIGGER display_salary_changes

• BEFORE DELETE OR INSERT OR UPDATE ON customers
• FOR EACH ROW
• WHEN (NEW.ID > 0)
• DECLARE
• sal_diff number;
• BEGIN
• sal_diff := :NEW.salary - :OLD.salary;
• dbms_output.put_line('Old salary: ' || :OLD.salary);
• dbms_output.put_line('New salary: ' || :NEW.salary);
• dbms_output.put_line('Salary difference: ' || sal_diff);
• END;
• /

Types of trigger

• Row Triggers and Statement Trigger

• BEFORE and AFTER Triggers
• System Events and User Events

Row Triggers and Statement Trigger

Row Triggers • A row trigger is fired each time the table is affected by the triggering
statement.

Statement Triggers • A statement trigger is fired once on behalf of the triggering

statement.

BEFORE and AFTER Triggers

BEFORE triggers run the trigger action before the triggering statement is run.
situations: To eliminate unnecessary processing To derive specific column values.

AFTER triggers run the trigger action after the triggering statement is run.
System Events

Database startup and shutdown

Data Guard role transitions

Server error message events

User Events

User logon and logoff

DDL statements

DML statements

Stored Procedures

Stored Procedures are created to perform one or more DML operations on

Database. It is nothing but the group of SQL statements that accepts some
input in the form of parameters and performs some task and may or may not
returns a value.
Syntax : Creating a Procedure

CREATE or REPLACE PROCEDURE name(parameters) IS variables;

BEGIN
//statements;
END;

The most important part is parameters. Parameters are used to pass values to
the Procedure. There are 3 different types of parameters, they are as follows:

1. IN:
This is the Default Parameter for the procedure. It always receives the values
from calling program.
2. OUT:
This parameter always sends the values to the calling program.
3. IN OUT:
This parameter performs both the operations. It Receives value from as well
as sends the values to the calling program.
Example:
Imagine a table named with emp_table stored in Database. Writing a Procedure
to update a Salary of Employee with 1000.

CREATE or REPLACE PROCEDURE INC_SAL(eno IN NUMBER, up_sal OUT NUMBER)

IS
BEGIN
UPDATE emp_table SET salary = salary+1000 WHERE emp_no = eno;
COMMIT;
SELECT sal INTO up_sal FROM emp_table WHERE emp_no = eno;
END;

Stored Functions
A stored function is a set of SQL statements that perform some operation and
return a single value.
Just like Mysql in-built function, it can be called from within a Mysql statement.
By default, the stored function is associated with the default database.
The CREATE FUNCTION statement requires CREATE ROUTINE database
privilege.
Syntax:
The syntax for CREATE FUNCTION statement is:

CREATE FUNCTION function_name(func_parameter1, func_parameter2, ..)RETURN

datatype [characteristics] func_body

Parameters used:
1. function_name:
It is the name by which stored function is called. The name should not be
same as native(built_in) function. In order to associate routine explicitly with
a specific database function name should be given
as database_name.func_name.
2. func_parameter:
It is the argument whose value is used by the function inside its body. You
can’t specify to these parameters IN, OUT, INOUT. The parameter
declaration inside parenthesis is provided as func_parameter type. Here, type
represents a valid Mysql datatype.
3. datatype:
It is datatype of value returned by function.
4. characteristics:
The CREATE FUNCTION statement is accepted only if at least one of the
characteristics { DETERMINISTIC, NO SQL, or READS SQL DATA } is specified
in its declaration.
Example
find the number of years the employee has been in the company-
DELIMITER //

CREATE FUNCTION no_of_years(date1 date) RETURNS int DETERMINISTIC

BEGIN
DECLARE date2 DATE;
Select current_date()into date2;
RETURN year(date2)-year(date1);
END
// DELIMITER ;

Calling of above function:

Select emp_id, fname, lname, no_of_years(start_date) as 'years' from employee;

Embedded SQL

Combining capabilities of a high-level programming language with SQL. The simplest

approach is to embed SQL statements directly into the source code file(s) that will be
used to create an application. This technique is referred to as Embedded SQL
programming High-level programming language compilers cannot interpret, SQL
statements. Hence source code files containing embedded SQL statements must be
preprocessed before compiling. Thus each SQL statement coded in a high-level
programming language source code file must be prefixed with the keywords EXEC SQL
and terminated with either a semicolon or the keywords END_EXEC. Likewise, the
Database Manager cannot work directly with high-level programming language
variables. Instead, it must use special variables known as host variables to move data
between an application and a database.

• Two types of Host variables:-

1. Input Host Variables – Transfer data to database

2. Output Host Variables – receives data from database

Host variables are ordinary programming language variables.

• To be set apart, they must be defined within a special section known as a declare
section.

EXEC SQL BEGIN DECLARE SECTION

char EmployeeID[7];

double Salary;

EXEC SQL END DECLARE SECTION

• Each host variable must be assigned a unique name even though declared in different
declaration section

Dynamic SQL

• Objective:
– Composing and executing new (not previously compiled) SQL statements
at run-time
• a program accepts SQL statements from the keyboard at run-time
• a point-and-click operation translates to certain SQL query
• Dynamic update is relatively simple; dynamic query can be complex
– because the type and number of retrieved attributes are unknown at
compile time
Example

EXEC SQL BEGIN DECLARE SECTION;

varchar sqlupdatestring[256];

EXEC SQL END DECLARE SECTION;

• prompt (“Enter update command:“, sqlupdatestring);

• EXEC SQL PREPARE sqlcommand FROM :sqlupdatestring;
• EXEC SQL EXECUTE sqlcommand

DBMS - Storage System

Databases are stored in file formats, which contain records. At physical level, the actual data is
stored in electromagnetic format on some device. These storage devices can be broadly
categorized into three types −

• Primary Storage − The memory storage that is directly accessible to the CPU
comes under this category. CPU's internal memory (registers), fast memory
(cache), and main memory (RAM) are directly accessible to the CPU, as they are
all placed on the motherboard or CPU chipset. This storage is typically very
small, ultra-fast, and volatile. Primary storage requires continuous power supply
in order to maintain its state. In case of a power failure, all its data is lost.
• Secondary Storage − Secondary storage devices are used to store data for
future use or as backup. Secondary storage includes memory devices that are
not a part of the CPU chipset or motherboard, for example, magnetic disks,
optical disks (DVD, CD, etc.), hard disks, flash drives, and magnetic tapes.
• Tertiary Storage − Tertiary storage is used to store huge volumes of data. Since
such storage devices are external to the computer system, they are the slowest
in speed. These storage devices are mostly used to take the back up of an entire
system. Optical disks and magnetic tapes are widely used as tertiary storage.

Memory Hierarchy
A computer system has a well-defined hierarchy of memory. A CPU has direct access to
it main memory as well as its inbuilt registers. The access time of the main memory is
obviously less than the CPU speed. To minimize this speed mismatch, cache memory is
introduced. Cache memory provides the fastest access time and it contains data that is
most frequently accessed by the CPU.
The memory with the fastest access is the costliest one. Larger storage devices offer
slow speed and they are less expensive, however they can store huge volumes of data
as compared to CPU registers or cache memory.

Storage Hierarchy
Besides the above, various other storage devices reside in the computer system. These
storage media are organized on the basis of data accessing speed, cost per unit of data
to buy the medium, and by medium's reliability. Thus, we can create a hierarchy of
storage media on the basis of its cost and speed.

Thus, on arranging the above-described storage media in a hierarchy according to its

speed and cost, we conclude the below-described image:

In the image, the higher levels are expensive but fast. On moving down, the cost per bit
is decreasing, and the access time is increasing. Also, the storage media from the main
memory to up represents the volatile nature, and below the main memory, all are non-
volatile devices.

Magnetic Disks
Hard disk drives are the most common secondary storage devices in present computer
systems. These are called magnetic disks because they use the concept of
magnetization to store information. Hard disks consist of metal disks coated with
magnetizable material. These disks are placed vertically on a spindle. A read/write head
moves in between the disks and is used to magnetize or de-magnetize the spot under
it. A magnetized spot can be recognized as 0 (zero) or 1 (one).
Hard disks are formatted in a well-defined order to store data efficiently. A hard disk
plate has many concentric circles on it, called tracks. Every track is further divided
into sectors. A sector on a hard disk typically stores 512 bytes of data.

Redundant Array of Independent Disks

RAID or Redundant Array of Independent Disks, is a technology to connect multiple
secondary storage devices and use them as a single storage media.
RAID consists of an array of disks in which multiple disks are connected together to
achieve different goals. RAID levels define the use of disk arrays.

RAID 0

In this level, a striped array of disks is implemented. The data is broken down into
blocks and the blocks are distributed among disks. Each disk receives a block of data to
write/read in parallel. It enhances the speed and performance of the storage device.
There is no parity and backup in Level 0.

RAID 1

RAID 1 uses mirroring techniques. When data is sent to a RAID controller, it sends a
copy of data to all the disks in the array. RAID level 1 is also called mirroring and
provides 100% redundancy in case of a failure.
RAID 2

RAID 2 records Error Correction Code using Hamming distance for its data, striped on
different disks. Like level 0, each data bit in a word is recorded on a separate disk and
ECC codes of the data words are stored on a different set disks. Due to its complex
structure and high cost, RAID 2 is not commercially available.

RAID 3

RAID 3 stripes the data onto multiple disks. The parity bit generated for data word is
stored on a different disk. This technique makes it to overcome single disk failures.

RAID 4

In this level, an entire block of data is written onto data disks and then the parity is
generated and stored on a different disk. Note that level 3 uses byte-level striping,
whereas level 4 uses block-level striping. Both level 3 and level 4 require at least three
disks to implement RAID.
RAID 5

RAID 5 writes whole data blocks onto different disks, but the parity bits generated for
data block stripe are distributed among all the data disks rather than storing them on a
different dedicated disk.

RAID 6

RAID 6 is an extension of level 5. In this level, two independent parities are generated
and stored in distributed fashion among multiple disks. Two parities provide additional
fault tolerance. This level requires at least four disk drives to implement RAID.
File Structure
A file is a sequence of records stored in binary format. A disk drive is formatted into several
blocks that can store records. File records are mapped onto those disk blocks.

File Organization
File Organization defines how file records are mapped onto disk blocks. We have four
types of File Organization to organize file records −

Heap File Organization

When a file is created using Heap File Organization, the Operating System allocates
memory area to that file without any further accounting details. File records can be
placed anywhere in that memory area. It is the responsibility of the software to manage
the records. Heap File does not support any ordering, sequencing, or indexing on its
own.

Sequential File Organization

Every file record contains a data field (attribute) to uniquely identify that record. In
sequential file organization, records are placed in the file in some sequential order
based on the unique key field or search key. Practically, it is not possible to store all the
records sequentially in physical form.
Hash File Organization
Hash File Organization uses Hash function computation on some fields of the records.
The output of the hash function determines the location of disk block where the
records are to be placed.

Clustered File Organization

Clustered file organization is not considered good for large databases. In this
mechanism, related records from one or more relations are kept in the same disk block,
that is, the ordering of records is not based on primary key or search key.

File Operations
Operations on database files can be broadly classified into two categories −
• Update Operations
• Retrieval Operations
Update operations change the data values by insertion, deletion, or update. Retrieval
operations, on the other hand, do not alter the data but retrieve them after optional
conditional filtering. In both types of operations, selection plays a significant role. Other
than creation and deletion of a file, there could be several operations, which can be
done on files.
• Open − A file can be opened in one of the two modes, read mode or write
mode. In read mode, the operating system does not allow anyone to alter data.
In other words, data is read only. Files opened in read mode can be shared
among several entities. Write mode allows data modification. Files opened in
write mode can be read but cannot be shared.
• Locate − Every file has a file pointer, which tells the current position where the
data is to be read or written. This pointer can be adjusted accordingly. Using find
(seek) operation, it can be moved forward or backward.
• Read − By default, when files are opened in read mode, the file pointer points to
the beginning of the file. There are options where the user can tell the operating
system where to locate the file pointer at the time of opening a file. The very
next data to the file pointer is read.
• Write − User can select to open a file in write mode, which enables them to edit
its contents. It can be deletion, insertion, or modification. The file pointer can be
located at the time of opening or can be dynamically changed if the operating
system allows doing so.
• Close − This is the most important operation from the operating system’s point
of view. When a request to close a file is generated, the operating system
o removes all the locks (if in shared mode),
o saves the data (if altered) to the secondary storage media, and
o releases all the buffers and file handlers associated with the file.
The organization of data inside a file plays a major role here. The process to locate the
file pointer to a desired record inside a file various based on whether the records are
arranged sequentially or clustered.

Indexing
Data is stored in the form of records. Every record has a key field, which helps it to be
recognized uniquely.
Indexing is a data structure technique to efficiently retrieve records from the database
files based on some attributes on which the indexing has been done. Indexing in
database systems is similar to what we see in books.
Indexing is defined based on its indexing attributes. Indexing can be of the following
types −
• Primary Index − Primary index is defined on an ordered data file. The data file is
ordered on a key field. The key field is generally the primary key of the relation.
• Secondary Index − Secondary index may be generated from a field which is a
candidate key and has a unique value in every record, or a non-key with
duplicate values.
• Clustering Index − Clustering index is defined on an ordered data file. The data
file is ordered on a non-key field.
Ordered Indexing is of two types −

• Dense Index
• Sparse Index

Dense Index
In dense index, there is an index record for every search key value in the database. This
makes searching faster but requires more space to store index records itself. Index
records contain search key value and a pointer to the actual record on the disk.
Sparse Index
In sparse index, index records are not created for every search key. An index record
here contains a search key and an actual pointer to the data on the disk. To search a
record, we first proceed by index record and reach at the actual location of the data. If
the data we are looking for is not where we directly reach by following the index, then
the system starts sequential search until the desired data is found.

Multilevel Index
Index records comprise search-key values and data pointers. Multilevel index is stored
on the disk along with the actual database files. As the size of the database grows, so
does the size of the indices. There is an immense need to keep the index records in the
main memory so as to speed up the search operations. If single-level index is used,
then a large size index cannot be kept in memory which leads to multiple disk accesses.
Multi-level Index helps in breaking down the index into several smaller indices in order
to make the outermost level so small that it can be saved in a single disk block, which
can easily be accommodated anywhere in the main memory.

B Tree
+

A B tree is a balanced binary search tree that follows a multi-level index format. The
+

leaf nodes of a B tree denote actual data pointers. B tree ensures that all leaf nodes
+ +

remain at the same height, thus balanced. Additionally, the leaf nodes are linked using
a link list; therefore, a B tree can support random access as well as sequential access.
+

Structure of B Tree
+

Every leaf node is at equal distance from the root node. A B tree is of the
+

order n where n is fixed for every B tree.

+
Internal nodes −

• Internal (non-leaf) nodes contain at least ⌈n/2⌉ pointers, except the root node.
• At most, an internal node can contain n pointers.
Leaf nodes −

• Leaf nodes contain at least ⌈n/2⌉ record pointers and ⌈n/2⌉ key values.
• At most, a leaf node can contain n record pointers and n key values.
• Every leaf node contains one block pointer P to point to next leaf node and forms a
linked list.

B Tree Insertion
+

• B trees are filled from bottom and each entry is done at the leaf node.
+

• If a leaf node overflows −

o Split node into two parts.
o Partition at i = ⌊(m+1) ⌋.
/2

o First i entries are stored in one node.

o Rest of the entries (i+1 onwards) are moved to a new node.
o i key is duplicated at the parent of the leaf.
th

• If a non-leaf node overflows −

o Split node into two parts.
o Partition the node at i = ⌈(m+1) ⌉.
/2

o Entries up to i are kept in one node.

o Rest of the entries are moved to a new node.
B Tree Deletion
+

• B tree entries are deleted at the leaf nodes.

• The target entry is searched and deleted.

o If it is an internal node, delete and replace with the entry from the left
position.
• After deletion, underflow is tested,
o If underflow occurs, distribute the entries from the nodes left to it.
• If distribution is not possible from left, then
o Distribute from the nodes right to it.
• If distribution is not possible from left or from right, then
o Merge the node with left and right to it.

Hashing
For a huge database structure, it can be almost next to impossible to search all the
index values through all its level and then reach the destination data block to retrieve
the desired data. Hashing is an effective technique to calculate the direct location of a
data record on the disk without using index structure.
Hashing uses hash functions with search keys as parameters to generate the address of
a data record.

Hash Organization
• Bucket − A hash file stores data in bucket format. Bucket is considered a unit of
storage. A bucket typically stores one complete disk block, which in turn can
store one or more records.
• Hash Function − A hash function, h, is a mapping function that maps all the set
of search-keys K to the address where actual records are placed. It is a function
from search keys to bucket addresses.

Static Hashing
In static hashing, when a search-key value is provided, the hash function always
computes the same address. For example, if mod-4 hash function is used, then it shall
generate only 5 values. The output address shall always be same for that function. The
number of buckets provided remains unchanged at all times.
Operation

• Insertion − When a record is required to be entered using static hash, the hash
function h computes the bucket address for search key K, where the record will
be stored.
Bucket address = h(K)
• Search − When a record needs to be retrieved, the same hash function can be
used to retrieve the address of the bucket where the data is stored.
• Delete − This is simply a search followed by a deletion operation.

Bucket Overflow
The condition of bucket-overflow is known as collision. This is a fatal state for any
static hash function. In this case, overflow chaining can be used.
• Overflow Chaining − When buckets are full, a new bucket is allocated for the
same hash result and is linked after the previous one. This mechanism is
called Closed Hashing.
• Linear Probing − When a hash function generates an address at which data is
already stored, the next free bucket is allocated to it. This mechanism is
called Open Hashing.

Dynamic Hashing
The problem with static hashing is that it does not expand or shrink dynamically as the
size of the database grows or shrinks. Dynamic hashing provides a mechanism in which
data buckets are added and removed dynamically and on-demand. Dynamic hashing is
also known as extended hashing.
Hash function, in dynamic hashing, is made to produce a large number of values and
only a few are used initially.
Organization
The prefix of an entire hash value is taken as a hash index. Only a portion of the hash
value is used for computing bucket addresses. Every hash index has a depth value to
signify how many bits are used for computing a hash function. These bits can address
2n buckets. When all these bits are consumed − that is, when all the buckets are full −
then the depth value is increased linearly and twice the buckets are allocated.

Operation
• Querying − Look at the depth value of the hash index and use those bits to
compute the bucket address.
• Update − Perform a query as above and update the data.
• Deletion − Perform a query to locate the desired data and delete the same.
• Insertion − Compute the address of the bucket
o If the bucket is already full.
▪ Add more buckets.
▪ Add additional bits to the hash value.
▪ Re-compute the hash function.
o Else
▪ Add data to the bucket,
o If all the buckets are full, perform the remedies of static hashing.
Hashing is not favorable when the data is organized in some ordering and the queries
require a range of data. When data is discrete and random, hash performs the best.
Hashing algorithms have high complexity than indexing. All hash operations are done
in constant time.
DBMS - Transaction

A transaction can be defined as a group of tasks. A single task is the minimum processing unit
which cannot be divided further.

Properties of transaction

Atomicity − This property states that a transaction must be treated as an atomic unit, that is,
either all of its operations are executed or none. There must be no state in a database where a
transaction is left partially completed. States should be defined either before the execution of
the transaction or after the execution/abortion/failure of the transaction.

Consistency − The database must remain in a consistent state after any transaction. No
transaction should have any adverse effect on the data residing in the database. If the database
was in a consistent state before the execution of a transaction, it must remain consistent after
the execution of the transaction as well.

Durability − The database should be durable enough to hold all its latest updates even if the
system fails or restarts. If a transaction updates a chunk of data in a database and commits, then
the database will hold the modified data. If a transaction commits but the system fails before the
data could be written on to the disk, then that data will be updated once the system springs
back into action.

Isolation − In a database system where more than one transaction are being executed
simultaneously and in parallel, the property of isolation states that all the transactions will be
carried out and executed as if it is the only transaction in the system. No transaction will affect
the existence of any other transaction.
Serializability

When multiple transactions are being executed by the operating system in a multiprogramming
environment, there are possibilities that instructions of one transaction are interleaved with
some other transaction.

Schedule − A chronological execution sequence of a transaction is called a schedule. A schedule

can have many transactions in it, each comprising of a number of instructions/tasks.

Serial Schedule − It is a schedule in which transactions are aligned in such a way that one
transaction is executed first. When the first transaction completes its cycle, then the next
transaction is executed. Transactions are ordered one after the other. This type of schedule is
called a serial schedule, as transactions are executed in a serial manner.

States of Transactions

Concurrent Transactions

A transaction is a unit of database processing which contains a set of operations. For example,
deposit of money, balance enquiry, reservation of tickets etc.

Every transaction starts with delimiters begin transaction and terminates with end transaction
delimiters. The set of operations within these two delimiters constitute one transaction.

There are three possible ways in which a transaction can be executed. These are as follows −
1. Serial execution.
2. Parallel execution.
3. Concurrent execution.

Concurrent transaction or execution includes multiple transactions which are executed

concurrently or simultaneously in the system

Advantages

1. Increases throughput which is nothing but number of transactions completed per unit time.

2. It reduces the waiting time.

Disadvantage

The disadvantage is that the execution of concurrent transactions may result in inconsistency

Lock-Based Protocol

In this type of protocol, any transaction cannot read or write data until it acquires an appropriate
lock on it. There are two types of lock:

1. Shared lock:

It is also known as a Read-only lock. In a shared lock, the data item can only read by the
transaction. It can be shared between the transactions because when the transaction holds a
lock, then it can't update the data on the data item.

2. Exclusive lock:

In the exclusive lock, the data item can be both reads as well as written by the transaction. This
lock is exclusive, and in this lock, multiple transactions do not modify the same data
simultaneously.

Schedule

A series of operation from one transaction to another transaction is known as schedule. It is

used to preserve the order of the operation in each of the individual transaction
1. Serial Schedule

The serial schedule is a type of schedule where one transaction is executed completely before
starting another transaction. In the serial schedule, when the first transaction completes its cycle,
then the next transaction is executed.

For example: Suppose there are two transactions T1 and T2 which have some operations. If it
has no interleaving of operations, then there are the following two possible outcomes:

Execute all the operations of T1 which was followed by all the operations of T2.

In the given (a) figure, Schedule A shows the serial schedule where T1 followed by T2.

In the given (b) figure, Schedule B shows the serial schedule where T2 followed by T1.

2. Non-serial Schedule

If interleaving of operations is allowed, then there will be non-serial schedule.

It contains many possible orders in which the system can execute the individual operations of
the transactions.

In the given figure (c) and (d), Schedule C and Schedule D are the non-serial schedules. It has
interleaving of operations.

3. Serializable schedule

The serializability of schedules is used to find non-serial schedules that allow the transaction to
execute concurrently without interfering with one another.

It identifies which schedules are correct when executions of the transaction have interleaving of
their operations.

A non-serial schedule will be serializable if its result is equal to the result of its transactions
executed serially.

Locks

There are four types of lock protocols available:

1. Simplistic lock protocol

It is the simplest way of locking the data while transaction. Simplistic lock-based protocols allow
all the transactions to get the lock on the data before insert or delete or update on it. It will
unlock the data item after completing the transaction.

2. Pre-claiming Lock Protocol

Pre-claiming Lock Protocols evaluate the transaction to list all the data items on which they
need locks. Before initiating an execution of the transaction, it requests DBMS for all the lock on
all those data items. If all the locks are granted then this protocol allows the transaction to
begin. When the transaction is completed then it releases all the lock. If all the locks are not
granted then this protocol allows the transaction to rolls back and waits until all the locks are
granted.

3. Two-phase locking (2PL)

The two-phase locking protocol divides the execution phase of the transaction into three parts.
In the first part, when the execution of the transaction starts, it seeks permission for the lock it
requires. In the second part, the transaction acquires all the locks. The third phase is started as
soon as the transaction releases its first lock. In the third phase, the transaction cannot demand
any new locks. It only releases the acquired locks.

4. Strict Two-phase locking (Strict-2PL)

The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the locks, the
transaction continues to execute normally. The only difference between 2PL and strict 2PL is that
Strict-2PL does not release a lock after using it. Strict-2PL waits until the whole transaction to
commit, and then it releases all the locks at a time. Strict-2PL protocol does not have shrinking
phase of lock release

There are two phases of 2PL:

Growing phase: In the growing phase, a new lock on the data item may be acquired by the
transaction, but none can be released.

Shrinking phase: In the shrinking phase, existing lock held by the transaction may be released,
but no new locks can be acquired
Deadlock
In a multi-process system, deadlock is an unwanted situation that arises in a shared
resource environment, where a process indefinitely waits for a resource that is held by
another process.
For example, assume a set of transactions {T , T , T , ...,T }. T needs a resource X to
0 1 2 n 0

complete its task. Resource X is held by T , and T is waiting for a resource Y, which is
1 1

held by T . T is waiting for resource Z, which is held by T . Thus, all the processes wait
2 2 0

for each other to release resources. In this situation, none of the processes can finish
their task. This situation is known as a deadlock.
Deadlocks are not healthy for a system. In case a system is stuck in a deadlock, the
transactions involved in the deadlock are either rolled back or restarted.

Deadlock Prevention
To prevent any deadlock situation in the system, the DBMS aggressively inspects all the
operations, where transactions are about to execute. The DBMS inspects the operations
and analyzes if they can create a deadlock situation. If it finds that a deadlock situation
might occur, then that transaction is never allowed to be executed.
There are deadlock prevention schemes that use timestamp ordering mechanism of
transactions in order to predetermine a deadlock situation.

Wait-Die Scheme

In this scheme, if a transaction requests to lock a resource (data item), which is already
held with a conflicting lock by another transaction, then one of the two possibilities
may occur −
• If TS(T ) < TS(T ) − that is T , which is requesting a conflicting lock, is older than
i j i

T − then T is allowed to wait until the data-item is available.

j i

• If TS(T ) > TS(t ) − that is T is younger than T − then T dies. T is restarted later
i j i j i i

with a random delay but with the same timestamp.

This scheme allows the older transaction to wait but kills the younger one.

Wound-Wait Scheme

In this scheme, if a transaction requests to lock a resource (data item), which is already
held with conflicting lock by some another transaction, one of the two possibilities may
occur −
• If TS(T ) < TS(T ), then T forces T to be rolled back − that is T wounds T . T is
i j i j i j j

restarted later with a random delay but with the same timestamp.
• If TS(T ) > TS(T ), then T is forced to wait until the resource is available.
i j i

This scheme, allows the younger transaction to wait; but when an older transaction
requests an item held by a younger one, the older transaction forces the younger one
to abort and release the item.
In both the cases, the transaction that enters the system at a later stage is aborted.

Deadlock Avoidance
Aborting a transaction is not always a practical approach. Instead, deadlock avoidance
mechanisms can be used to detect any deadlock situation in advance. Methods like
"wait-for graph" are available but they are suitable for only those systems where
transactions are lightweight having fewer instances of resource. In a bulky system,
deadlock prevention techniques may work well.

Wait-for Graph

This is a simple method available to track if any deadlock situation may arise. For each
transaction entering into the system, a node is created. When a transaction T requests i

for a lock on an item, say X, which is held by some other transaction T , a directed edge
j

is created from T to T . If T releases item X, the edge between them is dropped and
i j j

T locks the data item.

The system maintains this wait-for graph for every transaction waiting for some data
items held by others. The system keeps checking if there's any cycle in the graph.
Here, we can use any of the two following approaches −
• First, do not allow any request for an item, which is already locked by another
transaction. This is not always feasible and may cause starvation, where a
transaction indefinitely waits for a data item and can never acquire it.
• The second option is to roll back one of the transactions. It is not always feasible
to roll back the younger transaction, as it may be important than the older one.
With the help of some relative algorithm, a transaction is chosen, which is to be
aborted. This transaction is known as the victim and the process is known
as victim selection.

Data Backup
Loss of Volatile Storage
A volatile storage like RAM stores all the active logs, disk buffers, and related data. In
addition, it stores all the transactions that are being currently executed. What happens
if such a volatile storage crashes abruptly? It would obviously take away all the logs and
active copies of the database. It makes recovery almost impossible, as everything that is
required to recover the data is lost.
Following techniques may be adopted in case of loss of volatile storage −
• We can have checkpoints at multiple stages so as to save the contents of the
database periodically.
• A state of active database in the volatile memory can be
periodically dumped onto a stable storage, which may also contain logs and
active transactions and buffer blocks.
• <dump> can be marked on a log file, whenever the database contents are
dumped from a non-volatile memory to a stable one.

Recovery

• When the system recovers from a failure, it can restore the latest dump.
• It can maintain a redo-list and an undo-list as checkpoints.
• It can recover the system by consulting undo-redo lists to restore the state of all
transactions up to the last checkpoint.
Database Backup & Recovery from Catastrophic Failure
A catastrophic failure is one where a stable, secondary storage device gets corrupt.
With the storage device, all the valuable data that is stored inside is lost. We have two
different strategies to recover data from such a catastrophic failure −
• Remote backup ; Here a backup copy of the database is stored at a remote
location from where it can be restored in case of a catastrophe.
• Alternatively, database backups can be taken on magnetic tapes and stored at a
safer place. This backup can later be transferred onto a freshly installed database
to bring it to the point of backup.
Grown-up databases are too bulky to be frequently backed up. In such cases, we have
techniques where we can restore a database just by looking at its logs. So, all that we
need to do here is to take a backup of all the logs at frequent intervals of time. The
database can be backed up once a week, and the logs being very small can be backed
up every day or as frequently as possible.

Remote Backup
Remote backup provides a sense of security in case the primary location where the
database is located gets destroyed. Remote backup can be offline or real-time or
online. In case it is offline, it is maintained manually.
Online backup systems are more real-time and lifesavers for database administrators
and investors. An online backup system is a mechanism where every bit of the real-time
data is backed up simultaneously at two distant places. One of them is directly
connected to the system and the other one is kept at a remote place as backup.
As soon as the primary database storage fails, the backup system senses the failure and
switches the user system to the remote storage. Sometimes this is so instant that the
users can’t even realize a failure.

Data Recovery
Crash Recovery
DBMS is a highly complex system with hundreds of transactions being executed every
second. The durability and robustness of a DBMS depends on its complex architecture
and its underlying hardware and system software. If it fails or crashes amid transactions,
it is expected that the system would follow some sort of algorithm or techniques to
recover lost data.

Failure Classification
To see where the problem has occurred, a failure can be divided into various
categories, as follows −

Transaction failure

A transaction has to abort when it fails to execute or when it reaches a point from
where it can’t go any further. This is called transaction failure where only a few
transactions or processes are hurt.
Reasons for a transaction failure could be −
• Logical errors − Where a transaction cannot complete because it has some code
error or any internal error condition.
• System errors − Where the database system itself terminates an active
transaction because the DBMS is not able to execute it, or it has to stop because
of some system condition. For example, in case of deadlock or resource
unavailability, the system aborts an active transaction.
System Crash

There are problems − external to the system − that may cause the system to stop
abruptly and cause the system to crash. For example, interruptions in power supply
may cause the failure of underlying hardware or software failure.
Examples may include operating system errors.

Disk Failure

In early days of technology evolution, it was a common problem where hard-disk drives
or storage drives used to fail frequently.
Disk failures include formation of bad sectors, unreachability to the disk, disk head
crash or any other failure, which destroys all or a part of disk storage.

Storage Structure
We have already described the storage system. In brief, the storage structure can be
divided into two categories −
• Volatile storage − As the name suggests, a volatile storage cannot survive
system crashes. Volatile storage devices are placed very close to the CPU;
normally they are embedded onto the chipset itself. For example, main memory
and cache memory are examples of volatile storage. They are fast but can store
only a small amount of information.
• Non-volatile storage − These memories are made to survive system crashes.
They are huge in data storage capacity, but slower in accessibility. Examples may
include hard-disks, magnetic tapes, flash memory, and non-volatile (battery
backed up) RAM.

Recovery and Atomicity

When a system crashes, it may have several transactions being executed and various
files opened for them to modify the data items. Transactions are made of various
operations, which are atomic in nature. But according to ACID properties of DBMS,
atomicity of transactions as a whole must be maintained, that is, either all the
operations are executed or none.
When a DBMS recovers from a crash, it should maintain the following −
• It should check the states of all the transactions, which were being executed.
• A transaction may be in the middle of some operation; the DBMS must ensure
the atomicity of the transaction in this case.
• It should check whether the transaction can be completed now or it needs to be
rolled back.
• No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well as
maintaining the atomicity of a transaction −
• Maintaining the logs of each transaction, and writing them onto some stable
storage before actually modifying the database.
• Maintaining shadow paging, where the changes are done on a volatile memory,
and later, the actual database is updated.

Log-based Recovery
Log is a sequence of records, which maintains the records of actions performed by a
transaction. It is important that the logs are written prior to the actual modification and
stored on a stable storage media, which is failsafe.
Log-based recovery works as follows −
• The log file is kept on a stable storage media.
• When a transaction enters the system and starts execution, it writes a log about
it.
<T , Start>
n

• When the transaction modifies an item X, it write logs as follows −

<T , X, V , V >
n 1 2

It reads T has changed the value of X, from V to V .

n 1 2

• When the transaction finishes, it logs −

<T , commit>
n

The database can be modified using two approaches −

• Deferred database modification − All logs are written on to the stable storage
and the database is updated when a transaction commits.
• Immediate database modification − Each log follows an actual database
modification. That is, the database is modified immediately after every
operation.
Recovery with Concurrent Transactions
When more than one transaction are being executed in parallel, the logs are
interleaved. At the time of recovery, it would become hard for the recovery system to
backtrack all logs, and then start recovering. To ease this situation, most modern DBMS
use the concept of 'checkpoints'.

Checkpoint

Keeping and maintaining logs in real time and in real environment may fill out all the
memory space available in the system. As time passes, the log file may grow too big to
be handled at all. Checkpoint is a mechanism where all the previous logs are removed
from the system and stored permanently in a storage disk. Checkpoint declares a point
before which the DBMS was in consistent state, and all the transactions were
committed.

Recovery

When a system with concurrent transactions crashes and recovers, it behaves in the
following manner −

• The recovery system reads the logs backwards from the end to the last
checkpoint.
• It maintains two lists, an undo-list and a redo-list.
• If the recovery system sees a log with <T , Start> and <T , Commit> or just <T ,
n n n

Commit>, it puts the transaction in the redo-list.

• If the recovery system sees a log with <T , Start> but no commit or abort log
n

found, it puts the transaction in undo-list.

All the transactions in the undo-list are then undone and their logs are removed. All the
transactions in the redo-list and their previous logs are removed and then redone
before saving their logs.

Database Recovery:
The Database is prone to failures due to inconsistency, network failure, errors or any
kind of accidental damage. So, database recovery techniques are highly important to
bring a database back into a working state after a failure. There are four
different recovery techniques are available in the Database.

1. Mirroring: Two complete copies of the database maintain on-line on different

stable storage devices. This method mostly uses in environments that require non-stop,
fault-tolerant operations.

2. Recovery using Backups:

Backups are useful if there has been extensive damage to database. Backups are mainly
two types :
Immediate Backup

Archival Backup

Immediate Backup is kept in a floppy disk, hard disk or magnetic tapes. These come in
handy when a technical fault occurs in the primary database such as system failure, disk
crashes, network failure. Damage due to virus attacks repair using the immediate
backup.
Archival Backups are kept in mass storage devices such as magnetic tape, CD-ROMs,
Internet Servers etc. They are very useful for recovering data after a disaster such as fire,
earthquake, flood etc. Archival Backup should be kept at a different site other than
where the system is functioning. Archival Backup at a separate place remains safe from
thefts and international destruction by user staff.

3.Recovery using Transaction Logs:

Step1: The log searches for all the transaction that have recorded a [ start transaction, ‘
‘] entry, but haven’t recorded a corresponding [commit, ‘ ‘] entry.
Step2: These transactions are rolling back.
Step3: Transactions which have recorded a [commit, ‘ ‘] entry in the log, it must have recorded
the changes, they did to the database in the log. These changes will follow to undo their effects
on the database.

4. Shadow Paging:
This system can use for data recovery instead of using transaction logs. In the Shadow
Paging, a database is divided into several fixed-sized disk pages, say n, thereafter a
current directory creates. It is having n entries with each entry pointing to a disk page in
the database. The current directory transfer to the main memory.

When a transaction begins executing, the current directory copies into a shadow
directory. Then, the shadow directory saves on the disk. The transaction will be using the
current directory. During the transaction execution, all the modifications are made on
the current directory and the shadow directory is never modified.

Simple Belt Conveyor Calculation Example
90% (10)
Simple Belt Conveyor Calculation Example
3 pages
Stargate Universe 3 X 02
100% (1)
Stargate Universe 3 X 02
52 pages
Unit 2
No ratings yet
Unit 2
67 pages
Unit 3 SQL and PL-SQL
No ratings yet
Unit 3 SQL and PL-SQL
127 pages
Unit 2 SQL
No ratings yet
Unit 2 SQL
83 pages
Unit 3
No ratings yet
Unit 3
57 pages
DBMS Unit 4 Notes PDF
100% (1)
DBMS Unit 4 Notes PDF
61 pages
DM Experiment No 1
No ratings yet
DM Experiment No 1
7 pages
SQL Basics and Commands Guide
No ratings yet
SQL Basics and Commands Guide
77 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
11 pages
Dbms 1
No ratings yet
Dbms 1
45 pages
DBMS Unit 5 Notes
No ratings yet
DBMS Unit 5 Notes
28 pages
1 Introduction
No ratings yet
1 Introduction
5 pages
Dbmslab
No ratings yet
Dbmslab
55 pages
Database Connectivity 100014
No ratings yet
Database Connectivity 100014
35 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
134 pages
DBMS Unit 3 Module 1
No ratings yet
DBMS Unit 3 Module 1
29 pages
SQL
No ratings yet
SQL
325 pages
SQL1
No ratings yet
SQL1
5 pages
Module 4 DBMS
No ratings yet
Module 4 DBMS
21 pages
Database Management System
No ratings yet
Database Management System
46 pages
Comprehensive SQL Guide & Commands
No ratings yet
Comprehensive SQL Guide & Commands
6 pages
Structured Query Language Introduction To SQL
No ratings yet
Structured Query Language Introduction To SQL
23 pages
Computer-Science-Class-12 DBMS NOTES
100% (1)
Computer-Science-Class-12 DBMS NOTES
52 pages
SQL Basics
No ratings yet
SQL Basics
34 pages
SQL Basics
No ratings yet
SQL Basics
27 pages
Introduction To SQL
No ratings yet
Introduction To SQL
3 pages
Computer Science Class 12 - SQL
No ratings yet
Computer Science Class 12 - SQL
31 pages
DSA-SQL For Data Analysis
No ratings yet
DSA-SQL For Data Analysis
36 pages
Database 3,4
No ratings yet
Database 3,4
24 pages
SQL Basics for Beginners
No ratings yet
SQL Basics for Beginners
19 pages
DDL DML DCL TCL and DQL - Index
No ratings yet
DDL DML DCL TCL and DQL - Index
7 pages
7 SQL
No ratings yet
7 SQL
6 pages
Adbms Modl3
No ratings yet
Adbms Modl3
30 pages
DBMS
No ratings yet
DBMS
25 pages
What Is SQL
No ratings yet
What Is SQL
4 pages
Structured Query Language OLD Week 1
No ratings yet
Structured Query Language OLD Week 1
24 pages
Database Management System (DBMS)
No ratings yet
Database Management System (DBMS)
41 pages
U2-3,4 SQL Data Manipulation
No ratings yet
U2-3,4 SQL Data Manipulation
22 pages
DBMS - UNIT-2 (Complete)
No ratings yet
DBMS - UNIT-2 (Complete)
22 pages
DBMS Chapter 4
No ratings yet
DBMS Chapter 4
39 pages
SQL Full Notes Reachus
No ratings yet
SQL Full Notes Reachus
59 pages
SQL Basics for Beginners
No ratings yet
SQL Basics for Beginners
27 pages
DBMS Unit3
No ratings yet
DBMS Unit3
24 pages
SQL Basics for CS Students
No ratings yet
SQL Basics for CS Students
58 pages
Darpan ?
No ratings yet
Darpan ?
24 pages
Database Interview
No ratings yet
Database Interview
15 pages
SQL For Data Science
No ratings yet
SQL For Data Science
53 pages
1 - SQL - DE - Feb25
No ratings yet
1 - SQL - DE - Feb25
79 pages
Overview of Database Languages
No ratings yet
Overview of Database Languages
19 pages
DBMS Refresher
No ratings yet
DBMS Refresher
65 pages
Unit-03 Structured Query Language (SQL)
No ratings yet
Unit-03 Structured Query Language (SQL)
17 pages
DBMS SQLBasics
No ratings yet
DBMS SQLBasics
14 pages
Abhishek Ghosh - DBMS With SQL - Mic401b
No ratings yet
Abhishek Ghosh - DBMS With SQL - Mic401b
15 pages
E Insurance Project
No ratings yet
E Insurance Project
10 pages
Reading Material Experiment 1.1
No ratings yet
Reading Material Experiment 1.1
5 pages
SQL Commands
No ratings yet
SQL Commands
6 pages
Unit Rest 2 Solution
No ratings yet
Unit Rest 2 Solution
21 pages
Anshika (DBMS)
No ratings yet
Anshika (DBMS)
38 pages
SQL Commands
100% (1)
SQL Commands
58 pages
A Simple Slide Show On SQL
No ratings yet
A Simple Slide Show On SQL
10 pages
CS2D Guide for Gamers
No ratings yet
CS2D Guide for Gamers
18 pages
Z-Transforms and Their Applications For Solving Difference Equations
No ratings yet
Z-Transforms and Their Applications For Solving Difference Equations
3 pages
Return To Running Program Steve Cole WM Mary
No ratings yet
Return To Running Program Steve Cole WM Mary
6 pages
Pune Metro Environmental Impact
No ratings yet
Pune Metro Environmental Impact
15 pages
1 Normal Stress
No ratings yet
1 Normal Stress
4 pages
Fundamentals of BALLISTICS
No ratings yet
Fundamentals of BALLISTICS
12 pages
Essay Patriotism
100% (2)
Essay Patriotism
6 pages
Peh Reviewer
No ratings yet
Peh Reviewer
46 pages
Olimpiade Guru Bahasa Inggris SMP Sce 2017 (Soal)
No ratings yet
Olimpiade Guru Bahasa Inggris SMP Sce 2017 (Soal)
5 pages
Computer Graphics
100% (1)
Computer Graphics
132 pages
Case Study
No ratings yet
Case Study
19 pages
Winding-Up: A Guide for B.Com Students
No ratings yet
Winding-Up: A Guide for B.Com Students
12 pages
AR Parts AR-6
No ratings yet
AR Parts AR-6
3 pages
Acoustic Insights for Engineers
100% (1)
Acoustic Insights for Engineers
16 pages
Module-15-22.pdf - PHY 032 PHYSICS FOR ENGINEERS Module #15 Student Activity Sheet Name - Section - College Sidekick
No ratings yet
Module-15-22.pdf - PHY 032 PHYSICS FOR ENGINEERS Module #15 Student Activity Sheet Name - Section - College Sidekick
1 page
Dežela Celjska in Your Pocket
No ratings yet
Dežela Celjska in Your Pocket
85 pages
Tiger in The Zoo
No ratings yet
Tiger in The Zoo
5 pages
Skin Disease in Travelers Premium Ebook Download
No ratings yet
Skin Disease in Travelers Premium Ebook Download
14 pages
S. Radhakrshinan
No ratings yet
S. Radhakrshinan
37 pages
EV 200 Trouble Shooting Guid 1
100% (2)
EV 200 Trouble Shooting Guid 1
82 pages
Lecture 3 (Week 2) : BIOLOGY 201/winter 2018 Dr. Ian Ferguson
No ratings yet
Lecture 3 (Week 2) : BIOLOGY 201/winter 2018 Dr. Ian Ferguson
5 pages
PP 11 - Bony Anatomy of The Hip
No ratings yet
PP 11 - Bony Anatomy of The Hip
14 pages
5.03.2 Tanker Based FPSOs y Cont. - Budhiraja
No ratings yet
5.03.2 Tanker Based FPSOs y Cont. - Budhiraja
44 pages
MATH
No ratings yet
MATH
6 pages
q14 SVC 052 Chaudhry r0
No ratings yet
q14 SVC 052 Chaudhry r0
5 pages
Old Question Plus 2
No ratings yet
Old Question Plus 2
18 pages
RDO No. 68 - Sorsogon City, Sorsogon 3
No ratings yet
RDO No. 68 - Sorsogon City, Sorsogon 3
703 pages
Wbi11 01 Que 20240508
No ratings yet
Wbi11 01 Que 20240508
28 pages