Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
6 views101 pages

DBMS Tutorial

The DBMS Tutorial covers both basic and advanced concepts of Database Management Systems (DBMS), detailing its functions, characteristics, advantages, and disadvantages. It explains the evolution of databases from file-based systems to modern cloud and NoSQL databases, highlighting their unique features and use cases. Additionally, it discusses relational databases and their properties, emphasizing the importance of data management and security.

Uploaded by

Aims Yendada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views101 pages

DBMS Tutorial

The DBMS Tutorial covers both basic and advanced concepts of Database Management Systems (DBMS), detailing its functions, characteristics, advantages, and disadvantages. It explains the evolution of databases from file-based systems to modern cloud and NoSQL databases, highlighting their unique features and use cases. Additionally, it discusses relational databases and their properties, emphasizing the importance of data management and security.

Uploaded by

Aims Yendada
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 101

DBMS Tutorial

DBMS Tutorial provides basic and advanced concepts of Database. Our DBMS Tutorial
is designed
for beginners and professionals both.

Database management system is software that is used to manage the database.

Our DBMS Tutorial includes all topics of DBMS such as introduction, ER model, keys,
relational
model, join operation, SQL, functional dependency, transaction, concurrency control,
etc.

What is Database
The database is a collection of inter-related data which is used to retrieve, insert and
delete the data
efficiently. It is also used to organize the data in the form of a table, schema, views, and
reports, etc.

51. vs Java

For example: The college Database organizes the data about the admin, staff, students
and faculty
etc.

Using the database, you can easily retrieve, insert, and delete the information.

Database Management System


o Database management system is a software which is used to manage the
database. For
example: MySQL, Oracle, etc are a very popular commercial database which is
used in
different applications.
o DBMS provides an interface to perform various operations like database creation,
storing data
in it, updating data, creating a table in the database and a lot more.
o It provides protection and security to the database. In the case of multiple
users, it also
maintains data consistency.

DBMS allows users the following tasks:

o Data Definition: It is used for creation, modification, and removal of definition


that defines
the organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the
actual data in
the database.
o Data Retrieval: It is used to retrieve the data from the database which can be
used by
applications for various purposes.
o User Administration: It is used for registering and monitoring users, maintain data
integrity,
enforcing data security, dealing with concurrency control, monitoring performance
and
recovering information corrupted by unexpected failure.
Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the
information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of
failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the requirements
of the user.

Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores
all the data
in one single database file and that recorded data is placed in the database.
o Data sharing: In DBMS, the authorized users of an organization can share the
data among
multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized
nature of the
database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic
backup of data
from hardware and software failures and restores the data if required.
o multiple user interface: It provides different types of user interfaces like
graphical user
interfaces, application program interfaces

Disadvantages of DBMS
o Cost of Hardware and Software: It requires a high speed of data processor and
large
memory size to run DBMS software.
o Size: It occupies a large space of disks and large memory to run them efficiently.
o Complexity: Database system creates additional complexity and requirements.
o Higher impact of failure: Failure is highly impacted the database because in
most of the
organization, all the data stored in a single database and if the database is
damaged due to
electric failure or database corruption then the data may be lost forever.

Database
What is Data?
Data is a collection of a distinct small unit of information. It can be used in a variety
of forms like text, numbers, media, bytes, etc. it can be stored in pieces of paper or
electronic memory, etc.

Word 'Data' is originated from the word 'datum' that means 'single piece of
information.' It is plural of the word datum.

In computing, Data is information that can be translated into a form for efficient
movement and processing. Data is interchangeable.

What is Database?
A database is an organized collection of data, so that it can be easily accessed and
managed.

53.7M

1.1K

Exception Handling in Java - Javatpoint

You can organize data into tables, rows, columns, and index it to make it easier to find
relevant information.

Database handlers create a database in such a way that only one set of software
program provides access of data to all the users.

The main purpose of the database is to operate a large amount of information by storing,
retrieving, and managing data.

There are many dynamic websites on the World Wide Web nowadays which are
handled through databases. For example, a model that checks the availability of rooms
in a hotel. It is an example of a dynamic website that uses a database.

There are many databases available like MySQL, Sybase, Oracle, MongoDB, Informix,
PostgreSQL, SQL Server, etc.

Modern databases are managed by the database management system (DBMS).

SQL or Structured Query Language is used to operate on the data stored in a database.
SQL depends on relational algebra and tuple relational calculus.

A cylindrical structure is used to display the image of a database.


Evolution of Databases
The database has completed more than 50 years of journey of its evolution from flat-
file system to relational and objects relational systems. It has gone through several
generations.

The Evolution
File-Based

1968 was the year when File-Based database were introduced. In file-based
databases, data was maintained in a flat file. Though files have many advantages,
there are several limitations.

One of the major advantages is that the file system has various access methods, e.g.,
sequential, indexed, and random.

It requires extensive programming in a third-generation language such as COBOL,


BASIC.

Hierarchical Data Model

1968-1980 was the era of the Hierarchical Database. Prominent hierarchical database
model was IBM's first DBMS. It was called IMS (Information Management System).

In this model, files are related in a parent/child manner.

Below diagram represents Hierarchical Data Model. Small circle represents objects.

Like file system, this model also had some limitations like complex implementation,
lack structural independence, can't easily handle a many-many relationship, etc.

Network data model


Charles Bachman developed the first DBMS at Honeywell called Integrated Data Store
(IDS). It was developed in the early 1960s, but it was standardized in 1971 by the
CODASYL group (Conference on Data Systems Languages).

In this model, files are related as owners and members, like to the common network
model.
Network data model identified the following components:

o Network schema (Database organization)

o Sub-schema (views of database per user)

o Data management language (procedural)

This model also had some limitations like system complexity and difficult to design and
maintain.

Relational Database

1970 - Present: It is the era of Relational Database and Database Management. In


1970, the relational model was proposed by E.F. Codd.

Relational database model has two main terminologies called instance

and schema. The instance is a table with rows or columns

Schema specifies the structure like name of the relation, type of each column

and name. This model uses some mathematical concept like set theory and

predicate logic.

The first internet database application had been created in 1995.

During the era of the relational database, many more models had introduced like
object-oriented model, object-relational model, etc.

Cloud database
Cloud database facilitates you to store, manage, and retrieve their structured,
unstructured data via a cloud platform. This data is accessible over the Internet. Cloud
databases are also called a database as service (DBaaS) because they are offered as a
managed service.

Some best cloud options are:

o AWS (Amazon Web Services)

o Snowflake Computing

o Oracle Database Cloud Services


o Microsoft SQL server

o Google cloud spanner

Advantages of cloud

database Lower costs

Generally, company provider does not have to invest in databases. It can maintain and
support one or more data centers.
Automated

Cloud databases are enriched with a variety of automated processes such as recovery,
failover, and auto-scaling.

Increased accessibility

You can access your cloud-based database from any location, anytime. All you need
is just an internet connection.

NoSQL Database
A NoSQL database is an approach to design such databases that can accommodate a
wide variety
of data models. NoSQL stands for "not only SQL." It is an alternative to traditional
relational
databases in which data is placed in tables, and data schema is perfectly designed
before the
database is built.

NoSQL databases are useful for a large set of distributed data.

Some examples of NoSQL database system with their category are:

o MongoDB, CouchDB, Cloudant (Document-based)

o Memcached, Redis, Coherence (key-value store)


o HBase, Big Table, Accumulo (Tabular)

Advantage of NoSQL
High Scalability

NoSQL can handle an extensive amount of data because of scalability. If the data
grows, NoSQL database scale it to handle that data in an efficient manner.

High Availability

NoSQL supports auto replication. Auto replication makes it highly available because, in
case of any failure, data replicates itself to the previous consistent state.

Disadvantage of NoSQL
Open source

NoSQL is an open-source database, so there is no reliable standard for NoSQL yet.

Management challenge

Data management in NoSQL is much more complicated than relational databases.


It is very challenging to install and even more hectic to manage daily.
GUI is not available

GUI tools for NoSQL database are not easily available in the market.

Backup

Backup is a great weak point for NoSQL databases. Some databases, like MongoDB,
have no powerful approaches for data backup.

The Object-Oriented Databases


The object-oriented databases contain data in the form of object and classes. Objects
are the realworld entity, and types are the collection of objects. An object-oriented
database is a combination of relational model features with objects oriented principles.
It is an alternative implementation to that of the relational model.

Object-oriented databases hold the rules of object-oriented programming. An object-


oriented database management system is a hybrid application.

The object-oriented database model contains the following properties.

Object-oriented programming properties

o Objects

o Classes
o Inheritance

o Polymorphism

o Encapsulation

Relational database properties

o Atomicity
o Consistency

o Integrity

o Durability

o Concurrency
o Query processing

Graph Databases
A graph database is a NoSQL database. It is a graphical representation of data. It
contains nodes and edges. A node represents an entity, and each edge represents a
relationship between two edges. Every node in a graph database represents a unique
identifier.
Graph databases are beneficial for searching the relationship between data because
they highlight the relationship between relevant data.

Graph databases are very useful when the database contains a complex relationship
and dynamic
schema.

It is mostly used in supply chain management, identifying the source of IP

telephony. DBMS (Data Base Management System)


Database management System is software which is used to store and retrieve the
database. For example, Oracle, MySQL, etc.; these are some popular DBMS tools.

o DBMS provides the interface to perform the various operations like creation, deletion,
modification,
etc.

o DBMS allows the user to create their databases as per their requirement.
o DBMS accepts the request from the application and provides specific data through the
operating
system.
o DBMS contains the group of programs which acts according to the user instruction.

o It provides security to the database.

Advantage of DBMS
Controls redundancy

It stores all the data in a single database file, so it can control data redundancy.

Data sharing

An authorized user can share the data among multiple users.

Backup

It providesBackup and recovery subsystem. This recovery system creates automatic data
from system failure and restores data if required.

Multiple user interfaces

It provides a different type of user interfaces like GUI, application interfaces.

Disadvantage of DBMS
Size

It occupies large disk space and large memory to run efficiently.

Cost

DBMS requires a high-speed data processor and larger memory to run DBMS software, so
it is costly.

Complexity

DBMS creates additional complexity and requirements.

RDBMS (Relational Database Management System)


The word RDBMS is termed as 'Relational Database Management System.' It is
represented as a table that contains rows and column.

RDBMS is based on the Relational model; it was introduced by E. F. Codd.

A relational database contains the following components:

o Table
o Record/ Tuple

o Field/Column name /Attribute

o Instance

o Schema
o Keys

An RDBMS is a tabular DBMS that maintains the security, integrity, accuracy, and
consistency of the
data.

Types of Databases
There are various types of databases used for storing different varieties of data:

1) Centralized Database
It is the type of database that stores data at a centralized database system. It comforts
the users to
access the stored data from different locations through several applications. These
applications
contain the authentication process to let users access data securely. An example of a
Centralized
database can be Central Library that carries a central database of each library in a
college/university.

Advantages of Centralized Database


o It has decreased the risk of data management, i.e., manipulation of data will not affect
the core data.

o Data consistency is maintained as it manages data in a central repository.


o It provides better data quality, which enables organizations to establish data standards.

o It is less costly because fewer vendors are required to handle the data sets.

Disadvantages of Centralized Database


o The size of the centralized database is large, which increases the response time for
fetching the data.

o It is not easy to update such an extensive database system.

o If any server failure occurs, entire data will be lost, which could be a huge loss.

2) Distributed Database
Unlike a centralized database system, in distributed systems, data is distributed
among different database systems of an organization. These database systems are
connected via communication links. Such links help the end-users to access the data
easily. Examples of the Distributed database are Apache Cassandra, HBase, Ignite, etc.

We can further divide a distributed database system into:

o Homogeneous DDB: Those database systems which execute on the same operating
system and use
the same application process and carry the same hardware devices.

o Heterogeneous DDB: Those database systems which execute on different operating


systems under
different application procedures, and carries different hardware devices.

Advantages of Distributed Database


o Modular development is possible in a distributed database, i.e., the system can be
expanded by
including new computers and connecting them to the distributed system.
o One server failure will not affect the entire data set.
3) Relational Database
This database is based on the relational data model, which stores data in the form of
rows(tuple) and columns(attributes), and together forms a table(relation). A relational
database uses SQL for storing, manipulating, as well as maintaining the data. E.F. Codd
invented the database in 1970. Each table in the database carries a key that makes the
data unique from others. Examples of Relational databases are MySQL, Microsoft SQL
Server, Oracle, etc.

Properties of Relational Database


There are following four commonly known properties of a relational model known
as ACID properties, where:

A means Atomicity: This ensures the data operation will complete either with success or
with failure.
It follows the 'all or nothing' strategy. For example, a transaction will either be
committed or will
abort.

C means Consistency: If we perform any operation over the data, its value before and
after the operation should be preserved. For example, the account balance before and
after the transaction should be correct, i.e., it should remain conserved.

I means Isolation: There can be concurrent users for accessing data at the same time
from the database. Thus, isolation between the data should remain isolated. For
example, when multiple transactions occur at the same time, one transaction effects
should not be visible to the other transactions in the database.

D means Durability: It ensures that once it completes the operation and commits the
data, data changes should remain permanent.

4) NoSQL Database
Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of
data sets. It is not a relational database as it stores data not only in tabular form but in
several different ways. It came into existence when the demand for building modern
applications increased. Thus, NoSQL presented a wide variety of database
technologies in response to the demands. We can further divide a NoSQL database
into the following four types:
a. Key-value storage: It is the simplest type of database storage where it stores every
single item as a
key (or attribute name) holding its value, together.
b. Document-oriented Database: A type of database used to store data as JSON-like
document. It
helps developers in storing data by using the same document-model format as used in the
application
code.
c. Graph Databases: It is used for storing vast amounts of data in a graph-like
structure. Most
commonly, social networking websites use the graph database.

d. Wide-column stores: It is similar to the data represented in relational databases. Here,


data is stored
in large columns together, instead of storing in rows.

Advantages of NoSQL Database


o It enables good productivity in the application development as it is not required to
store data in a
structured format.

o It is a better option for managing and handling large data sets.

o It provides high scalability.

o Users can quickly access data from the database through key-value.

5) Cloud Database
A type of database where data is stored in a virtual environment and executes over
the cloud computing platform. It provides users with various cloud computing services
(SaaS, PaaS, IaaS, etc.) for accessing the database. There are numerous cloud
platforms, but the best options are:

o Amazon Web Services(AWS)

o Microsoft Azure

o Kamatera
o PhonixNAP

o ScienceSoft

o Google Cloud SQL, etc.

6) Object-oriented Databases
The type of database that uses the object-based data model approach for storing
data in the database system. The data is represented and stored as objects which are
similar to the objects used in the object-oriented programming language.
7) Hierarchical Databases
It is the type of database that stores data in the form of parent-children relationship
nodes. Here, it organizes data in a tree-like structure.
Data get stored in the form of records that are connected via links. Each child record in
the tree will
contain only one parent. On the other hand, each parent record can have multiple child
records.

8) Network Databases
It is the database that typically follows the network data model. Here, the
representation of data is
in the form of nodes connected via links between them. Unlike the hierarchical
database, it allows
each record to have multiple children and parent nodes to form a generalized graph
structure.

9) Personal Database
Collecting and storing data on the user's system defines a Personal Database. This
database is basically designed for a single user.

Advantage of Personal Database


o It is simple and easy to handle.

o It occupies less storage space as it is small in size.

10) Operational Database


The type of database which creates and updates the database in real-time. It is
basically designed for executing and handling the daily data operations in several
businesses. For example, An organization uses operational databases for managing
per day transactions.

11) Enterprise Database


Large organizations or enterprises use this database for managing a massive amount of
data. It helps
organizations to increase and improve their efficiency. Such a database allows
simultaneous access
to users.

Advantages of Enterprise Database:


o Multi processes are supportable over the Enterprise database.
o It allows executing parallel queries on the system.

What is RDBMS (Relational Database Management System)


RDBMS stands for Relational Database Management System.
All modern database management systems like SQL, MS SQL Server, IBM DB2,
ORACLE, My-SQL, and Microsoft Access are based on RDBMS.

It is called Relational Database Management System (RDBMS) because it is based on


the relational model introduced by E.F. Codd.

How it works
Data is represented in terms of tuples (rows) in RDBMS. of India (1947-2020)

A relational database is the most commonly used database. It contains several tables,
and each table has its primary key.

Due to a collection of an organized set of tables, data can be accessed easily in RDBMS.

Brief History of RDBMS


From 1970 to 1972, E.F. Codd published a paper to propose using a relational

database model. RDBMS is originally based on E.F. Codd's relational model

invention.

Following are the various terminologies of RDBMS:

What is table/Relation?
Everything in a relational database is stored in the form of relations. The RDBMS
database uses tables to store data. A table is a collection of related data entries and
contains rows and columns to store data. Each table represents some real-world
objects such as person, place, or event about which information is collected. The
organized collection of data into a relational table is known as the logical view of the
database.

Properties of a Relation:

o Each relation has a unique name by which it is identified in the database.


o Relation does not contain duplicate tuples.

o The tuples of a relation have no specific order.

o All attributes in a relation are atomic, i.e., each cell of a relation contains exactly one
value.
A table is the simplest example of data stored in RDBMS.

Let's see the example of the student table.

ID Name AGE COURSE

1 Ajeet 24 B.Tech

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC

What is a row or record?


A row of a table is also called a record or tuple. It contains the specific information of
each entry in the table. It is a horizontal entity in the table. For example, The above
table contains 5 records.

Properties of a row:

o No two tuples are identical to each other in all their entries.

o All tuples of the relation have the same format and the same number of entries.

o The order of the tuple is irrelevant. They are identified by their content, not by

their position. Let's see one record/row in the table.

ID Name AGE COURSE

1 Ajeet 24 B.Tech

What is a column/attribute?
A column is a vertical entity in the table which contains all information associated
with a specific field in a table. For example, "name" is a column in the above table
which contains all information about a student's name.

Properties of an Attribute:

o Every attribute of a relation must have a name.


o Null values are permitted for the attributes.
o Default values can be specified for an attribute automatically inserted if no other value is
specified for
an attribute.

o Attributes that uniquely identify each tuple of a relation are the primary key.

Name

Ajeet

Aryan

Mahesh

Ratan

Vimal

What is data item/Cells?


The smallest unit of data in the table is the individual data item. It is stored at the
intersection of tuples and attributes.

Properties of data items:

o Data items are atomic.

o The data items for an attribute should be drawn from the same domain.

In the below example, the data item in the student table consists of Ajeet, 24 and
Btech, etc.

ID Name AGE COURSE

1 Ajeet 24 B.Tech

Degree:
The total number of attributes that comprise a relation is known as the degree of the
table.

For example, the student table has 4 attributes, and its degree is 4.

ID Name AGE COURSE


1 Ajeet 24 B.Tech

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC

Cardinality:
The total number of tuples at any one time in a relation is known as the table's
cardinality. The relation whose cardinality is 0 is called an empty table.

For example, the student table has 5 rows, and its cardinality is 5.

ID Name AGE COURSE

1 Ajeet 24 B.Tech

2 aryan 20 C.A

3 Mahesh 21 BCA

4 Ratan 22 MCA

5 Vimal 26 BSC

Domain:
The domain refers to the possible values each attribute can contain. It can be
specified using standard data types such as integers, floating numbers, etc. For
example, An attribute entitled Marital_Status may be limited to married or unmarried
values.

NULL Values
The NULL value of the table specifies that the field has been left blank during record
creation. It is different from the value filled with zero or a field that contains space.

Data Integrity
There are the following categories of data integrity exist with each RDBMS:
Entity integrity: It specifies that there should be no duplicate rows in a table.

Domain integrity: It enforces valid entries for a given column by restricting the type, the
format, or the range of values.

Referential integrity specifies that rows cannot be deleted, which are used by other
records.

User-defined integrity: It enforces some specific business rules defined by users.


These rules are different from the entity, domain, or referential integrity.

Difference between DBMS and RDBMS


Although DBMS and RDBMS both are used to store information in physical database
but there are some remarkable differences between them.

The main differences between DBMS and RDBMS are given below:

No. DBMS RDBMS

1) DBMS applications store data as file. RDBMS applications store data in a tabular form.

2) In DBMS, data is generally stored in either In RDBMS, the tables have an identifier called
primary
a hierarchical form or a navigational form.key and the data values are stored in the form
of tables.

3) Normalization is not present in DBMS. Normalization is present in RDBMS.

4) DBMS does not apply any security with RDBMS defines the integrity constraint for
the
regards to data manipulation. purpose of ACID (Atomocity, Consistency, Isolation
and
Durability) property.

5) DBMS uses file system to store data, so in RDBMS, data values are stored in the form of
tables,
there will be no relation between the so a relationship between these data values will
be
tables. stored in the form of a table as well.

6) DBMS has to provide some uniform RDBMS system supports a tabular structure of
the data
methods to access the stored and a relationship between them to access the
stored
information. information.

7) DBMS does not support distributed RDBMS supports distributed database.


database.
8) DBMS is meant to be for small RDBMS is designed to handle large amount of
data. it
organization and deal with small data. it supports multiple users.
supports single user.
9) Examples of DBMS are file Example of RDBMS are mysql, postgre,
sql
systems, xml etc. server, oracle etc.

After observing the differences between DBMS and RDBMS, you can say that RDBMS is
an extension of DBMS. There are many software products in the market today who are
compatible for both DBMS and RDBMS. Means today a RDBMS application is DBMS
application and vice-versa.

DBMS vs. File System


File System Approach
File based systems were an early attempt to computerize the manual system. It is
also called a
traditional based approach in which a decentralized approach was taken where each
department
stored and controlled its own data with the help of a data processing specialist. The
main role of a
data processing specialist was to create the necessary computer file structures, and also
manage the
data within structures and design some application programs that create reports based
on file data.

In the above figure:

Consider an example of a student's file system. The student file will contain
information regarding
the student (i.e. roll no, student name, course etc.). Similarly, we have a subject file
that contains
information about the subject and the result file which contains the information
regarding the result.

Some fields are duplicated in more than one file, which leads to data redundancy. So to
overcome this problem, we need to create a centralized system, i.e. DBMS approach.

DBMS:
A database approach is a well-organized collection of data that are related in a
meaningful way which can be accessed by different users but stored only once in a
system. The various operations performed by the DBMS system are: Insertion, deletion,
selection, sorting etc.
In the above figure,

In the above figure, duplication of data is reduced due to centralization of data.

There are the following differences between DBMS and File systems:

Basis DBMS Approach File System Approach

Meaning DBMS is a collection of data. In DBMS, The file system is a collection of


data.
the user is not required to write the In this system, the user has to
write
procedures. the procedures for managing
the
database.

Sharing of Due to the centralized approach, dataData is distributed in many


files, and
data sharing is easy. it may be of different formats,
so it
isn't easy to share data.

Data DBMS gives an abstract view of data The file system provides the
detail of
Abstraction that hides the details. the data representation and
storage
of data.

Security and DBMS provides a good protection It isn't easy to protect a file
under the
Protection mechanism. file system.

Recovery DBMS provides a crash recovery The file system doesn't have a
crash
Mechanism mechanism, i.e., DBMS protects the user mechanism, i.e., if the system
crashes
from system failure. while entering some data, then
the
content of the file will be lost.

Manipulation DBMS contains a wide variety ofThe file system can't efficiently
store
Techniques sophisticated techniques to store andand retrieve the data.
retrieve the data.

Concurrency DBMS takes care of Concurrent access In the File system, concurrent
access
Problems of data using some form of locking. has many problems like
redirecting
the file while deleting
some
information or updating
some information.

Where to use Database approach used in large File system approach used in
large
systems which interrelate many files. systems which interrelate many
files.

Cost The database system is expensive to The file system approach is


cheaper
design. to design.

Data Due to the centralization of In this, the files and


the application programs are
Redundancy database, the problems of created by different
data programmers so that there
and redundancy and inconsistency exists a lot of duplication of data
are
which may lead to inconsistency.
Inconsistency controlled.

Structure The database structure is complex toThe file system approach has
a
design. simple structure.

Data In this system, Data IndependenceIn the File system approach,


there
Independence exists, and it can be of two types. exists no Data Independence.
o Logical Data Independence
o Physical Data Independence

Integrity Integrity Constraints are easy to apply. Integrity Constraints are


difficult to
Constraints implement in file system.

Data Models In the database approach, 3 types In the file system approach,
of there is no concept of data
data models exist: models exists.
o Hierarchal data models
o Network data models
o Relational data models

Flexibility Changes are often a necessity to theThe flexibility of the system is less
as
content of the data stored in any compared to the DBMS approach.
system, and these changes are
more easily with a database
approach.

Examples Oracle, SQL Server, Sybase etc. Cobol, C++ etc.


DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is
used to deal
with a large number of PCs, web servers, database servers and other components that
are connected
with networks.
o The client/server architecture consists of many PCs and a workstation which are
connected via the
network.

o DBMS architecture depends upon how users are connected to the database to get their
request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.

1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can
directly sit on
the DBMS and uses it.

o Any changes done here will directly be done on the database itself. It doesn't provide a
handy tool
for end users.

o The 1-Tier architecture is used for development of the local application, where
programmers can
directly communicate with the database for the quick response.

2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture,
applications on the
client end can directly communicate with the database at the server side. For this
interaction, API's
like: ODBC, JDBC are used.

o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and
transaction
management.

o To communicate with the DBMS, client-side application establishes a connection with the
server side.
Fig: 2-tier Architecture

3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client
can't directly communicate with the server.

o The application on the client-end interacts with an application server which further
communicates
with the database system.

o End user has no idea about the existence of the database beyond the application server.
The database
also has no idea about any other user beyond the application.

o The 3-Tier architecture is used in case of large web application.

Fig: 3-tier Architecture

Three schema Architecture


o The three schema architecture is also called ANSI/SPARC architecture or three-level
architecture.

o This framework is used to describe the structure of a specific database system.


o The three schema architecture is also used to separate the user applications and
physical database.
o The three schema architecture contains three-levels. It breaks the database down into
three different
categories.
The three-schema architecture is as follows:

In the above diagram:

o It shows the DBMS architecture.


o Mapping is used to transform the request and response between various database
levels of
architecture.

o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the request from
external level to
conceptual schema.

o In Conceptual / Internal mapping, DBMS transform the request from the conceptual to
internal level.

Objectives of Three schema Architecture


The main objective of three level architecture is to enable multiple users to access
the same data
with a personalized view while storing the underlying data only once. Thus it
separates the user's
view from the physical structure of the database. This separation is desirable for the
following
reasons:
o Different users need different views of the same data.
o The approach in which a particular user needs to see the data may change over time.
o The users of the database should not worry about the physical implementation and
internal workings
of the database such as data compression and encryption techniques, hashing,
optimization of the internal structures etc.

o All users should be able to access the same data according to their requirements.
o DBA should be able to change the conceptual structure of the database without affecting
the user's

o Internal structure of the database should be unaffected by changes to physical aspects of


the storage.

1. Internal Level

o The internal level has an internal schema which describes the physical storage
structure of the
database.

o The internal schema is also known as a physical schema.


o It uses the physical data model. It is used to define that how the data will be stored in a
block.

o The physical level is used to describe complex low-level data

structures in detail. The internal level is generally is concerned with the

following activities:

o Storage space allocations.


For Example: B-Trees, Hashing etc.
o Access paths.
For Example: Specification of primary and secondary keys, indexes, pointers and
sequencing.

o Data compression and encryption techniques.

o Optimization of internal structures.


o Representation of stored fields.

2. Conceptual Level
o The conceptual schema describes the design of a database at the conceptual level.
Conceptual level
is also known as logical level.

o The conceptual schema describes the structure of the whole database.


o The conceptual level describes what data are to be stored in the database and also
describes what
relationship exists among those data.

o In the conceptual level, internal details such as an implementation of the data structure
are hidden.

o Programmers and database administrators work at this level.

3. External Level

o At the external level, a database contains several schemas that sometimes called as
subschema. The
subschema is used to describe the different view of the database.

o An external schema is also known as view schema.


o Each view schema describes the database part that a particular user group is interested
and hides the
remaining database from that user group.

o The view schema describes the end user interaction with database systems.

Mapping between Views


The three levels of DBMS architecture don't exist independently of each other. There
must be
correspondence between the three levels i.e. how they actually correspond with each
other. DBMS
is responsible for correspondence between the three types of schema. This
correspondence is called
Mapping.

There are basically two types of mapping in the database architecture:

o Conceptual/ Internal Mapping

o External / Conceptual

Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual level and the internal
level. Its role is to define the correspondence between the records and fields of the
conceptual level and files and data structures of the internal level.
External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level and the Conceptual
level. Its role is to define the correspondence between a particular external and the
conceptual view.

Data Models
Data Model is the modeling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a
database at each level of data abstraction. Therefore, there are following four data
models used for understanding the structure of the database:

1) Relational Data Model: This type of model designs the data in the form of rows and
columns within a table. Thus, a relational model uses tables for representing
data and in-between relationships. Tables are also called relations. This model was
initially described by Edgar F. Codd, in 1969. The relational data model is the widely
used model which is primarily used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as


objects
and relationships among them. These objects are known as entities, and
relationship is an
association among these entities. This model was designed by Peter Chen and
published in 1976
papers. It was widely used in database designing. A set of attributes describe the
entities. For
example, student_name, student_id describes the 'student' entity. A set of the same
type of entities
is known as an 'Entity set', and the set of the same type of relationships is known as
'relationship
set'.

3) Object-based Data Model: An extension of the ER model with notions of


functions,
encapsulation, and object identity, as well. This model supports a rich type system
that includes
structured and collection types. Thus, in 1980s, various database systems following
the object-
oriented approach were developed. Here, the objects are nothing but the data
carrying its
properties.

4) Semistructured Data Model: This type of data model is different from the other
three data models (explained above). The semistructured data model allows the data
specifications at places where the individual data items of the same type may have
different attributes sets. The Extensible Markup Language, also known as XML, is
widely used for representing the semistructured data. Although XML was initially
designed for including the markup information to the text document, it gains
importance because of its application in the exchange of data.

Data model Schema and Instance


o The data which is stored in the database at a particular moment of time is called
an instance
of the database.
o The overall design of a database is called schema.
o A database schema is the skeleton structure of the database. It represents the
logical view of
the entire database.
o A schema contains schema objects like table, foreign key, primary key, views,
columns, data
types, stored procedure, etc.
o A database schema can be represented by using the visual diagram. That
diagram shows the
database objects and relationship with each other.
o A database schema is designed by the database designers to help
programmers whose
software will interact with the database. The process of database creation is
called data
modeling.

A schema diagram can display only some aspects of a schema like the name of
record type, data
type, and constraints. Other aspects can't be specified through the schema diagram.
For example,
the given figure neither show the data type of each data item nor the relationship
among various
files.

In the database, actual data changes quite frequently. For example, in the given figure,
the database changes whenever we add a new grade or add a student. The data at a
particular moment of time is called the instance of the database.

Data Independence
o Data independence can be explained using the three-schema architecture.
o Data independence refers characteristic of being able to modify the schema at
one level of
the database system without altering the schema at the next higher level.

There are two types of data independence:

1. Logical Data Independence


o Logical data independence refers characteristic of being able to change the
conceptual
schema without having to change the external schema.
o Logical data independence is used to separate the external level from the
conceptual view.
o If we do any changes in the conceptual view of the data, then the user view of the
data would
not be affected.
o Logical data independence occurs at the user interface level.

2. Physical Data Independence


o Physical data independence can be defined as the capacity to change the
internal schema
without having to change the conceptual schema.
o If we do any changes in the storage size of the database system server, then the
Conceptual
structure of the database will not be affected.
o Physical data independence is used to separate conceptual levels from the
internal levels.
o Physical data independence occurs at the logical interface level.

Fig: Data Independence

Database Language
o A DBMS has appropriate languages and interfaces to express database queries and
updates.

o Database languages can be used to read, store and update the data in the database.

Types of Database Language


1. Data Definition Language
o DDL stands for Data Definition Language. It is used to define database structure or
pattern.

o It is used to create schema, tables, indexes, constraints, etc. in the database.

o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of
tables and
schemas, their names, indexes, columns in each table, constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.

o Alter: It is used to alter the structure of the database.


o Drop: It is used to delete objects from the database.

o Truncate: It is used to remove all records from a table.

o Rename: It is used to rename an object.

o Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come
under Data definition language.

2. Data Manipulation Language


DML stands for Data Manipulation Language. It is used for accessing and manipulating
data in a database. It handles user requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.

o Update: It is used to update existing data within a table.

o Delete: It is used to delete all records from a table.

o Merge: It performs UPSERT operation, i.e., insert or update operations.


o Call: It is used to call a structured query language or a Java subprogram.

o Explain Plan: It has the parameter of explaining data.

o Lock Table: It controls concurrency.

3. Data Control Language


o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.

o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have
the feature of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.


o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of

Revoke: CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE

and SELECT.

4. Transaction Control Language


TCL is used to run the changes made by the DML statement. TCL can be grouped into
a logical transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.

o Rollback: It is used to restore the database to original since the last Commit.

ACID Properties in DBMS


DBMS is the management of data that should remain integrated when any changes are
done in it.
It is because if the integrity of the data is affected, whole data will get disturbed and
corrupted.
Therefore, to maintain the integrity of the data, there are four properties described in
the database
management system, which are known as the ACID properties. The ACID properties are
meant for
the transaction that goes through a different group of tasks, and there we come to see
the role of
the ACID properties.
In this section, we will learn and understand about the ACID properties. We will learn
what these properties stand for and what does each property is used for. We will also
understand the ACID properties with the help of some examples.

ACID Properties
The expansion of the term ACID defines for:

1) Atomicity: The term atomicity defines that the data remains atomic. It means if any
operation is
performed on the data, either it should be performed or executed completely or
should not be
executed at all. It further means that the operation should not break in between or
execute partially.
In thea Try Catch

case of executing operations on the transaction, the operation should be completely executed and not partially.

Example: If Remo has account A having $30 in his account from which he wishes to
send $10 to
Sheero's account, which is B. In account B, a sum of $ 100 is already present. When
$10 will be
transferred to account B, the sum will become $110. Now, there will be two operations
that will take
place. One is the amount of $10 that Remo wants to transfer will be debited from his
account A, and
the same amount will get credited to account B, i.e., into Sheero's account. Now, what
happens - the
first operation of debit executes successfully, but the credit operation, however, fails.
Thus, in Remo's
account A, the value becomes $20, and to that of Sheero's account, it remains
$100 as it was
previously present.
In the above diagram, it can be seen that after crediting $10, the amount is still $100
in account B. So, it is not an atomic transaction.

The below image shows that both debit and credit operations are done successfully.
Thus the transaction is atomic.
Thus, when the amount loses atomicity, then in the bank systems, this becomes a
huge issue, and so the atomicity is the main focus in the bank systems.

2) Consistency: The word consistency means that the value should remain preserved
always.
In DBMS, the integrity of the data should be maintained, which means if a change in the
database is
made, it should remain preserved always. In the case of transactions, the integrity of the
data is very
essential so that the database remains consistent before and after the transaction. The
data should
always be correct.

Example:

In the above figure, there are three accounts, A, B, and C, where A is making a
transaction T one by one to both B & C. There are two operations that take place, i.e.,
Debit and Credit. Account A firstly debits $50 to account B, and the amount in account A
is read $300 by B before the transaction. After the successful transaction T, the
available amount in B becomes $150. Now, A debits $20 to account C, and that time,
the value read by C is $250 (that is correct as a debit of $50 has been successfully
done to B). The debit and credit operation from account A to C has been done
successfully. We can see that the transaction is done successfully, and the value is
also read correctly. Thus, the data is consistent. In case the value read by B and C is
$300, which means that data is inconsistent because when the debit operation
executes, it will not be consistent.

4) Isolation: The term 'isolation' means separation. In DBMS, Isolation is the property of
a database where no data should affect the other one and may occur concurrently. In
short, the operation on one database should begin when the operation on the first
database gets complete. It means if two operations are being performed on two
different databases, they may not affect the value of one another. In the case of
transactions, when two or more transactions occur simultaneously, the consistency
should remain maintained. Any changes that occur in any particular transaction will not
be seen by other transactions until the change is not committed in the memory.
Example: If two operations are concurrently running on two different accounts, then
the value of both accounts should not get affected. The value should remain
persistent. As you can see in the below diagram, account A is making T1 and T2
transactions to account B and C, but both are executing independently without
affecting each other. It is known as Isolation.

4) Durability: Durability ensures the permanency of something. In DBMS, the term


durability ensures that the data after the successful execution of the operation
becomes permanent in the database. The durability of the data should be so perfect
that even if the system fails or leads to a crash, the database still survives. However, if
gets lost, it becomes the responsibility of the recovery manager for ensuring the
durability of the database. For committing the values, the COMMIT command must
be used every time we make changes.

Therefore, the ACID property of DBMS plays a vital role in maintaining the
consistency and availability of data in the database.

Thus, it was a precise introduction of ACID properties in DBMS. We have discussed


these properties in the transaction section also.

ER model
o ER model stands for an Entity-Relationship model. It is a high-level data model. This
model is used to
define the data elements and relationship for a specified system.

o It develops a conceptual design for the database. It also develops a very simple and
easy to design
view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.

For example, Suppose we design a school database. In this database, the student will
be an entity with attributes like address, name, id, age, etc. The address can be
another entity with attributes like city, street name, pin code, etc and there will be a
relationship between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc.


can be taken as an entity.

a. Weak Entity

Competitive questions on Structures in HindiKeep Watching

An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent

an attribute. For example, id, age, contact number, name, etc. can be attributes of a

student.

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It


represents a primary key. The key attribute is represented by an ellipse with the text
underlined.
b. Composite Attribute

An attribute that composed of many other attributes is known as a composite


attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with
an ellipse.

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued attribute.

For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived attribute.
It can be represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from another
attribute like Date
of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus
is used to represent the relationship.

Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then it is
known as one to one relationship.

For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then this is known as a one-to-many
relationship.

For example, Scientist can invent many inventions, but the invention is done by the
only specific
scientist.
c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance of an
entity on the right associates with the relationship then it is known as a many-to-one
relationship.

For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one instance of
an entity on the right associates with the relationship then it is known as a many-to-
many relationship.

For example, Employee can assign by many projects and project can have many
employees.

Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are
used to express the cardinality. These notations are as follows:
Fig: Notations of ER diagram

ER Design Issues
In the previous sections of the data modeling, we learned to design an ER diagram.
We also discussed different ways of defining entity sets and relationships among them.
We also understood the various designing shapes that represent a relationship, an
entity, and its attributes. However, users often mislead the concept of the elements
and the design process of the ER diagram. Thus, it leads to a complex structure of
the ER diagram and certain issues that does not meet the characteristics of the
real-world enterprise model.

Here, we will discuss the basic design issues of an ER database schema in the following
points:

1) Use of Entity Set vs Attributes


The use of an entity set or attribute depends on the structure of the real-world
enterprise that is being modelled and the semantics associated with its attributes. It
leads to a mistake when the user use the primary key of an entity set as an attribute of
another entity set. Instead, he should use the relationship to do so. Also, the primary
key attributes are implicit in the relationship set, but we designate it in the relationship
sets.

2) Use of Entity Set vs. Relationship Sets


It is difficult to examine if an object can be best expressed by an entity set or relationship set.
To
understand and determine the right use, the user need to designate a relationship set for
describing an action that occurs in-between the entities. If there is a requirement of
representing the object as a
relationship set, then its better not to mix it with the entity set.

3) Use of Binary vs n-ary Relationship Sets


Generally, the relationships described in the databases are binary relationships. However, non-
binary
relationships can be represented by several binary relationships. For example, we can create
and represent a
ternary relationship 'parent' that may relate to a child, his father, as well as his mother. Such
relationship can
also be represented by two binary relationships i.e, mother and father, that may relate to their
child. Thus, it is possible to represent a non-binary relationship by a set of distinct binary
relationships.

4) Placing Relationship Attributes


The cardinality ratios can become an affective measure in the placement of the
relationship attributes. So, it is better to associate the attributes of one-to-one or one-
to-many relationship sets with any participating entity sets, instead of any relationship
set. The decision of placing the specified attribute as a relationship or entity attribute
should possess the charactestics of the real world enterprise that is being modelled.

For example, if there is an entity which can be determined by the combination of


participating entity sets, instead of determing it as a separate entity. Such type of
attribute must be associated with the many-to-many relationship sets.

Thus, it requires the overall knowledge of each part that is involved inb desgining and
modelling an ER diagram. The basic requirement is to analyse the real-world
enterprise and the connectivity of one entity or attribute with other.

Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities
to which
another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two
entity sets.
o For binary relationship set R on an entity set A and B, there are four possible
mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)

One-to-one
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with at most one entity in E1.
One-to-many
In one-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with at most one entity in E1.

Many-to-one
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with any number of entities in E1.

Many-to-many
In many-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with any number of entities in E1.

Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and
identify relationships between tables.

For example, ID is used as a key in the Student table because it is unique for each
student. In the
PERSON table, passport_number, license_number, SSN are keys since they are
unique for each
person.
Types of keys:

1. Primary key
o It is the first key used to identify one and only one instance of an entity uniquely. An
entity can contain
multiple keys, as we saw in the PERSON table. The key which is most suitable from those
lists becomes
a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each
employee. In the
EMPLOYEE table, we can even select License_Number and Passport_Number as
primary keys since
they are also unique.

o For each entity, the primary key selection is based on requirements and developers.
2. Candidate key
o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key.
The candidate
keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of
the attributes, like SSN, Passport_Number, License_Number, etc., are considered a
candidate key.

3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset
of a candidate
key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the
name of two
employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can
also be a key.

Competitive questions on Structures in HindiKeep Watching

The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. Foreign key
o Foreign keys are the column of the table used to point to the primary key of another
table.

o Every employee works in a specific department in a company, and employee and


department are two
different entities. So we can't store the department's information in the employee
table. That's why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new
attribute in the
EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related.
5. Alternate key
There may be one or more attributes or a combination of attributes that uniquely
identify each tuple in a relation. These attributes or combinations of the attributes are
called the candidate keys. One key is chosen as the primary key from these candidate
keys, and the remaining candidate key, if it exists, is termed the alternate key. In other
words, the total number of the alternate keys is the total number of candidate keys
minus the primary key. The alternate key may or may not exist. If there is only one
candidate key in a relation, it does not have an alternate key.

For example, employee relation has two attributes, Employee_Id and PAN_No, that act
as candidate keys. In this relation, Employee_Id is chosen as the primary key, so the
other candidate key, PAN_No, acts as the Alternate key.

6. Composite key
Whenever a primary key consists of more than one attribute, it is known as a
composite key. This key is also known as Concatenated Key.

For example, in employee relations, we assume that an employee may be assigned


multiple roles,
and an employee may work on multiple projects simultaneously. So the primary
key will be
composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in
combination. So these
attributes act as a composite key since the primary key comprises more than one
attribute.
7. Artificial key
The key created using arbitrarily assigned data are known as artificial keys. These keys
are created when a primary key is large and complex and has no relationship with
many other relations. The data values of the artificial keys are usually numbered in a
serial order.

For example, the primary key, which is composed of Emp_ID, Emp_role, and Proj_ID,
is large in employee relations. So it would be better to add a new virtual attribute to
identify each tuple in the relation uniquely.

Generalization
o Generalization is like a bottom-up approach in which two or more entities of
lower level
combine to form a higher level entity if they have some attributes in common.
o In generalization, an entity of a higher level can also combine with the entities
of the lower
level to form a further higher level entity.
o Generalization is more like subclass and superclass system, but the only
difference is the
approach. Generalization uses the bottom-up approach.
o In generalization, entities are combined to form a more generalized entity, i.e.,
subclasses are
combined to make a superclass.

For example, Faculty and Student entities can be generalized and create a higher level
entity Person.
Specialization
o Specialization is a top-down approach, and it is opposite to Generalization. In
specialization,
one higher level entity can be broken down into two lower level entities.
o Specialization is used to identify the subset of an entity set that shares some
distinguishing
characteristics.
o Normally, the superclass is defined first, the subclass and its related attributes
are defined
next, and relationship set are then added.

For example: In an Employee management system, EMPLOYEE entity can be specialized


as TESTER or DEVELOPER based on what role they play in the company.

Aggregation
In aggregation, the relation between two entities is treated as a single entity. In
aggregation, relationship with its corresponding entities is aggregated into a higher
level entity.

For example: Center entity offers the Course entity act as a single entity in the
relationship which is
in a relationship with another entity visitor. In the real world, if a visitor visits a
coaching center then
he will never enquiry about the Course only or just about the Center instead he will ask
the enquiry
about both.
Reduction of ER diagram to Table
The database can be represented using the notations, and these notations can be
reduced to a collection of tables.

In the database, every entity set or relationship set can be represented in tabular form.

The ER diagram is given below:

There are some points for converting the ER diagram to the table:

38M
667
History of Java

o Entity type becomes a table.

In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual
tables.

o All single-valued attribute becomes a column for the table.

In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of


STUDENT table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE
table and so on.

o A key attribute of the entity type represented by the primary key.

In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID


are the key attribute of the entity.

o The multivalued attribute is represented by a separate table.


In the student table, a hobby is a multivalued attribute. So it is not possible to
represent multiple values in a single column of STUDENT table. Hence we create a
table STUD_HOBBY with column name STUDENT_ID and HOBBY. Using both the
column, we create a composite key.

o Composite attribute represented by components.

In the given ER diagram, student address is a composite attribute. It contains CITY,


PIN, DOOR#,
STREET, and STATE. In the STUDENT table, these attributes can merge as an individual
column.

o Derived attributes are not considered in the table.

In the STUDENT table, Age is the derived attribute. It can be calculated at any point
of time by calculating the difference between current date and Date of Birth.

Using these rules, you can convert the ER diagram to tables and columns and assign
the mapping between the tables. Table structure for the given ER diagram is as below:

Figure: Table structure

Relationship of higher degree


The degree of relationship can be defined as the number of occurrences in one
entity that is associated with the number of occurrences in another entity.

There is the three degree of relationship:

1. One-to-one (1:1)
2. One-to-many (1:M)

3. Many-to-many (M:N)
1. One-to-one
o In a one-to-one relationship, one occurrence of an entity relates to only one occurrence
in another
entity.
o A one-to-one relationship rarely exists in practice.
o For example: if an employee is allocated a company car then that car can only be
driven by that
employee.

o Therefore, employee and company car have a one-to-one relationship.

2. One-to-many
o In a one-to-many relationship, one occurrence in an entity relates to many
occurrences in another
entity.

o For example: An employee works in one department, but a department has many
employees.
o Therefore, department and employee have a one-to-many relationship.

3. Many-to-many
o In a many-to-many relationship, many occurrences in an entity relate to many
occurrences in another
entity.
o Same as a one-to-one relationship, the many-to-many relationship rarely exists in
practice.
o For example: At the same time, an employee can work on several projects, and a project
has a team
of many employees.

o Therefore, employee and project have a many-to-many relationship.

Relational Model concept


Relational model can represent as a table with columns and rows. Each row is known as
a tuple. Each table of the column has a name or attribute.

Domain: It contains a set of atomic values that an attribute can take.


Attribute: It contains the name of a column in a particular table. Each attribute Ai
must have a domain, dom(Ai)

Relational instance: In the relational database system, the relational instance is


represented by a finite set of tuples. Relation instances do not have duplicate tuples.

Relational schema: A relational schema contains the name of the relation and name of
all columns or attributes.

Relational key: In the relational key, each row has one or more attributes. It can identify
the row in the relation uniquely.

Example: STUDENT Relation

NAME ROLL_NO PHONE_NO ADDRESS A

Ram 14795 7305758992 Noida 2

Shyam 12839 9026288936 Delhi 3

Laxman 33289 8583287182 Gurugram 2

Mahesh 27857 7086819134 Ghaziabad 2

Ganesh 17282 9028 9i3988 Delhi 4

o In the given table, NAME, ROLL_NO, PHONE_NO, ADDRESS, and AGE are the
attributes.
o The instance of schema STUDENT has 5 tuples.
o t3 = <Laxman, 33289, 8583287182, Gurugram, 20>

Properties of Relations
o Name of the relation is distinct from all other relations.
o Each relation cell contains exactly one atomic (single) value
o Each attribute contains a distinct name
o Attribute domain has no significance
o tuple has no duplicate value
o Order of tuple can have a different sequence

Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to
obtain the result of the query. It uses operators to perform queries.

Types of Relational operation


1. Select Operation:
o The select operation selects tuples that satisfy a given predicate.
o It is denoted by sigma (σ).

1. Notation: σ p(r)

Where:

σ is used for selection prediction


r is used for relation
p is used as a propositional logic formula which may use connectors like: AND OR and
NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤.

For example: LOAN Relation

52.8M

969

Difference between JDK, JRE, and JVM

BRANCH_NAME LOAN_NO AMOUNT

Downtown L-17 1000

Redwood L-23 2000

Perryride L-15 1500

Downtown L-14 1500

Mianus L-13 500


Roundhill L-11 900

Perryride L-16 1300

Input:

1. σ BRANCH_NAME="perryride" (LOAN)

Output:

BRANCH_NAME LOAN_NO AMOUNT

Perryride L-15 1500

Perryride L-16 1300

2. Project Operation:
o This operation shows the list of those attributes that we wish to appear in the result.
Rest of the
attributes are eliminated from the table.

o It is denoted by ∏.

1. Notation: ∏ A1, A2, An (r)

Where

A1, A2, A3 is used as an attribute name of

relation r. Example: CUSTOMER RELATION

NAME STREET CITY

Jones Main Harrison

Smith North Rye

Hays Main Harrison

Curry North Rye

Johnson Alma Brooklyn


Brooks Senator Brooklyn

Input:

1. ∏ NAME, CITY (CUSTOMER)

Output:

NAME CITY

Jones Harrison

Smith Rye

Hays Harrison

Curry Rye

Johnson Brooklyn

Brooks Brooklyn

3. Union Operation:
o Suppose there are two tuples R and S. The union operation contains all the tuples that
are either in R
or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.

1. Notation: R ∪ S

A union operation must hold the following condition:

o R and S must have the attribute of the same number.

o Duplicate tuples are eliminated

automatically. Example:
DEPOSITOR RELATION

CUSTOMER_NAME ACCOUNT_NO

Johnson A-101
Smith A-121

Mayes A-321

Turner A-176

Johnson A-273

Jones A-472

Lindsay A-284

BORROW RELATION

CUSTOMER_NAME LOAN_NO

Jones L-17

Smith L-23

Hayes L-15

Jackson L-14

Curry L-93

Smith L-11

Williams L-17

Input:

1. ∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME

(DEPOSITOR)

Output:

CUSTOMER_NAME

Johnson
Smith

Hayes

Turner

Jones

Lindsay

Jackson

Curry

Williams

Mayes

4. Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in
both R & S.

o It is denoted by intersection ∩.

1. Notation: R ∩ S

Example: Using the above DEPOSITOR table and BORROW

table

Input:

1. ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Smith

Jones

5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation contains all tuples
that are in R
but not in S.
o It is denoted by intersection minus (-).

1. Notation: R - S

Example: Using the above DEPOSITOR table and BORROW

table

Input:

1. ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)

Output:

CUSTOMER_NAME

Jackson

Hayes

Willians

Curry

6. Cartesian product
o The Cartesian product is used to combine each row in one table with each row in the
other table. It is
also known as a cross product.
o It is denoted by X.

1. Notation: E X D
Example:
EMPLOYEE

EMP_ID EMP_NAME EMP_DEPT

1 Smith A

2 Harry C

3 John B

DEPARTMENT
DEPT_NO DEPT_NAME

A Marketing

B Sales

C Legal

Input:

1. EMPLOYEE X DEPARTMENT

Output:

EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME

1 Smith A A Marketing

1 Smith A B Sales

1 Smith A C Legal

2 Harry C A Marketing

2 Harry C B Sales

2 Harry C C Legal

3 John B A Marketing

3 John B B Sales

3 John B C Legal

7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted by

rho (ρ).

Example: We can use the rename operator to rename STUDENT relation to

STUDENT1.
oin Operations:

join condition is satisfied. It is denoted by ⋈.


A Join operation combines related tuples from different relations, if and only if a given

Example:
EMPLOYEE

EMP_CODE EMP_NAME

101 Stephan

102 Jack

103 Harry

SALARY

EMP_CODE SALARY

101 50000

102 30000

103 25000

1. Operation: (EMPLOYEE ⋈

SALARY)

Result:

54.2M

877

Hello Java Program for Beginners

EMP_CODE EMP_NAME SALARY

101 Stephan 50000

102 Jack 30000


103 Harry 25000

Types of Join operations:

1. Natural Join:
o A natural join is the set of tuples of all combinations in R and S that are equal on their
common
attribute names.

o It is denoted by ⋈.

Example: Let's use the above EMPLOYEE table and

SALARY table:

Input:

1. ∏EMP_NAME, SALARY (EMPLOYEE ⋈

SALARY)

Output:

EMP_NAME SALARY
Stephan 50000
Jack 30000

Harry 25000

2. Outer Join:
The outer join operation is an extension of the join operation. It is used to deal with
missing information.

Example:

EMPLOYEE

EMP_NAME STREET CITY

Ram Civil line Mumbai

Shyam Park street Kolkata

Ravi M.G. Street Delhi

Hari Nehru nagar Hyderabad

FACT_WORKERS

EMP_NAME BRANCH SALARY

Ram Infosys 10000

Shyam Wipro 20000

Kuber HCL 30000

Hari TCS 50000

Input:

1. (EMPLOYEE ⋈

FACT_WORKERS)

Output:
EMP_NAME STREET CITY BRANCH SALAR

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru nagar Hyderabad TCS 50000

An outer join is basically of three types:

a. Left outer join


b. Right outer join

c. Full outer join

a. Left outer join:


o Left outer join contains the set of tuples of all combinations in R and S that are equal on
their common
attribute names.

o In the left outer join, tuples in R have no matching tuples in S.

o It is denoted by ⟕.

Example: Using the above EMPLOYEE table and

FACT_WORKERS table

Input:

1. EMPLOYEE ⟕ FACT_WORKERS

EMP_NAME STREET CITY BRANCH SALAR

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000

Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL

b. Right outer join:


o Right outer join contains the set of tuples of all combinations in R and S that are
equal on their
common attribute names.
o In right outer join, tuples in S have no matching tuples in R.

o It is denoted by ⟖.
Example: Using the above EMPLOYEE table and FACT_WORKERS Relation

Input:

1. EMPLOYEE ⟖ FACT_WORKERS

Output:

EMP_NAME BRANCH SALARY STREET CITY

Ram Infosys 10000 Civil line


Mumbai

Shyam Wipro 20000 Park street Kolkata

Hari TCS 50000 Nehru street


Hyderaba

Kuber HCL 30000 NULL NULL

c. Full outer join:


o Full outer join is like a left or right join except that it contains all rows from both tables.
o In full outer join, tuples in R that have no matching tuples in S and tuples in S that have
no matching
tuples in R in their common attribute name.

o It is denoted by ⟗.

Example: Using the above EMPLOYEE table and

FACT_WORKERS table

Input:

1. EMPLOYEE ⟗ FACT_WORKERS

Output:

EMP_NAME STREET CITY BRANCH SALAR

Ram Civil line Mumbai Infosys 10000

Shyam Park street Kolkata Wipro 20000


Hari Nehru street Hyderabad TCS 50000

Ravi M.G. Street Delhi NULL NULL


Kuber NULL NULL HCL
30000

3. Equi join:
It is also known as an inner join. It is the most common join. It is based on matched
data as per the equality condition. The equi join uses the comparison operator(=).

Example:

CUSTOMER RELATION

CLASS_ID NAME

1 John

2 Harry

3 Jackson

PRODUCT

PRODUCT_ID CITY

1 Delhi

2 Mumbai

3 Noida

Input:

1. CUSTOMER ⋈ PRODUCT

Output:

CLASS_ID NAME PRODUCT_ID CITY

1 John 1 Delhi

2 Harry 2 Mumbai
3 Harry 3 Noida

Integrity Constraints
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes
have to be
performed in such a way that data integrity is not affected.

o Thus, integrity constraint is used to guard against accidental damage to the database.

Types of Integrity Constraint

1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the
attribute must be available in the corresponding domain.

Example:
2. Entity integrity constraints
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation and if
the primary
key has a null value, then we can't identify those rows.

o A table can contain a null value other than the primary

key field. Example:

3. Referential Integrity Constraints


o A referential integrity constraint is specified between two tables.
o In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary
Key of Table 2,
then every value of the Foreign Key in Table 1 must be null or be available in Table 2.

Example:
4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary key. A
primary key
can contain a unique and null value in the relational table.

Example:

Relational Calculus
There is an alternate way of formulating queries known as Relational Calculus.
Relational calculus is a non-procedural query language. In the non-procedural query
language, the user is concerned with the details of how to obtain the end results. The
relational calculus tells what to do but never explains how to do. Most commercial
relational languages are based on aspects of relational calculus including SQL-QBE
and QUEL.
Why it is called Relational Calculus?
It is based on Predicate calculus, a name derived from branch of symbolic language. A
predicate is a truth-valued function with arguments. On substituting values for the
arguments, the function result in an expression called a proposition. It can be either true
or false. It is a tailored version of a subset of the Predicate Calculus to communicate
with the relational database.

Many of the calculus expressions involves the use of Quantifiers. There are two
types of quantifiers:

o Universal Quantifiers: The universal quantifier denoted by ∀ is read as for all which
means that in a
given set of tuples exactly all tuples satisfy a given condition.

o Existential Quantifiers: The existential quantifier denoted by ∃ is read as for all which
means that in
a given set of tuples there is at least one occurrences whose value satisfy a given
condition.

Before using the concept of quantifiers in formulas, we need to know the concept of Free
and Bound Variables.

A tuple variable t is bound if it is quantified which means that if it appears in any


occurrences a variable that is not bound is said to be free.

Free and bound variables may be compared with global and local variable of
programming languages.

Types of Relational calculus:

1. Tuple Relational Calculus (TRC)


It is a non-procedural query language which is based on finding a number of tuple
variables also known as range variable for which predicate holds true. It describes the
desired information without giving a specific procedure for obtaining that information.
The tuple relational calculus is specified to select the tuples in a relation. In TRC,
filtering variable uses the tuples of a relation. The result of the relation can have one
or more tuples.
Notation:

A Query in the tuple relational calculus is expressed as following notation

1. {T | P (T)} or {T | Condition (T)}

Where

T is the resulting tuples

P(T) is the condition used to

fetch T. For example:

1. { T.name | Author(T) AND T.article = 'database' }

Output: This query selects the tuples from the AUTHOR relation. It returns a tuple with
'name' from Author who has written an article on 'database'.

TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and
Universal Quantifiers (∀).

For example:

1. { R| ∃T ∈ Authors(T.article='database' AND

R.name=T.name)}

Output: This query will yield the same result as the previous

one.

2. Domain Relational Calculus (DRC)


The second form of relation is known as Domain relational calculus. In domain

calculus uses the same operators as tuple calculus. It uses logical connectives ∧ (and),
relational calculus, filtering variable uses the domain of attributes. Domain relational

∨ (or) and ┓ (not). It uses Existential (∃) and Universal Quantifiers (∀) to bind the
variable. The QBE or Query by example is a query language related to domain
relational calculus.

Notation:

1. { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where

a1, a2 are attributes


P stands for formula built by inner attributes

For example:
1. {< article, page, subject > | ∈ javatpoint ∧ subject = 'database'}
Output: This query will yield the article, page, and subject from the relational
javatpoint, where the subject is a database.

You might also like