DBMS Module 1
DBMS Module 1
A file system enables to handle the way of reading and writing data to the storage medium. It is
directly installed into the computer with the Operating systems such as Windows and Linux.
What is DBMS?
Database Management System (DBMS) is a software for storing and retrieving user’s data while
considering appropriate security measures. It consists of a group of programs that manipulate
the database. The DBMS accepts the request for data from an application and instructs the
DBMS engine to provide the specific data. In large systems, a DBMS helps users and other third-
party software to store and retrieve data.
KEY DIFFERENCES:
• A file system is a software that manages and organizes the files in a storage medium,
whereas DBMS is a software application that is used for accessing, creating, and
managing databases.
• The file system doesn’t have a crash recovery mechanism on the other hand, DBMS
provides a crash recovery mechanism.
• Data inconsistency is higher in the file system. On the contrary Data inconsistency is low
in a database management system.
• File system does not provide support for complicated transactions, while in the DBMS
system, it is easy to implement complicated transactions using SQL.
• File system does not offer concurrency, whereas DBMS provides a concurrency facility.
• Each application has its data file so, the same data may have to be recorded and stored
many times.
• Data dependence in the file processing system are data-dependent, but, the problem is
incompatible with file format.
• Limited data sharing.
• The problem with security.
• Time-consuming.
• It allows you to maintain the record of the big firm having a large number of items.
• Required lots of labor work to do.
• Cost of Hardware and Software of a DBMS is quite high, which increases the budget of
your organization.
• Most database management systems are often complex systems, so the training for
users to use the DBMS is required.
• The use of the same program at a time by many users sometimes lead to the loss of
some data.
• DBMS can’t perform sophisticated calculations
• Data-sets begin to grow large as it provides a more predictable query response time.
• It required a processor with the high speed of data processing.
• The database can fail because or power failure or the whole system stops.
• The cost of DBMS is depended on the environment, function, or recurrent annual
maintenance cost.
The external schemas describe the database as it is seen by the user, and the user applications.
The external schema maps onto the conceptual schema.
There may be many external schemas, each reflecting a simplified model of the world, as seen
by particular applications. External schemas may be modified, or new ones created, without the
need to make alterations to the physical storage of data. The interface between the external
schema and the conceptual schema can be amended to accommodate any such changes.
The external schema allows the application programs to see as much of the data as they
require, while excluding other items that are not relevant to that application. In this way, the
external schema provides a view of the data that corresponds to the nature of each task.
The external schema is more than a subset of the conceptual schema. While items in the
external schema must be derivable from the conceptual schema, this could be a complicated
process, involving computation and other activities.
The conceptual schema
The conceptual schema describes the universe of interest to the users of the database system.
For a company, for example, it would provide a description of all of the data required to be
stored in a database system. From this organisation-wide description of the data, external
schemas can be derived to provide the data for specific users or to support particular tasks.
At the level of the conceptual schema, programmers are concerned with the data itself, rather
than storage or the way data is physically accessed on disk. The definition of storage and access
details is the preserve of the internal schema.
The internal schema
A database will have only one internal schema, which contains definitions of the way in which
data is physically stored. The interface between the internal schema and the conceptual
schema identifies how an element in the conceptual schema is stored, and how it may be
accessed.
If the internal schema is changed, this will need to be addressed in the interface between the
internal and the conceptual schemas, but the conceptual and external schemas will not need to
change. This means that changes in physical storage devices such as disks, and changes in the
way files are organised on storage devices, are transparent to users and application programs.
In distinguishing between 'logical' and 'physical' views of a system, it should be noted that the
difference could depend on the nature of the user. While 'logical' describes the user angle, and
'physical' relates to the computer view, database designers may regard relations (for staff
records) as logical and the database itself as physical. This may contrast with the perspective of
a systems programmer, who may consider data files as logical in concept, but their
implementation on magnetic disks in cylinders, tracks and sectors as physical.
Physical data independence
Any changes to the conceptual schema can be isolated from the external schema and the
internal schema; such changes will be reflected in the interface between the conceptual
schema and the other levels. This achieves logical data independence. What this means,
effectively, is that changes can be made at the conceptual level, where the overall model of an
organisation's data is specified, and these changes can be made independently of both the
physical storage level, and the external level seen by individual users. The changes are handled
by the interfaces between the conceptual, middle layer, and the physical and external layers.
Database Schema
A database schema is the skeleton structure that represents the logical view of the entire
database. It defines how the data is organized and how the relations among them are
associated. It formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a
descriptive detail of the database, which can be depicted by means of schema diagrams. It’s
the database designers who design the schema to help programmers understand the database
and make it useful.
A database schema can be divided broadly into two categories −
• Physical Database Schema − This schema pertains to the actual storage of data and its
form of storage like files, indices, etc. It defines how the data will be stored in a
secondary storage.
• Logical Database Schema − This schema defines all the logical constraints that need to
be applied on the data stored. It defines tables, views, and integrity constraints.
Database Instance
Database schema is the skeleton of database. It is designed when the database doesn't exist at
all. Once the database is operational, it is very difficult to make any changes to it. A database
schema does not contain any data or information.
A database instance is a state of operational database with data at any given time. It contains
a snapshot of the database. Database instances tend to change with time. A DBMS ensures
that its every instance (state) is in a valid state, by diligently following all the validations,
constraints, and conditions that the database designers have imposed.
DBMS Three Level Architecture Diagram
1. External level
It is also called view level. The reason this level is called “view” is because several users can
view their desired data from this level which is internally fetched from database with the help
of conceptual and internal level mapping.
The user doesn’t need to know the database schema details such as data structure, table
definition etc. user is only concerned about data which is what returned back to the view level
after it has been fetched from database (present at the internal level).
External level is the “top level” of the Three Level DBMS Architecture.
2. Conceptual level
It is also called logical level. The whole design of the database such as relationship among data,
schema of data etc. are described in this level.
Database constraints and security are also implemented in this level of architecture. This level is
maintained by DBA (database administrator).
3. Internal level
This level is also known as physical level. This level describes how the data is actually stored in
the storage devices. This level is also responsible for allocating space to the data. This is the
lowest level of the architecture.
Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It
provides a different view and helps in achieving data independence which is used to enhance
the security of data.
The database systems consist of complicated data structures and relations. For users to access
the data easily, these complications are kept hidden, and only the relevant part of the database
is made accessible to the users through data abstraction.
Levels of abstraction for DBMS
Database systems include complex data-structures. In terms of retrieval of data, reduce
complexity in terms of usability of users and in order to make the system efficient, developers
use levels of abstraction that hide irrelevant details from the users. Levels of abstraction
simplify database design.
Mainly there are three levels of abstraction for DBMS, which are as follows −
The 3 levels of abstraction in the database do not exist separately of each other. There must be
some correspondence, or mapping, among the levels. There are two types of mappings: the
internal / conceptual mapping and the conceptual / external mapping.
The internal / conceptual mapping lies among the internal and conceptual levels, and tells the
correspondence among the records and the data structures of the internal view and fields of
the conceptual view and the files. If the structure of the stored database is changed, then the
internal / conceptual mapping must also be changed accordingly so that the view from the
conceptual level remains similar. It is this mapping that gives physical data independence for
the database. For instance, programmers may modify the internal view of student relation by
breaking the student file into two files, one containing enrolment, address and name and other
containing enrolment, programme. Therefore, the mapping will make sure that the conceptual
view is restored as original. The storage decision is firstly taken for optimisation purposes.
The external/conceptual view lies between the external and conceptual levels, and tells the
correspondence between a particular external view and the conceptual view. Though these two
levels are same, some elements discover in a particular external view may be different from the
conceptual view. For example, various fields can be combined into a single (virtual) field, which
can also have dissimilar names from the original fields. If the structure of the database at the
conceptual level is changed, then the conceptual / external mapping must change accordingly
so that the view from the external level remains constant. It is this mapping that gives logical
data independence for the database. For instance, we may change the student relation to have
more fields at conceptual level, yet this will never change the two user views at all.
It is also doable to have another mapping, where one external view is show in terms of other
external views (this could be known as external/external mapping). This is useful if various
external views are closely related to one another, as it permits you to avoid mapping each of
the same external views directly to the conceptual level.
The physical architecture defines the software components used to process and enter data,
and how these software components are related and interconnected. Though it is not possible
to simplify the component structure of a DBMS, it is possible to recognize a number of key
functions which are similar to most database management systems. The components that
normally execute these functions are shown in Figure , which depicts the physical architecture
of a typical DBMS.
DML PRECOMPILER IN DATABASE MANAGEMENT SYSTEM
DML Precompiler
All the DBMS have two basic sets of Languages - Data Definition Language (DDL) that have the
set of commands needed to define the format of the data that is being stored and Data
Manipulation Language (DML) which tells the set of commands that modify, process data to
make user definable output. The DML statements can as well be written in an application
program. The DML precompiler changes DML statements (such as SELECT...FROM in Structured
Query Language (SQL) covered in Block 2) embedded in an application program to normal
procedural calls in the host language. The precompiler relate with the query processor in order
to produce the appropriate code.
Structure of Database Management System
Database Management System (DBMS) is a software that allows access to data stored in a
database and provides an easy and effective method of –
• DML Compiler – It processes the DML statements into low level instruction (machine
language), so that they can be executed.
• DDL Interpreter – It processes the DDL statements into a set of table containing meta
data (data about data).
2. Storage Manager : Storage Manager is a program that provides an interface between the
data stored in the database and the queries received. It is also known as Database Control
System. It maintains the consistency and integrity of the database by applying the constraints
and executes the DCL statements. It is responsible for updating, storing, deleting, and
retrieving data in the database.
It contains the following components –
• Authorization Manager –
It ensures role-based access control, i.e,. checks whether the particular person is
privileged to perform the requested operation or not.
• Integrity Manager –
It checks the integrity constraints when the database is modified.
• Transaction Manager –
It controls concurrent access by performing the operations in a scheduled way that it
receives the transaction. Thus, it ensures that the database remains in the consistent
state before and after the execution of a transaction.
• File Manager –
It manages the file space and the data structure used to represent information in the
database.
• Buffer Manager –
It is responsible for cache memory and the transfer of data between the secondary
storage and main memory.
3. Disk Storage :
It contains the following components –
• Data Files –
It stores the data.
• Data Dictionary –
It contains the information about the structure of any database object. It is the repository
of information that governs the metadata.
• Indices –
It provides faster retrieval of data item.
DBA
A Database Administrator (DBA) is individual or person responsible for controlling,
maintenance, coordinating, and operation of database management system. Managing,
securing, and taking care of database system is prime responsibility.
They are responsible and in charge for authorizing access to database, coordinating, capacity,
planning, installation, and monitoring uses and for acquiring and gathering software and
hardware resources as and when needed. Their role also varies from configuration, database
design, migration, security, troubleshooting, backup, and data recovery. Database
administration is major and key function in any firm or organization that is relying on one or
more databases. They are overall commander of Database system.
Types of Database Administrator (DBA) :
• Administrative DBA – Their job is to maintain server and keep it functional. They are
concerned with data backups, security, trouble shooting, replication, migration etc.
• Data Warehouse DBA –Assigned earlier roles, but held accountable for merging data from
various sources into data warehouse. They also design warehouse, with cleaning and
scrubs data prior to loading.
• Development DBA – They build and develop queries, stores procedure, etc. that meets
firm or organization needs. They are par at programmer.
• Application DBA – They particularly manages all requirements of application components
that interact with database and accomplish activities such as application installation and
coordinating, application upgrades, database cloning, data load process management, etc.
• Architect – They are held responsible for designing schemas like building tables. They
work to build structure that meets organisation needs. The design is further used by
developers and development DBAs to design and implement real application.
• OLAP DBA – They design and builds multi-dimensional cubes for determination support or
OLAP systems.
Role and Duties of Database Administrator (DBA) :
• Decides hardware –
They decides economical hardware, based upon cost, performance and efficiency of
hardware, and best suits organisation. It is hardware which is interface between end users
and database.
• Manages data integrity and security –
Data integrity need to be checked and managed accurately as it protects and restricts data
from unauthorized use. DBA eyes on relationship within data to maintain data integrity.
• Database design –
DBA is held responsible and accountable for logical, physical design, external model
design, and integrity and security control.
• Database implementation –
DBA implements DBMS and checks database loading at time of its implementation.
• Query processing performance –
DBA enhances query processing by improving their speed, performance and accuracy.
• Tuning Database Performance –
If user is not able to get data speedily and accurately then it may loss organization
business. So by tuning SQL commands DBA can enhance performance of database.
4. Data Base Designers :Data Base Designers are the users who design the structure of data
base which includes tables, indexes, views, constraints, triggers, stored procedures.
He/she controls what data must be stored and how the data items to be related.
5. Application Program : Application Program are the back end programmers who writes the
code for the application programs. They are the computer professionals. These programs
could be written in Programming languages such as Visual Basic, Developer, C, FORTRAN,
COBOL etc.
6. Casual Users / Temporary Users : Casual Users are the users who occasionally use/access
the data base but each time when they access the data base they require the new
information, for example, Middle or higher level manager.
Data Model
Data Model gives us an idea that how the final system will look like after its complete
implementation. It defines the data elements and the relationships between the data
elements. Data Models are used to show how data is stored, connected, accessed and
updated in the database management system. Here, we use a set of symbols and text to
represent the information so that members of the organisation can communicate and
understand it. Though there are many data models being used nowadays but the Relational
model is the most widely used model. Apart from the Relational model, there are many other
types of data models about which we will study in details in this blog. Some of the Data
Models in DBMS are:
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
10. Context Data Model
Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding child node to the parent node. This model easily
represents some of the real-world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between the shoes present on a shopping
website in the following way
Network Model
This model is an extension of the hierarchical model. It was the most popular model before
the relational model. This model is the same as the hierarchical model, the only difference is
that a record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e. CSE
Department and Library. This was earlier not possible in the hierarchical model.
Features of a Network Model
1. Ability to Merge more Relationships: In this model, as there are more relationships so
data is more related. This model has the ability to manage one-to-one relationships as
well as many-to-many relationships.
2. Many paths: As there are more relationships so there can be more than one path to
the same record. This makes data access fast and simple.
3. Circular Linked List: The operations on the network model are done with the help of
the circular linked list. The current position is maintained with the help of a program
and this position navigates through the records according to the relationship.
Advantages of Network Model
• The data can be accessed faster as compared to the hierarchical model. This is
because the data is more related in the network model and there can be more than
one path to reach a particular node. So the data can be accessed in many ways.
• As there is a parent-child relationship so data integrity is present. Any change in
parent record is reflected in the child record.
Disadvantages of Network Model
• As more and more relationships need to be handled the system might get complex. So,
a user must be having detailed knowledge of the model to work with the model.
• Any change like updation, deletion, insertion is very complex.
Entity-Relationship Model
Entity-Relationship Model or simply ER Model is a high-level data model diagram. In this
model, we represent the real-world problem in the pictorial form to make it easy for the
stakeholders to understand. It is also very easy for the developers to understand the system
by just looking at the ER diagram. We use the ER diagram as a visual tool to represent an ER
Model. ER diagram has the following three components:
Features of ER Model
Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in the
form of a two-dimensional table. All the information is stored in the form of row and
columns. The basic structure of a relational model is tables. So, the tables are also
called relations in the relational model. Example: In this example, we have an Employee table.
• Tuples: Each row in the table is called tuple. A row contains all the information about
any instance of the object. In the above example, each row has all the information
about any specific individual like the first row has information about John.
• Attribute or field: Attributes are the property which defines the table or relation. The
values of the attribute should be from the same domain. In the above example, we
have different attributes of the employee like Salary, Mobile_no, etc.
Advantages of Relational Model
• Simple: This model is more simple as compared to the network and hierarchical
model.
• Scalable: This model can be easily scaled as we can add as many rows and columns we
want.
• Structural Independence: We can make changes in database structure without
changing the way to access the data. When we can make changes to the database
structure without affecting the capability to DBMS to access the data we can say that
structural independence has been achieved.
Disadvantages of Relational Model
• Hardware Overheads: For hiding the complexities and making things easier for the
user this model requires more powerful hardware computers and data storage
devices.
• Bad Design: As the relational model is very easy to design and use. So the users don't
need to know how the data is stored in order to access it. This ease of design can lead
to the development of a poor database which would slow down if the database grows.
In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name,
Job_title of the employee and the methods which will be performed by that object are
stored as a single object. The two objects are connected through a common attribute i.e
the Department_id and the communication between these two will be done with the help
of this common id.
Object-Relational Model
As the name suggests it is a combination of both the relational model and the object-oriented
model. This model was built to fill the gap between object-oriented model and the relational
model. We can have many advanced features like we can make complex data types according
to our requirements using the existing data types. The problem with this model is that this
can get complex and difficult to handle. So, proper understanding of this model is required.
Semi-Structured Model
Semi-structured model is an evolved form of the relational model. We cannot differentiate
between data and schema in this model. Example: Web-Based data sources which we can't
differentiate between the schema and data of the website. In this model, some entities may
have missing attributes while others may have an extra attribute. This model gives flexibility
in storing the data. It also gives flexibility to the attributes. Example: If we are storing any
value in any attribute then that value can be either atomic value or a collection of values.
• Item: Items contain the name and the identifier(some numeric value).
• Links: Links contain the identifier, source, verb and subject.