CH.1 Introduction To DBMS
CH.1 Introduction To DBMS
Unit I
Introduction to DBMS
❖ Basics :
Data,Database and its management systems are an essential part of our daily life
through various activities.
For example: bank : deposit or withdraw money, book tickets,buy something from
shops.
In a traditional database system , data is stored in only numeric and textual forms.
But nowadays with advancement in technology, this data can be organised using
audio,images,videos,graphics etc.
❖ Introduction :
Data: Raw facts and figures which give information.Example student name Ram
which represents Ram.
Base : the one who represents data .Example : Ram which is nothing but a
name.Another example 2,3,4….. Numbers
Management: how to organise this data in the form of table,tree, schema, how to
view it and report it etc.
System: Techniques,application,platform used to manage this data.
The primary goal of DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient.
Basic Concept:
Traditional database Systems: 1.Manual 2.File Processing System
limitations:
● It is less flexible.
● It is very difficult to maintain a file processing system.
● Any change in one file affects all the files that creates a burden on the
programmer.
● Files in Traditional File Processing Systems are called flat files.
Introduction to DBMS
DBMS is a collection of interrelated data and a set of programs that access those data.
A database is a collection of related data stored , so that different users can use this
data for different purposes.
The database is a collection of organized sets of data that make it straightforward to
access, manage, manipulate and update as well.
Moreover, the data can be in any form related to facts such as pictures, file, images,
pdf and so on.
Here are a few examples of our daily life. To exemplify, the online telephone
directory uses the database for the entire data related to the clients including their
name, contact number, address and so on. At the same token, the database is also used
in electricity services for handling bills, issues related to clients and managing fault
data and many more.
Moreover, you can also take an example of Facebook that needs to store the whole
data related to their members, friends, activities, messages and advertisements.
The origin of the database exists before the invention of the computer as it is needed
in libraries, governmental, business and medical records. Moreover, it was observed
that they required maintaining and storing the data for retrieval and they found some
ways at that time. With the introduction of the computer, it becomes quite easy, time
effective and consumes less space to collect as well as maintain the database.
Database Management system i.e. DBMS is a collection of programs for
managing data and simultaneously it supports different types of users to create,
manage, retrieve, update and store information.
For example, from a small startup firm to the multinational companies and industries
managing a huge amount of data becomes a mess. So, software like the DBMS
brought a revolution in many fields regarding efficient information management.
Database System Application:
There are a wide range of applications that make use of database systems.
Some of the applications are -
1) Accounting: Database systems are used in maintaining information about
employees, salaries, and payroll taxes.
3) For maintaining customer, product and purchase information the databases are
used.
4) Banking: In the banking sector, for customer information, accounts and loan and
for performing banking applications the DBMS is used.
5) For purchase on credit cards and generation of monthly statements database
systems are useful.
6) Universities: The database systems are used in universities for maintaining student
information, course registration, and accounting.
1) DBMS removes the data redundancy that means there is no duplication of data in
the database.
3) Data can be isolated in separate tables for convenient and efficient use.
5) The data integrity can be maintained. That means the constraints can be applied
on data and it should be in some specific range.
8) The security policies can be applied to DBMS to allow the user to access only the
desired part of the database system.
Disadvantages of DBMS
2) Hardware and software cost: Large amount of investment is needed to set up the
required hardware or to repair software failure.
3) Damaged part: If one part of the database is corrupted or damaged, then the entire
database may get affected.
4) Conversion cost: If the current system is in a conventional file system and if we
need to convert it to database systems then a large amount of cost is incurred in
purchasing different tools, and adopting different techniques as per the requirement.
5) Training: For designing and maintaining the database systems, the people need to
be trained.
Data Abstraction
Definition of data abstraction: Data abstraction means retrieving only the required
amount of information about the system and hiding background details.
There are several levels of abstraction that simplify user interactions with the system.
These are:
1) Physical level:
o This is the lowest level.
o This level describes how the data are stored.
o The database administrators decide how to store data at the physical level.
• This level describes complex low-level data structures.
2) Logical level:
• This is the next higher level, which describes what data are stored in the database?.
• This level also describes the relationship between the data.
• The logical level thus describes the entire database in terms of a small number of
relatively simple structures.
• The database administrators use a logical level of abstraction for deciding what
information to keep in the database.
3) View level:
• This is the highest level of abstraction that describes only part of the entire
database.
• The view level can provide access to only part of the database.
• This level helps in simplifying the interaction with the system.
o It can provide multiple views of the same system.
• For example A Clerk at the reservation system can see only part of the database and
access the passenger's required information.
• At the physical level, the record customer, employee, department can be described
as a block of consecutive storage locations. Many database systems hide lowest level
storage details from database programmers.
• The type definition of the records is decided at the logical level. The programmer
works on the record at this level, similarly database administrators also work at this
level of abstraction.
• There is a specific view of the record that is allowed at the view level. For instance
a customer can view the name of the employee, or id of the employee but cannot
access the employee's salary.
Database Languages
(1) DDL
Data Definition Language (DDL) is a specialized language used to specify a
database schema by a set of definitions.
It is a language used for creating and modifying the structures of tables,
views, indexes, etc.
DDL is also used to specify additional properties of data.
Some of the common commands used in DDL are -CREATE, ALTER, DROP.
The primary use of the CREATE command is to build a new table. Using the
ALTER command, the users can add up some additional columns and drop existing
columns. Using the DROP command, the user can delete table or view.
(3) DCL
● The Data Control Language (DCL) is used to control access to data stored in
the database. This is also called authorization.
● The typical commands used in DCL are GRANT and REVOKE.
• GRANT: This command is used to give access rights or privileges to the
database.
• REVOKE: The revoke command removes user access rights or privileges to the
database objects.
Data Models
● Definition: It is a collection of conceptual tools for describing data,
relationships among data, semantics (meaning) of data and constraints.
● Data model is a structure below the database.
● Data model provides a way to describe the design of a database at physical,
logical and view level.
There are various data models used in database systems and these are as follows -
(1) Relational
model:
● The relation model consists of a collection of tables which stores data and
also represents the relationship among the data.
● The table is also known as relation.
● The table contains one or more columns and each column has a unique
name.
● Each table contains records of a particular type, and each record type defines
a fixed number of fields or attributes.
● For example - The following figure shows the relational model by showing the
relationship between Student and Result database. For example - Student
Ram lives in the city Chennai and his marks are 78. Thus the relationship
between these two databases are maintained by SeatNo. Column.
Advantages:
(i) Structural independence: Structural independence is an ability that allows us to
make changes in one database structure without affecting others. The relational model
has structural independence. Hence making required changes in the database is
convenient in relational database models.
(ii) Conceptual simplicity: The relational model allows the designer to simply focus
on logical design and not on physical design. Hence relational models are
conceptually simple to understand.
(iii) Query capability: Using simple query language (such as SQL) users can get
information from the database or designer can manipulate the database structure.
(iv) Easy design, maintenance and usage: The relational models can be designed
logically hence they are easy to maintain and use.
Disadvantages:
i) Relational models require powerful hardware and large data storage devices.
ii) May lead to slower processing time.
iii) Poorly designed systems lead to poor implementation of database systems.
Advantages:
ii) Easy to understand: The design of ER diagrams is very logical and hence they
are easy to design and understand.
iii) Effective: It is an effective communication tool.
iv) Integrated: The ER model can be easily integrated with the Relational model.
v) Easy conversion: ER models can be converted easily into other types of models.
Disadvantages:
● The object oriented languages like C++, Java, C# are becoming the dominant in
software development.
● This led to object based data models.
● The object based data model combines object oriented features with relational
data models.
Advantages:
i) Enriched modelling: The object based data model has capability of modelling the
real world objects.
ii) Reusability: There are certain features of object oriented design such as
inheritance, polymorphism which help in reusability..
iii) Support for schema evolution: There is a tight coupling between data and
applications, hence there is strong support for schema evolution.
iv) Improved performance: Using object based data model there can be significant
improvement in performance using object based data model.
Disadvantages:
i) Lack of universal data model: There is no universally agreed data model for an
object based data model, and most models lack a theoretical foundation.
ii) Lack of experience: In comparison with relational database management the use
of object based data models is limited. This model is more dependent on the skilled
programmer.
iii) Complex: More functionalities present in object based data models make the
design complex.
● Data does not conform to a data model but has some structure.
● Data can not be stored in the form of rows and columns as in Databases
● Semi-structured data contains tags and elements (Metadata) which is used to
group data and describe how the data is stored
● Similar entities are grouped together and organized in a hierarchy
● Entities in the same group may or may not have the same attributes or
properties
● Does not contain sufficient metadata which makes automation and management
of data difficult
● Size and type of the same attributes in a group may differ
● Due to lack of a well-defined structure, it can not used by computer programs
easily
● E-mails
● XML and other markup languages
● Binary executables
● TCP/IP packets
● Zipped files
● Integration of data from different sources
● Web pages
Advantages:-
ii) It is flexible.
iii) It is portable.
Disadvantage:-
Advantage:-
1. This model groups the data into tables and defines the relationship between the
tables.
Disadvantages:-
1. For searching any data, we have to start from the root and move downwards and
visit each child node. Thus traversing through each node is required.
1. Capability to handle more Relationships: Since the network model allows many to
many relationships, it helps in modeling the real life situations.
2. Ease of data access: The data access is easier and flexible than hierarchical models.
3. Data Integrity: In the network model every member is associated with some other
member in the model.
4. Conformance to Standards: The network model structure can be designed as per
the standards.
Disadvantages
1. Complex to implement: For all the records the pointers need to be maintained,
hence the database structure becomes complex.
2. Complicated Operations: The simple operations such as insertion, deletion and
modification become complex due to adjustment of multiple pointer.
Data Independence :
● The ability to modify schema definition in one level without affecting the
schema definition in the next higher level is called data independence.
1. Hardware
Hardware refers to the physical, electronic devices such as computers and hard disks
that offer the interface between computers and real-world systems.
2. Software
Software is a set of programs used to manage and control the database and includes
the database software, operating system, network software used to share the data with
other users, and the applications used to access the data.
3. Data
Data are raw facts and information that need to be organized and processed to make it
more meaningful. Database dictionaries are used to centralize, document, control, and
coordinate the use of data within an organization. A database is a repository of
information about a database (also called metadata).
4. Procedures
The Middle two parts(Query processor and storage manager) are important
components of database architecture.
When a user issues a query, the parsed query is presented to a query optimizer, which
uses information about how the data is stored to produce an efficient execution plan
for evaluating the query. An execution plan is a blueprint for evaluating a query. It is
evaluated by a query evaluation engine.
2.Storage manager:
● Storage manager is the component of a database system that provides interface
between the low level data stored in the database and the application programs from
top level.
● Query processor processes the queasy and gives request to the next level, that is
storage manager , queries submitted to the system.
● This Storage manager has interaction with the file manager.
● The storage manager is responsible for storing, retrieving, and updating data in
the database. The storage manager components include
● Authorization and integrity manager: Validates the users who want to access
the data and tests for integrity constraints.Integrity balance should not zero .
● Transaction manager: Ensures that the database remains consistent despite
system failures and concurrent transaction execution proceeds without conflicting.
● File manager: Manages allocation of space on disk storage and representation
of the information on disk.
● Buffer manager: Manages the fetching of data from disk storage into main
memory. The buffer manager also decides what data to cache in main memory. Buffer
manager is a crucial part of the database system.used for faster access of data.
2. Indexes:
Details about indexes, including their names, types, and the tables/columns they are
associated with.
4. Schemas:
Definitions of schemas and the objects they contain.
5. Constraints:
Metadata about primary keys, foreign keys, unique constraints, and check constraints.
7. Storage Information:
Data about physical storage, such as tablespaces, files, and allocation details.
8. Statistics:
Metadata about table sizes, index usage, query optimization statistics, etc.
Uses of System Catalog
1. Query Optimization:
The catalog provides the optimizer with statistics to choose the best query execution
plan.
3. Access Control:
Manages permissions and access rights for users and roles.
4. Database Management:
Facilitates database maintenance tasks, such as backup and recovery.
5. System Administration:
Helps database administrators (DBAs) monitor and tune database performance.
● Simplicity: Straight-forward to purchase, use, and maintain since only the user
has an interface with the system.
● Lower Costs: Due to the low demand for Resource-hogging hardware and
efficient software, Single-user systems sometimes prove to be cost-effective.
● Less Complexity: When there is only one user involved in the use of the
application there tends to be no issues related to conflicting users or
simultaneous access to the data.
● Limited Scalability: These systems are strictly for the single user, which makes
them unsuitable to be used in large organizations.
● Lower Efficiency for Collaboration: This makes the database unsuitable for the
environment where the simultaneous use of the database by several users is
necessary since only one user can have access to the database at a certain time.
● Not Ideal for Large Data Handling: It is worth noting that most of these
databases are applied in small scale systems which may not efficiently manage
huge data.
● Concurrency: Since they do not require exclusive rights to the database, more
than one user can work on it at the same time and hence boost productivity.
● Scalability: Such systems can support a widely ranging population as well as
support large amounts of information to suit commercial and corporate uses.
● Data Consistency and Integrity: Transaction control capabilities offered by
programs guarantee the consistency of data, and when the data is stored using
this technique, even when done by different people, it will not be compromised.
● Higher Complexity: Such systems are much more challenging to administer and
support because of the numerous user disputes, data integrity and concurrent
access.
● Cost: The Multi-User Systems may involve the use of costly colossal, effective
and efficient hardware and software and firm network connections.
● Performance Overhead: The risk that arises out of having a number of users
accessing the system is that this leads to slow system response time if for
instance the back-end database is not well optimized or if there is inadequate
allocation of system resources.
❖
Data Modeling:
Basics Concepts
Entity
Attributes
Relationships
Constraints
Keys
● Data modeling is the process of creating a visual representation of a system's
data.The basic concepts of data modeling include entities, attributes,
relationships, constraints, and keys.
● Entities Represent real-world objects or concepts. For example, an employee is
an entity in an employee database.
● Attributes Describe the characteristics of an entity. For example, age, roll
number, or marks for a student.
● Relationships Define the connections between entities. For example, a
one-to-one relationship is when only one instance of an entity is associated with
another entity.
● Constraints Rules that help maintain data integrity and consistency. For
example, primary key constraints, unique constraints, and foreign key
constraints.
● Keys Unique identifiers used to distinguish individual instances of an entity.
For example, a roll number can uniquely identify a student from a set of
students.
● Data models Can be conceptual, logical, physical, relational, hierarchical,
network, object-oriented, or graph data model