Unit1 Dbms
Unit1 Dbms
Unit - I
1
Introduction
For example, a university database might contain information about the following:
Entities such as students, faculty and courses.
Relationships between entities, such as students' enrollment in courses,
faculty teaching courses.
2
• Information systems should allow interactive access to data to obtain new
information without writing fresh programs.
• The stored data should be made available for access by different users
simultaneously.
a) Those who actually use and control the database content, and those who
design, develop and maintain database applications (called ―Actors on the
Scene)
b) Those who design and develop the DBMS software and related tools, and the
computer systems operators (called ―Workers Behind the Scene).
3
stored and for choosing an appropriate way to organize it. They also define
views for different categories of users. The final design must be able to
support the requirements of all the user sub-groups.
3. End Users: These are persons who access the database for
querying, updating, and report generation. They are main reason
for database's existence!
Casual end users: use database occasionally, needing different information each
time; use query language to specify their requests; typically middle- or high-level
managers.
Naive/Parametric end users: Typically the biggest group of users; frequently
query/update the database using standard canned transactions that have been
carefully programmed and tested in advance. Examples:
parametric users, and develop specifications for canned transactions that meet
these needs.
Application Programmers: Implement, test, document, and maintain
programs that satisfy the specifications mentioned above.
4
c) Workers Behind the Scene
Advantages of a DBMS
Using a DBMS to manage data has many advantages:
Data administration: When several users share the data, centralizing the
administration of data can offer significant improvements. Experienced
professionals who understand the nature of the data being managed, and how
different groups of users use it, can be responsible for organizing the data
representation to minimize redundancy and for fine-tuning the storage of the
5
data to make retrieval efficient.
accessed by only one user at a time. Further, the DBMS protects users from
the effects of system failures.
Functions of DBMS
• Data Definition: The DBMS provides functions to define the structure of the
data in the application. These include defining and modifying the record structure,
the type and size of fields and the various constraints to be satisfied by the data
in each field.
• Data Security & Integrity: The DBMS contains modules which handle the
security and integrity of data in the application.
• Data Recovery and Concurrency: Recovery of the data after system failure
and concurrent access of records by multiple users is also handled by DBMS.
• Data Dictionary Maintenance: Maintaining the data dictionary which contains
the data definition of the application is also one of the functions of DBMS.
• Performance: Optimizing the performance of the queries is one of the
important functions of DBMS.
6
When not to use a DBMS
a) Main inhibitors (costs) of using a DBMS:
i) High initial investment and possible need for additional hardware.
1. The END User who uses the application. Ultimately he is the one who
actually puts the data into the system into use in business. This user need not
know anything about the organization of data in the physical level.
2. The Application Programmer who develops the application programs. He/She
has more knowledge about the data and its structure. He/she can manipulate the
data using his/her programs. He/she also need not have access and knowledge
of the complete data in the system.
3. The Data base Administrator (DBA) who is like the super-user of the system.
7
The role of DBA is very important and is defined by the following functions.
• Defining the schema: The DBA defines the schema which contains the
structure of the data in the application. The DBA determines what data
needs to be present in the system and how this data has to be
presented and organized.
• Liaising with users: The DBA needs to interact continuously with the
users to understand the data in the system and its use.
• Defining Security & Integrity checks: The DBA finds about the access
restrictions to be defined and defines security checks accordingly.
Data Integrity checks are defined by the DBA.
• Defining Backup/Recovery Procedures: The DBA also defines procedures
for backup and recovery. Defining backup procedure includes specifying
what data is to be backed up, the periodicity of taking backups and also
the medium and storage place to backup data.
Database Manager
Database manager is a program module which provides the interface
between the low level data stored in the database and the application
programs and queries submitted to the system:
– The database manager would translate DML statement into low level
file system commands for storing, retrieving, and updating data in the
database.
8
– Backup and recovery. Backup and recovery of database is necessary to ensure
that the database must remain consistent despite the fact of failures.
Database Users
Database users are the people who need information from the database to
carry out their business responsibility. The database users can be broadly
classified into two categories like application programmers and end users.
Sophisticated end users interact with the system without writing programs.
They form requests by writing queries in a database query language. These
are submitted to query processor. Analysts who submit queries to explore
data in the database fall in this category.
Specialized End Users
Specialized end users write specialized database application that does not
fit into data-processing frame work. Application involves knowledge base
and expert system, environment modeling system, etc.
Naive End Users
Naive end user interact with the system by using permanent application program
Example: Query made by the student, namely number of books borrowed
in library database.
System Analysts
9
System analysts determine the requirements of end user, and develop
specification for canned transaction that meets this requirement.
Canned Transaction
Readymade programs through which naive end users interact with the database
is called canned transaction.
The structure of a database means that holds the data’s data types, relationships,
and constraints.
According to C.J. Date (one of the leading database experts), a data model is an
abstract, self-contained, logical definition of the objects, operators, and so forth,
that together constitute the abstract machine with which users interact. The
objects allow us to model the structure of data; the operators allow us to model its
behavior.
1. High Level-conceptual data model: User level data model is the high level
10
or conceptual model. This provides concepts that are close to the way that
many users perceive data.
2 .Low level-Physical data model: provides concepts that describe the details of
how data is stored in the computer model. Low level data model is only for
Computer specialists not for end-user.
3. Representation data model: It is between High level & Low level data model
Which provides concepts that may be understood by end-user but that are not
too far removed from the way data is organized by within the computer.
1. Relational Model
The Relational Model uses a collection of tables both data and the relationship
among those data. Each table has multiple columns and each column has a
unique name.
Relational database comprising of two tables.
Advantages
11
3. Representation of different types of relationship is possible with this model.
2. Network Model
The data in the network model are represented by collection of records and
relationships among data are represented by links, which can be viewed as
pointers.
Advantages:
3. Hierarchical Model
A hierarchical data model is a data model which the data is organized into a tree
like structure. The structure allows repeating information using parent/child
relationships: each parent can have many children but each child has one
parent. All attributes of a specific record are listed under an entity type.
12
Advantages:
The overall design of the database is called the database schema. A schema is
a collection of named objects. Schemas provide a logical classification of objects
in the database. A schema can contain tables, views, triggers, functions,
packages, and other objects.
DBMS Architecture
1
The three levels of the architecture are three different views of the data:
External Schema - individual user view
The External Schema is the view that the individual user of the database has. This
view is often a restricted view of the database and the same database may
provide a number of different views for different classes of users.
The Conceptual schema (sometimes called the logical schema) describes the
stored data in terms of the data model of the DBMS. In a relational DBMS, the
conceptual schema describes all relations that are stored in the database.
It hides physical storage details, concentrating upon describing entities, data
types, relationships, user operations, and constraints.
1
Data Independence
Data independence can be defined as the capacity to change the schema at one
level without changing the schema at next higher level.
1
Data Definition Language
Data Definition Language (DDL) statements are used to define the database
structure or schema. Some examples:
For instance, the following statement in the SQL language is used to create the
account table:
create table account
(accountnumber number(10),
balance number(8));
The storage definition language (SDL), is used to specify the internal schema. The
mappings between the two schemas may be specified in either one of these
languages.
The view definition language (VDL), to specify user views and their mappings to
the conceptual schema, but in most DBMSs the DDL is used to define both
conceptual and external schemas.
In addition, it updates a special set of tables called the data dictionary or data
directory.
A data dictionary contains metadata—that is, data about data. The schema of a
table is an example of metadata. A database system consults the data dictionary
before reading or modifying actual data.
Data Manipulation Language
Data manipulation is
• The retrieval of information stored in the database
1
• The insertion of new information into the database
• The deletion of information from the database
• The modification of information stored in the database
Data Manipulation Language (DML) statements are used for managing data within
schema objects. Some examples:
o SELECT - retrieve data from the a database
o INSERT - insert data into a table
• Procedural DMLs require a user to specify what data are needed and how to
get those data, The DML component of the PL/SQL language is procedural.
select
customername from
customer
where customerid = 192;
The query specifies that those rows from the table customer where the
customerid is 192 must be retrieved.
1
DBMS Interfaces
User-friendly interfaces provided by a DBMS may include the following.
Menu Based Interfaces for Web Clients or Browsing . These interfaces present the
user with lists of options, called menus, that lead the user through the formulation
of a request. Menus do away with the need to memorize the specific commands
and syntax of a query language; rather, the query is composed step by step by
picking options from a menu that is displayed by the system. Pull-down menus are
a very popular technique in Web-based user interfaces. They are also often used
in browsing interfaces, which allow a user to look through the contents of a
database in an exploratory and unstructured manner.
1
minimizing the number of keystrokes required for each request.
Interfaces for the DBA. Most database systems contain privileged commands that
can be used only by the DBA's staff. These include commands for creating
accounts, setting system parameters, granting account authorization, changing a
schema, and reorganizing the storage structures of a database.
the top part of Figure, It shows interfaces for the DBA staff, casual users who
work with interactive interfaces to formulate queries, application programmers
who create programs using some host programming languages, and parametric
users who do data entry work by supplying parameters to predefined transactions.
The DBA staff works on defining the database and tuning it by making changes to
its definition using the DDL and other privileged commands.
The DDL compiler processes schema definitions, specified in the DDL, and stores
descriptions of the schemas (meta-data) in the DBMS catalog. The catalog
includes information such as the names and sizes of files, names and data types
of data items, storage details of each file, mapping information among schemas,
and constraints.
Casual users and persons with occasional need for information from the database
interact using some form of interface, which we call the interactive query
1
interface. These queries are parsed and validated for correctness of the query
syntax, the names of files and a query compiler that compiles them into an
internal form. This internal query is subjected to query optimization, the query
optimizer is concerned with the rearrangement and possible reordering of
operations, elimination of redundancies, and use of correct algorithms and
indexes during execution.
It is now common to have the client program that accesses the DBMS running
on a separate computer from the computer on which the database resides. The
former is called the client computer running a DBMS client software and the
latter is called the database server. In some cases, the client accesses a middle
computer, called the application server, which in turn accesses the database
server.
2
Database System Utilities
In addition to possessing the software modules just described, most DBMSs have
database utilities that help the DBA manage the database system. Common
utilities have the following types of functions:
Loading. A loading utility is used to load existing data files—such as text
files or sequential files—into the database. Usually, the current (source) format of
the data file and the desired (target) database file structure are specified
to the utility, which then automatically reformats the data and stores it
in the database.
Backup. A backup utility creates a backup copy of the database, usually by
dumping the entire database onto tape or other mass storage medium. The
backup copy can be used to restore the database in case of disk failure.
Other utilities may be available for sorting files, handling data compression,
monitoring access by users, interfacing with the network, and performing other
functions.
Architectures for DBMSs have followed trends similar to those for general
computer system architectures. Earlier architectures used mainframe computers
to provide the main processing for all system functions, including user application
programs and user interface programs, as well as all the DBMS functionality. The
reason was that most users accessed such systems via computer terminals that
21
did not have processing power and only provided display capabilities. Therefore,
all processing was performed remotely on the computer system, and only display
information and controls were sent from the computer to the display terminals,
which were connected to the central computer via various types of
communications networks.
As prices of hardware declined, most users replaced their terminals with PCs
and workstations. At first, database systems used these computers similarly to
how they had used display terminals, so that the DBMS itself was still a
centralized DBMS in which all the DBMS functionality,application program
execution, and user interface processing were carried out on one machine.
2
Two-tier Client / Server architecture is used for User Interface program and
Application Programs that runs on client side. An interface called ODBC(Open
Database Connectivity) provides an API that allow client side program to call the
dbms. Most DBMS vendors provide ODBC drivers. A client program may connect to
several DBMS's. In this architecture some variation of client is also possible for
example in some DBMS's more functionality is transferred to the client including
data dictionary, optimization etc. Such clients are called Data server.
2
Differentiate between centralized and distributed data base
Centralized Distributed
Database is maintained at one site Database is maintained at a number of
different sites
If centralized system fails, entire system If one system fails, system continues
is halted. work with other sites
Less reliable More reliable
Several criteria are normally used to classify DBMSs. The first is the data model
on which the DBMS is based. The main data model used in many current
commercial DBMSs is the relational data model. The object data model has
been implemented in some commercial systems but has not had widespread use.
Many legacy applications still run on database systems based on the hierarchical
and network data models.
The second criterion used to classify DBMSs is the number of users supported
by the system. Single-user systems support only one user at a time and are
mostly used with PCs. Multiuser systems, which include the majority of DBMSs,
support concurrent multiple users.
The third criterion is the number of sites over which the database is distributed.
A DBMS is centralized if the data is stored at a single computer site. A
centralized DBMS can support multiple users, but the DBMS and the database
reside totally at a single computer site. A distributed DBMS (DDBMS) can have
the actual database and DBMS software distributed over many sites, connected by
a computer network. Homogeneous DDBMSs use the same DBMS software at all
2
the sites, whereas heterogeneous DDBMSs can use different DBMS software at
each site.
The fourth criterion is cost. It is difficult to propose a classification of DBMSs based
on cost. Today we have open source (free) DBMS products like MySQL and
PostgreSQL that are supported by third-party vendors with additional services.
The main RDBMS products are available as free examination 30-day copy versions
as well as personal versions,
We can also classify a DBMS on the basis of the types of access path options for
storing files. One well-known family of DBMSs is based on inverted file structures.
Finally, a DBMS can be general purpose or special purpose. When performance
is a primary consideration, a special-purpose DBMS can be designed and built for
a specific application; such a system cannot be used for other applications without
major changes. Many airline reservations and telephone directory systems
developed in the past are special-purpose DBMSs. These fall into the category
of online transaction processing (OLTP) systems, which must support a
large number of concurrent transactions without imposing excessive delays.