Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
91 views9 pages

Data Base-1

The document discusses databases and database management systems. It defines data and information, describes the components and characteristics of databases and DBMSs, and explains the differences between database and file processing approaches.

Uploaded by

habtamu desalegn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views9 pages

Data Base-1

The document discusses databases and database management systems. It defines data and information, describes the components and characteristics of databases and DBMSs, and explains the differences between database and file processing approaches.

Uploaded by

habtamu desalegn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Mekelle Institute of Lecture Notes for Database

Technology(MIT) Management
Mekelle,Tigray ,Ethiopia Systems(DBMSs)

Unit One : Databases and


Database Users

1.1Introduction to databases;Difference between Data and Information

Definition : data can be defined in many ways and in database parlance data
can be defined as a collection of unprocessed facts about things like people,
places, events or concept that have inherent meaning. Data is mainly derived
from accurate observation, experiment and/or research.

Data has three categories:


1. Raw data e.g. October 12, 2010.
2. Related raw data: is a group (data set) of organized raw data that can be
tied together. For example, The Id number, name, age, gender and
address of any one of you.
3. Cleaned raw data: is raw data or related raw data after being validated or
processed through some kind of ‘sense’ gate. For example, validating the
age of the student not to be greater than 125.

The grammar of data: In American English, data is usually a plural noun. In


British English it is mostly used as uncountable noun but sometimes as plural
when it is used in technical and formal terms.

Information: is a processed data that has been organized and communicated in


a coherent and meaningful manner. Information is mostly presented in the form
of charts, summaries, averages and ranked lists so that it will be very helpful in
making ‘informed’ decision. Useful information is called knowledge.

Flow of Data:
Data Information Knowledge Action
Database
Databases have been defined in various ways .Here we are going to consider
three of the most commonly used definitions:

(1) Definition 1: a database is a collection of related data. E.g. Address


book .An address book mainly contains the name, telephone number and
address of the people that you know. What about the collection of words
those make up this page? Since it is a collection of related data with
inherent meaning it can be considered as a database.

(2) Definition 2: A database is a well organized, logically related and shared


collection of data that is designed to meet the information needs of
various users in an organization.

(3) Definition 3: Contrary to the above definitions the common use of the
term database is more restricted. The term database can be defined
accurately using its implicit properties as follows:

o A database represents some aspect of the real world, sometimes


called the miniworld or the Universe of Discourse
(UoD) .Changes in the miniworld are reflected on the database.
o A database is a logically coherent (i.e. it is well planned, so that it
is clear and sensible and all its parts go well with each other)
collection of data with inherent meaning. A random assortment of
data can’t be correctly referred to as a database.
o A database is designed, built and populated with data for a
particular audience and for a specific purpose. It has an intended
group of users and some pre-conceived application.

Generally, a database has some source from which data is derived, some degree
of interaction with events in the real world and an audience that is actively
interested in the content of the database.

A database can be of any size and of varying complexity depending on its


application. For example, a database used by MIT’s registrar system may not
have the same size and complexity as a database used by Ethiopian airlines.

A database may be generated and maintained manually or using computers. An


address book can be an example of a database created manually .A
computerized database may be created and maintained by a group of application
programs written specifically for that task or by a general-purpose database
management system.

1.2 DBMS and Its Components

Database management system (DBMS) is a collection of usually complex


general-purpose or specific-purpose software systems that enable you to create
and maintain databases. More precisely, a DBMS is a collection of complex
programs that are used to facilitate the process of database definition,
construction, manipulation, sharing and protection.

o Database definition: is the process of specifying the data types,


structures, and constraints for the data to be stored on the database.
o Database construction: is the process of storing the data itself on some
storage medium that is controlled by the DBMS.
o Database manipulation: includes such functions as querying (asking a
stylised question) the database to retrieve specific data, updating the
database if there is a change in the miniworld, and generating reports
from the database.
o Database sharing: involves allowing multiple users and programs to use
the database concurrently.
o Database protection: includes both system protection against hardware or
software malfunctions (or crashes) and security protection against an
unauthorized or malicious access.

The DBMS must also be able to maintain the database system by allowing it to
evolve as requirements change over time.

Database system is a system that contains both the database and the database
management system (DBMS).

Components of a complete DBMS

A complete DBMS contains the following main components:


1. Data: the data is the most important part of the DBMS. The database
should store data that is relevant to the users. In fact there are two types
of data: real data and meta-data. The meta-data is the description of the
real data such as, when the real data was created, who created it and who
can access it.
2. Hardware: the actual hardware of the computer which is used for
keeping the database and accessing the data.
3. Software: the actual DBMS which acts as a mediator between the
database and the database users. Examples of DBMS software include
MySQL, Microsoft Access, Oracle, DB2, Sybase, SQL Server, Ingress,
Postgress, Postgress SQL, etc.
4. Users: are individuals who can insert, retriev, update data in the database.
5. Procedures: are the actual practices that users follow to obtain, enter and
maintain data. For example, in a payroll system how are the hours worked
received by the clerk and entered onto the system, exactly when are
monthly reports generated& to whom they are sent, etc. Procedures are
important if new employees are in need of using the DBMS.

1.3 Characteristics of the database approach

There are a number of characteristics that distinguish the database approach


from the traditional file processing systems. A file processing system is a
collection of individual files accessed by application programs. They were the
first computer-based applications that were used to handle commercial or
organizational data as a replacement for the manual file system. File processing
systems have the following limitations:
 Application program dependencies: a change in a single file can cause
changes in many application programs.
 Data redundancy: storage of unnecessary duplicates of data. Redundancy
can waste space and lead to data integrity problems.
 Lack of data sharing: multiple user access is difficult to control.
Thus database management systems were developed to handle these limitations.
A DBMS attempts to resolve the following problems:
 Data redundancy and inconsistency by keeping one copy of data item in
the database.
 Difficulty in accessing data by providing a query language.
 Integrity problems by enforcing integrity constraints.
 Lack of data sharing by providing a controlled concurrent multiuser
access.
 Security problems by using different security mechanisms.

The main characteristics of the database approach versus the file-processing


approach are the following:
1. Self-describing nature of a database system
2. Insulation between programs and data
3. Support of multiple views of the data
4. Sharing of data and multiuser transaction processing
1. Self-Describing Nature of a Database System

A fundamental characteristic of the database approach is that the database


system contains not only the database itself but also a complete definition or
description of the database structure and constraints called Meta data.

Metadata is simply data about data. It is a complete definition or description of


the database structure and constraints, details about database users and their
privileges. For example, in a relational database management system the meta
data may contain the names of tables, the relationship between/among the tables,
the names of the columns of the tables, the type of data to be stored on each
column and the constraints on the columns such as Not Null or Unique, etc.

In other words a metadata is data about one or more pieces of data which
provides the means of creation of the data, the purpose of the data, time and
date of creation, creator or author of the data, standards used while creating the
data, etc. Metadata is essential for further understanding and hence improve the
usage of a large collection of data.
.
Meta data is stored in a special storage called Data dictionary or System
Catalogue. The catalogue is mostly used by the DBMS and Users. The users
(mainly the administrators) are those who are interested to know the structure of
the database. A general purpose DBMS must refer the catalogue to know the
structure of specific tables in a particular database.

Most database management systems keep the data dictionary hidden from users
to prevent them from accidentally destroying its contents. Data dictionaries do
not contain any actual data. The actual data of interest is stored separately.

2. Insulation between Programs and Data

In traditional file processing, the structure of data files is embedded (inserted as


an integral part) in the programs that access the data file, so any changes to the
structure of a file may require changing all programs that access this file. By
contrast, in DBMSs the actual data is stored separately from the programs that
use the data so access programs do not require such changes in most cases. The
structure of data files is stored in the DBMS catalogue separately from the
access programs. We call this property program-data independence

3. Support of Multiple Views of the Data

A database typically has many users, each of whom may require a different
perspective or view of the database. A view may be a subset of the database or
it may contain virtual data that is derived from the database files but is not
explicitly stored. Some users may not need to be aware of whether the data they
refer to is stored or derived. A multiuser DBMS whose users have a variety of
applications must provide facilities for defining multiple views. For example, a
department head may only be interested in the departmental finances and
students’ enrolments but not the library information. The librarian is interested
in the information about books but he/she would not be expected to have any
interest in the information about the salary of academic staff.

4. Sharing of Data and Multiuser Transaction Processing

A multiuser DBMS, as its name implies, must allow multiple users to access the
database at the same time. This essential data for multiple applications is to be
integrated and maintained in a single database. The DBMS must include
concurrency control software to ensure that several users trying to update the
same data do so in a controlled manner so that the result of the updates is
correct. For example, when several reservation clerks try to assign a seat on an
airline flight, the DBMS should ensure that each seat can be accessed by only
one clerk at a time for assignment to a passenger. These types of applications
are generally called on-line transaction processing (OLTP) applications. A
fundamental role of multiuser DBMS software is to ensure that concurrent
transactions operate correctly.

1.4 Actors on the scene

In a large database with approximately hundreds of users many persons are


involved in its design, use and maintenance. Actors on the scene are people
whose jobs involve the day-to-day use of such large databases. The following
are professionals who are categorized as actors on the scene:

a. Database Administrators(DBAs)
In any organization where many persons use the same resources, there is a need
for a chief administrator to oversee and manage these resources. A resource is
anything which is valuable for an organization. There can be a number of
resources in any organization, for example, Buildings, Vehicle, Technical Staff,
managers, supporting staff and Machinery etc.In a database environment, the
primary resource is the database itself and the secondary resource is the DBMS
and related software.
The main responsibilities of the DBAs are:

 Administering the resources in a database environment


 Authorizing access to the database, coordinating and monitoring
its use.
 Acquiring software and hardware resources as needed.
 Setting security parameters for the database system and improving
system performance.

The DBA is accountable for problems such as breach of security or poor


system response time. In large organizations, the DBA is assisted by a
staff that helps carry out these functions.

b. Database Designers
Database designers are responsible for identifying the data to be stored in
the database (what type of data should the database contain) and for
choosing appropriate structures to represent and store this data.
It is the responsibility of database designers to communicate with all
prospective database users, in order to understand their requirements, and
to come up with a design that meets these requirements. In many cases,
the designers are on the staff of the DBA and may be assigned other staff
responsibilities after the database design is completed.

c. End Users
End users are the people whose jobs require access to the database for
querying, updating, and generating reports; the database primarily exists
for their use. There are several categories of end users:

• Casual end users: Occasionally access the database, but they may
need different information each time. They use a sophisticated
database query language to specify their requests and are typically
middle-level or high-level managers or other occasional browsers.
• Naive or parametric end users: make up a sizeable (fairly large)
portion of database end users. Their main job function revolves around
constantly querying and updating the database, using standard types of
queries and updates—called canned transactions—that have been
carefully programmed and tested. Parametric users perform various
tasks:

 Bank tellers check account balances and post withdrawals and


deposits.
 Reservation clerks for airlines, hotels, and car rental companies
check availability for a given request and make reservations.
• Sophisticated end users: include engineers, scientists, business
analysts, and others who thoroughly familiarize themselves with the
facilities of the DBMS so as to implement their applications to meet
their complex requirements.
• Stand-alone users: maintain personal databases by using ready-made
program packages that provide easy-to-use menu-based or graphics-
based interfaces. An example is the user of a tax package that stores a
variety of personal financial data for tax purposes.

d. System Analysts and Application Programmers


System analysts determine the requirements of end users, especially naive
or parametric end users, and develop specifications for canned
transactions that meet these requirements. Application programmers
implement these specifications as programs; then they test, debug,
document, and maintain these canned transactions. Such analysts and
programmers (nowadays called software engineers) should be familiar
with the full range of capabilities provided by the DBMS to accomplish
their tasks.

1.5 Workers behind the Scene

Persons who are involved in the design, development, operation and maintenance
of the DBMS software and system environment are called workers behind the
scene. They typically do not use the database for their own purposes. There
following are categories of workers behind the scene:

 DBMS system designers and implementers: are persons who design and
implement the DBMS modules (i.e. The system catalogue module, data
access module, concurrency control module, recovery module, etc.) and
interfaces (i.e. interfaces with others OSs and compilers) as a software
package.
 Tool developers: include persons who design and implement tools—the
software packages that facilitate database system design and use, and help
improve performance. Tools are optional packages that are often
developed by independent software developers and hence are purchased
separately.
 Operators and maintenance personnel: are the system administration
personnel who are responsible for the actual running and maintenance of
the hardware and software environment for the database system.
1.6 Advantages of using a DBMS

 Controlling Redundancy in data storage, development and


maintenance efforts.

 Restricting Unauthorized Access (security and authorization)


 Providing Persistent Storage for Program Objects and Data
Structures
 Permitting Inference and Actions Using Rules
 Providing Multiple User Interfaces
 Representing Complex Relationships Among data
 Enforcing Integrity Constraints
 Providing Backup and Recovery

1.7 When not to use a DBMS

Main costs of using a DBMS:


 High initial investment in hardware, software, and possible need for
additional hardware.
 Overhead for providing generality, security, recovery, integrity, and
concurrency control.
 Generality that a DBMS provides for defining and processing data.

When a DBMS may be unnecessary:


 If the database and applications are simple, well defined, and not
expected to change.
 If there are stringent real-time requirements that may not be met
because of DBMS overhead.
 If access to data by multiple users is not required.

You might also like