Department of Commerce (CA)
CORE PAPER-II-DATABASE SYSTEM
CONCEPTS
SEMESTER:I SUB CODE:18MCC12C
M.COM(CA)
UNIT1: Database system architecture-basic concepts-
data system-operational data, data independence
architecture for a database system-distributed
databases.
REFERENCE BOOK:
An introduction to database system-C.J. Dates
An introduction to database system-Bipin
PREPARED BY: DR. E.N. KANJANA,
ASST PROFESSOR.
Data Processing Vs. Data Management Systems
Although Data Processing and Data Management Systems both refer to functions that
take raw data and transform it into usable information, the usage of the terms is very
different. Data Processing is the term generally used to describe what was done by large
mainframe computers from the late 1940's until the early 1980's (and which continues to
be done in most large organizations to a greater or lesser extent even today): large
volumes of raw transaction data fed into programs that update a master file, with fixedformat
reports written to paper
The term Data Management Systems refers to an expansion of this concept, where the
raw data, previously copied manually from paper to punched cards, and later into dataentry
terminals, is now fed into the system from a variety of sources, including ATMs,
EFT, and direct customer entry through the Internet. The master file concept has been
largely displaced by database management systems, and static reporting replaced or
augmented by ad-hoc reporting and direct inquiry, including downloading of data by
customers. The ubiquity of the Internet and the Personal Computer have been the driving
force in the transformation of Data Processing to the more global concept of Data
Management Systems.
Characteristics of Database
The database approach has some very characteristic features which are discussed in detail
below:
6
1.5.1 Concurrent Use
A database system allows several users to access the database concurrently. Answering
different questions from different users with the same (base) data is a central aspect of an
information system. Such concurrent use of data increases the economy of a system.
An example for concurrent use is the travel database of a bigger travel agency. The
employees of different branches can access the database concurrently and book journeys
for their clients. Each travel agent sees on his interface if there are still seats available for
a specific journey or if it is already fully booked.
1.5.2 Structured and Described Data
A fundamental feature of the database approach is that the database systems does not
only contain the data but also the complete definition and description of these data. These
descriptions are basically details about the extent, the structure, the type and the format of
all data and, additionally, the relationship between the data. This kind of stored data is
called metadata ("data about data").
1.5.3 Separation of Data and Applications
As described in the feature structured data the structure of a database is described through
metadata which is also stored in the database. An application software does not need any
knowledge about the physical data storage like encoding, format, storage place, etc. It
only communicates with the management system f a database (DBMS) via a standardised
interface with the help of a standardised language like SQL. The access to the data and
the metadata is entirely done by the DBMS. In this way all the applications can be totally
seperated from the data. Therefore database internal reorganisations or improvement of
efficiency do not have any influence on the application software.
1.5.4 Data Integrity
Data integrity is a byword for the quality and the reliability of the data of a database
system. In a broader sense data integrity includes also the protection of the database from
unauthorised access (confidentiality) and unauthorised changes. Data reflect facts of the
real world. database.
1.5.5 Transactions
A transaction is a bundle of actions which are done within a database to bring it from one
7
consistent state to a new consistent state. In between the data are inevitable inconsistent.
A transaction is atomic what means that it cannot be divided up any further. Within a
transaction all or none of the actions need to be carried out. Doing only a part of the
actions would lead to an inconsistent database state. One example of a transaction is the
transfer of an amount of money from one bank account to another. The debit of the
money from one account and the credit of it to another account makes together a
consistent transaction. This transaction is also atomic. The debit or credit alone would
both lead to an inconsistent state. After finishing the transaction (debit and credit) the
changes to both accounts become persistent and the one who gave the money has now
less money on his account while the receiver has now a higher balance.
1.5.6 Data Persistence
Data persistence means that in a DBMS all data is maintained as long as it is not deleted
explicitly. The life span of data needs to be determined directly or indirectly be the user
and must not be dependent on system features. Additionally data once stored in a
database must not be lost. Changes of a database which are done by a transaction are
persistent. When a transaction is finished even a system crash cannot put the data in
danger.
1.6 Advantages and Disadvantages of a DBMS
Using a DBMS to manage data has many advantages:
Data independence: Application programs should be as independent as possible from
details of data representation and storage. The DBMS can provide an abstract view of the
data to insulate application code from such details.
Efficient data access: A DBMS utilizes a variety of sophisticated techniques to store and
retrieve data efficiently. This feature is especially important if the data is stored on
external storage devices.
Data integrity and security: If data is always accessed through the DBMS, the DBMS
can enforce integrity constraints on the data. For example, before inserting salary
information for an employee, the DBMS can check that the department budget is not
exceeded. Also, the DBMS can enforce access controls that govern what data is visible to
different classes of users.
8
Data administration: When several users share the data, centralizing the administration
of data can offer significant improvements. Experienced professionals who understand
the nature of the data being managed, and how different groups of users use it, can be
responsible for organizing the data representation to minimize redundancy and finetuning the
storage of the data to make retrieval efficient.
Concurrent access and crash recovery: A DBMS schedules concurrent accesses to the
data in such a manner that users can think of the data as being accessed by only one user
at a time. Further, the DBMS protects users from the effects of system failures.
Reduced application development time: Clearly, the DBMS supports many important
functions that are common to many applications accessing data stored in the DBMS.
This, in conjunction with the high-level interface to the data, facilitates quick
development of applications. Such applications are also likely to be more robust than
applications developed from scratch because many important
tasks are handled by the DBMS instead of being implemented by the application. Given
all these advantages, is there ever a reason not to use a DBMS? A DBMS is a complex
piece of software, optimized for certain kinds of workloads (e.g., answering complex
queries or handling many concurrent requests), and its performance may not be adequate
for certain specialized applications. Examples include applications with tight real-time
constraints or applications with just a few well-designed critical operations for which
efficient custom code must be written. Another reason for not using a DBMS is that an
application may need to manipulate the data in ways not supported by the query
language. In such a situation, the abstract view of the data presented by the DBMS does
not match the application's needs, and actually gets in the way. As an example, relational
databases do not support flexible analysis of text data (although vendors are now
extending their products in this direction). If specialized performance or data
manipulation requirements are central to an application, the application may choose not
to use a DBMS, especially if the added benefits of a DBMS (e.g., flexible querying,
security, concurrent access, and crash recovery) are not required. In most situations
calling for large-scale data management, however, DBMSs have become an
indispensable tool.
9
Disadvantages of a DBMS
Danger of a Overkill: For small and simple applications for single users a database
system is often not advisable.
Complexity: A database system creates additional complexity and requirements. The
supply and operation of a database management system with several users and databases
is quite costly and demanding.
Qualified Personnel: The professional operation of a database system requires
appropriately trained staff. Without a qualified database administrator nothing will work
for long.
Costs: Through the use of a database system new costs are generated for the system
itselfs but also for additional hardware and the more complex handling of the system.
Lower Efficiency: A database system is a multi-use software which is often less efficient
than specialised software which is produced and optimised exactly for one problem.
Data Dictionary
It holds detailed information about the different structures and data types, the details
of the logical structures that are mapped into the different structure, details of relationship
between data items, details of all users privileges and access rights and performance of
resource with details.
TYPES OF RELATIONSHIPS
There are three types of relationships between the tables. The type of relationship
that is created depends on how the related columns are defined:
One-to-one relationship (1:1)
A pair of tables bears a one-to-one relationship when a single record in the first table
is related to only one record in the second table, and a single record in the second table is
related to only one record in the first table.
One-to-Many Relationships (1:M)
A one-to-many relationship exists between a pair of tables when a single record in
the first table can be related to one or more records in the second table, but a single
record in the second table can be related to only one record in the first table.
Many-to-Many Relationships (M:M)
A pair of tables bears a many-to-many relationship when a single record in the first
table can be related to one or more records in the second table and a single record in the
second table can be related to one or more records in the first table.
Data Independence
The three-schema architecture can be used to explain the concept of data independence,
which can be defined as the capacity to change the schema at one level of a database
system without having to change the schema at the next higher level. We can define two
types of data independence:
1. Logical data independence is the capacity to change the conceptual schema
without having to change external schemas or application programs. We may
change the conceptual schema to expand the database (by adding a record type or
data item), or to reduce the database (by removing a record type or data item). In
the latter case, external schemas that refer only to the remaining data should not
be affected. Only the view definition and the mappings need be changed in a
DBMS that supports logical data independence. Application programs that
reference the external schema constructs must work as before, after the conceptual
schema undergoes a logical reorganization. Changes to constraints can be applied
also to the conceptual schema without affecting the external schemas or
application programs.
21
2. Physical data independence is the capacity to change the internal schema
without having to change the conceptual (or external) schemas. Changes to the
internal schema may be needed because some physical files had to be
reorganized—for example, by creating additional access structures—to improve
the performance of retrieval or update. If the same data as before remains in the
database, we should not have to change the conceptual schema.
Whenever we have a multiple-level DBMS, its catalog must be expanded to include
information on how to map requests and data among the various levels. The DBMS uses
additional software to accomplish these mappings by referring to the mapping
information in the catalog. Data independence is accomplished because, when the schema
is changed at some level, the schema at the next higher level remains unchanged; only the
mapping between the two levels is changed. Hence, application programs referring to the
higher-level schema need not be changed.
The three-schema architecture can make it easier to achieve true data independence, both
physical and logical. However, the two levels of mappings create an overhead during
compilation or execution of a query or program, leading to inefficiencies in the DBMS.
Because of this, few DBMSs have implemented the full three-schema architecture