Handout : 2
Subject: Information Technology
Topic: Understand Relational Database
Management System
Database Concepts
The key to organizational success is effective decision making which requires timely, relevant and
accurate information. Hence information plays a critical role in today's competitive environment.
Database Management Software (DBMS) simplifies the task of managing the data and extracting useful
information out of it.
What is Data?
Data is a collection of raw facts which have not been processed to reveal useful information. Information is
produced by processing data as shown .
Data Processing Example
In this example, given the data of the test marks of all the students in a class (data), we have extracted
the information about average, maximum and minimum marks for given student data.
What is a database?
A collection of related data that has been recorded, organized and made available for searching is called a
Database. For example, consider the name, class, roll number, marks in every subject of every student in a
school. To record this information about every student in a school, the school might have maintained a
register, or stored it on a hard drive using a computer system and software such as a spreadsheet or DBMS
package.
Properties of database
1) A database is a representation of some aspect of the real world also called miniworld. Whenever there are
changes in this miniworld they are also reflected in the database.
2) It is designed, built and populated with data for specific purpose.
3) It can be of any size and complexity.
4) It can be maintained manually or it may be computerized.
ISAS/XI/IT/2022-23 Page 1/6
Need for a Database
In traditional file processing, data is stored in the form of files. A number of application programs are
written by programmers to insert, delete, modify and retrieve data from these files. New application
programs will be added to the system as the need arises. For example, consider the Sales and Payroll
departments of a company. One user will maintain information about all the salespersons in the Sales
department in some file say File1 and another user will maintain details about the payroll of the salesperson
in a separate file say File2 in the Payroll Department as shown
Traditional File Processing System Although both the departments need information about the salesperson
but they will store information about the salesperson in different files and will use different application
programs to access those files.
Drawbacks of Traditional File processing system
1. Data Redundancy: Same information is stored in more than one file. This would result in wastage of space.
2. Data Inconsistency: If a file is updated then all the files containing similar information must be
updated else it would result in inconsistency of data.
3. Lack of Data Integration: As data files are independent, accessing information out of multiple files
becomes very difficult.
Database approach overcomes these problems In database approach, a single repository of data is
maintained which is accessed by different users as per their needs.
Database Management System (DBMS)
A database management system is a collection of programs It enables users
to create, maintain and use a database.
This database can be accessed by different users as per their requirements.
•
The various operations that need to be performed on a database are as follows:
1. Defining the Database: It involves specifying the data type of data that will be stored in the database and
also any constraints on that data.
2. Populating the Database: It involves storing the data on some storage medium that is controlled by DBMS.
3. Manipulating the Database: It involves modifying the database, retrieving data or querying the
database, generating reports from the database etc.
4. Sharing the Database: Allow multiple users to access the database at the same time.
5. Protecting the Database: It enables protection of the database from software/ hardware failures and
unauthorized access.
6. Maintaining the Database: It is easy to adapt to the changing requirements. Some examples of DBMS are –
MySQL, Oracle, DB2, etc.
ISAS/XI/IT/2022-23 Page 2/6
The main characteristics of a DBMS are as follows:
Self-describing Nature of a Database System: DBMS contains not only the database but also the
description of the data that it stores. This description of data is called metadata. Meta-data is stored in a
database catalogue or data dictionary. It contains the structure of the data and also the constraints that
are imposed on the data.
Insulation Between Programs and Data: Since the definition of data is stored separately in a DBMS,
any change in the structure of data would be done in the catalogue and hence programs which access
this data need not be modified. This property is called Program-Data Independence.
Sharing of Data: A multiuser environment allows multiple users to access the database
simultaneously. Thus a DBMS must include concurrency control software to allow simultaneous access
of data in the database without any inconsistency problems.
Advantages of using DBMS Approach
Reduction in Redundancy: Data in a DBMS is more concise because of the central repository of data.
All the data is stored at one place. There is no repetition of the same data. This also reduces the cost of
storing data on hard disks or other memory devices.
Improved Consistency: The chances of data inconsistencies in a database are also reduced as there is a
single copy of data that is accessed or updated by all the users.
Improved Availability: Same information is made available to different users. This helps sharing of
information by various users of the database.
Improved Security: Though there is improvement in the availability of information to users, it may
also be required to restrict the access to confidential information. By making use of passwords and
controlling users' database access rights, the DBA can provide security to the database.
User Friendly: Using a DBMS, it becomes very easy to access, modify and delete data. It reduces the
dependency of users on computer specialists to perform various data related operations in a DBMS
because of its user friendly interface.
The two main disadvantages of using a DBMS:
High Cost: The cost of implementing a DBMS system is very high. It is also a very timeconsuming
process.
Security and Recovery Overheads: Unauthorized access to a database can lead to threat to the
individual or organization depending on the data stored. Also the data must be regularly backed up to
prevent its loss due to fire, earthquakes, etc. Hence the DBMS approach is usually not preferred when
the database is small, well defined, less frequently changed and used by few users.
Relational Database
Relational database was developed by E.F Codd at IBM in 1970.
It is used to organize collection of data as a collection of relations
In this tables are called Relations. So each relation corresponds to a table of values.
Relations store data for different columns.
Each Relation can have multiple columns where each column name should be unique.
Each column name is used to interpret the meaning of that data in each row.
Each row in the Relation represents a related set of values.
Page 3/6
Each row in this table represents facts about a particular employee.
The column names – Name, Employee_ID, Gender, Salary and Date_of_Birth specify how to interpret
the data in each row.
Commonly used terminologies in Relational Data Model
A table is called as a Relation.
A row is called a Tuple.
A column is called an Attribute.
The data type of values in each column is called the Domain.
The number of attributes in a relation is called the Degree of a relation.
The number of rows in a relation is called the Cardinality of a relation.
Relation Schema R is denoted by R (A , A , A …, A ) where R is the relation name 1 2 3, n and
A , A , A ,….A is the list of attributes.
EMPLOYEE table is a relation.
There are three tuples in EMPLOYEE relation.
Name, Employee_ID, Gender, Salary, Date_of_Birth are attributes
In order to specify a domain, we specify the data type of that attribute.
Following are the domain of attributes of the EMPLOYEE relation:
(a) Name – Set of character strings representing names of persons.
(b) Employee_ID–Set of 4-digit numbers
(c) Gender – male or female
(d) Salary – Number
(e) Date_of_Birth – Should have a valid date, month and year. The birth year of the employee must be
greater than 1985. Also the format should be dd-mm-yyyy.
The degree of the EMPLOYEE relation is 5 as there are five attributes in this relation.
The cardinality of the EMPLOYEE relation is 3 as there are three tuples in this relation.
Relation Schema – EMPLOYEE (Name, Employee_ID, Gender, Salary, Date_of_Birth)
Some characteristics of Relations:
Ordering of tuples is not important in a Relation.
The ordering of attributes is also unimportant.
No two tuples of relation should be identical i.e. given any pair of two tuples, value in at least one
column must be different.
The value in each tuple is an atomic value (indivisible).
If the value of an attribute in a tuple is not known or not applicable or not available, a special
value called null is used to represent them .
Page 4/6
Constraints: relational data model imposes some restrictions or constraints on the values of the
attributes and how the contents of one relation be referred through another relation. These restrictions
are specified at the time of defining the database are restrictions on the values, stored in a database
based on the requirements.
For example, in the relation EMPLOYEE, the Employee_ID must be a 4-digit number, the
Date_of_Birth must be such that the birth year > 1985.
Constraints in Relational Model:
Domain Constraint: It specifies that the value of every attribute in each tuple must be from the domain of
that attribute. For example, the Employee_ID must be a 4-digit number. Hence a value such as “12321” or
“A234” violates the domain constraint as the former is not 4-digit long and the latter contains an alphabet.
Key Constraint:
Keys are very important part of Relational database model. They are used to establish and identify
relationships between tables and also to uniquely identify any record or row of data inside a table. A
Key can be a single attribute or a group of attributes, where the combination may act as a key.
Superkey is a set of attributes in a relation, for which no two tuples have the same combination of
values. Every relation must have at least one superkey which is the combination of all attributes in a
relation.
Thus for the EMPLOYEE relation, following are some of the superkeys:
{Name, Employee_ID, Gender, Salary, Date_of_birth} - default superkey consisting of all attributes.
{Name, Employee_ID, Date_of_Birth}
{Employee_ID, Gender, Salary}
{Name, Employee_ID, Gender}
{Employee_ID}
However, {Gender, Salary} is not a superkey because both these attributes have identical values for
employees Neha and Himani.
Candidate key : A relation can have one or more attributes that takes distinct values. Any of these
attributes can be used to uniquely identify the tuples in the relation. Such attributes are called candidate
keys as each of them are candidates for the primary key. A relation may have more than one candidate
key. Consider the relation PERSON with the following schema: PERSON (Aadhar_no, PAN,
Voter_ID_no, Name, Date_of_birth, Address). This relation has three keys namely: {Aadhar_no},
{PAN}, {Voter_ID_no} as every individual in India has a unique Aadhar, PAN as well as Voter ID
number. So the PERSON relation has three candidate keys.
Primary Key: One of the candidate keys may be designated as Primary key. Primary key is used to
identify tuples in a relation. If a relation has many candidate keys it is preferable to choose that one as
primary key which has least number of attributes. Primary key is usually underlined in the schema of
the relation. For example, in the relation schema: PERSON (Aadhar_no, PAN, Voter_ID_no, Name,
Date_of_birth, Address), Aadhar_no is the primary key.
Alternate Key: The remaining attributes in the list of candidate keys which are not designated as
primary key are called as Alternate Keys. In the relation PERSON, if Aadhar_no is chosen as primary
key, then PAN and Voter_ID_no will be called the alternate keys.
Page 5/6
Foreign key is used to represent the relationship between two relations. A foreign key in one table refers to
the primary key in other table. The referencing relation is called Foreign Relation. The relation in which the
referenced primary key is defined is called primary relation or master relation. Foreign key can have null
values but primary key cannot have null values. Foreign key need not have unique values
Null Value Constraint: Sometimes it is required that certain attributes cannot have null values. For example,
if every EMPLOYEE must have a valid name then the Name attribute is constrained to be NOT NULL.
Entity Integrity Constraint: This constraint specifies that primary key of a relation cannot have null value.
The reason behind this constraint is that we know primary key contains no duplicates. However if we allow
null values for a primary key then there can be multiple tuples for which primary key is having null values.
This would imply that we are allowing duplicate values (NULL) for a primary key which itself violates the
definition of primary key.
Referential Integrity Constraint: This constraint is specified between two relations. The main purpose
of this constraint is to check that data entered in one relation is consistent with the data entered in
another relation.
Page 6/6