Introduction to Databases
Dr. Enosha Hettiarachchi
[email protected] 1
Learning Outcomes:
Upon completion of this course, students will be able to do the following:
LO1. Describe the role of a database system and the functions of a database administrator.
LO2. Explain the ANSI-SPARC three-schema architecture for databases and differentiate
between conceptual, external, and internal schemas.
LO3. Define fundamental concepts of the relational model, including relations, attributes,
domains, keys, foreign keys, entity integrity, and referential integrity.
LO4. Analyze database requirements and evaluate the suitability of different database
designs.
LO5. Create Entity-Relationship diagrams for given scenarios and translate conceptual
models into relational schemas.
LO6. Demonstrate the process of normalization and apply normalization techniques to
relations.
LO7. Formulate SQL queries of varying complexity to retrieve and manipulate data.
LO8. Define the concept of views in database systems, create data views using SQL, and
evaluate the advantages and disadvantages of views.
LO9. Develop database applications using SQL and other relevant tools and technologies.
2
References
Fundamentals of Database Systems
by Ramez Elmasri, Shamkant B. Navathe
An Introduction to Database Systems
by C.J. Date
Database Systems: A Practical Approach to
Design, Implementation, and Management,
Third Edition
by Thomas Connolly, Carolyn Begg
3
Rubric:
70% Final Examination
30% Assignments
Structure
Why use a Database?
Components of Database System Environment
File based Systems and Limitations
Introduction to DB approach
Data Hierarchy
Introduction to meta data
DB applications
DB approach advantages & disadvantages
5
Why use a Database?
Many people collect things
– How about you?
If you collect any thing, you probably are familiar with
some of the problems of managing a collection
– e.g. stamps, photos, paper cuttings
One way to keep track of a collection is to create a
database
6
Why Database Technology?
The need to manipulate large collection of data for
frequent used data queries and reports.
E.g. Collection of information on library books
Queries:
– List of books written by a particular author
– List of books about a particular subject
– Borrowing a book
– Reserving a book for borrowing
7
Examples of Database Applications
Purchases from the supermarket
Purchases using your credit card
Booking a holiday at the travel agents (Air
Line Reservation)
Using the Internet
Studying at university
Hospital admissions
Borrowing Books from library
8
Library
Membership
Reference
Borrow
Return
Order
9
Air Line Reservations
• Availability of Seats on a flight, ticket booking, issuing,
reconfirmation
10
Hotel Reservations
Availability of Rooms,
reservation
11
Hospital System
Doctor’s information, speciality
Ward information, no. of beds
Theatre information, facilities
Patient information, admittance information
Consultation information, booking
Listing
Read each record and print. In time sequence.
12
Components of Database System
Environment
14
Components of Database System
Environment
Hardware
Set of physical devices on which a database
resides. Can range from a PC to a network of
computers.
Software
– Database Management System (DBMS)
– Operating System
– Application Programs
– User Interface
15
Components of Database System
Environment - Data
Data
– A representation of facts, concepts or
instructions in a formalised manner suitable
for communication, interpretation or
processing by human beings or by automatic
means.
– Text, colours, symbols, shapes, graphics,
images, temperatures, sound, video or other
facts and figures are data suitable for
processing.
16
Components of Database System
Environment - Data
E.g. Person or Employee or Customer
name, address, phone, date of birth,
designation, department, salary,
employee no, photograph
17
Information
Knowledge derived from data
Processed or organised or summarised data
Eg:-
• Process Date of Birth ->Age
• Process Salary (all) ->Highest paid employee
• Process all -> No of employees
• Process all -> Employees working for Sales division
18
Components of Database System
Environment
Procedures
Instructions and rules that should be applied to the
design and use of the database.
People
Two different types of people (end-users and
practitioners) are concerned with the database.
End-Users
– are the ‘clients’ for the database, who need information from the
database to carry out their duties.
e.g. Executives, managers, staff, clerical personnel
19
Components of Database System
Environment - People
Practitioners
– People responsible for the database
system and its associated application
software.
e.g. Data and Database administrators, Database designers,
Application developers.
20
People - Job Definitions
Data Administration: A high-level function that is
responsible for the overall management of data
resources in an organization, including
maintaining corporate-wide definitions and
standards.
Database Administration: A technical function
that is responsible for physical database design
and for dealing with technical issues such as
security enforcement, database performance,
and backup and recovery. 21
Database Administration Functions
Often, some mixture of these duties
– Selection of hardware and software
– Installing/upgrading DBMS
– Tuning database performance
– Improving query processing performance
– Managing data security, privacy, and integrity
– Data backup and recovery
22
Manual Systems – Information on library books
Before and during most of the last century,
libraries used card catalogues stored in drawers
of special cabinets
– cards with typed book information
E.g. the title index has one card for every book in the library
23
File-Based Systems
Collection of application programs that perform
services for the end users (e.g. reports).
Each program defines and manages its own data.
Personnel Application Employee file
Name, Address, NID number, Designation
Payroll Application Payroll file
Name, Address, Hours worked, Pay rate
24
Data Redundancy
Personnel
Application Employee file
Name, Address, NID number, Designation
Payroll
Payroll file
Application
NID, Name, Address, Hours worked, Pay rate
Benefit
Benefit files
Application
Name, NID, Address, Insurance, Pension plan
25
Limitations of File-Based Approach
Separation and isolation of data
– Each program maintains its own set of data.
– Users of one program may be unaware of
potentially useful data held by other programs.
Duplication of data
– Same data is held by different programs.
– Wasted space and potentially different values
and/or different formats for the same item.
26
Limitations of File-Based Approach
Data dependence
– File structure is defined in the program code.
Incompatible file formats
– Programs are written in different languages, and so
cannot easily access each other’s files.
Fixed Queries/Proliferation of application programs
– Programs are written to satisfy particular functions.
– Any new requirement needs a new program.
27
Program-Data/Data Dependence
program Customer-Entry program Customer-Orders
…. ….
type customer = record type customer = record
customer-id: string; customer-id: string;
customer-name: string; customer-name: string;
customer-street: string; customer-street: string;
customer-city: string; customer-city: string;
end; end;
.... ....
10 | Perera | Galle Road | Colombo 04 28
12 | Silva | Reid Drive | Colombo 07
Database Approach
Arose because:
– Definition of data was embedded in application
programs, rather than being stored separately and
independently.
– No control over access and manipulation of data beyond
that imposed by application programs.
Result:
– the database and Database Management System
(DBMS).
29
Database
Shared collection of logically related data (and a
description of this data), designed to meet the
information needs of an organization.
System catalog or data dictionary provides
description of data (metadata) to enable program–
data independence.
Logically related data comprises entities, attributes,
and relationships of an organization’s information.
30
Database Management System (DBMS)
A software system that enables users to
define, create, and maintain the database
and that provides controlled access to this
database.
E.g. MySQL, Oracle, Access, SQL
Microsoft Server, PostgreSQL, IBM DB2, etc
31
Database Approach
Personnel
Application
File 1
Payroll
DBMS File 2
Application
File 3
Benefit
Application
e.g. Integrated human resources database
– Personnel: Name, Address, NID number, Designation
– Payroll: Hours worked, Pay rate 32
– Benefit: Insurance, Pension plan
Data Hierarchy
D a ta b a s e
Entities/objects
F ile F ile F ile
F ie ld
R e c o rd R e c o rd R e c o rd
B y te B y te B y te
F ie ld F ie ld F ie ld
attributes
B it B it B it
33
Data Hierarchy
Employee (Empno, Name, Designation, Salary,
Depart)
1 De Silva Manager 50000 Personnel
2 Perera Secretary 15000 Personnel
3 Dias Salesman 25000 Sales
Department (Depart, Manager, Dept Addr, Dept Phone)
Personnel De Silva Colombo 589123
Sales Alwis Kandy 987275
34
…. …. …. ….
Data Hierarchy
(Empno,name, designation, salary,department)
2 Perera Manager 35,000 Personnel
Record
Field 1 Field 2 Field 3 Field 4
Byte
•A single character (letter, number, symbol) is represented using a
group of bits, E.g. 10101010 letter J in ASCII
35
Bit
•The smallest unit of data, E.g. 0 or 1
Data Dictionary/System Catalog
A subsystem that keeps track of the definitions of
data items in the database which includes:
• Elementary-level data items (fields/attributes),
• Relationships that exists between various data
structures.
• Files or relational tables.
• Indexes that are used to access data quickly.
36
Meta Data
Data that describe the properties or
characteristics of other data.
Some of these properties include the name of the
data item, data type, length, minimum and maximum
allowable values (where appropriate), rules or
constraints and a brief description of each data item.
Metadata allow database designers and users to
understand what data exist, what the data mean.
Data without clear meaning can be confusing,
misinterpreted or erroneous.
37
Meta Data
E.g. Employee
Name Type Length Min Max Description
EmpNo Number 9 Employee No.
Name Character 30 Employee Name
Dept Character 10 Dept. No.
Salary Number 8 5000 60000 Employee Salary
Employee No. (ID) unique
38
Table Definition
39
Database Approach
Data Definition Language (DDL).
– Permits specification of data types, structures and
any data constraints.
– All specifications are stored in the database.
Data Manipulation Language (DML).
– is used for selecting, inserting, deleting and
updating data in a database.
– General enquiry facility (query language) of the
data.
40
Database Approach
Controlled access to database may
include:
– A security system.
– An integrity system.
– A concurrency control system.
– A recovery control system.
41
Data Security
The database is a valuable resource needing
protection.
The DBMS provides database security by limiting
access to the database to authorised personnel.
Authorised users will generally be restricted as to the
particular data they can access and whether they can
update it.
Access is often controlled by passwords and by data
views.
42
Data Integrity
The integrity and consistency of the database are
protected via constraints on values that data items
can have.
Data constraint definitions are maintained in the data
dictionary.
43
Concurrency control
Concurrency control in database management
systems (DBMS) ensures that database transactions
are performed concurrently without the concurrency
violating the data integrity of a database.
44
Recovery Control
Backup and recovery are supported by software that
automatically logs changes to the database and
provides for a means of recovering the current state
of the database in case of system failure.
Power Failure in a Bank?
Day’s transactions?
Bomb or Floods destroy your computer system?
45
Database Server Architecture
Client/server platform
– A local area network consisting of client computers
which receive services from a server computer.
Database server
– A program running on server hardware to provide
database services to client machines.
46
Client/Server Platform
Clients interacting with a database server
Database Server
Multi-user
47
Database Applications
Databases range from those for a single
user with a desktop computer to those on
mainframe computers with thousands of
users.
Personal databases
Workgroup databases
Departmental databases
Enterprise databases 48
Personal databases
Designed to support one user with a standard
alone PC.
E.g. a sales person keeping track of this customer
information with contact details.
49
Workgroup databases
A relatively small team of people (less than 25) who
collaborate on the same project or application.
E.g. a team of engineering designers maintain versions
of the artifact that they design.
50
Departmental databases
A department is a functional unit of an
organisation. It is larger than a workgroup.
Department databases are designed to support
the various functions and activities of a
department.
E.g. a personnel database that is designed to track
data concerning employees, jobs, skills and job
assignments. 51
Enterprise databases
An enterprise is one whose scope is the entire organisation
or enterprise.
Such databases are intended to support organisation-wide
operations and decision making.
E.g. a large health care organisation that operates a group
of medical centre's including hospitals, clinics and nursing
homes.
52
Enterprise databases
An enterprise database does support information needs
from many departments. The most important type of
enterprise database today is called a data warehouse.
Data warehouse
– An integrated decision support database whose content is
derived from the various operational databases.
53
Database Approach -Advantages
Improved maintenance through program-data
independence
Minimal data redundancy
Improved data consistency
Improved data sharing
Increased productivity 54
Database Approach -Advantages
Enforcement of standards
Improved data integrity
Improveddata accessibility and
responsiveness
Improved security
Increased concurrency
55
Improved maintenance through
Program-Data/Data Independence
The separation of data descriptions
(metadata) from the application programs
that use the data.
This simplifies database application
maintenance.
Improved maintenance through
Program-Data/Data Independence
In
the database approach data descriptions
are stored in a central location called the
data dictionary.
Thisproperty allows an organisation’s data
to change and evolve (within limits) without
changing the application program that
process the data.
Minimal Data Redundancy
Data files are integrated into a single, logical
structure. Each primary fact is recorded (ideally)
in only one place in the database.
E.g. Employee data not with the payroll and
benefit files.
Note: Data redundancy is not eliminated entirely.
Some data items will appear in more than one
place (e.g. employee no.) to represent the
relationship with others.
Improved Data Consistency
By eliminating (or controlling) data
redundancy, we greatly reduce the
opportunities for inconsistency.
E.g. employee address is stored only once
and hence we cannot have disagreement on
the stored values.
Updating data values is greatly simplified and
have avoided the wasted storage space.
Improved Data Sharing
A database is designed as a shared corporate
resource and can be shared by all authorised
users. In this way more users share more of the
data.
E.g. employee data common to payroll, benefit
applications will be shared among different users.
New applications can be built on the existing
data in the database.
Increased Productivity
A major advantage of the database approach is
that it greatly reduces the cost and time for
developing new business applications.
Programmer could concentrate on the specific
functions required for the new application, without
having to worry about design or low-level
implementation details; as related data has
already been designed and implemented.
Increased Productivity
– DBMS provides many of the standard
functions (e.g. forms and report
generations) that the programmer would
normally have to write in a file-based
application DBMS.
Enforcement of Standards
When the database approach is implemented
with full management support, the database
administration function should be granted single-
point authority and responsibility for establishing
and enforcing data standards.
Standards include naming conventions, data
quality standards and uniform procedures for
accessing, updating and protecting data.
Improved Data Integrity
Integrity can be expressed in terms of constraints, which
are consistency rules that the database is not permitted
to violate.
Eg: A member of staff’s salary cannot be greater than
60,000.
Improved Data Accessibility and
Responsiveness
With relational database, end users without
programming experience can often retrieve and display
data, even when it crosses traditional departmental
boundaries.
English-like query language SQL and query tools such as
Query-By-Example provide such facilities.
Improved Security
DBMS can be used to enforce database security. This
may take the form of user names and passwords to
identify people authorised to use the database.
The access that the authorised user is allowed on the
data can also be restricted by the operation type
(retrieval, delete, update, insert).
Increased concurrency
Many DBMSs allow users to undertake
simultaneous operations on the database. The
DBMS implements a concurrency control
mechanism that prevents database accesses
from interfering with one another.
Disadvantages of DBMS
Complexity
Size
Cost of DBMS
Additional hardware costs
Cost of conversion
Performance
Higher impact of a failure
68
-END-