Chapter 11 databases
Limitations of file-based approach
Using text files compromises data integrity.
There’s no in-built functions to try and detect errors, or to validate input.
Data could be entered twice and some areas may be left blank and that wouldn’t be detected.
The problem with a single file
You can’t restrict access to parts of a file and leave the other parts.
That means that if all data is stored in the same file, there’s a chance of it being seen by people it
wasn’t meant for.
For example, the finance sector would be looking for banking details and the recruitment sector
would be looking for contact details.
If they are all stored in a single file, all the data would be shown to unconcerned individuals.
Data redundancy and inconsistency in multiple files
If you decide to store the data in different files, there’s a chance that the data is going to be
repetitive.
You’ll have to store the same data over and over in different files.
This could cause errors as the data is edited over and over.
The relational database
The DBMS will not allow a user to enter a primary key value that already exists and that
increases data integrity.
The primary key also provides a unique reference to any attribute the query selects.
A database can have individual tables but they are all usually related to one another and this is
implemented by a foreign key.
Entity-relationship modelling
Normalization
A technique used to design tables from a list of data items.
Can also be used to improve existing tables.
The Database Management System (DBMS)
The database approach
The three levels of a database are the external, the conceptual, and the internal.
The physical storage of the data is represented here as being on disk.
The details of storage are known only at the internal level which is the lowest in the architecture.
It is controlled by the DBMS software.
The programmers who wrote the software are the only ones who know the structure of data
storage on the disc.
At the next level, there is a single universal view of the database.
It is controlled by the database administrator (DBA) who also has access to the DBMS.
In the ANSI architecture, the conceptual level has a conceptual schema describing the
organization of the data perceived by the user or programmer.
At the external level there are the individual and programmer views.
Each view describes which parts of the database are accessible.
A view can supports a number of user programs.
The provision of views is important as it allows the DBA to give users appropriate rights (some
may read only, other only access parts of the database)
This ensures data security.
Facilities provided by a DBMS.
Some facilities provided by the DBMS will only be relevant to large organizations where the DBA
will manage their use.
One option for the language of creating the DBMS is SQL.
The DBMS provides software tools through a developer interface.
The DBMS provides facilities for the programmer to create a user interface.
It also provides a query processor
A query allows the manipulation and retrieval of data.
It also provides an option to produce formatted output.
DBMS functions likely to be used by the DBA
The DBA is responsible for setting up the user and programmer views and for defining the
specific access rights.
The data dictionary is hidden from everyone else apart from the DBA.
It contains metadata about the data.
It contains all the definitions of the tables and their attributes.
Also data about how the physical storage is organized.
Structured Query Language
SQL is the language provided by the DBMS to support all operations associated with a relational
database.
Data definition language DDL
This is a part of SQL that’s used for creating and altering tables.
It doesn’t input the data; it just creates the structure.
Data Manipulation Language
There are three categories of use for the DML:
Insertion of data when the table is created.
Removal or modification of data
Reading data from the table.
It is possible to include instruction to dictate how the output is formatted.
The ORDER BY command tells the system to output the information
You can use a WHERE clause to further restrict the output.
Another use of the DML is to modify data.
You can use the UPDATE command to modify data in the table.