Data Base Management System
(DBMS) and File Based Approach
How DBMS is better than file based approach?
Data Redundancy:
File-based Approach: In a file-based system, data is often duplicated across
multiple files or applications, leading to redundancy. For example, if you
store customer information in separate files for sales and marketing, the
same customer details might be repeated in both files.
DBMS Solution: A DBMS eliminates redundancy by centralizing data in one
place, making it easier to update and maintain. In a relational database, you
can have a "Customers" table, and various departments can access it
without duplicating data.
Data Integrity:
File-based Approach: Maintaining data consistency and integrity is
challenging in file-based systems. Errors or inconsistencies can occur when
different files are updated independently. For instance, a customer's
address change might be recorded in one file but not in another.
DBMS Solution: A DBMS enforces data integrity by using constraints, such
as unique keys and foreign keys, to ensure that data remains accurate and
consistent across the database. This ensures that changes in one place are
reflected everywhere.
Data Security:
File-based Approach: File-based systems often lack robust security
features, making it easier for unauthorized access to data. Access control
and data protection are limited.
DBMS Solution: A DBMS offers access control mechanisms, user
authentication, and encryption to enhance data security. For example, in a
DBMS, you can define user roles and permissions, restricting access to
sensitive information.
Data Retrieval and Querying:
File-based Approach: Retrieving specific data from multiple files can be
time-consuming and inefficient. Querying is often limited or not supported,
making it difficult to extract valuable insights.
DBMS Solution: A DBMS provides powerful querying capabilities through
SQL or other query languages. You can easily retrieve, filter, and analyze
data from the database, making it suitable for generating reports and
performing data analysis.
Data Scalability and Performance:
File-based Approach: As data grows, file-based systems can become slow
and inefficient. Adding or managing large amounts of data becomes a
challenging task.
DBMS Solution: DBMS systems are designed to handle large volumes of
data efficiently. They provide mechanisms for optimizing performance, such
as indexing and caching, allowing you to scale your data without significant
performance degradation.
What is a Relational Database? Define the terms?
A relational database is like a digital filing system for organizing data. It's made up
of entities (like objects or concepts) that are stored in tables (like grids). Each
table has fields (like columns) for specific pieces of information. A record is a full
row in a table, holding all details about one entity. Tuples and attributes are just
different words for rows and columns, like in a spreadsheet. Below is a simple
diagram with labels for clarity:
Entity: A general object or concept you want to store data about.
Table: A grid-like structure where you store data related to a specific entity.
Field: A column in a table, representing a specific piece of information.
Record: A row in a table containing all details about an entity.
Tuple: Another word for a row or record in the table.
Attribute: Another term for a column or field in the table.
Define the 4 keys and tell briefly about their differences?
Candidate Key:
A candidate key is a unique identifier in a database table that can potentially
serve as a primary key. It's a candidate for being the main reference point for
accessing data, but multiple candidate keys may exist, and one is chosen as the
primary key.
Primary Key:
A primary key is the main unique identifier in a database table, ensuring each
record has a distinct value. It enforces data integrity, allowing efficient data
retrieval. The primary key serves as the primary reference point for accessing and
identifying records.
Secondary Key
A secondary key is an alternative key in a database table, not used as the primary
identifier. It helps speed up data retrieval by providing additional indexes for
commonly queried columns, but it may not be unique.
Foreign Key:
A foreign key is a field in one table that links to the primary key in another table,
creating a relationship between them. It's used to maintain data consistency and
enforce referential integrity, ensuring that data in the linked tables stays in sync.
Differences between Key Types:
1. A Candidate Key can become a Primary Key, while a Secondary Key is an
alternative identifier.
2. A Primary Key is unique and enforces data integrity, while a Secondary Key
provides additional indexes for querying.
3. A Foreign Key establishes relationships between tables, referencing the
Primary Key of another table.
What are Relationships in a DBMS? Role of Foreign Key
Relationship:- A database can have 2 or more tables in it which can form a
relationship for which it is necessary to have one table’s foreign key to be
another’s primary key like Class ID in the following diagram
There are three main relationships
1. One-to-One:-
In a one-to-one relationship, each record in one table is associated with
only one record in another table, and vice versa. This is less common but
useful for cases where two entities have a unique, direct relationship
2. One-to-Many:-
In this relationship, one record in the first table can be associated with
multiple records in the second table, but each record in the second table is
associated with only one record in the first table.
The relationship between “ClassID” and “StudentID” is a One-to-Many
because in the Student table the same class ID can appear many times but
in the class table the ClassID will only appear once
3. Many-to Many:-
In a many-to-many relationship, multiple records in one table can be
associated with multiple records in another table.
What is an E-R diagram? How is it represented?
An E-R diagram is an entity-relationship diagram to show the relationship
between two entites visually represented
Relationships can be mandatory or optional. For example for a mother it is
mandatory to have a child relationship but for and employee to have a desk or
not is optional
There are cardinality of relationships to show relationships between two entities
What is the normalization Process?
Normalization is a process in a Database Management System (DBMS) that helps
structure a relational database to reduce data redundancy and improve data
integrity. The goal is to minimize data duplication and ensure that data is
organized efficiently.
How is a Database converted into the First Normal Form.
What are the conditions?
1. Eliminate Repeating Groups:
First, you identify repeating groups within a table, which are sets of related data
that should be separated into their own tables. For example, if you have a table
for "Orders" and each order can have multiple products, you might have a column
for "Product1," "Product2," and so on. In 1NF, you remove these repeating
groups.
2. Create Separate Rows:
For each set of related data (e.g., products in an order), you create separate rows
in the table, duplicating the other non-repeating data (e.g., customer information,
order date). This ensures that each row in the table represents a single entity.
3. Use Unique Identifiers: You should have a unique identifier (primary key) for
each row to distinguish it from others. In the case of orders, each row should have
a unique order identifier, such as an order number.
Conditions of 1NF:
Each table should have a primary key to uniquely identify each row.
Data should be organized so that there are no repeating groups or arrays in
a single row.
Each column should contain atomic (indivisible) values. Avoid storing
multiple values in a single column.
What is the 2nd Normal Form and how is a database converted
to it?
For data to be in the 2nd normal data first must be converted to 1st normal form
and there should be no partial dependencies
The Second Normal Form (2NF) is a property in database design that helps
eliminate redundancy and improve data integrity by ensuring that non-prime
attributes (attributes that are not part of a candidate key) are functionally
dependent on the entire candidate key, rather than on only part of it. To achieve
2NF, a table must first be in First Normal Form (1NF)
To convert a database to 2NF, you need to follow these steps:
Identify the candidate key(s) for each table.
Ensure that the table is in 1NF, which means that there are no repeating
groups, and each column contains atomic (indivisible) values.
Verify that each non-prime attribute (attributes that are not part of the
candidate key) is fully functionally dependent on the entire candidate key.
Here's an example to illustrate 2NF with tables:
OrderID Customer ID Product Order Date
1 101 Apple 11-11-23
2 102 Oranges 12-11-23
3 103 Mangoes 17-11-23
4 104 Grapes 12-12-23
In this initial state, the table is not in 2NF because the Product and Order Date
attributes are dependent on the entire candidate key (OrderID). However, the
Customer ID attribute is partially dependent on the candidate key since it
depends on Customer ID, which is not part of the candidate key.
To convert this table to 2NF, we can create two separate tables, one for Orders
and one for Customers, like this:
Order ID Products Order date
1 Apple 11-11-23
2 Oranges 12-11-23
3 Mangoes 17-11-23
4 Grapes 12-12-23
Customer ID Customer Name
101 Ahmed
102 Ali
103 Ayaan
104 Muhammad
Now, the Orders table is in 2NF because both Product and OrderDate are fully
functionally dependent on the candidate key (OrderID), and the CustomerID
attribute is moved to the Customers table, where it's also fully dependent on its
candidate key (CustomerID).
This organization reduces redundancy and ensures data integrity.
How to convert a table in to 3rd Normal Form? Conditions?
The Third Normal Form (3NF) is a property in database design that builds upon
the Second Normal Form (2NF) and further eliminates redundancy and data
anomalies by ensuring that there are no transitive dependencies between non-
prime attributes and the candidate key. To achieve 3NF, a table must first be in
2NF.
A relation is in 3NF if it is in 2NF and, for every non-prime attribute A in the
relation, A is non-transitively dependent on the entire candidate key.
To convert a database to 3NF, you need to follow these steps:
Identify the candidate key(s) for each table.
Ensure that the table is in 2NF, which means that all non-prime attributes
are fully functionally dependent on the entire candidate key.
Eliminate any transitive dependencies by creating additional tables and
relationships.
Suppose we have a table called "Employees" with the following attributes:
EmployeeID (candidate key), EmployeeName, Department, and DepartmentHead.
The table might look like this:
Employee ID Employee Name Department Department Head
1 Ayaan HR Ali
2 Musab IT Ahmed
3 Aazmir IT Ahmed
4 Khubaib Finance Ashal
In this initial state, the table is not in 3NF because there is a transitive
dependency between the Department Head and the Department. The
Department Head depends on the Department, and the Department depends on
the candidate key (EmployeeID). To convert this table to 3NF, we can create two
separate tables, one for Employees and one for Departments, like this:
Employee ID Employee Name
1 Ayaan
2 Musab
3 Aazmir
4 Khubaib
Department Department Head
IT Ahmed
Finance Ashal
HR Ali
Now, the Employees table is in 3NF because there are no transitive dependencies,
and both EmployeeName and DepartmentHead are directly dependent on the
candidate key (EmployeeID). The Department attribute is moved to the
Departments table, where it is also directly dependent on its candidate key
(Department).
How does a DBMS solve the limitations of a file-based
approach?
Database Management Systems (DBMS) have addressed several issues associated
with the file-based approach to data management, including data redundancy,
data consistency, and data dependency. Here's how DBMSs have tackled these
problems:
Data Redundancy Issue:
Normalization: DBMSs employ the concept of data normalization, which
involves organizing data into well-structured tables to minimize
redundancy. This means that data is stored in a way that prevents the
duplication of information. In contrast, the file-based approach often led to
multiple copies of the same data, increasing the risk of inconsistency.
Centralized Data Storage: DBMSs provide a centralized repository for data,
eliminating the need to duplicate data across different files and
applications. Data is stored in a structured manner, reducing redundancy
and ensuring that changes to data occur in one place.
Data Integrity Constraints: DBMSs allow the enforcement of data integrity
constraints, such as unique keys and foreign key relationships. These
constraints prevent the insertion of duplicate data into tables, further
reducing redundancy.
Referential Integrity: DBMSs support referential integrity, ensuring that
relationships between data are maintained consistently. In the file-based
approach, it was challenging to maintain the integrity of data relationships
across different files and applications.
Data Consistency Issue:
ACID Properties: DBMSs ensure data consistency through the ACID
(Atomicity, Consistency, Isolation, Durability) properties. These properties
guarantee that database transactions are executed in a way that preserves
data consistency. If a transaction fails, the database is rolled back to a
consistent state.
Concurrency Control: DBMSs offer mechanisms for handling concurrent
access to data by multiple users or applications. Locking and isolation levels
are used to prevent data inconsistencies caused by concurrent updates.
Data Validation: DBMSs enable the implementation of data validation rules
and checks to ensure that data entered into the database is consistent and
adheres to predefined rules. In the file-based approach, this type of
consistency enforcement was often left to individual applications, leading
to potential discrepancies.
Data Dependency Issue:
Data Abstraction: DBMSs provide data abstraction through a data model
(e.g., relational, object-oriented). This means that applications interact with
the database using a logical data model, rather than being dependent on
the physical structure of the data. This abstraction reduces data
dependency.
Query Language: DBMSs offer a query language (e.g., SQL) that allows
users to access and manipulate data without needing to understand how
data is physically stored. This further reduces data dependency on the
underlying file structure.
Data Dictionary: DBMSs maintain a data dictionary or metadata repository,
which documents the structure of the database. This documentation helps
users and developers understand the data without needing to delve into
the file-level details, reducing data dependency.
Methods to Modify and fetch data from Databases
DBMS use a Data definition language (DDL) to create, remove or modify
structures in a database
They are written as a script that has syntax similar to a computer program
DBMS use Data manipulation language to add, modify, delete or add or
retrieve data in a database
They are written as a script similar to a computer program
The difference between the languages is that DDL is used for the structures
in a relational database while DML is used for data in the relational
database.
What are the SQL DDL commands?
What are the SQL DML commands