DBMS IMPORTANT QUES FOR END SEM
1. Concept of:
i) Super key
Ans. Super key is an attribute set that can uniquely identify a tuple. A super key
is a superset of a candidate key.
ii) Candidate key
Ans. A candidate key is an attribute or set of attributes that can uniquely
identify a tuple. Except for the primary key, the remaining attributes are
considered a candidate key. The candidate keys are as strong as the primary key.
iii) Primary key
Ans. It is the first key used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys, as we saw in the PERSON table.
The key which is most suitable from those lists becomes a primary key.
iv) Foreign key
Ans. Foreign keys are the column of the table used to point to the primary key of
another table.
2. What is DML and DDL? or Difference between DML and DDL.
Ans.
DDL DML
It stands for Data Definition It stands for Data Manipulation
Language . Language .
DDL DML
It is used to create database
It is used to add, retrieve, or update the
schema and can be used to define
data.
some constraints as well.
It basically defines the column It adds or updates the row of the table.
(Attributes) of the table. These rows are called tuples.
It doesn’t have any further It is further classified into Procedural and
classification. Non-Procedural DML.
Basic commands present in DDL
BASIC commands present in DML
are CREATE, DROP, RENAME,
are UPDATE , INSERT , MERGE etc.
ALTER, etc.
DDL does not use WHERE While DML uses WHERE clause in its
clause in its statement. statement.
DDL is used to define the structure DML is used to manipulate the data
of a database. within the database.
3. What is Data Abstraction?
Ans. Data abstraction is the process of hiding unwanted and irrelevant details
from the end user. It helps to store information in such a way that the end user
can access data which is necessary, the user will not be able to see what data is
stored or how it is stored in a database. Data abstraction helps to keep data
secure from unauthorized access and it hides all the implementation details.
4. Generalization, specialization and aggregation.
Ans. Generalization is the process of extracting common properties from a set of
entities and creating a generalized entity from it. It is a bottom-up approach in
which two or more entities can be generalized to a higher-level entity if they
have some attributes in common.
In specialization, an entity is divided into sub-entities based on its
characteristics. It is a top-down approach where the higher-level entity is
specialized into two or more lower-level entities.
An ER diagram is not capable of representing the relationship between an entity
and a relationship which may be required in some scenarios. In those cases, a
relationship with its corresponding entities is aggregated into a higher-level
entity. Aggregation is an abstraction through which we can represent
relationships as higher-level entity sets.
5. Strong and weak entity set.
Ans.
1. Strong Entity Type: It is an entity that has its own existence and is
independent.
2. Weak Entity Type: It is an entity that does not have its own existence and
relies on a strong entity for its existence.
6. Data Independence and its types.
Ans. Data independence can be explained using the three-schema architecture.
Data independence refers characteristic of being able to modify the schema at
one level of the database system without altering the schema at the next higher
level.
1. Logical Data Independence: o Logical data independence refers characteristic
of being able to change the conceptual schema without having to change the
external schema. o Logical data independence is used to separate the
external level from the conceptual view. o If we do any changes in the
conceptual view of the data, then the user view of the data would not be
affected. o Logical data independence occurs at the user interface level.
2. Physical Data Independence: o Physical data independence can be defined as
the capacity to change the internal schema without having to change the
conceptual schema. o If we do any changes in the storage size of the database
system server, then the Conceptual structure of the database will not be
affected. o Physical data independence is used to separate conceptual levels
from the internal levels.
o Physical data independence occurs at the logical interface level.
7. Relational algebra.
Ans. Relational Algebra is a procedural query language. Relational algebra
mainly provides a theoretical foundation for relational databases and SQL. The
main purpose of using Relational Algebra is to define operators that transform
one or more input relations into an output relation. Given that these operators
accept relations as input and produce relations as output, they can be combined
and used to express potentially complex queries that transform potentially
many input relations (whose data are stored in the database) into a single
output relation (the query results). As it is pure mathematics, there is no use of
English Keywords in Relational Algebra and operators are represented using
symbols.
8. SQL.
Ans. SQL stands for Structured Query Language. It is used for storing and
managing data in relational database management system (RDMS). In RDBMS
data stored in the form of the tables.
oIt is a standard language for Relational Database System. It enables a
user to create, read, update and delete relational databases and tables.
• All the RDBMS like MySQL, Informix, Oracle, MS Access and SQL Server
use SQL as their standard database language.
• o SQL allows users to query the database in a number of ways, using
English-like statements.
9. Triggers.
Ans. Trigger in DBMS is a special type of stored procedure that is automatically
executed in response to certain database events such as an INSERT, UPDATE, or
DELETE operation. Triggers can be used to perform actions such as data
validation, enforcing business rules, or logging. They can be defined to execute
before or after the triggering event and can be defined to execute for every row
or once for each statement. Triggers are a powerful feature of dbms that allow
developers to define automatic actions based on database events.
10. Joins.
Ans. Join is an operation in DBMS(Database Management System) that
combines the row of two or more tables based on related columns between
them. The main purpose of Join is to retrieve the data from multiple tables in
other words Join is used to perform multi-table queries. It is denoted by ⨝.
11. Embedded SQL.
Ans.
• The Programming module in which the SQL statements are embedded is
called Embedded SQL module.
• It is possible to embed SQL statements inside a variety of programming
languages such as C, C++, Java, Fortran, and PL/1,
• A language to which SQL queries are embedded is referred to as a host
language.
. EXEC SQL statement is used in the host language to identify embedded SQL
request to the preprocessor.
12. Dynamic SQL.
Ans. • The dynamic SQL component of SQL allows programs to construct and
submit SQL queries at run time.
• In contrast, embedded SQL statements must be completely present at compile
time, they are compiled by the embedded SQL preprocessor.
•Using dynamic SQL programs can create SQL queries as strings at runtime and
can either have them executed immediately or have them prepared for sub
sequent use.
13. Referential integrity.
Ans. Referential Integrity Rule in DBMS is constraint applied on Primary key in
parent table which is Foreign key to the child table, which defines that a Foreign
key value of child table should always have the same Primary key value in the
parent table. So that, Reference from a parent table to child table is valid.
14. Constraints and its types.
Ans. Constraints in a Database Management System (DBMS) are rules enforced
on data in tables to ensure the accuracy and reliability of the data within the
database. Here are the various types of constraints commonly used in DBMS,
along with examples:
1. Primary Key Constraint:
• Ensures that a column or a combination of columns uniquely identifies
each row in a table.
• Example: In a Students table, the StudentID column can be set as a
primary key to uniquely identify each student.
2. Foreign Key Constraint:
• Ensures the referential integrity of the data by linking two tables. The
foreign key in one table points to the primary key in another table.
• Example: In an Orders table, the CustomerID column can be a foreign key
that references the CustomerID column in the Customers table.
3. Unique Constraint:
• Ensures that all values in a column or a set of columns are unique across
the table.
• Example: In an Employees table, the Email column can be set to have
unique values.
4. Not Null Constraint:
• Ensures that a column cannot have a NULL value.
• Example: In a Products table, the ProductName column can be set to not
accept NULL values.
5. Check Constraint:
• Ensures that all values in a column satisfy a specific condition.
• Example: In an Employees table, the Age column can be constrained to
accept only values greater than or equal to 18.
6. Default Constraint:
• Provides a default value for a column when no value is specified.
• Example: In an Orders table, the OrderDate column can have a default
value of the current date.
7. Index Constraint:
• Although not always considered a constraint, indexes are used to
improve the performance of data retrieval. Indexes can be unique or
non-unique.
• Example: Creating an index on the LastName column in an
Employees table to speed up searches.
15. Union compatibility.
Ans. In order to be used in a UNION, the tables must have the same attribute
characteristics, that it the attributes and their domains must be compatible.
When two or more tables share the same number of columns and when their
corresponding columns share the same or compatible domains, they are said
to be union-compatible.
16. Functional dependency and multivalued dependency.
Ans. A functional dependency (FD) is a relationship that exists between two
attributes in a database, typically the primary key and additional non-key
attributes. Consider it a link between two qualities of the same relation.
A dependency is denoted by an arrow "→".
If Cdetermines Dfunctionally, thenC→D.
The term Multivalued Dependency refers to having several rows in a
particular table. As a result, it implies that there are multiple other rows in
the same table. A multivalued dependency would thus preclude the 4NF. Any
multivalued dependency would involve at least three table attributes.
17. Normalization.
Ans. Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set
of relations. It is also used to eliminate undesirable characteristics like
Insertion, Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using
relationships.
o The normal form is used to reduce redundancy from the database table.
18. Normal Forms.
• Ans. First Normal Form (1NF): This is the most basic level of normalization.
In 1NF, each table cell should contain only a single value, and each column
should have a unique name. The first normal form helps to eliminate
duplicate data and simplify queries.
• Second Normal Form (2NF): 2NF eliminates redundant data by requiring
that each non-key attribute be dependent on the primary key. This means
that each column should be directly related to the primary key, and not to
other columns.
• Third Normal Form (3NF): 3NF builds on 2NF by requiring that all non-key
attributes are independent of each other. This means that each column
should be directly related to the primary key, and not to any other
columns in the same table.
• Boyce-Codd Normal Form (BCNF): BCNF is a stricter form of 3NF that
ensures that each determinant in a table is a candidate key. In other
words, BCNF ensures that each non-key attribute is dependent only on the
candidate key.
19. What is a database?
Ans. A database is a collection of data that is stored and organized
electronically. Databases can store any type of data, including words,
numbers, images, videos, and files. They can be used to store and manage
large amounts of data, and can support a wide range of activities, including
data storage, data analysis, and data management.
20. Explain the difference between a Database System and a File System.
Ans.
DBMS File System
Dbms is a collection of data or File system is used to manage and
software to store and retrieve organize the files stored in the
the user’s data. computer’s hard disk.
Dbms gives an abstract view The file system specifies the details
of data that hides the of data representation and
details storage.
DBMS File System
Dbms provide a backup means In the file system lost data cannot
data lost can be recovered be recovered.
The database system is The file system is cheaper and
expensive and complex to simple to design
design
Dbms provide multiple users Data is isolated in the file system
interfaces
Dbms provide a good In a file system is difficult to protect
protection mechanism a file.
21. Describe the three-level schema architecture in DBMS.
Ans. The three-level schema architecture in a database management system
(DBMS) is a database design approach that separates data views into three
layers:
• Internal level: Defines how data is physically stored
• Conceptual level: Describes the overall database structure and hides
internal details
• External level: Applications are written in terms of an external schema,
which is computed when accessed
22. Describe the overall architecture of a Database System.
Ans. The term "database architecture" refers to the structural design and
methodology of a database system, which forms the core of a Database
Management System (DBMS). This architecture dictates how data is stored,
organized, and retrieved, playing a crucial role in the efficiency and
effectiveness of data management.
23. What are the different types of Database Languages?
• Ans. Data definition language (DDL): Used to create, modify, and delete
tables
• Data manipulation language (DML): Used to query and update data stored
in tables
• Data control language (DCL): Used to control access privileges
• Query languages: Often high-level and declarative, but computationally
incomplete
• Procedural languages: Often low-level and imperative
• GraphQL: An open-source language that uses APIs to allow users to
interact with data
24. Discuss the concept of Aggregation in the ER Model with an example.
Ans. Aggregation is a concept in the Entity-Relationship (ER) model that
allows for the creation of relationships between classes without creating a
hierarchy. It's used to treat relationships as higher-level entities, and is an
abstraction that allows for relations between relations.
Here's an example of aggregation in the ER model:
• Entities: A, B, and C are entities in the ER model
• Relationships: R1 is the relationship formed when A and B are connected
• Aggregation: A and B are merged to form a single complex entity, and R1
establishes relationships with other entities to create a new relationship
(R2)
• Higher-level entity set: The relationship set work and the entity sets
employee and project are treated as a higher-level entity set called work
25. Describe the characteristics and advantages of SQL.
Ans. SQL is easy to learn.
o SQL is used to access data from relational database management
systems.
o SQL can execute queries against the database.
• SQL is used to describe the data.
• SQL is used to define the data in the database and manipulate it when
needed.
• SQL is used to create and drop the database and table.
• SQL is used to create a view, stored procedure, function in a database.
• SQL allows users to set permissions on tables, procedures, and views.
26. Discuss the various types of data models used in DBMS with examples.
Ans. 1) Relational Data Model: This type of model designs the data in the
form of rows and columns within a table. Thus, a relational model uses tables
for representing data and in-between relationships. Tables are also called
relations.
2) Entity-Relationship Data Model: An ER model is the logical representation
of data as objects and relationships among them. These objects are known as
entities, and relationship is an association among these entities.
3) Object-based Data Model: An extension of the ER model with notions of
functions, encapsulation, and object identity, as well. This model supports a
rich type system that includes structured and collection types.
4) Semistructured Data Model: This type of data model is different from the
other three data models (explained above). The semistructured data model
allows the data specifications at places where the individual data items of the
same type may have different attributes sets.
27. Explain the concept of Data Independence.
Ans. data independence is the feature that allows the schema of one layer of
the database system to be changed without any impact on the schema of the
next higher level of the database system.
28. Differentiate between Specialization and Generalization.
Ans.
GENERALIZATION SPECIALIZATION
Generalization works in Bottom-Up Specialization works in top-down
approach. approach.
In Generalization, size of schema gets In Specialization, size of schema gets
reduced. increased.
Generalization is normally applied to group We can apply Specialization to a single
of entities. entity.
Generalization can be defined as a process Specialization can be defined as process of
of creating groupings from creating subgrouping
various entity sets within an entity set
30. Describe the different types of relational algebra operations.
Ans. Relational algebra is a procedural query language. It gives a step by step
process to obtain the result of the query. It uses operators to perform
queries.
1. Select Operation:
o The select operation selects tuples that satisfy a given predicate. o
It is denoted by sigma (σ).
2. Project Operation: o This operation shows the list of those attributes that
we wish to appear in the result. Rest of the attributes are eliminated from
the table.
o It is denoted by ∏.
3. Union Operation: o Suppose there are two tuples R and S. The union
operation contains all the tuples that are either in R or S or both in R &
S. o It eliminates the duplicate tuples. It is denoted by .
4. Set Intersection: o Suppose there are two tuples R and S. The set
intersection operation contains all tuples that are in both R & S.
o It is denoted by intersection ∩.
31. Differentiate between Primary Key and Foreign Key.
Ans.
Primary Key Foreign Key
It is used to uniquely identify data in It is used to maintain relationship
the table. between tables.
It can't be NULL. It can accept the NULL values.
Two or more rows can't have same It can carry duplicate value for a
primary key. foreign key attribute.
Primary has clustered index. By default, It is not clustered index.
Primary key constraint can be It can't be defined on temporary
defined on temporary table. tables.
32. Differentiate between Sub Queries and Joins in SQL.
Ans.
Subquery JOIN
A subquery is a query nested inside another
query and is used to return data that will be
used in the A JOIN is a means for combining fields
main query as a condition to further restrict from two tables by using values common
the data to be retrieved. to each.
Subqueries can be slower than JOINs,
especially if the subquery returns a large JOINs are generally faster than
number of rows. subqueries,
Subqueries can be more complex and harder
to read, especially when there are multiple JOINs can be easier to read and
levels of nesting. understand, especially for simple queries.
JOINs are used to combine rows from two
Subqueries can be used in SELECT, WHERE, or more tables based on a related
and FROM clauses, offering more flexibility. column.
Subqueries are often used when the result of JOINs are used when the relationships
the query is not known or dynamic. between the tables are known and fixed.
33. Define a Cursor in SQL.
Ans. In SQL, a cursor is a temporary workstation that is allocated by the
database server during the execution of a statement. It is a database object that
allows us to access data of one row at a time.
34. What is SQL?
Ans. Structured Query Language (SQL) is a specialized programming language for
managing relational database data. It allows users to store, manipulate, and
retrieve data efficiently in databases like MySQL, SQL Server, Oracle, and more.
35. What is a Primary Key?
Ans. A primary key is the column or columns that contain values that uniquely
identify each row in a table. A database table must have a primary key for Optim
to insert, update, restore, or delete data from a database table.
36. What is a Tuple in a table?
Ans. A tuple is 1 Row of a Table. A table has rows and columns, where rows
represents records and columns represent the attributes. Tuple − A single row of
a table, which contains a single record for that relation is called a tuple.
37. Define a Relation
Ans. A relationship in databases is a situation where there is a logical association
between two or more database tables. It helps improve table structures and
reduce redundant data.
38. What is an entity?
Ans. entity is a piece of data that is stored in the database. An entity can be a
person, place, thing, or even an event.
39. Multi valued and join dependency.
Ans. When one attribute in a database depends on another attribute and has
many independent values, it is said to have multivalued dependency (MVD). It
supports maintaining data accuracy and managing intricate data interactions.
Join Dependency means re-creating the original Table by joining multiple sub-
tables of the given Table. It is a further generalization of MVD(multi-valued
Dependencies).
• When a relation R can be obtained by joining the R1, R2, R3..., Rn where
R1, R2, R3..., Rn are sub-relations of R, it is called a Join Dependency.
• R1,R2,R3...Rn are the sub-relations composed or derived from the relation
R.
40. ACID properties.
Ans.
• Atomicity: This principle states that database transactions must be all or
nothing. If the transaction fails, the entire transaction is rolled back.
Atomicity prevents partial and incomplete transactions.
• Consistency: According to this property, only valid data is written to the
database. Consistency enforces integrity constraints to maintain the
accuracy and correctness of data.
• Isolation: Running transactions independently is the core of isolation.
Changes in one transaction will not impact others until committed.
Isolation maintains data integrity across concurrent transactions.
• Durability: Durability guarantees that all committed transactions are
permanently recorded in the database. They persist even after system
failure. Durability provides recoverability.
41. Deadlock handling in dbms.
Ans. Deadlock is a state of a database system having two or more transactions,
when each transaction is waiting for a data item that is being locked by some
other transaction.
42. Transaction system state diagram.
Ans.
43. Reasons for transaction failure in dbms.
• Ans. System errors: These are related to issues with the hardware or
software.
• System crashes: These can be caused by bugs or hardware malfunctions in
the database software or operating system.
• Local errors: These are errors that occur within a single transaction, such
as when executing SQL queries.
• Media failures: These can be caused by unreadable media or a head
crash.
• Application software errors: These can occur when the resource limit is
exceeded, bad input is entered, or logical or internal errors occur.
• Logical errors: These can occur when the transaction can no longer
continue due to an internal condition, such as data not found, overflow, or
bad input
44. Timestamp based concurrency control.
Ans. Timestamp-based concurrency control is a method used in database
systems to ensure that transactions are executed safely and consistently without
conflicts, even when multiple transactions are being processed simultaneously.
This approach relies on timestamps to manage and coordinate the execution
order of transactions. Refer to the timestamp of a transaction T as TS(T).
45. Two phase locking in dbms.
Ans. Two Phase Locking:
A transaction is said to follow the Two-Phase Locking protocol if Locking and
Unlocking can be done in two phases.
• Growing Phase: New locks on data items may be acquired but none can
be released.
• Shrinking Phase: Existing locks may be released but no new locks can be
acquired.
46. Phantom phenomenon.
Ans. The phantom phenomenon, also known as the phantom problem, is a
database transaction issue that occurs when a query produces different sets of
rows at different times.
47. Lock based protocol.
Ans. In this type of protocol, any transaction cannot read or write data until it
acquires an appropriate lock on it. There are two types of lock:
1. Shared lock:
o It is also known as a Read-only lock. In a shared lock, the data item can
only read by the transaction.
o It can be shared between the transactions because when the transaction
holds a lock, then it can't update the data on the data item.
2. Exclusive lock:
o In the exclusive lock, the data item can be both reads as well as written by
the transaction.
o This lock is exclusive, and in this lock, multiple transactions do not modify
the same data simultaneously.
48. What is meant by serializability?
Ans. serializability is a concept that ensures that multiple transactions can
access and modify the same data without interfering with each other. It's a vital
concept in multi-user environments where concurrent transactions can lead to
inconsistencies and conflicts.
49. What is a transaction in DBMS?
Ans. A transaction in a database management system (DBMS) is a series of
operations that are performed to complete a logical task.
50. What is concurrency control?
Ans. Concurrency control is a set of techniques that manage simultaneous
operations on a database or program to prevent conflicts and maintain data
integrity.
51. What is a conflict in transaction scheduling?
Ans. A conflict in transaction scheduling occurs when two operations in a
schedule involve the same data item and at least one of them is a write
operation.
52. Define a lock in DBMS
Ans. A lock is a mechanism that prevents multiple users from simultaneously
accessing a row or column in a database. Locks are used to ensure data integrity
and consistency.
53. What is a checkpoint?
Ans. a point in time when the database is in a consistent state and all
transactions have been committed.
54. Explain any two methods for deadlock handling.
• Ans. Deadlock prevention
This method is used when the system is likely to enter a deadlock. It involves
granting resources to a packet in a way that prevents a request from leading to a
deadlock.
• Lock timeouts
This method prevents transactions from waiting indefinitely for locks. However,
it's important to handle errors and rollbacks that may occur when a transaction
times out.
55. Differentiate between shared and exclusive locks.
Ans.
Shared Lock Exclusive Lock
Lock mode is read as well as write
Lock mode is read only operation.
operation.
Shared lock can be placed on objects Exclusive lock can only be placed on
that do not have an exclusive lock objects that do not have any other kind
already placed on them. of lock.
Prevents others from updating the Prevents others from reading or
data. updating the data.
Issued when transaction wants to read Issued when transaction wants to
item that do not have an exclusive lock. update unlocked item.
Any number of transaction can hold Exclusive lock can be hold by only one
shared lock on an item. transaction.
S-lock is requested using lock-S X-lock is requested using lock-X
instruction. instruction.
Example: Multiple transactions Example: Transaction updating a
reading the same data table row
56. Discuss the difference between lossless and lossy decomposition with
examples.
Ans.
Lossless
Lossy
Permanently removes file data. Restores and rebuilds compressed data.
When file information loss is When file information loss is
acceptable. unacceptable.
Images, video, audio Text, images, audio
Images: JPEG Images: RAW, BMP, PNG
Video: MPEG, AVC, HEVC General: ZIP
Audio: MP3, AAC Audio: WAV, FLAC
Lossy Lossless
Small file sizes. Ideal for web use. Lots of No loss in quality. Slight
tools, plugins and software support it. decreases in file sizes.
Quality degrades due to higher rate of Compressed files are larger
compression. than lossy files.