Database Systems
Database Systems
Database Systems
▪ DBMS contains information about a particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
▪ Database systems are used to manage collections of data that are:
• Highly valuable
• Relatively large
• Accessed by multiple users and applications, often at the same time.
▪ A modern database system is a complex software system whose task is to manage a
large, complex collection of data.
▪ Databases touch all aspects of our lives.
Database Applications Examples
▪ Enterprise Information
• Sales: customers, products, purchases
• Accounting: payments, receipts, assets
• Human Resources: Information about employees, salaries, payroll taxes.
▪ Manufacturing: management of production, inventory, orders, supply chain.
▪ Banking and finance
• customer information, accounts, loans, and banking transactions.
• Credit card transactions
• Finance: sales and purchases of financial instruments (e.g., stocks and bonds;
storing real-time market data
▪ Universities: registration, grades
▪ Airlines: reservations, schedules
▪ Telecommunication: records of calls, texts, and data usage, generating monthly bills,
maintaining balances on prepaid calling cards
▪ Web-based services
• Online retailers: order tracking, customized recommendations
• Online advertisements
▪ Document databases
▪ Navigation systems: For maintaining the locations of varies places of interest along
with the exact routes of roads, train systems, buses, etc.
Purpose of Database Systems
In the early days, database applications were built directly on top of file systems, which leads
to:
▪ Data redundancy and inconsistency: data is stored in multiple file formats resulting
induplication of information in different files
▪ Difficulty in accessing data
• Need to write a new program to carry out each new task
▪ Data isolation
• Multiple files and formats
▪ Integrity problems
• Integrity constraints (e.g., account balance > 0) become “buried” in program
code rather than being stated explicitly
• Hard to add new constraints or change existing ones
▪ Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates
carried out
• Example: Transfer of funds from one account to another should either
complete or not happen at all
▪ Concurrent access by multiple users
• Concurrent access needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
▪ Ex: Two people reading a balance (say 100) and updating it by
withdrawing money (say 50 each) at the same time
▪ Security problems
• Hard to provide user access to some, but not all, data
Database systems offer solutions to all the above problems
University Database Example
▪ In this text we will be using a university database to illustrate all the concepts
▪ Data consists of information about:
• Students
• Instructors
• Classes
▪ Application program examples:
• Add new students, instructors, and courses
• Register students for courses, and generate class rosters
• Assign grades to students, compute grade point averages (GPA) and generate
transcripts
View of Data
▪ A database system is a collection of interrelated data and a set of programs that allow
users to access and modify these data.
▪ A major purpose of a database system is to provide users with an abstract view of the
data.
• Data models
▪ A collection of conceptual tools for describing data, data relationships,
data semantics, and consistency constraints.
• Data abstraction
▪ Hide the complexity of data structures to represent data in the database
from users through several levels of data abstraction.
Data Models
▪ A collection of tools for describing
• Data
• Data relationships
• Data semantics
• Data constraints
▪ Relational model
▪ Entity-Relationship data model (mainly for database design)
▪ Object-based data models (Object-oriented and Object-relational)
▪ Semi-structured data model (XML)
▪ Other older models:
• Network model
• Hierarchical model
Relational Model
▪ All the data is stored in various tables.
▪ Example of tabular data in the relational model
A Sample Relational Database
Levels of Abstraction
▪ Physical level: describes how a record (e.g., instructor) is stored.
▪ Logical level: describes data stored in database, and the relationships among the data.
type instructor = record
ID : string;
name : string;
dept_name : string;
salary : integer;
end;
▪ View level: application programs hide details of data types. Views can also hide
information (such as an employee’s salary) for security purposes.
View of Data
An architecture for a database system
▪ Query Processor
Query Processing
▪ Centralized databases
• One to a few cores, shared memory
▪ Client-server,
• One server machine executes work on behalf of multiple client machines.
▪ Parallel databases
• Many core shared memory
• Shared disk
• Shared nothing
▪ Distributed databases
• Geographical distribution
• Schema/data heterogeneity
Database Architecture (Centralized/Shared-Memory)
Database Applications
Database applications are usually partitioned into two or three parts
▪ Two-tier architecture -- the application resides at the client machine, where it invokes
database system functionality at the server machine
▪ Three-tier architecture -- the client machine acts as a front end and does not contain
any direct database calls.
• The client end communicates with an application server, usually through a
forms interface.
• The application server in turn communicates with a database system to access
data.
Two-tier and three-tier architectures
Database Users
Database Administrator
A person who has central control over the system is called a database administrator (DBA).
Functions of a DBA include:
▪ Schema definition
▪ Storage structure and access-method definition
▪ Schema and physical-organization modification
▪ Granting of authorization for data access
▪ Routine maintenance
▪ Periodically backing up the database
▪ Ensuring that enough free disk space is available for normal operations, and
upgrading disk space as required
▪ Monitoring jobs running on the database
History of Database Systems
▪ 1950s and early 1960s:
• Data processing using magnetic tapes for storage
▪ Tapes provided only sequential access
• Punched cards for input
▪ Late 1960s and 1970s:
• Hard disks allowed direct access to data
• Network and hierarchical data models in widespread use
• Ted Codd defines the relational data model
▪ Would win the ACM Turing Award for this work
▪ IBM Research begins System R prototype
▪ UC Berkeley (Michael Stonebraker) begins Ingres prototype
▪ Oracle releases first commercial relational database
• High-performance (for the era) transaction processing
▪ 1980s:
• Research relational prototypes evolve into commercial systems
▪ SQL becomes industrial standard
• Parallel and distributed database systems
▪ Wisconsin, IBM, Teradata
• Object-oriented database systems
▪ 1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
▪ 2000s
• Big data storage systems
▪ Google BigTable, Yahoo PNuts, Amazon,
▪ “NoSQL” systems.
• Big data analysis: beyond SQL
▪ Map reduce and friends
▪ 2010s
• SQL reloaded
▪ SQL front end to Map Reduce systems
▪ Massively parallel database systems
▪ Multi-core main-memory databases
Domain Types in SQL
▪ char(n). Fixed length character string, with user-specified length n.
▪ varchar(n). Variable length character strings, with user-specified maximum length
n.
▪ int. Integer (a finite subset of the integers that is machine-dependent).
▪ smallint. Small integer (a machine-dependent subset of the integer domain type).
▪ numeric(p,d). Fixed point number, with user-specified precision of p digits, with d
digits to the right of decimal point. (ex., numeric(3,1), allows 44.5 to be stores
exactly, but not 444.5 or 0.32)
▪ real, double precision. Floating point and double-precision floating point numbers,
with machine-dependent precision.
▪ float(n). Floating point number, with user-specified precision of at least n digits.
▪ Find the names of all instructors who have taught some course and the course_id
• select name, course_id
from instructor , teaches
where instructor.ID = teaches.ID
▪ Find the names of all instructors in the Art department who have taught some course
and the course_id
• select name, course_id
from instructor , teaches
where instructor.ID = teaches.ID
and instructor. dept_name = 'Art'
Aggregate Functions
▪ These functions operate on the multiset of values of a column of a relation, and return
a value
avg: average value
min: minimum value
max: maximum value
sum: sum of values
count: number of values
▪ Find the average salary of instructors in the Computer Science department
• select avg (salary)
from instructor
where dept_name= 'Comp. Sci.';
▪ Find the total number of instructors who teach a course in the Spring 2018 semester
• select count (distinct ID)
from teaches
where semester = 'Spring' and year = 2018;
▪ Find the number of tuples in the course relation
• select count (*)
from course;
select name
from instructor
where salary > some (select salary
from instructor
where dept name = 'Biology');
▪ F <comp> some r ⇔ ∃ t ∈ r such that (F <comp> t )
Where <comp> can be: <, ≤, >, =, ≠
▪ Yet another way of specifying the query “Find all courses taught in both the Fall 2017
semester and in the Spring 2018 semester”
select course_id
from section as S
where semester = 'Fall' and year = 2017 and
exists (select *
from section as T
where semester = 'Spring' and year= 2018
and S.course_id = T.course_id);
Entity Sets
▪ An entity is an object that exists and is distinguishable from other objects.
• Example: specific person, company, event, plant
▪ An entity set is a set of entities of the same type that share the same properties.
• Example: set of all persons, companies, trees, holidays
▪ An entity is represented by a set of attributes; i.e., descriptive properties possessed by
all members of an entity set.
• Example:
instructor = (ID, name, salary )
course= (course_id, title, credits)
▪ A subset of the attributes form a primary key of the entity set; i.e., uniquely
identifying each member of the set.
Relationship Sets
▪ A relationship is an association among several entities
Example:
44553 (Peltier) advisor 22222 (Einstein)
student entity relationship set instructor entity
▪ A relationship set is a mathematical relation among n ≥ 2 entities, each taken from
entity sets
{(e1, e2, … en) | e1 ∈ E1, e2 ∈ E2, …, en ∈ En}
One-to-Many Relationship
▪ one-to-many relationship between an instructor and a student
• an instructor is associated with several (including 0) students via advisor
• a student is associated with at most one instructor via advisor,
Notation for Expressing More Complex Constraints
▪ A line may have an associated minimum and maximum cardinality, shown in the form
l..h, where l is the minimum and h the maximum cardinality
• A minimum value of 1 indicates total participation.
• A maximum value of 1 indicates that the entity participates in at most one
relationship
• A maximum value of * indicates no limit.
▪ Example
• Instructor can advise 0 or more students. A student must have 1 advisor;
cannot have multiple advisors
Redundant Attributes
▪ Entity sets and relationship sets can be expressed uniformly as relation schemas that
represent the contents of the database.
▪ A database which conforms to an E-R diagram can be represented by a collection of
schemas.
▪ For each entity set and relationship set there is a unique schema that is assigned the
name of the corresponding entity set or relationship set.
▪ Each schema has a number of columns (generally corresponding to attributes), which
have unique names.
Redundancy of Schemas
▪ Many-to-one and one-to-many relationship sets that are total on the many-side can be
represented by adding an extra attribute to the “many” side, containing the primary
key of the “one” side
▪ Example: Instead of creating a schema for relationship set inst_dept, add an attribute
dept_name to the schema arising from entity set instructor
▪ Example
Specialization
▪ Top-down design process; we designate sub-groupings within an entity set that are
distinctive from other entities in the set.
▪ These sub-groupings become lower-level entity sets that have attributes or participate
in relationships that do not apply to the higher-level entity set.
▪ Depicted by a triangle component labeled ISA (e.g., instructor “is a” person).
▪ Attribute inheritance – a lower-level entity set inherits all the attributes and
relationship participation of the higher-level entity set to which it is linked.
CONCLUSION:
In conclusion, databases play a critical role in the efficient storage, management, and retrieval
of data across various domains. By enabling structured data organization, ensuring data
integrity, and supporting concurrent access, databases serve as the backbone of modern
information systems. With the evolution of technology, database systems have become more
robust, scalable, and adaptable, meeting the growing demands of big data, real-time
processing, and cloud integration. A solid understanding of database concepts and
architecture is essential for designing efficient systems that can support informed decision-
making and ensure long-term data sustainability.