0% found this document useful (0 votes)

76 views59 pages

08pdf Physical Optim

The document discusses physical data structures and query optimization in databases. It covers topics like main memory vs secondary memory, database management systems, access methods, sequential structures like entry-sequenced and array structures, hash-based structures, and tree-based indexes. The goal is to understand how databases are implemented and organized at a low level to provide efficient data access and querying.

Uploaded by

asdqweasdqweasd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views59 pages

08pdf Physical Optim

Uploaded by

asdqweasdqweasd

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 59

Advanced Databases

8 Physical data structures and query optimization

Physical data structures and query optimization

Study of inside DB technology: why?

DBMSs provide transparent services: So transparent that, so far, we could ignore many implementation details! So far DBMSs have always been a black box So why should we open the box? Knowing how it works may help to use it better Some services are provided separately

Physical data structures and query optimization

DataBase Management System DBMS

A system (software product) capable of managing data

collections which are: large ((much) larger than the central memory available on the computers that run the software) persistent (with a lifetime which is independent of single executions of the programs that access them) shared (in use by several applications at a time) guaranteeing reliability (i.e. tolerance to hardware and software failures) and privacy (by disciplining and controlling all accesses).

Physical data structures and query optimization

Access and query manager

SQL
Query manager Access methods manager Buffer manager Secondary memory manager Secondary Memory

Physical data structures and query optimization

Technology of DBMSs - topics

Query management ("optimization") Physical data structures and access structures Buffer and secondary memory management Reliability control Concurrency control Distributed architectures

Physical data structures and query optimization

Main and Secondary memory (1)

Programs can only refer to data stored in main memory Databases must be stored (mainly) in secondary memory for two reasons: size persistence Data stored in secondary memory can only be used if first transferred to main memory (which explains the terms "main" and "secondary")

Physical data structures and query optimization

Main and Secondary memory (2)

Secondary memory devices are organized in blocks of (usually) fixed length (order of magnitude: a few KBs) The only available operations for such devices are reading and writing one page, i.e. the byte stream corresponding to a block; For convenience and simplicity, we will use block and page as synonyms

Physical data structures and query optimization

Main and Secondary memory (3)

Secondary memory access:

seek time (8-12ms) - head positioning latency time (2-8ms) - disc rotation transfer time (~1ms) - data transfer
as an average, hardly less than 10 ms overall The cost of an access to secondary memory is 4 orders of magnitude higher than that to main memory In "I/O bound" applications the cost exclusively depends on the number of accesses to secondary memory

Physical data structures and query optimization

Main and Secondary memory (4)

New storage technologies: SSD Only transfer time Different model, (surprisingly, apparently) similar performance (for DB-like loads) Continuous writes stress the erasing process (that is the weak link) Other approaches Main memory databases again, different cost models Efficient when read operations significantly exceed writes

Physical data structures and query optimization

DBMS and file system (1)

The File System (FS) is the component of the Operating Systems which manages access to secondary memory DBMSs make limited use of FS functionalities: to create and delete files and for reading and writing single blocks or sequences of consecutive blocks. The DBMS directly manages the file organization, both in terms of the distribution of records within blocks and with respect to the internal structure of each block.

Physical data structures and query optimization

DBMS and file system (2)

The DBMS manages the blocks of allocated files as if they were a single large space in secondary memory. It builds in such space the physical structures with which tables are implemented. A file is typically dedicated to a single table, but. It may happen that a file contains data belonging to more than one table and that the tuples of one table are split in more than one file.

Physical data structures and query optimization

Blocks and records

Blocks (the "physical" components of a file) and records (the "logical" components) generally have different size: The size of a block depends on the file system The size of a record depends on the needs of applications and is normally variable within a file

Physical data structures and query optimization

Block Factor
The number of records within a block SR: Size of a record (assumed constant in the file for simpicity: "fixed length record") SB: Size of a block if SB > SR, there may be many records in each block:

SB / SR
The rest of the space can be used ("spanned" records (or "hung-up" records)) non used ("unspanned" records)

Physical data structures and query optimization

Physical access structures

Used for the efficient storage and manipulation of data within the DBMS Encoded as access methods, that is, software modules providing data access and manipulation primitives for each physical access structure Each DBMS has a distinctive and limited set of access methods We will consider three types of data access structures: Sequential Hash-based Tree-based (or index-based)

Physical data structures and query optimization

Organization of tuples within pages

Each access method has its own page organization In the case of sequential and hash-based methods each page has: An initial part (block header) and a final part (block trailer) containing control information used by the file system An initial part (page header) and a final part (page trailer) containing control information about the access method A page dictionary, which contains pointers to each item of useful elementary data contained in the page A useful part, which contains the data. In general, the page dictionary and the useful data grow as opposing stacks A checksum, to detect corrupted data Tree structures have a different page organization

Physical data structures and query optimization

Organization of tuples within pages

page dictionary useful part of the page checksum

t1 t2 *t3

tuple t3

tuple t2 stack

tuple t1

stack
Page-headerBlock-header-

control information about the access method control information used by the file system

-trailer -trailer

Physical data structures and query optimization

Page manager primitives

Insertion and update of a tuple may require a reorganization of the page (there is enough space to store the extra bytes) or even usage of a new page (if there is not enough space) Deletion of a tuple often carried out by marking the tuple as invalid Access to a field of a particular tuple after identifying the tuple by means of its key or its offset, the field is identified according to the offset and the length of the field itself

Physical data structures and query optimization

Sequential structures
Characterized by a sequential arrangement of tuples in the secondary memory Three cases: entry-sequenced, array, sequentially-ordered In an entry-sequenced organization, the sequence of the tuples is dictated by their order of entry In an array organization, the tuples (all of the same size) are arranged as in an array, and their positions depend on the values of an index (or indexes) In a sequentially-ordered organization, the position of each tuple in the sequence depends on the value of a key field, that induces the ordering

Physical data structures and query optimization

Entry-sequenced sequential structure

Optimal for carrying out sequential reading and writing operations Optimal for space occupancy, as it uses all the blocks available for files and all the space within the blocks Non optimal with respect to searching specific data units updates that increase the size of a tuple

Physical data structures and query optimization

Array sequential structure

Possible only when the tuples are of fixed length Made of n adjacent blocks, each block with m slots available to store m tuples Each tuple has a numeric index i and is placed in the i-th position of the array

Physical data structures and query optimization

Sequentially-ordered sequential structure

Each tuple has a position based on the value of the key field Historically, such structures were used on sequential devices (tapes). This has fallen out of use, but for data streams and system logs The main problems are insertions or updates which increase the physical space - they require reordering techniques for the tuples already present: Options to avoid global reorderings:
Differential files (example: yellow pages) Leaving a certain number of slots free at the time of first loading, followed by local reordering operations Integrating the sequentially ordered files with an overflow file, where new tuples are inserted into blocks linked to form an overflow chain

Physical data structures and query optimization

Hash-based access structures

Ensure an efficient associative access to data, based on the value of a key field A hash-based structure has B blocks (often adjacent) A hash algorithm is applied to the key field and returns a value between zero and B-1. This value is interpreted as the position of the block in the file, and used both for reading and writing the block This is the most efficient technique for queries with equality predicates, but it is rather inefficient for queries with interval predicates

Physical data structures and query optimization

Features of hash-based structures

Primitive interface: hash(fileId,Key):BlockId The implementation consists of two parts. folding, transforms the key values so that they become positive integer values, uniformly distributed over a large range hashing transforms the positive binary number into a number between zero and B - 1 Optimal performance if the file is larger than necessary. Let: T be the number of tuples expected for the file, F be the average number of tuples stored in each page; then a good choice for B is T/(0.8 x F), using only 80% of the available space

Physical data structures and query optimization

Collisions
Collisions occur when the same block number is associated to too many tuples. They are critical when the maximum number of tuples per block is exceeded Collisions are solved by adding an overflow chain This gives the additional cost of scanning the chain The average length of the overflow chain is a function of the ratio T/(F x B) and of the average number F of tuples per page:
.5 .6 .7 .8 .9 T/(FxB) 1 0.5 0.75 1.167 2.0 4.495 2 0.177 0.293 0.494 0.903 2.146 3 0.087 0.158 0.286 0.554 1.377 5 0.031 0.066 0.136 0.289 0.777 10 0.005 0.015 0.042 0.110 0.345

Physical data structures and query optimization

An example
40 records hash table with 50 positions: 1 collision of 4 values 2 collisions of 3 values 5 collisions of 2 values

M 60600 66301 205751 205802 200902 116202 200604 66005 116455 200205 201159 205610 201260 102360 205460 205912 205762 200464 205617 205667

M mod 50 0 1 1 2 2 2 4 5 5 5 9 10 10 10 10 12 12 14 17 17

M 200268 205619 210522 205724 205977 205478 200430 210533 205887 200138 102338 102690 115541 206092 205693 205845 200296 205796 200498 206049

M mod 50 18 19 22 24 27 28 30 33 37 38 38 40 41 42 43 45 46 46 48 49

Physical data structures and query optimization

About hashing
Performs best for direct access based on equality for values of the key Collisions (overflow) are typically managed with linked blocks into an area called overflow file Inefficient for access based on interval predicates or based on the value of non-key attributes Hash files "degenerate" if the extra-space is too small (should be at least 120% of the minimum required space) and if the file size changes a lot over time

Physical data structures and query optimization

Tree structures
The most frequently used in relational DBMSs SQL indexes are implemented in this way Gives associative access based on the value of a key no constraints on the physical location of the tuples Note: the primary key of the relational model and the keys for hash-based and tree structures are different concepts

Physical data structures and query optimization

Index file
Index: an auxiliary structure for the efficient access to the records of a file based upon the values of a given field or record of fields called the index key The index concept: analytic index of a book, seen as a pair (term-page list), alphabetically ordered, at the end of a book The index key is not a primary key!

Physical data structures and query optimization

Tree structures
first level

root node
paolo

Each tree has: one root node several intermediate nodes several leaf nodes

mauro

renzo

second level

bice

dino

mauro

paolo

renzo

teresa

Each node corresponds to a block

Pointers to tuples (arbitrarily organized)

The links between the nodes are established by pointers to mass memory In general, each node has a large number of descendants (fan out), and therefore the majority of pages are leaf nodes In a balanced tree, the lengths of the paths from the root node to the leaf nodes are all equal. Balanced trees give optimal performance.

Physical data structures and query optimization

Structure of the tree nodes

K 1 P1

.....

K ii

.....

K F PF

sub-tree with keys K K1

sub-tree with keys K i K K i+1

sub-tree with keys K KF

Physical data structures and query optimization

Primary vs. secondary indexes

Indexes can be used as a primary access structure: Tuples are stored in the index nodes or in a file ordered according to the index key (also: clustered index)
Possibly a sparse index: with less index entries than the number of tuples of the file, as tuples are ordered in the file

Indexes are (more) often used as secondary access structures Tuples are stored according to another structure (hashed, entry-sequenced, another index with a different key) The index nodes only contain key values and pointers
Necessarily a dense index: one index entry pointing to every tuple in the file is required (or the tuple is lost)

Physical data structures and query optimization

B and B+ trees
B+ trees The leaf nodes are linked in a chain ordered by the key Supports interval queries efficiently The most used by relational DBMSs B trees No sequential connection for leaf nodes Intermediate nodes use two pointers for each key value Ki one points directly to the block that contains the tuple corresponding to Ki the other points to a sub-tree with keys greater than Ki and less than Ki+1

Physical data structures and query optimization

An example of B+ tree
root node first level
mauro paolo

Pointers to index nodes

renzo

second level
bice dino mauro paolo renzo teresa

Pointers to blocks of tuples (arbitrarily organized)

Physical data structures and query optimization

An example of B tree
k1 k6 k10

t(k2)

t(k3)

t(k4)

t(k5) t(k1)

t(k6) t(k10)

t(k7)

t(k8)

t(k9)

Physical data structures and query optimization

Search technique
Looking for a tuple with key value V, at each intermediate node:
if V < K1 follow P0 if V KF follow PF otherwise, follow Pj such that Kj V < Kj+1
P0 K 1 P1 ..... K ii Pi ..... K F PF

sub-tree with keys K K1

sub-tree with keys K i K K i+1

sub-tree with keys K KF

The leaf nodes can be organized in two ways:

In key-sequenced trees tuples are contained in the leaves In indirect trees leaf nodes contain pointers to the tuples, allocated with any other primary mechanism (entrysequenced, hash, key-sequenced, ...)

Physical data structures and query optimization

Split and Merge operations

SPLIT: required when the insertion of a new tuple cannot be done locally to a node Causes an increase of pointers in the superior node and thus could recursively cause another split MERGE: required when two close nodes have entries that could be condensed into a single node. Done in order to keep a high node filling and minimal paths from the root to the leaves. Causes a decrease of pointers in the superior node and thus could recursively cause another merge

Physical data structures and query optimization

k1 k1 k6 k2 k3 k4 k6 k3 k4 k5 k5

Initial situation

Split and merge

a. insert k3: split k1 k1 k2

b. delete k2: merge k1 k1 k6 k3 k4 k5

Physical data structures and query optimization

Index usage
Syntax in SQL: create [unique] index IndexName on TableName(AttributeList) drop index IndexName Every table should have: A primary index, with key-sequenced structure, normally unique, on the primary key Several secondary indexes, both unique and not unique, on the attributes most used for selections and joins They are progressively added, checking that the system actually uses them, and without excess

Physical data structures and query optimization

Query optimization
Optimizer: an important module in the architecture of a DBMS It receives a query written in SQL and produces an access program in object or internal format, which uses the data access methods. Steps: Lexical, syntactic and semantic analysis Translation into an internal representation Algebraic optimization Cost-based optimization Code generation

Physical data structures and query optimization

Internal representation of queries

A tree representation, similar to that of relational algebra: Leaf nodes correspond to the physical data structures (tables, indexes, files). intermediate nodes represent physical data access operations that are supported by the access methods Typical operations include sequential scans, orderings, indexed accesses and various methods for evaluating joins and aggregate queries, as well as materialization choices for intermediate results

Physical data structures and query optimization

Query optimization input-output

Input: query in SQL SELECT R.a FROM R,S,T WHERE R.a=S.a AND R.b=T.b Output: execution plan project select(R.a=S.a, R.b=T.b) cartProd R S cartProd T dupElim&project build3 probe2 build2 probe1 scan T scan S build1 scan R

Physical data structures and query optimization

Approaches to query execution

Compile and store: the query is compiled once and executed many times The internal code is stored in the DBMS, together with an indication of the dependencies of the code on the particular versions of catalog used at compile time On relevant changes of the catalog, the compilation of the query is invalidated and repeated Compile and go: immediate execution, no storage Even if not stored, the code may live for a while in the DBMS and be available for other executions

Physical data structures and query optimization

Relation profiles
Profiles contain quantitative information about tables and are stored in the data dictionary: the cardinality (number of tuples) of each table T the dimension in bytes of each attribute Aj in T the number of distinct values of each attribute Aj in T the minimum and maximum values of each attribute Aj in T Periodically calculated by activating appropriate system primitives (for example, the update statistics command) Used in cost-based optimization for estimating the size of the intermediate results produced by the query execution plan

Physical data structures and query optimization

Sequential scan
Performs a sequential access to all the tuples of a table or of an intermediate result, at the same time executing various operations, such as: Projection to a set of attributes Selection on a simple predicate (of type: Ai = v) Sort (ordering) Insertions, deletions, and modifications of the tuples currently accessed during the scan Primitives: Open, next, read, modify, insert, delete, close

Physical data structures and query optimization

Sort
This operation is used for ordering the data according to the value of one or more attributes. We distinguish: Sort in main memory, typically performed by means of ad-hoc algorithms Sort of large files, which can not be transferred to main memory, performed by merging smaller parts with already sorted parts

Physical data structures and query optimization

Indexed access
Indexes are used when queries include: simple predicates (of the type Ai = v) interval predicates (of the type v1 Ai v2) These predicates are said to be supported by indexes built on Ai With conjunctions of supported predicates, the DBMS chooses the most selective supported predicate for the primary access, and evaluates the other predicates in main memory With disjunctions of predicates:
if any of them is not supported a scan is needed; if all are supported, indexes can be used (on all of them) and then duplicate elimination is normally required

Physical data structures and query optimization

Join Methods
Joins are the most frequent (and costly) operations in DBMSs There are several methods for join evaluation, among which: nested-loop, merge-scan and hashed. These three methods are based on scanning, hashing, and ordering.

Physical data structures and query optimization

Nested-loop join
External table A
External scan

Internal table A a -----------------------------

----------------

a
Internal scan or indexed access

---------------

Physical data structures and query optimization

Merge-scan join
Left Table A a b b c c e f Left scan Right scan A a a b c e e g Right Table

-------------------------------

---------------

Physical data structures and query optimization

Hashed join
a Left Table hash(a) Right Table hash(a)

A
d e a c

A
e m a a

j j

j z

Physical data structures and query optimization

Cost-based optimization
An optimization problem, whose decisions are: The data access operations to execute (e.g., scan vs index access) The order of operations (e.g., the join order) The option to allocate to each operation (e.g., choosing the join method) Parallelism and pipelining can improve performances Further options appear in selecting a plan within a distributed context

Physical data structures and query optimization

Approach to query optimization

Optimization approach: Make use of profiles and of approximate cost formulas Construct a decision tree, in which each node corresponds to a choice; each leaf node corresponds to a specific execution plan. Assign to each plan a cost: Ctotal = CI/O nI/O + Ccpu ncpu Choose the plan with the lowest cost, based on operations research (branch and bound) Optimizers should obtain good solutions in a very short time

Physical data structures and query optimization

An example of decision tree

R 1 S 2 T

nested-loop, 1 R internal

nested-loop R external

merge-scan

hash-join, 1 hash on R

hash-join, 1 hash on S

nested-loop, 2 T internal strategy 1

nested-loop, T external strategy 2

merge-scan

hash-join, 2 hash on T strategy 4

hash-join, 2 hash on (R strategy 5

strategy 3

.............

Physical data structures and query optimization

Query processing components

User
Query
Catalog Statistic profiles Analyzer Internal representation Optimizer Execution plan Execution layer

Result

Mass memory: - Data - Indexes

Physical data structures and query optimization

Centralized architecture (DBMS)

User
DBMS
Catalog Statistic profiles

Query
Analyzer Internal representation Optimizer Execution plan Execution layer

Result

Mass memory: - Data - Indexes

Physical data structures and query optimization Distributed database with master-slave optimization

Site S1 (master)
Distributed catalogs Distributed profiles Analyzer Optimizer Distributed execution plan

User

Site S2 (slave)
Execution layer

Database Site S3 (slave)

Execution layer

Database

Physical data structures and query optimization Distributed optimization with negotiation

Site S1
Distributed catalogs Distributed profiles

User
Analyzer Optimizer Distributed execution plan Execution layer

Site S2
Optimizer

Execution layer

Distributed negotiation

Database
Execution layer

Optimizer

Database

Site S3

Database

Physical data structures and query optimization Distributed system with mediator and wrappers

Mediator

User

Wrapper W1
(Web source)

Analyzer Optimizer Distributed execution plan Execution layer

Execution environment

Wrapper W2
(file system)

Execution environment

Database

Wrapper W3
(program)

Execution environment

Physical data structures and query optimization

Overall view: components of a DBMS

USER LOG QUERY ANALYZER OPTIMIZER [TRANSACTION] ACCESS MANAGER FILE MANAGER CONCURRENCY MANAGER LOCK TABLES RELIABILITY MANAGER DUMP

BUFFER MANAGER

STATISTICS USER DATA

INDEXES SYSTEM DATA

MAIN MEMORY

Online Recruitment System Final Year Project1
92% (12)
Online Recruitment System Final Year Project1
53 pages
New Table PRCD - ELEMENTS in S - 4 HANA
100% (1)
New Table PRCD - ELEMENTS in S - 4 HANA
9 pages
15 Storage Manager
No ratings yet
15 Storage Manager
5 pages
AppNote VxWorks 7 Porting C Code From 32-Bit To 64-Bit
0% (1)
AppNote VxWorks 7 Porting C Code From 32-Bit To 64-Bit
12 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
90 pages
Chapter 4 Summery
No ratings yet
Chapter 4 Summery
14 pages
Data Storage and Access Methods: Min Song IS698
No ratings yet
Data Storage and Access Methods: Min Song IS698
50 pages
WBUT Data C Book
No ratings yet
WBUT Data C Book
587 pages
Data Structures PDF
100% (4)
Data Structures PDF
622 pages
Data Structures Overview & Examples
No ratings yet
Data Structures Overview & Examples
4 pages
Data & Storage Structures: Introductions &
No ratings yet
Data & Storage Structures: Introductions &
17 pages
Data Structures
No ratings yet
Data Structures
613 pages
6 Data Storage and Querying
100% (1)
6 Data Storage and Querying
58 pages
DBMS Course Overview
No ratings yet
DBMS Course Overview
121 pages
Oracle Scripting for DBAs
No ratings yet
Oracle Scripting for DBAs
3 pages
DBMS - Notes 5 Unit
No ratings yet
DBMS - Notes 5 Unit
25 pages
Disk Storage & DBMS Basics
No ratings yet
Disk Storage & DBMS Basics
33 pages
Chapter 4: Spatial Storage and Indexing
No ratings yet
Chapter 4: Spatial Storage and Indexing
39 pages
Unit 2 Data Structures, File Organisation and Physical Database Design
No ratings yet
Unit 2 Data Structures, File Organisation and Physical Database Design
13 pages
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
No ratings yet
Chapter 11: Indexing and Storage: Modified From: Database System Concepts, 6 Ed
53 pages
CSI104 Summary
No ratings yet
CSI104 Summary
114 pages
Dbms Chapter 5
No ratings yet
Dbms Chapter 5
28 pages
BCA 403 (File & Data Structure)
100% (1)
BCA 403 (File & Data Structure)
94 pages
Disks, Memories & Buffer Management: "The Two Offices of Memory Are Collection and Distribution." - Samuel Johnson
No ratings yet
Disks, Memories & Buffer Management: "The Two Offices of Memory Are Collection and Distribution." - Samuel Johnson
28 pages
Database System Ch-6
No ratings yet
Database System Ch-6
78 pages
Unit 5
No ratings yet
Unit 5
185 pages
Database Management System
No ratings yet
Database Management System
35 pages
Employee Paroll System
71% (7)
Employee Paroll System
42 pages
Introduction To Data Structures and Algorithm Analysis
No ratings yet
Introduction To Data Structures and Algorithm Analysis
47 pages
Storage and Indexing Overview
No ratings yet
Storage and Indexing Overview
100 pages
Physical Data Organization: Department of Computer Science
No ratings yet
Physical Data Organization: Department of Computer Science
18 pages
Grade-8-4 Lesson
No ratings yet
Grade-8-4 Lesson
31 pages
DBMS Architecture and File Systems
No ratings yet
DBMS Architecture and File Systems
58 pages
Dmbs New Slides Unit 1
No ratings yet
Dmbs New Slides Unit 1
35 pages
Petrel Workflow Editor Guide
100% (1)
Petrel Workflow Editor Guide
17 pages
BSC (IT) Semester 2
No ratings yet
BSC (IT) Semester 2
8 pages
Create Table SQL 1
No ratings yet
Create Table SQL 1
11 pages
SQL Database
No ratings yet
SQL Database
220 pages
Storage and File Structures: Goals
No ratings yet
Storage and File Structures: Goals
13 pages
Programs
No ratings yet
Programs
7 pages
Mpmath
No ratings yet
Mpmath
387 pages
CTG Articles Generics
100% (2)
CTG Articles Generics
51 pages
Problems of Education in The 21st Century, Vol. 47, 2012
No ratings yet
Problems of Education in The 21st Century, Vol. 47, 2012
181 pages
02 Storage
No ratings yet
02 Storage
104 pages
Image Processing (RCS082) Unit V Huffman Coding
No ratings yet
Image Processing (RCS082) Unit V Huffman Coding
12 pages
Computer Architecture Essentials
No ratings yet
Computer Architecture Essentials
31 pages
Lecture 17
No ratings yet
Lecture 17
24 pages
Module 6 - Normalization-1
No ratings yet
Module 6 - Normalization-1
30 pages
File Organization and Data Base Design
No ratings yet
File Organization and Data Base Design
17 pages
Chapter1 2challenges
No ratings yet
Chapter1 2challenges
10 pages
Presentation of Aict (GRP 5)
No ratings yet
Presentation of Aict (GRP 5)
40 pages
CSC148 Tt2am 2012W
No ratings yet
CSC148 Tt2am 2012W
10 pages
File and Database Design
No ratings yet
File and Database Design
28 pages
Python Lab Manual 2023
No ratings yet
Python Lab Manual 2023
60 pages
Less01 - DBA1 Notes
No ratings yet
Less01 - DBA1 Notes
34 pages
Storage and Indexing Overview
No ratings yet
Storage and Indexing Overview
16 pages
CH 1
No ratings yet
CH 1
39 pages
File Organization in RDBMS
No ratings yet
File Organization in RDBMS
9 pages
ROCK Clustering Example
100% (2)
ROCK Clustering Example
4 pages
.Trashed 1737148326 Suziselokiburisuneripi
No ratings yet
.Trashed 1737148326 Suziselokiburisuneripi
5 pages
File Organization
No ratings yet
File Organization
47 pages
Audit Data Structures
No ratings yet
Audit Data Structures
5 pages
Software Engineer II Job at Walmart Labs
No ratings yet
Software Engineer II Job at Walmart Labs
2 pages
Computer Engg. Training Diary
No ratings yet
Computer Engg. Training Diary
8 pages
Notes 2
No ratings yet
Notes 2
8 pages
Unit 2
No ratings yet
Unit 2
13 pages
React 18 - Course Content
No ratings yet
React 18 - Course Content
3 pages
4 Marks Chapter (12) : 1) Physical Storage Media
No ratings yet
4 Marks Chapter (12) : 1) Physical Storage Media
6 pages
Tabu Search-Assignment IM-49202
No ratings yet
Tabu Search-Assignment IM-49202
3 pages
Arrays Answers Python
No ratings yet
Arrays Answers Python
9 pages
Lecture Data Storage
No ratings yet
Lecture Data Storage
28 pages
Software Engineer Career Profile
No ratings yet
Software Engineer Career Profile
2 pages
Data Structure Lecture Note
No ratings yet
Data Structure Lecture Note
94 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
68 pages
Characteristics of Database Approach
No ratings yet
Characteristics of Database Approach
63 pages
File Organization
No ratings yet
File Organization
37 pages
MCA Resume
No ratings yet
MCA Resume
1 page
Unit 4 Storage and Querying
No ratings yet
Unit 4 Storage and Querying
48 pages
Unit3 Datastorage Structre
No ratings yet
Unit3 Datastorage Structre
29 pages
UNIT 5 - IO SubSystem RAID File System
No ratings yet
UNIT 5 - IO SubSystem RAID File System
50 pages
Topic 08 ICT285
No ratings yet
Topic 08 ICT285
67 pages
File Organisation in DBMS
No ratings yet
File Organisation in DBMS
27 pages
Unit 3
No ratings yet
Unit 3
15 pages
DEN80EDUCSA
No ratings yet
DEN80EDUCSA
5 pages

08pdf Physical Optim

Uploaded by

08pdf Physical Optim

Uploaded by

Advanced Databases

8 Physical data structures and query optimization

Physical data structures and query optimization

Study of inside DB technology: why?

Physical data structures and query optimization

DataBase Management System DBMS

Physical data structures and query optimization

Access and query manager

Physical data structures and query optimization

Technology of DBMSs - topics

Physical data structures and query optimization

Main and Secondary memory (1)

Physical data structures and query optimization

Main and Secondary memory (2)

Physical data structures and query optimization

Main and Secondary memory (3)

Physical data structures and query optimization

Main and Secondary memory (4)

Physical data structures and query optimization

DBMS and file system (1)

Physical data structures and query optimization

DBMS and file system (2)

Physical data structures and query optimization

Blocks and records

Physical data structures and query optimization

Physical data structures and query optimization

Physical access structures

Physical data structures and query optimization

Organization of tuples within pages

Physical data structures and query optimization

Organization of tuples within pages

*t1 *t2 *t3

Physical data structures and query optimization

Page manager primitives

Physical data structures and query optimization

Physical data structures and query optimization

Entry-sequenced sequential structure

Physical data structures and query optimization

Array sequential structure

Physical data structures and query optimization

Sequentially-ordered sequential structure

Physical data structures and query optimization

Hash-based access structures

Physical data structures and query optimization

Features of hash-based structures

Physical data structures and query optimization

Physical data structures and query optimization

Physical data structures and query optimization

Physical data structures and query optimization

Physical data structures and query optimization

Physical data structures and query optimization

Each node corresponds to a block

Pointers to tuples (arbitrarily organized)

Physical data structures and query optimization

Structure of the tree nodes

sub-tree with keys K K1

sub-tree with keys K i K K i+1

sub-tree with keys K KF

Physical data structures and query optimization

Primary vs. secondary indexes

Physical data structures and query optimization

Physical data structures and query optimization

Pointers to index nodes

Pointers to blocks of tuples (arbitrarily organized)

Physical data structures and query optimization

Physical data structures and query optimization

sub-tree with keys K K1

sub-tree with keys K i K K i+1

sub-tree with keys K KF

The leaf nodes can be organized in two ways:

Physical data structures and query optimization

Split and Merge operations

Physical data structures and query optimization

Split and merge

a. insert k3: split k1 k1 k2

b. delete k2: merge k1 k1 k6 k3 k4 k5

t1 t2 *t3