0% found this document useful (0 votes)

11 views56 pages

L4 Indexing

The document discusses indexing in databases, highlighting its importance for efficient data retrieval without scanning every row. It covers various types of indices, including ordered and hash indices, and explains the structure and access mechanisms of disk storage. Additionally, it addresses the performance implications of different indexing strategies and the management of indices during data insertion and deletion.

Uploaded by

xihuatl074

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views56 pages

L4 Indexing

Uploaded by

xihuatl074

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 56

Indexing

Dr. K. M. Azharul Hasan

Dept. of CSE, KUET
Indexing

Indexes are used to quickly locate data

without having to search every row in a
database every time a database table is
accessed.
 Indexes can be created using one or more
columns of a database table, providing the
basis for both rapid random lookups and
efficient access of ordered records.
Storage and Indexing?

DB design using logical models (ER/Relational).

 Appropriate level for designers to begin with
 Provide independence from implementation details

Performance: another major factor in user

satisfaction
 Depends on
 Efficient data structures for data representation
 Efficiency of system operation on those structures
 Disks contains data files and system files including
dictionary and index files
 Disk access: one of the most critical factor in
performance.
Storage Hierarchy

 DBMS stores information on some storage medium

 Primary storage: can be operated directly by CPU.
 Secondary storage:
 larger capacity, lower cost, slower access
 cannot be operated directly by CPU – must be copied
to primary storage
 Secondary storage has major implications for DBMS
design
 READ: transfer data to main memory
 WRITE: transfer data from main memory.
 Both transfers are high-cost operations, relative to in-
memory operations, so must be planned carefully
Why Not Store Everything in Main
Memory?

 Cost and size

 Main memory is volatile: What’s the problem?
You know!!!
 Typical storage hierarchy:
 Factors: access speed, cost per unit, reliability
 Cache and main memory (RAM) for currently used
data: fast but costly
 Flash memory: limited number of writes (and
slow), non-volatile, disk-substitute in embedded
systems
 Disk for the main database (secondary storage).
 Tapes for archiving older versions of the data
(tertiary storage).
Disks

Secondary storage device of choice.

Data is stored and retrieved in units
called disk blocks or pages.
Unlike RAM, time to retrieve a disk page
varies depending upon location on disk.
 Therefore, relative placement of pages on disk
has major impact on DBMS performance!
Components of a Disk
Spindle
Tracks
 The platters spin Disk head

 The arm assembly is moved in

or out to position a head on a Sector
desired track. Tracks under
heads make a cylinder
(imaginary!).
 Only one head
reads/writes at any one
time.
 Block size is a multiple Platters
Arm movement
of sector size (which is
fixed).

Arm assembly
Accessing a Disk Page

Time to access (read/write) a disk block:

 seek time (moving arms to position disk head on track)
 rotational delay (waiting for block to rotate under
head)
 transfer time (actually moving data to/from disk
surface)
Seek time and rotational delay dominate.
Key to lower I/O cost: reduce seek/rotation
delays
Basic Concepts
9

 Indexing mechanisms used to speed up access to desired

data.
 E.g., author catalog in library

 Search Key - attribute to set of attributes used to look up

records in a file.
 An index file consists of records (called index entries) of the
form search-key pointer

 Index files are typically much smaller than the original file
 Two basic kinds of indices:
 Ordered indices: search keys are stored in sorted order
 Hash indices: search keys are distributed uniformly
across “buckets” using a “hash function”.
06/05/2025
Index Evaluation Metrics
10

 Access types supported efficiently. e.g.,

 records with a specified value in the
attribute
 or records with an attribute value falling in

a specified range of values.

 Access time
 Insertion time
 Deletion time
 Space overhead

06/05/2025
Ordered Indices
11
 In an ordered index, index entries are stored sorted on
the search key value. E.g., author catalog in library.
 Primary index: in a sequentially ordered file, the index
whose search key specifies the sequential order of the file.
 Also called clustering index
 The search key of a primary index is usually but not

necessarily the primary key.

 Secondary index: an index whose search key specifies an
order different from the sequential order of the file. Also
called
non-clustering index.
 Index-sequential file: ordered sequential file with a
primary index.

06/05/2025
Dense Index Files
12

Dense index — Index record appears for every search-key value

in the file.

06/05/2025
Sparse Index Files
13

 Sparse Index: contains index records for only some search-

key values.
 Applicable when records are sequentially ordered on

search-key
 To locate a record with search-key value K we:
 Find index record with largest search-key value < K
 Search file sequentially starting at the record to which

the index record points

06/05/2025
Sparse Index Files (Cont.)
14

 Compared to dense indices:

 Less space and less maintenance overhead for
insertions and deletions.
 Generally slower than dense index for locating

records.
 Good tradeoff: sparse index with an index entry for
every block in file, corresponding to least search-key
value in the block.

06/05/2025
Multilevel Index
15

 If primary index does not fit in memory, access

becomes expensive.
 Solution: treat primary index kept on disk as a
sequential file and construct a sparse index on it.
 outer index – a sparse index of primary index
 inner index – the primary index file

 If even outer index is too large to fit in main

memory, yet another level of index can be
created, and so on.
 Indices at all levels must be updated on insertion
or deletion from the file.

06/05/2025
Multilevel Index (Cont.)

06/05/2025
Index Classification
17

Summery
 Primary vs. secondary: If search key contains same
order or not.
 Clustered vs. unclustered: If order of data records
is the same as order of data entries or not.
 Dense vs. sparse: If there is an entry in the index
for each key value or not .
 Single level vs. multi level:

06/05/2025
Hash-Based Indexes
18
Good for equality selections.
 Index is a collection of buckets. Bucket = primary
page plus zero or more overflow pages.
 Hashing function h: h(r) = bucket in which
record r belongs. h looks at the search key fields
of r.
Buckets may contain the data records or just
the rids.
Hash-based indexes are best for equality
selections. Cannot support range searches
So what is difference between hashing and
indexing?
06/05/2025
Index Update: Deletion
19

 If deleted record was the only record in the file with its
particular search-key value, the search-key is deleted from the
index also.
 Single-level index deletion:
 Dense indices – deletion of search-key: similar to file record

deletion.
 Sparse indices –

 if an entry for the search key exists in the index, it is

deleted by replacing the entry in the index with the next
search-key value in the file (in search-key order).
 If the next search-key value already has an index entry, the
entry is deleted instead of being replaced.

06/05/2025
Index Update: Insertion
20

 Single-level index insertion:

 Perform a lookup using the search-key value
appearing in the record to be inserted.
 Dense indices – if the search-key value does not

appear in the index, insert it.

 Sparse indices – if index stores an entry for each

block of the file, no change needs to be made to

the index unless a new block is created.
 If a new block is created, the first search-key value
appearing in the new block is inserted into the index.
 Multilevel insertion (as well as deletion) algorithms
are simple extensions of the single-level algorithms
06/05/2025
Secondary Indices
21

 Frequently, one wants to find all the records

whose values in a certain field (which is not the
search-key of the primary index) satisfy some
condition.
 Example 1: In the account relation stored

sequentially by account number, we may want

to find all accounts in a particular branch
 Example 2: as above, but where we want to

find all accounts with a specified balance or

range of balances
 We can have a secondary index with an index
record for each search-key value

06/05/2025
Secondary Indices Example

Secondary index on balance field of account

 Index record points to a bucket that contains pointers

to all the actual records with that particular search-key
value.
 Secondary indices have to be dense
Primary and Secondary Indices
23

 Indices offer substantial benefits when searching

for records.
 Updating indices imposes overhead on database
modification --when a file is modified, every index
on the file must be updated.
 Sequential scan using primary index is efficient,
but a sequential scan using a secondary index is
expensive
 Each record access may fetch a new block from

disk
 Block fetch requires about 5 to 10 micro

seconds, versus about 100 nanoseconds for

memory access 06/05/2025
B+-Tree Index Files
24

 Disadvantage of indexed-sequential files

 performance degrades as file grows, since many
overflow blocks get created.
 Periodic reorganization of entire file is required.
 Advantage of B+-tree index files:
 automatically reorganizes itself with small, local,
changes, in the face of insertions and deletions.
 Reorganization of entire file is not required to maintain
performance.
 (Minor) disadvantage of B+-trees:
 extra insertion and deletion overhead, space overhead.
 Advantages of B+-trees outweigh disadvantages
 B+-trees are used extensively

06/05/2025
B+-Tree Index Files
25

B+-tree indices are an alternative to indexed-sequential files.

06/05/2025
B+-Tree Index Files (Cont.)
26

B+-tree is a rooted tree satisfying the following properties

 All paths from root to leaf are of the same length

 Each node that is not a root or a leaf has between n/2 and
n children.
 A leaf node has between (n–1)/2 and n–1 values
 Special cases:
 If the root is not a leaf, it has at least 2 children.
 If the root is a leaf (that is, there are no other nodes in

the tree), it can have between 0 and (n–1) values.

06/05/2025
B+ Tree Example
27

To Records

06/05/2025
B+-Tree Node Structure
28

 Typical node
 Ki are the search-key values
 Pi are pointers to children (for non-leaf nodes) or pointers
to records or buckets of records (for leaf nodes).
 The search-keys in a node are ordered
K1 < K2 < K3 < . . . < Kn–1

06/05/2025
Leaf Nodes in B+-Trees
29
Properties of a leaf node:
 For i = 1, 2, . . ., n–1, pointer Pi either points to a file record with
search-key value Ki, or to a bucket of pointers to file records, each
record having search-key value Ki.
 If Li, Lj are leaf nodes and i < j, Li’s search-key values are less than
Lj’s search-key values
 Pn points to next leaf node in search-key order

06/05/2025
Non-Leaf Nodes in B+-Trees
30

 Non leaf nodes form a multi-level sparse index on the leaf

nodes. For a non-leaf node with m pointers:
 All the search-keys in the subtree to which P points are
1
less than K1
 For 2  i  n – 1, all the search-keys in the subtree to
which Pi points have values greater than or equal to Ki–1
and less than Ki
 All the search-keys in the subtree to which Pn points have
values greater than or equal to Kn–1

06/05/2025
Sample non-leaf

120

150

180
to keys to keys to keys
< 120 120 k<150 150k<180 180

06/05/2025
Sample leaf node
32

From non-leaf node

to next leaf
in sequence

120

130
with key 120

with key 130

To record

06/05/2025
3
5
11

30
30
35

100
101
110
B+ Tree Example
33

100

To Records
120
130

150
156 120
179 150
180
180
200
06/05/2025
B+ Tree
34

Suppose a key value is 9 byte, page size is

512 bytes and a pointer (both page pointer
and record pointer) is 7 bytes. How many key
values you can enter in a leaf and non leaf
node of a B+ tree?

HT

06/05/2025
Insert into B+ tree
35

First lookup the proper leaf

(a) simple case

 leaf not full: just insert (key, pointer-to-record)
(b) leaf overflow
(c) non-leaf overflow
(d) new root

06/05/2025
(a) Insert key = 32
36

n=3

100
30
11

30
31
32
3
5

06/05/2025
(b) Insert key = 7

n=3

100
30
7
57
11

30
31
3
5

06/05/2025
100
160
150
(c) Insert key = 160

156 120
179 150
180
38

160
179
180

180
n=3

200
06/05/2025
(d) New root, insert 45 n=3
39

Height grows at root

30
new root => balance maintained

10
20
30

40
10
12

20
25

30
32
40

40
45
1
2
3

06/05/2025
Deletion from B+ tree
40

Again, first lookup the proper leaf;

(a): Simple case: no underflow;

(b): Borrow keys from an adjacent sibling

(if it doesn't become too empty);

(c): Underflow

06/05/2025
(b) Delete 50
=> min # of keys
41
in a leaf = 5/2 = 2

n=4

40 35
100
10

35
10
20
30
35

40
50

06/05/2025
(c) Leaf Underflow Delete 50

n=4
42

100
20
40
40
20
30

40
50

06/05/2025
(d) Non-leaf underflow Delete 37
=> min # of keys in a
non-leaf =
(n+1)/2 - 1=3-1= 2

n=4

25
new root

40
25
10
20

30
40
30

30
37
10
14

20
22

25
26

40
45
1
3

43 06/05/2025
Home task
• Construct a B+ tree having n= 4 or 5 up to
level 3 to insert random keys considering
the cases.
• How can you perform range key query in a
B+ tree ?

44 06/05/2025
Queries on B+-Trees (Cont.)
45

 If there are K search-key values in the file, the

height of the tree is no more than logn/2(K)
 A node is generally the same size as a disk block,
typically 4 kilobytes
 and n is typically around 100 (40 bytes per index entry).
 With 1 million search key values and n = 100
 at most log (1,000,000) = 4 nodes are accessed in a
50
lookup.
 Contrast this with a balanced binary tree with 1
million search key values — around 20 nodes are
accessed in a lookup
 above difference is significant since every node access
may need a disk I/O, costing around 20 milliseconds
06/05/2025
B-Tree Index Files
46
 Similar to B+-tree, but B-tree allows search-key values to appear only
once; eliminates redundant storage of search keys.
 Search keys in nonleaf nodes appear nowhere else in the B-tree; an
additional pointer field for each search key in a nonleaf node must be
included.
 Generalized B-tree leaf node vs B+ tree

 Nonleaf node – pointers Bi are the bucket or file

record pointers.
06/05/2025
B-Tree Index File Example
47

B-tree (above) and B+-tree (below) on

same data

06/05/2025
B-Tree Index Files (Cont.)
48

 Advantages of B-Tree indices:

 May use less tree nodes than a corresponding B +-Tree.

 Sometimes possible to find search-key value before

reaching leaf node.

 Disadvantages of B-Tree indices:
 Only small fraction of all search-key values are found early
 Non-leaf nodes are larger, so fan-out is reduced. Thus, B-
Trees typically have greater depth than corresponding B+-
Tree
 Insertion and deletion more complicated than in B+-Trees
 Implementation is harder than B+-Trees.
 Range key search is difficult.
 Typically, advantages of B-Trees do not out weigh
disadvantages. 06/05/2025
Index Definition in SQL
49

 Create an index
create index <index-name> on <relation-name>
(<attribute-list>)
E.g.: create index b-index on branch(branch_name)
 Use create unique index to indirectly specify and
enforce the condition that the search key is a
candidate key.
 Not really required if SQL unique integrity constraint is
supported
 To drop an index
drop index <index-name>

06/05/2025
Index Selection Guidelines
 Attributes in WHERE clause are candidates for
index keys.
 Exact match condition suggests cluster/sparse/hash
index.
 Range query suggests tree index.
Clustering is especially useful for range queries;
can also help on equality queries if there are
many duplicates.
 Multi-attribute search keys should be considered
when a WHERE clause contains several conditions.
 Try to choose indexes that benefit as many queries
as possible.
 If only one index can be clustered per relation,
choose it based on important queries that would
benefit the most from clustering.
Index Selection Guidelines(Cont..)
SELECT E.dno
FROM Emp E
WHERE E.age>40
 B+ tree index on E.age can be used to get
qualifying tuples.
 Things to consider
 How selective is the condition?
 If 99% are over 40, index is less useful
 If 10%, an index is useful
Index Selection Guidelines(Cont..)
SELECT E.dno, COUNT (*)
FROM Emp E
WHERE E.age>20
GROUP BY E.dno

Consider the GROUP BY query: using age as an

index ---- is it effective?
 If many tuples have E.age > 20, using E.age index and
sorting the retrieved tuples may be costly.
 Especially bad if this index is not clsutered
 Clustered E.dno index may be better!
Indexes with Composite Search
Keys

 Composite Search Keys: Examples of composite key

Search on a combination indexes using lexicographic order.
of fields.
11,80 11
 Equality query: Every field 12,10 12
value is equal to a constant 12,20 name age sal 12
value. E.g. wrt <sal,age> 13,75 bob 12 10 13
index: <age, sal> cal 11 80 <age>
 age=12 and sal =75 joe 12 20
 Range query: Some field 10,12 sue 13 75 10
value is not a constant. E.g.: 20,12 Data records 20
 age =12; or age=12 and sal 75,13 sorted by name 75
> 10 80,11 80
 Data entries in index <sal, age> <sal>
Data entries in index Data entries
sorted by search key to sorted by <sal,age> sorted by <sal>
support range queries.
Composite Search Keys

To retrieve Emp records with age=30 AND

sal=4000, an index on <age,sal> would be
better than an index on age or an index on sal.
If condition is: 20<age<30 AND
3000<sal<5000:
 Clustered index on <age,sal> or <sal,age> is best.
If condition is: age=30 AND 3000<sal<5000:
 Clustered <age,sal> index much better than <sal,age>
index!
Composite indexes are larger, updated more
often.
Exercise to solve

 Emp (eid: int, salary:int, age: real, did: int)

 eid is the key, and there’s a clustered
index on eid and an unclustered index on
age
1. Give an example of a query that can be
speeded up because of the available
indexes.
2. Give an example that is neither speeded up
nor slowed down by the indexes.
3. Can there be an update that can be slowed
down because of the indexes?
56

Thank You

06/05/2025

Lec20Indexing v1
No ratings yet
Lec20Indexing v1
57 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
11.2 Indexing
No ratings yet
11.2 Indexing
26 pages
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
No ratings yet
Indexing and Hashing: Basic Concept, Ordered Indices: Adbms
22 pages
Indexes
No ratings yet
Indexes
70 pages
Indexing - II
No ratings yet
Indexing - II
57 pages
Indexing
No ratings yet
Indexing
62 pages
Database Storage & Indexing Guide
No ratings yet
Database Storage & Indexing Guide
41 pages
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
No ratings yet
Chap. 2 File Organization and Indexing: Abel J.P. Gomes
20 pages
Index 1
No ratings yet
Index 1
25 pages
Index and Hashing 2017 Combined
No ratings yet
Index and Hashing 2017 Combined
60 pages
DBMS Indexing 5
No ratings yet
DBMS Indexing 5
63 pages
Database Indexing Techniques Guide
No ratings yet
Database Indexing Techniques Guide
8 pages
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
No ratings yet
IN3020/4020 - Database Systems Spring 2020, Week 3.1 Indexing
44 pages
DBMS Indexing Methods
No ratings yet
DBMS Indexing Methods
33 pages
DBMS Storage and Indexing
No ratings yet
DBMS Storage and Indexing
80 pages
26 - Databse Indexes
No ratings yet
26 - Databse Indexes
48 pages
Co3 Session 21
No ratings yet
Co3 Session 21
53 pages
Module Iippt
No ratings yet
Module Iippt
27 pages
INDEXING
No ratings yet
INDEXING
10 pages
Database Management System-203105251: Assistant Professor Computer Science & Engineering
No ratings yet
Database Management System-203105251: Assistant Professor Computer Science & Engineering
35 pages
Lecture12 (CNC 312)
No ratings yet
Lecture12 (CNC 312)
36 pages
Indexing
No ratings yet
Indexing
24 pages
Aplikasi DB-MKG 7
No ratings yet
Aplikasi DB-MKG 7
22 pages
Indexing Hashing Files
No ratings yet
Indexing Hashing Files
68 pages
Unit-6 Storage Strategies
No ratings yet
Unit-6 Storage Strategies
43 pages
Indexing
No ratings yet
Indexing
11 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
DBMS Unit9
No ratings yet
DBMS Unit9
44 pages
Database Indexing Essentials
No ratings yet
Database Indexing Essentials
110 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
84 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
84 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
84 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
84 pages
Indexing in Database
No ratings yet
Indexing in Database
33 pages
File Storage and Indexing: Lesson 13 Cs 3200 Kathleen Durant PHD
No ratings yet
File Storage and Indexing: Lesson 13 Cs 3200 Kathleen Durant PHD
46 pages
Chapter 3 File Organization Indexed Methods
No ratings yet
Chapter 3 File Organization Indexed Methods
31 pages
CO3-Session-09 & 10
No ratings yet
CO3-Session-09 & 10
41 pages
UNIT 4 Updated - 121124
No ratings yet
UNIT 4 Updated - 121124
52 pages
ch12 1 40
No ratings yet
ch12 1 40
40 pages
CS2202 IndexingHashing
No ratings yet
CS2202 IndexingHashing
83 pages
Indexing and Hashing: B.Ramamurthy
No ratings yet
Indexing and Hashing: B.Ramamurthy
24 pages
Unit - 5 - Part 2
No ratings yet
Unit - 5 - Part 2
33 pages
Indexing Files: Last Time
No ratings yet
Indexing Files: Last Time
5 pages
Module 4 Indexing
No ratings yet
Module 4 Indexing
20 pages
B+ Tree and Hashing in Dbms
No ratings yet
B+ Tree and Hashing in Dbms
110 pages
1 Indexing Techniques
No ratings yet
1 Indexing Techniques
30 pages
V Unit
No ratings yet
V Unit
36 pages
V Unit
No ratings yet
V Unit
15 pages
DBMS Unit5
No ratings yet
DBMS Unit5
40 pages
DBMS Unit 5
No ratings yet
DBMS Unit 5
58 pages
Chapter - 2 - Revision
No ratings yet
Chapter - 2 - Revision
26 pages
Indexing Hashing
No ratings yet
Indexing Hashing
34 pages
Index Method1
No ratings yet
Index Method1
24 pages
Chapter 11: Indexing and Hashing
No ratings yet
Chapter 11: Indexing and Hashing
47 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
Unit Iv Indexing and Hashing: Basic Concepts
No ratings yet
Unit Iv Indexing and Hashing: Basic Concepts
35 pages
Indexing in Dbms
No ratings yet
Indexing in Dbms
19 pages
Storage and Indexing Methods
No ratings yet
Storage and Indexing Methods
43 pages
L7 XML
No ratings yet
L7 XML
41 pages
L5 RelationalDBDesign
No ratings yet
L5 RelationalDBDesign
61 pages
L6 - Integrity and Security
No ratings yet
L6 - Integrity and Security
45 pages
The Age of Zugzwang
No ratings yet
The Age of Zugzwang
9 pages
Chapter 8 Introduction To DBMS Notes - Important Points - CS-IP-Learning-Hub
No ratings yet
Chapter 8 Introduction To DBMS Notes - Important Points - CS-IP-Learning-Hub
4 pages
Active DataGuard DML Redirection Guide
No ratings yet
Active DataGuard DML Redirection Guide
9 pages
MiniTool Partition Wizard Crackaplpw PDF
No ratings yet
MiniTool Partition Wizard Crackaplpw PDF
3 pages
SQL Basics for Beginners
No ratings yet
SQL Basics for Beginners
4 pages
Chapter 07
No ratings yet
Chapter 07
45 pages
DCIT 24 Reviewer
No ratings yet
DCIT 24 Reviewer
16 pages
RSLTE031 - Neighbor HO Analysis-RSLTE-ECI-2-Day-rslte LTE17A Reports RSLTE031 Danubyu
No ratings yet
RSLTE031 - Neighbor HO Analysis-RSLTE-ECI-2-Day-rslte LTE17A Reports RSLTE031 Danubyu
24 pages
Lecture - 08 PLSQL Triggers and Audit Mechanisms
No ratings yet
Lecture - 08 PLSQL Triggers and Audit Mechanisms
89 pages
Dbms Lab 7 19IT030
No ratings yet
Dbms Lab 7 19IT030
10 pages
Creating ASP - Net Applications With N-Tier Architecture - CodeProject
No ratings yet
Creating ASP - Net Applications With N-Tier Architecture - CodeProject
12 pages
SAP Info Steward 4.3 Upgrade Guide
No ratings yet
SAP Info Steward 4.3 Upgrade Guide
28 pages
Rapid Data Migration To SAP S/4 Hana
No ratings yet
Rapid Data Migration To SAP S/4 Hana
3 pages
Database Management and Relational Database Management System
No ratings yet
Database Management and Relational Database Management System
11 pages
Tableau Developer Skills Set and Requirement As Developer
No ratings yet
Tableau Developer Skills Set and Requirement As Developer
1 page
List of Aws Security Labs by Pwnedlabs 1728618372
No ratings yet
List of Aws Security Labs by Pwnedlabs 1728618372
7 pages
Lecture 7
No ratings yet
Lecture 7
27 pages
SQLServer Guide
No ratings yet
SQLServer Guide
179 pages
Ps File
No ratings yet
Ps File
6 pages
GhettoVCB - SH Vmware Backup Server Free Software.
No ratings yet
GhettoVCB - SH Vmware Backup Server Free Software.
29 pages
Azure Infrastructure Course
No ratings yet
Azure Infrastructure Course
2 pages
Tcode
No ratings yet
Tcode
11 pages
InnoDB Cluster Notes
No ratings yet
InnoDB Cluster Notes
3 pages
Memory Management in OS
100% (1)
Memory Management in OS
76 pages
Arrays
No ratings yet
Arrays
30 pages
Cape It Unit 2 2022
No ratings yet
Cape It Unit 2 2022
19 pages
The DAMA Guide To The Data Management 1a 6
100% (2)
The DAMA Guide To The Data Management 1a 6
176 pages
Prefix Hash Tree: An Indexing Data Structure Over Distributed Hash Tables
No ratings yet
Prefix Hash Tree: An Indexing Data Structure Over Distributed Hash Tables
10 pages
Centera Foundation Student Resource Guide
No ratings yet
Centera Foundation Student Resource Guide
41 pages
Database Design for IT Students
No ratings yet
Database Design for IT Students
4 pages
ACID Properties: Atomicity
No ratings yet
ACID Properties: Atomicity
2 pages

L4 Indexing

Uploaded by

L4 Indexing

Uploaded by

Indexing

Dr. K. M. Azharul Hasan

Indexes are used to quickly locate data

DB design using logical models (ER/Relational).

Performance: another major factor in user

 DBMS stores information on some storage medium

 Cost and size

Secondary storage device of choice.

 The arm assembly is moved in

Time to access (read/write) a disk block:

 Indexing mechanisms used to speed up access to desired

 Search Key - attribute to set of attributes used to look up

 Access types supported efficiently. e.g.,

a specified range of values.

necessarily the primary key.

Dense index — Index record appears for every search-key value

 Sparse Index: contains index records for only some search-

the index record points

 Compared to dense indices:

 If primary index does not fit in memory, access

 If even outer index is too large to fit in main

 if an entry for the search key exists in the index, it is

 Single-level index insertion:

appear in the index, insert it.

block of the file, no change needs to be made to

 Frequently, one wants to find all the records

sequentially by account number, we may want

find all accounts with a specified balance or

Secondary index on balance field of account

 Index record points to a bucket that contains pointers

 Indices offer substantial benefits when searching

seconds, versus about 100 nanoseconds for

 Disadvantage of indexed-sequential files

B+-tree indices are an alternative to indexed-sequential files.

B+-tree is a rooted tree satisfying the following properties

 All paths from root to leaf are of the same length

the tree), it can have between 0 and (n–1) values.

 Non leaf nodes form a multi-level sparse index on the leaf

From non-leaf node

with key 130

Suppose a key value is 9 byte, page size is

First lookup the proper leaf

(a) simple case

Height grows at root

Again, first lookup the proper leaf;

(a): Simple case: no underflow;

(if it doesn't become too empty);

 If there are K search-key values in the file, the

 Nonleaf node – pointers Bi are the bucket or file

B-tree (above) and B+-tree (below) on

 Advantages of B-Tree indices:

 Sometimes possible to find search-key value before

reaching leaf node.

Consider the GROUP BY query: using age as an

 Composite Search Keys: Examples of composite key

To retrieve Emp records with age=30 AND

 Emp (eid: int, salary:int, age: real, did: int)

You might also like