Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views7 pages

Session02 FileProcessing

The document discusses file processing in computer science, focusing on indexes used to speed up record retrieval. It details various types of indexes, including primary, clustering, secondary, and multilevel indexes, along with their characteristics and access efficiencies. Additionally, it addresses the challenges of maintaining indexes during record insertion and deletion, proposing solutions such as unordered overflow files and linked lists for overflow records.

Uploaded by

aws1010374
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views7 pages

Session02 FileProcessing

The document discusses file processing in computer science, focusing on indexes used to speed up record retrieval. It details various types of indexes, including primary, clustering, secondary, and multilevel indexes, along with their characteristics and access efficiencies. Additionally, it addresses the challenges of maintaining indexes during record insertion and deletion, proposing solutions such as unordered overflow files and linked lists for overflow records.

Uploaded by

aws1010374
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

‫علوم الحاسب‬

‫الفرقة الثانية‬

‫معالجة الملفات‬
File Processing

Computer Science Department


Q.1. What are Indexes? State it’s types.
 Indexes are used to speed up record retrieval in response to certain
search conditions.

 Index types:
 Single level Ordered Indexes:

1. Primary Indexes.

2. Clustering Indexes.

3. Secondary Indexes.

 Multilevel Indexes.

 Dynamic Multilevel Indexes Using B-Trees and B+-Trees.

 Indexes on Multiple Keys.

 Indexes may be dense or sparse.


 Dense index has an index entry for every search key value.

 Sparse index has entries for only some search values.

Q.2. Describe the primary index, Illustrate the saving in block access
when a primary index is used to search for a record.
 Ordered file with two fields.
 <Field value (P.K), Pointer to record>
 One index entry in the index file for each block in the data file.
 The first record in each block is called the anchor record of the block.

1|Page
Q.3. Suppose that we have an ordered file with r = 300,000 record
stored on a disk with block size B = 4,096 bytes. File records are
of fixed size and are unspanned, with record length R = 100
bytes.
❖ Before using index

 The blocking factor (number of records per block).


 ⎣(4,096/100)⎦ = 40 records per block.
 The number of blocks needed for the file.
 = ⎾(r/bfr)⏋ = ⎾ (300,000/40) ⏋ = 7,500 blocks.
 A binary search ⎾log2 b⏋
 ⎾ (log2 7,500) ⏋ = 13 block accesses.
❖ After using index

 With key field V = 9 bytes long, a block pointer is P = 6 bytes.


 The size of each index entry is Ri = (9 + 6) = 15 bytes.
 The blocking factor for the index = ⎣(4,096/15)⎦ = 273.
2|Page
 The total number of rows of index = number of blocks = 7,500.
 The number of index blocks i= ⎾ (ri/bfri) ⏋ = ⎾ (7,500/273) ⏋ = 28
blocks.

 A binary search ⎾log2 b⏋


 ⎾ (log2 28) ⏋ = 5 block accesses + 1 additional block access to the
data file for a total of 5 + 1 = 6 block accesses.

Q.4. Suppose that we have an ordered file with R = 30,000 records


stored on a disk with block size B = 1024 bytes.
File record is fixed size and unspanned with record length R =
100 bytes.

3|Page
Q.5. What is the clustering index?
 Defined on an ordered data file.
 The data file is ordered on a non-key field.
 Includes one index entry for each distinct value of the field.
 The index entry points to the first data block that contains records
with that field value.

 It is another example of a non-dense index.

Q.6. What is the secondary index?


1. Includes one entry for each record in the data file; hence, it is a
dense index.
2. The index is an ordered file with two fields:
o The first field is of the same data type as some non-ordering
field of the data file that is an indexing field.
o The second field is either a block pointer or a record pointer.
3. A secondary means of accessing a file for which some primary access
already exists [True].

4|Page
Q.7. Consider the file with r = 300,000 fixed-length records of size R =
100 bytes stored on a disk with block size B = 4,096 bytes, suppose
we want to search for a record with a specific value for the secondary
key—a non-ordering key field of the file that is V = 9 bytes long and
a block pointer is P = 6 bytes long.

The file has b = 7,500 blocks How?

❖ Without the secondary index

 To do a linear search (b/2) on the file would require.


 b/2 = 7,500/2 = 3,750 block accesses on average.
❖ With a secondary index on that non-ordering key field of the file.

 Each index entry is Ri = (9 + 6) = 15 bytes.


 Blocking factor = ⎣(4,096/15)⎦ = 273 index entries per block.
 Total number of index entries ri is = the total number of records in the
data file, which is 300,000.

 The number of blocks needed for the index = ⎾ (300,000/273) ⏋ =


1,099 blocks.

5|Page
Example of Define Multi-Level Indexes (A Two-level Primary Index)

Q.8. What is the Major problem these indexes.


 Insertion and deletion of records move records around and change

index values.

 Solutions:
o Use unordered overflow file.

o Use linked list of overflows records.

6|Page

You might also like