09 FIle

The document discusses file structures in data management, detailing the definition of files, types of file organizations, and their advantages and disadvantages. It covers sequential files, hashing, and indexing methods, including primary, clustering, and secondary indexes, highlighting their roles in efficient data retrieval. The content emphasizes the importance of choosing appropriate file structures for various data operations to optimize performance.

Uploaded by

pro gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views22 pages

09 FIle

Uploaded by

pro gaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Data Structures (DS)

GTU # 3130702

Unit-4
Hashing & File Structure
(File Structure)
What is File?
A file is a collection of records where a record consists of one or more fields. Each contains the
same sequence of fields.
Each field is normally of fixed length.
A sample file with four records is shown below:
Name Roll No. Year Marks • There are four records
AMIT 1000 1 82 • There are four fields (Name, Roll No., Year,
KALPESH 1005 2 54 Marks)
JITENDRA 1009 1 75 • Records can be uniquely identified on the field
RAVI 1010 1 79 'Roll No.' Therefore, Roll No. is the key field.
• A database is a collection of files.
File Organizations
File Organizations Primitive Operations on a File
1. Sequential files 1. Creation
2. Relative files 2. Reading
3. Direct files 3. Insertion
4. Indexed Sequential files 4. Deletion
5. Index files 5. Updation
6. Searching
Sequential Files
It is the most common type of file. Block 1
Name Roll No. Year Marks
A fixed format is used for record.
AMIT 1000 1 82
All records are of the same length. KALPESH 1005 1 54
JITENDRA 1009 1 75
Position of each field in record and length of field is fixed.
RAVI 1010 1 79
Records are physically ordered on the value of one of the
fields - called the ordering field. Block 2
Name Roll No. Year Marks
RAMESH 1015 1 75
ROHIT 1025 1 65
JANAK 1026 1 75
AMAR 1029 1 79
Advantages of Sequential Files
Reading of records in order of the ordering key is extremely efficient.
Finding the next record in order of the ordering key usually, does not require additional block
access. Next record may be found in the same block.
Searching operation on ordering key is must faster. Binary search can be utilized. A binary
search will require log2b block accesses where b is the total number of blocks in the file.
Disadvantages of Sequential Files
Sequential file does not give any advantage when the search operation is to be carried out on
non- ordering field.
Inserting a record is an expensive operation. Insertion of a new record requires finding of place
of insertion and then all records ahead of it must be moved to create space for the record to be
inserted. This could be very expensive for large files.
Deleting a record is an expensive operation. Deletion too requires movement of records.
Modification of field value of ordering key could be time consuming. Modifying the ordering
field means the record can change its position. This requires deletion of the old record followed
by insertion of the modified record.
Hashing (Direct file organization)
Bucket 0
0 230 480
460 790
1
2
Bucket 1
… 321 Hashing with buckets
… 531 of chained blocks
…
…
… Bucket 2
… 232 270 930
242 470 420

B-1
Bucket Directory
Hashing (Direct file organization)
It is a common technique used for fast accessing of records on secondary storage.
Records of a file are divided among buckets.
A bucket is either one disk block or cluster of contiguous blocks.
A hashing function maps a key into a bucket number. The buckets are numbered 0, 1,2...b-1.
A hash function f maps each key value into one of the integers 0 through b - 1.
If x is a key, f(x) is the number of bucket that contains the record with key x.
The blocks making up each bucket could either be contiguous blocks or they can be chained
together in a linked list.
Hashing (Direct file organization)
Indexing
Indexing is used to speed up retrieval of records.
It is done with the help of a separate sequential file.
Each record of in the index file consists of two fields, a key field and a pointer into the main file.
To find a specific record for the given key value, index is searched for the given key value.
Binary search can used to search in index file. After getting the address of record from index
file, the record in main file can easily be retrieved.
Indexing
Index File Main File

Keyc Name Roll No. Year Marks

1000 AMIT 1010 1 82
1009 KALPESH 1016 1 54
1010 JITENDRA 1000 1 75
1012 RAVI 1012 1 79
1016 NILESH 1089 1 85
1089 NITIN 1100 1 98
1100 JAYESH 1200 1 99
1200 UMESH 1009 1 74
Index file is ordered on the ordering key Roll No. each record of index file points to
the corresponding record. Main file is not sorted.
Advantages of Indexing
Sequential file can be searched effectively on ordering key. When it is necessary to search for a
record on the basis of some other attribute than the ordering key field, sequential file
representation is inadequate.
Multiple indexes can be maintained for each type of field used for searching. Thus, indexing
provides much better flexibility.
An index file usually requires less storage space than the main file.
A binary search on sequential file will require accessing of more blocks.
This can be explained with the help of the following example.
Consider the example of a sequential file with r = 1024 records of fixed length with record size
R = 128 bytes stored on disk with block size B = 2048 bytes.
Advantages of Indexing
Size of Sequential File
Number of blocks required to store the file
▪ (1024 x 128) / 2048 = 64
Number of block accesses for searching a record
▪ log264= 6

Size of Index File

Suppose, we want to construct an index on a key field that is V = 4 bytes long and the block pointer is P = 4
bytes long.
A record of an index file needs 8 bytes per entry.
Total Number of index entries = 1024
Number of blocks required to store the index file
▪ (1024x8) / 2048 = 4
Number of block accesses for searching a record = log24 = 2
Types of Indexes
With indexing, new records can be added at the end of the main file. It will not require
movement of records as in the case of sequential file.
Updation of index file requires fewer block accesses compare to sequential file
Types of Indexes:
1. Primary indexes
2. Clustering indexes
3. Secondary indexes
Primary Indexes (Indexed Sequential File)
101

101 200

201 201
351
350 Data File
…
… Sequential File
351
805
905 400
… …
…
Index File …
805
Primary Index on ordering key ﬁeld
Roll Number 904
Primary Indexes (Indexed Sequential File)
An indexed sequential ﬁle is characterized by
Sequential organization (ordered on primary key)
Indexed on primary key

An indexed sequential ﬁle is both ordered and indexed.

Records are organized in sequence based on a key field, known as primary key.
An index to the file is added to support random access. Each record in the index file consists of
two fields: a key field, which is the same as the key field in the main file.
Number of records in the index file is equal to the number of blocks in the main file (data file)
and not equal to the number of records in the main file (data file).
Clustering Indexes
100 Math
100 Science
100 105 Physics
105 105
106 105
108 106
… 106
…
…
…
…

… 108
108
Field Clustering 109
Index File
Data File 109
Clustering Indexes
If records of a file are ordered on a non-key field, we can create a different type of index known
as clustering index.
A non-key field does not have distinct value for each record.
A Clustering index is also an ordered file with two fields.
Secondary Indexes (Simple Index File)
1 2

2 5

3 3

4 17

5 6
6 10
7 14
8 7
10 13
12 4
13 15
14 18
15
12
17
1
18
19
19
8

A secondary index on a non-ordering key ﬁeld

Secondary Indexes (Simple Index File)
While the hashed, sequential and indexed sequential files are suitable for operations based on
ordering key or the hashed key. Above file organizations are not suitable for operations
involving a search on a field other than ordering or hashed key.
If searching is required on various keys, secondary indexes on these fields must be maintained.
A secondary index is an ordered file with two fields.
Some non-ordering field of the data file.
A block pointer

There could be several secondary indexes for the same ﬁle.

One could use binary search on index file as entries of the index file are ordered on secondary
key field.
Records of the data files are not ordered on secondary key field.
Secondary Indexes (Simple Index File)
A secondary index requires more storage space and longer search time than does a primary
index.
A secondary index file has an entry for every record whereas primary index file has an entry for
every block in data file.
There is a single primary index file but the number of secondary indexes could be quite a few.
Data Structures (DS)
GTU # 3130702

Thank
You

DS TM Study Material Presentations Unit-4 1TM
No ratings yet
DS TM Study Material Presentations Unit-4 1TM
22 pages
Chapter 5. Record Storage and Primary File Organization
No ratings yet
Chapter 5. Record Storage and Primary File Organization
18 pages
File Organization
No ratings yet
File Organization
11 pages
UNIT-IV - File Organization
No ratings yet
UNIT-IV - File Organization
10 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
No ratings yet
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
23 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
File Organizations and Indexes
No ratings yet
File Organizations and Indexes
51 pages
Indexing
No ratings yet
Indexing
62 pages
Unit 6 File Indexing and Transaction Processing
No ratings yet
Unit 6 File Indexing and Transaction Processing
21 pages
DBMS Unit-5 Notes
No ratings yet
DBMS Unit-5 Notes
23 pages
SelfStudy - Chapter 10, 11 - File Structure, Indexing and Hashing
No ratings yet
SelfStudy - Chapter 10, 11 - File Structure, Indexing and Hashing
33 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
22-File Organization-06-09-2024
No ratings yet
22-File Organization-06-09-2024
23 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
File Management in Operating Systems
No ratings yet
File Management in Operating Systems
40 pages
Chapter - 8 1 97
No ratings yet
Chapter - 8 1 97
97 pages
Unit5 File Organization
No ratings yet
Unit5 File Organization
112 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
Data Structure Unit 5
50% (4)
Data Structure Unit 5
14 pages
Unit 1 Lecture 9
No ratings yet
Unit 1 Lecture 9
22 pages
Chapter 3 File Organization Indexed Methods
No ratings yet
Chapter 3 File Organization Indexed Methods
31 pages
Chapter 12: Indexing and Hashing
No ratings yet
Chapter 12: Indexing and Hashing
31 pages
File Organization
No ratings yet
File Organization
9 pages
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
12 pages
9 Files, Indices and Database Tuning
No ratings yet
9 Files, Indices and Database Tuning
17 pages
Database Storage & Indexing Guide
No ratings yet
Database Storage & Indexing Guide
41 pages
Unit-4: Hashing & File Structure (File Structure)
No ratings yet
Unit-4: Hashing & File Structure (File Structure)
22 pages
File Organization Methods
No ratings yet
File Organization Methods
22 pages
File Structure & Hashing Guide
No ratings yet
File Structure & Hashing Guide
12 pages
7-Indexing and Block
No ratings yet
7-Indexing and Block
20 pages
Indexing Structures & Database Design
No ratings yet
Indexing Structures & Database Design
39 pages
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE302L TH VL2024250101553 2024-09-02 Reference-Material-I
48 pages
DBMS Unit 3
No ratings yet
DBMS Unit 3
81 pages
Explain File Management in An Operating System
No ratings yet
Explain File Management in An Operating System
57 pages
Lecture 4.Pptx 2
No ratings yet
Lecture 4.Pptx 2
15 pages
$R101OHL
No ratings yet
$R101OHL
17 pages
IGNOU BCA CS-06 2012 Solved Assignment
No ratings yet
IGNOU BCA CS-06 2012 Solved Assignment
10 pages
Efficient File Indexing Methods
No ratings yet
Efficient File Indexing Methods
40 pages
File Organizations and Indexing: R&G Chapter 8
No ratings yet
File Organizations and Indexing: R&G Chapter 8
40 pages
1-File Structure
No ratings yet
1-File Structure
17 pages
1 - Disk Storage - Ch13
No ratings yet
1 - Disk Storage - Ch13
31 pages
Lecture 3.3.2 Index Sequential
No ratings yet
Lecture 3.3.2 Index Sequential
14 pages
Chapter 11 File Management
No ratings yet
Chapter 11 File Management
13 pages
Indexing
No ratings yet
Indexing
53 pages
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
No ratings yet
Inls 623 - Database Systems Ii - File Structures, Indexing, and Hashing
41 pages
Chapter 6
No ratings yet
Chapter 6
62 pages
10 File Organization in DBMS
No ratings yet
10 File Organization in DBMS
15 pages
08 File Handling
No ratings yet
08 File Handling
18 pages
2022 - CMP 262 - File Organisation - Slides
No ratings yet
2022 - CMP 262 - File Organisation - Slides
19 pages
Database File Organization Guide
No ratings yet
Database File Organization Guide
26 pages
Unit V Dbms Question and Answer
No ratings yet
Unit V Dbms Question and Answer
9 pages
Ds Mod 5
No ratings yet
Ds Mod 5
17 pages
UNIT-6 Important Questions & Answers
No ratings yet
UNIT-6 Important Questions & Answers
20 pages
Unit 6
No ratings yet
Unit 6
20 pages
07 Graph
No ratings yet
07 Graph
32 pages
08 Hashing
No ratings yet
08 Hashing
26 pages
01 - Introduction To DS
No ratings yet
01 - Introduction To DS
24 pages
06 Tree Part02
No ratings yet
06 Tree Part02
31 pages
8 Apache Hive.
No ratings yet
8 Apache Hive.
12 pages
Fundamentals of Database System Note Unit 1-4 PDF
100% (9)
Fundamentals of Database System Note Unit 1-4 PDF
50 pages
Tech Quiz for Developers
No ratings yet
Tech Quiz for Developers
5 pages
Transparent vs Pool vs Cluster Tables
No ratings yet
Transparent vs Pool vs Cluster Tables
1 page
SAP BODI Sample Resume 1
No ratings yet
SAP BODI Sample Resume 1
7 pages
Data Structure MCQ (Multiple Choice Questions)
No ratings yet
Data Structure MCQ (Multiple Choice Questions)
15 pages
Ingenieria en Sistemas de Informacion: Carlos Patricio López Loja
No ratings yet
Ingenieria en Sistemas de Informacion: Carlos Patricio López Loja
3 pages
Chapter 7 - Distributed Database System
No ratings yet
Chapter 7 - Distributed Database System
27 pages
Assignment 10
No ratings yet
Assignment 10
6 pages
Cis4100 - Project 1 Introduction: Murach'S PHP and Mysql by Developing An Application Called Sportspro Technical
0% (1)
Cis4100 - Project 1 Introduction: Murach'S PHP and Mysql by Developing An Application Called Sportspro Technical
16 pages
DWDM All Units
No ratings yet
DWDM All Units
102 pages
Joiner Transformation Overview
No ratings yet
Joiner Transformation Overview
16 pages
Datamodelling Training
No ratings yet
Datamodelling Training
7 pages
UCS312 - MST Even 24
No ratings yet
UCS312 - MST Even 24
2 pages
Pgpool II Tutorial
No ratings yet
Pgpool II Tutorial
6 pages
OFM 2007.2 Fundamentals
100% (2)
OFM 2007.2 Fundamentals
308 pages
Data Base Management Complete Book
No ratings yet
Data Base Management Complete Book
41 pages
Sqlmap Cheatsheet
No ratings yet
Sqlmap Cheatsheet
2 pages
Aurum A Data Discovery System
No ratings yet
Aurum A Data Discovery System
12 pages
AWS - Certified Cloud Practitioner (CLF-C01) Notes 30
No ratings yet
AWS - Certified Cloud Practitioner (CLF-C01) Notes 30
1 page
Data Visualization in Excel Using Python
No ratings yet
Data Visualization in Excel Using Python
3 pages
Assignment Data Warehousing - Odt
No ratings yet
Assignment Data Warehousing - Odt
3 pages
SQL Zero To Advance
No ratings yet
SQL Zero To Advance
46 pages
Synopsis of T24 Java Documentations
No ratings yet
Synopsis of T24 Java Documentations
1 page
Final-Project Report DB PDF
No ratings yet
Final-Project Report DB PDF
29 pages
App Development Workshop Guide
No ratings yet
App Development Workshop Guide
20 pages
La Structure de La Magie Le Livre Fondateur de La - 5a9c51801723dd1a6ed27b78 PDF
No ratings yet
La Structure de La Magie Le Livre Fondateur de La - 5a9c51801723dd1a6ed27b78 PDF
2 pages
Extension Ledgers in S/4 HANA Guide
No ratings yet
Extension Ledgers in S/4 HANA Guide
3 pages
58 - SQL vs. NoSQL
0% (1)
58 - SQL vs. NoSQL
6 pages
STUDENT RESULT MANAGEMENT SYSTEM Final
No ratings yet
STUDENT RESULT MANAGEMENT SYSTEM Final
46 pages

09 FIle

Uploaded by

09 FIle

Uploaded by

Data Structures (DS)

Keyc Name Roll No. Year Marks

Size of Index File

An indexed sequential ﬁle is both ordered and indexed.

A secondary index on a non-ordering key ﬁeld

There could be several secondary indexes for the same ﬁle.

You might also like