Dbms Unit - 4 Notes - Removed - Removed
Dbms Unit - 4 Notes - Removed - Removed
$
• Overview of Physical Storage Media
• Magnetic Disks
• RAID
• Tertiary Storage
• Storage Access
• File Organization
& %
• Organization of Records in Files
• Data-Dictionary Storage
• Storage Structures for Object-Oriented Databases
& %
– volatile storage: loses contents when power is switched off
– non-volatile storage: contents persist even when power is
switched off. Includes secondary and tertiary storage, as
well as battery-backed up main-memory.
& %
– volatile — contents of main memory are usually lost if a
power failure or system crash occurs
& %
– usually survives power failures and system crashes; disk
failure can destroy data, but is much less frequent than
system crashes
& %
cheaper than disk.
main memory
flash memory
magnetic disk
& %
optical disk
magnetic tapes
&
Database Systems Concepts 10.7 Silberschatz, Korth and Sudarshan c 1997
%
' Magnetic Disks Mechanism
$
track t spindle
arm assembly
sector s
cylinder c read-write
head
& %
platter
arm
rotation
& %
• Head–disk assemblies – multiple disk platters on a single
spindle, with multiple heads (one per platter) mounted on a
common arm.
• Cylinder i consists of i th track of all the platters
Disk
Controller
Disks
& %
the disk drive hardware.
– accepts high-level commands to read or write a sector
– initiates actions such as moving the disk arm to the right
track and actually reading or writing the data.
& %
• Data-transfer rate – the rate at which data can be retrieved
from or stored to the disk.
• Mean time to failure (MTTF) – the average time the disk is
expected to run continuously without any failure.
& %
• Nonvolatile write buffers speed up disk writes by writing
blocks to a non-volatile RAM buffer immediately; controller then
writes to disk whenever the disk has no other requests.
• Log disk – a disk devoted to writing a sequential log of block
updates; this eliminates seek time. Used like nonvolatile RAM.
• Today RAIDs are used for their higher reliability and bandwidth,
rather than for economic reasons. Hence the “I” is interpreted
as independent, instead of inexpensive.
&
Database Systems Concepts 10.13 Silberschatz, Korth and Sudarshan c 1997
%
' Improvement of Reliability via Redundancy
$
• The chance that some disk out of a set of N disks will fail is
much higher than the chance that a specific single disk will fail.
E.g., a system with 100 disks, each with MTTF of 100,000
hours (approx. 11 years), will have a system MTTF of 1000
hours (approx. 41 days).
• Redundancy – store extra information that can be used to
rebuild information lost in a disk failure
• E.g. Mirroring (or shadowing)
& %
– duplicate every disk. Logical disk consists of two physical
disks.
– every write is carried out on both disks
– if one disk in a pair fails, data still available in the other
& %
Popular for applications such as storing log files in a database
system.
P P P
& %
P P
P P P
P P P P P P
P P P P
(g) RAID 6: P + Q Redundancy
& %
– Subsumes Level 2 (provides all its benefits, at lower cost).
& %
independent block writes since every block write also writes
to parity disk
& %
• Level 6: P+Q Redundancy scheme; similar to Level 5, but
stores extra redundant information to guard against multiple
disk failures. Better reliability than Level 5 at a higher cost; not
used as widely.
& %
– data can only be written once, and cannot be erased.
– high capacity and long lifetime; used for archival storage
– WORM jukeboxes
& %
• Tape jukeboxes used for very large capacity (terabyte (1012 ) to
petabyte (1015 )) storage
& %
• Buffer manager – subsystem responsible for allocating buffer
space in main memory.
& %
was modified since the most recent time that it was written
to/fetched from the disk.
– Once space is allocated in the buffer, the buffer manager
reads in the block from the disk to the buffer, and passes
the address of the block in main memory to the requester.
& %
repeated scans of data
• Mixed strategy with hints on replacement strategy provided by
the query optimizer is preferable
& %
• Buffer manager can use statistical information regarding the
probability that a request will reference a particular relation
– E.g., the data dictionary is frequently accessed. Heuristic:
keep data-dictionary blocks in main memory buffer
& %
records later.
& %
– Link all free records on a free list
header
record 0 Perryridge A-102 400
record 1
& %
record 2 Mianus A-215 700
record 3 Downtown A-101 500
record 4
record 5 Perryridge A-201 900
record 6
record 7 Downtown A-110 600
record 8 Perryridge A-218 700
& %
records; such records are pinned.
& %
– Attach an end-of-record (⊥) control character to the end of
each record
– Difficulty with deletion
– Difficulty with growth
Size # Entries
Free Space
Location
• Header contains:
– number of record entries
– end of free space in the block
– location and size of each record
& %
• Records can be moved around within a page to keep them
contiguous with no empty space between them; entry in the
header must then be updated.
• Pointers should not point directly to record — instead they
should point to the entry for the record in header.
& %
2 Mianus A-215 700
3 Downtown A-101 500 A-110 600
4 Redwood A-222 700
5 Brighton A-217 750
& %
• Pointers – the maximum record length is not known; a
variable-length record is represented by a list of fixed-length
records, chained together via pointers.
& %
Redwood A-222 700
Brighton A-217 750
& %
stored in the same file; related records are stored on the same
block
& %
Perryridge A-201 900
Perryridge A-218 700
Redwood A-222 700
Round Hill A-305 350
& %
sequential order
& %
– good for queries involving depositor 1 customer, and for
queries involving one single customer and his accounts
– bad for queries involving only customer
– results in variable size records
& %
implemented as B-trees, or as separate relations in the
database.
– Set fields can also be eliminated at the storage level by
normalization.
& %
1. a volume or file identifier
2. a page identifier within the volume or file
3. an offset within the page
PageID FullPageID
5001 679.34.28000
Translation Table 4867 519.56.84000
& %
• Page with short page identifier 2395 was allocated address
5001. Observe change in pointers and translation table.
• Page with short page identifier 4867 has been allocated
address 4867. No change in pointers and translation table
& %
back directly to disk
• A process should not access more pages than size of virtual
memory — reuse of virtual memory addresses for other pages
is expensive
& %
• Can transparently convert from disk representation to form
required on the specific machine, language, and compiler,
when the object (or page) is brought into memory.