Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
50 views26 pages

Dr. E. Papanasam Soc

This document discusses RAID (Redundant Array of Independent Disks), which is a way to organize multiple physical disks into an array that provides increased performance and reliability compared to a single disk. It describes the motivation for RAID as coming from the need to improve I/O performance as processor speeds increased faster than disk speeds. The key aspects of RAID include distributing data across multiple disks, using redundant disk capacity to store parity information to enable data recovery if a disk fails. It then explains the different RAID levels (0-6) and their characteristics in terms of performance, reliability and redundancy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views26 pages

Dr. E. Papanasam Soc

This document discusses RAID (Redundant Array of Independent Disks), which is a way to organize multiple physical disks into an array that provides increased performance and reliability compared to a single disk. It describes the motivation for RAID as coming from the need to improve I/O performance as processor speeds increased faster than disk speeds. The key aspects of RAID include distributing data across multiple disks, using redundant disk capacity to store parity information to enable data recovery if a disk fails. It then explains the different RAID levels (0-6) and their characteristics in terms of performance, reliability and redundancy.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

RAID

Dr. E. Papanasam
SoC
Impact of I/O on System Performance
• Suppose we have a benchmark that executes in 100 seconds of
elapsed time, of which 90 seconds is CPU time and the rest is I/O
time
• Suppose the number of processors doubles every two years, but
the processors remain the same speed, and I/O time doesn’t
improve
• How much faster will our program run at the end of six years?
• Elapsed time = CPU time + I/O time; I/O time = 10 seconds
Impact of I/O on System Performance
• The improvement in CPU performance after six years is 90 /11 =
8
• However, the improvement in elapsed time is only 100/21 = 4.7
• I/O time has increased from 10% to 47% of the elapsed time

• Parallel revolution needs to come to I/O as well as to


computation
• Accelerating I/O performance was the original motivation of disk
arrays
• In the late 1980s, the high performance storage of choice was
large, expensive disks
• By replacing a few large disks with many small disks,
performance would improve because there would be more read
heads
• Good match for multiple processors as well
• Many read/write heads mean the storage system could support
many more independent accesses as well as large transfers
spread across many disk
• Both high I/Os per second and high data transfer rates

• Disk arrays could make reliability much worse


• These smaller, inexpensive drives had lower MTTF ratings than
the large drives
• More importantly, by replacing a single drive with, say, 50 small
drives, the failure rate would go up by at least a factor of 50!
• The solution was to add redundancy so that the system could
cope with disk failures without losing information
• By having many small disks, the cost of extra redundancy to
improve dependability is small, relative to a few large disks
• Dependability was more affordable if you constructed a
redundant array of inexpensive disks
• Led to its name: redundant arrays of inexpensive disks (RAID)
• An organization of disks uses an array of small and
inexpensive disks so as to increase both performance and
reliability

• How much redundancy do you need?


• Do you need extra information to find the faults?
• Does it matter how you organize the data and the extra check
information on these disks?
RAID
• Rate in improvement in secondary storage performance has
been considerably less than the processors and main memory
• This mismatch has made the disk storage system the main focus
of concern in improving overall computer system performance
• Additional gains in performance by using multiple parallel
components

• Leads to the development of arrays of disks that operate


independently and in parallel
• With multiple disks, separate I/O requests can be handled in
parallel, as long as the data reside on separate disks
• A single I/O request can be executed in parallel if the block of
data to be accessed is distributed across multiple disks
RAID (2)
• With the use of multiple disks, there is a wide variety of ways in
which the data can be organized and in which redundancy can be
added to improve reliability
• RAID (Redundant Array of Independent Disks) is a standardized
scheme for multiple-disk database design
• Seven levels - zero through six
• Levels
– Do not imply a hierarchical relationship
– Designate different design architectures
Common characteristics of RAID
• Three common characteristics
– 1. RAID is a set of physical disk drives viewed as a single
logical drive by OS
– 2. Data are distributed across the physical drives of an array
– 3. Redundant disk capacity is used to store parity
information, which guarantees data recoverability in case
of a disk failure
• Details of the second and third characteristics differ for the
different RAID levels
• RAID 0 and RAID 1 do not support the third characteristic
RAID (3)
• The RAID strategy employs
– multiple disk drives and
– distributes data in such a way enabling simultaneous
access to data from multiple drives
– improving I/O performance
– allowing easier incremental increases in capacity
– Improve access time and improve reliability
• RAID - Redundant Array of Inexpensive Disks
• Inexpensive - small relatively inexpensive disks in the
RAID array alternative to a single large expensive disk
RAID 0 (Non Redundant)
• Not a true member of the RAID family - It does not include
redundancy
• Data stored in the logical disk in the form of strip
• Striping - Allocation of logically sequential blocks to separate disks
to allow higher performance than a single disk can deliver
• One logical disk is divided into four physical disks
• Round Robin method
• Array management software carry out distribution of data from
logical disk into physical disks
• High I/O request rate - Requests can be handled in parallel (ala
data resides in multiple disks) - Reduces the I/O queuing time
• High data transfer capacity - Request can be executed in parallel (if
the block of data to be accessed is distributed across multiple
disks) - reducing the I/O transfer time
RAID 0
RAID 1 (Mirroring or Shadowing)

• Traditional scheme for tolerating disk failure


• Uses twice as many disks as does RAID 0
• Data striping as in RAID 0
• Read from either-A read request can be serviced by either of the
two disks
RAID 1
• Write to both
– A write request requires both corresponding strips be
updated - can be done in parallel
• Recovery from a failure is simple
– If a disk fails, the system just goes to the “mirror” and reads its
contents to get the desired information
• Expensive
– Requires twice the disk space of the logical disk that it
supports
– RAID 1 configuration is limited to drives that store system
software and data and other highly critical files
• Achieve high I/O request rates if the bulk of the requests are
reads
• No significant performance gain over RAID 0 if a substantial
fraction of the I/O requests are write requests
RAID 2 (Redundancy through Hamming code)
• Use of a parallel access technique
• All disks participate in the execution of every I/O request
• Data striping – Bit level striping
• An error-correcting code is calculated across corresponding bits on
each data disk
• Bits of the code are stored in the corresponding bit positions on
multiple parity disks
• Hamming code is used, which is able to correct single-bit errors and
detect double-bit errors
RAID 2
• Requires fewer disks than RAID 1
• Number of redundant disks is proportional to the log number
of data disk
• If there is a single-bit error, the controller can recognize and
correct the error instantly, so that the read access time is not
slowed
• On a single write, all data disks and parity disks must be
accessed for the write operation
RAID 3 (Bit-Interleaved Parity)
• Bit level striping
• Parity rather than hamming code
• Simple parity bit for each set of corresponding bits
• Only one redundant disk, no matter how large the array
• Data on failed drive can be reconstructed from surviving data and
parity info
• Read - all disk
• Write - all disk including the redundant parity disk to be updated
• Only one request serviced at a time
RAID 3
• In the event of a drive failure, the parity drive is accessed and data
is reconstructed from the remaining devices
• Once the failed drive is replaced, the missing data can be restored
on the new drive and operation resumed
• Array of five drives X0 through X3 contain data and X4 is the parity
disk
• The parity for the ith bit is

• Suppose that drive X1 has failed


• If we add to both sides of the preceding equation

• Contents of each strip of data on X1 can be regenerated from the


contents of the corresponding strips on the remaining disks in the
array
RAID 4 (Block level Parity)

• Block level striping


• Block – one or more word
• Each disk operates independently
• Separate I/O requests can be satisfied in parallel
• Provided read request smaller than striping unit (Block)
RAID 4
• Write request – Requested data block as well as parity block to
be updated
• More suitable for applications require high I/O request rates
• Small write request require four disk I/O (write penalty)
– One to write a new date
– Two read
• One to read the old data and another to read old parity
to compute new parity
– One to write the new parity
– Each strip write involves two reads and two writes (Read-
Modify-Write procedure)
RAID 4
• Suppose that a write is performed that only involves a strip on
disk X1.
• Initially, for each bit i, we have the following relationship
Small write update on RAID 3 versus
RAID 4
RAID 5(Block level Distribute Parity)
• One drawback is parity disk must be updated on every write, so
the parity disk is the bottleneck for back-to-back writes
• Same as RAID 4
• Parity strips are distributed across all disks
• Parity strip allocation – Round Robin scheme
• Avoids potential I/O bottleneck found in RAID 4
RAID 4 Vs RAID 5
• RAID 5 allows multiple writes to occur simultaneously as long
as the parity blocks are not located on the same disk
RAID 6 (Dual Redundancy)
• Two different parity calculations
• P - Odd-even parity and Q - Read Solomon code
• Stored in separate blocks on different disks
• No of disks required = N+2 (N - number of disk required for data)
• Possible to regenerate data even if two disks containing user data
fail
• Substantial write penalty, because each write affects two parity
blocks
• hot-swapping
Replacing a hardware component while the system is running
• standby spares
Reserve hardware resources that can immediately take the
place of a failed component
Summary of RAID Levels
• Data transfer capacity (ability to move data)
• I/O request rate (ability to satisfy I/O requests)

You might also like