COMP2304 Computer Architecture and Organization
Redundant Arrays of Inexpensive Disks
By:
Dwi Putri Wahyuningsih
2019390004
Assignment Execution Date: March 5, 2019
Submission Date: March 27, 2019
Faculty of Engineering & Technology
Sampoerna University
I. Introduction
RAID (Redundant Array of Independent Disks) refers to a technology in computer
data storage that is used to implement fault tolerance features on computer storage media
such as hard disks by using redundancy or data buildup, either by using software or by a
separate hardware RAID unit. The word "RAID" also has several abbreviations for
Redundant Array of Inexpensive Disks, Redundant Array of Independent Drives, and
Redundant Array of Inexpensive Drives. This technology divides or replicates data into
separate hard disks. RAID is designed to increase data reliability and improve the I / O
performance of a hard disk. RAID is also a disk memory organization that can handle
multiple disks with parallel access systems and redundancies added to increase reliability.
This parallel work results in a faster disk speed resultant.
In its, implementation RAID is a set of hard disks arranged in such a way that it has
a total capacity more than the capacity of one of the most recent hard disks. The need for
RAID techniques arises due to the development of data storage needs which are faster than
the development of hard disk capacity. Moreover, in a data center environment, the need
for capacity to store data can reach thousands of times more than the capacity of one hard
disk. In 2007, IDC estimated that information growth had increased more rapidly than the
available storage capacity.
Figure 1. information growth vs Available storage capacity of the IDC
The purpose of RAID itself is only 3, namely data speed (stripping), security data
(mirroring) and both. Initially, RAID was only used for servers, where data security & speed are
necessary. And to make this RAID configuration initially needs its very expensive RAID card.
RAID is not as complicated as imagined, because the basic principle of RAID is only 2, namely:
stripping and mirroring.
Stripping is dividing the work of 2 or more hard drives to process 1 data at the same time.
The disadvantage of stripping is that if one of the HDD arrays is jammed, then half the data stored
on the other HDD will not be readable. Now, if Mirroring means we will back up the same data
on another HDD in real-time. So, this is intended for data security. The disadvantage is a capacity
loss.
II. RAID levels
There are many types of raid levels, for example, RAID 0, RAID 1, RAID 2, etc. which
will determine the type. RAID commonly used for home users is RAID 0 and RAID 1.
i. Raid Level 0
Raid level 0 also known as stripping mode. Requires a minimum of 2 hard
drives. The system is combining the capacity of several hard disks. So logically
only "visible" a hard drive with a large capacity (the total capacity of the entire hard
drive). Initially, RAID 0, was used to form a very large partition of several hard
disks in a cost-efficient manner.
RAID level 0 uses a collection of disks with striping at the block level,
without redundancy. So, it only saves striping data blocks into multiple disks. Data
written on hard disks are divided into fragments. Where the fragments are spread
throughout the hard disk. So, if one of the hard disks is physically damaged, then
the data cannot be read at all. Data can be accessed faster with RAID 0 because
when a computer reads a fragment on one hard disk, the computer can also read
other fragments on another hard disk.
Drive number requirements:
• At least one drive is required for RAID Level 0.
• RAID 0 volume groups can have more than 30 drives.
• We can create a volume group that includes all drives in the storage array.
Advantages:
• RAID 0 uses maximum hard disk space because there is no data reduction.
• RAID 0 has more speed because there is more space than two hard drives
put together.
• Speed
Disadvantages:
• There is no protection. So, if we lose one single hard disk, our data will be
lost.
• Because we use two hard drives without redundancy, our risk is doubled.
So, it will be safer to store data on a single hard disk.
ii. Raid Level 1
RAID 1 is a mirroring method used to get data security (backup). In RAID
1, data on the first hard disk is copied to the second hard disk. This method can
improve disk performance, but the number of disks needed is doubled, so the cost
is very expensive. If there is damage to one of the hard disks, then the data will
remain safe because it has been copied to the second hard disk. If the damaged hard
disk is replaced, RAID 1 will automatically copy the data to the new hard disk. But
it must be remembered that the total capacity of RAID 1 is only half of the total
hard disk in the stack. The system works by making the system work in data
protection mode (also known as mirror mode or RAID 1) and the capacity is divided
in half. Half the capacity is used to store data and half is used for duplicate copies.
If one drive is damaged, the data is protected because the data has been duplicated.
Drive number requirements:
• A minimum of two drives is required for RAID 1: one drive for the user
data, and one drive for the mirrored data.
• We must have an even number of drives in the volume group. If we do not
have an even number of drives and we have some remaining unassigned
drives, select Storage > Pools & Volume Groups to add additional drives to
the volume group and retry the operation.
• Volume groups can have more than 30 drives. A volume group can be
created that includes all the drives in the storage array.
Advantages:
• Redundancy
• Speed.
Disadvantage:
Hard disk space is not used efficiently. Because the two hard disks are copies
of each other, only half the combined size is used.
iii. Raid Level 1+0 or 10
RAID 1 + 0 is a combination of RAID 1 (mirroring) and RAID 0 (striping)
schemes, which allows the use of RAID 0 schemes to run above RAID 1. The use
of RAID 1 + 0 schemes is intended to obtain read / write speed performance owned
by the RAID scheme 0 but with the RAID protection method 1. On RAID 1 + 0,
the disks are copied (mirrored) in pairs, and the copies are stripped.
RAID 10 or RAID 1 + 0 provides very high I / O rates by spreading RAID
1 segments (mirrored). This RAID mode is good for database management
solutions that are important for businesses that require maximum performance and
high fault tolerance. A system set to RAID 10 produces half the total capacity of all
drives in a series/array.
Drive number requirements:
• If you select four or more drives, RAID 10 is automatically configured
across the volume group: two drives for user data, and two drives for the
mirrored data.
• RAID 10 volume groups can have more than 30 drives. A volume group
can be created that includes all the drives in the storage array.
Advantage:
• Can be said to be the same as RAID 1 only increase the performance of
reading and write hard disk compared to RAID 1.
• That if a disk is damaged, its mirror pairs can still be accessed.
Disadvantages:
The price is high.
iv. Raid Level 2
RAID level 2 is organizing with error-correcting codes (ECC). As in
memory where error detection is detected using parity bits. Each byte in the data
has a corresponding bit parity that represents the number of bits in the data byte
where the parity bit = 0 if the number of bits is even or parity = 1 if it is odd. So, if
one of the bits in the data changes, parity changes and does not match the parity of
the stored bit. Thus, if there is a failure on one of the disks, the data can be reshaped
with reading error-correction bits on other disks.
RAID 2, also uses a stripping system. But added three more hard disks for
hamming parity, so that the data becomes more reliable. Therefore, the number of
hard disks needed is at least 5 (n + 3, n > 1). The last three hard disks are used to
store the hamming code from the calculation of every bit in the other hard disk.
Advantages:
Good reliability because it can reshape damaged data with the ECC, and the number
of redundancy bits needed is less when compared to level 1 (mirroring).
Disadvantages:
The need for bit parity calculation, so writing or changing data takes longer than
those without using bit parity, this level requires a special disk for its application
which is quite expensive.
v. Raid Level 3
RAID 3, also uses a stripping system. It also uses an additional hard disk
for reliability, but only added a new hard drive for parity. Therefore, the number of
hard disks needed is a minimum of 3 (n + 1; n > 1). The last hard disk is used to
store parity from the calculation of each bit in another hard disk. RAID level 3 is
organizing with interleaved bit parity. This organization is almost the same as
RAID level 2, the difference is that RAID level 3 only requires a redundant disk,
regardless of the number of disk assemblies. So, it does not use ECC, but only uses
a parity bit for a set of bits that have the same position on each disk that contains
data. It also uses data striping and accesses disks in parallel.
Advantages:
Good reliability (reliability), faster data access because reading each bit is done on
several disks (parallel), only need 1 red disk and which is certainly more profitable
with levels 1 and 2.
Disadvantages:
The need for calculation and writing of parity bits resulting in lower performance
than those using parity.
vi. Raid Level 4
RAID level 4 is organizing with interleaved block parity, that is, using data
striping at the block level, storing a parity block on a separate disk for each data
block on other corresponding disks. If a disk fails, the parity block can be used to
reshape the data blocks on the failed disk. Same as the RAID 3 system but uses the
parity of each hard disk block, not the bits. The minimum hard drive requirements
are also the same, 3 (n + 1; n > 1).
Advantages:
• High data transfer speed for reading
• The reliability is also good because of the block parity.
Disadvantage:
• Block access, as usual, using 1 disk. Even writing to 1 block requires 4
accesses to read to the relevant disk data and disk parity, and 2 more to write
to 2 disks also (read-modify-read).
vii. Raid Level 5
RAID 5, also called Disk Striping with Distributed Parity, works the same
as RAID 0, which uses disk striping. The difference between the two is parity,
which is used for checking and correcting errors. RAID 5 uses the block-level
striping method with parity data distributed to all hard disks. This parity is spread
across multiple disks to avoid any reduction in performance. If parity is stored on
only one hard disk, it is called RAID 3 (Disk Striping with Dedicated Parity). With
parity, the RAID 5 system will continue to function if one of the hard disks in its
layout is damaged.
RAID 5 only has damage tolerance on one disk. That means, if the
configuration consists of 4 hard drives with a capacity of 1 TB, then the storage
capacity that can be used is 3 TB because the other 1 TB is used to tolerate damage.
In systems with three or four drives, we recommend that you set the system to RAID
5.
Drive number requirements:
• We must have a minimum of three drives in the volume group.
• Typically, we are limited to a maximum of 30 drives in the volume group.
Advantages:
• As in level 4 coupled with parity distribution like this can avoid excessive
use of a bit parity as in RAID level 4.
• A fast performance by spreading data across all drives.
• Data protection by dedicating a quarter of each drive in a four-drive system
for fault tolerance and leaving three-quarters of the system's capacity for
data storage.
Disadvantage:
The need for additional mechanisms for calculating the location of parity so that it
will affect the speed in reading the block and writing.
viii. Raid Level 6
In general, it is an improvement from RAID 5, namely by adding parity to
2 (p + q). So, the minimum number of hard disks is 4 (n + 2; n > 1). With the
addition of this secondary parity, then damage to two hard disks at the same time
can still be tolerated. RAID level 6 works as a store of additional redundant
information to anticipate the failure of several disks at once. RAID level 6 performs
two different parity calculations, then stored in separate blocks on different disks.
So, if the data disks are used as many as n disks, then the number of disks needed
for RAID level 6 is n + 2 disks.
Drive number requirements
• We must have a minimum of five drives in the volume group.
• Typically, we are limited to a maximum of 30 drives in the volume group.
Advantage:
Data reliability is very high because to cause data loss, failure must occur on three
disks in the average interval for data recovery (Mean Time to Repair).
Disadvantage:
The time penalty at the time of writing data, because every writing that is done will
affect two parity blocks.
III. Summary
RAID is useful in terms of obtaining extensive hard disk capacity from the result
of combining multiple hard disks (RAID 0). RAID is also useful for obtaining a system
that is tolerant of the damaged hard disk (other than RAID 0). RAID support can be
obtained through RAID controller hardware or software RAID. The Windows 7 operating
system supports RAID 0 and implementation RAID 1. The mdadm application in a Linux
environment supports implementations of RAID 0, RAID 1, RAID 2, RAID 3, RAID 4,
RAID 5, RAID 6 and RAID 1 + 0.
IV. Resources
1. Help,
mysupport.netapp.com/NOW/public/eseries/sam_archive1150/index.html#page/G
UID-8538272A-B802-49D9-9EA2-96C82DAD26A2/GUID-1BF9A33B-C3A1-
487C-B8D8-5F2C14E3ED2E.html.
2. Kuncara, Purba, et al. “Mengenal Teknologi RAID Pada HDD.” KlikHost, 5 Oct.
2012, klikhost.com/mengenal-teknologi-raid-pada-hdd/.
3. “RAID Levels Explained.” RAID Levels and Types,
www.enterprisestorageforum.com/storage-management/raid-levels.html.
4. Das, Anirban. “RAID Levels 0, 1, 4, 5, 6, 10 Explained.” Boolean World, Boolean
World, 15 Mar. 2018, www.booleanworld.com/raid-levels-explained/.