Disk Disk MTTF Fits Disk Jbod Disk MTTF Fits N Fits N N MMTF Fits MTTF Disk Jbod Jbod

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

RAID

RAID (Redundant Array of Independent Disks) is a technology to combine multiple small, independent disk drivers into an array that looks like a single, big disk drive to the system. Simply putting n disk drives together (as in JBOD) results in a system with a failure rate that is n times the failure rate of a single disk. The high failure rate makes the disk array concept impractical for addressing the high reliability and large capacity needs of enterprise storage. The system reliability is measured in MTTF (Mean time to Failure) or FITS (Failures in Time). The MTTF is the mean value of life distribution for the population of devices under operation or the expected lifetime of an individual device. Usually MTTF is measured in hours. The FITS is the measure of failure rates in 109 device hours. Assuming MTTFDisk and FITSDisk to be the MTTF and FITS value of a single disk device, and MTTFJBOD and FITSJBOD to be the measurement of a JBOD system with n disks, the resulting failure rate parameters follow the equations below: Disk Disk MTTF FITS 109 = Disk JBOD Disk MTTF FITS n FITS n 109 = = n MMTF FITS MTTF Disk JBOD JBOD = = 109

With the current typical Fibre Channel disk drive MTTF parameter of 1.5M hours, a JBOD system with 100 disk drives has a MTTF of 15000 hours, or 1.7 years. Without the use of RAID technology, the raw MTTF number is clearly too low for applications that demands high reliability. This is dismal when considering the scalability requirements of enterprise storage is multi-shelf. Later analysis will show that RAID technologies can dramatically increase the MTTF of disk array systems. The basic concepts of RAID were described in [9] as five types of array architectures, called RAID levels each providing disk fault tolerance and each offering different feature sets and performance trade-offs. Later, industry introduced proprietary RAID implementations that included various combinations and variations of the basic RAID levels. Some of the most popular extensions will be discussed in the following paragraphs in addition to the basic RAID levels. RAID 0 Data Striping After the initial concept of RAID was introduced, RAID 0 was adopted to describe nonredundant disk arrays, wherein the data striping was used only to increase capacity and data throughput of the disk array. RAID 1 Mirroring RAID level 1 describes data mirroring system. Mirroring provides data redundancy by writing the same data to both sides of the mirror. Level 1 is simple to implement, provides good data reliability, and doubles read performance of the array, all at the cost of doubling the storage capacity required. RAID 2 Striping with ECC RAID Level 2 uses Hamming codes to generate ECC (Error Correction Code) checksums. The data and the ECC checksums are striped across multiple disk drives. The basic idea is for the ECC to correct single or multiple bit failure across the disk drives. In practice, although ECC codes are often used inside the disk drive for correcting bit errors from the physical storage media, it is unsuitable for protecting data across multiple disks. Typical disk drive failures are often caused by catastrophic mechanical failures the drive either works properly, or it does not work at all. To protect the disk

array against single disk drive failure, a simple XOR (Exclusive OR) parity is as good as the more complicated ECC code.

For this reason, XOR parity is the primary checksum mechanism for RAID architectures including RAID 3, 4 and 5. RAID 2 is conceptually intuitive, but has rarely or never been implemented in any real storage systems. RAID 3 Byte Striping with Parity RAID Level 3 stripes data bytes across a group of disks in a similar way as RAID 2, and generates parity information over the data on a dedicated parity disk. If a disk drive fails, the data can be restored on the fly by calculating the exclusive OR (XOR) of the data from the remaining drives. RAID provides high reliability at the cost of one additional parity drive per RAID group. The storage capacity requirement of RAID 3 is much lower than mirroring (RAID 1). The major drawback to level 3 is that every read or write operation needs to access all drives in a group, so only one request can be pending at a time (i.e., sequential access). As the disk access (seek) time dominates the overhead for random disk accesses, the sequential access pattern of RAID 3 makes it impossible to hide the access latency by overlapping multiple I/O transactions over time. The byte striping methods of RAID3 also imply certain restrictions on the number of disks in a RAID group and the size of the logical block. The most efficient block size is now dependent on the number of disks ( Group_size x Sector_size). Some RAID configurations can result in unusual block sizes that are difficult for the operating systems to deal with. RAID 4 Block Striping with Parity RAID 4 uses the same XOR parity checksum technique to provide data redundancy as RAID 3, except the checksum is calculated over disk blocks. A dedicated drive is assigned to store the checksum blocks. As the data blocks are stripped across the data drives in the RAID group, RAID 4 can simultaneously perform multiple asynchronous read transactions on the different disk drives, giving a very good read transaction rate. However, the write performance is still limited to the transaction rate of a single disk, as any block write requires the corresponding block in the parity disk to be updated. The parity disk becomes the system performance bottleneck for write accesses. RAID 5 Block Striping with Distributed Parity

RAID 5 addresses the write bottleneck issue of RAID 4 by distributing the parity data across all member drives of the group in a round robin fashion. RAID 5 still requires any write transaction to update the parity block and the data block. To calculate the new checksum, RAID 5 requires the XOR operation to be applied to the old data block (1st block read), the old checksum (2nd block read) and the new data block to generate the new checksum block. Then the new data block and the new checksum blocks are written back (2 block writes) to their respective drives. Each write transaction requires 2 block reads and 2 block writes to be executed. Since the parity blocks are even distributed across all the member drives, the probability of access conflict for the parity drive from simultaneous write transactions is much reduced. As the result, RAID 5 has the full benefit of RAID 4 for data redundancy and high read transaction rate, the write performance is also very high. RAID 5 is commonly regarded as the best compromise of all RAID levels among read performance, write performance, data availability, cost of capacity. It is the most commonly used RAID level today. Other RAID Architectures Over the years, the storage industry has implemented various proprietary storage array architectures, often by combining multiple techniques outlined in the basic RAID levels. RAID 0+1 is a combination of level 0 (striping) and 1 (mirroring). RAID 0+1 benefits from the performance gain of striping and the data redundancy provided by mirroring. Because no parity needs to be calculated, write operations are very fast. In some proprietary implementations, extensions are made to RAID 5 and the result called RAID 6. In addition to the parity, RAID 6 includes the additional checksum generated over all the data and parity using Reed-Solomon coding on another drive. The RAID 6 design can allow the failure of any two drives at the same time, thereby dramatically increasing the survival capability beyond what can be offered by RAID 5. RAID 6 requires additional storage space for the additional checksum, and the write performance is slower from having to generate and write to the two checksums. But clearly it may be applied to protect mission critical data that requires very high fault tolerance. RAID Reliability Calculation To illustrate the reliability enhancement provided by RAID techniques, it is necessary to revisit the reliability calculation of a RAID system. First, the following terms must be defined: n = total number of disks with data

g = number of data blocks in a group c = number of checksum blocks in a group m = n/g = number of groups MTTRDisk = mean time to repair a failed disk MTTFDisk = mean time to failure Next, assume the disk failures are independent and occur at a uniform rate, then the mean time to failure is given by: ( ) (( ) ) (MTTR (g c ) (g c)) MTTF g c MTTR MTTF gc MTTF MTTF Disk Disk Disk Disk Disk Group + + = + + = 11 2 (MTTR (g c ) (g c) n) MTTF g m MTTF MTTF Disk Group Disk RAID + + == 1 2

Now, calculate the improvement in reliability provided by using the example of a RAID 5 system with 100 data disks divided into groups of 10 disks. 10 additional disks are required for the checksum (total number of disks = 110; n =100; g = 10; c =1). Assuming it takes 10 hours to repair a failed disk (replace the faulty disk and repopulate the data based on the checksum (MTTR=10 hr) and a normal disk mean time to failure of 1.5M hours (MTTFDisk = 1.5Mhr) like before. Plugging these numbers into the equation yields: MTTFRAID = 2.045108 hours = 23349.9years Contrasting this result to the MTTF for a 100-disk JBOD system (only about 1.7 years), it is clear that the RAID 5 technology has dramatically increased the reliability of the storage array to over 23 thousand years. Because of this, RAID technology has established itself to be corner stone of highly reliable storage array systems.

You might also like