| United States-English |
|
|
|
![]() |
HP A5856A RAID 4Si PCI 4-Channel Ultra2 SCSI Controller: Installation and Administration Guide > Chapter 1 OverviewHP RAID 4Si Concepts |
|
Understanding the basics of SCSI is the foundation on which the principles of RAID technology is built. RAID begins as a modification to the SCSI Host Controller which, in turn, paves the way for hard disk manipulation and the creation of the logical drive. In 1987, David Patterson, Garth Gibson, and Randy Katz at the University Of California Berkeley, published a paper entitled, "A Case for Redundant Arrays of Inexpensive Disks (RAID)." This paper described various types of disk arrays, referred to by the acronym RAID. The basic idea of RAID was to combine multiple small, inexpensive disk drives into an array of disk drives which would give performance exceeding that of a Single Large Expensive Drive (SLED). Additionally, these arrays of small drives would appear to the computer as a single logical storage unit or drive.
The small disk drives used with personal and micro computers are lower in performance and capacity when compared to the large disk drives used on mainframes and super computers. Small drives have lower storage density than the large drives, but the smaller disk drives are equal to or better than the large drives in four areas:
Placing these small inexpensive disk drives into an array provides for the following enhancements:
One of the highlights of the paper pointed out that as the number of drives in an array (also referred to as a stripe set) increases, the overall mean time between failures (MTBF) of the array decreases. This is a powerful proposal, considering that up until then, if your hard drive crashed you were dependent on some form of backup to restore data, usually from a tape drive—and then there was the grim outlook of having a system down while repairs were being made. The RAID paper proposed a conceptual method for handling problems associated with MTBF and data availability through five possible RAID configurations, which were defined as RAID levels 1 through 5. Of the original five levels, RAID 1, 3, and 5 have become the most widely used in the computing industry.
RAID levels extend increasing forms of data redundancy, and each higher level seems to be an improvement over the last. However, these methodologies are dependent on the particular I/O requirements of an application and the cost of implementing it. So, there is no panacea of best or worse case scenarios in light of this aspect. Overall, a RAID has three main attributes that are explored in some way by all RAID Levels. They are:
In Figure 1-1, the system interprets the three-disk array of small drives and presents it to the user as one large drive. Disk array technology is based on improving I/O performance and capability through the use of multiple disks. In the past, large computers used upscale controllers and multiple disks, but each of these disks was independent of each other. The theory was that if files could be "properly placed" on a disk, the overall system would respond with consistent I/O. However, in these configurations the storage devices are not really ever exactly balanced. What happens is the storage areas will form hot spots that make the I/O requests back up and force queues to be generated, causing slowdown of data transfer. With the proposition of RAID storage, the idea of disk striping was introduced. Fundamental to RAID is a technique known as striping. Striping is a method of chaining multiple drives into one logical storage unit, which is also referred to as a stripe set. Striping involves partitioning each drive's storage space into stripes, or data chunks, which can be as small as one sector (512 bytes) or as large as several megabytes. These stripes are then interleaved in a "round robin" fashion, so that the combined space is composed alternately of stripes from each drive. In effect, the storage space of the drives can be thought of as a shuffled deck of cards. The result is an even distribution of hot spots across the set of drives that uses the full I/O capability and improves overall disk performance. The type of operating environment determines whether large or small stripes are used. Another feature of the RAID architecture is disk mirroring. This is a method in which each write command to a disk is written (mirrored) on another disk, thereby creating two disks with the same data. If a disk drive fails, its mirror drive continues the operation. Mirroring requires additional system software and additional processing power. Mirroring also increases data availability but doubles media requirements and costs. Hardware fails. An unfortunate fact of life, but fortunately, RAID can lessen this occurrence. A failure is repaired by the replacement of a physical component. If a disk subsystem is considered to be fault tolerant, it means that any one component in the subsystem can fail, and the subsystem will remain operational. In a much broader sense, this also applies to other components of a system such as power supplies, adapters, controllers, and cabling. The specification of a RAID (RAID 3 through RAID 5) provides for a failed disk, by reconstructing the data contained on it using an extra disk (a redundant or parity disk containing redundant information) to recover the original information. If the failed disk is the redundant disk, the data from all the good disks is used to reconstruct the data on the redundant disk while the subsystem continues normal operations. Redundant arrays of disks were originally specified at five levels (plus 0) in the Berkeley paper. Today, the most commonly used levels are the following:
In RAID 0, data is divided into blocks and distributed sequentially among the disks. This level is also referred to as pure striping. At least one disk is required to create a RAID 0. This type of array is used when high data transfer rates are required, but fault tolerance is not needed. For example, consider five physical drives configured as one RAID 0 logical drive. The data blocks are then written as shown in Table 1-1 “RAID 0 Striping” below. Table 1-1 RAID 0 Striping
RAID 0 allows data to be accessed on multiple disks simultaneously. Read/Write performance on a multi-disk RAID 0 system is significantly faster than on a single-drive system. The advantages of using RAID level 0 are as follows:
The disadvantages of using RAID level 0 are as follows:
This section describes RAID levels 1, 3, and 5, which use non-spanned arrays. RAID 1 is the first level that provides data redundancy. Data written to one disk is simultaneously written to another disk. If one disk fails, the other disk can be used to run the system and reconstruct the failed disk. Since the disk is mirrored, it does not matter if one of them fails because both disks contain the same data at all times. Either disk can act as the operational disk. This level provides 100% redundancy but is expensive because each drive in the system is duplicated. This type of array is used for read-intensive, high fault-tolerant configurations. Two disk drives are required. For example, consider two physical disks configured as one logical drive. The data blocks are then written as shown in Table 1-2 “RAID 1 Striping” below. Table 1-2 RAID 1 Striping
With this setup, if either disk fails all of the data is available from the other disk. The advantages of using RAID level 1 are as follows:
The disadvantages of using RAID level 1 are as follows:
RAID 3 uses parity to generate redundancy data from two or more parent data sets. A set of disks is used to stripe data (RAID 0) and another disk is used to collect parity information from the striped set of disks. This disk is referred to as a dedicated parity disk. Parity data does not fully duplicate the parent disk sets, but if a single disk in the set fails, it can be rebuilt from the parity of the respective data on the remaining disks. RAID 3 configurations are usually reserved for non-interactive applications that process large files sequentially and require fault tolerance. At least three drives are required—two striped disks and one parity disk. For example, consider a five-disk array, four data and one parity disk, as shown in Table 1-3 “RAID 3 Striping—Five-Disk Array” below. Table 1-3 RAID 3 Striping—Five-Disk Array
In RAID 3, data reads are faster than writes, because parity must be calculated for each write. It, therefore, performs better for long writes than for short ones. RAID 3 works well for long data transfers, such as CAD or graphic files and data logging. The advantages of using RAID level 3 are as follows:
The disadvantage of using RAID level 3 is that performance is slower than RAID 0 or RAID 1. RAID 5 is similar to a RAID 3 in that it also uses striped data and parity to generate redundancy. However, instead of dedicating a disk entirely for parity storage, the parity is rotated or distributed among the stripes of the disk array. This is an advantage in applications that require high read-request rates with low write-request rates, such as transaction processing, office automation, and online customer service, because parity generation can slow write operations down considerably. At least three disks are required to configure this type of RAID level. For example, consider a five-disk array, as shown in Table 1-4 “RAID 5 Striping—Five-Disk Array” below. Table 1-4 RAID 5 Striping—Five-Disk Array
RAID 5 outperforms RAID 1 for read operations. The write performance, however, might be slower than RAID 1, especially if most writes are small and random. For example, to change Block 1 in the table above, the controller must first read Blocks 2, 3, and 4 before it can calculate Parity Block 1-4. Once it has calculated the new Parity Block 1-4 it must then write Block 1 and then Parity Block 1-4. The advantages of using RAID level 5 are as follows:
The disadvantage of using RAID level 5 is that write performance is slower than RAID 0 or 1. With the HP RAID 4Si controller, array spanning allows combining two, three, or four arrays into a single storage space. A spanned array must have the same number of disk drives in each array—each array can have two disks, three disks, four disks, and so on. A RAID 1+0 (formerly called RAID 10) configuration uses two, three, or four pairs of mirrored disks, spanning two, three, or four arrays, respectively. (RAID 1+0 is a RAID 1 configuration with array spanning.) If your RAID 1+0 logical drive spans two arrays with two physical drives each, the data blocks are written as shown in Table 1-5 “RAID 1+0—Four-Disk Array” below. Table 1-5 RAID 1+0—Four-Disk Array
The advantages of using RAID level 1+0 are as follows:
The disadvantage of using RAID level 1+0 is that costs are high, because 50% of all disk space is allocated for redundancy. In a RAID 3+0 (formerly called RAID 30) configuration, parity blocks provide redundancy to a logical drive that spans two, three, four, or five arrays. (RAID 3+0 is a RAID 3 configuration with array spanning.) If your RAID 3+0 logical drive has two arrays with four physical drives each, the data blocks are shown in Table 1-6 “RAID 3+0 with Two Four-Disk Arrays” below. Table 1-6 RAID 3+0 with Two Four-Disk Arrays
The advantages of using RAID level 3+0 are as follows:
The disadvantages of using RAID level 3+0 are as follows:
In a RAID 5+0 (formerly called RAID 50) configuration, parity blocks are distributed throughout the logical drive, spanning two, three, four, or five arrays. (RAID 5+0 is a RAID 5 configuration with array spanning.) If your RAID 5+0 logical drive has two arrays with four physical drives each, the data blocks are written as shown in Table 1-7 “RAID 5+0 with Two Four-Disk Arrays” below. Table 1-7 RAID 5+0 with Two Four-Disk Arrays
The advantages of using RAID level 5+0 are as follows:
The disadvantages of using RAID level 5+0 are as follows:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||