| United States-English |
|
|
|
![]() |
VERITAS Volume Manager 3.1 Reference Guide: for HP-UX 11i and HP-UX 11i Version 1.5 > Chapter 3 Disk ArraysDisk Array Overview |
|
This section provides an overview of traditional disk arrays. Performing I/O to disks is a slow process because disks are physical devices that require time to move the heads to the correct position on the disk before reading or writing. If all of the read or write operations are done to individual disks, one at a time, the read-write time can become unmanageable. Performing these operations on multiple disks can help to reduce this problem. A disk array is a collection of disks that appears to the system as one or more virtual disks (also referred to as volumes). The virtual disks created by the software controlling the disk array look and act (to the system) like physical disks. Applications that interact with physical disks should work exactly the same with the virtual disks created by the array. Data is spread across several disks within an array, which allows the disks to share I/O operations. The use of multiple disks for I/O improves I/O performance by increasing the data transfer speed and the overall throughput for the array. Figure 3-1 “Standard Disk Array” shows a standard disk array. A Redundant Array of Independent Disks (RAID) is a disk array set up so that part of the combined storage capacity is used for storing duplicate information about the data stored in the array. The duplicate information allows you to regenerate the data in case of a disk failure. Several levels of RAID exist. These are introduced in the following sections.
Volume Manager's implementations of RAID is described in the Introduction to" Volume Manager" chapter of the VERITAS Volume Manager Administrator's Guide. Although it does not provide redundancy, striping is often referred to as a form of RAID, known as RAID-0. The Volume Manager's implementation of striping is described in the "Introduction to Volume Manager" chapter of the VERITAS Volume Manager Administrator's Guide. RAID-0 offers a high data transfer rate and high I/O throughput, but suffers lower reliability and availability than a single disk. Mirroring is a form of RAID, which is known as RAID-1. The Volume Manager's implementation of mirroring is described in the "Introduction to Volume Manager" chapter of the VERITAS Volume Manager Administrator's Guide. Mirroring uses equal amounts of disk capacity to store the original plex and its mirror. Everything written to the original plex is also written to any mirrors. RAID-1 provides redundancy of data and offers protection against data loss in the event of physical disk failure. RAID-2 uses bitwise striping across disks and uses additional disks to hold Hamming code check bits. RAID-2 is described in a University of California at Berkeley research paper entitled A Case for Redundant Arrays of Inexpensive Disks (RAID), by David A. Patterson, Garth Gibson, and Randy H. Katz (1987). RAID-2 deals with error detection, but does not provide error correction. RAID-2 also requires large system block sizes, which limits its use. RAID-3 uses a parity disk to provide redundancy. RAID-3 distributes the data in stripes across all but one of the disks in the array. It then writes the parity in the corresponding stripe on the remaining disk. This disk is the parity disk. Figure 3-2 “RAID-3 Disk Array” shows a RAID-3 disk array. The user data is striped across the data disks. Each stripe on the parity disk contains the result of an exclusive OR (XOR) procedure done on the data in the data disks. If the data on one of the disks is inaccessible due to hardware or software failure, data can be restored by XORing the contents of the remaining data disks with the parity disk. The data on the failed disk can be rebuilt from the output of the XOR process. RAID-3 typically uses a very small stripe unit size (also historically known as a stripe width), sometimes as small as one byte per disk (which requires special hardware) or one sector (block) per disk. Figure 3-3 “Data Writes to RAID-3” shows a data write to a RAID-3 array. The parity disk model uses less disk space than mirroring, which uses equal amounts of storage capacity for the original data and the copy. The RAID-3 model is often used with synchronized spindles in the disk devices. This synchronizes the disk rotation, providing constant rotational delay. This is useful in large parallel writes. RAID-3 type performance can be emulated by configuring RAID-5 (described later) with very small stripe units. RAID-4 introduces the use of independent-access arrays (also used by RAID-5). With this model, the system does not typically access all disks in the array when executing a single I/O procedure. This is achieved by ensuring that the stripe unit size is sufficiently large that the majority of I/Os to the array will only affect a single disk (for reads). An array attempts to provide the highest rate of data transfer by spreading the I/O load as evenly as possible across all the disks in the array. In RAID-3, the I/O load is spread across the data disks, as shown in Figure 3-3 “Data Writes to RAID-3”, and each write is executed on all the disks in the array. The data in the data disk is XORed and the parity is written to the parity disk. RAID-4 maps data and uses parity in the same manner as RAID-3, by striping the data across all the data disks and XORing the data for the information on the parity disk. The difference between RAID-3 and RAID-4 is that RAID-3 accesses all the disks at one time and RAID-4 accesses each disk independently. This allows the RAID-4 array to execute multiple I/O requests simultaneously (provided they map to different member disks), while RAID-3 can only execute one I/O request at a time. RAID-4 read performance is much higher than its write performance. It performs well with applications requiring high read I/O rates. RAID-4 performance is not as high in small, write-intensive applications. The parity disk can cause a bottleneck in the performance of RAID-4. This is because all the writes that are taking place simultaneously on the data disks must each wait its turn to write to the parity disk. The transfer rate of the entire RAID-4 array in a write-intensive application is limited to the transfer rate of the parity disk. Since RAID-4 is limited to parity on one disk only, it is less useful than RAID-5. RAID-5 is similar to RAID-4, using striping to spread the data over all the disks in the array and using independent access. However, RAID-5 differs from RAID-4 in that the parity is striped across all the disks in the array, rather than being concentrated on a single parity disk. This breaks the write bottleneck caused by the single parity disk write in the RAID-4 model. Figure 3-4 “Parity Locations in a RAID-5 Model” shows parity locations in a RAID-5 array configuration. Every stripe has a column containing a parity stripe unit and columns containing data. The parity is spread over all of the disks in the array, reducing the write time for large independent writes because the writes do not have to wait until a single parity disk can accept the data. RAID-5 and how it is implemented by the Volume Manager is described in the "Introduction to Volume Manager" chapter of the VERITAS Volume Manager Administrator's Guide. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||