Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
VERITAS Volume Manager 3.1 Administrator's Guide: for HP-UX 11i and HP-UX 11i Version 1.5 > Chapter 8 Recovery

Miscellaneous RAID-5 Operations

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

NOTE: You may need an additional license to use this feature.

Many operations exist for manipulating RAID-5 volumes and associated objects. These operations are usually performed by other commands, such as the vxassist command and the vxrecover command, as part of larger operations, such as evacuating disks. These command line operations are not necessary for light usage of the Volume Manager.

Manipulating RAID-5 Logs

RAID-5 logs are represented as plexes of RAID-5 volumes and are manipulated using the vxplex command. To a RAID-5 log can be added, use the following command:

# vxplex att r5vol r5log

The attach (att) operation can only proceed if the size of the new log is large enough to hold all of the data on the stripe. If the RAID-5 volume already contains logs, the new log length is the minimum of each individual log length. This is because the new log is a mirror of the old logs.

If the RAID-5 volume is not enabled, the new log is marked as BADLOG and is enabled when the volume is started. However, the contents of the log are ignored.

If the RAID-5 volume is enabled and has other enabled RAID-5 logs, the new log has its contents synchronized with the other logs through ATOMIC_COPY ioctls.

If the RAID-5 volume currently has no enabled logs, the new log is zeroed before it is enabled.

Log plexes can be removed from a volume using the following command:

# vxplex dis r5log3

When removing the log leaves the volume with less than two valid logs, a warning is printed and the operation is not allowed to continue. The operation must be forced by using the -o force option.

Manipulating RAID-5 Subdisks

As with other subdisks, subdisks of the RAID-5 plex of a RAID-5 volume are manipulated using the vxsd command. Association is done by using the assoc keyword in the same manner as for striped plexes. For example, to add subdisks at the end of each column in the vxprint output for a RAID-5 volume on “Disk Failures” page 247, use the following command:

# vxsd assoc r5vol-01 disk10-01:0 disk11-01:1 disk12-01:2

If a subdisk is filling a "hole" in the plex (that is, some portion of the volume logical address space is mapped by the subdisk), the subdisk is considered stale. If the RAID-5 volume is enabled, the association operation regenerates the data that belongs on the subdisk by using VOL_R5_RECOVER ioctls. Otherwise, it is marked as stale and is recovered when the volume is started.

To remove subdisks from the RAID-5 plex, use the following command:

# vxsd dis disk10-01
CAUTION: If the subdisk maps a portion of the RAID-5 volume address space, this places the volume in DEGRADED mode. In this case, the dis operation prints a warning and must be forced using the -o force option. Also, if removing the subdisk makes the RAID-5 volume unusable, because another subdisk in the same stripe is unusable or missing and the volume is not DISABLED and empty, this operation is not allowed.

Subdisks can be moved to change the disks which a RAID-5 volume occupies by using the vxsd mv utility. For example, if disk03 is to be evacuated and disk22 has enough room by using two portions of its space, use the following command:

# vxsd mv disk03-01 disk22-01 disk22-02

While this command is similar to that for striped plexes, the actual mechanics of the operation are not similar.

RAID-5 Subdisk Moves

To do RAID-5 subdisk moves, the current subdisk is removed from the RAID-5 plex and replaced by the new subdisks. The new subdisks are marked as STALE and then recovered using VOL_R5_RECOVER operations. Recovery is done either by the vxsd utility or (if the volume is not active) when the volume is started. This means that the RAID-5 volume is degraded for the duration of the operation.

Another failure in the stripes involved in the move makes the volume unusable. The RAID-5 volume can also become invalid if the parity of the volume becomes stale.

To avoid these situations, the vxsd utility does not allow a subdisk move if:

  • a stale subdisk occupies any of the same stripes as the subdisk being moved

  • the RAID-5 volume is stopped but was not shut down cleanly (parity is considered stale)

  • the RAID-5 volume is active and has no valid log areas

Only the third case can be overridden by using the -o force option.

Subdisks of RAID-5 volumes can also be split and joined by using the vxsd split command and the vxsd join command. These operations work the same way as those for mirrored volumes.

NOTE: RAID-5 subdisk moves are performed the same as other subdisk moves without the penalty of degraded redundancy.

Starting RAID-5 Volumes

When a RAID-5 volume is started, it can be in one of many states. After a normal system shutdown, the volume should be clean and require no recovery. However, if the volume was not closed, or was not unmounted before a crash, it can require recovery when it is started, before it can be made available. This section describes actions that can be taken under certain conditions.

Under normal conditions, volumes are started automatically after a reboot and any recovery takes place automatically or is done through the vxrecover command.

Unstartable RAID-5 Volumes

A RAID-5 volume is unusable if some part of the RAID-5 plex does not map the volume length:

  • the RAID-5 plex cannot be sparse in relation to the RAID-5 volume length

  • the RAID-5 plex does not map a region where two subdisks have failed within a stripe, either because they are stale or because they are built on a failed disk

When this occurs, the vxvol start command returns the following error message:

	vxvm:vxvol: ERROR: Volume r5vol is not startable; RAID-5 plex does
not map entire volume length.

At this point, the contents of the RAID-5 volume are unusable.

Another possible way that a RAID-5 volume can become unstartable is if the parity is stale and a subdisk becomes detached or stale. This occurs because within the stripes that contain the failed subdisk, the parity stripe unit is invalid (because the parity is stale) and the stripe unit on the bad subdisk is also invalid. This situation is shown in Figure 8-1 “Invalid RAID-5 Volume” which shows a RAID-5 volume that has become invalid due to stale parity and a failed subdisk.

Figure 8-1 Invalid RAID-5 Volume

Invalid RAID-5 Volume

This example shows four stripes in the RAID-5 array. All parity is stale and subdisk disk05-00 has failed. This makes stripes X and Y unusable because two failures have occurred within those stripes.

This qualifies as two failures within a stripe and prevents the use of the volume. In this case, the output display from the vxvol start command is as follows:

	vxvm:vxvol: ERROR: Volume r5vol is not startable; some subdisks are
unusable and the parity is stale.

This situation can be avoided by always using two or more RAID-5 log plexes in RAID-5 volumes. RAID-5 log plexes prevent the parity within the volume from becoming stale which prevents this situation (see “System Failures” for details).

Forcibly Starting RAID-5 Volumes

You can start a volume even if subdisks are marked as stale. For example, if a stopped volume has stale parity and no RAID-5 logs and a disk becomes detached and then reattached.

The subdisk is considered stale even though the data is not out of date (because the volume was in use when the subdisk was unavailable) and the RAID-5 volume is considered invalid. To prevent this case, always have multiple valid RAID-5 logs associated with the array. However, this is not always possible.

To start a RAID-5 volume with stale subdisks, you can use the -f option with the vxvol start command. This causes all stale subdisks to be marked as nonstale. Marking takes place before the start operation evaluates the validity of the RAID-5 volume and what is needed to start it. Also, you can mark individual subdisks as nonstale by using the command vxmend fix unstale subdisk.

Recovery When Starting RAID-5 Volumes

Several operations can be necessary to fully restore the contents of a RAID-5 volume and make it usable. Whenever a volume is started, any RAID-5 log plexes are zeroed before the volume is started. This is done to prevent random data from being interpreted as a log entry and corrupting the volume contents. Also, some subdisks may need to be recovered, or the parity may need to be resynchronized (if RAID-5 logs have failed).

The following steps are taken when a RAID-5 volume is started:

  1. If the RAID-5 volume was not cleanly shut down, it is checked for valid RAID-5 log plexes.

    • If valid log plexes exist, they are replayed. This is done by placing the volume in the DETACHED kernel state and setting the volume state to REPLAY, and enabling the RAID-5 log plexes. If the logs can be successfully read and the replay is successful, move on to Step 2.

    • If no valid logs exist, the parity must be resynchronized. Resynchronization is done by placing the volume in the DETACHED kernel state and setting the volume state to SYNC. Any log plexes are left DISABLED

      The volume is not made available while the parity is resynchronized because any subdisk failures during this period makes the volume unusable. This can be overridden by using the -o unsafe start option with the vxvol command. If any stale subdisks exist, the RAID-5 volume is unusable.

      CAUTION: The -o unsafe start option is considered dangerous, as it can make the contents of the volume unusable. It is therefore not recommended.
  2. Any existing logging plexes are zeroed and enabled. If all logs fail during this process, the start process is aborted.

  3. If no stale subdisks exist or those that exist are recoverable, the volume is put in the ENABLED kernel state and the volume state is set to ACTIVE. The volume is now started.

  4. If some subdisks are stale and need recovery, and if valid logs exist, the volume is enabled by placing it in the ENABLED kernel state and the volume is available for use during the subdisk recovery. Otherwise, the volume kernel state is set to DETACHED and it is not available during subdisk recovery.

    This is done because if the system were to crash or the volume was ungracefully stopped while it was active, the parity becomes stale, making the volume unusable. If this is undesirable, the volume can be started with the -o unsafe start option.

    CAUTION: The -o unsafe start option is considered dangerous, as it can make the contents of the volume unusable. It is therefore not recommended.
  5. The volume state is set to RECOVER and stale subdisks are restored. As the data on each subdisk becomes valid, the subdisk is marked as no longer stale.

    If any subdisk recovery fails and there are no valid logs, the volume start is aborted because the subdisk remains stale and a system crash makes the RAID-5 volume unusable. This can also be overridden by using the -o unsafe start option.

    CAUTION: The -o unsafe start option is considered dangerous, as it can make the contents of the volume unusable. It is therefore not recommended.

    If the volume has valid logs, subdisk recovery failures are noted but do not stop the start procedure.

  6. When all subdisks have been recovered, the volume is placed in the ENABLED kernel state and marked as ACTIVE. It is now started.

Changing RAID-5 Volume Attributes

You can change several attributes of RAID-5 volumes. For RAID-5 volumes, the volume length and RAID-5 log length can be changed by using the vxvol set command. To change the length of a RAID-5 volume, use the following command:

# vxvol set len=10240 r5vol

The length of a volume cannot exceed the mapped region (called the contiguous length, or contiglen) of the RAID-5 plex. The length cannot be extended so as to make the volume unusable. If the RAID-5 volume is active and the length is being shortened, the operation must be forced by using the -o force usage type option. This is done to prevent removal of space from applications using the volume.

The length of the RAID-5 logs can also be changed using the following command:

# vxvol set loglen=2M r5vol

Remember that RAID-5 log plexes are only valid if they map the entire length of the RAID-5 volume log length. If increasing the log length makes any of the RAID-5 logs invalid, the operation not allowed. Also, if the volume is not active and is dirty (not shut down cleanly), the log length cannot be changed. This avoids the loss of any of the log contents (if the log length is decreased) or the introduction of random data into the logs (if the log length is being increased).

Writing to RAID-5 Arrays

This section describes the write process for RAID-5 arrays.

Read-Modify-Write

When you write to a RAID-5 array, the following procedure is used for each stripe involved in the I/O:

  1. The data stripe units to be updated with new write data are accessed and read into internal buffers. The parity stripe unit is read into internal buffers.

  2. Parity is updated to reflect the contents of the new data region. First, the contents of the old data undergo an exclusive OR (XOR) with the parity (logically removing the old data). The new data is then XORed into the parity (logically adding the new data). The new data and new parity are written to a log.

  3. The new parity is written to the parity stripe unit. The new data is written to the data stripe units. All stripe units are written in a single write.

This process is known as a read-modify-write cycle, which is the default type of write for RAID-5. If a disk fails, both data and parity stripe units on that disk become unavailable. The disk array is then said to be operating in a degraded mode. See Figure 8-2 “Read-Modify-Write”

Figure 8-2 Read-Modify-Write

Read-Modify-Write

Full-Stripe Writes

When large writes (writes that cover an entire data stripe) are issued, the read-modify-write procedure can be bypassed in favor of a full-stripe write. A full-stripe write is faster than a read-modify-write because it does not require the read process to take place. Eliminating the read cycle reduces the I/O time necessary to write to the disk.

A full-stripe write procedure consists of the following steps:

  1. All the new data stripe units are XORed together, generating a new parity value. The new data and new parity is written to a log.

  2. The new parity is written to the parity stripe unit. The new data is written to the data stripe units. The entire stripe is written in a single write. See Figure 8-3 “Full-Stripe Write”

    Figure 8-3 Full-Stripe Write

    Full-Stripe Write

Reconstruct-Writes

When 50 percent or more of the data disks are undergoing writes in a single I/O, a reconstruct-write can be used. A reconstruct-write saves I/O time by XORing. XORing does not require a read of the parity region and only requires a read of the unaffected data. Unaffected data amounts to less than 50 percent of the stripe units in the stripe.

A reconstruct-write procedure consists of the following steps:

  1. Unaffected data is read from the unchanged data stripe unit(s).

  2. The new data is XORed with the old, unaffected data to generate a new parity stripe unit. The new data and resulting parity are logged.

  3. The new parity is written to the parity stripe unit. The new data is written to the data stripe units. All stripe units are written in a single write.

    See Figure 8-4 “Reconstruct-Write”. A reconstruct-write is preferable to a read-modify-write in this case because it reads only the necessary data disks, rather than reading the disks and the parity disk.

    Figure 8-4 Reconstruct-Write

    Reconstruct-Write
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1983-2000 Hewlett-Packard Development Company, L.P.