When storing data redundantly, using mirrored or RAID-5 volumes,
the Volume Manager ensures that all copies of the data match exactly. However,
under certain conditions (usually due to complete system failures),
some redundant data on a volume can become inconsistent or unsynchronized. The mirrored data is not exactly the same as the original
data. Except for normal configuration changes (such as detaching
and reattaching a plex), this can only occur when a system crashes
while data is being written to a volume.
Data is written to the mirrors of a volume in parallel, as
is the data and parity in a RAID-5 volume. If a system crash occurs
before all the individual writes complete, it is possible for some
writes to complete while others do not. This can result in the data
becoming unsynchronized. For mirrored volumes, it can cause two
reads from the same region of the volume to return different results,
if different mirrors are used to satisfy the read request. In the
case of RAID-5 volumes, it can lead to parity corruption and incorrect
data reconstruction.
The Volume Manager needs to ensure that all mirrors contain
exactly the same data and that the data and parity in RAID-5 volumes
agree. This process is called volume resynchronization.
For volumes that are part of disk groups that are automatically
imported at boot time (such as rootdg), the resynchronization process takes place when
the system reboots.
Not all volumes require resynchronization after a system failure. Volumes
that were never written or that were quiescent (that is, had no active
I/O) when the system failure occurred could not have had outstanding
writes and do not require resynchronization.
The Volume Manager records when a volume is first written
to and marks it as dirty. When a volume is
closed by all processes or stopped cleanly by the administrator,
all writes have been completed and the Volume Manager removes the
dirty flag for the volume. Only volumes that are marked dirty when
the system reboots require resynchronization.
The process of resynchronization depends on the type of volume.
RAID-5 volumes that contain RAID-5 logs can "replay" those
logs. If no logs are available, the volume is placed in reconstruct-recovery
mode and all parity is regenerated. For mirrored volumes, resynchronization
is done by placing the volume in recovery mode (also called read-writeback recovery
mode). Resynchronization of data in the volume is done
in the background. This allows the volume to be available for use
while recovery is taking place.
The process of resynchronization can be expensive and can
impact system performance. The recovery process reduces some of
this impact by spreading the recoveries to avoid stressing a specific
disk or controller.
For large volumes or for a large number of volumes, the resynchronization
process can take time. These effects can be addressed by using Dirty
Region Logging for mirrored volumes, or by ensuring that RAID-5
volumes have valid RAID-5 logs. For volumes used by database applications,
the VxSmartSync™ Recovery Accelerator can be used (see “VxSmartSync Recovery Accelerator”).