| United States-English |
|
|
|
![]() |
VERITAS Volume Manager 3.2 Administrator's Guide: for HP-UX 11i and HP-UX 11i Version 1.5 > Chapter 12 Performance Monitoring and TuningTuning VxVM |
|
This section describes how to adjust the tunable parameters that control the system resources used by VxVM. Depending on the system resources that are available, adjustments may be required to the values of some tunable parameters to optimize performance. VxVM is optimally tuned for most configurations ranging from small systems to larger servers. In cases where tuning can be used to increase performance on larger systems at the expense of a valuable resource (such as memory), VxVM is generally tuned to run on the smallest supported configuration. Any tuning changes must be performed with care, as they may adversely affect overall system performance or may even leave VxVM unusable. Various mechanisms exist for tuning VxVM. On some systems, many parameters can be tuned using the global tunable file /stand/system. Other values can only be tuned using the command line interface to VxVM. On smaller systems (with less than a hundred disk drives), tuning is unnecessary and VxVM is capable of adopting reasonable defaults for all configuration parameters. On larger systems, configurations can require additional control over the tuning of these parameters, both for capacity and performance reasons. Generally, only a few significant decisions must be made when setting up VxVM on a large system. One is to decide on the size of the disk groups and the number of configuration copies to maintain for each disk group. Another is to choose the size of the private region for all the disks in a disk group. Larger disk groups have the advantage of providing a larger free-space pool for the vxassist(1M) command to select from, and also allow for the creation of larger arrays. Smaller disk groups do not require as large a configuration database and so can exist with smaller private regions. Very large disk groups can eventually exhaust the private region size in the disk group with the result that no more configuration objects can be added to that disk group. At that point, the configuration either has to be split into multiple disk groups, or the private regions have to be enlarged. This involves re-initializing each disk in the disk group (and can involve reconfiguring everything and restoring from backup). A general recommendation for users of disk array subsystems is to create a single disk group for each array so the disk group can be physically moved as a unit between systems. Selection of the number of configuration copies for a disk group is based on a trade-off between redundancy and performance. As a general rule, reducing the number configuration copies in a disk group speeds up initial access of the disk group, initial startup of the vxconfigd daemon, and transactions performed within the disk group. However, reducing the number of configuration copies also increases the risk of complete loss of the configuration database, which results in the loss of all objects in the database and of all data in the disk group. The default policy for configuration copies in the disk group is to allocate a configuration copy for each controller identified in the disk group, or for each target that contains multiple addressable disks. This provides a sufficient degree of redundancy, but can lead to a large number of configuration copies under some circumstances. If this is the case, we recommended that you limit the number of configuration copies to a minimum of 4. Distribute the copies across separate controllers or targets to enhance the effectiveness of this redundancy. To set the number of configuration copies for a new disk group, use the nconfig operand with the vxdg init command (see the vxdg(1M) manual page for details). You can also change the number of copies for an existing group by using the vxedit set command (see the vxedit(1M) manual page). For example, to configure five configuration copies for the disk group, bigdg, use the following command:
Tunables are modified by adding a line to /stand/system file. Changed tunables take effect only after relinking the kernel and booting the system from the new kernel. Modify tunables in /stand/system using the following format:
The following sections describe specific tunable parameters. The interval at which utilities performing recoveries or resynchronization operations load the current offset into the kernel as a checkpoint. A system failure during such operations does not require a full recovery, but can continue from the last reached checkpoint. The default value of the checkpoint is 10240 sectors (10MB). Increasing this size reduces the overhead of checkpointing on recovery operations at the expense of additional recovery following a system failure during a recovery. The count in clock ticks for which utilities pause if they have been directed to reduce the frequency of issuing I/O requests, but have not been given a specific delay time. This tunable is used by utilities performing operations such as resynchronizing mirrors or rebuilding RAID-5 columns. The default for this tunable is 50 ticks. Increasing this value results in slower recovery operations and consequently lower system impact while recoveries are being performed. The maximum size in kilobytes of the bitmap that Non-Persistent FastResync uses to track changed blocks in a volume. The number of blocks in a volume that are mapped to each bit in the bitmap depends on the size of the volume, and this value changes if the size of the volume is changed. For example, if the volume size is 1 gigabyte and the system block size is 1024 bytes, a vol_fmr_logsz value of 4 yields a map contains 32,768 bits, each bit representing one region of 32 blocks. The larger is the bitmap size, the fewer the number of blocks that are mapped to each bit. This can reduce the amount of reading and writing required on resynchronization, at the expense of requiring more non-pagable kernel memory for the bitmap. Additionally, on clustered systems, a larger bitmap size increases the latency in I/O performance, and it also increases the load on the private network between the cluster members. This is because every other member of the cluster must be informed each time a bit in the map is marked. Since the region size must be the same on all nodes in a cluster for a shared volume, the value of the vol_fmr_logsz tunable on the master node overrides the tunable values on the slave nodes, if these values are different. Because the value of a shared volume can change, the value of vol_fmr_logsz is retained for the life of the volume or until FastResync is turned on for the volume. In configurations which have thousands of mirrors with attached snapshot plexes, the total memory overhead can represent a significantly higher overhead in memory consumption than is usual for VxVM. The default value of this tunable is 4 KB. The maximum and minimum permitted values are 1 and 8 KB.
The maximum number of volumes that can be created on the system. This value can be set to between 1 and the maximum number of minor numbers representable in the system. The default value for this tunable is 16777215. The maximum size of logical I/O operations that can be performed without breaking up the request. I/O requests to VxVM that are larger than this value are broken up and performed synchronously. Physical I/O requests are broken up based on the capabilities of the disk device and are unaffected by changes to this maximum logical request limit. The default value for this tunable is 256 sectors (256KB).
The maximum size of data that can be passed into VxVM via an ioctl call. Increasing this limit allows larger operations to be performed. Decreasing the limit is not generally recommended, because some utilities depend upon performing operations of a certain size and can fail unexpectedly if they issue oversized ioctl requests. The default value for this tunable is 32768 bytes (32KB). The maximum number of I/O operations that can be performed by VxVM in parallel. Additional I/O requests that attempt to use a volume device are queued until the current activity count drops below this value. The default value for this tunable is 2048. Because most process threads can only issue a single I/O request at a time, reaching the limit of active I/O requests in the kernel requires 2048 I/O operations to be performed in parallel. Raising this limit is unlikely to provide much benefit except on the largest of systems. The number of I/O operations that the vxconfigd(1M) daemon is permitted to request from the kernel in a single VOL_VOLDIO_READ per VOL_VOLDIO_WRITE ioctl call. The default value for this tunable is 256. It is not desirable to change this value. The maximum size of an I/O request that can be issued by an ioctl call. Although the ioctl request itself can be small, it can request a large I/O request be performed. This tunable limits the size of these I/O requests. If necessary, a request that exceeds this value can be failed, or the request can be broken up and performed synchronously. The default value for this tunable is 256 sectors (256KB). Raising this limit can cause difficulties if the size of an I/O request causes the process to take more memory or kernel virtual mapping space than exists and thus deadlock. The maximum limit for vol_maxio is 20% of the smaller of physical memory or kernel virtual memory. It is inadvisable to go over this limit, because deadlock is likely to occur. If stripes are larger than vol_maxio, full stripe I/O requests are broken up, which prevents full-stripe read/writes. This throttles the volume I/O throughput for sequential I/O or larger I/O requests. This tunable limits the size of an I/O request at a higher level in VxVM than the level of an individual disk. For example, for an 8 by 64KB stripe, a value of 256KB only allows I/O requests that use half the disks in the stripe; thus, it cuts potential throughput in half. If you have more columns or you have used a larger interleave factor, then your relative performance is worse. This tunable must be set, as a minimum, to the size of your largest stripe (RAID-0 or RAID-5). The granularity of the round-robin policy for reading from mirrors. A read is serviced by the same mirror as the last read if its offset is within the number of sectors described by this tunable of the last read. The default for this value is 256 sectors (256KB). Increasing this value causes fewer switches to alternate mirrors for reading. This is desirable if the I/O being performed is largely sequential with a few small seeks between I/O operations. Large numbers of randomly distributed volume reads are generally best served by reading from alternate mirrors. The maximum number of subdisks that can be attached to a single plex. There is no theoretical limit to this number, but it has been limited to a default value of 4096. This default can be changed, if required. If set to 0, volcvm_smartsync disables SmartSync on shared disk groups. This is the default behavior when VxVM 3.2 is installed. If set to 1, this parameter enables the use of SmartSync with shared disk groups. See “SmartSync Recovery Accelerator” for more information. The maximum number of dirty regions that can exist for non-sequential DRL on a volume. A larger value may result in improved system performance at the expense of recovery time. This tunable can be used to regulate the worse-case recovery time for the system following a failure. The default value for this tunable is 2048. The maximum number of dirty regions allowed for sequential DRL. This is useful for volumes that are usually written to sequentially, such as database logs. Limiting the number of dirty regions allows for faster recovery if a crash occurs. The default value for this tunable is 3. The minimum number of sectors for a dirty region logging (DRL) volume region. With DRL, VxVM logically divides a volume into a set of consecutive regions. Larger region sizes tend to cause the cache hit-ratio for regions to improve. This improves the write performance, but it also prolongs the recovery time. The VxVM kernel currently sets the default value for this tunable to 512 sectors. The granularity of memory chunks used by VxVM when allocating or releasing system memory. A larger granularity reduces CPU overhead due to memory allocation by allowing VxVM to retain hold of a larger amount of memory. The default size for this tunable is 64KB. The maximum memory requested from the system by VxVM for internal purposes. This tunable has a direct impact on the performance of VxVM as it prevents one I/O operation from using all the memory in the system. VxVM allocates two pools of voliomem_maxpool_sz, one for RAID-5 and one for mirrored volumes. A write request to a RAID-5 volume that is greater than volio_maxpool_sz/10 is broken up and performed in chunks of size volio_maxpool_sz/10. A write request to a mirrored volume that is greater than volio_maxpool_sz/2 is broken up and performed in chunks of size volio_maxpool_sz/2. The default value for this tunable is 4M.
The default size of the buffer maintained for error tracing events. This buffer is allocated at driver load time and is not adjustable for size while VxVM is running. The default size for this buffer is 16384 bytes (16KB). Increasing this buffer can provide storage for more error events at the expense of system memory. Decreasing the size of the buffer can result in an error not being detected via the tracing device. Applications that depend on error tracing to perform some responsive action are dependent on this buffer. The default size for the creation of a tracing buffer in the absence of any other specification of desired kernel buffer size as part of the trace ioctl. The default size of this tunable is 8192 bytes (8KB). If trace data is often being lost due to this buffer size being too small, then this value can be tuned to a more generous amount. The upper limit to the size of memory that can be used for storing tracing buffers in the kernel. Tracing buffers are used by the VxVM kernel to store the tracing event records. As trace buffers are requested to be stored in the kernel, the memory for them is drawn from this pool. Increasing this size can allow additional tracing to be performed at the expense of system memory usage. Setting this value to a size greater than can readily be accommodated on the system is inadvisable. The default value for this tunable is 131072 bytes (128KB). The maximum buffer size that can be used for a single trace buffer. Requests of a buffer larger than this size are silently truncated to this size. A request for a maximal buffer size from the tracing interface results (subject to limits of usage) in a buffer of this size. The default size for this buffer is 65536 bytes (64KB). Increasing this buffer can provide for larger traces to be taken without loss for very heavily used volumes. Care should be taken not to increase this value above the value for the voliot_iobuf_limit tunable value. The maximum number of tracing channels that can be open simultaneously. Tracing channels are clone entry points into the tracing device driver. Each vxtrace process running on a system consumes a single trace channel. The default number of channels is 32. The allocation of each channel takes up approximately 20 bytes even when not in use. The maximum number of transient reconstruct operations that can be performed in parallel for RAID-5. A transient reconstruct operation is one that occurs on a non-degraded RAID-5 volume that has not been predicted. Limiting the number of these operations that can occur simultaneously removes the possibility of flooding the system with many reconstruct operations, and so reduces the risk of causing memory starvation. The default number of transient reconstruct operations that can be performed in parallel is 1. Increasing this size improves the initial performance on the system when a failure first occurs and before a detach of a failing object is performed, but can lead to memory starvation. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||