Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Using High Availability Monitors > Chapter 2 Monitoring Disk Resources

HA Disk Monitor Reference

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The HA Disk Monitor reports information about the physical and logical volumes configured by LVM (Logical Volume Manager). Anything not configured through LVM cannot be monitored from the HA Disk Monitor. Monitored disk resources are:

  • /vg/vgName/pv_summary, a summary status of all physical volumes in a volume group

  • /vg/vgname/pv_pvlink/status/deviceName, the status of a given physical volume or PV links in a volume group

  • /vg/vgName/lv_summary, a summary status of all logical volumes in a volume group

  • /vg/vgName/lv/status/lvName, the status of a given logical volume in a volume group

  • /vg/vgName/lv/copies/lvName, the number of copies of data available in a volume group

Monitoring both the physical and logical volumes allows you to detect failures in volume groups (both active and inactive) and in logical volumes. With these warnings, you can correct hardware problems that put node, application, or data availability at risk.

Figure 2-1 “Disk Monitor Resource Class Hierarchy” shows the class hierarchy for the HA Disk Monitor.

Figure 2-1 Disk Monitor Resource Class Hierarchy

Disk Monitor Resource Class Hierarchy

Bold items are resource instances that can be monitored. Bold italic variables represent specific instances of volume groups, devices, and logical volumes on the system.

Physical Volume Summary

The pv_summary is the summary status of all physical volumes in a volume group. This status is based on the compiled results of SCSI inquiries to all physical volumes in a volume group.

If you have configured physical volumes as package dependencies in ServiceGuard, this resource is used to trigger package failover. Refer the manual Using the Event Monitoring Service (HP Part Number B7612-90015) for information on configuring ServiceGuard package dependencies. If you are using the disk monitor with ServiceGuard, it is important that you configure physical volume groups (PVGs) to give you the most accurate pv_summary for ServiceGuard package failover.

Table 2-1 “Interpreting Physical Volume Summary” lists how conditions compare in logical operations. Specify the logical operation in the monitor request parameters portion of the monitor request. For example, to create a request that alerts you when the condition is SUSPECT or DOWN, specify greater than or equal to 3 (>=3).

Table 2-1 Interpreting Physical Volume Summary

Resource Name: /vg/vgName/pv_summary

Condition

Value

Interpretation

UP

1

All physical volumes containing data are accessible.

PVG_UP

2

At least 1 PV has failed. All data is accessible. If more than 1 PV is down and the failed PVs are from the same PVG, all data is still accessible.

This condition can only occur in mirrored set or in PV links in PVGs.

SUSPECT

3

Two or more physical volumes from different PVGs are not available. The disk monitor cannot conclude that all data is available.

For example, on a 2-way mirrored system, if a physical volume fails on each side of the mirror, data may be available if the failed volumes are holding different data. But data may be unavailable if the failed volumes hold the same data. Because the disk monitor only knows that disks have failed, and not what data is on the disks, it marks the volume group SUSPECT.

DOWN

4

Some data missing or no data accessible.

 

The pv_summary resource may not be available for a given volume group in the following cases:

  • Devices are on an unsupported bus (such as HP-IB or HP-FL) or an unrecognized bus, (such as a new bus technology). The /var/adm/syslog/syslog.log entry would say:

    diskmond[5699]: pv_summary will be unavailable for
    /dev/vg00 because there are physical volumes in this volume
    group which are on an unrecognized bus. (DRM-502).
  • PVGs (physical volume groups) exist in a volume group, but not all physical volumes are assigned to a PVG. The /var/adm/syslog/
    syslog.log
    entry would say:

    diskmond[18323]: pv_summary will be unavailable for
    /dev/vgtest because the physical volume groups (PVGs) in
    this volume group do not have an equal number of PVs or there are PVs not in a PVG. (DRM-503)
  • Unequal numbers of physical volumes exist in each PVG in the volume group. The /var/adm/syslog/syslog.log entry would say:

    diskmond[18323]: pv_summary will be unavailable for
    /dev/vgtest because the physical volume groups (PVGs in this volume group do not have an equal number of PVs or there are PVs not in a PVG. (DRM-503)

    Two cases where this would occur are:

    • There is both 2-way and 3-way mirroring in the same volume group.

    • The mirrored disks contain a different number of physical disks that equate to the same amount of disk space. For example, one 4GB drive in one PVG mirrored with 2-2G drives in the redundant PVG.

Checks for the validity of pv_summary are logged with the name of the local node and the identifier diskmond to both /var/adm/syslog/syslog.log and /etc/opt/resmon/log/api.log.

Physical Volume and Physical Volume Link Status

Requests to monitor physical volumes and physical volume links give you status of the individual physical volumes and PV links in a volume group. In the case of most RAID arrays, this means the HA Disk Monitor can talk to a physical link LUN (logical unit number) in the array. In the case of stand-alone disks, it means the HA Disk Monitor can talk to the disk itself.

The pv_pvlink status is used to calculate pv_summary. Although it is somewhat redundant to use both, you might want to have more specific status sent by pv_summary, and only have status sent on pv_pvlinks if a device is DOWN.

pv_pvlinks and pv_summary supplement lv_summary by giving status on the accessibility of volume groups (both active and inactive) and logical volumes.

To pinpoint a failure of a particular disk, bus, or I/O card, you need to use the HA Disk Monitor alerts in conjunction with standard troubleshooting methods: reading log files and inspecting the actual devices. The HA Disk Monitor uses the data in /etc/lvmtab to see what is available for monitoring, and /etc/lvmtab does not distinguish between physical volumes and physical volume links, so you need to investigate to detect whether a disk, bus, or I/O card has failed.

Table 2-2 “Interpreting Physical Volume and Physical Volume Link Status” lists how conditions compare in logical operations. You specify the logical operation in the monitor request parameters portion of the monitor request. For example, to create a request that alerts you when the condition is BUSY, you would specify greater than or equal to 2 (>=2).

Table 2-2 Interpreting Physical Volume and Physical Volume Link Status

Resource Name: /vg/vgName/pv_pvlink/status/deviceName

Condition

Value

Interpretation

UP

1

SCSI inquiry was successful.

BUSY

2

SCSI inquiry returned with DEVICE BUSY. The HA Disk Monitor will try 3 times to see if it gets either an UP or DOWN result before marking a device BUSY.

DOWN

3

SCSI inquiry failed. The bus and/or the disk is not accessible.

 

While configuring requests from the SAM interface, a wildcard (*) can be used in place of deviceName to monitor all physical volumes and physical volume links in a volume group.

Logical Volume Summary

The logical volume summary describes how accessible the data is in all logical volumes in an active volume group. Sometimes the physical connection may be working, but the application cannot read or write data on the disk. The HA Disk Monitor determines I/O activity by querying LVM, and marks a logical volume as DOWN if a portion of its data is unavailable.

NOTE: If the logical volume is in an inactive volume group, the HA Disk Monitor cannot determine if the data can be accessible.

Table 2-3 “Interpreting Logical Volume Summary” lists how conditions compare in logical operations. You specify the logical operation in the monitor request parameters portion of the monitor request. For example, to create a request that alerts you when the condition is INACTIVE_DOWN, you would specify greater than or equal to 3 (>=3).

Table 2-3 Interpreting Logical Volume Summary

Resource Name: /vg/vgName/lv_summary

Condition

Value

Interpretation

UP

1

All logical volumes are accessible and all data is accessible.

INACTIVE

2

The volume group is inactive. This could be because:

  • The volume group is active in exclusive mode on another node in an ServiceGuard cluster. (This is not valid for clusters running ServiceGuard Extension for RAC, because it can support a volume group being active on more than one node.) Note that ServiceGuard does allow a volume group to be active in read-only mode, if it is already active on another node.

  • The volume group is made inactive using vgchange -a n for maintenance or other reasons.

  • There is no quorum of active physical volumes at system boot. For example, not enough disks in the volume group were working.

INACTIVE_DOWN

3

The last time the inactive volume was activated, it was DOWN. At least one logical volume in the volume was inaccessible.

DOWN

4

At least one logical volume in the volume group reports a status of either INACTIVE or DOWN. Note that an inactive logical volume in an active volume group is rare, but possible. See “Logical Volume Status”.

 

Logical Volume Status

Logical volume status gives you status on each logical volume in a volume group. While the lv_summary gives you information on whether data in a volume group is available, the lv/status/lvName gives you information on whether specific logical volumes have failed.

Table 2-4 “Interpreting Logical Volume Status” lists how conditions compare in logical operations. You specify the logical operation in the monitor request parameters portion of the monitor request. For example, to create a request that alerts you when the condition is INACTIVE, you would specify greater than or equal to 2 (>=2).

Table 2-4 Interpreting Logical Volume Status

Resource Name: /vg/vgName/lv/status/lvName

Condition

Value

Interpretation

UP

1

All logical volumes are accessible and all data is accessible.

INACTIVE

2

The logical volume is inactive.

DOWN

3

The logical volume is DOWN. A complete copy of the data is not available for this logical volume.

 

When configuring requests from the SAM interface, use a wildcard (*) in place of lvName to monitor all logical volumes in a volume group.

If you split off mirrors from your mirrored configuration, you will see new logical volume resource instances when the split mirror is created.

Logical Volume Number of Copies

The logical volume number of copies is most useful to monitor in a mirrored disk configuration. It tells you how many copies of the data are available. The HA Disk Monitor monitors all copies of data, and therefore counts the "original" as part of the total number of copies.

MirrorDisk/UX supports up to 3-way mirroring, so the range can be from 0 to 3 copies (see Table 2-5 “Interpreting Logical Volume Copies”). In a RAID configuration that is not mirrored using LVM, the only possible number is 0 or 1; either the data is accessible or it is not.

When you first configure mirroring in LVM, it lists 0 mirrors, meaning you have only the original copy of the data. Likewise, 2 mirrors mean you have one original plus 2 mirrored copies.

Table 2-5 Interpreting Logical Volume Copies

Resource Name: vg/vgName/lv/copies/lvName

Condition

Interpretation

0

No additional copies, (only the original copy), either physical parts of the disk array have problems, the lv is inactive, or a physical extent is stale or unavailable.

1

One complete copy of data is available. If the data is not mirrored, then all physical extents are fine. If the data is mirrored, all other copies have problems.

2

Two complete copies of data are available. If the data is two-way mirrored, then all physical disks are up and data is available. If the data is 3-way mirrored, at least one logical extent has a missing or stale physical extent.

3

All copies of a 3-way mirror are available.

 

When configuring requests from the SAM interface, use a wildcard (*) in place of lvName to request status for all logical volumes in a volume group.

If you split off mirrors from your mirrored configuration, you will see the number of copies reduced by 1 when the split mirror is created.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1997, 2003 Hewlett-Packard Development Company, L.P.