Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Using EMS HA Monitors > Chapter 2 Monitoring Disk Resources

Disk Monitor Reference

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The EMS disk monitor reports information on the physical and logical volumes configured by LVM (Logical Volume Manager). Anything not configured through LVM is not monitored from the disk monitor. Monitored disk resources are:

  • Physical volume summary (/vg/vgName/pv_summary), a summary status of all physical volumes in a volume group.

  • Physical volume and physical volume link status (/vg/vgname/pv_pvlink/status/deviceName), the status of a given physical volume or PV links in a volume group.

  • Logical volume summary (/vg/vgName/lv_summary), a summary status of all logical volumes in a volume group.

  • Logical volume status (/vg/vgName/lv/status/lvName), the status of a given logical volume in a volume group.

  • Logical volume copies (/vg/vgName/lv/copies/lvName), the number of copies of data available in a volume group.

Monitoring both the physical and logical volumes allows you to detect failures in both active and inactive volume groups and logical volumes and correct hardware problems that put node, application, or data availability at risk.

Figure 2-1 “Disk Monitor Resource Class Hierarchy” shows the class hierarchy for the disk monitor.

Items in boxes are resource instances that can be monitored.
Items in italics change depending on the names of volume groups, devices, and logical volumes on the system.

Figure 2-1 Disk Monitor Resource Class Hierarchy

Disk Monitor Resource Class Hierarchy

Physical Volume Summary

The pv_summary is a summary status of all physical volumes in a volume group. This status is based on the compiled results of SCSI inquiries to all physical volumes in a volume group; see “Physical Volume and Physical Volume Link Status”.

If you have configured package dependencies in MC/ServiceGuard, this resource is used to determine package failover based on access to physical disks. (See Chapter 1 for information on configuring MC/ServiceGuard package dependencies.) If you are using the disk monitor with MC/ServiceGuard, it is important that you configure physical volume groups (PVGs) to give you the most accurate pv_summary for MC/ServiceGuard package failover. See “Rules for Using the EMS Disk Monitor
with MC/ServiceGuard”
.

The value in Table 2-1 “Interpreting Physical Volume Summary ” is used by the disk monitor to determine how conditions compare in logical operations. For example, you may create a request that alerts you when the condition is greater than or equal to SUSPECT. The numeric value allows you to tell which conditions qualify.

Table 2-1 Interpreting Physical Volume Summary

Resource Name

/vg/vgName/pv_summary

Condition

Value

Interpretation

UP

1

All physical volumes containing data are accessible.

PVG_UP

2

At least 1 PV has failed; all data is accessible. If more than 1 is down and the failed PVs are from the same PVG, all data is still accessible.

This condition can only occur in mirrored set or if PV links in PVGs.

SUSPECT

3

Two or more physical volumes from different PVGs are unavailable; the disk monitor cannot conclude that all data is available.

For example, on a 2-way mirrored system, if a physical volume fails on each side of the mirror, data may be available if the failed volumes are holding different data. But data may be unavailable if the failed volumes hold the same data. Because the disk monitor only knows disks have failed, and not what data is on the disks, it marks the volume group SUSPECT.

DOWN

4

Some data missing or no data accessible.

 

The pv_summary resource may not be available for a given volume group in the following cases:

  • Devices are on an unsupported bus (such as HP-IB or HP-FL) or an unrecognized bus, in the case of a new bus technology. The /etc/syslog entry would say:

    diskmond[5699]: pv_summary will be unavailable for /dev/vg00 because there are physical volumes in this volume group which are on an unrecognized bus. (DRM-502)

  • PVGs (physical volume groups) exist in a volume group, but not all physical volumes are assigned to a PVG. The /etc/syslog entry would say:

    diskmond[18323]: pv_summary will be unavailable for /dev/vgtest because the physical volume groups (PVGs) in this volume group do not have an equal number of PVs or there are PVs not in a PVG. (DRM-503)

  • Unequal numbers of physical volumes exist in each PVG in the volume group. The /etc/syslog entry would say:

    diskmond[18323]: pv_summary will be unavailable for /dev/vgtest because the physical volume groups (PVGs) in this volume group do not have an equal number of PVs or there are PVs not in a PVG. (DRM-503)

    Two cases where this would occur are:.

    • There are both 2-way and 3-way mirroring in the same volume group.

    • Mirrored disks are a different number of physical disks with the same total disk mi4Gb drive in one PVG and 2 2G drives in the redundant PVG.

All checks for the validity of pv_summary are logged to both /etc/syslog and /etc/opt/resmon/log/api.log with the name of the local node and the identifier diskmond.

Physical Volume and Physical Volume Link Status

Requests to monitor physical volumes and physical volume links give you status on the individual physical volumes and PV links in a volume group. In the case of most RAID arrays, this means the monitor can talk to the physical link to a logical unit number ( LUN) in the array. In the case of stand-alone disks, it means the monitor can talk to the disk itself.

The pv_pvlink status is used to calculate pv_summary. Although it is somewhat redundant to use both, you might want to have more specific status sent by pv_summary, and only have status sent on pv_pvlinks if a device is DOWN.

Pv_pvlinks and pv_summary supplement lv_summary by giving status on the accessibility of both active and inactive volume groups and logical volumes.

To pinpoint a failure to a particular disk, bus, or I/O card, you need to use the disk monitor alerts in conjunction with standard troubleshooting methods: reading log files, inspecting the actual devices. The disk monitor uses the data in /etc/lvmtab to see what is available for monitoring, and /etc/lvmtab does not distinguish between physical volumes and physical volume links, so you need to do additional investigation to detect whether a disk, bus, or I/O card has failed.

The value in Table 2-2 “Interpreting Physical Volume and Physical Volume Link Status” is used by the disk monitor to determine how conditions compare in logical operations. For example, you may create a request that alerts you when the condition is greater than or equal to BUSY. The numeric value allows you to tell which conditions qualify.

Table 2-2 Interpreting Physical Volume and Physical Volume Link Status

Resource Name

/vg/vgName/pv_pvlink/status/deviceName

Condition

Value

Interpretation

UP

1

SCSI inquiry was successful.

BUSY

2

SCSI inquiry returned with DEVICE BUSY; the disk monitor will try 3 times to see if it gets either an UP or DOWN result before marking a device BUSY.

DOWN

3

SCSI inquiry failed; either the bus or disk are not accessible.

 

When configuring requests from the SAM interface, a wildcard (*) may be used in place of deviceName to monitor all physical volumes and physical volume links in a volume group.

Logical Volume Summary

The logical volume summary tells you how accessible the data is in all logical volumes in an active volume group. Sometimes the physical connection may be working, but the application cannot read or write data on the disk. The disk monitor determines I/O activity by querying LVM, and marks a logical volume as DOWN if a portion of its data is unavailable.

NOTE: The disk monitor cannot determine data accessibility to logical volumes in an inactive volume group.

The values in Table 2-3 “Interpreting Logical Volume Summary” are used by the disk monitor to determine how conditions compare in logical operations. For example, you may create a request that alerts you when the condition is greater than or equal to INACTIVE_DOWN.

Table 2-3 Interpreting Logical Volume Summary

Resource Name

/vg/vgName/lv_summary

Condition

Value

Interpretation

UP

1

All logical volumes are accessible, all data is accessible.

INACTIVE

2

The volume group is inactive. This could be because:

  • The volume group is active in exclusive mode on another node in an MC/ServiceGuard cluster. (This is not valid for clusters running MC/LockManager, because it can support a volume group being active on more than one node.) Note that MC/ServiceGuard does allow a volume group to be active in read-only mode, if it is already active on another node.

  • The volume group was made inactive using vgchange -a n for maintenance or other reasons.

  • There was not a quorum of active physical volumes at system boot, i.e. not enough disks in the volume group were working.

INACTIVE_DOWN

3

The last time the inactive volume was activated, it was DOWN; at least one logical volume in the volume was inaccessible

DOWN

4

At least one logical volume in the volume group reports a status of either INACTIVE or DOWN. Note that an inactive logical volume in an active volume group is rare, but possible. See “Logical Volume Status”.

 

Logical Volume Status

Logical volume status gives you status on each logical volume in a volume group. While the lv_summary tells whether data in a volume group is available, the lv/status/lvName will tell you whether specific logical volumes have failed.

The value in Table 2-4 “Interpreting Logical Volume Status” is used by the disk monitor to determine how conditions compare in logical operations. For example, you may create a request that alerts you when the condition is greater than or equal to INACTIVE. The numeric value allows you to tell which conditions qualify.

Table 2-4 Interpreting Logical Volume Status

Resource Name

/vg/vgName/lv/status/lvName

Condition

Value

Interpretation

UP

1

All logical volumes are accessible, all data is accessible.

INACTIVE

2

The logical volume is inactive.

DOWN

3

The logical volume is DOWN, a complete copy of the data is not available for this logical volume.

 

When configuring requests from the SAM interface, a wildcard (*) may be used in place of lvName to monitor all logical volumes in a volume group.

If you split off mirrors from your mirrored configuration, you will see new logical volume resource instances when the split mirror is created.

Logical Volume Number of Copies

The logical volume number of copies is most useful to monitor in a mirrored disk configuration. It tells you how many copies of the data are available.

MirrorDisk/UX supports up to 3-way mirroring, so there can be from 0 to 3 copies (see Table 2-5 “Interpreting Logical Volume Copies”.) In a RAID configuration that is not mirrored using LVM, the only possible number is 0 or 1; either the data is accessible or it isn't.

Note that when you configure mirroring in LVM, it lists 0 mirrors to mean you have one copy of the data. Likewise, 2 mirrors mean you have 3 copies of the data (one original plus 2 mirrors). The disk monitor is monitoring all copies of data, and therefore counts the "original" as part of the total number of copies.

Table 2-5 Interpreting Logical Volume Copies

Resource Name

vg/vgName/lv/copies/lvName

Condition

Interpretation

0

No copies, either physical parts of the disk array have problems, the lv is inactive, or a physical extent is stale or unavailable.

1

One complete copy of data available; if the data is not mirrored, then all physical extents are fine, if data is mirrored, all other copies have problems.

2

Two complete copies of data are available; if the data is two-way mirrored, then all physical disks are up and data is available, if 3-way mirrored, at least one logical extent has a missing or stale physical extent .

3

All copies of a 3-way mirror are available.

 

When configuring requests from the SAM interface, a wildcard (*) may be used in place of lvName to request status for all logical volumes in a volume group.

If you split off mirrors from your mirrored configuration, you will see the number of copies reduced by 1 when the split mirror is created.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1997 Hewlett-Packard Development Company, L.P.