These release notes cover the September 2005 release of Support Plus for HP-UX 11i.
- Overview
- Configuring Hardware Monitoring
- Monitors Provided
- Monitor Dependencies
- Changes
- Known Problems
- Reporting Defects
- Product Structure
- Product Documentation
Note: No tape drives are supported by Online Diagnostics on HP-UX. Although some of the Support Tools Manager (STM) tools may function with tape drives, they are not supported. The diagnostic tools and utilities that support these devices are HP StorageWorks Library and Tape Tools (L and TT). These tools are available at the following site:
http://www.hp.com/support/tapetoolsEMS Hardware Monitors are used to maintain system availability. They enable you to monitor the operation of a wide variety of hardware. They also alert you immediately if any failure or other unusual event occurs. These monitors generate events if something unusual is detected on the hardware they monitor. These events are classified as major, warning, serious and critical on the basis of their severity. Action may be taken based on the severity. Peripheral Status Monitor (PSM) converts hardware events to changes in the device status. This way the monitors provide a high level of protection against system hardware failure. By using hardware event monitoring, you can virtually eliminate undetected hardware failures that could interrupt system operation or cause data loss.
Hardware event monitoring is available on HP-UX 11i and HP-UX 11.0 (IPR 9902 and later).
By default, events will be tracked in the following ways:
- Written to /var/adm/syslog/syslog.log
- Sent to EMAIL address root
All event information will be stored in /var/opt/resmon/log/event.log.
Configuring Hardware Monitoring
EMS Hardware Monitors are installed along with STM. Once the OnlineDiag bundle is installed, the monitoring action starts automatically.
To configure, enable, or disable hardware event monitoring, run the following command: /etc/opt/resmon/lbin/monconfig.
PSM is configured differently using the EMS GUI. For more information, see: http://docs.hp.com/hpux/onlinedocs/diag/ems/ems_gui.htm
Following are the monitors provided:
- AutoRAID Disk Array (armmon)
- Chassis Code Monitor (dm_chassis)
- Core Hardware (dm_core_hw)
- Chassis Event Monitor (ia64_corehw)
- CPU (lpmc_em)
- Disk (disk_em)
- Disk Array FC60 (fc60mon)
- Fast Wide SCSI Disk Array (fw_disk_array)
- Fibre Channel Adapters (dm_FCMS_adapter)
- Fibre Channel Adapter Model A5158 (dm_TL_adapter)
- Fibre Channel Adapter Model A6826A Monitor (dm_QL_adapter)
- Fibre Channel Arbitrated Loop Hub (dm_fc_hub)
- Fibre Channel SCSI Multiplexer (dm_fc_scsi_mux)
- Fibre Channel Switch (dm_fc_sw)
- Forward Progress Log Monitor (fpl_em)
- High Availability Disk Array (ha_disk_array)
- High Availability Storage System (dm_ses_enclosure)
- iSCSI Device Adapter (dm_iscsi_adapter)
- LPMC (lpmc_em) is now renamed to CPU Monitor as of the June 02 release
- Memory (dm_memory)
- MSA 1000 Storage Disk Array (msamon)
- Remote (RemoteMonitor)
- SCSI Card (scsi123_em)
- SCSI Cascade (scsi_cascade)
- SCSI Disk (scsi_disk)
- System Status (sysstat_em)
- UPS (dm_ups)
In addition, PSM monitors the current status of the products that are supported by the mentioned list of monitors.
For more information on the products that are supported by monitors, see the "Diagnostics" section of Hewlett-Packard's online documentation Website: http://docs.hp.com/en/diag/ .
Several of the monitors have special requirements, such as patches or certain versions of the firmware. They are as follows:
For more information on the patches required, see the DIAGNOSTIC.readme file for this release.
- The Fibre Channel Arbitrated Loop Hub monitor and the Fibre Channel Switch monitor have special requirements that are described in their data sheets in the "EMS Hardware Monitors User's Guide" (chapter 6) at http://docs.hp.com/en/2512/B6191-90029.pdf.
- If your system consists of HP SureStore E Disk Array FC60, a patch is required to run the fc60mon monitor or the STM tools.
Current requirements of monitor are described in the Requirements and Supported Products page under "EMS Hardware Monitors" at http://docs.hp.com/en/diag/ems/ems_prod.htm. Requirements are also listed in chapter 2 of the manual EMS Hardware Monitors User's Guide at http://docs.hp.com/en/2512/B6191-90029.pdf.
Following lists the changes that apply to EMS Hardware Monitors for the September 2005 release:
- Changes to Multiple Monitors
- Changes to Individual Monitors
- Changes to Platform and Interface
- Customer-viewable Interfaces
- N/A
Changes to Individual Monitors
Following are the changes that apply to individual monitors:
- AutoRAID Disk Array (armmon)
- JAGae87110
The armmon monitor exits causing registrar.log messages. This problem has been fixed. The monitor now logs correct signals to registrar.log. Therefore, no spurious logs are created.- JAGaf69055
armmon does hard ioscan, not kernel (-k) ioscan. armmon monitor has been modified to include the -k option during ioscan. This problem is fixed.- Chassis Code Monitor (dm_chassis)
- JAGaf69452
The text on Probable Cause/Recommended Action for event 1839 is modified.- Core Hardware Monitor (dm_core_hw)
- JAGaf63587
Events 37 and 38 were not generated on c8000 systems that run on PA8900 processors. This problem has been fixed.- Chassis Event Monitor (ia64_corehw)
- JAGaf61202
Whenever a redundant power supply is removed, event 104011 is generated and logged in the event.log file. This event record includes an incorrect generic message against Event Details that reads Redundancy regained. The correct message should read Redundancy lost. This problem has been fixed.- JAGaf46420
The severity of event 115002 and event 115003 are changed to "Major" and "Minor" respectively.- JAGaf57777; JAGaf55382
The ia64_corehw monitor is enhanced to improve Forward Progress Log (FPL) processing performance.- JAGaf55932
The ia64_corehw monitor experiences a memory leak when the monitor configuration values were retrieved. This problem has been fixed.- JAGaf25773
The ia64_corehw monitor generates events for errors that are already resolved. This problem has been fixed.- JAGaf39091
The Summary text and the Probable Cause/Recommended Action text of event 101011 are not consistent. This problem has been fixed.- JAGaf40433
The ia64_corehw monitor experiences a memory leak while processing entries from the os_decode_xref file. This problem has been fixed.- JAGaf42886
The ia64_corehw monitor generates events of high severity for incorrect system temperature . These events are generated when the system temperature changed from a very severe state to a less severe state. For example, when the system changed from a non-critical to a normal state, an event for a non-critical temperature used to be generated. This problem has been fixed.- The ia64_corehw monitor AplLog content has been cleaned up and enhanced. The details of content level can now be controlled.
- JAGaf51587
Event 102000 was generated although the event does not exist. This problem has been fixed.- JAGaf51588
Event 103011 was suppressed incorrectly. This problem has been fixed.- JAGaf54242
The ia64_corehw monitor experiences memory leak on non cellular systems. This has been fixed now.- The ia64_corehw monitor is enhanced to generate an FPL event to indicate that FPL entries were missed.
- CPU Monitor (lpmc_em)
Note: Starting from June 2002 release, LPMC (lpmc_em) was renamed CPU. The binary name is still lpmc_em. The name was changed to reflect the monitor's enhancement to check floating-point functionality of the CPU.
- JAGaf56907
The Dynamic Processor Resilience (DPR) action threshold for events 100906-100910 incorrectly reports as 31. The correct value is 1. This problem has been fixed.- Disk Array FC60 Monitor (fc60mon)
- JAGae87110
The fc60mon monitor exits causing registrar.log messages. This problem has been fixed. The monitor now logs correct signals to registrar.log. Therefore, no spurious logs are created.- JAGaf52161
Event 6 used to be generated by the monitor even if fc60 was working fine. This problem has been fixed. Entry for event 6 in the default_fc60mon.clcfg file is modified to report controller failure events as and when they occur.- JAGaf52955
The fc60mon monitor does not generate an event when a disk fails in a Fibre Channel (FC60) array that is enabled with Global Hotspace (GHS). This problem occurs inspite of fixing the following:This problem has been fixed.
- JAGae03024
- JAGae62769
- JAGaf01421
The /etc/opt/resmon/log directory is deleted 24 hours after the fc60mon monitor starts. This problem is fixed.- Fast Wide SCSI Disk Array (fw_disk_array)
- Not applicable
- Disk Monitor (disk_em)
- JAGaf48688
The disk_em monitor incorrectly reports a firmware mismatch as a hardware failure. This problem has been fixed.- Fibre Channel Adapters (dm_FCMS_adapter)
- Not applicable
- Fibre Channel Adapter Model A5158 Monitor (dm_TL_adapter)
- Not applicable
- Fibre Channel Adapter Model A6826A Monitor (dm_QL_adapter)
- Not applicable
- Fibre Channel SCSI Multiplexer (dm_fc_scsi_mux)
- Not applicable
- Fibre Channel Switch (dm_fc_sw)
- Not applicable
- Forward Progress Log Monitor (fpl_em)
- The fpl_em monitor is enhanced to filter events that have severity level less than 3 that corresponds to Warning.
- High Availability Disk Array Monitor (ha_disk_array)
- Not applicable
- High Availability Storage System (dm_ses_enclosure)
- Not applicable
- iSCSI Device Adapter (dm_iscsi_adapter)
- Not applicable
- Memory Monitor (dm_memory)
- JAGaf48614
In the events generated on rp4440, rp3440, c8000 class of machines, the MC/EXT field was not being logged. This problem has been fixed.- In the default_dm_memory.clcfg file, the severity level of events 3100, 3200, and 3300 is changed to Information. Also, these events will be disabled by default. These errors, relating to memory, occur at the same address and on the same DIMM and can be corrected. The severity level is changed to Information to disable the mentioned events.
- Peripheral Status Monitor (PSM/psmmon)
- Not applicable
- MSA 1000 Storage Disk Array (msamon)
- JAGaf48231
The msamon monitor has been modified to monitor MSA1500 storage device with ACTIVE-ACTIVE firmware.- Remote Monitor (RemoteMonitor)
- Not applicable
- SCSI Card Monitor (scsi123_em)
- Not applicable
- SCSI Cascade Monitor (scsi_cascade)
- Not applicable
- SCSI Disk (scsi_disk)
- Not applicable
- System Status Monitor (sysstat_em)
- Not applicable
- UPS Monitor (dm_ups)
- Not applicable
Changes to Platform and Interface
- Not applicable
- Not applicable
CAUTION: UPS monitor may need a patchIn some cases, the UPS monitor (dm_ups) will not function and will instead generate event 45 (formerly event 42) with the following text:
Probable Cause / Recommended Action: The monitor was unable to locate the fifo pipe that should have been created by ups_mond. Therefore, information about the ups cannot be sent to the monitor. You need version (80.1.2.3) of ups_mond or greater. To update your system with the correct version of ups_mond, install the following patch: HPUX 11.11 : PHCO_23832This problem will affect most systems with a UPS when the September 2001 diagnostics are installed. The only systems that are not affected will be those being updated from certain versions of the diagnostics (September 2000 through March 2001) and those which do not have patch PHCO_19040 (HP-UX 11.00) installed.
To fix the problem, install the indicated patch or the HWE patch bundle which contains this patch. For HP-UX 11i, the ups_mond patch PHCO_23832 is also distributed on the Sept 01 OE.
CAUTION: Changes to the monitoring method for disc30, sdisk and disk array devicesStarting with IPR 9902 (Feb 99 release), disc30, sdisk and the HA Disk Array Models 10, 20, and 30FC are monitored differently.
The "diaglogd exec" programs (pdisc30_exec and psdisk_exec) have been deleted and the EMS Hardware Monitors handle driver error entries for these devices.
Formerly, the "diaglogd exec" programs (pdisc30_exec and psdisk_exec) handled driver error entries for these devices.
If you customize the configuration files for the diaglogd exec programs (disk30_exec.cfg and sdisk_exec.cfg), you can re-configure the EMS Hardware Monitors to achieve the same results.
CAUTION:EMS monitors incompatible with ServiceGuard and High Availability (HA) Monitors.If you install the OnlineDiag bundle (IPR 9912 or later) on a system running older versions of EMS-related products, these products may experience compatibility problems. Affected products include MC/ServiceGuard, ServiceGuard OPS Edition and High Availability Monitors. Critical problems occur with the following versions:
MC/ServiceGuard A.10.10, A.11.01, A.11.03 ServiceGuard OPS Edition A.11.02, A.11.03Support Tools and the EMS Hardware Monitors are not affected. For complete information, refer EMS Incompatibility Problem.
If the maxssiz_64bit kernel parameter is set below the default value of 0x800000, it can cause the lpmc_em monitor to abort.
You can report defects related to EMS Hardware Monitors by filing a request on CHART. The name of the project is diag. If you do not have access to CHART, contact your local HP representative to file a defect on your behalf.
EMS Hardware Monitors are installed as part of the OnlineDiag bundle (product number B4708AA). In addition, they utilize the EMS framework, product number B7609BA.
Note: EMS Hardware Monitors are installed as part of the STM-UUT-RUN Fileset. However, EMS Hardware Monitors are dependent on the EMS-Core and EMS-Config products and additional filesets in the STM Product.
For information on the STM product, refer to the STM release notes available at /usr/sbin/stm/Rel_NOTES.STM.
SD Bundle: OnlineDiag Description: On-line Diagnostic System (Series 800/700) SD PRODUCT: Sup-Tool-Mgr Description: Support Tools Manager for HP-UX Systems SD SUB-PRODUCT: Manuals Description: Support Tools Manager Manual Pages FILESET: RELEASE_NOTES Description: HPUX STM Release Notes FILESET: STM-MAN Description: HPUX STM Manual Pages SD SUB-PRODUCT: Runtime Description: STM Manual Runtime FILESET: STM-CATALOGS Description: HPUX STM Shared Libraries FILESET: STM-SHLIBS Description: HPUX STM Shared Libraries FILESET: STM-UI-RUN Description: HPUX STM User Interface FILESET: STM-UUT-RUN Description: HPUX STM Unit Under Test Runtime SD PRODUCT: EMS-Config Description: EMS Config FILESET: EMS-GUI Description: Event Monitoring Service Graphical User Interface SD PRODUCT: EMS-Core Description: EMS Core Product FILESET: EMS-CORE Description: Event Monitoring Service Core FilesFollowing are the documents related to EMS Hardware Monitors available at http://docs.hp.com/en/diag/:
- Data Sheets
- EMS Hardware Monitors Quick Reference Guide
- EMS Hardware Monitors User's Guide
- EMS HW Monitors for Hitachi Systems Running HP-UX
- Event Descriptions
- Frequently Asked Questions (FAQs)
- Multiple-View (Predictive-Enabled) Monitors
- Overview
- Quick Start: Anatomy of a Monitor (Controlling and Learning About Monitors)
- Requirements and Supported Products
- Release Notes
For information on installing and using EMS Hardware Monitors, and the list of supported hardware, refer to the "EMS Hardware Monitors User's Guide". An electronic copy of this book is also included on the Support Plus CD-ROM in the <mount_point>/DIAGNOSTICS directory.