These release notes cover the March 2003 release of Support Plus for HP-UX 11i/11.00 running on S800/S700 systems.
- Overview
- Configuring Hardware Monitoring
- Documentation
- Changes
- Known Problems
- Monitors Provided
- Monitor Dependencies
- Defect Reporting
- SD Product Structure
NOTE: As of the September 1999 release, the name of the Diagnostic/IPR Media has been changed to Support Plus. In addition, the format has changed so that there is a separate CD-ROM for each version of the operating system (HP-UX 11i, 11.00, and 10.20).
NOTE: The HP Storageworks SDLT 160/320 GB Tape Drive and HP Ultrium 460 External Tape Drive are NOT SUPPORTED by the OnlineDiag product. Even though some STM tools may function with these devices, they are still NOT SUPPORTED. The diagnostic tools and utilities that DO SUPPORT these devices are the HP StorageWorks Library and Tape Tools (L&TT). These tools can be downloaded, free of charge, at:
http://www.hp.com/support/tapetoolsIncluded on the Support Plus CD-ROM are the EMS Hardware Monitors - an important tool for maintaining system availability. The EMS hardware monitors allow you to monitor the operation of a wide variety of hardware products and be alerted immediately if any failure or other unusual event occurs. Hardware event monitoring is available to users running HP-UX 11i, 11.00, and 10.20 (IPR 9902 and later).
Hardware event monitoring provides a high level of protection against system hardware failure. By using hardware event monitoring, you can virtually eliminate undetected hardware failures that could interrupt system operation or cause data loss.
Configuring Hardware Monitoring
The EMS Hardware Monitors are installed at the same time as the Support Tools Manager. Once the monitoring software is installed, monitoring is automatically enabled.
By default, messages regarding major warning, serious and critical events that occur on hardware being monitored will be:
All events will be stored in /var/opt/resmon/log/event.log.
- Written to /var/adm/syslog/syslog.log
- Sent to EMAIL address root
To configure, enable, or disable hardware event monitoring, run the monitoring request manager: /etc/opt/resmon/lbin/monconfig .
The Peripheral Status Monitor (PSM) and the The Kernel Resource Monitor (krmond) are configured differently. They use the EMS GUI. See: http://docs.hp.com/hpux/onlinedocs/diag/ems/ems_gui.htm
For the latest and most complete information on EMS Hardware Monitors and the Support Tools Manager (STM), see the Web page "Diagnostics":
http://docs.hp.com/hpux/diag/At this site, you will find Overviews, Tutorials, Quick Reference Cards, Frequently Asked Questions (FAQs), and much other material.For complete information on installing and using EMS hardware monitors, as well as a list of supported hardware, refer to the "EMS Hardware Monitors User's Guide" available at the above site. An electronic copy of this book is also included on the Support Plus CD-ROM in the <mount_point>/DIAGNOSTICS directory.
Changes in the EMS Hardware Monitors for the the March 2003 release include:
- Changes to Multiple Monitors
- Changes to Individual Monitors
- Changes to Platform and Interface
- Customer-Vi sible Interface Changes
Changes to Individual Monitors
Changes to each monitor are described below. (Monitors are listed in alphabetical order.)
- AutoRAID Disk Array (armmon).
- JAGae50172; JAGae53064; JAGae53083
The following problems were fixed:
- JAGae50172 -- fc60mon and armmon do ioscans when started.
- JAGae53064 -- There are two similarly named config files for fc60mon and armmon; there should only be one.
- JAGae53083 -- EMS 12H disk array armmon monitor monitoring requests are missing.
The armmon executable has been modified to fix JAGae50172 and JAGae53064.
To fix the problem associated with JAGae53083, do the following:
Whenever a customer encounters a situation where armmon has stopped monitoring the device, first check the ioscan output to confirm that the state is CLAIMED; then kill and restart monitoring.
- Chassis Code Monitor (dm_chassis).
- The changes to the chassis code database have been incorporated, so that dm_chassis monitor recognizes the new events which have been added.
- CMC Monitor (cmc_em).
- N/A
- Core Hardware Monitor (dm_core_hw)
- Core Hardware for Itanium (ia64_corehw).
- N/A
- CPU Monitor (lpmc_em).
Note: As of the June 2002 release, the LPMC Monitor (lpmc_em) was renamed to "CPU Monitor". The binary name is still lpmc_em. The name was changed to reflect the monitor's enhancement to check floating-point functionality in the CPU.
- JAGae46865
A fix was made to correct Problem-Description text for 100611-614 LPMC events.The current incorrect message is :
5 Minute(s) parity errors have been detected in the Data portion of the Instruction Cache (I-Cache Data) in !. The operating system has recovered from the errors, but this is an abnormally high failure rate. The monitor will try to deactivate the processor and/or mark it for deconfiguration. Another event will be generated to inform the result of the operation(s).Similar text is being used for events 100611-614 currently. This submittal fixes the problem as :NNN parity errors have been detected in the XXXX portion of the YYYY Cache in TIME, (and on....) where NNN is 2 XXX is Data or Tag (depending on the event number) YYY is Instruction or Data (depending on the event number) TIME is days,hours,minutes whichever applies to the event.- Disk Array FC60 Monitor (fc60mon).
- JAGae50172; JAGae53064; JAGae50182; JAGae50615
The following defect fixes were made for this monitor:
- JAGae50172 -- fc60mon does ioscan at startup.
- JAGae53064 -- There are two similarly named config files for fc60mon and armmon; there should only be one.h
- JAGae50182 -- Event 3 is missing from fc60mon event descriptions list.
- JAGae50615 -- fc60mon on A.34.00 (HWE0209) can loop using 100% CPU.
- Fast Wide SCSI Disk Array (fw_disk_array)
- Disk Monitor (disk_em).
- JAGae54859
When logtool was used to see the details of an I/O error related to sdisk, logged by the driver giving rawfiles as input to the tool, the tool displayed only header info, but not other details related to the error -- e.g., error description, cause/action, etc. This problem was due to the fact that disk_em monitor doesn't log sdisk messages when running as a decoder. Now this problem has been fixed.- JAGae47068; JAGae47608; JAGae54206
JAGae47068: The moncheck reports duplicate disk resources; this is mainly because of a particular variable (whose value defines num of HW paths monitored by this monitor) is not getting updated to the actual number of HW paths to be monitored by this monitor. Now this has been corrected. JAGae47608: The moncheck reports as DVD and MO drives are monitored by disk_em monitor. These drives are not supported by disk_em monitor, so these are not polled, and their asychronous events are not handled by disk_em monitor. But the moncheck reports that these drives are monitred by disk_em monitor, even if disk_em doesn't actually monitor these drives; this is because of a particular variable (whose value defines num of HW paths monitored by this monitor) is not getting updated to the actual number of HW paths to be monitored by this monitor. Now this has been corrected. JAGae54206 : disk_em monitor generates information event 6 unnecessarily for 73.3GB SEAGATE FC drives. This is because of spurious response data from these drives for the READ DEFECT command (this behaviour of disk drives is unexpected). Now the disk_em monitor has been modified not to generate event 6, even if spurious response data is returned from drive for the READ DEFECT command.- Fibre Channel Adapters (dm_FCMS_adapter).
- N/A
- Fibre Channel Adapter Model A5158 Monitor (dm_TL_adapter).
- N/A
- Fibre Channel SCSI Multiplexer (dm_fc_scsi_mux).
- N/A
- Fibre Channel Switch (dm_fc_sw).
- N/A
- High Availability Disk Array Monitor (ha_disk_array) .
- High Availability Storage System (dm_ses_enclosure)
- JAGae43072
dm_ses_enclosure restarts repeatedly, when there are 4 fc10s connected.- JAGae51571
Fixed the following problem: event 306 was being reported for DS2300 with one controller installed.- iSCSI Device Adapter (dm_iscsi_adapter)
- JAGae52908
Added a new decoder and monitor (dm_iscsi_adapter) support for the Hewlett-Packard iSCSI driver subsystem, which is available as a technology release for target vendor and early adopter customer testing only.- This monitor is now available for HP-UX 11i.
- Kernel Resource Monitor (krmond)
- N/A
- LPMC Monitor (lpmc_em).
As of the June 2002 release, the LPMC Monitor (lpmc_em) was renamed to "CPU Monitor". The binary name is still lpmc_em. The name was changed to reflect the monitor's enhancement to check floating-point functionality in the CPU. For more information, see CPU Monitor (lpmc_em).- Memory Monitor (dm_memory).
- JAGae40024
Memlogd now sends "memory.debug file is present" message to the memory monitor, in the case of T-Class and N-Class machines.- Peripheral Status Monitor (PSM/psmmon).
- Remote Monitor (RemoteMonitor).
- SCSI Card Monitor (scsi123_em).
- N/A
- SCSI Cascade Monitor (scsi_cascade).
- SCSI Disk (scsi_disk).
- SCSI Tape Monitor (dm_stape).
- System Status Monitor (sysstat_em)
- UPS Monitor (dm_ups).
- JAGae46611
The dm_ups monitor was causing the persistence files to grow in size, when ups_mond was killed, and the entry for ups_mond in the /etc/inittab file was removed to prevent re-spawning of the daemon. Code changes have been made to resolve this problem.Changes to Platform and Interface
Customer-Visible Interface Changes
CAUTION: UPS Monitor May Need a PatchIn some cases, the UPS monitor (dm_ups) will not function and will instead generate event 45 (formerly event 42) with the text:
Probable Cause / Recommended Action: The monitor was unable to locate the fifo pipe that should have been created by ups_mond. Therefore, information about the ups cannot be sent to the monitor. You need version (80.1.2.3) of ups_mond or greater. To update your system with the correct version of ups_mond, install one of the following patches: HPUX 10.20/s800 : PHCO_24153 (supersedes PHCO_23830) HPUX 11.00 : PHCO_24172 (supersedes PHCO_23831) HPUX 11.11 : PHCO_23832To fix the problem, load the indicated patch or load the HWE patch bundle which contains this patch. For HP-UX 11i, the ups_mond patch PHCO_23832 is also distributed on the Sept 01 OE.This problem will affect most systems with a UPS when the September 2001 diagnostics are installed. The only systems not affected will be those which are being updating from certain versions of the diagnostics (September 2000 through March 2001) and which do not have patch PHCO_19031 (HP-UX 10.20) or PHCO_19040 (HP-UX 11.00) installed.
CAUTION: Monitoring Changes for disc30, sdisk and disk array devicesAs of IPR 9902 (Feb 99 release), there has been a change to the way that monitoring is done for disc30, sdisk and the HA Disk Array Models 10, 20, and 30FC.
Formerly, the "diaglogd exec" programs (pdisc30_exec and psdisk_exec) handled driver error entries for these devices.
As of IPR 9902, these programs have been deleted and their functionality is now provided by the EMS Hardware Monitors.
If you had customized the configuration files for the diaglogd exec programs (disk30_exec.cfg and sdisk_exec.cfg) you may wish to re-configure the EMS Hardware Monitors to achieve the same results.
CAUTION: Compatibility Problem with EMS-Related Products (ServiceGuard, HA Monitors, etc.)If you install the OnlineDiag bundle (Dec 99 or later) onto a computer running older revisions of EMS-related products, these products may experience compatibility problems. Affected products include MC/ServiceGuard, ServiceGuard OPS Edition and High Availability Monitors. The only critical problems occur with the following versions:
MC/ServiceGuard A.10.10, A.11.01, A.11.03 ServiceGuard OPS Edition A.11.02, A.11.03Support Tools and the EMS hardware monitors are not affected. For complete information, see EMS Incompatibility Problem.
Monitors are provided to support the following:
- AutoRAID Disk Array (armmon)
- Chassis Code Monitor (dm_chassis)
- CMC Monitor (cmc_em).
- Core Hardware (dm_core_hw)
- Core Hardware for Itanium (ia64_corehw)
- CPU (lpmc_em)
- Disk (disk_em)
- Disk Array FC60 (fc60mon)
- Fast Wide SCSI Disk Array (fw_disk_array)
- Fibre Channel Adapters (dm_FCMS_adapter)
- Fibre Channel Adapter Model A5158 (dm_TL_adapter)
- Fibre Channel Arbitrated Loop Hub (dm_fc_hub)
- Fibre Channel SCSI Multiplexer (dm_fc_scsi_mux)
- Fibre Channel Switch (dm_fc_sw)
- High Availability Disk Array (ha_disk_array)
- High Availability Storage System (dm_ses_enclosure)
- iSCSI Device Adapter (dm_iscsi_adapter)
- Kernel Resource (krmond)
- LPMC (lpmc_em) renamed to "CPU Monitor" as of the June 02 release
- Memory (dm_memory)
- Remote (RemoteMonitor)
- SCSI Card (scsi123_em)
- SCSI Cascade (scsi_cascade)
- SCSI Disk (scsi_disk)
- SCSI Tape Devices (dm_stape)
- System Status (sysstat_em)
- UPS (dm_ups)
In addition, the Peripheral Status Monitor (PSM) is provided to monitor the current status of the products supported by the above list.
For detailed information concerning which products are supported by which monitors and additional dependencies, check the "Diagnostics" section of Hewlett-Packard's online documentation web site: http://docs.hp.com/hpux/diag/ .
Several of the monitors have special requirements, such as patches or certain versions of firmware. In particular:
For a list of the current required patches, see the DIAGNOSTIC.readme file for this release.
- The Fibre Channel Arbitrated Loop Hub Monitor and the Fibre Channel Switch Monitor require special configuration which is described in their data sheets in the "EMS Hardware Monitors User's Guide" (chapter 6). A patch is also required.
- A patch is required if your system includes an HP SureStore E Disk Array FC60. This patch is required to to run the EMS hardware monitor (fc60mon) or STM tools for this device.
Current monitor requirements are described in the "Supported Products" page under "EMS Hardware Monitors" at http://docs.hp.com/hpux/diag . Requirements are also listed in chapter 2 of the manual "EMS Hardware Monitors User's Guide".
Use CHART to report defects in the EMS Hardware monitors. The project name is diag.hw_mon.hpux. If you don't have access to CHART, contact an HP representative to enter a defect for you.
The EMS hardware monitors are installed as part of the OnlineDiag bundle (product number B4708AA). In addition, they utilize the EMS framework, product number B7609BA.
Note: EMS Hardware Monitors are installed as part of the STM-UUT-RUN Fileset. However, the EMS Hardware Monitors are dependent on the EMS-Core and EMS-Config products and additional filesets in the Sup-Tool-Mgr Product.
For information on the STM product, refer to the STM release notes file /usr/sbin/stm/Rel_NOTES.STM.
SD Bundle: OnlineDiag Description: On-line Diagnostic System (Series 800/700) SD PRODUCT: Sup-Tool-Mgr Description: Support Tools Manager for HP-UX Systems SD SUB-PRODUCT: Manuals Description: Support Tools Manager Manual Pages FILESET: RELEASE_NOTES Description: HPUX STM Release Notes FILESET: STM-MAN Description: HPUX STM Manual Pages SD SUB-PRODUCT: Runtime Description: STM Manual Runtime FILESET: STM-CATALOGS Description: HPUX STM Shared Libraries FILESET: STM-SHLIBS Description: HPUX STM Shared Libraries FILESET: STM-UI-RUN Description: HPUX STM User Interface FILESET: STM-UUT-RUN Description: HPUX STM Unit Under Test Runtime SD PRODUCT: EMS-Config Description: EMS Config FILESET: EMS-GUI Description: Event Monitoring Service Graphical User Interface SD PRODUCT: EMS-Core Description: EMS Core Product FILESET: EMS-CORE Description: Event Monitoring Service Core Files