Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Using EMS HA Monitors > Chapter 1 Installing and Using EMS

Using EMS HA Monitors

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

There are two ways to use EMS HA Monitors:

  • Configure monitoring requests from the EMS interface in the Resource Management area of SAM.

  • Configure package dependencies in MC/ServiceGuard by using the Package Configuration interface in the High Availability Clusters subarea of SAM or by editing the package ASCII configuration file.

The following are prerequisites to using EMS:

  • Disks need to be configured using the LVM (Logical Volume Manager).

  • Network cards need to be configured.

  • Filesystems need to have been created and mounted.

Resource classes are structured hierarchically, similar to a filesystem structure, although they are not actually files and directories. The classes supplied with this version of EMS are listed in Figure 1-2 “Event Monitoring Service Resource Class Hierarchy”. Resource instances are listed in bold, and instances that are replaced with an actual name are in bold italics.

Figure 1-2 Event Monitoring Service Resource Class Hierarchy

Event Monitoring Service Resource Class Hierarchy

The full path of a resource includes the class, subclasses, and instance. An example of a full resource path for the physical volume status of the device /dev/dsk/c0t1d2 belonging to volume group vgDataBase, would be /vg/vgDataBase/pv_pvlink/status/c0t1d2.

Configuring EMS Monitoring Requests
Outside of MC/ServiceGuard

This section describes the steps from the SAM interface to EMS to create monitoring requests that notify non-MC/ServiceGuard management applications such as IT/Operations.This information for creating requests is also valid for monitors sold with other products (ATM or OTS, for example) and for user-written monitors written according to developer specifications in Writing Monitors for the Event Monitoring Service (EMS).

To start the EMS configuration, double-click on the Event Monitoring Service icon in the Resource Management area in SAM. The main screen, shown in Figure 1-3 “Event Monitoring Service Screen”, shows all requests configured on that system; if you haven't created requests, the screen will be empty.

Figure 1-3 Event Monitoring Service Screen

Event Monitoring Service Screen

Selecting a Resource to Monitor

All resources are divided into classes. When you double-click on Add Monitoring Request in the Actions menu, the top-level classes for all installed monitors are dynamically discovered and then listed.

Figure 1-4 The Top Level of the Resource Hierarchy in the Add a Monitoring Request Screen

The Top Level of the Resource Hierarchy in the Add a Monitoring Request Screen

Some Hewlett-Packard products, such as ATM or HP OTS 9000, provide EMS monitors. If those products are installed on the system, then their top-level classes will also appear here. Similarly, top-level classes belonging to user-written monitors, created using the EMS Developer's Kit, will be discovered and displayed here.

Traverse the hierarchy in the upper part of the screen in Figure 1-4 “The Top Level of the Resource Hierarchy in the Add a Monitoring Request Screen” and select a resource instance to monitor in the lower part of the screen as in Figure 1-5 “Choosing a Resource Instance in the Add a Monitoring Request Screen”.

Figure 1-5 Choosing a Resource Instance in the Add a Monitoring Request Screen

Choosing a Resource Instance in the Add a Monitoring Request Screen

Using Wildcards

The * wildcard is a convenient way to create many requests at once. Most systems have more than one disk or network card, and many have several disks. To avoid having to create a monitor request for each disk, select * (All Instances) in the Resource Instance box. See Figure 1-5 “Choosing a Resource Instance in the Add a Monitoring Request Screen”.

Wildcards are available only when all instances of a subclass are the same resource type.

Wildcards are not available for resource classes. So, for example, a wildcard is available for the status instances in the /vg/vgName/pv_pvlink/status subclass, but no wildcard appears for the volume group subclasses under the /vg resource class.

Creating a Monitoring Request

The screen in Figure 1-6 “Monitoring Request Parameters” shows where you specify when and how to send events. The following sections describe the monitoring parameters and some common applications of them.

Figure 1-6 Monitoring Request Parameters

Monitoring Request Parameters

How Do I Tell EMS When to Send Events?

While the monitor may be polling disks every 5 minutes, for example, you may only want to be alerted when something happens that requires your attention. When you create a request, you specify the conditions under which you receive an alert. Here are the terms under which you can be notified:

When value is...

You define the conditions under which you wish to be notified for a particular resource using an operator (e.g. =, not equal, >, >=, <, <=) and a value returned by the monitor (e.g. UP, DOWN, INACTIVE). Text values are mapped to numerical values. Specific values are in the chapters describing the individual monitors.

When value changes

This notification might be used for a resource that does not change frequently, but you need to know each time it does. For example, you would want notification each time the number of mirrored copies of data changes from 2 to 1 and back to 2.

At each interval

This sends notification at each polling interval. It would most commonly be used for reminders or gathering data for system analysis. Use this for only a small number of resources at a time, and with long polling intervals of several minutes or hours; there is a risk of affecting system performance.

If you select conditional notification, you may select one or more of these options:

Initial

Use this option as a baseline when monitoring resources such as available filesystem space or system load. It can also be used to test that events are being sent for a new request.

Repeat

Use this option for urgent alerts. The Repeat option sends an alert at each polling interval as long as the notify condition is met. Use this option with caution; there is a risk of high CPU use or filling log files and alert windows.

Return

Use this option to track when emergency situations return to normal.

NOTE: Updated monitors may have new status values that change the meaning of your monitoring requests, or generate new alerts.

For example, assume you have a request for notification if status > 3 for a resource with a values range of 1-7. You would get alerts each time the value equaled 4, 5, 6, or 7. If the updated version of the monitor has a new status value of 8, you would see new alerts when the resource equalled 8.

What is a Polling Interval?

The polling interval determines the maximum amount of elapsed time before a monitor knows about a change in status for a particular resource. The shorter the polling interval, the more likely you are to have recent data. However, depending on the monitor, a short polling interval may use more CPU and system resources. You need to weigh the advantages and disadvantages between being able to quickly respond to events and maintaining good system performance.

The minimum polling interval depends on the monitor's ability to process quickly. For most resource monitors the minimum is 30 seconds. Disk monitor requests can be as short as 1 second.

MC/ServiceGuard monitors resources every few seconds. You may want to use a short polling interval (30 seconds or less) when it is critical that you make a quick failover decision.

You may want a polling interval of 5 minutes or so for monitoring less critical resources.

You may want to set a very long polling interval (4 hours) to monitor failed disks that are not essential to the system, but which should be replaced in the next few days.

Which Protocols Can I Use to Send Events?

You specify the protocol the EMS framework uses to send events in the Notify via: section of the screen in Figure 1-6 “Monitoring Request Parameters”. The options are:

  • opcmsg ITO sends messages to ITO applications via the opcmsg daemon. EMS defines normal and abnormal differently for each notification type:

    • Conditional notification defines all events that meet the condition as abnormal, and all others as normal.

    • Change notification defines all events as abnormal.

    • Notification at each polling interval defines all events as normal.

    You may specify the ITO message severity for both normal and abnormal events:

    • Normal

    • Warning

    • Critical

    • Minor

    • Major

    The ITO application group is EMS(HP), the message group, HA, and the object is the full path of the resource being monitored.

    See HP OpenView IT/Operations Administrators Task Guide (P/N B4249-90003) for more information on configuring notification severity.

  • SNMP traps
    This sends messages to applications using SNMP traps, such as Network Node Manager. See HP OpenView Using Network Node Manager (P/N J1169-90002) for more information on configuring SNMP traps. The following traps are used by EMS:

    EMS_NORMAL_OID     "1.3.6.1.4.1.11.2.3.1.7.0.1" - Normal notification
    EMS_ABNORMAL_OID "1.3.6.1.4.1.11.2.3.1.7.0.2" - Abnormal notification
    EMS_RESTART_OID "1.3.6.1.4.1.11.2.3.1.7.0.4" - Restart notification
  • TCP and UDP
    This sends TCP or UDP encoded events to the target host name and port indicated for that request. Thus the message can be directed to a user-written socket program.

Templates for configuring IT/Operations and Network Node Manager to display EMS events can be found on the Hewlett-Packard High Availability public web page at http://www.hp.com/go/ha.

What is a Notification Comment?

The notification comment is useful for sending task reminders to the recipients of an event. For example, if you have a disk monitor request that reports an alert that an entire mirror has failed, when that event shows up in IT/Operations, for example, you may want it to have the name of the person to contact if disks fail. If you have configured MC/ServiceGuard package dependencies, you may want to enter the package name as a comment in the corresponding pv_summary request.

Copying Monitoring Requests

There are two ways to use the copy function:

  • To create requests for many resources using the same monitoring parameters, select the monitoring request in the main screen and choose Actions: Copy Monitoring Request. You need to have configured at least one similar request for a similar instance. Choose a different resource instance in the Add a Monitoring Request screen, and click <OK> in the Monitoring Request Parameters screen.

  • To create many different requests for the same resource, select the monitoring request in the main screen and choose Actions: Copy Monitoring Request. You need to have configured at last one request for that resource. Click <OK> in the Add a Monitoring Request screen, and modify the parameters in the Monitoring Request Parameters screen. You may want to do this to create requests that send events using multiple protocols.

Modifying Monitoring Requests

To change the monitoring parameters of a request, select the monitoring request from the main screen and select Actions: Modify Monitoring Request.

Removing Monitoring Requests

Select one or more monitoring requests from the main screen and choose Actions: Remove Monitoring Request. To start monitoring the resource again you must recreate the request, either by copying a similar request for a similar resource or by re-entering the data.

Configuring MC/ServiceGuard Package Dependencies

This section describes how to use SAM to create package dependencies on EMS resources. This creates an EMS request to monitor that resource and to notify MC/ServiceGuard when that resource reaches a critical user-defined level. MC/ServiceGuard will then failover the package. Here are some examples of how EMS might be used:

  • In a cluster where one copy of data is shared between all nodes in a cluster, you may want to fail over a package if the host adapter has failed on the node running the package. Because busses, controllers, and disks are shared, package fail over to another node because of bus, controller, or disk failure would not successfully run the package. To make sure you have proper failover in a shared data environment, you must create identical package dependencies on all nodes in the cluster. MC/ServiceGuard can then compare the resource "UP" values on all nodes and fail over to the node that has the correct resources available.

  • In a cluster where each node has its own copy of data, you may want to fail over a package to another node for any number of reasons:

    • host adapter, bus, controller, or disk failure

    • unprotected data (the number of copies is reduced to one)

    • performance has degraded because one of the PV links has failed

    In this sort of cluster of web servers, where each node has a copy of the data and users are distributed for load balancing, you can fail over a package to another node with the correct resources available. Again, the package resource dependencies should be configured the same on all nodes.

This information for creating requests is also valid for EMS monitors sold with other products (ATM or OTS, for example) and for user-written monitors written according to developer specifications in Writing Monitors for the Event Monitoring Service (EMS).

NOTE: You should create the same requests on all nodes in an MC/ServiceGuard cluster.

A package can depend on any resource monitored by an EMS monitor. To create package dependencies, choose create or modify a package from the Package Configuration interface under the High Availability Clusters subarea of SAM, Figure 1-7 “Package Configuration Screen”. You see a new option called "Specify Package Resource Dependencies."

Figure 1-7 Package Configuration Screen

Package Configuration Screen

Click on "Specify Package Resource Dependencies..." to add EMS resources as package dependencies; you see a screen similar to Figure 1-8 “Package Resource Dependencies Screen”. If you click "Add Resource", you get a screen similar to Figure 1-7 “Package Configuration Screen”.

Figure 1-8 Package Resource Dependencies Screen

Package Resource Dependencies Screen

When you select a resource, either from the "Add a Resource" screen, or from the "Package Resource Dependencies" screen by selecting a resource and clicking "Modify Resource Dependencies..." you get a screen similar to Figure 1-9 “Resource Parameters Screen”.

To make a package dependent on an EMS resource, select a Resource Up Value from the list of Available Resource Values, then click "Add." The example in Figure 1-9 “Resource Parameters Screen” shows the possible values for pv_summary. Different resources show different available "Up" values.

NOTE: Make sure you always select UP as one of the UP values. MC/ServiceGuard creates an EMS request that sends an event if the resource value is not equal to the UP value.

If you select UP, the package fails over if the value is anything but UP. If you select UP and PVG-UP, the package fails over if the pv_summary value is not equal to UP or PVG_UP; in other words, if pv_summary were SUSPECT or DOWN.

The polling interval determines the maximum amount of elapsed time before the monitor knows about a change in resource status. For critical resources, you may want to set a short polling interval, for example 30 seconds, which could adversely affect system performance. With longer polling intervals you gain system performance, but risk not detecting problems soon enough.

Figure 1-9 Resource Parameters Screen

Resource Parameters Screen

You can also add resources as package dependencies by modifying the package configuration file in /etc/cmcluster/pkg.ascii. See Managing MC/ServiceGuard for details on how to modify this file. A example of the syntax is:

RESOURCE_NAME               /vg/vg01/pv_summary
RESOURCE_POLLING_INTERVAL 60
RESOURCE_UP_VALUE = UP
RESOURCE_UP_VALUE = PVG_UP
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1997 Hewlett-Packard Development Company, L.P.