Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Configuring OPS Clusters with MC/LockManager: > Chapter 4 Planning and Documenting and OPS Cluster

Package Configuration Planning

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

Planning for packages involves assembling information about each group of highly available services. Some of this information is used in creating the package configuration file, and some is used for editing the package control script.

NOTE: Volume groups that are to be activated by packages must also be defined as cluster aware in the cluster configuration file. See the previous section "“Cluster Configuration Planning ”."

Logical Volume and Filesystem Planning

You may need to use logical volumes in volume groups as part of the infrastructure for package operations on a cluster. When the package moves from one node to another, it must be able to access data residing on the same disk as on the previous node. This is accomplished by activating the volume group and mounting the file system that resides on it.

In MC/LockManager, high availability applications, services, and data are located in volume groups that are on a shared bus. When a node fails, the volume groups containing the applications, services, and data of the failed node are deactivated on the failed node and activated on the adoptive node. In order to do this, you have to configure the volume groups so that they can be transferred from the failed node to the adoptive node.

As part of planning, you need to decide the following:

  • What volume groups are needed?

  • How much disk space is required, and how should this be allocated in logical volumes?

  • What file systems need to be mounted for each package?

  • Which nodes need to import which logical volume configurations.

  • If a package moves to an adoptive node, what effect will its presence have on performance?

Create a list by package of volume groups, logical volumes, and file systems. Indicate which nodes need to have access to common filesystems at different times.

It is recommended that you use customized logical volume names that are different from the default logical volume names (lvol1, lvol2, etc.). Choosing logical volume names that represent the high availability applications that they are associated with (for example, lvoldatabase) will simplify cluster administration.

To further document your package-related volume groups, logical volumes, and file systems on each node, you can add commented lines to the /etc/fstab file. The following is an example for a database application:

# /dev/vg01/lvoldb1 /applic1 vxfs defaults 0 1   # These six entries are
# /dev/vg01/lvoldb2 /applic2 vxfs defaults 0 1   # for information purposes
# /dev/vg01/lvoldb3 raw_tables ignore ignore 0 0 # only. They record the
# /dev/vg01/lvoldb4 /general vxfs defaults 0 2   # logical volumes that
# /dev/vg01/lvoldb5 raw_free ignore ignore 0 0   # exist for MC/LockManager's
# /dev/vg01/lvoldb6 raw_free ignore ignore 0 0   # HA package. Do not uncomment.

Create an entry for each logical volume, indicating its use for a file system or for a raw device.

CAUTION: Do not use /etc/fstab to mount file systems that are used by MC/LockManager packages.

Details about creating, exporting, and importing volume groups in MC/LockManager are given in the chapter "Chapter 5 “Building an OPS Cluster Configuration”."

Monitoring Registered Package Resources

MC/LockManager has access to a registry of resources that can be monitored as package dependencies. The registry is the core of the Event Monitoring Service (EMS). Once an EMS registered resource is configured as a package dependency, MC/LockManager can fail a package to another node based on messages the resource's monitor returns. Monitors for individual resources may be provided by hardware or software vendors from time to time. A specific group of HA EMS monitors for disk, LAN, and system status information is available from HP as a separate product. Refer to the manual Using High Availability Monitors (B5736-90012) for additional information.

You can specify a registered resource for a package by selecting it from the list of available resources displayed in the SAM package configuration area. The size of the list displayed by SAM depends on which resource monitors have been registered on your system. Alternatively, you can obtain information about registered resources on your system by using the command /opt/resmon/bin/resls. For additional information, refer to the man page for resls(1m).

Choosing Switching and Failover Behavior

Switching IP addresses from a failed LAN card to a standby LAN card on the same physical subnet may take place if Automatic Switching is set to Enabled in SAM (NET_SWITCHING_ENABLED set to YES in the ASCII package configuration file). Automatic Switching Enabled is the default.

To determine failover behavior, you can define a package startup policy that governs which nodes will automatically start up a package that is not running. In addition, you can define a failback policy that determines whether a package will be automatically returned to its primary node when that is possible.

The following table describes different types of failover behavior and the settings in SAM or in the ASCII package configuration file that determine each behavior.

Table 4-2 Package Failover Behavior

Switching Behavior

Options in SAMParameters in ASCII File

Package IP address switches to standby LAN card transparently on LAN card failure

  • Automatic Switching set to Enabled for the package (Default)

  • NET_SWITCHING_ ENABLED set to YES for the package (Default)

Package switches normally after detection of failure or report of an EMS monitor event showing that a resource on which the package depends is down. Halt script runs before switch takes place (default behavior)

  • Package Failfast set to Disabled. (Default)

  • Service Failfast set to Disabled for all services. (Default)

  • Automatic Switching set to Enabled for the package. (Default)

  • NODE_FAIL_FAST_ENABLED set to NO. (Default)

  • SERVICE_FAIL_ FAST_ENABLED set to NO for all services. (Default)

  • PKG_SWITCHING_ ENABLED set to YES for the package. (Default)

Package fails over to the node with the fewest active packages

  • Failover policy set to Minimum Package Node

  • FAILOVER_POLICY set to MIN_PACKAGE_ NODE

Package fails over to the node that is next on the list of nodes (default behavior)

  • Failover policy set to Configured Node

  • FAILOVER_POLICY set to CONFIGURED_NODE

Package is automatically halted and restarted on its primary node if the primary node is available and the package is running on a non-primary node

  • Failback policy set to Automatic

  • FAILBACK_POLICY set to AUTOMATIC

If desired, package must be manually returned to its primary node if it is running on a non-primary node

  • Failback policy set to Manual

  • FAILBACK_POLICY set to MANUAL

All packages switch following a TOC (Transfer of Control, an immediate halt without a graceful shutdown) on the node when a specific service fails. Halt scripts are not run.

  • Package Failfast set to Disabled

  • Service Failfast set to Enabled for a specific service

  • Automatic Switching set to Enabled for all packages.

  • NODE_FAIL_FAST_ENABLED set to NO

  • SERVICE_FAIL_ AST_ENABLED set to YES for a specific service.

  • PKG_SWITCHING_ NABLED set to YES for all packages.

All packages switch following a TOC on the node when any service fails.

  • Package Failfast set to Disabled.

  • Service Failfast set to Enabled for all services.

  • Automatic Switching set to Enabled for all packages.

  • NODE_FAIL_FAST_ENABLED set to NO.

  • SERVICE_FAIL_ FAST_ENABLED set to YES for all services.

  • PKG_SWITCHING_ ENABLED set to YES for all packages.

All packages switch following a TOC on the node when the run or halt script exits with an error other than 0 or 1. This may be caused by an EMS monitor event showing that a resource is down

  • Package Failfast set to Enabled.

  • Automatic Switching set to Enabled for all packages.

  • NODE_FAIL_FAST_ENABLED set to YES.

  • PKG_SWITCHING_ ENABLED set to YES for all packages.

 

Package Configuration File Parameters

Prior to generation of the package configuration file, assemble the following package configuration data. The parameter names given below are the names that appear in SAM. The names coded in the ASCII cluster configuration file appear at the end of each entry. The following parameters must be identified and entered on the worksheet for each package:

Package Name

The name of the package. The package name must be unique in the cluster. It is used to start, stop, modify, and view the package.

The package name must not contain any of the following illegal characters: '/', '\', and '*'. All other characters are legal.

In the ASCII package configuration file, this parameter is known as PACKAGE_NAME.

Package Failover Policy

The policy to be used by the package manager to start the node to run the package whenever the package is automatically started. The default is CONFIGURED_NODE, which means the next available node in the list of node names for the package. The order of node name entries dictates the order of preference when selecting the node. The alternate policy is MIN_PACKAGE_NODE, which means the node from the list that is running the fewest other packages at the time this package is to be started.

In the ASCII package configuration file, this parameter is known as FAILOVER_POLICY.

Failback Policy

The policy used to determine what action the package manager should take if the package is not running on its primary node and its primary node is capable of running the package. The default is MANUAL, which means no attempt will be made to move the package back to its primary node when it is running on an alternate node. The alternate policy is AUTOMATIC, which means that the package will halted and restarted on its primary node as soon as the primary node is capable of running the package and, if MIN_PACKAGE_NODE is the Package Startup Policy, is running fewer packages than the current node.

In the ASCII package configuration file, this parameter is known as FAILBACK_POLICY.

Node Name

The names of primary and alternate nodes for the package, e.g., ftsys9 and ftsys10. The order in which you specify the node names is important. First list the primary node name, then the first adoptive node name, then the second adoptive node name, followed, in order, by additional node names. Ownership of a package may be transferred to the next adoptive node name listed in the package configuration file.

In the ASCII package configuration file, this parameter is known as NODE_NAME.

Control Script Pathname

Enter the full pathname of the package control script. (The script must reside in a directory that contains the string "cmcluster".) It is recommended that you use the same script as both the run and halt script. This script will contain both your package run instructions and your package halt instructions. When the package starts, its run script is executed and passed the parameter 'start'; similarly, at package halt time, the halt script is executed and passed the parameter 'stop'.

In the ASCII package configuration file, this parameter maps to the two separate parameters named RUN_SCRIPT and HALT_SCRIPT. Use the name of the single control script as the name of the RUN_SCRIPT and the HALT_SCRIPT in the ASCII file.

If you wish to separate the package run instructions and package halt instructions into separate scripts, the package configuration file allows you to do this by naming two separate scripts. However, under most conditions, it is simpler to combine your run and halt instructions into a single package control script and repeat its name for both the RUN_SCRIPT and the HALT_SCRIPT. Ensure that the script is executable.

NOTE: If you choose to write separate package run and halt scripts, be sure to include identical configuration information (such as node names, IP addresses, etc.) in both scripts.
Run Script Timeout and Halt Script Timeout

Enter a number of seconds. If the script has not completed by the specified timeout value, MC/LockManager will terminate the script. The default is 0, or no timeout.

If the timeout is exceeded:

  • Control of the package will not be transferred.

  • The run or halt instructions will not be run.

  • Global switching will be disabled.

  • The current node will be disabled from running the package.

  • The control script will exit with status 1.

In the ASCII package configuration file, this parameter is called RUN_SCRIPT_TIMEOUT and HALT_SCRIPT_TIMEOUT. The default for both is 0 or NO_TIMEOUT. In the ASCII file, this parameter is entered in microseconds.

If the halt script timeout occurs, you may need to perform manual cleanup see the section "“Reviewing the Package Control Script ”" in the chapter "Chapter 8 “Troubleshooting Your Cluster”."

Service Name

Enter a unique name for each service. You can configure a maximum of 30 services per package.

In the ASCII package configuration file, this parameter is called SERVICE_NAME. Define one SERVICE_NAME entry for each service.

Service Fail Fast

Enter Enabled or Disabled for each service. This parameter indicates whether or not the failure of a service results in the failure of a node. If the parameter is set to Enabled, in the event of a service failure, MC/LockManager will halt the node on which the service is running with a TOC. The default is Disabled.

In the ASCII package configuration file, this parameter is SERVICE_FAIL_FAST_ENABLED, and possible values are YES and NO. The default is NO. Define one SERVICE_FAIL_FAST_ENABLED entry for each service.

The service name must not contain any of the following illegal characters: '/', '\', and '*'. All other characters are legal.

Service Halt Timeout

In the event of a service halt, MC/LockManager will first send out a SIGTERM signal to terminate the service. If the process is not terminated, MC/LockManager will wait for the specified timeout before sending out the SIGKILL signal to force process termination. Default is 300 seconds (5 minutes).

In the ASCII package configuration file, this parameter is SERVICE_HALT_TIMEOUT. Define one SERVICE_HALT_TIMEOUT entry for each service.

Subnet

Enter the IP subnets that are to be monitored for the package.

In the ASCII package configuration file, this parameter is called SUBNET.

Resource Name

The name of a resource that is to be monitored by MC/LockManager as a package dependency. A resource name is the name of an important attribute of a particular system resource. The resource name includes the entire hierarchy of resource class and subclass within which the resource exists on a system.

In the ASCII package configuration file, this parameter is called RESOURCE_NAME. Obtain the resource name from the list provided in SAM, or obtain it from the documentation supplied with the resource monitor.

A maximum of 60 resources may be defined per cluster. Note also the limit on Resource Up Values described below.

Resource Polling Interval

The frequency of monitoring an additional package resource. The default is 60 seconds. In the ASCII package configuration file, this parameter is called RESOURCE_POLLING_INTERVAL. The Resource Polling Interval appears on the list provided in SAM, or you can obtain it from the documentation supplied with the resource monitor.

Resource Up Value

The criteria for judging whether an additional package resource has failed or not. In the ASCII package configuration file, this parameter is called RESOURCE_UP_VALUE. The Resource Up Value appears on the list provided in SAM, or you can obtain it from the documentation supplied with the resource monitor.

You can configure a total of 15 Resource Up Values per package. For example, if there is only one resource in the package, then a maximum of 15 Resource Up Values can be defined. If there are two Resource Names defined and one of them has 10 Resource Up Values, then the other Resource Name can have only 5 Resource Up Values.

Automatic Switching

Enter Enabled or Disabled. The default is Enabled, which allows a package to start up normally on a cluster node. In the event of a failure, a value of Enabled permits MC/LockManager to transfer the package to an adoptive node. If this parameter is set to Disabled, the package will not start up automatically when the cluster starts running.

In the ASCII package configuration file, this parameter is called PKG_SWITCHING_ENABLED, and possible values are YES and NO. The default is YES. If this parameter is set to NO, the package will not start up automatically when the cluster starts running.

Local Switching

Enter Enabled or Disabled. In the event of a failure, this permits MC/LockManager to switch LANs locally, that is, transfer to a standby LAN card. The default is Enabled.

In the ASCII package configuration file, this parameter is called NET_SWITCHING_ENABLED, and possible values are YES and NO. The default is YES.

Package Fail Fast Enabled

In the event of the failure of the control script itself or the failure of a subnet or the report of an EMS monitor event showing that a resource is down, if this parameter is set to Enabled, MC/LockManager will issue a TOC on the node where the control script fails. The default is Disabled.

In the ASCII package configuration file, this parameter is called NODE_FAIL_FAST_ENABLED, and possible values are YES and NO. The default is NO.

Package Control Script Variables

The control script that accompanies each package must also be edited to assign values to a set of variables. The following variables must be set:

Volume Groups, Logical Volumes, File Systems and Mount Options

Determine the filesystems and corresponding logical volumes within the volume groups required. Example:

pkg1 requires /dev/vg01/lvol1 mounted on /vg01

Indicate the names of volume groups that are to be activated and deactivated, together with the logical volumes and file systems that are to be mounted. You can also specify options that are to be used with the HP-UX mount command. On starting the package, the script activates a volume group, and it may mount logical volumes onto file systems. At halt time, the script unmounts the file systems and deactivates each volume group. All volume groups must be accessible on each target node.

In the ASCII package control script, these variables are arrays, as follows: VG, LV, FS and LV_MOUNT_COUNT. For each file system (FS), you must identify a logical volume (LV). Include as many volume groups (VGs) as needed. If you are using raw files, the LV, FS, and LV_MOUNT_COUNT entries are not needed.

Only cluster aware volume groups should be specified in package control scripts. To make a volume group cluster aware, enter it as part of the cluster configuration. See the section above, "“Cluster Configuration Planning ”."

IP Addresses and SUBNETs

These are the IP addresses by which a package is mapped to a LAN card. Indicate the IP addresses and subnets for each IP address you want to add to an interface card. The Subnet is the IP address logically ANDed with the subnet mask.

In the ASCII package control script, these variables are entered in pairs. Example IP[0]=192.10.25.12 and SUBNET[0]=192.10.25.0. (In this case the subnet mask is 255.255.255.0.)

Service Name

Enter a unique name for each specific service within the package. All services are monitored by MC/LockManager. The service name, service command, and service restart parameters are entered in the package control script in groups of three. You may specify as many service names as you need. Each name must be unique within the cluster. The service name is the name used by cmrunserv and cmhaltserv inside the package control script.

In the ASCII package control script, enter values into an array known as SERVICE_NAME. Enter one service name for each service.

Service Command

For each named service, enter a service command. This command will be executed through the control script by means of the cmrunserv command.

In the ASCII package control script, enter values into an array known as SERVICE_CMD. Enter one service command string for each service.

Service Restart Parameter

Enter a number of restarts. One valid form of the parameter is -r n where n is a number of retries. A value of -r 0 indicates no retries. A value of -R indicates an infinite number of retries. The default is 0, or no restarts.

In the ASCII package control script, enter values into an array known as SERVICE_RESTART. Enter one restart value for each service.

For information on using a DTC with MC/LockManager, see the chapter "Configuring DTC Manager for Operation with MC/ServiceGuard" in the manual Using the HP DTC Manager/UX.

The package control script will clean up the environment and undo the operations in the event of an error.

Package Configuration Worksheet

Assemble your package configuration and control script data in a separate worksheet for each package. Figure 4-9 “Package Configuration Worksheet” is a sample worksheet filled out. Refer to the appendix, "Appendix E “Blank Planning Worksheets ”" for samples of blank worksheets. Make as many copies as you need. Fill out the worksheet and keep it for future reference.

Figure 4-9 Package Configuration Worksheet

===============================================================================
Package Configuration File Data:
===============================================================================
Package Name: ______pkg11_______________
Failover Policy: _________________     Failback Policy: __AUTOMATIC____
Primary Node: ______ftsys9_______________ 
First Failover Node:____ftsys10_______________
Second Failover Node:_________________________________
Package Run Script: __/etc/cmcluster/pkg1/control.sh__Timeout: _NO_TIMEOUT_
Package Halt Script: __/etc/cmcluster/pkg1/control.sh_Timeout: _NO_TIMEOUT_
Package Switching Enabled?  __YES___ Local Switching Enabled?  ___YES__
Node Failfast Enabled?      ____NO____
Additional Package Resource:
Resource Name:________ Polling Interval_______ Resource UP Value___________
===============================================================================
Package Control Script Data:
===============================================================================
VG[0]___/dev/vg01 __LV[0]__/dev/vg01/lvol1__FS[0]____/mnt1___FS_MOUNT_OPT[0]____ VG[1]_______________LV[1]___________________FS[1]____________FS_MOUNT_OPT[1]____ VG[2]_____________LV[2]__________________FS[2]__________FS_MOUNT_OPT[2]____
    IP[0] ___15.13.171.14 ______________ SUBNET[0]_______15.13.168________
    IP[1] ______________________________ SUBNET[1]________________________
    X.25 Resource Name _________________
Service Name: __Svc1____ Run Command: __/usr/bin/MySvc -f_____Retries: _-r 2__
    Service Fail Fast Enabled? ___NO___Service Halt Timeout __NO_TIMEOUT_______
Service Name: __________ Run Command: _______________________ Retries: ________
    Service Fail Fast Enabled? _________Service Halt Timeout __________
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1999 Hewlett-Packard Development Company, L.P.