Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Managing Serviceguard Version A.11.16, Eleventh EditionSecond Printing > Chapter 5  Building an HA Cluster Configuration

Managing the Running Cluster

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

This section describes some approaches to routine management of the cluster. Additional tools and suggestions are found in Chapter 7, “Cluster and Package Maintenance.”

Checking Cluster Operation with Serviceguard Manager

Serviceguard Manager lets you see all the nodes and packages within a cluster and displays their current status. Refer to the section on “Using Serviceguard Manager” in Chapter 7. You can check configuration and status information using Serviceguard Manager:

  • You can see if all configured nodes are running.

  • You can check that all configured packages are running, and see what nodes they are running on.

  • You can get more information from the property sheets for cluster, nodes, and packages.

When you create or modify a package or cluster configuration, you can start the cluster running and archive its configuration in a Serviceguard Manager (.sgm) file. The data in this file can be compared with later versions of the cluster to understand the changes that are made over time. It will be particularly useful in troubleshooting to compare this file to a problem cluster.

There are several administrative commands you can use through Serviceguard Manager, if the Session Server node and the target node both have Serviceguard version A.11.12 or later installed.

Checking Cluster Operation with Serviceguard Commands

Serviceguard also provides several commands for control of the cluster:

  • cmviewcl checks status of the cluster and many of its components. A non-root user with the role of Monitor can run this command from a cluster node or see status information in Serviceguard Manager.

  • cmrunnode is used to start a node. A non-root user with the role of Full Admin, can run this command from a cluster node or through Serviceguard Manager.

  • cmhaltnode is used to manually stop a running node. (This command is also used by shutdown(1m).) A non-root with the role of Full Admin can run this command from a cluster node or through Serviceguard Manager.

  • cmruncl is used to manually start a stopped cluster. A non-root user with Full Admin access can run this command from a cluster node, or through Serviceguard Manager.

  • cmhaltcl is used to manually stop a cluster. A non-root user with Full Admin access, can run this command from a cluster node or through Serviceguard Manager.

You can use these commands to test cluster operation, as in the following:

  1. If the cluster is not already online, start it. From the Serviceguard Manager menu, choose Run Cluster. From the command line, use cmruncl -v.

    By default, cmruncl will check the networks. Serviceguard will probe the actual network configuration with the network information in the cluster configuration. If you do not need this validation, use cmruncl -v - w none instead, to turn off validation and save time

  2. When the cluster has started, make sure that cluster components are operating correctly. In Serviceguard Manager, open the cluster on the map or tree, and perhaps check its Properties. On the command line, use the cmviewcl -v command.

    Make sure that all nodes and networks are functioning as expected. For more information, refer to the chapter on “Cluster and Package Maintenance.”

  3. Verify that nodes leave and enter the cluster as expected using the following steps:

    • Halt the cluster. In Serviceguard Manager menu use Halt Cluster. On the command line, use the cmhaltnode command.

    • Check the cluster membership on the map or tree to verify that the node has left the cluster. In Serviceguard Manager, open the map or tree or Cluster Properties. On the command line, use the cmviewcl command.

    • Start the node. In Serviceguard Manager use the Run Node command. On the command line, use the cmrunnode command.

    • To verify that the node has returned to operation, check the Serviceguard Manager map or tree, or use the cmviewcl command again.

  4. Bring down the cluster. In Serviceguard Manager, use the Halt Cluster command. On the command line, use the cmhaltcl -v -f command.

Additional cluster testing is described in the “Troubleshooting” chapter. Refer to Appendix A for a complete list of Serviceguard commands. Refer to the Serviceguard Manager Help for a list of Serviceguard Administrative commands.

Preventing Automatic Activation of Volume Groups

It is important to prevent LVM volume groups that are to be used in packages from being activated at system boot time by the /etc/lvmrc file. To ensure that this does not happen, edit the /etc/lvmrc file on all nodes. Set AUTO_VG_ACTIVATE to 0, then include all the volume groups that are not cluster bound in the custom_vg_activation function. Volume groups that will be used by packages should not be included anywhere in the file, since they will be activated and deactivated by control scripts.

NOTE: The root volume group does not need to be included in the custom_vg_activation function, since it is automatically activated before the /etc/lvmrc file is used at boot time.

Setting up Autostart Features

Automatic startup is the process in which each node individually joins a cluster; Serviceguard provides a startup script to control the startup process. Automatic cluster start is the preferred way to start a cluster. No action is required by the system administrator.

There are three cases:

  • The cluster is not running on any node, all cluster nodes must be reachable, and all must be attempting to start up. In this case, the node attempts to form a cluster consisting of all configured nodes.

  • The cluster is already running on at least one node. In this case, the node attempts to join that cluster.

  • Neither is true: the cluster is not running on any node, and not all the nodes are reachable and trying to start. In this case, the node will attempt to start for the AUTO_START_TIMEOUT period. If neither of these things becomes true in that time, startup will fail.

To enable automatic cluster start, set the flag AUTOSTART_CMCLD to 1 in the /etc/rc.config.d/cmcluster file on each node in the cluster; the nodes will then join the cluster at boot time.

Here is an example of the /etc/rc.config.d/cmcluster file:

#************************  CMCLUSTER  ************************
# Highly Available Cluster configuration
#
# @(#) $Revision: 72.2 $
#
# AUTOSTART_CMCLD:    If set to 1, the node will attempt to
#                      join it's CM cluster automatically when
#                      the system boots.
#                      If set to 0, the node will not attempt
#                      to join it's CM cluster.
#
AUTOSTART_CMCLD=1

Changing the System Message

You may find it useful to modify the system's login message to include a statement such as the following:

This system is a node in a high availability cluster.
Halting this system may cause applications and services to
start up on another node in the cluster.

You might wish to include a list of all cluster nodes in this message, together with additional cluster-specific information.

The /etc/issue and /etc/motd files may be customized to include cluster-related information.

Managing a Single-Node Cluster

The number of nodes you will need for your Serviceguard cluster depends on the processing requirements of the applications you want to protect. You may want to configure a single-node cluster to take advantage of Serviceguard’s network failure protection.

In a single-node cluster, a cluster lock is not required, since there is no other node in the cluster. The output from the cmquerycl command omits the cluster lock information area if there is only one node.

You still need to have redundant networks, but you do not need to specify any heartbeat LANs, since there is no other node to send heartbeats to. In the cluster configuration ASCII file, specify all LANs that you want Serviceguard to monitor. For LANs that already have IP addresses, specify them with the STATIONARY_IP keyword, rather than the HEARTBEAT_IP keyword. For standby LANs, all that is required is the NETWORK_INTERFACE keyword with the LAN device name.

Single-Node Operation

Single-node operation occurs in a single-node cluster or in a multi-node cluster, following a situation where all but one node has failed, or where you have shut down all but one node, which will probably have applications running. As long as the Serviceguard daemon cmcld is active, other nodes can re-join the cluster at a later time.

If the Serviceguard daemon fails when in single-node operation, it will leave the single node up and your applications running. This is different from the loss of the Serviceguard daemon in a multi-node cluster, which halts the node with a TOC, and causes packages to be switched to adoptive nodes.

It is not necessary to halt the single node in this scenario, since the application is still running, and no other node is currently available for package switching.

However, you should not try to restart Serviceguard, since data corruption might occur if the node were to attempt to start up a new instance of the application that is still running on the node. Instead of restarting the cluster, choose an appropriate time to shutdown and reboot the node, which will allow the applications to shut down and then permit Serviceguard to restart the cluster after rebooting.

Deleting the Cluster Configuration

With root login, you can delete a cluster configuration from all cluster nodes by using Serviceguard Manager, or on the command line. The cmdeleteconf command prompts for a verification before deleting the files unless you use the -f option. You can only delete the configuration when the cluster is down. The action removes the binary configuration file from all the nodes in the cluster and resets all cluster-aware volume groups to be no longer cluster-aware.

NOTE: The cmdeleteconf command removes only the cluster binary file /etc/cmcluster/cmclconfig. It does not remove any other files from the /etc/cmcluster directory.

Although the cluster must be halted, all nodes in the cluster should be powered up and accessible before you use the cmdeleteconf command. If a node is powered down, power it up and boot. If a node is inaccessible, you will see a list of inaccessible nodes together with the following message:

It is recommended that you do not proceed with the configuration operation unless you are sure these nodes are permanently unavailable.Do you want to continue?

Reply Yes to remove the configuration. Later, if the inaccessible node becomes available, you should run the cmdeleteconf command on that node to remove the configuration file.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.