Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Configuring OPS Clusters with ServiceGuard OPS Edition > Chapter 5 Building an OPS Cluster Configuration

Managing the Running Cluster

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

This section describes some approaches to routine management of the cluster. Additional tools and suggestions are found in Chapter 7, "Cluster and Package Maintenance."

Checking Cluster Operation with ServiceGuard Manager

ServiceGuard Manager lets you see all the nodes and packages within a cluster and displays their current status. Refer to the section on "Using ServiceGuard Manager" in Chapter 7. The following are suggested using ServiceGuard Manager:

  • Ensure that all configured nodes are running.

  • Check that all configured packages are running, and running on the correct nodes.

  • Ensure that the settings on the property sheets for cluster, nodes, and packages are correct.

When you are sure the cluster is correctly configured, save a copy of the configuration data in a file for archival purposes. The data in this file can be compared with later versions of the cluster to understand the changes that are made over time.

Checking Cluster Operation with ServiceGuard Commands

ServiceGuard also provides several commands for control of the cluster:

  • cmrunnode is used to start a node.

  • cmhaltnode is used to manually stop a running node. (This command is also used by shutdown(1m).)

  • cmruncl is used to manually start a stopped cluster.

  • cmhaltcl is used to manually stop a cluster.

You can use these commands to test cluster operation, as in the following:

  1. If the cluster is not already online, run the cluster, as follows:

    # cmruncl -v  
  2. When the cluster has started, use the following command to ensure that cluster components are operating correctly:

    # cmviewcl -v  

    Make sure that all nodes and networks are functioning as expected. For information about using cmviewcl, refer to the chapter on "Cluster and Package Maintenance."

  3. Use the following sequence of commands to verify that nodes leave and enter the cluster as expected:

    • On a cluster node, issue the cmhaltnode command.

    • Use the cmviewcl command to verify that the node has left the cluster.

    • Issue the cmrunnode command.

    • Use the cmviewcl command again to verify that the node has returned to operation.

  4. Use the following command to bring down the cluster:

    # cmhaltcl -v -f  

Additional cluster testing is described in the "Troubleshooting" chapter. Refer to Appendix A for a complete list of MC/ServiceGuard commands

Preventing Automatic Activation of Volume Groups

It is important to prevent OPS and package volume groups from being activated at system boot time by the /etc/lvmrc file. To ensure that this does not happen, edit the /etc/lvmrc file on all nodes. Set AUTO_VG_ACTIVATE to 0, then include all the volume groups that are not cluster bound in the custom_vg_activation function. Volume groups that will be used by packages should not be included anywhere in the file, since they will be activated and deactivated by control scripts.

NOTE: The root volume group does not need to be included in the custom_vg_activation function, since it is automatically activated before the /etc/lvmrc file is used at boot time.

Setting up Autostart Features

Automatic startup is the process in which each node individually joins a cluster. To control this process, ServiceGuard provides a startup script. If a cluster already exists, the node attempts to join it; if no cluster is running, the node attempts to form a cluster consisting of all configured nodes. Automatic cluster start is the preferred way to start a cluster. No action is required by the system administrator.

To enable automatic cluster start, set the flag AUTOSTART_CMCLD to 1 in the /etc/rc.config.d/cmcluster file on each node in the cluster; the nodes will then join the cluster at boot time.

Here is an example of the /etc/rc.config.d/cmcluster file:

#***************************  CMCLUSTER  *************************
# Highly Available Cluster configuration
#
# @(#) $Revision: 72.2 $
#
# AUTOSTART_CMCLD: If set to 1, the node will attempt to
# join it's CM cluster automatically when
# the system boots.
# If set to 0, the node will not attempt
# to join it's CM cluster.
#
AUTOSTART_CMCLD=1

Changing the System Message

You may find it useful to modify the system's login message to include a statement such as the following:

This system is a node in a high availability cluster.
Halting this system may cause applications and services to
start up on another node in the cluster.

You might wish to include a list of all cluster nodes in this message, together with additional cluster-specific information.

The /etc/issue and /etc/motd files may be customized to include cluster-related information.

Using Packages to Configure Startup and Shutdown of OPS Instances

To automate the startup and shutdown of OPS instances on the nodes of the cluster, you can create packages which activate the appropriate volume groups and then run OPS. Refer to the section "Creating Packages to Launch OPS Instances" in the chapter "Configuring Packages and Their Services."

NOTE: The maximum number of OPS instances is 127 per cluster.

Starting Oracle Instances

Once the Oracle installation is complete, ensure that all package control scripts are in place on each node and that each /etc/rc.config.d/cmcluster script contains the entry AUTOSTART_CMCLD=1. Then reboot each node. Within a couple of minutes following reboot, the cluster will reform, and the package control scripts will bring up the database instances and application programs.

When Oracle has been started, you can use the SAM process management area or the ps -ef command on both nodes to verify that all OPS daemons and Oracle processes are running.

Starting Up and Shutting Down Manually

To start up and shut down OPS instances without using packages, you can perform the following steps.

  • Starting up involves the following sequence:

    1. Start up the cluster (cmrunnode or cmruncl)

    2. Activate the database volume groups or disk groups in shared mode.

    3. Bring up Oracle in shared mode.

    4. Bring up the Oracle applications, if any.

  • Shutting down involves the following sequence:

    1. Shut down the Oracle applications, if any.

    2. Shut down Oracle.

    3. Deactivate the database volume groups or disk groups.

    4. Shut down the cluster (cmhaltnode or cmhaltcl).

If the shutdown sequence described above is not followed, cmhaltcl or cmhaltnode may fail with a message that GMS clients (OPS 8i) are active or that shared volume groups are active.

Managing a Single-Node Cluster

The number of nodes you will need for your MC/ServiceGuard cluster depends on the processing requirements of the applications you want to protect. You may want to configure a single-node cluster to take advantage of MC/ServiceGuard's network failure protection.

In a single-node cluster, a cluster lock is not required, since there is no other node in the cluster. The output from the cmquerycl command omits the cluster lock information area if there is only one node.

You still need to have redundant networks, but you do not need to specify any heartbeat LANs, since there is no other node to send heartbeats to. In the cluster configuration ASCII file, specify all LANs that you want ServiceGuard to monitor. For LANs that already have IP addresses, specify them with the STATIONARY_IP keyword, rather than the HEARTBEAT_IP keyword. For standby LANs, all that is required is the NETWORK_INTERFACE keyword with the LAN device name.

Single-Node Operation

Single-node operation occurs in a single-node cluster or in a multi-node cluster, following a situation where all but one node has failed, or where you have shut down all but one node, which will probably have applications running. As long as the MC/ServiceGuard daemon cmcld is active, other nodes can re-join the cluster at a later time.

If the MC/ServiceGuard daemon fails when in single-node operation, it will leave the single node up and your applications running. This is different from the loss of the MC/ServiceGuard daemon in a multi-node cluster, which halts the node with a TOC, and causes packages to be switched to adoptive nodes. It is not necessary to halt the single node in this scenario, since the application is still running, and no other node is currently available for package switching. However, you should not try to restart MC/ServiceGuard, since data corruption might occur if another node were to attempt to start up a new instance of the application that is still running on the single node. Instead of restarting the cluster, choose an appropriate time to shutdown and reboot the node, which will allow the applications to shut down and then permit MC/ServiceGuard to restart the cluster after rebooting.

Deleting the Cluster Configuration

You can delete a cluster configuration from all cluster nodes by using SAM or by issuing the cmdeleteconf command. The command prompts for a verification before deleting the files unless you use the -f option. You can only delete the configuration when the cluster is down. The action removes the binary configuration file from all the nodes in the cluster and resets all cluster-aware volume groups to be no longer cluster-aware.

NOTE: The cmdeleteconf command removes only the cluster binary file /etc/cmcluster/cmclconfig. It does not remove any other files from the /etc/cmcluster directory.

Although the cluster must be halted, all nodes in the cluster should be powered up and accessible before you use the cmdeleteconf command. If a node is powered down, power it up and boot. If a node is inaccessible, you will see a list of inaccessible nodes together with the following message:

It is recommended that you do not proceed with the configuration operation unless you are sure these nodes are permanently unavailable. Do you want to continue?

Reply Yes to remove the configuration. Later, if the inaccessible node becomes available, you should run the cmdeleteconf command on that node to remove the configuration file.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.