Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Designing Disaster Tolerant HA Clusters Using Metrocluster and Continentalclusters: > Chapter 2 Designing a Continental Cluster

Maintaining a Continental Cluster

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The following common maintenance tasks are described in this section:

  • Adding a Node to a Cluster or Removing a Node from a Cluster

  • Adding a Package to a Continental Cluster

  • Removing a Package from the Continental Cluster

  • Changing Monitoring Definitions

  • Checking the Status of Clusters, Nodes and Packages

  • Reviewing Log Files

  • Renaming a Continental Cluster

  • Deleting a Continental Cluster configuration

  • Checking Java Versions

CAUTION: Never issue the cmrunpkg command for a recovery package when ContinentalClusters is enabled, because there is no guaranteed way of preventing a package that is running on the one cluster from running on the other cluster if the package is started using this command. The potential for data corruption is significant.

Adding a Node to a Cluster or Removing a Node from a Cluster

To add a node to or remove a node from the continental cluster, use the following procedure:

  1. Halt any monitor packages that are running both clusters.

    # cmhaltpkg ccmonpkg

  2. Add or remove the node in a cluster by editing the Serviceguard cluster configuration file and applying the configuration.

    # cmapplyconf -C cluster.config

  3. Edit the Continentalclusters configuration ASCII file to add or remove the node in the cluster.

  4. For added nodes, ensure that the /etc/cmcluster/cmclnodelist and /etc/opt/cmom/cmomhosts files are set up correctly on the new node. Refer to “Preparing Security Files”. Ensure that the cmclnodelist and cmomhosts files on all nodes (including the new node) contains an entry allowing write access by the host on which you are running the configuration commands.

  5. Check and apply the configuration using the cmcheckconcl and cmapplyconcl commands.

  6. Restart the monitor packages on both clusters.

  7. View the status of the continental cluster.

    # cmviewconcl

Adding a Package to the Continental Cluster

To add a new package for possible recovery to the Continentalclusters configuration, it is necessary to first configure a new primary package and recovery package, then you must add a new recovery group to the Continentalclusters configuration file. In addition, it is necessary to ensure that the data replication is provided for the new package, either through hardware or software.

Adding a new package does not require bringing down either cluster. However, in order to implement the new configuration, the following are required:

  1. Configure the new primary and recovery packages by editing the new package configuration files and control scripts.

  2. Use the Serviceguard cmapplyconf command to add the primary package to one cluster, and the recovery package to the other cluster.

  3. Provide the appropriate data replication for the new package.

  4. Create the new recovery group in the Continentalclusters configuration file.

  5. Ensure that the cmclnodelist and cmomhosts files on all nodes contains an entry allowing write access by the host on which you are running the configuration commands.

  6. Halt the monitor packages on both clusters.

  7. Use the cmapplyconcl command to apply the new Continentalclusters configuration.

  8. Restart the monitor packages on both clusters.

  9. View the status of the continental cluster.

    # cmviewconcl

Removing a Package from the Continental Cluster

To remove a package from the Continentalclusters configuration, you must first remove the recovery group from the Continentalclusters configuration file.

Removing the package does not require you to bring down either cluster. However, in order to implement the new configuration, the following steps are required:

  1. Edit the continental clusters configuration file, deleting the recovery group.

  2. Halt the monitor packages that are running on the clusters.

  3. Use the cmapplyconcl command to apply the new Continentalclusters configuration.

  4. Restart the monitor packages on both clusters.

  5. Use the Serviceguard cmdeleteconf command to remove each package in the recovery group.

  6. View the status of the continental cluster.

    # cmviewconcl

Changing Monitoring Definitions

It is allowable to change the monitoring definitions in the configuration without bringing down either cluster. This includes: adding, removing, or changing the cluster events, changing the timings, and adding, removing, or changing the notification messages.

Use the following steps to change the monitoring definitions:

  1. Edit the continental clusters configuration file to incorporate the new or changed monitoring definitions.

  2. Halt the monitor packages on both clusters.

  3. Use the cmapplyconcl command to apply the new configuration.

  4. Restart the monitor packages on both clusters.

  5. View the status of the continental cluster.

    # cmviewconcl

Checking the Status of Clusters, Nodes, and Packages

To check on the status of the continental clusters and associated packages, use the cmviewconcl command, which lists the status of the clusters, associated package status, and configured events status.

The following is an example of cmviewconcl output in a situation where there is a single recovery group for which the primary cluster is cjc838 and the recovery cluster is cjc1234.

# cmviewconcl

WARNING: Primary cluster cjc838 is in an alarm state
(cmrecovercl is enabled on recovery cluster cjc1234)

CONTINENTAL CLUSTER cjccc1
RECOVERY CLUSTER cjc1234

PRIMARY CLUSTER STATUS EVENT LEVEL POLLING INTERVAL
cjc838 down ALARM 20

PACKAGE RECOVERY GROUP prg1
PACKAGE ROLE STATUS
cjc838/primary primary down
cjc1234/recovery recovery up

The following is an example of cmviewconcl output from a primary cluster that is down.

persian (root 2131): cmviewconcl -v 
WARNING: Primary cluster cjc838 is in an alarm state
(cmrecovercl is enabled on recovery cluster cjc1234)

Primary cluster cjc838 is not configured to monitor recovery
cluster cjc1234

CONTINENTAL CLUSTER cjccc1
RECOVERY CLUSTER cjc1234

PRIMARY CLUSTER STATUS EVENT LEVEL POLLING INTERVAL
cjc838 down ALARM 20

CONFIGURED EVENT STATUS DURATION LAST NOTIFICATION SENT
alert unreachable 15 sec --
alarm unreachable 30 sec --
alarm down 0 sec Fri May 12 12:13:06 PDT 2000
alert error 0 sec --
alert up 20 sec --
alert up 40 sec --

PACKAGE RECOVERY GROUP prg1

PACKAGE ROLE STATUS
cjc838/primary primary down
cjc1234/recovery recovery up

The following is the output of a cmviewconcl command that displays data for a mutual recovery configuration in which each cluster has both the primary and the recovery roles—the primary role for one recovery group and the recovery role for the other recovery group:


CONTINENTAL CLUSTER ccluster1

RECOVERY CLUSTER PTST_dts1

PRIMARY CLUSTER STATUS EVENT LEVEL POLLING INTERVAL
PTST_sanfran Unmonitored unmonitored 1 min

CONFIGURED EVENT STATUS DURATION LAST NOTIFICATION SENT
alert unreachable 1 min --
alert unreachable 2 min --
alarm unreachable 3 min --
alert down 1 min --
alert down 2 min --
alarm down 3 min --
alert error 0 sec --
alert up 1 min --

RECOVERY CLUSTER PTST_sanfran

PRIMARY CLUSTER STATUS EVENT LEVEL POLLING INTERVAL
PTST_dts1 Unmonitored unmonitored 1 min

CONFIGURED EVENT STATUS DURATION LAST NOTIFICATION SENT
alert unreachable 1 min --
alert unreachable 2 min --
alarm unreachable 3 min --
alert down 1 min --
alert down 2 min --
alarm down 3 min --
alert error 0 sec --
alert up 1 min --

PACKAGE RECOVERY GROUP hpgroup10

PACKAGE ROLE STATUS
PTST_sanfran/PACKAGE1 primary down
PTST_dts1/PACKAGE1 recovery down

PACKAGE RECOVERY GROUP hpgroup20

PACKAGE ROLE STATUS
PTST_dts1/PACKAGE1x_ld primary down
PTST_sanfran/PACKAGE1x_ld recovery down

For a more comprehensive status of component clusters, nodes, and packages, use the cmviewcl command on both clusters. On each cluster, make note of which nodes the primary packages are running on, as well as data sender and data receiver packages, if they are being used for logical data replication. Verify that the monitor is running on each cluster on which it is configured.

The following is an example of cmviewcl output for a cluster (nycluster) running a monitor package. Note that the recovery package salespkg_bak is not running, and is shown as an unowned package. This is the expected display while the other cluster is running salespkg.

CLUSTER      STATUS
nycluster    up

NODE STATUS STATE
nynode1      up         running

    Network_Parameters:
    INTERFACE STATUS PATH NAME
    PRIMARY      up           12.1         lan0
    PRIMARY      up           56.1         lan2

  NODE           STATUS       STATE
  nynode2        up           running

    Network_Parameters:
    INTERFACE STATUS PATH NAME
    PRIMARY      up           4.1          lan0
    PRIMARY      up           56.1         lan1

    PACKAGE STATUS STATE PKG_SWITCH NODE
    ccmonpkg     up           running      enabled nynode2

    Script_Parameters:
    ITEM NAME STATUS MAX_RESTARTS RESTARTS
    Service      ccmonpkg.srv up 20 0

    Node_Switching_Parameters:
    NODE_TYPE STATUS SWITCHING NAME
    Primary      up           enabled      nynode2  (current)
    Alternate    up           enabled      nynode1

UNOWNED PACKAGES

    PACKAGE STATUS STATE PKG_SWITCH NODE
    salespkg_bak down unowned
      
      Policy_Parameters:
      POLICY_NAME     CONFIGURED_VALUE
      Failover        unknown
      Failback        unknown

      Script_Parameters:
      ITEM       STATUS      NODE_NAME      NAME
      Subnet     unknown     nynode1        195.14.171.0
      Subnet     unknown     nynode2        195.14.171.0

      Node_Switching_Parameters:
      NODE_TYPE STATUS SWITCHING NAME
      Primary      down                      nynode1
      Alternate    down                      nynode2

Use the ps command to check for the status of the Continentalclusters monitor daemons cmclrmond and cmclsentryd, which should be running on the cluster node where the monitor package is running.

Reviewing Messages and Log Files

The Continentalclusters commands—cmquerycl, cmcheckconcl, cmapplyconcl, and cmrecovercl—all display messages on the standard output, which is the first place to look for error messages.

All notification messages associated with cluster events are reported in /var/opt/resmon/log/cc/eventlog on the cluster where monitoring is taking place. An example of output from this file follows:

>------------ Event Monitoring Service Event Notification ------------<

Notification Time: Wed Nov 10 21:00:39 1999

system1 sent Event Monitor notification information:

/cluster/concl/ccluster1/clusters/LAclust/status/unreachable is = 15

User Comments:

Cluster "LAclust" has status "unreachable" for 15 sec

>---------- End Event Monitoring Service Event Notification ----------<

In addition, if you have defined a TEXTLOG destination, notification messages are sent to the file that were specified. (See “Editing Section 3—Monitoring Definitions” for more information.)

Also review the monitor startup and shutdown log file /etc/cmcluster/ccmonpkg/ccmonpkg.cntl.log on any node where a Continentalclusters monitor has been running. Information about the primary or recovery packages may be found in their respective startup and shutdown log files.

Messages from the Continentalclusters daemon are reported in log file /var/adm/cmconcl/sentryd.log, and Object Manager messages appear in /var/opt/cmom/cmomd.log. These messages may be helpful in troubleshooting. Use the cmreadlog command to view the entries in these files. Examples:

# /opt/cmom/tools/bin/cmreadlog -f /var/adm/cmconcl/sentryd.log slog.txt

# /opt/cmom/tools/bin/cmreadlog -f /var/opt/cmom/cmomd.log \ omlog.txt

The following is sample output from the cmreadlog command for the sentryd.log file:

Oct 20 18:28:22:[[main,5,main]]:FATAL:dr.sentryd:No continental cluster found on this node
Oct 22 13:38:45:[[Thread-309,5,main]]:ERROR:dr.sentryd:Error connecting to axe28
Oct 22 13:38:45:[[Thread-309,5,main]]:ERROR:dr.sentryd:Connection refused
Oct 22 13:38:45:[[Thread-309,5,main]]:INFO:dr.sentryd:Connection failed to axe28
Oct 22 13:38:45:[[Thread-311,5,main]]:ERROR:dr.sentryd:Cannot find cluster KC-cluster at location axe29
Oct 22 13:38:45:[[Thread-311,5,main]]:ERROR:dr.sentryd:null result from query

General information about Serviceguard operation is found in /var/adm/syslog/syslog.log.

Deleting a Continental Cluster Configuration

The cmdeleteconcl command is used to delete the configuration on all nodes in the continental cluster configuration. To delete a continental cluster and the Continentalclusters configuration.

# cmdeleteconcl

NOTE: If modifying the configuration, re-issue the cmapplyconcl command. There is no need to delete the previous configuration.

Renaming a Continental Cluster

To rename an existing continental cluster, perform the following steps:

  1. Remove the continental clusters configuration.

    # cmdeleteconcl

  2. Edit the CONTINENTAL_CLUSTER_NAME field in the configuration ASCII file, and run the cmapplyconcl command to configure the continental cluster with a new name.

Checking Java File Versions

Some components of Continentalclusters are executed from Java .jar files. To obtain version information about these files, use the what.sh script provided in the /opt/cmconcl/jar directory. Example:

# /opt/cmconcl/jar/what.sh configcl.jar

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.