 |
» |
|
|
 |
After a failover to a cluster occurs, restoring disaster tolerance
has many challenges, the most significant of which are: Restoring the failed cluster. Depending on the nature of the disaster it may be necessary
to either create a new cluster or to restore the cluster. Before starting up the new or the failed cluster, make sure
the AUTO_RUN flag for all of the Continentalclusters application
packages is disabled. This is to prevent starting the packages unexpectedly with
the cluster. Resynchronizing the data To resynchronize the data, you either restore the data to
the cluster and continue with the same data replication procedure,
or set up data replication to function in the other direction.
The following sections briefly outline some scenarios for
restoring disaster tolerance. Restore
Clusters to their Original Roles |  |
If the disaster did not destroy the cluster, there is the
option to return both clusters in a recovery pair to their original
roles. To do this: Make sure that both clusters are up
and running, with the recovery packages continuing to run on the
surviving cluster. On each cluster, stop the ContinentalClusters monitor package
if it is still running. # cmhaltpkg ccmonpkg Compare the clusters to make sure their configurations
are consistent. Correct any inconsistencies. For each recovery group where the repaired cluster
will run the primary package: Synchronize the data from the disks
on the surviving cluster to the disks on the repaired cluster. This
may be time-consuming. Halt the recovered application on the surviving
cluster if necessary, and start it on the repaired cluster. To keep application down time to a minimum, start
the primary package on the cluster before resynchronizing the
data of the next recovery group.
Restart the monitor using the following command
on each cluster: # cmrunpkg ccmonpkg Alternatively, if the monitoring package configuration has
been modified, use the following sequence on each cluster to apply
the new configuration and start the monitor: # cmapplyconf -P ccmonpkg.config # cmmodpkg -e ccmonpkg View the status of the Continentalcluster. # cmviewconcl
Primary
Packages Remaining on the Surviving Cluster |  |
Configure the failed cluster in a recovery pair as a recovery-only
cluster and the surviving cluster as a primary-only cluster. This
minimizes the downtime involved with moving the applications back
to the restored cluster. It also assumes that the surviving cluster
has sufficient resources to handle running all critical applications
indefinitely.  |  |  |  |  | NOTE: In a multiple recovery pairs scenario, where more than
one primary cluster are configured to share the same recovery cluster,
the following procedure to switch the role of the failed cluster
and the surviving cluster should not be used. |  |  |  |  |
Use the following: Halt the monitor packages. Issue the
following command on each cluster: # cmhaltpkg ccmonpkg Edit the Continentalclusters
ASCII configuration file. It is necessary to change the definitions
of monitoring clusters, and switch the names of primary and recovery
packages in the definitions of recovery groups. It may also be necessary
to re-create data sender and data receiver packages. Check and apply the Continentalclusters
configuration. # cmcheckconcl -v -C cmconcl.config # cmapplyconcl -v -C cmconcl.config Restart the monitor packages on each cluster. # cmmodpkg -e ccmonpkg View the status of the Continentalcluster. # cmmviewconcl
Before applying the edited configuration, the data storage
associated with each cluster needs to be prepared to match the new
role. In addition, the data replication direction needs to be changed
to mirror data from the new primary cluster to the new recovery
cluster. Primary
Packages Remaining on the Surviving Cluster using cmswitchconcl |  |
Continentalclusters provides the command cmswitchconcl to facilitate steps two and three described in the section “Primary
Packages Remaining on the Surviving Cluster”. The command cmswitchconcl is used to switch the roles of primary and recovery packages
of the Continentalclusters recovery groups for which the specified
cluster is defined as the primary cluster. Do not use the cmswitchconcl command in a multiple recovery pair configuration where
more than one primary cluster is sharing the same recovery cluster.
Otherwise, the command will fail. To restore disaster tolerance with cmswitchconcl while continuing to run the packages on the surviving
cluster, use the following procedures: Halt the monitor package on each cluster. # cmhaltpkg ccmonpkg Run this command. # cmswitchconcl \ -C currentContinentalclustersConfigFileName \ -c oldPrimaryClusterName \ [-a] [-F NewContinentalclustersConfigFileName] The above command switches the roles of the primary and recovery packages
of the Continentalclusters recovery groups for which “OldPrimaryClusterName” is defined as the primary cluster. The default values of monitoring package name (ccmonpkg) and interval (60 seconds), and notification scheme
(SYSLOG) with notification delay (0 seconds) will be added for cluster “OldPrimaryClusterName”, which will serve as the recover-only cluster. If editing of the default values are desired, do it with file “NewContinentalclusterConfigFileName” if -F is specified, or with file, “CurrentContinentalclustersConfigFileName” if -F is not specified. If editing of the new configuration
file is needed, do not use the -a option. If option -a is specified the new configuration will be applied automatically. If option -a is specified with cmswitchconcl in step 2, skip this step. Otherwise manually apply the
new Continentalclusters configuration. # cmapplyconcl -v -c newContinentalclustersConfigFileName (if -F is specified in step 2) # cmapplyconcl -v -c \ CurrentContinentalcusterConfigFileName (if -F is not specified in step 2) Restart the monitor packages
on each cluster. # cmmodpkg -e ccmonpkg View the status of the Continentalcluster. # cmviewconcl
 |  |  |  |  | NOTE: The cluster shared storage configuration file /etc/cmconcl/ccrac/ccrac.config is not updated by cmswitchconcl. The CCRAC_CLUSTER and CCRAC_INSTANCE_PKGS variables in the cluster shared storage configuration
file must be manually updated on all nodes in the clusters to reflect
the new primary cluster and package names. |  |  |  |  |
The cmswitchconcl command is also used to switch the package role of a recovery
group. If only a subset of the primary packages will remain running
on the surviving (recovery) cluster, a new option -g is
provided with the cmswitchconcl command. This option reconfigures the roles of the packages
of a recovery group and helps retain recovery protection after a
failover. Usage of option -g (recovery group based role switch reconfiguration) is the
same as the one for -c (cluster based role switch reconfiguration). Note, option -c and -g of the cmswitchconcl command are mutually exclusive. # cmswitchconcl \ -C currentContinentalclustersConfigFileName \ -g RecoverGroupName \ [-a] [-F NewContinentalclustersConfigFileName] The following is a sample of input and output files for running cmswitchconcl -C sample.input -c clusterA -F Sample.out ### Section 1. Cluster Information |
CONTINENTAL_CLUSTER_NAME Sample_CC_ClusterCLUSTER_NAME ClusterA CLUSTER_DOMAIN cup.hp.com |
NODE_NAME node1 NODE_NAME node2 MONITOR_PACKAGE_NAME ccmonpkgCLUSTER_NAME ClusterBCLUSTER_DOMAIN cup.hp.com NODE_NAME node3 NODE_NAME node4 MONITOR_PACKAGE_NAME ccmonpkgMONITOR_INTERVAL 60 SECONDS |
### Section 2. Recovery Groups |
RECOVERY_GROUP_NAME RG1 PRIMARY_PACKAGE ClusterA/pkgX RECOVERY_PACKAGE ClusterB/pkgX'RECOVERY_GROUP_NAME RG2 PRIMARY_PACKAGE ClusterA/pkgY RECOVERY_PACKAGE ClusterB/pkgY' DATA_RECEIVER_PACKAGE ClusterB/pkgR1RECOVERY_GROUP_NAME RG3 PRIMARY_PACKAGE ClusterB/pkgZ RECOVERY_PACKAGE ClusterA/pkgZ' |
PRIMARY_PACKAGE ClusterB/pkgW RECOVERY_PACKAGE ClusterA/pkgW' DATA_RECEIVER_PACKAGE ClusterA/pkgR2 |
### Section 3. Monitoring Definitions |
CLUSTER_EVENT ClusterA/DOWN MONITORING_CLUSTER ClusterB CLUSTER_ALERT 60 SECONDS |
NOTIFICATION TEXTLOG /var/opt/resmon/log/data/events.log “CC alert: DOWN” NOTIFICATION SYSLOG “CC alert: DOWN” CLUSTER_ALARM 90 SECONDSNOTIFICATION TEXTLOG /var/opt/resmon/log/data/events.log “CC alarm: DOWN” NOTIFICATION SYSLOG “CC alarm: DOWN” |
sample.output### Section 1. Cluster Information CONTINENTAL_CLUSTER_NAME Sample_CC_ClusterCLUSTER_NAME ClusterA CLUSTER_DOMAIN cup.hp.com NODE_NAME node1 NODE_NAME node2 MONITOR_PACKAGE_NAME ccmonpkg MONITOR_INTERVAL 60 SECONDSCLUSTER_NAME ClusterBCLUSTER_DOMAIN cup.hp.com NODE_NAME node3 NODE_NAME node4 |
### Section 2. Recovery Groups |
RECOVERY_GROUP_NAME RG1 PRIMARY_PACKAGE ClusterB/pkgX' RECOVERY_PACKAGE ClusterA/pkgXRECOVERY_GROUP_NAME RG2 PRIMARY_PACKAGE ClusterB/pkgY' RECOVERY_PACKAGE ClusterA/pkgY DATA_RECEIVER_PACKAGE ClusterA/pkgR1RECOVERY_GROUP_NAME RG3 PRIMARY_PACKAGE ClusterB/pkgZ RECOVERY_PACKAGE ClusterA/pkgZ' |
RECOVERY_GROUP_NAME RG4 PRIMARY_PACKAGE ClusterB/pkgW RECOVERY_PACKAGE ClusterA/pkgW' DATA_RECEIVER_PACKAGE ClusterA/pkgR2 |
### Section 3. Monitoring DefinitionsCLUSTER_EVENT ClusterB/DOWN MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: DOWN” CLUSTER_ALARM 0 MINUTES NOTIFICATION SYSLOG “CC alarm: DOWN”CLUSTER_EVENT ClusterB/UNREACHABLE MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: UNREACHABLE” CLUSTER_ALARM 0 MINUTES NOTIFICATION SYSLOG “CC alarm: UNREACHABLE”CLUSTER_EVENT ClusterB/ERROR MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: ERROR”CLUSTER_EVENT ClusterB/UP MONITORING_CLUSTER ClusterA CLUSTER_ALERT 0 MINUTES NOTIFICATION SYSLOG “CC alert: UP” |
Newly
Created Cluster Will Run Primary Packages |  |
After creating a new cluster to replace the damaged cluster,
restore the critical applications to the new cluster and restore
the other cluster to its role as a backup for the recovered packages. Configure the new cluster as a Serviceguard cluster.
Use the cmviewcl command on the surviving cluster and compare the results to
the new cluster configuration. Correct any inconsistencies on
the new cluster. Halt the monitor package on the surviving recovery
cluster. # cmhaltpkg ccmonpkg Edit the continental cluster configuration file
to replace the data from the old failed cluster with data from the
new cluster. Check and apply the Continentalclusters configuration. # cmcheckconcl -v -C cmconcl.config # cmapplyconcl -v -C cmconcl.config Do the following for each recovery group where the
new cluster will run the primary package: Synchronize the data from the disks
on the surviving recovery cluster to the disks on the new cluster.
This may be time-consuming. Halt the application on the surviving recovery cluster
if necessary, and start it on the new cluster. To keep application down time to a minimum, start
the primary package on the cluster before resynchronizing the
data of the next recovery group.
If the new cluster acts as
a recovery cluster for any recovery group, create a monitor package
for the new cluster. Apply the configuration of the new monitor package. # cmapplyconf -p ccmonpkg.config Restart the monitor package on the surviving cluster. # cmrunpkg ccmonpkg View the status of the Continentalcluster. # cmviewconcl
|