 |
» |
|
|
 |
Configuring
Packages for Automatic Disaster Recovery |  |
After completing the following steps, packages will be able
to automatically fail over to an alternate node in another data
center and still have access to the data that they need in order
to operate. This procedure must be repeated on all the cluster nodes for
each Serviceguard package so the application can fail over to any
of the nodes in the cluster. Customizations include editing an environment
file to set environment variables, and customizing the package control
script to include customer-defined run and halt commands, as appropriate.
The package control script must also be customized for the particular application
software that it will control. Consult the Managing Serviceguard user’s
guide for more detailed instructions on how to start, halt, and
move packages and their services between nodes in a cluster. For
ease of troubleshooting, configure and test one package at a time. Create a directory /etc/cmcluster/pkgname for each package: # mkdir /etc/cmcluster/pkgname Create a package configuration file. # cd /etc/cmcluster/pkgname # cmmakepkg -p pkgname.config Customize the package configuration file as appropriate to
your application. Be sure to include the pathname of the control
script (/etc/cmcluster/pkgname/pkgname.cntl) for the RUN_SCRIPT and HALT_SCRIPT parameters. In the <pkgname>.config file, list the node names in the order in which you
want the package to fail over. It is recommended for performance
reasons, that you have the package fail over locally first, then
to the remote data center. Set the value of RUN_SCRIPT_TIMEOUT in the package configuration file to NO_TIMEOUT or to a large enough value to take into consideration
the extra startup time required to obtain status from the EVA.  |  |  |  |  | NOTE: If using the EMS disk monitor as a package resource,
do not use NO_TIMEOUT. Otherwise, package shutdown will hang if there
is no access from the host to the package disks. |  |  |  |  |
This toolkit may increase package startup time by 5 minutes
or more. Packages with many disk devices will take longer to start
up than those with fewer devices due to the time needed to get device status
from the EVA. Clusters with multiple packages that use devices on
the EVA will all cause package startup time to increase when more
than one package is starting at the same time. Create a package control script. # cmmakepkg -s pkgname.cntl Customize the control script as appropriate to your application
using the guidelines in the Managing Serviceguard user’s
guide. Standard Serviceguard package customizations include modifying
the VG, LV, FS, IP, SUBNET, SERVICE_NAME, SERVICE_CMD and SERVICE_RESTART parameters.
Be sure to set FS_UMOUNT_COUNT to 1. Add customer-defined run and halt commands in the
appropriate places according to the needs of the application. Refer
to the Managing Serviceguard user’s
guide for more detailed information on these functions. Copy the environment file template /opt/cmcluster/toolkit/SGCAEVA/caeva.env to the package directory, naming it pkgname_caeva.env: # cp /opt/cmcluster/toolkit/SGCAEVA/caeva.env \ /etc/cmcluster/pkgdir/pkgname_caeva.env  |  |  |  |  | NOTE: If not using a package name as a filename
for the package control script, it is necessary to follow the convention
of the environment file name. This is the combination of the file
name of the package control script without the file extension, an
underscore and type of the data replication technology (caeva) used.
The extension of the file must be env. The following examples demonstrate how the environment
file name should be chosen. Example 1: If the
file name of the control script is pkg.cntl, the environment file name would be pkg_caeva.env. Example 2: If the file name of
the control script is control_script.sh, the environment file name would be control_script_caeva.env. |  |  |  |  |
Edit the environment file <pkgname>_caeva.env as follows: Set the CLUSTER_TYPE variable to METRO if this a Metrocluster. Set the PKGDIR variable
to the full path name of the directory where the control script
has been placed. This directory, which is used for status data files,
must be unique for each package. For example, set PKGDIR to /etc/cmcluster/package_name, removing any quotes around the file names. The operator
may create the FORCEFLAG file in this directory. See Appendix B
for an explanation of these variables. Set the DT_APPLICATION_STARTUP_POLICY variable to one of two policies: Availability_Preferred, or Data_Currency_Preferred. Set the WAIT_TIME variable to the timeout, in minutes, to wait for
completion of the data merge from source to destination volume before
starting up the package on the destination volume. If the wait time
expires and merging is still in progress, the package will fail
to start with an error that prevents restarting on any node in the
cluster. Set the DR_GROUP_NAME variable
to the name of DR Group used by this package. This DR Group name
is defined when the DR Group is created. Set the DC1_STORAGE_WORLD_WIDE_NAME variable
to the world wide name of the EVA storage system which resides in
Data Center 1. This WWN can be found on the front panel of the EVA controller,
or from command view EVA UI. Set the DC1_SMIS_LIST variable
to the list of Management Servers which resides in Data Center 1.
Multiple names are defined using a comma as a separator between
the names. Set the DC1_HOST_LIST variable
to the list of clustered nodes which resides in Data Center 1. Multiple
names are defined using a comma as a separator between the names. Set the DC2_STORAGE_WORLD_WIDE_NAME variable
to the world wide name of the EVA storage system which resides in
Data Center 2. This WWN can be found on the front panel of the EVA controller,
or from command view EVA UI. Set the DC2_SMIS_LIST variable
to the list of Management Server, which resides in Data Center 2.
Multiple names are defined using a comma as a separator between
the names. Set the DC2_HOST _LIST
variable to the list of clustered nodes which resides in Data Center
2. Multiple names are defined using a comma as a separator between
the names. Set the QUERY_TIME_OUT variable to the number of seconds to wait for
a response from the SMI-S CIMOM in Management Server. The default
timeout is 300 seconds. The recommended minimum value is 20 seconds.
After customizing the control script file and creating
the environment file, and before starting up the package, do a syntax check
on the control script using the following command (be sure to include
the -n option to perform syntax checking only): # sh -n <pkgname.cntl> If any messages are returned, it is recommended to correct
the syntax errors. Distribute
Metrocluster CA EVA configuration, environment and control script
files to other nodes in the cluster by using ftp or rcp: # rcp -p /etc/cmcluster/pkgname/* \ other_node:/etc/cmcluster/pkgname See the example script Samples/ftpit to see how to semi-automate the copy using ftp. This script assumes the package directories already
exist on all nodes. Using ftp may be preferable at your organization, since it does
not require the use of a.rhosts file for root. Root access via .rhosts may create a security issue. Verify that each node in the Serviceguard cluster
has the following files in the directory /etc/cmcluster/pkgname: - pkgname.cntl
Seviceguard package control
script - pkgname_caeva.env
Metrocluster CA EVA environment
file - pkgname.config
Serviceguard package ASCII
configuration file - pkgname.sh
Package monitor shell script,
if applicable - other files
Any other scripts used to
manage Serviceguard packages
Check the configuration using the cmcheckconf -P pkgname.config, then apply the Serviceguard configuration using the cmapplyconf -P pkgname.config command or SAM.
The Serviceguard cluster is ready to automatically switch
packages to nodes in remote data centers using Metrocluster CA EVA. Maintaining
a Cluster that Uses Metrocluster CA EVA |  |
While the package is running, a manual storage failover on
CA EVA outside of Metrocluster CA EVA software can cause the package
to halt due to unexpected condition of the CA EVA volumes. It is
recommended that no manual storage failover be performed while the
package is running. A manual change of CA EVA link state from suspend to resume
is allowed to re-establish data replication while the package is
running. CA
EVA Link Suspend and Resume ModesUpon CA links recovery, CA EVA automatically normalizes (the
CA EVA term for “synchronizes”) the source Vdisk
and destination Vdisk data. If the log disk is not full, when a CA connection is re-established,
the contents of the log are written to the destination Vdisk to
synchronize it with the source Vdisk. This process of writing the
log contents, in the order that the writes occurred, is called merging.
Since write ordering is maintained, the data on the destination
Vdisk is consistent while merging is in progress. If the log disk is full, when a CA connection is re-established,
a full copy from the source Vdisk to the destination Vdisk is done.
Since a full copy is done at the block level, the data on the destination
Vdisk is not consistent until the copy completes. If all CA links fail and if failsafe mode is disabled, the
application package continues to run and writes new I/O to source
Vdisk. The virtual log in EVA controller collects host write commands
and data; DR group's log state changes from normal to logging. When
a DR group is in a logging state, the log will grow in proportion
to the amount of write I/O being sent to the source Vdisks. If the
links are down for a long time, the log disk may be full, and full
copy will happen automatically upon link recovery. If primary site
fails while copy is in progress, the data in destination Vdisk is
not consistent, and is not usable. To prevent this, after all CA
links fail, it is recommended to manually put the CA link state
to suspend mode by using the Command View EVA UI. When CA link is
in suspend state, CA EVA will not try to normalize the source and destination
Vdisks upon links recovery until you manually change the link state
to resume mode. There might be situations when the package has to be taken
down for maintenance purposes without having the package move to
another node. The following procedure is recommended for normal
maintenance of the Metrocluster CA EVA: Stop the package with the appropriate
Serviceguard command. # cmhaltpkg pkgname Distribute the Metrocluster CA EVA configuration
changes. # cmapplyconf -P pkgname.config Start the package with the appropriate Serviceguard
command. # cmmodpkg -e pkgname
Planned maintenance is treated the same as a failure by the
cluster. If you take a node down for maintenance, package failover
and quorum calculation is based on the remaining nodes. Make sure
that the nodes are taken down evenly at each site, and that enough
nodes remain on-line to form a quorum if a failure occurs. See “Example
Failover Scenarios with Two Arbitrators”. After resynchronization is complete, halt the package on the
failover site, and restart it on the primary site. Metrocluster
will then do a failover of the storage, which will trigger CA EVA
to swap the personalities between the source and the destination
Vdisks, returning source status to the primary site. There might be situations when the cluster has to be re-configured
due to maintenance purposes. The following procedure is recommended
for re-configuration of the Metrocluster CA EVA: Before running the cmapplyconf -C command, it is necessary to remove the cluster awareness
from the Metrocluster volume groups. This is done by halting all
Metrocluster packages by running the appropriate Serviceguard command
on the source side. # vgchange -a n <vg> Halt the entire cluster and apply your changes with
the Serviceguard command. # cmapplyconf -C Re-start the cluster and mark the cluster ID on
all Metrocluster volume groups. Run on the source side. # vgchange -c y <vg>
|