 |
» |
|
|
 |
The following section describes how to configure a continental
cluster solution using Continuous Access EVA, which requires the
HP Metrocluster with Continuous Acess EVA product. Setting
up a Primary Package on the Primary Cluster |  |
Use the procedures in this section to configure a primary
package on the primary cluster. Consult the Serviceguard documentation
for more detailed instructions on setting up Serviceguard with packages,
and for instructions on how to start, halt, and move packages and
their services between nodes in a cluster. Install Continentalclusters
on all the cluster nodes in the primary cluster (Skip this step
if the software has been preinstalled). Run swinstall(1m)
to install HP Continentalclusters from an SD depot. When swinstall(1m) has completed,
create a directory for the new package in the primary cluster:
# mkdir /etc/cmcluster/<package_name> Create a Serviceguard package configuration
file in the primary cluster with the following commands: # cd /etc/cmcluster/<package_name> # cmmakepkg -p <package_name>.ascii Customize the Serviceguard package configuration file as appropriate
to your application. Be sure to include the pathname of the control
script /etc/cmcluster/<package_name>/ <package_name>.cntl for the RUN_SCRIPT and HALT_SCRIPT parameters. Set the AUTO_RUN flag to NO. This is to ensure the package
will not start when the cluster starts. Only after the primary packages start, use cmmodpkg to enable package switching on all primary packages.
By enabling package switching in the package configuration, it
will automatically start the primary package when the cluster starts.
However, had there been a primary cluster disaster, resulting in
the recovery package starting and running on the recovery cluster,
the primary package should not be started until after first stopping
the recovery package. Create a package control
script with the command: # cmmakepkg -s pkgname.cntl Customize the control script as appropriate to your application using the guidelines in Managing Serviceguard. Standard Serviceguard package customizations include modifying the VG, LV, FS, IP, SUBNET, SERVICE_NAME, SERVICE_CMD and SERVICE_RESTART parameters. Be sure to set LV_UMOUNT_COUNT to 1 or greater. Add customer-defined run
and halt commands in the appropriate places according to the needs
of the application. See Managing Serviceguard for more information
on these functions. Copy the environment file
template: /opt/cmcluster/toolkit/SGCA/caeva.env to the package directory, naming it pkgname_caeva.env: # cp /opt/cmcluster/toolkit/SGCA/caeva.env \ /etc/cmcluster/pkgname/pkgname_caeva.env  |  |  |  |  | NOTE: If you do not use a package name as a filename
for the package control script, you must follow the convention of
the environment file name. This is the combination of the file name
of the package control script without the file extension, an underscore
and type of the data replication technology (caeva) used. The extension
of the file must be env. The following examples demonstrate how the environment
file name should be chosen. Example 1: If the
file name of the control script is pkg.cntl, the environment file name would be pkg_caeva.env. Example 2: If the file name of the control
script is control_script.sh, the environment file name would be control_script_caeva.env. |  |  |  |  |
Edit the environment file
<pkgname>_caeva.env as follows: Set the
CLUSTER_TYPE variable to CONTINENTAL Set the PKGDIR variable to
the full path name of the directory where the control script has
been placed. This directory, which is used for status data files,
must be unique for each package. For example, set PKGDIR to /etc/cmcluster/package_name, removing any quotes around the file names. The operator
may create the FORCEFLAG file in this directory. See Appendix B
for a description of these variables. Set the DT_APPLICATION_STARTUP_POLICY variable to one of two policies: Availability_Preferred,
or Data_Currency_Preferred. Set the WAIT_TIME variable
to the timeout, in minutes, to wait for completion of the data merge
from source to destination volume before starting up the package
on the destination volume. If the wait time expires and merging
is still in progress,the package will fail to start with an error
that prevents restarting on any node in the cluster. Set the DR_GROUP_NAME variable
to the name of DR Group used by this package. This DR Group name
is defined when the DR Group is created. Set the DC1_STORAGE_WORLD_WIDE_NAME
variable to the world wide name of the EVA storage system which
resides in Data Center 1. This WWN can be found on the front panel
of the EVA controller, or from command view EVA UI. Set the DC1_SMIS_LIST variable
to the list of Management Servers which resides in Data Center 1.
Multiple names are defined using a comma as a separator between
the names. Set the DC1_HOST_LIST variable
to the list of clustered nodes which resides in Data Center 1. Multiple
names are defined using a comma as a separator between the names. Set the DC2_STORAGE_WORLD_WIDE_NAME
variable to the world wide name of the EVA storage system which
resides in Data Center 2. This WWN can be found on the front panel
of the EVA controller, or from command view EVA UI. Set the DC2_SMIS_LIST variable
to the list of Management Server, which resides in Data Center 2.
Multiple names are defined using a comma as a separator between
the names. Set the DC2_HOST _LIST variable
to the list of clustered nodes which resides in Data Center 2. Multiple
names are defined using a comma as a separator between the names. Set the QUERY_TIME_OUT variable
to the number of seconds to wait for a response from the SMI-S CIMOM
in Management Server. The default timeout is 300 seconds. The recommended minimum
value is 20 seconds.
Distribute Metrocluster CA
EVA configuration, environment and control script files to other
nodes in the cluster by using ftp or rcp: # rcp -p /etc/cmcluster/pkgname/* \ other_node:/etc/cmcluster/pkgname Apply the Serviceguard configuration
using the cmapplyconf command or SAM. Verify that each node in
the Serviceguard cluster has the following files in the directory
/etc/cmcluster/pkgname - pkgname.cntl
Serviceguard package control
script - pkgname_caeva.env
Metrocluster CA EVA environment
file - pkgname.ascii
Serviceguard package ASCII
configuration file - pkgname.sh
Package monitor shell script,
if applicable - other files
Any other scripts you use
to manage Serviceguard packages
The Serviceguard cluster is ready to automatically switch
packages to nodes in remote data centers using Metrocluster CA EVA Using standard Serviceguard
commands (cmruncl, cmhaltcl, cmrunpkg, cmhaltpkg), test the primary cluster for cluster and package startup
and package failover. Any running package on the
primary cluster that will have a counterpart on the recovery cluster
must be halted at this time.
Setting
up a Recovery Package on the Recovery Cluster |  |
Use the procedures in this section to configure a recovery
package on the recovery cluster. Consult the Serviceguard documentation
for more detailed instructions on setting up Serviceguard with packages,
and for instructions on how to start, halt, and move packages and
their services between nodes in a cluster. Use the following steps
for the recovery package set up: Install Continentalclusters on
all the cluster nodes in the recovery cluster (Skip this step if
the software has been preinstalled)  |  |  |  |  | NOTE: Serviceguard should already be installed on all the
cluster nodes. Run swinstall(1m) to install Continentalclusters
from an SD depot. |  |  |  |  |
When swinstall(1m) has completed, create a directory
as follows for the new package in the recovery cluster: # mkdir /etc/cmcluster/<package_name> Create an Serviceguard package configuration file in the recovery cluster
with the commands: # cd /etc/cmcluster/<package_name> # cmmakepkg -p <package_name>.ascii Customize it as appropriate to your application.
Be sure to include the pathname of the control script (/etc/cmcluster/<package_name>/ <package_name>.cntl)
for the RUN_SCRIPT and HALT_SCRIPT parameters. Set the AUTO_RUN flag to NO. This is to ensure the package will not start
when the cluster starts. Do not use cmmodpkg
to enable package switching on any recovery package. Enabling package switching
will automatically start the recovery package. Package switching
on a recovery package will be automatically set by the cmrecovercl command on the recovery cluster when it successfully starts
the recovery package. Create a package control script with the command: # cmmakepkg -s pkgname.cntl Customize the control script as appropriate to your application
using the guidelines in Managing Serviceguard. Standard
Serviceguard package customizations include modifying the VG, LV, FS, IP, SUBNET, SERVICE_NAME, SERVICE_CMD and SERVICE_RESTART parameters.
Be sure to set LV_UMOUNT_COUNT to 1 or greater.  |  |  |  |  | NOTE: Some of the control script variables, such as VG and LV, on the recovery cluster must be the same as on
the primary cluster. Some of the control script variables, such
as, FS, SERVICE_NAME, SERVICE_CMD and SERVICE_RESTART are probably the same as on the primary cluster.
Some of the control script variables, such as IP and SUBNET, on the recovery cluster are probably different from
those on the primary cluster. Make sure that you review all the
variables accordingly. |  |  |  |  |
Add customer-defined run and halt commands in the
appropriate places according to the needs of the application. See Managing Serviceguard for
more information on these functions. Copy the environment file template /opt/cmcluster/toolkit/SGCA/xpca.env to the package directory, naming it pkgname_xpca.env: # cp /opt/cmcluster/toolkit/SGCA/caeva.env \ /etc/cmcluster/pkgname/pkgname_caeva.env Edit the environment file <pkgname>_caeva.env as follows: Set the CLUSTER_TYPE
variable to CONTINENTAL Set the PKGDIR variable to
the full path name of the directory where the control script has
been placed. This directory, which is used for status data files,
must be unique for each package. For example, set PKGDIR to /etc/cmcluster/package_name,
removing any quotes around the file names. The operator may create
the FORCEFLAG file in this directory. See Appendix B for an explanation
of these variables. Set the DT_APPLICATION_STARTUP_POLICY variable to
one of two policies: Availability_Preferred, or Data_Currency_Preferred. Set the WAIT_TIME variable
to the timeout, in minutes, to wait for completion of the data merge
from source to destination volume before starting up the package
on the destination volume. If the wait time expires and merging
is still in progress,the package will fail to start with an error
that prevents restarting on any node in the cluster. Set the DR_GROUP_NAME variable
to the name of DR Group used by this package. This DR Group name
is defined when the DR Group is created. Set the DC1_STORAGE_WORLD_WIDE_NAME
variable to the world wide name of the EVA storage system which
resides in Data Center 1. This WWN can be found on the front panel
of the EVA controller, or from command view EVA UI. Set the DC1_SMIS_LIST variable
to the list of Management Servers which resides in Data Center 1.
Multiple names are defined using a comma as a separator between
the names. Set the DC1_HOST_LIST variable
to the list of clustered nodes which resides in Data Center 1. Multiple
names are defined using a comma as a separator between the names. Set
the DC2_STORAGE_WORLD_WIDE_NAME variable to the world wide name
of the EVA storage system which resides in Data Center 2. This WWN
can be found on the front panel of the EVA controller, or from command
view EVA UI. Set the DC2_SMIS_LIST variable
to the list of Management Server, which resides in Data Center 2.
Multiple names are defined using a comma as a separator between
the names. Set the DC2_HOST _LIST variable
to the list of clustered nodes which resides in Data Center 2. Multiple
names are defined using a comma as a separator between the names. Set the QUERY_TIME_OUT variable
to the number of seconds to wait for a response from the SMI-S CIMOM
in Management Server. The default timeout is 300 seconds. The recommended minimum
value is 20 seconds.
Distribute
Metrocluster CA EVA configuration, environment and control script
files to other nodes in the cluster by using ftp or rcp: # rcp -p /etc/cmcluster/pkgname/* \ other_node:/etc/cmcluster/pkgname See the example script Samples/ftpit to see how to semi-automate the copy using ftp. This script assumes the package directories already
exist on all nodes. Using ftp may be preferable at your organization, since it does
not require the use of a.rhosts file for root. Root access via .rhosts may create a security issue. Apply the Serviceguard configuration using the cmapplyconf command or SAM. Verify that each node in the Serviceguard cluster
has the following files in the directory /etc/cmcluster/pkgname: - bkpbkgname.cntl
Serviceguard package control
script - bkpkgname_caeva.env
Metrocluster CA EVA environment
file - bkpkgname.ascii
Serviceguard package ASCII
configuration file - bkpkgname.sh
Package monitor shell script,
if applicable - other files
Any other scripts you use
to manage Serviceguard packages
Make sure the packages on the primary cluster are
not running. Using standard Serviceguard commands (cmruncl, cmhaltcl, cmrunpkg, cmhaltpkg) test the recovery cluster for cluster and package startup
and package failover. Any running package on the recovery cluster that
has a counterpart on the primary cluster should be halted at this
time.
Setting
up the Continental Cluster Configuration |  |
The steps below are the basic procedure for setting up the Continentalclusters
configuration file and the monitoring packages on the two clusters.
For complete details on creating and editing the configuration file,
refer to Chapter 4 “Designing
a Continental Cluster” Generate the Continentalclusters
configuration using the following command: # cmqueryconcl -C cmconcl.config Edit the configuration file cmconcl.config with
the names of the two clusters, the nodes in each cluster, the recovery
groups and the monitoring definitions. The recovery groups define
the primary and recovery packages. When data replication is done
using Continuous Access EVA, there are no data sender and receiver
packages. Define the monitoring parameters, the notification mechanism
(ITO, email, console, SNMP, syslog or tcp) and notification type
(alert or alarm) based on the cluster status (unknown, down, up
or error). Descriptions for these can be found in the configuration
file generated in the previous step. Edit the continental cluster security file /etc/opt/cmom/cmomhosts to allow or deny hosts read access by the monitor software. On all nodes in both clusters copy the monitor package
files from /opt/cmconcl/scripts to/etc/cmcluster/ccmonpkg. Edit the monitor package configuration as needed in
the file /etc/cmcluster/ccmonpkg/ccmonpkg.config. Set the AUTO_RUN flag to YES. This is in contrast to the flag setting for
the application packages. We want the monitor package to start
automatically when the cluster is formed. Apply the monitor package to both cluster configurations
using the following command: # cmapplyconf -P /etc/cmcluster/ccmonpkg/ccmonpkg.config Apply the continental cluster configuration file
using cmapplyconcl. Files are placed in /etc/cmconcl/instances. There is no change to /etc/cmcluster/cmclconfig nor is there an equivalent file for Continentalclusters.
Example: # cmapplyconcl -C cmconcl.config Start the monitor package on both clusters.  |  |  |  |  | NOTE: The monitor package for a cluster checks the status
of the other cluster and issues alerts and alarms, as defined in
the Continentalclusters configuration file, based on the other cluster’s status. |  |  |  |  |
Check /var/adm/syslog/syslog.log for messages. Also check the ccmonpkg package log file. Start the primary packages on the primary cluster
using cmrunpkg. Test local failover within the primary cluster. View the status of the Continentalcluster primary
and recovery clusters, including configured event data:
The continental cluster is now ready for testing. See “Testing
the Continental Cluster”. Switching
to the Recovery Cluster in Case of Disaster |  |
It is vital the administrator verify that recovery is needed
after receiving a cluster alert or alarm. Network failures may
produce false alarms. After validating a failure, start the recovery
process using the cmrecovercl [-f] command. Note the following: During an alert, the cmrecovercl will
not start the recovery packages unless the -f option
is used. During an alarm, the cmrecovercl will start
the recovery packages without the -f option. When there is neither an alert nor an alarm condition,
cmrecovercl cannot start the recovery packages on the
recovery cluster. This condition applies not only when
no alert or alarm was issued, but also applies to the situation
where there was an alert or alarm, but the primary cluster recovered
and its current status is Up.
Failover
to Recovery Site |  |
After reception of the Continentalcluster’s alerts
and alarm, the administrators at the recovery site follow the prescribed
processes and recovery procedures to start the protected applications
on the recovery Cluster. The recovery package control script will evaluate the status
of the DR group used by the package, and will do the failover of
the DR group to the EVA in the Recovery site. This means after the
failover successful, the DR group in the recovery site's EVA will
be source and be accessible with read/write mode. After the recovery package is up and running, the EVA in the
Recovery site will have more current data than the one in the primary
site. Failover
Scenarios |  |
The goal of HP Continentalclusters is to maximize system and application
availability. However, even systems configured with Continentalclusters
can experience hardware failures at the primary site or the recovery
site, as well as the hardware or networking failures connecting
the two sites. The following scenarios addresses some of those
failures and suggests recovery approaches applicable to environments
using data replication provided by HP StorageWorks EVA series disk
arrays and Continuous Access (CA). The primary site has lost power for a prolonged time, including
backup power (UPS), to both the systems and disk arrays that make
up the Serviceguard Cluster at the primary site. There is no loss
of data on either the EVA disk array or the operating systems of
the systems at the primary site. Failback
to the Primary SiteIn this scenario, the EVA in the primary site is down due
to the loss of power; therefore, the storage configuration information
and the application data prior to power failure remain intact in
the EVA . When the primary site’s power is restored, the
EVA is up and running, and CA links are up, CA EVA software will
automatically resynchronize the data from the recovery site's EVA
back to the primary site’s EVA. If the resynchronization
is a full copy operation, the data in the primary site's EVA is
not consistent and is not usable until the full copy (resynchronization)
completes. It is recommended to wait until the resynchronization is complete
before failing back the packages to the primary site. The state
of the DR group in the primary site’s EVA can be checked
either via Command View (CV) EVA or SSSU command. If the state
of each Vdisk in the DR group is shown “Normal”,
then the resynchronization is complete, and the user can move the
packages back to the primary site. The primary site HP StorageWorks EVA disk array experienced
a catastrophic hardware failure and all data was lost on the array. Failback
to the Primary SiteIn this scenario the disk array is repaired or a new EVA array
is commissioned at the primary site. Before the application can
fail back to the primary site, the EVA in the recovery site (now
is the source storage) needs to establish the replication relationship
with the new EVA in the primary site (now is the destination storage).
Refer to the procedure named “Return Operations to Replaced
New Storage Hardware” in the “Continuous Access
EVA Operation Guide” to rebuild the DR groups configured
in the EVA. Once the DR groups re-build and the destination storage
is synchronized with the source storage, the packages can be failed
back to the primary site. The primary site has lost power, which only impact the systems
in the primary cluster. The primary cluster is down but the EVA
disk array and CA links to the recovery site are up and running. In this scenario the EVA disk arrays in both sites are up
and running. The CA links are function. When the recovery packages
are up and running on the recovery site, CA EVA automatically switches
the replication direction; the new data written on the recovery
site's EVA is replicated to the primary site's EVA. After the primary cluster is back online, the packages can
be failed back to the primary site. Reconfiguring
Recovery Group Site Identities in Continentalcluster after a RecoveryIn a disaster scenarion where the primary site goes out of
operation, and there was no loss of data on the disk array or the
servers. After the recovery is completed the recovered application
can continue to run at the recovery site with out requiring to fail
back when the primary cluster becomes available at a later point
in time. This will avoid further downtime for the recovered application.
But it will also be desired to have the same level of recovery capabilities
for the applications in their new site, as they had in their original
primary site. In the above described scenario, The Continentalcluster can
be reconfigured to provide monitoring and recovery for the application
now running on its recovery cluster by switching the identities
of the sites in the applications context. (i.e., the old or original
primary site will become the recovery site and the old or original
recovery site will become the primary site). This type of reconfiguration
of the Continentalcluster is possible only in a two cluster and
two site configuration. Continentalclusters solutions using HP StorageWorks EVA disk
arrays will need no disk array replication related tasks during
the reconfigurations. Once the primary site EVA Disk array comes
back online, The HP StorageWorks EVA CA will automatically resynchronize the
data making the recovery site as “source” and
the old primary site as “destination”. Using the cmswitchconcl command (only in two cluster configuration) to swapped
the site identities for all or for a selected application’s (Recovery
Groups). So that the applications can now be monitored and recovered
from their once primary cluster.
|