Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Designing Disaster Tolerant HA Clusters Using Metrocluster and Continentalclusters: > Chapter 3 Building Disaster Tolerant Serviceguard Solutions Using Metrocluster with Continuous Access XP

Completing and Running a Metrocluster Solution with Continuous Access XP

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

No additional steps are required after cluster and package configuration to complete the setup of the metropolitan cluster. In normal operation, the metropolitan cluster with Continuous Access XP starts like any other cluster, and runs and halts packages in the same way as a standard cluster. However, startup time for packages may be considerably slower because of the need to check disk status on both disk arrays.

Maintaining a Cluster that uses Metrocluster with Continuous Access XP

While the cluster is running, performing manual “changes of state” for devices on the XP Series disk array can cause the package to halt. This is due to unexpected conditions and can cause the package not to start up after a failover. In general, it is recommended that no manual “changes of state” be performed while the package and the cluster are running.

NOTE: Manual changes can be made when they are required to bring the device group into a “protected” state. For example, if a package starts up with data replication suspended, a user can perform a pairresync command to re-establish data replication while the package is still running.

Viewing the Progress of Copy Operations

While a copy is in progress between XP systems (that is, the volumes are in a COPY state), the progress of the copy can be viewed by monitoring the % column in the output of the pairdisplay command:

# pairdisplay -g pkgB -fc -CLI

Group   PairVol L/R   Port# TID LU  Seq# LDEV# P/S Status Fence    %  P-LDEV# M
pkgB pkgD-disk0 L CL1-C 0 3 35422 463 P-VOL COPY NEVER    79    460  -
pkgB pkgD-disk0 R CL1-F 0 3 35663 3 S-VOL COPY NEVER     -      0  -

This display shows that 79% of a current copy operation has completed. Synchronous fence levels (NEVER and DATA) show 100% in this column when the volumes are in a PAIR state.

Viewing Side File Size

If you are using asynchronous data replication, you can see the current size of the side file when the volumes are in a PAIR state by using the pairdisplay command. The following output, obtained during normal cluster operation, shows the percentage of the side file that is full:

# pairdisplay -g pkgB -fc -CLI

Group   PairVol L/R   Port# TID LU  Seq# LDEV# P/S Status Fence    %  P-LDEV# M
pkgB pkgD-disk0 L CL1-C 0 3 35422 463 P-VOL PAIR ASYNC    35       3 -
pkgB pkgD-disk0 R CL1-F 0 3 35663 3 S-VOL PAIR ASYNC     0     463 -

This output shows that 35% of the side file is full.

When volumes are in a COPY state, the % column shows the progress of the copying between the XP frames, until it reaches 100%, at which point the display reverts to showing the side file usage in the PAIR state.

Viewing the Continuous Access Journal Status

The following two sections describe using the pairdisplay and raidvchkscan commands for viewing the Continuous Access Journal Status.

Viewing the Pair and Journal Group Information - Raid Manager using the “pairdisplay” Command

The command option “-fe” is added to the Raid Manager pairdisplay command. This option is used to display the Journal Group ID (and other data) of a device group pair. The Journal Group ID shows ‘-’ if the device pair is not in Continuous Access Journal mode. Otherwise, it shows a number.

An example of the pairdisplay command with the “-fe” is as below:

The pairdisplay -fe is primarily used for the following: Continuous Access Journal device group consistency set, Journal group ID (JID), and Continuous Access link status (AP).





# pairdisplay -g oradb -fe
Group Seq#, LDEV# P/S,Status, Fence, %, P-LDEV# M CTG JID AP EM E-Seq# E-LDEV# oradb 30053 64 P-VOL PAIR Never, 75 C8 - 1 0 2
oradb 30054 C8 S-VOL PAIR Never, 64 - - 1 0

Viewing the Journal Volumes Information - Raid Manager using the “raidvchkscan” Command

The raidvchkscan command supports the option (-v jnl [unit#]) in order to find the journal volume lists, and displays information for the journal volumes.

raidvchkscan { -h -q -z -v jnl [unit#] [ -s Seq# ] [ -f[x ] | }

An example of the raidvchkscan command is as follows:

# raidvchkscan -v jnl 0
JID MU CTG JNLS AP U(%) Q-Marker Q-CNT D-SZ(BLK) Seq# Nnm LDEV#
001 0 1 PJNN 4 21 43216fde 30 512345 62500 2 265
002 1 2 PJNF 4 95 3459fd43 52000 512345 62500 3 270
003 0 3 PJSN 4 0 - - 512345 62500 1 275
004 0 4 PJSF 4 45 1234f432 78 512345 62500 1 276
005 0 5 PJSE 0 0 - - 512345 62500 1 277
006 - - SMPL - - - - 512345 62500 1 278
007 0 6 SMPL 4 5 345678ef 66 512345 62500 1 278

Figure 3-5 “Q-Marker and Q-CNT” shows the illustration for Q-Marker and Q-CNT. The following terms define the meaning for contents in the figure.

  • JID: Displays the journal group ID.

  • MU: Displays the mirror descriptions on the journal group.

  • CTG: Displays the consistency group ID.

  • JNLS: Displays the following status in the journal group.

    • SMPL: this means the journal volume is no in pair mode or is in deleting state.

    • P(S)JNN: this means “P(S)vol Journal Normal”

    • P(S)JSN: this means “P(S)vol Journal Suspend Normal”

    • PJNF: this means “P(S)vol Journal Normal Full”

    • P(S)JSF: this means “P(S)vol Journal Suspend Full”

    • P(S)JSE: this means “P(S)vol Journal Suspend Error” including Link failure

  • AP: shows the number of active path on the initiator port in Continuous Access links.

  • Q-Marker: Displays the sequence number in the journal group.

  • In case of the P-JNL, Q-Marker shows the latest sequence number on P-JNL volume.

  • In case of the S-JNL, Q-Marker shows the latest sequence number putting on the cache.

  • Q-CNT: Displays the number of remaining Q-Marker of a journal group.

Figure 3-5 Q-Marker and Q-CNT

Q-Marker and Q-CNT
  • U(%): Displays the usage rate of the journal data.

  • D-SZ: Displays the capacity for the journal data on the journal group.

  • Seq#: Displays the serial number of the XP12000.

  • Num: Displays the number of LDEV (journal volumes) configured for the journal group.

  • LDEV#: Displays the first LDEV number of journal volumes.

Normal Maintenance

There might be situations when the package has to be taken down for maintenance purposes without having the package move to another node. The following procedure is recommended for normal maintenance of the Metrocluster/Continuous Access:

  1. Stop the package with the appropriate Serviceguard command.

    # cmhaltpkg pkgname

  2. Split links for the package.

    # pairsplit -g <package device group name> -rw

  3. Distribute the Metrocluster with Continuous Access XP configuration changes.

    # cmapplyconf -P pkgname.config

  4. Start the package with the appropriate Serviceguard command:

    # cmmodpkg -e pkgname

Planned maintenance is treated the same as a failure by the cluster. If you take a node down for maintenance, package failover and quorum calculation is based on the remaining nodes. Make sure that nodes are taken down evenly at each site, and that enough nodes remain on-line to form a quorum if a failure occurs. See “Example Failover Scenarios with Two Arbitrators”.

Resynchronizing

After certain failures, data is no longer remotely protected. In order to restore disaster tolerant data protection after repairing or recovering from the failure, you must manually run the command pairresync. This command must successfully complete for disaster-tolerant data protection to be restored.

Following is a partial list of failures that require running pairresync to restore disaster-tolerant data protection:

  • Failure of all Continuous Access links without restart of the application

  • Failure of all Continuous Access links with Fence Level “DATA” with restart of the application on a primary host

  • Failure of the entire secondary Data Center for a given application package

  • Failure of the secondary XP Series disk array for a given application package while the application is running on a primary host

Following is a partial list of failures that require full resynchronization to restore disaster-tolerant data protection. Full resynchronization is automatically initiated for these failures by moving the application package back to its primary host after repairing the failure:

  • Failure of the entire primary data center for a given application package

  • Failure of all of the primary hosts for a given application package

  • Failure of the primary XP Series disk array for a given application package

  • Failure of all Continuous Access links with restart of the application on a secondary host

Pairs must be manually recreated if both the primary and secondary XP Series disk array are in SMPL (simplex) state. Make sure you periodically review the files syslog.log and /etc/cmcluster/pkgname/pkgname.log for messages, warnings and recommended actions. It is recommended to review these files after system, data center, or application failures.

Full resynchronization must be manually initiated after repairing the following failures:

  • Failure of the secondary XP Series disk array for a given application package followed by application startup on a primary host

  • Failure of all Continuous Access links with Fence Level NEVER and ASYNC with restart of the application on a primary host

Using the pairresync Command

The pairresync command can be used with special options; after a failover in which the recovery site has started the application, and has processed transaction data on the disk at the recovery site, but the disks on the primary site are intact. After the Continuous Access link is fixed, use the pairresync command in one of the following two ways depending on which site you are on:

  • pairresync -swapp—from the primary site.

  • pairresync -swaps—from the failover site.

These options take advantage of the fact that the recovery site maintains a bit-map of the modified data sectors on the recovery array. Either version of the command will swap the personalities of the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL. With the personalities swapped, any data that has been written to the volume on the failover site (now PVOL) are then copied back to the SVOL (now running on the primary site). During this time the package continues running on the failover site. After resynchronization is complete, you can halt the package on the failover site, and restart it on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

NOTE: The preceding steps are automated provided the default value of 1 is being used for the auto variable AUTO_PSUEPSUS. Once the Continuous Access link failure has been fixed, the user only needs to halt the package on the recovery cluster and restart on the primary cluster. However, if you want to reduce the amount of application downtime, you should manually invoke pairresync before failback.

Failback

After resynchronization is complete, you can halt the package on the failover site, and restart it on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

Timing Considerations

In a journal group, many journal volumes can be configured to hold a significant amount of the journal data (host-write data). The package startup time may increase significantly when a Metrocluster Continuous Access package fails over. Delay in package startup time will occur in these situations:

  1. When recovering from broken pair affinity. On failover, the SVOL pull all the journal data from PVOL site. The time needed to complete all data transfer to SVOL depends on the amount of outstanding journal data in the PVOL and the bandwidth of the Continuous Access links.

  2. When host I/O faster than Continuous Access data replication. The outstanding data not being replicated to the SVOL is accumulated in journal volumes. Upon package fail over to the SVOL site, the SVOL pull all the journal data from PVOL site. The completion of the all data transfer to the SVOL depends on the bandwidth of the Continuous Access links and amount of outstanding data in the PVOL journal volume.

Data maintenance with the failure of a Metrocluster Continuous Access XP Failover

The following sections, “Swap Takeover Failure (Asynchronous/Journal mode)” and “Takeover Timeout (for Continuous Access Journal mode)” describes data maintenance upon failure of a Metrocluster Continuous Access XP failover.

Swap Takeover Failure (Asynchronous/Journal mode)

When a device group pair state is SVOL-PAIR at a local site and is PVOL-PAIR at the remote site, the Metrocluster Continuous Access performs a swap takeover. The swap takeover would fail if there is an internal (unseen) error (for example, cache or shared memory failure) in the device group pair. In this case, if the AUTO-NONCURDATA is set to 0, the package will not be started and the SVOL state is change to SVOL-PSUE (SSWS) by the takeover command. The PVOL site either remains in PVOL-PAIR or is changed to PVOL-PSUE.

The SVOL is in SVOL-PSUE(SSWS) meaning that the SVOL is read/write enabled and the data is usable but not as current as PVOL.

In this case, either use FORCEFLAG to startup the package on SVOL site or fix the problem and resume the data replication with the following procedures:

  1. Split the device group pair completely (pairsplit -g <dg> -S).

  2. Re-create a pair from original PVOL as source (use paircreate command).

  3. Startup package on either the PVOL site or SVOL site.

Takeover Timeout (for Continuous Access Journal mode)

A takeover timeout occurs when a package failover to the secondary site (SVOL) and Metrocluster Continuous Access issues takeover (either swap or SVOL takeover) command on SVOL. If the journal group pair is flushing the journal data from PVOL to SVOL and takeover timeout occurs, the package would not start and the following situations would occur:

  1. The device group pair state remains in PVOL-PAIR/SVOL-PAIR.

  2. The journal data is continuously transferring to the SVOL.

In this case, it is required to wait for the completion of the journal data flushing and the state for each of the following:

  • Primary site: PVOL-PAIR or PVOL-PSUS(E)

  • Secondary site: SVOL-PSUS(SSWS) or SVOL-PSUE(SSWS)

At this point, execute either: (1) by using the FORCEFLAG to startup the package on SVOL site or (2) to fix the problem (if any of Continuous Access links was failed) and resume the data replication with the following procedures:

  1. Split the device group pair completely (pairsplit -g <dg> -S).

  2. Re-create a pair from original PVOL as source (use the paircreate command).

  3. Startup package on PVOL site (or SVOL site).

PVOL-PAIR with SVOL-PSUS(SSWS) State (for Continuous Access Journal Mode)

PVOL-PAIR with SVOL-PSUS(SSWS) is an intermediate state. The following is one scenario that leads to this state:

  • At T1, device pair is in PVOL-PAIR/SVOL-PAIR and the AP value is 0 in SVOL site.

  • At T2, a failover occurs; package failover from PVOL site to SVOL site. The Metrocluster Continuous Access issues SVOL-Takeover and the state will become SVOL-PSUS(SSWS) and PVOL-PAIR.

  • At T3, all Continuous Access links have been recovered. The state stays in SVOL-PSUS (SSWS) and PVOL-PAIR. The duration the PVOL remains in PAIR state is relatively short

The PVOL-PAIR/SVOL-PSUS (SSWS) is an invalid state for XP Asynchronous (both Continuous Access/Asynchronous and Continuous Access Journal). In this state, by issuing a pairresync or takeover command, it would fail. It is necessary to wait for the PVOL to become PSUE.

XP Continuos Access Device Group Monitor

In the Metrocluster/Continuous Access environment, where the device group state is not actively monitored, it may not be apparent when the application data is not remotely protected for an extended period of time. Under these circumstances, the XP/Continuous Access device group monitor provides the capability to monitor the status of the XP/Continuous Access device group used in a package. The XP/Continuous Access device group monitor, based on a pre-configured environment variable, also provides the ability to perform automatic resynchronization of the XP/Continuous Access device group upon link recovery.

NOTE: If the monitor is configured to automatically resynchronize the data from PVOL to SVOL upon link recovery, a Business Copy (BC) volume of the SVOL should be configured as another mirror.

In the case of a rolling disaster and the data in the SVOL becomes corrupt due to an incomplete resynchronization, the data in the BC volume can be restored to the SVOL. This will result non-current, but usable data in the BC volumes

The monitor, as a package service, periodically checks the status of the XP/Continuous Access device group that is configured for the package, and sends notification to the user via email, syslog, and console if there is a change in the status of the package’s device group.

XP/Continuous Access Device Group Monitor Operation Overview

The XP/Continuous Access device group monitor runs as a package service. The user can configure the monitor's setting through the package's environment file. Once the package has started the XP/Continuous Access device group monitor, the monitor will periodically check the status of the XP/Continuous Access device group. If there is a change in the status or the monitor is configured to notify after an interval of no status change, the monitor will send a notification that states the reason for the notification, a timestamp, and the status of the XP/Continuous Access device group.

Configuring the Monitor

Use the following steps to configure a monitor for a package’s device group:

  • Configure the monitor’s variables in the package environment file.

  • Configure the monitor as a service of the package.

Configure the Monitor’s Variables in the Package Environment File.

Edit the following variables of the monitor’s section in the environment file <pkgname>_xpca.env as follows:

NOTE: See Appendix A for an explanation of these variables.
  • Uncomment the MON_POLL_INTERVAL variable and set it to the desired value in minutes. If this variable is not set, it will default to a value of 10 minutes.

  • Uncomment the MON_NOTIFICATION_FREQUENCY variable and set it to the desired value. This value is used to control the frequency of notification message when the state of the device group remains the same after the first check of the device group's state. If the value is zero, the monitor will only send notification when the state of the device group has changed. If the variable is not set, the default will be 0.

  • If you want to receive notification messages over email, uncomment the MON_NOTIFICATION_EMAIL variable and set it to a fully qualified email address. Multiple email addresses can be configured using comma as separator between the addresses.

  • If you want notification messages to be logged in the syslog file, uncomment the MON_NOTIFICATION_SYSLOG variable and set it to 1.

  • If you want notification messages to be logged on the system's console, uncomment the MON_NOTIFICATION_CONSOLE variable and set it to 1.

  • If you want an automatic resynchronization upon link recovery, uncomment the AUTO_RESYNC variable and set it to either 0, 1 or 2.

    If AUTO_RESYNC is set to 0 (DEFAULT), the monitor will not try to do the resynchronization from PVOL to SVOL. This setting will only send notifications.

    If AUTO_RESYNC is set to 1, the monitor will split the remote BC if one is configured from the mirror group before trying to do the resynchronization from PVOL to SVOL.

    If AUTO_RESYNC is set to 2, the monitor will only do the resynchronization from PVOL to SVOL when it finds the MON_RESYNC file in the package directory on the node that the package is running. The monitor will not manage the remote BC prior to and after the resynchronization. This setting is used if the user wants to manage the BC themselves.

    To enable the Continuous Access resynchronization for AUTO_RESYNC=2, it is necessary to create a file using the HP-UX command touch. For example:

    # touch /etc/cmcluster/packageA/MON_RESYNC

    (where /etc/cmcluster/packageA is the package directory)

    After the monitor detects the MON_RESYNC file, it is automatically removed.

    The following is an example of the XP/Continuous Access device group monitor definition section in the environment file (<packagename>_xpca.env>) where the monitor will perform the following:

  • poll every 15 minutes.

  • send a notification on every third polling, if the state of the device group remains the same.

  • send the notifications to sysadmin1@hp.com and sysadmin2@hp.com.

  • log notifications to system log file, syslog.

  • display notifications to system console.

  • perform automatic resynchronization with BC management when detecting the device group local state change to PVOL-PSUE or PVOL-PDUB.

    MON_POLL_INTERVAL=15MON_NOTIFICATION_FREQUENCY=3MON_NOTIFICATION_EMAIL=sysadmin1@hp.com,sysadmin2@hp.comMON_NOTIFICATION_SYSLOG=1MON_NOTIFICATION_CONSOLE=1AUTO_RESYNC=1

Configure XP/Continuous Access Device Group Monitor as a Service of the Package

Add the monitor as a service in the package's configuration file and control script file as follows:

  • In the package's configuration file, add the following lines:

    SERVICE_NAME pkgXdevgrpmon.srv
    SERVICE_FAIL_FAST_ENABLED NOSERVICE_HALT_TIMEOUT 5
NOTE: The SERVICE_HALT_TIMEOUT value of 5 is a recommended value. If the value is set to lower than 5 seconds as the service halt timeout, then it may not allow enough time for the monitor to properly clean itself up.
  • In the package's control script file, add the following lines on the SERVICE NAMES AND COMMANDS section:

    SERVICE_NAME[0]=”pkgXdevgrpmon.srv”SERVICE_CMD[0]=”/usr/sbin/DRMonitorXPCADevGrp <full path name of the package environment file>”SERVICE_RESTART[0]=”-r 10”
CAUTION: If the Continuous Access links are still down while the monitor is trying to do the resynchronization and another failure occurs that causes a remote failover to the secondary site, the SVOL’s BC volumes will remain split from its mirror group.

This will only occur if the monitor is configured to perform automatic resynchronization using AUTO_RESYNC=1.

Troubleshooting the XP/Continuous Access Device Group Monitor

The following is a guideline to help identify the cause of potential problems with the XP/Continuous Access device group monitor.

  • Problems with email notifications:

    XP/Continuous Access device group monitor uses SMTP to send out email notifications. All email notification problems are logged in the package log file.

    If a warning message in the package log file indicates the monitor is unable to determine the SMTP port. it is caused by not having the SMTP port defined in the /etc/services file. The monitor assumes that SMTP port is 25. If a different port number is defined, the monitor will need to be restarted in order for it to connect to the correct port.

    If an error message in the package control log file states that the SMTP server cannot be found is caused by not having a mail server configured on the local node, such as sendmail. A mail server needs to be configured and run in the local node for email notification. Once the mail server is running in the local node, the monitor will start sending email notifications.

  • Problems with Unknown Continuous Access Device Status:

    XP/Continuous Access device group monitor relies on the Raid Manager instance to get the Continuous Access device group state. Under the circumstances when the local Raid Manager instance fails, the monitor will not be able to determine the status of the Continuous Access device group state. The monitor will send out a notification to all configured destinations, via email, stating that the state has changed to an UNKNOWN status. Since the monitor will not try to restart the Raid Manager instance, the user is required to restart the Raid Manager instance before the monitor will be able to determine the status of the Continuous Access device group. Make sure to start Raid Manager instance with the same instance number that is defined in the package’s environment file.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.