Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Designing Disaster Tolerant High Availability Clusters: > Chapter 5 Building a Continental Cluster

Designing a Disaster Tolerant Architecture for use with ContinentalClusters

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The ContinentalClusters product operates as a configuration of two MC/ServiceGuard clusters, which can run a package on a cluster and a Recovery Cluster. The key elements providing disaster tolerance in a continental cluster are:

  • Mutual Recovery

  • MC/ServiceGuard clusters

  • Data replication

  • Highly available WAN networking

  • Data center processes and procedures coordinated between the two cluster sites

You have a great deal of latitude in selecting these elements for your configuration. It is recommended that you record your choices on worksheets which can be reviewed and updated periodically.

Mutual Recovery

For mutual recovery, any cluster in a continental cluster may contain both primary and recovery packages for any recovery group. Recovery groups may be defined, for example, such that cluster A and cluster B contain recovery packages. In this case, cmrecovercl could be run on cluster B to recover packages from cluster A, or on cluster A to recover packages from cluster B.

MC/ServiceGuard Clusters

Each MC/ServiceGuard cluster in a continental cluster provides high availability for an application at the local level at that particular site. For optimal performance and to assure adequate capacity on the recovery cluster, it is best to have similar hardware on both clusters. For example, if one cluster contains two V class HP 9000 systems with 1Gb of memory each, it is not a good idea to have a low-end K series HP 9000 with 128 Mb of memory in the other cluster. Each cluster may have as many nodes as are permitted in an ordinary MC/ServiceGuard cluster, and each may be running packages that are not configured to fail over between clusters.

NOTE: Remember that when cluster A takes over for cluster B, it must run cluster B's packages as well as any packages that it was already running on its own, unless you choose to stop those packages.

Data Replication

Data replication between the MC/ServiceGuard clusters extends the scope of high availability to the level of the continental cluster. You must select a technology for data replication between the two clusters. There are many possible choices, including:

  • Logical replication of databases

  • Logical replication of filesystems

  • Physical replication of data volumes via software

  • Physical replication of disk units via hardware

Table 5-3 “Data Replication and ContinentalClusters” is a brief discussion of how a data replication method affects a continental cluster environment. A detailed description of data replication can be found in Chapter 1, in the section titled "Disaster Tolerant Architecture Guidelines." Specific guidelines for configuring the HP SureStore E Disk Array XP Series and the EMC Symmetrix Disk Array for physical data replication in a continental cluster are provided in Chapters 6 and 7. White papers describing specific implementations are also available from http://docs.hp.com/hpux/ha.

Table 5-3 Data Replication and ContinentalClusters

Replication Type

How it Works

ContinentalClusters Implication

Logical Database Replication

Transactions from the primary application are applied from logs to a copy of the application running on the recovery site. (This is an example only; there are other methods.)

Requirements on CPU and I/O may limit or prevent the Recovery Cluster from running additional applications.

Logical Filesystem Replication

Writes to the filesystem on the primary cluster are duplicated periodically on the recovery cluster.

CPU issues are the same as for Logical Database Replication. The software may have to be managed as a separate MC/ServiceGuard package.

Physical Replication of Data Volumes via Software

Disk mirroring via LVM software. Only limited distances are possible (up to 10 km), since mirroring is done on disk links (SCSI or FibreChannel).

Requirements on CPU are less than for logical replication, but there is still some CPU use. Distance limits may make this type of replication inappropriate for ContinentalClusters.

Physical Replication of Disk Units via Hardware

Replication of the LUNs within a disk array through dedicated hardware links such as EMC SRDF or Continuous Access XP.

Limited CPU requirements, but the requirement of synchronous data replication slows replication, and may impair application performance. Increased network speed and bandwidth can remedy this.

 

Logical data replication may require the use of packages to handle software processes that copy data from one cluster to another or that apply transactions from logs that are copied from one cluster to another. Some methods of logical data replication may use a logical replication data sender package; others may use a logical replication data receiver package; some may use both. Logical replication data sender and receiver packages are configured as part of the data recovery group, as shown below under "Creating the ContinentalClusters Configuration."

Physical Data Replication using Special Environment files

Physical data replication generally does not require the use of separate sender or receiver packages, but it does require specialized logic in the package control scripts to handle the transfer of control from the storage units of one cluster to the storage units at the other cluster. The packages that use physical data replication with the HP SureStore E Disk Array XP Series with Continuous Access XP should have created a specific environment file using template /opt/cmcluster/toolkit/SGCA/xpca.env; environment file for packages that are using physical data replication with EMC Symmetrix and the SRDF facility should be created using /opt/cmcluster/toolkit/SGSRDF/srdf.env. Both of these templates are included with the ContinentalClusters product.

Details on configuring the special ContinentalClusters control scripts are in Chapters 6 and 7. Some additional notes are provided below.

Highly Available Wide Area Networking

Disaster tolerant networking for ContinentalClusters is directly tied to the data replication method. In addition to the reliability of the redundant lines connecting the remote nodes, you also need to consider what bandwidth you need to support the data replication method you have chosen. A continental cluster that handles a high number of write transactions per minute will not only require a highly available network, but also one with a large amount of bandwidth. Details on highly available networking can be found in Chapter 1, in the section titled "Disaster Tolerant Architecture Guidelines." White papers describing specific implementations are also available from http://docs.hp.com.

Data Center Processes

ContinentalClusters provides the cmrecovercl command that fails over all applications on the primary cluster that are protected by ContinentalClusters. However, application failover also requires well-defined processes for the two sites. These processes and procedures should be written down and made available at both sites.

Some considerations for site management are as follows:

  • Who notifies whom for the various events: configuration changes, alerts, alarms?

  • What communication methods should be used? Email? Phone? Beeper? Multiple methods?

  • Who has authority to perform what sort of configuration modifications? Can the administrator at one site log in to the nodes on the remote site? If so, what permissions would be set?

  • How often is a practice failover done?

  • Is there a documented test plan?

  • What is the process for tracking changes made to the primary cluster?

ContinentalClusters Worksheets

Planning is an essential effort in creating a robust continental cluster environment. It is recommended that you record the details of your configuration on planning worksheets. These worksheets can be filled in partially before configuration begins, and then completed as you build the continental cluster. Both the site with the cluster and the site with the Recovery Cluster should have a copy of these worksheets to help coordinate initial configuration and subsequent changes. Complete the worksheets in the following sections for each pair of clusters that will be monitored by the ContinentalClusters monitor.

Data Center Worksheet

The following worksheet will help you describe your specific data center configuration. Fill out the worksheet and keep it for future reference.

    =======================================================================

Continental Cluster Name: _____________________________________________

=======================================================================

Primary Data Center Information:

Primary Cluster Name: ____________________________________________

Data Center Name and Location: ___________________________________

Main Contact: ____________________________________________________

Phone Number: ____________________________________________________

Beeper: __________________________________________________________

Email Address: ___________________________________________________

Node Names: ______________________________________________________

Monitor Package Name: __ccmonpkg__________________________________

Monitor Interval: __60 seconds____________________________________

=======================================================================

Recovery Data Center Information:

Recovery Cluster Name: ___________________________________________

Data Center Name and Location: ___________________________________

Main Contact: ____________________________________________________

Phone Number: ____________________________________________________

Beeper: __________________________________________________________

Email Address: ___________________________________________________

Node Names: ______________________________________________________

Monitor Package Name: __ccmonpkg__________________________________

Monitor Interval: __60 seconds____________________________________

Recovery Group Worksheet

The following worksheet will help you organize and record your specific recovery groups. Fill out the worksheet and keep it for future reference.

    =======================================================================

Continental Cluster Name: _____________________________________________

=======================================================================

Recovery Group Data:

Recovery Group Name: _____________________________________________

Primary Cluster/Package Name:_____________________________________

Data Sender Cluster/Package Name:_________________________________

Recovery Cluster/Package Name:____________________________________

Data Receiver Cluster/Package Name:_______________________________

Recovery Group Data:

Recovery Group Name: _____________________________________________

Primary Cluster/Package Name:_____________________________________

Data Sender Cluster/Package Name:_________________________________

Recovery Cluster/Package Name:____________________________________

Data Receiver Cluster/Package Name:_______________________________

Recovery Group Data:

Recovery Group Name: _____________________________________________

Primary Cluster/Package Name:_____________________________________

Data Sender Cluster/Package Name:_________________________________

Recovery Cluster/Package Name:____________________________________

Data Receiver Cluster/Package Name:_______________________________





Cluster Event Worksheet

The following worksheet will help you organize and record the cluster events you wish to track. Fill out a worksheet for each primary or recovery cluster that you wish to monitor. You must monitor each cluster containing a recovery package.


Continental Cluster Name: _____________________________________________

=======================================================================

Cluster Event Information:

Cluster Name _____________________________________________________

Monitoring Cluster: ______________________________________________

UNREACHABLE:

Alert Interval:___________________________________________________

Alarm Interval:___________________________________________________

Notification:_____________________________________________________

Notification:_____________________________________________________

Notification:_____________________________________________________


DOWN:

Alert Interval:___________________________________________________

Notification:_____________________________________________________

Notification:_____________________________________________________

UP:

Alert Interval:___________________________________________________

Notification:_____________________________________________________

Notification:_____________________________________________________

ERROR::

Alert Interval:___________________________________________________

Notification:_____________________________________________________

Notification:_____________________________________________________
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.