Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Designing Disaster Tolerant High Availability Clusters: > Chapter 2 Building an Extended Distance Cluster Using Serviceguard

Two Data Center and Third Location Architectures

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

A two data center and third location have the following configuration requirements:

NOTE: There is no hard requirement on how far the third location has to be from the two main data centers. The third location can be as close as the room next door with its own power source or can be as far as in another site across town. The distance between all three locations dictates that level of disaster tolerance a cluster can provide.
  • In these solutions, there must be an equal number of nodes (1-7) or (1-8) if a Quorum Server is used) in each primary data center, and the third location (known as the arbitrator data center) or Quorum Server can contain 1 or 2 nodes. Cluster lock disks must not be configured.

  • The abritrator nodes are standard Serviceguard nodes configured in the cluster; however, they are not allowed to be connected to the shared disks in either of the primary data centers. Arbitrator nodes are used as tie-breakers to maintain cluster quorum when all communication between the two primary data centers is lost. The third location containing the arbitrator nodes must be located separately from the primary data centers.

    It is possible to use a single Serviceguard Quorum Server node in place of Arbitrator node(s); however, the quorum server system must still be located in a third location separate from the primary data centers. For more information about quorum server, refer to the Managing Serviceguard user’s guide and the Serviceguard Quorum Server Release Notes.

  • If Serviceguard OPS Edition or Serviceguard Extension for RAC is used, then there can only be two or four nodes configured to share OPS/RAC data, as MirrorDisk/UX only supports concurrent volume group activation for up to two nodes. CVM allows for clusters containing 2, 4, 6 or 8 nodes.

  • There can be separate networking and Fibre Channel links between the data centers, or both networking and Fibre Channel can go over DWDM links between the data centers.

  • Fibre Channel Direct Fabric Attach (DFA) is recommended over Fibre Channel Arbitrated loop configurations, due to the superior performance of DFA, especially as the distance increases. Therefore Fibre Channel switches are preferred over Fibre Channel hubs.

  • Any combination of the following Fibre Channel capable disk arrays may be used: HP StorageWorks FC10, HP StorageWorks FC60, HP StorageWorks Virtual Arrays, HP StorageWorks Disk Array XP or EMC Symmetrix Disk Arrays.

  • Application data must be mirrored between the primary data centers. If MirrorDisk/UX is used, Mirror Write Cache (MWC) must be the Consistency Recovery policy defined for all mirrored logical volumes. This will allow for resynchronization of stale extents after a node crash, rather than requiring a full resynchronization. For SLVM (concurrently activated) volume groups, Mirror Write Cache must not be defined as the Consistency Recovery policy for mirrored logical volumes (I.E. NOMWC must be used). This means that a full resynchronization may be required for shared volume group mirrors after a node crash, which can have a significant impact on recovery time. You must ensure that the mirror copies reside in different data centers, so it is recommended to configure physical volume groups for the disk devices in each data center, and to use Group Allocation Policy for all mirrored logical volumes.

  • Due to the maximum of 3 images (1 original image plus two mirror copies) allowed in MirrorDisk/UX, if JBODs are used for application data, only one data center can contain JBODs while the other data center must contain disk arrays with hardware mirroring. Note that having three mirror copies will affect performance on disk writes. VxVM and CVM mirroring does not have a limit on the number of mirror copies, so it is possible to have JBODS in both data centers, however increasing the number of mirror copies may adversely affect performance on disk writes.

  • No routing is allowed for the networks between data centers. Routing is allowed to the third data center if a Quorum Server is used in that data center.

  • VERITAS Volume Manager (VxVM) mirroring is supported for distances of up to 100 kilometers for clusters of 16 nodes. However, on HP-UX 11i v2, VxVM supports up to 10 kilometers for clusters of 16 nodes. You must ensure that the mirror copies reside in different data centers and the DRL (Dirty Region Logging) feature must be used. Raid 5 mirrors are not supported. It is important to note that the data replication links between the data centers VxVM can only perform a full resynchronization (that is, it cannot perform an incremental synchronization) when recovering from the failure of a mirror copy or loss of connectivity to a data center. This can have a significant impact on performance and availability of the cluster if the disk groups are large.

  • VERITAS Cluster Volume Manager (CVM) mirroring is supported for Serviceguard, Serviceguard OPS Edition, or Serviceguard Extension for RAC clusters for distances up to 10 kilometers for 2, 4, 6, or 8 node clusters, and up to 100 kilometers for 2 node clusters.

    Since CVM does not support multiple heartbeats and allows only one heartbeat network to be defined for the cluster, you must make the heartbeat network highly available, using a standby LAN to provide redundancy for the heartbeat network. The heartbeat subnet should be a dedicated network, to ensure that other network traffic will not saturate the heartbeat network. The CVM Mirror Detachment Policy must be set to “Global”.

  • For clusters using VERITAS CVM, only a single heartbeat subnet is supported, so it is required to have both Primary and Standby LANs configured for the heartbeat subnet on all nodes. For SGeRAC clusters, it is recommended to have an additional network for Oracle RAC cache fusion traffic. It is acceptable to use a single Standby network to provide backup for both the heartbeat network and the RAC cache fusion network, however it can only provide failover capability for one of these networks at a time.

  • If Serviceguard Extension for Faster Failover (SGeFF) is used in a two data center and third location architecture, a two node cluster with multiple heartbeats and a quorum server in the third location are required. For more detailed information on Serviceguard Extension for Faster Failover, refer to the Serviceguard Extension for Faster Failover Release Notes and white paper, “Optimizing Failover Time in a Serviceguard Environment”.

The following table shows the possible configurations using a three data center architecture.

Table 2-2 Supported System and Data Center Combinations

Data Center AData Center BData Center C

Serviceguard Version

111 Arbitrator NodeA.11.13 or later

1

1

Quorum Server System

A.11.13 or later

1

1

Quorum Server System

A.11.16 or later (including SGeFF)

2

12 Arbitrator NodesA.11.13 or later
122 Arbitrator NodesA.11.13 or later
221 Arbitrator Node

A.11.13 or later

222* Arbitrator Nodes

A. 11.13 or later

2

2

Quorum Server System

A. 11.13 or later
331 Arbitrator Node

A. 11.13 or later

332* Arbitrator Nodes

A. 11.13 or later

3

3

Quorum Server System

A.11.13 or later

441 Arbitrator Node

A.11.13 or later

442* Arbitrator Nodes

A.11.13 or later

4

4

Quorum Server System

A.11.13 or later

551 Arbitrator Node

A.11.13 or later

552* Arbitrator Nodes

A.11.13 or later

5

5

Quorum Server System

A.11.13 or later

661 Arbitrator Node

A.11.13 or later

662* Arbitrator Nodes

A.11.13 or later

6

6

Quorum Server System

A.11.13 or later

771 Arbitrator Node

A.11.13 or later

772* Arbitrator Nodes

A.11.13 or later

7

7

Quorum Server System

A.11.13 or later

8

8Quorum Server SystemA.11.13 or later

 

* Configurations with two arbitrators are preferred because they provide a greater degree of availability, especially in cases when a node is down due to a failure or planned maintenance. It is highly recommended that two arbitrators be configured in Data Center C to allow for planned downtime in Data Centers A and B.

NOTE: Serviceguard Extension for RAC clusters are limited to 2, 4, 6, or 8 nodes.

The following is a list of recommended arbitration methods for Metrocluster solutions in order of preference:

  • 2 arbitrator nodes, where supported

  • 1 arbitrator node, where supported

  • Quorum Server running in a Serviceguard cluster

  • Quorum Server with APA

  • Quorum Server

For more information on Quorum Server, refer to the Serviceguard Quorum Server Version A.01.00 Release Notes for HP-UX.

Figure 2-4 “Two Data Centers and Third Location with DWDM and Arbitrators” is an example of a two data center and third location configuration using DWDM, with arbitrator nodes in the third location.

Figure 2-4 Two Data Centers and Third Location with DWDM and Arbitrators

Two Data Centers and Third Location with DWDM and Arbitrators

Figure 2-5 Two Data Centers and Third Location with DWDM and Quorum Server

Two Data Centers and Third Location with DWDM and Quorum Server

Figure 2-5 “Two Data Centers and Third Location with DWDM and Quorum Server” is an example of a two data center and third location configuration using DWDM, with a quorum server node on the third site and is specifically for a SGeRAC cluster. The DWDM boxes connected between the two Primary Data Centers are configured with redundant dark fibre links and the standby fibre feature has been enabled.

Note that there is a separate network (indicated by the lines to switches #3 and #4) being used for the RAC Cache Fusion traffic to ensure good RAC performance. Switches #2 and #5 are used for the Standby network, which can provide local LAN failover for both the Primary Heartbeat network and the Primary RAC Cache Fusion network. However it must be noted that the Standby network can only provide local failover capability for one of the Primary networks at a time. For that reason, it is preferable to have a separate Standby network for the Heartbeat network and for the RAC Cache Fusion network.

There are no requirements for the distance between the Quorum Server Data center and the Primary Data Centers, however it is necessary to ensure that the Quorum Server can be contacted within a reasonable amount of time (should be within the NODE_TIMEOUT period). Cluster lock disks are not allowed in this configuration. There can be 2, 4, 6, or 8 nodes in this cluster if CVM is used and the distance is 10 kilometers or less. However, there can be only 2 nodes in this cluster if CVM is used, the distance is 100 kilometers and if shared LVM is used.

Since there are 4 nodes shown in this example cluster, this means that this cluster can only use CVM as the volume manager, and the distance between the Primary data centers cannot exceed 10 kilometers.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.