Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Configuring OPS Clusters with MC/LockManager: > Chapter 4 Planning and Documenting an OPS Cluster

Cluster Configuration Planning

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

A cluster should be designed to provide the quickest possible recovery from failures. The actual time required to recover from a failure depends on several factors:

  • The length of the cluster heartbeat interval and node timeout. They should each be set as short as practical, but not shorter than 1000000 (one second) and 2000000 (two seconds), respectively.

  • The design of the run and halt instructions in the package control script. They should be written for fast execution.

  • The availability of raw disk access. Applications that use raw disk access should be designed with crash recovery services.

  • The application and database recovery time. They should be designed for the shortest recovery time.

In addition, you must provide consistency across the cluster so that:

  • User names are the same on all nodes.

  • UIDs are the same on all nodes.

  • GIDs are the same on all nodes.

  • Applications in the system area are the same on all nodes.

  • System time is consistent across the cluster.

  • Files that could be used by more that one node, such as /usr files, must be the same on all nodes.

Choosing the Cluster Lock Volume Group

A specific disk is identified in the cluster configuration file as holding the cluster lock. This disk must be accessible from all nodes in the cluster. The purpose of the cluster lock is to ensure that only one new cluster is formed in the event that exactly half of the previously clustered nodes try to form a new cluster. It is critical that only one new cluster is formed and that it alone has access to the disks specified in its packages.

Cluster Lock and Re-formation Time

The acquisition of the cluster lock takes different amounts of time depending on the disk I/O interface that is used. After all the disk hardware is configured, use the cmquerycl command specifying all the nodes in the cluster to display a list of available disks and the re-formation time associated with each. Example:

# cmquerycl -v -n ftsys9 -n ftsys10 

Alternatively, you can use SAM to display a list of cluster lock physical volumes, including the re-formation time.

The following list shows disk interface types in descending order of cluster lock disk acquisition/re-formation time:

  • Any combination of HP-HSC and HP-PB disks on F/W SCSI

  • Fiber Link disks

  • HP-PB disks on single-ended SCSI

By default, MC/LockManager selects the disk with the fastest re-formation time. But you may need to choose a different disk because of power considerations. Remember that the cluster lock disk should be separately powered, if possible.

Heartbeat Subnet and Re-formation Time

The speed of cluster re-formation is partially dependent on the type of hearbeat network that is used. Ethernet results in a slower failover time than the other types. If two or more heartbeat subnets are used, the one with the fastest failover time is used.

Cluster Manager Parameters

For the operation of the cluster manager, you need to define a set of cluster parameters. These are stored in the binary cluster configuration file, which is located on all nodes in the cluster. These parameters can be entered by using SAM or by editing the cluster configuration template file created by issuing the cmquerycl command, as described in the chapter "Building an HA Cluster Configuration." The parameter names given below are the names that appear in SAM. The names coded in the ASCII cluster configuration file appear at the end of each entry.

The following parameters must be identified:

Title not available (Cluster Manager Parameters )

Cluster Name

The name of the cluster as it will appear in the output of cmviewcl and other commands, and as it appears in the cluster configuration file.

In the ASCII cluster configuration file, this parameter is CLUSTER_NAME.

The cluster name must not contain any of the following illegal characters: '/', '\', and '*'. All other characters are legal.

Cluster Nodes

The hostname of each system that will be a node in the cluster.

In the ASCII cluster configuration file, this parameter is NODE_NAME.

Cluster Aware Volume Group

The name of a volume group whose disks are intended for access by at least two nodes in the cluster and are physically connected to more than one cluster node. Such disks are considered cluster aware.

In the ASCII cluster configuration file, this parameter is VOLUME_GROUP.

OPS Volume Group

The name of a volume group whose disks are attached to at least two nodes in the cluster; the disks will be accessed by more than one node at a time with concurrency control provided by the Distributed Lock Manager. Such disks are considered cluster aware. In the ASCII cluster configuration file, this parameter is OPS_VOLUME_GROUP. Volume groups listed under this parameter are marked for activation in shared mode.

Heartbeat Subnet

IP notation in SAM indicating the subnet that will carry the cluster heartbeat. Note that heartbeat IP addresses must be on the same subnet on each node.

In the ASCII cluster configuration file, this parameter is HEARTBEAT_IP.

RS232 Heartbeat Network

The name of the device file that corresponds to the serial (RS232) port that you have chosen on each node. Specify this parameter when you are using RS232 as a heartbeat line.

In the ASCII cluster configuration file, this parameter is SERIAL_DEVICE_FILE.

Monitored Non-Heartbeat Subnet

The IP address of each monitored subnet that does not carry the cluster heartbeat. You can identify any number of subnets to be monitored. If you have an application network that does not carry heartbeat, define it as a monitored non-heartbeat subnet.

In the ASCII cluster configuration file, this parameter is STATIONARY_IP.

Lock Volume Group

The volume group containing the physical disk volume on which a cluster lock is written. Identifying a cluster lock volume group is essential in a two-node cluster. If you are using dual cluster locks, enter a lock volume group name for each lock.

In the ASCII cluster configuration file, this parameter is FIRST_CLUSTER_LOCK_VG for the first lock volume group. If there is a second lock volume group, the parameter SECOND_CLUSTER_LOCK_VG is included in the file on a separate line.

NOTE: Lock volume groups must also be defined as Cluster Aware or OPS Volume groups in SAM or in VOLUME_GROUP or OPS_VOLUME_GROUP parameters in the ASCII configuration file.
Physical Volumes

The name of the physical volume within the Lock Volume Group that will have the cluster lock written on it. Enter the physical volume name as it appears on each node in the cluster (the same physical volume may have a different name on each node). If you are creating two cluster locks, enter the physical volume names for both locks.

In the ASCII cluster configuration file, this parameter is FIRST_CLUSTER_LOCK_PV for the first physical lock volume and SECOND_CLUSTER_LOCK_PV for the second physical lock volume. If there is a second physical lock volume, the parameter SECOND_CLUSTER_LOCK_PV is included in the file on a separate line.

Disk Unit

This information is for a label to be attached to each disk drive. Enter the number of the disk drive unit on which the physical volume is located. (This parameter is not entered in the ASCII cluster configuration file.)

Power Supply

This information is for a label to be attached to each UPS. Enter the number of the power supply to which the physical volume is connected. (This parameter is not entered in the ASCII cluster configuration file. )

Heartbeat Interval

The normal interval between the transmission of heartbeat messages from one node to the other in the cluster. Enter a number of seconds. Default: 1 second. The interval should not be set smaller than this.

In the ASCII cluster configuration file, this parameter is HEARTBEAT_INTERVAL, and its value is entered in microseconds.

Node Timeout

The time after which a node may decide that the other node has become unavailable and initiate reconfiguration. Enter a number of seconds. Default: 2 seconds. Minimum is 2 * (Heartbeat Interval).

In the ASCII cluster configuration file, this parameter is NODE_TIMEOUT, and its value is entered in microseconds.

Maximum Configured Packages

This parameter sets the maximum number of packages that can be configured in the cluster. The default is 0, which means that you must set this parameter if you want to use packages. (When upgrading from an earlier version, the parameter is set to the number of packages already configured.)

The greatest possible value is 30.

Set this parameter to a value that is high enough to accomodate the expected number of packages. However, be sure not to set the parameter so high that memory is wasted. Each configured package requires about 600 K of lockable memory.

In the ASCII cluster configuration file, this parameter is known as MAX_CONFIGURED_PACKAGES .

Network Polling Interval

The frequency at which the networks configured for MC/LockManager are checked. The current default is 2 seconds. Thus every 2 seconds, the cluster manager polls each network interface to make sure it can still send and receive information. Changing this value can affect how quickly a network failure is detected.

In the ASCII cluster configuration file, this parameter is NETWORK_POLLING_INTERVAL, and its value is entered in microseconds.

Autostart Delay

The amount of time a node waits before it stops trying to join a cluster during automatic cluster startup. All nodes wait this amount of time for other nodes to begin startup before the cluster completes the operation. The time should be selected based on the slowest boot time in the cluster. Enter a number of seconds equal to the boot time of the slowest booting node minus the boot time of the fastest booting node plus 600 seconds (ten minutes). Default: 600 seconds.

In the ASCII cluster configuration file, this parameter is AUTO_START_TIMEOUT, and its value is entered in microseconds.

Cluster Configuration Worksheet

===============================================================================
Name and Nodes:
===============================================================================
Cluster Name: ___opscluster_____________ OPS Version: __7.3.3______

Node Names: ____node1_________________ ____node2_________________

OPS Volume Groups: _____/dev/vg_ops______/dev/vg_lock_______________

Volume Groups (for packages):_______________________________________
===============================================================================
Subnets:
===============================================================================
Heartbeat Subnet: ___15.13.168.0______

Monitored Non-heartbeat Subnet: _____15.12.172.0___

Monitored Non-heartbeat Subnet: ___________________
===============================================================================
Cluster Lock Volume Groups and Volumes:
===============================================================================
First Lock Volume Group: | Physical Volume:
|
__/dev/vg_lock__ | Name on Node 1: _/dev/dsk/c15t2d0__
|
| Name on Node 2: __/dev/dsk/c15t2d0_
|
| Disk Unit No: ___1_____
|
| Power Supply No: ___1_____
|
===============================================================================
Timing Parameters:
===============================================================================
Heartbeat Interval: _1 sec_
===============================================================================
Node Timeout: _2 sec_
===============================================================================
Network Polling Interval: _15 sec_
===============================================================================
. Autostart Delay: _600 sec_
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1998 Hewlett-Packard Development Company, L.P.