Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Configuring OPS Clusters with ServiceGuard OPS Edition > Chapter 5 Building an OPS Cluster Configuration

Preparing Your Systems

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

Before configuring your cluster, ensure that all cluster nodes possess the appropriate security files, kernel configuration and NTP (network time protocol) configuration.

Understanding Where Files Are Located

ServiceGuard uses a special file, /etc/cmcluster.conf, to define the locations for configuration and log files within the HP-UX filesystem. The following locations are defined in the file:

###################### cmcluster.conf########################
#
# Highly Available Cluster file locations
#
# This file must not be edited
#############################################################

SGCONF=/etc/cmcluster
SGSBIN=/usr/sbin
SGLBIN=/usr/lbin
SGLIB=/usr/lib
SGCMOM=/opt/cmom
SGRUN=/var/adm/cmcluster
SGAUTOSTART=/etc/rc.config.d/cmcluster
SGCMOMLOG=/var/adm/syslog/cmom
NOTE: If these variables are not defined on your system, then include the file /etc/cmcluster.conf in your login profile for user root.

Throughout this book, system filenames are usually given with one of these location prefixes. Thus, references to $SGCONF/<FileName> can be resolved by supplying the definition of the prefix that is found in this file. For example, if SGCONF is defined as /etc/cmcluster/conf, then the complete pathname for file $SGCONF/cmclconfig would be /etc/cmcluster/conf/cmclconfig.

NOTE: Do not edit the /etc/cmcluster.conf file.

Editing Security Files

ServiceGuard makes use of ARPA services to ensure secure communication among cluster nodes. Before installing ServiceGuard OPS Edition, you must identify the nodes in the cluster that permit access by the root user on other nodes. If you do not do this, ServiceGuard will not be able to copy files among nodes during configuration.

Instead of using the /.rhosts file to enforce security in communication among all nodes within a cluster, ServiceGuard allows you to specify an alternate file, /etc/cmcluster/cmclnodelist, for validating inter-node access within the cluster. The use of /etc/cmcluster/cmclnodelist is strongly recommended if NIT or NIS+ is configured in your cluster environment for network security services. ServiceGuard will check for the existence of the /etc/cmcluster/cmclnodelist file first. If the /etc/cmcluster/cmclnodelist file exists, ServiceGuard will only use this file to verify access within the cluster; otherwise, the /.rhosts file will be used.

The format for entries in the /etc/cmcluster/cmclnodelist file is as follows:

	[hostname or IP address]     [rootuser]       [#comment]

The following examples show how to format the file.

In the following, NodeA and NodeB.sys.dom.com are configured on the same subnet:

NodeA                 root            # cluster1
NodeB.sys.dom.com root # cluster1

In the following example, NodeA and NodeB share two public subnetworks, 192.6.1.0 and 192.6.5.0. NodeA is configured in subnet 192.6.1.0 and its official IP address known by clients is 192.6.1.10. It also has another IP address, 192.6.1.20, which is available for communication on subnet 192.6.1.0. The cmclnodelist file must contain all these IP addresses in order to grant permission for ServiceGuard messages to be sent from any node.

NodeA                 root            # cluster1
192.6.5.10 root # cluster1
NodeB.sys.dom.com root # cluster1
192.6.1.20 root # cluster1

The /.rhosts or /etc/cmcluster/cmclnodelist file should be copied to all cluster nodes. ServiceGuard supports full domain names in both the /etc/cmcluster/cmclnodelist and /.rhosts files.

NOTE: The .rhosts file must not allow write access by group or other. If these permissions are enabled, MC/ServiceGuard commands may fail, logging a "Permission Denied for User" message.

You can also use the /etc/hosts.equiv and /var/adm/inetd.sec files to provide other levels of cluster security. For more information, refer to the HP-UX guide, Managing Systems and Workgroups.

Creating Mirrors of Root Logical Volumes

It is highly recommended that you use mirrored root volumes on all cluster nodes. The following procedure assumes that you are using separate boot and root volumes; you create a mirror of the boot volume (/dev/vg00/lvol1), primary swap (/dev/vg00/lvol2), and root volume (/dev/vg00/lvol3). The procedure cannot be carried out with SAM. In this example and in the following commands, /dev/dsk/c4t5d0 is the primary disk and /dev/dsk/c4t6d0 is the mirror; be sure to use the correct device file names for the root disks on your system.

  1. Create a bootable LVM disk to be used for the mirror.

    # pvcreate -B /dev/rdsk/c4t6d0 
  2. Add this disk to the current root volume group.

    # vgextend /dev/vg00 /dev/dsk/c4t6d0 
  3. Make the new disk a boot disk.

    # mkboot -l /dev/rdsk/c4t6d0  
  4. Mirror the boot, primary swap, and root logical volumes to the new bootable disk. Ensure that all devices in vg00, such as /usr, /swap, etc., are mirrored.

    NOTE: The boot, root, and swap logical volumes must be done in exactly the following order to ensure that the boot volume occupies the first contiguous set of extents on the new disk , followed by the swap and the root.

    The following is an example of mirroring the boot logical volume:

    # lvextend -m 1 /dev/vg00/lvol1 /dev/dsk/c4t6d0 

    The following is an example of mirroring the primary swap logical volume:

    # lvextend -m 1 /dev/vg00/lvol2 /dev/dsk/c4t6d0 

    The following is an example of mirroring the root logical volume:

    # lvextend -m 1 /dev/vg00/lvol3 /dev/dsk/c4t6d0 
  5. Update the boot information contained in the BDRA for the mirror copies of boot, root and primary swap.

    # /usr/sbin/lvlnboot -b /dev/vg00/lvol1
    # /usr/sbin/lvlnboot -s /dev/vg00/lvol2
    # /usr/sbin/lvlnboot -r /dev/vg00/lvol3

  6. Verify that the mirrors were properly created.

    # lvlnboot -v

    The output of this command is shown in a display like the following:

    Boot Definitions for Volume Group /dev/vg00:
    Physical Volumes belonging in Root Volume Group:
    /dev/dsk/c4t5d0 (10/0.5.0) -- Boot Disk
    /dev/dsk/c4t6d0 (10/0.6.0) -- Boot Disk
    Boot: lvol1 on: /dev/dsk/c4t5d0
    /dev/dsk/c4t6d0
    Root: lvol3 on: /dev/dsk/c4t5d0
    /dev/dsk/c4t6d0
    Swap: lvol2 on: /dev/dsk/c4t5d0
    /dev/dsk/c4t6d0
    Dump: lvol2 on: /dev/dsk/c4t6d0, 0

Choosing Cluster Lock Disks

The following guidelines apply if you are using a lock disk. The cluster lock disk is configured on a volume group that is physically connected to all cluster nodes. This volume group may also contain data that is used by packages.

When you are using dual cluster lock disks, it is required that the default IO timeout values are used for the cluster lock physical volumes. Changing the IO timeout values for the cluster lock physical volumes can prevent the nodes in the cluster from detecting a failed lock disk within the alloted time period which can prevent cluster re-formations from succeeding. To view the existing IO timeout value, run the following command:

# pvdisplay <lock device file name>

The IO Timeout value should be displayed as "default." To set the IO Timeout back to the default value, run the command:

# pvchange -t 0 <lock device file name>

The use of a dual cluster lock is only allowed with certain specific configurations of hardware. Refer to the discussion in Chapter 3 on "Dual Cluster Lock."

Backing Up Cluster Lock Disk Information

After you configure the cluster and create the cluster lock volume group and physical volume, you should create a backup of the volume group configuration data on each lock volume group. Use the vgcfgbackup command for each lock volume group you have configured, and save the backup file in case the lock configuration must be restored to a new disk with the vgcfgrestore command following a disk failure.

NOTE: You must use the vgcfgbackup and vgcfgrestore commands to back up and restore the lock volume group configuration data regardless of whether you use SAM or HP-UX commands to create the lock volume group.

Allowing Non-Root Users to Run cmviewcl

The ServiceGuard cmviewcl command normally requires root access to the system. However, you can easily modify the cmclnodelist file to allow non-root users to run the cmviewcl command.

If you want a specific non-root user to run the cmviewcl command, then add a hostname-user name pair in the /etc/cmcluster/cmclnodelist file. If you want to allow every user to run the cmviewcl command, then add "+" to the end of the /etc/cmcluster/cmclnodelist file. As an example, the following entries for a two-node cluster allow user1 and user2 to run cmviewcl on system1 and allow user3 to run cmviewcl on system2:

system1     root
system1 user1
system1 user2
system2 root
system2 user3

The following example allows any non-root user to run the cmviewcl command:

system1     root
system2 root
+

Adding an Oracle User

The oracle user must be given cluster-level access by creating an entry in the /etc/cmcluster/cmclnodelist file, as in the following example:

node1.sys.dom.com            oracle
node2.sys.dom.com oracle

Include a similar entry for each node in the OPS cluster.

Defining Name Resolution Services

It is important to understand how ServiceGuard uses name resolution services. When you employ any user-level ServiceGuard command (including cmviewcl), the command uses name lookup (as configured in /etc/resolv.conf) to obtain the addresses of all the cluster nodes. If name services are not available, the command could hang or return an unexpected networking error message. In SAM, cluster or package operations also will return an error if name services are not available.

NOTE: If such a hang or error occurs, ServiceGuard and all protected applications will continue working even though the command you issued does not. That is, only the ServiceGuard configuration commands and SAM functions are impacted, not the cluster daemon or package services.

To avoid this problem, you can use the /etc/hosts file on all cluster nodes instead of DNS or NIS. It is also recommended to make DNS highly available either by using multiple DNS servers or by configuring DNS into a ServiceGuard package.

A workaround for the problem that still retains the ability to use conventional name lookup is to configure the /etc/nsswitch.conf file to search the /etc/hosts file when other lookup strategies are not working. In case name services are not available, ServiceGuard commands and SAM functions will then use the /etc/hosts file on the local system to do name resolution. Of course, the names and IP addresses of all the nodes in the cluster must be in the /etc/hosts file.

Name Resolution Following Primary LAN Failure or Loss of DNS

There are some special configuration steps required to allow cluster configuration commands such as cmrunnode and cmruncl to continue to work properly after LAN failure, even when a standby LAN has been configured for the failed primary. These steps also protect against the loss of DNS services, allowing cluster nodes to continue communicating with one another.

  1. Edit the /etc/cmcluster/cmclnodelist file on all nodes in the cluster and add all heartbeat IP addresses, as well as other IP addresses on the nodes and all cluster node names. Example:

    192.2.1.1       root
    197.2.1.2 root
    192.2.1.3 root
    15.13.172.231 root
    15.13.172.232 root
    15.13.172.233 root
    hasupt01 root
    hasupt02 root
    hasupt03 root

    This will ensure that permission is granted to all IP addresses configured on the heartbeat networks or on any other network in addition to the public network.

  2. Edit or create the /etc/nsswitch.conf file on all nodes and add the following line if it does not already exist:

    hosts:        files [NOTFOUND=continue] dns 

    If a line beginning with the string "hosts:" already exists, then make sure that the text immediately to the right of this string is:

    files [NOTFOUND=continue] 

    This step is critical so that the nodes in the cluster can still resolve hostnames to IP addresses while DNS is down or if the primary LAN is down.

  3. Edit or create the /etc/hosts file on all nodes and add all cluster node primary IP addresses and node names:

    15.13.172.231     hasupt01 
    15.13.172.232 hasupt02
    15.13.172.233 hasupt03

Ensuring Consistency of Kernel Configuration

Make sure that the kernel configurations of all cluster nodes are consistent with the expected behavior of the cluster during failover. In particular, if you change any kernel parameters on one cluster node, they may also need to be changed on other cluster nodes that can run the same packages.

Enabling the Network Time Protocol

It is strongly recommended that you enable network time protocol (NTP) services on each node in the cluster. The use of NTP, which runs as a daemon process on each system, ensures that the system time on all nodes is consistent, resulting in consistent timestamps in log files and consistent behavior of message services. This ensures that applications running in the cluster are correctly synchronized. The NTP services daemon, xntpd, should be running on all nodes before you begin cluster configuration. The NTP configuration file is /etc/ntp.conf.

For information about configuring NTP services, refer to the chapter "Configuring NTP," in the HP-UX manual, Installation and Administration of Internet Services.

Preparing for Changes in Cluster Size

If you intend to add additional nodes to the cluster online, while it is running, ensure that they are connected to the same heartbeat subnets and to the same lock disks as the other cluster nodes. In selecting a cluster lock configuration, be careful to anticipate any potential need for additional cluster nodes. Remember that a cluster of more than four nodes may not use a lock disk, but a two-node cluster must use a cluster lock. Thus, if you will eventually need five nodes, you should build an initial configuration that uses a quorum server.

If you intend to remove a node from the cluster configuration while the cluster is running, ensure that the resulting cluster configuration will still conform to the rules for cluster locks described above.

To facilitate moving nodes in and out of the cluster configuration, you can use SCSI cables with inline terminators, which allow a node to be removed from the bus without breaking SCSI termination. See the section "Online Hardware Maintenance with In-line SCSI Terminator" in the "Troubleshooting" chapter for more information on inline SCSI terminators.

If you are planning to add a node online, and a package will run on the new node, ensure that any existing cluster bound volume groups for the package have been imported to the new node. Also, ensure that the MAX_CONFIGURED_PACKAGES parameter is set high enough to accomodate the total number of packages you will be using.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.