 |
» |
|
|
 |
Planning for packages involves assembling information about
each group of highly available services. Some of this information
is used in creating the package configuration file, and some is
used for editing the package control script.  |  |  |  |  | NOTE: Volume groups that are to be activated by packages must
also be defined as cluster aware in the cluster configuration file.
See the previous section "“Cluster Configuration Planning ”." |  |  |  |  |
Logical Volume and Filesystem Planning |  |
You may need to use logical volumes in volume groups as part
of the infrastructure for package operations on a cluster. When
the package moves from one node to another, it must be able to access
data residing on the same disk as on the previous node. This is
accomplished by activating the volume group and mounting the file
system that resides on it. In MC/LockManager, high availability applications, services, and
data are located in volume groups that are on a shared bus. When
a node fails, the volume groups containing the applications, services,
and data of the failed node are deactivated on the failed node and
activated on the adoptive node. In order to do this, you have to
configure the volume groups so that they can be transferred from
the failed node to the adoptive node. As part of planning, you need to decide the following: What volume groups are needed? How much disk space is required, and how should
this be allocated in logical volumes? What file systems need to be mounted for each package? Which nodes need to import which logical volume
configurations. If a package moves to an adoptive node, what effect
will its presence have on performance?
Create a list by package of volume groups, logical volumes,
and file systems. Indicate which nodes need to have access to common filesystems
at different times. It is recommended that you use customized logical volume names
that are different from the default logical volume names (lvol1,
lvol2, etc.). Choosing logical volume names that represent the high
availability applications that they are associated with (for example, lvoldatabase) will simplify cluster administration. To further document your package-related volume groups, logical volumes,
and file systems on each node, you can add commented lines to the /etc/fstab file. The following is an
example for a database application: # /dev/vg01/lvoldb1 /applic1 vxfs defaults 0 1 # These six entries are
# /dev/vg01/lvoldb2 /applic2 vxfs defaults 0 1 # for information purposes
# /dev/vg01/lvoldb3 raw_tables ignore ignore 0 0 # only. They record the
# /dev/vg01/lvoldb4 /general vxfs defaults 0 2 # logical volumes that
# /dev/vg01/lvoldb5 raw_free ignore ignore 0 0 # exist for MC/LockManager's
# /dev/vg01/lvoldb6 raw_free ignore ignore 0 0 # HA package. Do not uncomment. |
Create an entry for each logical volume, indicating its use
for a file system or for a raw device.  |  |  |  |  | CAUTION: Do not use /etc/fstab to mount
file systems that are used by MC/LockManager packages. |  |  |  |  |
Details about creating, exporting, and importing volume groups
in MC/LockManager are given in the chapter "Chapter 5 “Building an OPS Cluster Configuration”." Monitoring Registered Package
Resources |  |
MC/LockManager has access to a registry of resources that can
be monitored as package dependencies. The registry is the core of
the Event Monitoring Service (EMS). Once an EMS registered resource
is configured as a package dependency, MC/LockManager can fail a package
to another node based on messages the resource's monitor returns.
Monitors for individual resources may be provided by hardware or
software vendors from time to time. A specific group of HA EMS monitors
for disk, LAN, and system status information is available from HP
as a separate product. Refer to the manual Using High Availability Monitors (B5736-90012) for
additional information. You can specify a registered resource for a package by selecting
it from the list of available resources displayed in the SAM package configuration
area. The size of the list displayed by SAM depends on which resource
monitors have been registered on your system. Alternatively, you
can obtain information about registered resources on your system
by using the command /opt/resmon/bin/resls. For additional information, refer to the man
page for resls(1m). Choosing Switching and Failover Behavior |  |
Switching IP addresses from a failed LAN card to a standby
LAN card on the same physical subnet may take place if Automatic
Switching is set to Enabled in SAM (NET_SWITCHING_ENABLED set to YES in the ASCII package configuration
file). Automatic Switching Enabled is the default. To determine failover behavior, you can define a package startup
policy that governs which nodes will automatically start up a package
that is not running. In addition, you can define a failback policy
that determines whether a package will be automatically returned
to its primary node when that is possible. The following table describes different types of failover
behavior and the settings in SAM or in the ASCII package configuration
file that determine each behavior. Table 4-2 Package Failover Behavior Switching Behavior | Options in SAM | Parameters in ASCII File |
|---|
Package IP address switches to standby LAN
card transparently on LAN card failure | Automatic Switching
set to Enabled for the package (Default)
| NET_SWITCHING_
ENABLED set to YES for the package (Default)
| Package switches normally after detection
of failure or report of an EMS monitor event showing that a resource
on which the package depends is down. Halt script runs before switch
takes place (default behavior) | Package Failfast
set to Disabled. (Default) Service Failfast set to Disabled for all services.
(Default) Automatic Switching set to Enabled for the package.
(Default)
| NODE_FAIL_FAST_ENABLED set to NO. (Default) SERVICE_FAIL_
FAST_ENABLED set to NO for all services. (Default) PKG_SWITCHING_
ENABLED set to YES for the package. (Default)
| Package fails over to the node with the fewest
active packages | Failover policy
set to Minimum Package Node
| FAILOVER_POLICY set to MIN_PACKAGE_
NODE
| Package fails over to the node that is
next on the list of nodes (default behavior) | Failover policy
set to Configured Node
| FAILOVER_POLICY set to CONFIGURED_NODE
| Package is automatically halted and restarted
on its primary node if the primary node is available and the package
is running on a non-primary node | Failback policy
set to Automatic
| FAILBACK_POLICY set to AUTOMATIC
| If desired, package must be manually returned
to its primary node if it is running on a non-primary node | Failback policy
set to Manual
| FAILBACK_POLICY set to MANUAL
| All packages switch following a TOC (Transfer
of Control, an immediate halt without a graceful shutdown) on the
node when a specific service fails. Halt scripts are not run. | Package Failfast
set to Disabled Service Failfast set to Enabled for a specific service Automatic Switching set to Enabled for all packages.
| NODE_FAIL_FAST_ENABLED set to NO SERVICE_FAIL_
AST_ENABLED set to YES for a specific service. PKG_SWITCHING_
NABLED set to YES for all packages.
| All packages switch following a TOC on the
node when any service fails. | Package
Failfast set to Disabled. Service Failfast set to Enabled for all services. Automatic Switching set to Enabled for all packages.
| NODE_FAIL_FAST_ENABLED set to NO. SERVICE_FAIL_
FAST_ENABLED set to YES for all services. PKG_SWITCHING_
ENABLED set to YES for all packages.
| All packages switch following a TOC on the
node when the run or halt script exits with an error other than
0 or 1. This may be caused by an EMS monitor event showing that
a resource is down | Package Failfast
set to Enabled. Automatic Switching set to Enabled for all packages.
| NODE_FAIL_FAST_ENABLED set to YES. PKG_SWITCHING_
ENABLED set to YES for all packages.
|
Package Configuration File Parameters |  |
Prior to generation of the package configuration file, assemble
the following package configuration data. The parameter names given
below are the names that appear in SAM. The names coded in the ASCII cluster
configuration file appear at the end of each entry. The following parameters
must be identified and entered on the worksheet for each package: - Package Name
The name of the package. The package name must be unique in
the cluster. It is used to start, stop, modify, and view the package. The package name must not contain any of the following illegal
characters: '/', '\',
and '*'. All other characters are legal. In the ASCII package configuration file, this parameter is
known as PACKAGE_NAME. - Package Failover Policy
The policy to be used by the package manager to start the
node to run the package whenever the package is automatically started.
The default is CONFIGURED_NODE, which means the next available node
in the list of node names for the package. The order of node name
entries dictates the order of preference when selecting the node.
The alternate policy is MIN_PACKAGE_NODE, which means the node from
the list that is running the fewest other packages at the time this
package is to be started. In the ASCII package configuration file, this parameter is
known as FAILOVER_POLICY. - Failback Policy
The policy used to determine what action the package manager
should take if the package is not running on its primary node and
its primary node is capable of running the package. The default
is MANUAL, which means no attempt will be made to move the package
back to its primary node when it is running on an alternate node.
The alternate policy is AUTOMATIC, which means that the package
will halted and restarted on its primary node as soon as the primary
node is capable of running the package and, if MIN_PACKAGE_NODE
is the Package Startup Policy, is running fewer packages than the
current node. In the ASCII package configuration file, this parameter is
known as FAILBACK_POLICY. - Node Name
The names of primary and alternate nodes for the package,
e.g., ftsys9 and ftsys10. The order in which you specify the node
names is important. First list the primary node name, then the first
adoptive node name, then the second adoptive node name, followed,
in order, by additional node names. Ownership of a package may be
transferred to the next adoptive node name listed in the package
configuration file. In the ASCII package configuration file, this parameter is
known as NODE_NAME. - Control Script Pathname
Enter the
full pathname of the package control script. (The script must reside
in a directory that contains the string "cmcluster".) It is recommended that you use the
same script as both the run and halt script. This script will contain
both your package run instructions and your package halt instructions.
When the package starts, its run script is executed and passed the
parameter 'start'; similarly, at package halt time, the halt script
is executed and passed the parameter 'stop'. In the ASCII package configuration file, this parameter maps
to the two separate parameters named RUN_SCRIPT and HALT_SCRIPT. Use the name of the single control script as
the name of the RUN_SCRIPT and the HALT_SCRIPT in the ASCII file. If you wish to separate the package run instructions and package
halt instructions into separate scripts, the package configuration
file allows you to do this by naming two separate scripts. However,
under most conditions, it is simpler to combine your run and halt instructions
into a single package control script and repeat its name for both
the RUN_SCRIPT and the HALT_SCRIPT. Ensure that the script is executable.  |  |  |  |  | NOTE: If you choose to write separate package run and halt
scripts, be sure to include identical configuration information
(such as node names, IP addresses, etc.) in both scripts. |  |  |  |  |
- Run Script Timeout and Halt Script Timeout
Enter a number of seconds. If the script has not completed
by the specified timeout value, MC/LockManager will terminate the script.
The default is 0, or no timeout. If the timeout is exceeded: Control of the package will not be
transferred. The run or halt instructions will not be run. Global switching will be disabled. The current node will be disabled from running the package. The control script will exit with status 1.
In the ASCII package configuration file, this parameter is
called RUN_SCRIPT_TIMEOUT and HALT_SCRIPT_TIMEOUT. The default for both is 0 or NO_TIMEOUT. In the ASCII file, this parameter is entered
in microseconds. If the halt script timeout occurs, you may need to perform
manual cleanup see the section "“Reviewing the Package Control Script ”" in
the chapter "Chapter 8 “Troubleshooting Your Cluster”." - Service Name
Enter a unique
name for each service. You can configure a maximum of 30 services
per package. In the ASCII package configuration file, this parameter is
called SERVICE_NAME. Define one SERVICE_NAME entry for each service. - Service Fail Fast
Enter Enabled
or Disabled for each service. This parameter indicates whether or
not the failure of a service results in the failure of a node. If
the parameter is set to Enabled, in the event of a service failure, MC/LockManager will
halt the node on which the service is running with a TOC. The default
is Disabled. In the ASCII package configuration file, this parameter is SERVICE_FAIL_FAST_ENABLED, and possible values are YES and NO. The default
is NO. Define one SERVICE_FAIL_FAST_ENABLED entry for each service. The service name must not contain any of the following illegal
characters: '/', '\',
and '*'. All other characters are legal. - Service Halt Timeout
In the event
of a service halt, MC/LockManager will first send out a SIGTERM signal
to terminate the service. If the process is not terminated, MC/LockManager will
wait for the specified timeout before sending out the SIGKILL signal
to force process termination. Default is 300 seconds (5 minutes). In the ASCII package configuration file, this parameter is SERVICE_HALT_TIMEOUT. Define one SERVICE_HALT_TIMEOUT entry for each service. - Subnet
Enter the
IP subnets that are to be monitored for the package. In the ASCII package configuration file, this parameter is
called SUBNET. - Resource Name
The name of a resource that is to be monitored by MC/LockManager as
a package dependency. A resource name is the name of an important
attribute of a particular system resource. The resource name includes
the entire hierarchy of resource class and subclass within which
the resource exists on a system. In the ASCII package configuration file, this parameter is
called RESOURCE_NAME. Obtain the resource name from the list provided
in SAM, or obtain it from the documentation supplied with the resource
monitor. A maximum of 60 resources may be defined per cluster. Note
also the limit on Resource Up Values described below. - Resource Polling Interval
The frequency
of monitoring an additional package resource. The default is 60
seconds. In the ASCII package configuration file, this parameter is
called RESOURCE_POLLING_INTERVAL. The Resource Polling Interval appears on the
list provided in SAM, or you can obtain it from the documentation
supplied with the resource monitor. - Resource Up Value
The criteria
for judging whether an additional package resource has failed or
not. In the ASCII package configuration file, this parameter is
called RESOURCE_UP_VALUE. The Resource Up Value appears on the list provided
in SAM, or you can obtain it from the documentation supplied with
the resource monitor. You can configure a total of 15 Resource Up Values per package.
For example, if there is only one resource in the package, then
a maximum of 15 Resource Up Values can be defined. If there are
two Resource Names defined and one of them has 10 Resource Up Values, then
the other Resource Name can have only 5 Resource Up Values. - Automatic Switching
Enter Enabled or Disabled. The default is Enabled, which allows
a package to start up normally on a cluster node. In the event of
a failure, a value of Enabled permits MC/LockManager to transfer the package
to an adoptive node. If this parameter is set to Disabled, the package
will not start up automatically when the cluster starts running. In the ASCII package configuration file, this parameter is
called PKG_SWITCHING_ENABLED, and possible values are YES and NO. The default
is YES. If this parameter is set to NO, the package will not start
up automatically when the cluster starts running. - Local Switching
Enter Enabled
or Disabled. In the event of a failure, this permits MC/LockManager to
switch LANs locally, that is, transfer to a standby LAN card. The
default is Enabled. In the ASCII package configuration file, this parameter is
called NET_SWITCHING_ENABLED, and possible values are YES and NO. The default
is YES. - Package Fail Fast Enabled
In the event
of the failure of the control script itself or the failure of a
subnet or the report of an EMS monitor event showing that a resource
is down, if this parameter is set to Enabled, MC/LockManager will issue
a TOC on the node where the control script fails. The default is
Disabled. In the ASCII package configuration file, this parameter is
called NODE_FAIL_FAST_ENABLED, and possible values are YES and NO. The default
is NO.
Package Control Script Variables |  |
The control script that accompanies each package must also
be edited to assign values to a set of variables. The following
variables must be set: - Volume Groups, Logical Volumes, File Systems and Mount Options
Determine the filesystems and corresponding logical volumes
within the volume groups required. Example: pkg1 requires /dev/vg01/lvol1 mounted on /vg01 |
Indicate the names of volume groups that are to be activated
and deactivated, together with the logical volumes and file systems
that are to be mounted. You can also specify options that are to
be used with the HP-UX mount command. On starting the package, the script activates
a volume group, and it may mount logical volumes onto file systems.
At halt time, the script unmounts the file systems and deactivates
each volume group. All volume groups must be accessible on each
target node. In the ASCII package control script, these variables are arrays,
as follows: VG, LV, FS and LV_MOUNT_COUNT. For each file system (FS), you must identify a logical volume (LV). Include as many volume groups (VGs) as needed. If you are using raw files, the LV, FS, and LV_MOUNT_COUNT entries are not needed. Only cluster aware volume groups should be specified in package
control scripts. To make a volume group cluster aware, enter it
as part of the cluster configuration. See the section above, "“Cluster Configuration Planning ”." - IP Addresses and SUBNETs
These
are the IP addresses by which a package is mapped to a LAN card.
Indicate the IP addresses and subnets for each IP address you want
to add to an interface card. The Subnet is the IP address logically
ANDed with the subnet mask. In the ASCII package control script, these variables are entered
in pairs. Example IP[0]=192.10.25.12 and SUBNET[0]=192.10.25.0. (In this case the subnet mask is 255.255.255.0.) - Service Name
Enter
a unique name for each specific service within the package. All
services are monitored by MC/LockManager. The service name, service command,
and service restart parameters are entered in the package control
script in groups of three. You may specify as many service names
as you need. Each name must be unique within the cluster. The service name
is the name used by cmrunserv and cmhaltserv inside the package control script. In the ASCII package control script, enter values into an
array known as SERVICE_NAME. Enter one service name for each service. - Service Command
For
each named service, enter a service command. This command will be
executed through the control script by means of the cmrunserv command. In the ASCII package control script, enter values into an
array known as SERVICE_CMD. Enter one service command string for each service. - Service Restart Parameter
Enter
a number of restarts. One valid form of the parameter is -r n where n is a number of retries. A value of -r 0 indicates no retries. A value of -R indicates an infinite number of retries. The default is
0, or no restarts. In the ASCII package control script, enter values into an
array known as SERVICE_RESTART. Enter one restart value for each service.
For information on using a DTC with MC/LockManager, see the chapter "Configuring
DTC Manager for Operation with MC/ServiceGuard" in the manual Using
the HP DTC Manager/UX. The package control script will clean up the environment and
undo the operations in the event of an error.
|