| United States-English |
|
|
|
![]() |
Managing Serviceguard Twelfth Edition > Chapter 3 Understanding
Serviceguard Software ComponentsHow the Package Manager Works |
|
Packages are the means by which Serviceguard starts and halts configured applications. A package is a collection of services, disk volumes and IP addresses that are managed by Serviceguard to ensure they are available. Each node in the cluster runs an instance of the package manager; the package manager residing on the cluster coordinator is known as the package coordinator. The package coordinator does the following:
The package manager on all nodes does the following:
Three different types of packages can run in the cluster: the most common is the failover package. There are also special-purpose packages that run on more than one node at a time, and so do not failover. They are typically used to manage resources of certain failover packages. There are also two types of special-purpose packages that do not failover and that can run on more than one node at the same time: the system multi-node package, which runs on all nodes in the cluster, and the multi-node package, which can be configured to run on all or some of the nodes in the cluster. These packages are not for general use, and are only supported by Hewlett-Packard for specific applications. One common system multi-node package is shipped with the Serviceguard product. It is used on systems that employ VERITAS Cluster Volume Manager (CVM) as a storage manager. This package is known as VxVM-CVM-pkg for VERITAS CVM Version 3.5 and called SG-CFS-pkg for VERITAS CVM Version 4.1. It runs on all nodes that are active in the cluster and provides cluster membership information to the volume manager software. This type of package is configured and used only when you employ CVM for storage management. The process of creating the system multi-node package for CVM without CFS is described in “Preparing the Cluster for Use with CVM ”. The process of creating the system multi-node package for CVM with CFS is described in “Creating a Storage Infrastructure with VERITAS Cluster File System (CFS)”. The multi-node packages are used in clusters that use the VERITAS Cluster File System (CFS) and other HP-specified applications. They can run on on several nodes at a time, but need not run on all. These packages are used when creating cluster file system dependencies. The rest of this section describes the standard failover packages. A failover package starts up on an appropriate node when the cluster starts. A package failover takes place when the package coordinator initiates the start of a package on a new node. A package failover involves both halting the existing package (in the case of a service, network, or resource failure), and starting the new instance of the package. Failover is shown in the following figure: Each package is separately configured. You create a failover package by using Serviceguard Manager or by editing a package ASCII configuration file template. (Detailed instructions are given in Chapter 6 “Configuring Packages and Their Services ”). Then you use the cmapplyconf command to check and apply the package to the cluster configuration database. You also create the package control script, which manages the execution of the package’s services. “Creating the Package Control Script” for detailed information. Then the package is ready to run. The package configuration file assigns a name to the package and includes a list of the nodes on which the package can run. Failover packages list the nodes in order of priority (i.e., the first node in the list is the highest priority node). In addition, failover packages’ files contain three parameters that determine failover behavior. These are the AUTO_RUN parameter, the FAILOVER_POLICY parameter, and the FAILBACK_POLICY parameter. The AUTO_RUN parameter (known in earlier versions of Serviceguard as the PKG_SWITCHING_ENABLED parameter) defines the default global switching attribute for a failover package at cluster startup: that is, whether Serviceguard can automatically start the package when the cluster is started, and whether Serviceguard should automatically restart the package on a new node in response to a failure. Once the cluster is running, the package switching attribute of each package can be temporarily set with the cmmodpkg command; at reboot, the configured value will be restored. The parameter is coded in the package ASCII configuration file:
A package switch involves moving failover packages and their associated IP addresses to a new system. The new system must already have the same subnet configured and working properly, otherwise the packages will not be started. With package failovers, TCP connections are lost. TCP applications must reconnect to regain connectivity; this is not handled automatically. Note that if the package is dependent on multiple subnets, all of them must be available on the target node before the package will be started. If the package has a dependency, the dependency must be met on the target node before the package can start. The switching of relocatable IP addresses is shown in Figure 3-5 “Before Package Switching” and Figure 3-6 “After Package Switching”. Figure 3-5 “Before Package Switching” shows a two node cluster in its original state with Package 1 running on Node 1 and Package 2 running on Node 2. Users connect to node with the IP address of the package they wish to use. Each node has a stationary IP address associated with it, and each package has an IP address associated with it. Figure 3-6 “After Package Switching” shows the condition where Node 1 has failed and Package 1 has been transferred to Node 2. Package 1's IP address was transferred to Node 2 along with the package. Package 1 continues to be available and is now running on Node 2. Also note that Node 2 can now access both Package1’s disk and Package2’s disk. The Package Manager selects a node for a failover package to run on based on the priority list included in the package configuration file together with the FAILOVER_POLICY parameter, also in the configuration file. The failover policy governs how the package manager selects which node to run a package on when a specific node has not been identified and the package needs to be started. This applies not only to failovers but also to startup for the package, including the initial startup. The two failover policies are CONFIGURED_NODE (the default) and MIN_PACKAGE_NODE. The parameter is coded in the package ASCII configuration file:
If you use CONFIGURED_NODE as the value for the failover policy, the package will start up on the highest priority node available in the node list. When a failover occurs, the package will move to the next highest priority node in the list that is available. If you use MIN_PACKAGE_NODE as the value for the failover policy, the package will start up on the node that is currently running the fewest other packages. (Note that this does not mean the lightest load; the only thing that is checked is the number of packages currently running on the node.) Using the MIN_PACKAGE_NODE failover policy, it is possible to configure a cluster that lets you use one node as an automatic rotating standby node for the cluster. Consider the following package configuration for a four node cluster. Note that all packages can run on all nodes and have the same NODE_NAME lists. Although the example shows the node names in a different order for each package, this is not required. Table 3-1 Package Configuration Data
When the cluster starts, each package starts as shown in Figure 3-7 “Rotating Standby Configuration before Failover”. If a failure occurs, any package would fail over to the node containing fewest running packages, as in Figure 3-8 “Rotating Standby Configuration after Failover”, which shows a failure on node 2:
If these packages had been set up using the CONFIGURED_NODE failover policy, they would start initially as in Figure 3-7 “Rotating Standby Configuration before Failover”, but the failure of node 2 would cause the package to start on node 3, as in Figure 3-9 “CONFIGURED_NODE Policy Packages after Failover”: If you use CONFIGURED_NODE as the value for the failover policy, the package will start up on the highest priority node in the node list, assuming that the node is running as a member of the cluster. When a failover occurs, the package will move to the next highest priority node in the list that is available. The use of the FAILBACK_POLICY parameter allows you to decide whether a package will return to its primary node if the primary node becomes available and the package is not currently running on the primary node. The configured primary node is the first node listed in the package’s node list. The two possible values for this policy are AUTOMATIC and MANUAL. The parameter is coded in the package ASCII configuration file: # Enter the failback policy for this package. This policy will be used As an example, consider the following four-node configuration, in which FAILOVER_POLICY is set to CONFIGURED_NODE and FAILBACK_POLICY is AUTOMATIC: Table 3-2 Node Lists in Sample Cluster
Node1 panics, and after the cluster reforms, pkgA starts running on node4: After rebooting, node 1 rejoins the cluster. At that point, pkgA will be automatically stopped on node 4 and restarted on node 1.
Combining a FAILOVER_POLICY of MIN_PACKAGE_NODE with a FAILBACK_POLICY of AUTOMATIC can result in a package’s running on a node where you did not expect it to run, since the node running the fewest packages will probably not be the same host every time a failover occurs. If you are using package configuration files that were generated using a previous version of Serviceguard, we recommend you use the cmmakepkg command to open a new template, and then copy the parameter values into it. In the new template, read the descriptions and defaults of the choices that did not exist when the original configuration was made. For example, the default for FAILOVER_POLICY is now CONFIGURED_NODE and the default for FAILBACK_POLICY is now MANUAL. In Serviceguard A.11.17 and later, you specify a package type parameter; the PACKAGE_TYPE for a traditional package is the default value, FAILOVER. Starting with the A.11.12 version of Serviceguard, the PKG_SWITCHING_ENABLED parameter was renamed AUTO_RUN. The NET_SWITCHING_ENABLED parameter was renamed to LOCAL_LAN_FAILOVER_ALLOWED. Basic package resources include cluster nodes, LAN interfaces, and services, which are the individual processes within an application. All of these are monitored by Serviceguard directly. In addition, you can use the Event Monitoring Service registry through which add-on monitors can be configured. This registry allows other software components to supply monitoring of their resources for Serviceguard. Monitors currently supplied with other software products include EMS (Event Monitoring Service) High Availability Monitors, and an ATM monitor. If a registered resource is configured in a package, the package manager calls the resource registrar to launch an external monitor for the resource. Resources can be configured to start up either at the time the node enters the cluster or at the end of package startup. The monitor then sends messages back to Serviceguard, which checks to see whether the resource is available before starting the package. In addition, the package manager can fail the package to another node or take other action if the resource becomes unavailable after the package starts. You can specify a registered resource for a package by selecting it from the list of available resources displayed in the Serviceguard Manager Configuring Packages. The size of the list displayed by Serviceguard Manager depends on which resource monitors have been registered on your system. Alternatively, you can obtain information about registered resources on your system by using the command /opt/resmon/bin/resls. For additional information, refer to the man page for resls(1m). The EMS (Event Monitoring Service) HA Monitors, available as a separate product (B5736DA), can be used to set up monitoring of disks and other resources as package dependencies. Examples of resource attributes that can be monitored using EMS include the following:
Once a monitor is configured as a package dependency, the monitor will notify the package manager if an event occurs showing that a resource is down. The package may then be failed over to an adoptive node. The EMS HA Monitors can also be used to report monitored events to a target application such as OpenView IT/Operations for graphical display or for operator notification. Refer to the manual Using High Availability Monitors (B5736-90022) for additional information. To determine failover behavior, you can define a package failover policy that governs which nodes will automatically start up a package that is not running. In addition, you can define a failback policy that determines whether a package will be automatically returned to its primary node when that is possible. The following table describes different types of failover behavior and the settings in Serviceguard Manager or in the package configuration file that determine each behavior. Table 3-3 Package Failover Behavior
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||