| United States-English |
|
|
|
![]() |
Configuring OPS Clusters with MC/LockManager: > Chapter 3 Understanding MC/LockManager Software ComponentsHow the Package Manager Works |
|
Each node in the cluster runs an instance of the package manager; the package manager residing on the cluster coordinator node is known as the package coordinator. The package coordinator does the following:
The package manager on all nodes does the following:
A package starts up on an appropriate node when the cluster starts. A package failover takes place when the package coordinator starts a package on a new node, as in the following figure: Each package is separately configured by means of a package configuration file, which can be edited manually or through SAM. This file assigns a name to the package and identifies the nodes on which the package can run, in order of priority. It also indicates whether or not switching is enabled for the package, that is, whether the package should switch to another node or not in the case of a failure. In addition, the package failover and failback policies allow the package manager to decide dynamically where to start up a package. OPS instances are configured as packages with a single node in their node list. For conventional non-OPS instance packages, there may be one or many applications in a package. Package configuration is described in detail in the chapter "Configuring Packages and Their Services." The Package Manager selects a node for a package to run on based on the priority list included in the package configuration file together with the FAILOVER_POLICY parameter, also coded in the file or set with SAM. Failover policy governs not only failover behavior but also startup behavior for the package, including the initial startup. The two failover policies are CONFIGURED_NODE and MIN_PACKAGE_NODE. If you use CONFIGURED_NODE as the value for the failover policy, the package will start up on the highest priority node in the node list, assuming that the node is running as a member of the cluster. When a failover occurs, the package will move to the next highest priority node in the list that is available. If you use MIN_PACKAGE_NODE as the value for the failover policy, the package will start up on the node that is currently running the fewest other packages. (Note that this does not mean the lightest load; the only thing that is checked is the number of packages currently running on the node.) Using the MIN_PACKAGE_NODE failover policy, it is possible to configure a cluster that lets you use one node as an automatic rotating standby node for the package. Consider the following package configuration for a four node cluster. Note that all packages can run on all nodes and have the same NODE_NAME lists, though the node names appear in a different order in the package configuration files.
When the cluster starts, each package starts as shown in the Figure 3-3 “Rotating Standby Configuration before Failover”. If a failure occurs, any package would fail over to the node containing fewest running packages, as in the following, which shows a failure on node 2: If these packages had been set up using the CONFIGURED_NODE failover policy, they would start initially as in Figure 3-3 “Rotating Standby Configuration before Failover”, but the failure of node 2 would cause the package to start on node 3: If you use CONFIGURED_NODE as the value for the failover policy, the package will start up on the highest priority node in the node list, assuming that the node is running as a member of the cluster. When a failover occurs, the package will move to the next highest priority node in the list that is available.
The use of the FAILBACK_POLICY parameter allows packages to return to the primary node if the primary node becomes available and the package is not currently running on the primary node. The configured primary node is the first node listed in the package's node list. Here are some examples of what can happen when the FAILBACK_POLICY is set to AUTOMATIC:
After a cluster has formed, the package manager on each node starts up packages on that node. Starting a package means running individual application services on the node where the package is running. To start a package, the package manager runs the package control script with the 'start' parameter. This script performs the following tasks:
While the package is running, services are continuously monitored. If any part of a package fails, the package halt instructions are executed as part of a recovery process. Failure may result in simple loss of the service, a restart of the service, transfer of the package to an adoptive node, or transfer of all packages to adoptive nodes, depending on the package configuration. In package transfers, MC/LockManager sends a TCP/IP packet across the heartbeat subnet to the package's adoptive node telling it to start up the package.
The package configuration file and control script are described in detail in the chapter "Configuring Packages and Their Services." The Service Monitor checks the PIDs of services started by the package control script. If it detects a PID failure, the package is halted. Depending upon the parameters set in the package configuration file, MC/LockManager will attempt to restart the service on the primary node, or the package will fail over to a specified adoptive node. Basic package resources include cluster nodes, LAN interfaces, and services, which are the individual processes within an application. All of these are monitored by MC/LockManager directly. In addition, you can use the Event Monitoring Service registry through which add-on monitors can be configured. This registry allows other software components to supply monitoring of their resources for MC/LockManager. Monitors currently supplied with other software products include an OTS/9000 monitor and an ATM monitor. If a registered resource is configured in a package, the package manager calls the resource registrar to launch an external monitor for the resource. The monitor then sends messages back to MC/LockManager, which checks to see whether the resource is available before starting the package. In addition, the package manager can fail the package to another node or take other action if the resource becomes unavailable after the package starts. The EMS HA Monitors, available as a separate product (A5735AA), can be used to set up monitoring of disks and other resources as package dependencies. Examples of resource attributes that can be monitored using EMS include the following:
Once a monitor is configured as a package dependency, the monitor will notify the package manager if an event occurs showing that a resource is down. The package may then be failed over to an adoptive node. The EMS HA Monitors can also be used to report monitored events to a target application such as ClusterView for graphical display or for operator notification. Refer to the manual Using EMS HA Monitors (HP part number B5735-90001) for additional information. The package manager is notified when a command is issued to shut down a package. In this case, the package control script is run with the 'stop' parameter. For example, if the system administrator chooses "Halt Package" from the "Package Administration" menu in SAM, the package manager will stop the package. Similarly, when a command is issued to halt a cluster node, the package manager will shut down all the packages running on the node, executing each package control script with the 'stop' parameter. When run with the 'stop' parameter, the control script:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||