| United States-English |
|
|
|
![]() |
Configuring OPS Clusters with MC/LockManager: > Chapter 8 Troubleshooting Your ClusterSolving Package Problems |
|
Problems with packages fall into three categories:
The first two categories of problems occur with the incorrect configuration of MC/LockManager. The last category contains "normal" failures to which MC/LockManager is designed to react and ensure the availability of packages containing your applications. There are a number of errors you can make when configuring MC/LockManager that will not show up when you start the cluster. Your cluster can be running, and everything appears to be fine, until there is a hardware or software failure and control of your packages are not transferred to another node as you would have expected. These are errors caused specifically by errors in the cluster configuration file and package configuration scripts. Examples of these errors include:
You can use the following commands to check the status of your disks:
These errors are similar to the system administration errors except they are caused specifically by errors in the package control script. The best way to prevent these errors is to test your package control script before putting your high availability application on line. Running your script with the -x shell option will give you details on where your script may be failing. Node and network failures cause MC/LockManager to transfer control of a package to another node. This is the normal action of MC/LockManager, but you have to be able to recognize when a transfer has taken place and decide to leave the cluster is its current condition or to restore it to its original condition. Possible node failures can be caused by the following conditions:
In the event of a TOC, a system dump is performed on the failed node and numerous messages are also displayed on the console. You can use the following commands to check the status of your network and subnets:
Since your cluster is unique, there are no cookbook solutions to possible problems. But if you apply these checks and commands and work your way through the log files, you will be successful in identifying and solving problems. |
|||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||