| United States-English |
|
|
|
![]() |
HP XC System Software : Administration Guide > Chapter 8 Monitoring the System with NagiosAdjusting the Nagios Configuration |
|
You can adjust Nagios by stopping the nagios service, updating a configuration file, and restarting the nagios service. This section describes the procedures for adjusting a Nagios configuration. It addresses the following topics: Nagios can record a multitude of alerts on large systems when many nodes undergo known maintenance operations. These operations can include restarting or shutting down the HP XC system. To avoid these alerts, shut down Nagios on the head node immediately before these maintenance operations with the following command:
To restart Nagios after a maintenance operation, use the following command:
To restart Nagios after changing its configuration, use the following command:
Improved Availability Is in EffectIf improved availability is in effect, you must restart the nagios service (that is, the Nagios master) using the system's availability tool. Following is an example of how to restart the nagios service using HP Serviceguard. This example restarts the Nagios master, which is running on node n128.
Most of the following sections provide you with information on which template files to update to accomplish a given task. The nagios_vars.ini file contains most of the parameters that define the Nagios configuration. Editing this file is key for most of the configuration updates you want to perform. The HP XC System Software also features Nagios template files that define configurable parameters. As shown in Figure 8-8, the template files, the nagios_vars.ini file, and data from the configuration and management database (CMDB) are processed by a Nagios configurator to generate include files that form the basis for the configured Nagios application.
When you change the Nagios configuration, you must perform the following tasks:
Nagios sends e-mail by default to the nagios user. The simplest method to forward e-mail alerts is to log in as the Nagios user and to create a .forward file in the Nagios user's directory (usually /home/nagios) to redirect e-mail alert messages from Nagios to another e-mail account. This method assures that the .forward file's permissions are correct.
You can customize the Nagios configuration to specify whom to contact by editing the /opt/hptc/nagios/etc/contacts.cfg file. The main portion of this file is shown here:
Changing the values for email and pager to reflect your system's name enables Nagios to send notification through the sendmail utility. For example, changing nagios@localhost.localdomain to nagios@example.com.
Job loads, usage patterns, process types, counts, memory, cache, disk subsystems, and so on all contribute input to Nagios. Nagios uses threshold values to determine whether or not to send an alert, and, if so, whether that alert is critical or a warning. Nagios monitors the sensor thresholds and generates alerts when a threshold is reached. Depending on your specific site configuration and use, some default thresholds might not be appropriate for your system. The platform-dependent default thresholds provided in the HP XC system serve as a baseline, but they might not be optimal for your site. As system administrator, you need to determine the threshold values appropriate for your site and customize the Nagios configuration. The /opt/hptc/nagios/etc/nagios_vars.ini file represents various constants and variables used throughout the HP XC system's plug-ins and the Nagios configurations. You can edit this file to customize Nagios for the thresholds. Changing these values changes when Nagios alerts you to subsystems encountering thresholds. The nagios_vars.ini file also contains variables that are commented out. Examine the content of the file to determine if those variables are appropriate for your system. If so, remove the comment characters accordingly. This portion of the nagios_vars.ini file is an example:
If you change the nagios_vars.ini file, be sure to propagate the file to the appropriate nodes, usually the management hubs, on your system; see Chapter 10 for more information. “Updating the Nagios Configuration” describes the overall procedure for updating the Nagios configuration. Table 8-1 displays the default collection intervals for the Supermon Metrics Monitor service. The Supermon Metrics Monitor schedules and collects individual metrics at a specified interval. You can change an interval. The interval must be a multiple of the time specified by the value of the normal_check_interval parameter defined in the /opt/hptc/nagios/etc/templates/nagios_template.cfg or /opt/hptc/nagios/etc/templates/nagios_monitor.cfg template file. Table 8-1 Supermon Metrics Collection Intervals
The master Nagios configuration file, nagios.cfg, has a number of global settings that control overall behavior. One of these is the service_check_timeout interval. Nagios limits the execution time of plug-ins to this interval. If a plug-in is still running when the interval expires, Nagios terminates the plug-in and shows the result as a Service check timeout error. For systems with fewer than 256 nodes, the default value of 180 seconds should be adequate. However, warning or critical messages can occur if the service_check_timeout interval ends before the metrics gathering is complete. If your system has more nodes, consider increasing the value for the service_check_timeout parameter. Often the Nagios user name and user ID are established during the initial system configuration, that is, when the cluster_config utility is run. If a Nagios user name is found at that time, the HP XC system uses that user name and user ID instead of creating the default user name and user ID. However, you can configure the HP XC system to use an alternate nagios user and group account. Use the following procedure to change the default Nagios user name.
All the Nagios plug-ins developed for the HP XC system are enabled by default. However, you can modify the /opt/hptc/nagios/etc/templates/*_template.cfg files to customize the service checks as needed.
Use the following procedure to disable a specific Nagios plug-in:
Update the golden image with the Nagios template file to ensure a permanent change. See Chapter 10 for more information. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||