| United States-English |
|
|
|
![]() |
HP XC System Software: Installation Guide > Appendix N TroubleshootingTroubleshoot the Discovery Process |
|
Figure N-1 provides a high-level flowchart that illustrates the processing performed by the discover command.
The remainder of this section provides troubleshooting hints to help you solve some common problems that may occur during the discovery process. The following conditions are described: After performing the suggested corrective action, rerun the discover command.
This information applies only to HP XC systems with nodes that use Integrated Lights Out (iLO) as the console management device. If the discovery process hangs when it tries to discover the console ports (that is, those named cp-n), it may be because the iLO console management devices do not have telnet enabled. See Appendix C, which describes how to enable telnet on iLO devices. The discovery process requires the correct MAC address of the Root Administration Switch to discover the hardware components in the system. The Root Administration Switch in an HP XC system is typically a ProCurve 2848 or a ProCurve 2824 switch. To determine which switch is the Root Administration Switch, look for the switch that has the head node's NIC plugged into port 42 (for the ProCurve 2848) or port 22 (for the ProCurve 2824) of the switch. During the discovery process, do not provide the MAC address of the Root Console switch, typically a ProCurve 2650 or ProCurve 2626 switch, which may have the console port for the head node plugged into port 42. If the correct MAC address has been provided to the discover command, and the switch has not obtained its address, you may need to reset the ProCurve switch before running the discover command. At power-up, the ProCurve switch will query the network for an IP address using DHCP. If a response is not received, the switch continues to query for an address but at a decreasing rate of frequency. The discover command waits a maximum of 10 minutes for the ProCurve switch to obtain an address before terminating the process. To reset the switch, press the front panel reset button, or log in to the switch through the serial cable and issue the reset command. By default, ProCurve switches are configured to obtain their IP address using DHCP/Bootp. If the switch does not receive an address, it continues to periodically send DHCP requests, with decreasing frequency. The discovery process waits up to 10 minutes for any one switch to obtain its IP address. If the discover process times out because it cannot communicate with a switch, reset all the switches in the cluster before attempting to run the discover command again. This forces the switches to immediately start sending DHCP requests again. The discovery process queries the ProCurve switches to obtain the MAC addresses of all console ports. The MAC addresses are logged in the switch as the console port device issues a DHCP request to get an IP address.
If some console ports are not configured to use DHCP, they will not be discovered. Therefore, the first item to verify is whether or not the nondiscovered console ports are configured to use DHCP. If all console ports are properly configured to use DHCP, verify the wiring against the descriptions in the HP XC Hardware Preparation Guide. It is important to point out that because of hardware limitations in the ProCurve switch, empty ports are included in the port layout. Use the following procedure to determine why all console ports have been discovered, but some console ports have not obtained their IP addresses after a reasonable time.
In much the same way as the console ports are queried, the discovery process queries the ProCurve switches to obtain the MAC address of a node. For the discovery process to do this, the nodes must be configured to boot off the network interface that is connected to the Root Administration Switch. If the node is not configured to network boot, it is likely that the node will not be properly discovered. The discovery process uses the number of nodes as a checkpoint to determine whether or not it has discovered all the nodes in the system. As a result, the system checks the ports on a switch for an open port. If one is found, this does not necessarily indicate a problem but rather where the nodes on the particular switch end. When this nonresponding port is found, the discovery process proceeds to the next switch in the list. It will do this until it discovers all nodes or runs out of ports. As a result, it is important that all nodes in the cluster be available during the discovery process. To determine what node failed in the discover process, examine the output of the discover command when it is parsing the switch output to gather the node input. For example, assume that the following was displayed during the discovery process:
In this case, a node is plugged into port 6 of the Branch Root switch at address 172.20.65.3. To resolve the discovery problem, examine this node to see what actions it is taking during power-on. Is it booting from the network? Is the proper network interface plugged into the switch? After these issues are resolved, run the discover command again. If discover encounters a port where a node is expected to be plugged in but is not found, a message similar to the following is displayed:
The most common reasons for this are: After the discovery process is complete, a list of the nodes that were not found is displayed. Determine why the node or nodes was not found and rerun the discover command with the --replacenode= option to properly discover the node. The discover command attempts to account for all the nodes that it expects to find. If the discover command cannot account for all nodes, the missing node or nodes are assumed to be at the end of the administration or branch switches. The discover command cannot determine this situation without user intervention, however. Therefore, you are prompted to specify what port is the last one used on the switch. As a result, the following messages are displayed:
In this example, all nodes were found except the node at port 44 on switch 172.20.65.6. Entering n creates a disabled node entry, which completes the expected number of nodes. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||