Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software: Installation Guide > Chapter 12 Troubleshooting

Troubleshooting the Cluster Configuration Process

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The following list provides hints to troubleshoot problems you might encounter during the initial configuration of the system with the cluster_config utility:

  • Use the following command to view the node role assignments:

    # shownode config | more
  • The results of successful and unsuccessful services configuration are logged in the /var/log/nconfig.log file on each node.

  • DNS is not functioning or the IP address of the external connection on the head node is not available from DNS when you see the following message:

    gethostbyaddr failure

    To resolve this problem, edit the /etc/resolv.conf file and fix incorrect DNS entries.

  • Nodes that fail the configuration phase are put into single-user mode and marked as disabled in the database if an essential service failed.

lsadmin limrestart Command Fails

“Task 18: Finalize the Configuration of Compute Resources” describes LSF postconfiguration tasks. It is possible for the lsadmin limrestart command to fail if the LSF control node was assigned to the wrong node name. If the command fails, messages similar to the following are displayed:

[root@blc2n1 ~]# lsadmin limrestart 
Checking configuration files ... 
There are fatal errors.
Do you want to see the detailed messages? [y/n] y 
Checking configuration files ... 
Platform LSF 6.2 for SLURM, May 15 2006 
Copyright 1992-2005 Platform Computing Corporation 
Reading configuration from /opt/hptc/lsf/top/conf/lsf.conf 
Dec 20 21:00:38 2006 11220 5 6.2 /opt/hptc/lsf/top/6.2/linux2.6-glibc2.3-x86_64- 
slurm/etc/lim -C 
Dec 20 21:00:38 2006 11220 7 6.2 setMyClusterName: searching cluster files ... 
Dec 20 21:00:38 2006 11220 7 6.2 setMyClusterName: Local host blc2n1 not defined 
 in cluster file /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf 
Dec 20 21:00:38 2006 11220 3 6.2 setMyClusterName(): unable to find the cluster 
file containing local host blc2n1 
Dec 20 21:00:38 2006 11220 3 6.2 setMyClusterName: Above fatal error(s) found. 
--------------------------------------------------------- 
There are fatal errors. 

To correct this problem, enter the following commands on the head node where control_nodename is the name of the node that is the LSF control node:

# controllsf stop
# controllsf set primary control_nodename
# controllsf start

Cannot Connect to Database During Configuration

At times, especially during the initial configuration or reconfiguration of the system, you might see the following message:

Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock 

If you see that message, perform the following steps to restart the database and resolve the problem:

  1. As root on the head node, restart the database:

    # service mysqld restart

    This command may report that it fails to either stop or restart the MySQL processes. If so, continue with the remainder of this procedure.

  2. Enter the following command to find MySQL processes:

    # ps -eaf | grep mysql

    Three processes should be listed: grep, mysqld_safe, and mysqld. If you do not see mysqld_safe and mysqld, proceed to step 4.

  3. Use the process ID (PID) of /usr/libexec/mysqld (the number just after the process owner name) to kill mysqld manually. If the mysqld process is not listed, but there is a mysqld_safe process, use that PID instead.

    # kill mysqld_PID

    This process should kill both mysqld and mysqld_safe.

  4. Restart the mysqld service:

    # service mysqld restart

    The command you were trying to initiate should now be able to connect to the database.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.