Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software : Release Notes > Chapter 9 Load Sharing Facility and Job Management Notes

SLURM and Job Management

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

The notes in this section apply to the Simple Linux Utility for Resource Management (SLURM). SLURM provides commands for launching, monitoring, and controlling jobs.

Refer to the HP XC System Software User's Guide for more information about using SLURM.

Error in slurm.epilog.clean Script

SLURM provides a slurm.epilog.clean script in the /opt/hptc/slurm/etc/ directory. This script is not used in normal operation by default. However, it is provided if you want to configure SLURM on your XC system to ensure that all processes relating to a user's job on the compute nodes are terminated after the job has completed.

An error has been discovered in this script where the SLURM_BIN variable is not set. If this script is enabled, this error causes the script to terminate all user processes on the node, even if the user has a separate job running on the same node.

Follow this procedure to correct the problem:

  1. On the head node, use the text editor of your choice to edit the following file:

    /opt/hptc/slurm/etc/slurm.epilog.clean
  2. Add the following line to the file:

    SLURM_BIN=/opt/hptc/slurm/bin
    
  3. Save your changes and exit the file.

  4. Use procedures in the HP XC System Software Administration Guide to update the golden image and propagate the new image to all nodes.

How to Remove SLURM

The HP XC system installation process offers a choice of two different types of LSF. The default choice, LSF-HPC with SLURM, requires that SLURM is also installed and configured. The other choice is standard LSF, which does not require nor interact with SLURM. If standard LSF is selected, SLURM should not be configured.

If SLURM has been installed and configured but is not required, use the following procedures to deactivate it:

  1. As root on the head node, shut down SLURM:

    # scontrol shutdown
    
  2. Unconfigure SLURM on the head node:

    # /opt/hptc/slurm/etc/gconfig.d/slurm_gconfig.pl gunconfigure
    # /opt/hptc/slurm/etc/nconfig.d/slurm_nconfig.pl nunconfigure
  3. Update the golden image.

  4. Propagate the new golden image to all nodes.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003–2007 Hewlett-Packard Development Company, L.P.