Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software : Installation Guide > Chapter 4 Configuring and Imaging the System

Task 7: Respond to Configuration Questions

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

Output from the cluster_config utility is shown in this section. Depending upon how you want your system configured, respond to the following questions as services are configured on the head node:

  1. When you are prompted to enter the number of NFS daemons required on your system; accept the default value.

    Configuring system wide functions / policies / behaviors
    Executing C02ssh_config sconfigure
    Executing C10cluster_fstab sconfigure
    Executing C20sysparams sconfigure
    NFS daemon tuning:
    Given that there are 6 nodes in this cluster, enter the number of
    NFS daemons that shall be configured to support them [8] : Enter
    

    The default number scales according to the number of nodes in the system. The default represent the number of NFS daemons required on the head node to adequately support serving the /hptc_cluster file system. Table 4-5 lists the default values.

    Table 4-5 Number of NFS Daemons Based on System Size

    Number of NodesNumber of NFS Daemons
    88
    12816
    25632
    51264
    76896
    1024 or more128

     

  2. Specify the Network Time Protocol (NTP) server. The head node is automatically configured as the system's NTP server if another server is not specified, but you have the option to provide up to four external NTP servers instead.

    If your HP XC system will be integrated with HP StorageWorks Scalable File Share (HP SFS), the XC and HP SFS systems must be synchronized to a common time server. Therefore, do not take the default response; instead, enter the same external time server that will be used for the HP SFS system.

    Executing C75mpiic sconfigure
    Configuring service specific functions
    Executing C05pdsh gconfigure
    Executing C08ntp gconfigure
    Configuring the following nodes as ntp servers for the cluster:        
             n16
    
    You must now specify the clock source for the server nodes.  
    If the nodes have external connections, you may specify up 
    to 4 external NTP servers.  Otherwise, you must use the node's 
    system clock.
    Enter the IP address or host name of the first external NTP server
    or leave blank to use the system clock on the NTP server node: Enter
    Renaming previous /etc/ntp.conf to /etc/ntp.conf.bak 

  3. Supply the network type if your system has a QsNetII interconnect; the possible choices for your system are displayed, and a default is provided:

    Enter the network type of your system.
    Valid choices are QMS16 or QMS32: [QMS32]:  Enter

    The network type reflects the maximum number of ports the fabric topology can support. See Appendix G for information about how to determine the QsNetII network type for your system.

  4. Supply the name of the LVS alias if you assigned a login role to one or more nodes. This example uses the alias penguin. This is the name by which users will log in to the system. If you did not assign a login role to any node, you are not asked to supply an LVS alias.

    Executing C10hptc_cluster_fs gconfigure
    Executing C20gmmon gconfigure
    Executing C30swmlogger gconfigure
    Executing C30syslogng_forward gconfigure
    Executing C35dhcp gconfigure
    Executing C50cmf gconfigure
    Executing C50lvs gconfigure
    
    Enter the name of the cluster alias: penguin

  5. Enable Web access to the Nagios monitoring application and create a password for the nagiosadmin user. This password does not have to match any other password on your system.

    Executing C50nagios gconfigure
    Would you like to enable web based monitoring? ([y]/n) y
    Enter the password for the 'nagiosadmin' web user:
    New password: your_nagios_password
    Re-type new password: your_nagios_password
    Adding password for user nagiosadmin
    Executing C50nat gconfigure
    Executing C50supermond gconfigure
    Executing C51nagios_monitor gconfigure
    Executing C60nis gconfigure 

  6. Supply the name or IP address of your external NIS master server and your NIS domain name if you assigned the nis_server role to one or more nodes to configure them as a NIS slave server. If you did not assign a nis_server role to any node, you are not asked to supply this information.

        Network Information Service (NIS) Configuration
    
    This step sets up one or more NIS servers within the XC system
    that are "slaves" to an external NIS "master".  The master NIS
    server provides the slaves with copies of its NIS maps.
    
    In order to successfully complete this configuration step, the NIS
    master must have been previously set to allow slaves to communicate
    with it.  On Linux systems, this is typically accomplished by adding
    the NIS slave hostname(s) to the /var/yp/ypservers file on the NIS
    master, and then running 'make'.
    
    In addition, to complete this configuration, you will need to provide
    
    1) the name or IP address of the NIS master, and
    2) the NIS domain name hosted by the NIS master
    
    Enter the name or IP address of the external NIS master: [] NIS_IP_address
    Enter the NIS domain hosted by the NIS master: [] your_NIS_domain
    Executing C90munge gconfigure
    Executing C90slurm gconfigure 

  7. Configure SLURM:

    Do you want to configure SLURM now? (y/n) [y]:y

    Do one of the following:

    • If you intend to install LSF-HPC with SLURM , enter y.

    • If you intend to install standard LSF do not install SLURM; in that case, enter n.

    If you are installing SLURM, define a SLURM user name and accept all default responses. Output looks different if you assigned the resource_management role to one or more additional nodes because you will be prompted to assign the master and backup controller nodes.

    This SLURM configuration needs a special SLURM user. The SLURM
    controller daemons will be run by this user, and certain SLURM
    runtime files will be owned by this user.
    Enter the SLURM username [slurm]: Enter
    
    User 'slurm' does not exist.
    If this user account is created here, it will not have login
    access. Do you want to create this user? (y/n) [y]: Enter
    
    n16 is the only node with the Resource Management
    role. Therefore the SLURM Master Controller daemon will be set up
    on this node, and there will be no SLURM Backup Controller.
    The current Compute Node configuration is:
        NodeName=xc6n[11-16] Procs=2
    
    NOTE: The only Partition created by default is the lsf
    partition. If you want additional partitions, configure
    them manually in the /hptc_cluster/slurm/etc/slurm.conf file.
    
    The current Node Partition configuration is:
        PartitionName=lsf RootOnly=YES Shared=FORCE Nodes=xc6n[11-16]
    
    Do you want to enable SLURM-controlled user-access to the
    compute nodes? (y/n) [n]: n 1
    
    SLURM configuration complete. Press 'Enter' to continue: Enter
    Executing C95lsf gconfigure
    1

    By default, all compute nodes in the HP XC system are accessible by any user after their user accounts have been set up. This prompt enables you to restrict individual access to each compute node to the user who currently has the compute node reserved within SLURM.

    It is important that you assign a login role to each node on which you expect users to be able to log in and use the system. If you answer yes here and configure all nodes with the compute role (the default), but you do not configure any nodes with the login role, non-root users will not be allowed to log in to the system.
    Note:

    After cluster_config processing is complete, you have the option to modify default SLURM compute node and partition information. This information is described in “Task 8: Modify SLURM Characteristics (Optional)”.

  8. Decide whether or not you want to install LSF as your job management system:

    Do you want to install LSF locally now? (y|n) [y]:

    Do one of the following:

    • To install LSF-HPC with SLURM or standard LSF, enter y or press the Enter key. Proceed to step 9.

    • If you intend to install another job management system, such as the Maui Scheduler (which is documented in Appendix J), enter n. Proceed to step 12.

      If at a future time you want to install LSF, rerun the cluster_config utility, and answer y to this question.

      The remainder of this procedure does not describe how to install any other job management system other than LSF-HPC with SLURM or standard LSF.

  9. Decide the type of LSF to install:

    There are two types of LSF available to install:
    
        1. Standard LSF: the standard Load Sharing Facility product.
    
        2. LSF-HPC integrated with SLURM: the LSF High Performance
           Computing solution integrated with SLURM for XC.
    
    Which LSF product would you like to install (1/2)? [2]:  

    Table 4-6 describes characteristics of LSF-HPC with SLURM and standard SLURM to help you decide which type of LSF to install.

    Table 4-6 Characteristics of LSF-HPC with SLURM and Standard LSF

    LSF-HPC With SLURMStandard LSF
    • Parallel support for:

      • Accounting

      • Signal propagation

      • I/O

      • Job launching

    • Designed to ensure that parallel jobs (MPI jobs) achieve the best performance by dedicating whole nodes to parallel jobs. This works well for systems with 2 processor and 4 processor nodes where jobs are expected to span across nodes.

    • Exclusive node allocation with exclusive user access control.

    • Presents the entire system as a single, large SMP host rather than a large system of many hosts. This simplifies system status commands because information is shown for one host, which makes it desirable for large-scale systems.

    • A load-based scheduler, which is ideal for serial jobs.

    • Finds the free resource that is the least loaded and dispatches the job to that node.

    • Sufficient for sites that do not need the type of parallel job support provided by LSF-HPC with SLURM.

     

  10. If you are installing LSF-HPC with SLURM, your first decision is where to assign the primary LSF node; this decision is not required for standard LSF.

    • If more than one node is assigned the resource_management role, you are prompted to identify the primary LSF node, as follows:

      Here is the set of nodes from which to select the Primary
      XC LSF-HPC node: n[15-16]
      Enter the Primary XC LSF-HPC node [n16] : n16

    • If only one node is assigned the resource_management role, the following is displayed:

      n16 is the only node with the Resource Management role,
      and it is the Primary LSF-HPC node. 

  11. Provide responses to install and configure LSF. This requires you to supply information about the primary LSF administrator and administrator's password.

    The user name lsfadmin is the default user name for the primary LSF administrator. If you accept the default user name and a NIS account exists with the same name, LSF-HPC with SLURM will be configured with the existing NIS account. You will not be prompted to supply a password for the lsfadmin account. Otherwise, accept all default answers.

    Output is similar to the following:

    What name shall LSF use to uniquely identify this system?
    No existing host names are allowed, and the name must be
    less than 39 characters with no whitespace:
    LSF System Name [hptclsf]:  Enter
    
    Enter the name of the Primary LSF Administrator. You can
    configure additional administrators later, but this user
    must exist now in order to be given ownership of the files
    to be installed. If this user does not exist, it will be
    created locally [lsfadmin]: Enter
    
    The lsfadmin user does not exist. Do you want to
    create this user now? (y/n) [y] Enter
    Changing password for user lsfadmin. 
    New UNIX password:  your_lsfadmin_password
    Retype new UNIX password: your_lsfadmin_password
    passwd: all authentication tokens updated successfully.
    
    Executing the Platform LSF-HPC installation script (hpcinstall)...
    Logging installation sequence in 
    /opt/hptc/lsf/files/lsfhpc/install-20051216023643/hpc6.1_hpcinstall/Install.log
      1) linux2.6-glibc2.3-ia32e-slurm
    
    Press 1 or Enter to install this host type:  Enter 1
    1

    The sample command output was obtained from an Opteron-based system. Thus, the tar file name is linux2.6-glibc2.3-amd64-slurm (the string amd64 signifies an Opteron- or Xeon-based architecture). When an Itanium-based system is configured, the string ia64 is included in the file name.

  12. Follow along with the remainder of the system configuration process. This sample output is provided for your information only; there is nothing else you have to do in this step. Despite some of the messages shown in the command output, everything you need to install, configure, and verify LSF and SLURM is described in this document.

    Pre-installation check report saved as text file: 
    /opt/hptc/lsf/files/lsfhpc/install-20051216023643/hpc6.1_hpcinstall/prechk.rpt.
    
    ... Done LSF pre-installation check.
    
    ... Done installing hpc binary files "linux2.6-glibc2.3-ia32e-slurm".
    
    ... LSF configuration is done.
    
    hpcinstall is done.
    
    To complete your hpc installation and get your 
    cluster "hptclsf" up and running, follow the steps in 
    "/opt/hptc/lsf/files/lsfhpc/install-20051216023643/hpc6.1_hpcinstall/  \
        hpc_getting_started.html".
    
    After setting up your LSF server hosts and verifying 
    your cluster "hptclsf" is running correctly, 
    see "/opt/hptc/lsf/top/6.1/hpc_quick_admin.html" 
    to learn more about your new LSF cluster.
    
    ***Begin LSF-HPC Post-Processing***
    
    Created '/hptc_cluster/lsf/tmp'...
    
    Editing /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf...
    Moving /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf
     to /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf.old.6490...
    
    Editing /opt/hptc/lsf/top/conf/lsf.conf...
    Moving /opt/hptc/lsf/top/conf/lsf.conf
     to /opt/hptc/lsf/top/conf/lsf.conf.old.6490...
    
    Editing /opt/hptc/lsf/top/conf/lsbatch/hptclsf/configdir/lsb.params...
    Moving /opt/hptc/lsf/top/conf/lsbatch/hptclsf/configdir/lsb.params
     to /opt/hptc/lsf/top/conf/lsbatch/hptclsf/configdir/lsb.params.old.6490...
    
    Replaced default lsb.queues with a preconfigured lsb.queues.
    
    C95lsf finished

  13. After LSF is installed and configured, the golden image is created, and all other system services are configured and started. Output looks similar to the following:

    Configuring the image replication environment
        Initializing 172.20.0.16 as golden client
        Creating the golden image (takes approximately 10 minutes)
    
    **Do not interrupt this process or else the golden image will be incomplete**
    
        Setting up the bootserver
        Linking client nodes to their autoinstall script
        Initializing service persistence
        Sanitizing services in the golden image
        Creating golden image 'tar' file (takes approximately 10-15 minutes)
        Verifying integrity of golden image 'tar' file
    Image replication environment configuration complete.
    info: nconfig started
    info: Executing on head node
    
    info: Executing C02network nconfigure
    info: Executing C04iptables nconfigure
    info: Executing C06nfs_server nconfigure
    info: Executing C08ntp nconfigure
    info: Executing C10hptc_cluster_fs nconfigure
    info: Executing C10hptc_cluster_fs_client nconfigure
    info: Executing C20gmmon nconfigure
    info: Executing C30swmlogger nconfigure
    info: Executing C30syslogng_forward nconfigure
    info: Executing C40hpasm nconfigure
    info: Executing C50cmf nconfigure
    info: Executing C50collectl nconfigure
    info: Executing C50gather_data nconfigure
    info: Executing C50hptc-lm nconfigure
    info: Executing C50nagios nconfigure
    info: Executing C50nat nconfigure
    info: Executing C50supermond nconfigure
    info: Executing C51nagios_monitor nconfigure
    info: Executing C51nrpe nconfigure
    info: Executing C90munge nconfigure
    info: Executing C90slurm nconfigure
    info: Executing C95lsf nconfigure
    info: Executing C30syslogng_forward cconfigure
    info: Executing C35dhcp cconfigure
    info: Executing C50supermond cconfigure
    info: Executing C90munge cconfigure
    info: Executing C90slurm cconfigure
    info: Executing C95lsf cconfigure
    info: nconfig shut down
    info: nconfig started
    info: Executing on head node
    
    info: Executing C02network nrestart
    info: Executing C04iptables nrestart
    info: Executing C06nfs_server nrestart
    info: Executing C08ntp nrestart
    info: Executing C10hptc_cluster_fs nrestart
    info: Executing C10hptc_cluster_fs_client nrestart
    info: Executing C20gmmon nrestart
    info: Executing C30swmlogger nrestart
    info: Executing C30syslogng_forward nrestart
    info: Executing C40hpasm nrestart
    info: Executing C50cmf nrestart
    info: Executing C50collectl nrestart
    info: Executing C50gather_data nrestart
    info: Executing C50hptc-lm nrestart
    info: Executing C50nagios nrestart
    info: Executing C50nat nrestart
    info: Executing C50supermond nrestart
    info: Executing C51nagios_monitor nrestart
    info: Executing C51nrpe nrestart
    info: Executing C90munge nrestart
    info: Executing C90slurm nrestart
    info: Executing C95lsf nrestart
    info: Executing C30syslogng_forward crestart
    info: Executing C35dhcp crestart
    info: Executing C50supermond crestart
    info: Executing C90munge crestart
    info: Executing C90slurm crestart
    info: Executing C95lsf crestart
    info: nconfig shut down 

    Note:

    If necessary, see “Troubleshooting the Imaging Process” for information about using the imaging log files to troubleshoot the imaging process.

Proceed to “Task 8: Modify SLURM Characteristics (Optional)” to modify the SLURM configuration file. This task is optional.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.