Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software: Installation Guide > Chapter 3 Configuring and Imaging the System

Task 11: Run the startsys Utility to Start the System and Propagate the Golden Image

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The first time the entire system is started with the startsys command, power to each node is turned on, each node boots from its network adapter, and the SystemImager automatic installation environment is downloaded. This environment automatically installs and configures each node from the golden image. The startsys command may take several minutes to power on the nodes on large-scale systems because of scale requirements.

The number of nodes to be installed influences the amount of time it takes to complete the process. After all nodes are installed, they automatically reboot to the login prompt. This process can take between two to three hours on a system with 1024 compute nodes.

This release uses the multicast file transfer technology to download software to client nodes during their image installation. Multicast file transfer technology provides a fast and scalable method of installing systems. Using multicast imaging sends data to many nodes simultaneously that have been previously set up to listen to a multicast from the designated image server. Multicast imaging provides very little resource drain on the image server as compared to other file transfer technologies, and therefore, allows systems of all sizes to be installed relatively quickly.

Multicast imaging uses the udpcast open source package, and the flamethrower functionality of SystemImager. A series of udp-sender daemons are run on the image server, and each client node runs a series of udp-receiver daemons during the imaging operation. The udp-sender daemons are managed by the startsys command. The startsys command starts these daemons when the --image_only or --image_and_boot options are entered on the command line and then shuts these daemons down after the imaging operation is complete. Therefore, you must use startsys when performing a full installation through the imaging operation.

Follow this procedure to start the system and propagate the golden image to all nodes; the command-line options depend on the number of nodes in the system:

  1. Make sure the XC.lic license key file is located in the following directory:

    # ls /opt/hptc/etc/license
    CAUTION: You cannot continue if the license key file is not present in this directory. See “Task 7: Have the License Key File Ready” and “Put the License Key File in the Correct Location (Required)” for more information about obtaining and positioning the license key file if you have not already done so.
  2. Use the startsys command to turn on power to all nodes, image the nodes, and boot the nodes. As shown in Table 3-9, the command-line options for the initial system image and boot depend upon the size of the system. See startsys( 8) for the complete list of command options.

    Table 3-9 The startsys Command-Line Options for Initial System Image and Boot

    startsys Command-Line OptionDescriptionUse on Systems with Fewer Than 300 Nodes?Use on Systems with More Than 300 Nodes?

    --image_and_boot

    Images and then reboots all nodes so they complete their per-node configuration phase, thus completing the installation on the nodes. This option applies only for nodes that have previously been set up to network boot.

    YesNo

    --image_only

    Completes the imaging phase and then halts the nodes before their per-node configuration phase. For large-scale systems, booting while imaging is not recommended.

    NoYes

     

  3. Use this startsys command on systems with fewer than 300 nodes to image and boot all nodes. For larger hardware configurations, use the command shown in step 4.

    # startsys --image_and_boot
  4. Use these startsys commands on systems with more than 300 nodes. The image and boot process is completed in two phases:

    1. Image the nodes:

      # startsys --image_only
    2. Boot the nodes:

      # startsys --boot_group_delay=240
      NOTE: Use the --boot_group_delay=240 option only the first time nodes are booted after being imaged; the value 240 specifies the number of seconds to wait between groups of nodes as they are booting. For more information about this value, see startsys(8)
  5. If you want to watch as the startsys command images and powers on nodes, open a second terminal window and issue a tail command to view the following log files:

    • /hptc_cluster/adm/logs/imaging.log

    • /hptc_cluster/adm/logs/startsys.log

    Command output on a small, 16-node configuration is similar to the following:

    # startsys --image_and_boot
    Thu Sep 28 08:49:10 2006 Enabled nodes: 16 nodes -> n[1-16]
        Thu Sep 28 08:49:12 2006 Removing the execution node: n16
        Thu Sep 28 08:49:12 2006 Boot hierarchy of specified nodes is: n15 n[1-14]
        Thu Sep 28 08:49:15 2006 Initial power test - please wait.
        Thu Sep 28 08:49:24 2006 Nodes that will image: 15 nodes -> n[1-15]
    You must manually power on the following nodes:
    n1
    Press enter after applying power to these nodes.
    
    continuing ........
        Thu Sep 28 08:49:29 2006 Powering on for image: 14 nodes -> n[2-15]
        Thu Sep 28 08:50:34 2006 Retrying power --on command: 3 nodes -> n[2-3,15]
    
    *** Thu Sep 28 08:52:19 2006 Current statistics:
      Imaging: 15 nodes -> n[1-15]
    
      Progress:
    Flamethrower started: nodes waiting: 15 nodes -> n[1-15]
    
    *** Thu Sep 28 08:55:19 2006 Current statistics:
      Imaging: 15 nodes -> n[1-15]
    
      Progress:
    
    *** Thu Sep 28 08:58:19 2006 Current statistics:
      Imaging: 15 nodes -> n[1-15]
    
      Progress:
        Thu Sep 28 08:58:34 2006 Imaging completed; will be powered off: 2 nodes -> n[1-2]
    You must manually power off the following nodes:
    n1
    Press enter after removing power from these nodes.
    
    
    continuing ........
        Thu Sep 28 08:59:02 2006 Powering off: 1 node -> n2
        Thu Sep 28 08:59:48 2006 Imaging completed; will be powered off: 9 nodes -> n[4-10,12,14]
        Thu Sep 28 08:59:48 2006 Powering off: 9 nodes -> n[4-10,12,14]
        Thu Sep 28 09:00:04 2006 Imaging completed; will be powered off: 3 nodes -> n[11,13,15]
        Thu Sep 28 09:00:04 2006 Powering off: 3 nodes -> n[11,13,15]
        Thu Sep 28 09:00:52 2006 Imaging completed; will be powered off: 1 node -> n3
        Thu Sep 28 09:00:52 2006 Powering off: 1 node -> n3
        Thu Sep 28 09:01:07 2006 Retrying power --off command: 1 node -> n15
    
    *** Thu Sep 28 09:01:22 2006 Current statistics:
      Waiting for hierarchy to boot: 15 nodes -> n[1-15]
      Progress:
        Thu Sep 28 09:01:22 2006 Powering on for boot: 1 node -> n15
        Thu Sep 28 09:02:33 2006 Retrying power --on command: 1 node -> n15
        Thu Sep 28 09:04:18 2006 Processing completed for: 1 node -> n15
    
    *** Thu Sep 28 09:04:33 2006 Current statistics:
      Booted and available: 1 node -> n15
      Waiting for hierarchy to boot: 14 nodes -> n[1-14]
    
      Progress:
    You must manually power on the following nodes:
    n1
    Press enter after applying power to these nodes.
    
    continuing ........
        Thu Sep 28 09:04:37 2006 Powering on for boot: 13 nodes -> n[2-14]
        Thu Sep 28 09:05:33 2006 Retrying power --on command: 12 nodes -> n[2-6,8-14]
        Thu Sep 28 09:06:48 2006 Processing completed for: 1 node -> n1
        Thu Sep 28 09:07:03 2006 Processing completed for: 1 node -> n7
        Thu Sep 28 09:07:18 2006 Processing completed for: 9 nodes -> n[4-5,8-14]
    
    *** Thu Sep 28 09:07:33 2006 Current statistics:
      Booted and available: 15 nodes -> n[1-15]
    
      Progress:
        Thu Sep 28 09:07:33 2006 Processing completed for: 3 nodes -> n[2-3,6]
    
    *** Thu Sep 28 09:07:33 2006 Current statistics:
      Booted and available: 15 nodes -> n[1-15]
    
      Progress:
        Thu Sep 28 09:07:34 2006 startsys process exiting with code 0
  6. See “Troubleshoot the Imaging Process” if you encounter problems imaging nodes.

Proceed to “Task 12: Perform Postconfiguration Tasks for the InfiniBand Interconnect”.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.