 |
» |
|
|
 |
Follow this procedure
to configure the upgraded system and propagate the new golden image
to all client nodes: Back up the existing configuration
and management database and migrate existing data to the new release
format: Command output is similar to the following: The upgradesys utility performs all the necessary steps
to upgrade your cluster. This script should be run immediately
after you have upgraded the head node with the latest XC software
and any third party vendor rpms.
Do you wish to continue? [y/n] y
Backing up database to
/opt/hptc/etc/sysconfig/upgrade/upgradesys.dbbackup-20050103145027.sql ...
Executing C02database gupdate
Starting MySQL: [ OK ]
Executing C20server_type gupdate
Executing C30device_names gupdate
Executing C33etc_hosts gupdate
Executing C35region gupdate
Executing C40role_migration gupdate
Executing C90systemimager gupdate
Removing XC MLIB RPMs
upgradesys output logged to /var/log/upgradesys/upgradesys.log |
 |  |  |  |  | CAUTION: Do not proceed to the next step in the upgrade
process if the output from the upgradesys script
indicates failures. If you cannot determine how to resolve these errors,
contact your local HP support center. |  |  |  |  |
Review the /opt/hptc/systemimager/etc/base_exclude_file to determine if you want to exclude files from the golden image
beyond what is already excluded. The base_exclude file file is the file that is read when the golden image is re-created
as part of the upgrade process. The HP XC System Software Administration Guide describes how to add
exclusions to this file. If you are using the InfiniBand interconnect, verify that
the appropriate configuration files were successfully changed so that
the XC system is using the same type of InfiniBand software stack
as the HP SFS system. The sfsconfig should have
changed the following files: | /etc/modprob.conf | | /etc/modprobe.conf.lustre | | /etc/modprobe.conf.lustre.* |
If you are using the OFED InfiniBand software stack,
verify that the lnet option is added in /etc/modprobe.conf.lustre and it looks similar to the
following line: options lnet networks=o2ib0 |
Verify the /etc/sfstab.proto file
to ensure that the appropriate lnet entries were
created and the InfiniBand interconnect interface used is o2ib0 (and not vib0). If this is
not the case, contact HP SFS Support for more information on how to
manually change the /etc/sfstab.proto file. If necessary, when all of the above mentioned files
are correct (and you are able to automatically mount the HP SFS file
system on the head node), verify that the corresponding files under /var/lib/systemimager/images/base_image/etc are also correct
and replace the files.
Verify that the newly created HP XC Golden Image is a
reasonable size (typically 2 to 3 GB) otherwise the imaging process
might fail. # du –sk /var/lib/systemimager/images/base_image Decide on the cluster_config option to use to configure the upgraded system. Table 5-6 describes
two options to the cluster_config utility that
you can use to reconfigure a system after a software upgrade. The
option you choose depends upon how you want the upgrade to proceed. Table 5-6 Upgrade Options for the cluster_config Utility | --migrate
Option | --init Option |
|---|
Brings existing,
recognized roles in the configuration and management database into
alignment with the new roles introduced in this release. This option retains information about role assignments and preserves
role-to-node assignments. Using this option does not guarantee
the correct migration process for unrecognized (user-created) roles
and services in the configuration and management database. Before you decide to use this option, view the /opt/hptc/etc/sysconfig/upgrade/role_migration.ini file to see how the previous role assignments compare to the roles
provided in the new release. | Initializes (resets) your existing node role assignments and configures
the system with the default node role assignments in this release. This option does not preserve role-to-node assignments from
the previous release. The default roles and assignments
have been optimized for performance and, you might decide that this
configuration is better suited for your environment. See Appendix F for a description of roles and the services provided by them, as
well as the default node role assignments. |
Change directory to the configuration
directory: # cd /opt/hptc/config/sbin |
Specify one of the following cluster_config options: To migrate the existing
system configuration: # ./cluster_config --migrate |
To apply new default role
assignments to the existing system configuration: # ./cluster_config --init |
If you followed the instructions
in “Task 9: Plan a Service Availability Strategy” to install an availability tool and position related scripts to
set up improved availability of services, you are prompted to configure
availability sets now. “Task 8: Configure Availability Sets” describes how to configure availability
sets. Return here when you are done. View the role
assignments when the cluster_config utility displays
the command-line options menu: [L]ist Nodes, [M]odify Nodes, [A]nalyze, [H]elp, [P]roceed, [Q]uit: l |
HP recommends that you
use the [L]ist Nodes option
to see the roles assigned to each node and make adjustments if required. If you ran the cluster_config command with the --init option, use the [M]odify Nodes option to reassign any role assignments you customized in the previous
release. For example, if the system configuration had login roles on one or more nodes, you must assign a login role on any node on which you want users to be able to log in. In
the default configuration, a login role is not
assigned to any node. If you need more information
about using the cluster_config command-line options
menu to modify role assignments, see Appendix G.
When you have finished making
role assignments, enter the letter p to proceed
with the system configuration process: [L]ist Nodes, [M]odify Nodes, [A]nalyze, [H]elp, [P]roceed, [Q]uit: p
Do you want to apply your changes to the cluster configuration? [y/n] y
[S]ervices Config, [P]roceed, [Q]uit: p
Do you want to apply your changes to the service configuration? [y/n] y |
The cluster_config utility prompts you to supply system configuration information.
When prompted, provide the answers listed in Table 5-7. Table 5-7 Responding to cluster_config Prompts During an Upgrade | Prompt | Answer |
|---|
Regenerate ssh keys? | yes | Re-create the qsnet database? (Seen
only on systems that are configured with a QsNetII interconnect). | yes | Reconfigure SLURM? | yes | Create a new slurm.conf file? | yes | Install LSF? | yes | Upgrade to new version
of LSF? | u (upgrade) | All other prompts | Accept the default response for
all prompts except prompts for improved availability (if it has been
configured) |
Follow along on the screen
while the cluster_config utility configures the
system.  |  |  |  |  | NOTE: To avoid duplicating command output here, cluster_config output is shown in Section . |  |  |  |  |
Continue to the next step in this procedure when
the cluster_config processing is complete. Look at the backup
copy of the slurm.conf file, which is located
in the /hptc_cluster/slurm/etc/slurm.conf.bak file. If you previously customized this file, you must merge those
customizations into the new version of the /hptc_cluster/slurm/etc/slurm.conf file. Otherwise, omit this step. Compare the final LSF configuration
with the saved version to ensure that only the appropriate changes
relevant to the upgrade have occurred. Also ensure (or restore, if
required) the elim scripts. HP recommends that you use another terminal window to install the elim scripts at the point when the cluster_config utility displays the following prompt: All user specified configuration is complete.
The Golden Image will be created next.
[P]roceed, [Q]uit: |
Thus, changes to the /opt/hptc/lsf/top/6.2/ directory will be included in the golden image. This is the directory
where elim scripts reside. Any changes to the LSF configuration files in
the /hptc_cluster/lsf/conf/ directory can occur
after cluster_config has finished because this
directory is not part of the golden image. If the LSF version has changed, binary paths (including
the $LSF_SERVERDIR directory, where elim scripts reside) have also changed. Re-enter the monitoring line
card entries in the /etc/dhcpd.conf file if the
system is using a Myrinet interconnect. See “Configure Myrinet Switch Monitoring Line Cards” for more
information about adding entries to this file. If the system is using an InfiniBand, QsNetII , or Gigabit Ethernet interconnect, omit this step.
Proceed to “Task 9: Image and Boot the System and Start Compute Resources.”
|