| United States-English |
|
|
|
![]() |
HP XC System Software: XC Installation Guide > Appendix F Description of Node Roles, Services, and the Default ConfigurationRole Definitions |
|
A node role is defined by the services provided to the node. The role is an abstraction that combines one or more services into a group. Roles provide a convenient way of installing services on a node. Node roles, listed alphabetically, are characterized as follows: You can define multiple roles on any node. The head node, in particular, can have all of these roles if you are setting up a small cluster. If you need more information about services and node roles, see the HP XC System Software Administration Guide. The availability role is automatically assigned to all nodes that are members of availability sets. You cannot assign this role to any node. The configuration and management database names of the services provided by the availability role are avail and translate. The avail entry points make configuration specific changes for availability including configuring the node to run the configured availability tools. The translate entry point calls the availability tool scripts, which traverse the configuration and management database to obtain information about services configured for availability. The availability tool scripts create the configuration files necessary to enable the availability tool on the nodes on which it was configured. Assign the avail_node_management role to fail over services on the head node that are usually supplied by the node_management role. In this release, the database server service is the only service supplied by the avail_node_management role. Because you cannot assign the node_management role to any other node except the head node, the avail_node_management role was developed to accomplish that task. Never assign the avail_node_management role to the head node. Assign the avail_node_management role only to the second node in an availability set to fail over the database server (dbserver) service. This role defines where the optional Cisco InfiniBand Host Based Subnet Manager runs. The service associated with this role is called ib_sm. For more information about when this role is required, see Section . HP recommends that you assign this role to the head node and another service node. Assigning this role to another service node enables the Subnet Manager, and therefore the InfiniBand network, to continue to function even if the head node is down. Either the cisco_hsm or the voltaire_hsm roles might be present in a system, but not both. The common role is automatically assigned to all nodes, and it cannot be removed. This role runs services that must be present on every node. The configuration and management database names of the services provided by this role are as follows:
These services provide functionality that is required on all nodes and are fundamental to the proper functioning of the cluster. Jobs are distributed to and run on nodes with the compute role. This role provides the services required for the node to be an allocated resource of the SLURM central control service (slurmcd). On systems with fewer than 63 total nodes, this role is assigned to all nodes; on large-scale systems with more than 64 nodes, this role is assigned exclusively to nodes with no other roles assigned. Any node with this role may be called upon to execute jobs scheduled by the users. Nodes with this role are also often called compute nodes. It is your responsibility to remove this role on a node if it is not wanted or required. To enable monitoring of the nodes, run the Nagios remote plug-in execution agent on nodes with the compute role. The console_network role enables you to map HP XC services with physical console requirements to the appropriate nodes. By default, this role is assigned to the same set of nodes as the management_hub role. In system installations in which only specially designated nodes can access the console network, you can assign this role to those nodes instead, and the management and monitoring tools use only those nodes. You must assign the console_network role to nodes that have console access. Otherwise, Nagios plug-ins that require console access will not collect data. The configuration and management database names of the services provided by this role are as follows:
Nodes with the disk_io role provide access to storage and file systems mounted locally on the node. This role can be located on any node that provides local file system access to all nodes using NFS. Assign this role to any node that is exporting SAN storage. The configuration and management database for the NFS Server service supplied by this role is nfs_server. Nodes with this role normally reside in the utility cabinet of the cluster and have the most direct access to storage. You can assign other roles to a node with this role. However, you must be careful not to overload the node so it can provide adequate NFS service. The external role supplies the NAT server service, which does network address translation within the cluster. This enables applications to access nodes that do not have an external network connection. The configuration and management database name of the service supplied by this role is nat. Assign this role only to nodes on which you are configuring an external network connection. If you installed remote graphics software, you must configure an external Ethernet device (the external NICs) on the SVA nodes. The system can have multiple nodes defined with the external role, supplying multiple NAT servers to ease network traffic congestion. To achieve improved availability of NAT, assign the external role to both nodes in the availability set. If you assign the external role to another node in the system, it will be ignored. Applications have no need to be aware of the actual internal IP address of a compute node because a NAT server node handles all network requests. Nodes with the login role accept login sessions of users. A user can submit jobs from the command line on a node with a login role. The jobs are distributed among compute nodes to process the job. Nodes assigned with a login role must have a configured external network connection. The login role supplies a node with the LVS director service, which handles the placement of user login sessions on login nodes when a user logs in to the cluster alias. To achieve improved availability of the LVS director service, you must assign the login role to three nodes. See the role assignment guidelines listed in Table 1-2 for more information. The configuration and management database name for the service supplied by this role is lvs. A node with the management_hub role is an aggregation point for management activities. The configuration and management database names of the services provided by this role are as follows:
These services are used to support scaling of the cluster. Nodes with this role provide local storage for aggregation of system logs and performance information. Management hub services typically report up to the node with the management_server role. You can assign the management_hub role to several nodes, and HP recommends that you consider using one management hub for every 64 to 128 nodes. If you add the management_hub role to nodes in the system, HP recommends that you also add the console_network role as well. If the console_network role is not on a management_hub node, the hub picks a node that is assigned with the console_network role. See “Console_network Role” for more information. The management_server role contains services that manage the overall management infrastructure. Only one node in the system can have the management_server role; in this release, the head node has this role. To achieve improved availability of the Nagios master service, you must assign the management_server role to two nodes (the head node and one additional node). See the role assignment guidelines listed in Table 1-2 for more information. The configuration and management database names of the services provided by this role are as follows:
The nis_server role is not enabled by default. Assigning this role to a node configures the node as a NIS slave server. If you assign this role to a node, you are prompted to enter the name of the NIS master server and NIS domain name during cluster_config processing. Any node assigned with the nis_server role must also have an external Ethernet network connection defined. nis is the configuration and management database of the service provided by this role. Nodes with the node_management role run a number of services that help manage other nodes in the system. This role is restricted to the head node; it cannot be removed. The configuration and management database names of the services provided by this role are as follows:
The configuration and management database, in which all management configuration information is stored, runs on a node with this role. The power manager service manages the powering on and off of nodes in the system. The MPI interconnect settings are managed by nodes running this role. The image server service provides images from the SystemImager tool to all client nodes. The NTP server synchronizes the time on all nodes. The license manager service enables some software components in the system when a valid license is present. Nodes with the resource_management role provide the services necessary to support SLURM and LSF. On systems with fewer than 63 total nodes, this role is assigned by default to the head node. On large-scale systems with more than 64 nodes, this role is assigned by default to the node with the internal node name that is one less than the head node. For example, if the head node is n256, node n255 is assigned as the resource manager. On large-scale systems, the resource_management role is exclusive; the node with this role has no other roles assigned to it. The configuration and management database names of the services provided by this role are as follows:
You can assign this role to multiple nodes in the cluster to provide support for failover. Multiple nodes may have this role defined, but only one is active at a time. If the SLURM central control service fails on the active node, the daemon will be started on another node with the resource management role. This role defines where the optional Voltaire InfiniBand Host Based Subnet Manager runs. The service associated with this role is called gvfm. For more information about when this role is required, see Section . HP recommends that you assign this role to the head node and another service node. Assigning this role to another service node enables the Subnet Manager, and therefore the InfiniBand network, to continue to function even if the head node is down. Either the cisco_hsm or the voltaire_hsm roles might be present in a system, but not both. |
||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||