| United States-English |
|
|
|
![]() |
Managing Serviceguard Version A.11.16, Eleventh EditionSecond Printing > Chapter 3 Understanding
Serviceguard Software ComponentsHow the Network Manager Works |
|
The purpose of the network manager is to detect and recover from network card and cable failures so that network services remain highly available to clients. In practice, this means assigning IP addresses for each package to the primary LAN interface card on the node where the package is running and monitoring the health of all interfaces, switching them when necessary. Each node (host system) should have at least one IP address for each active network interface. This address, known as a stationary IP address, is configured in the node's /etc/rc.config.d/netconf file or in the node’s /etc/rc.config.d/netconf-ipv6 file. A stationary IP address is not transferable to another node, but may be transferable to a standby LAN interface card. The stationary IP address is not associated with packages. Stationary IP addresses are used to transmit heartbeat messages (described earlier in the section “How the Cluster Manager Works”) and other data. In addition to the stationary IP address, you normally assign one or more unique IP addresses to each package. The package IP address is assigned to the primary LAN interface card by the cmmodnet command in the package control script when the package starts up. The IP addresses associated with a package are called relocatable IP addresses (also known as package IP addresses or floating IP addresses) because the addresses can actually move from one cluster node to another. You can use up to 200 relocatable IP addresses in a cluster, spread over as many as 150 packages. This can be a combination of IPv4 and IPv6 addresses. A relocatable IP address is like a virtual host IP address that is assigned to a package. It is recommended that you configure names for each package through DNS (Domain Name System). A program then can use the package's name like a host name as the input to gethostbyname(), which will return the package's relocatable IP address. Both stationary and relocatable IP addresses will switch to a standby LAN interface in the event of a LAN card failure. In addition, relocatable addresses (but not stationary addresses) can be taken over by an adoptive node if control of the package is transferred. This means that applications can access the package via its relocatable address without knowing which node the package currently resides on. Both IPv4 and IPv6 address types are supported in Serviceguard. IPv4 addresses are the traditional addresses of the form “n.n.n.n” where ‘n’ is a decimal digit between 0 and 255. IPv6 addresses have the form “x:x:x:x:x:x:x:x” where ‘x’ is the hexadecimal value of each of eight 16-bit pieces of the 128-bit address. Only IPv4 addresses are supported as heartbeat addresses, but both IPv4 and IPv6 addresses (including various combinations) can be defined as stationary IPs in a cluster. Both IPv4 and IPv6 addresses also can be used as relocatable (package) IP addresses. When a package is started, a relocatable IP address can be added to a specified IP subnet. When the package is stopped, the relocatable IP address is deleted from the specified subnet. Adding and removing of relocatable IP addresses is handled through the cmmodnet command in the package control script, which is described in detail in Chapter 6 “Configuring Packages and Their Services ” IP addresses are configured only on each primary network interface card; standby cards are not configured with an IP address. Multiple IPv4 addresses on the same network card must belong to the same IP subnet. In one package, it is possible to have multiple services that are associated with the same IP address. If one service is moved to a new system, then the other services using the IP address will also be migrated. Load sharing can be achieved by making each service its own package and giving it a unique IP address. This gives the administrator the ability to move selected services to less loaded systems. At regular intervals, Serviceguard polls all the network interface cards specified in the cluster configuration file. Network failures are detected within each single node in the following manner. One interface on the node is assigned to be the poller. The poller will poll the other primary and standby interfaces in the same bridged net on that node to see whether they are still healthy. Normally, the poller is a standby interface; if there are no standby interfaces in a bridged net, the primary interface is assigned the polling task. (Bridged nets are explained in the section on “Redundant Network Components” in Chapter 2.) The polling interface sends LAN packets to all other interfaces in the node that are on the same bridged net and receives packets back from them. Whenever a LAN driver reports an error, Serviceguard immediately declares that the card is bad and performs a local switch, if applicable. For example, when the card fails to send, Serviceguard will immediately receive an error notification and it will mark the card as down. Serviceguard Network Manager also looks at the numerical counts of packages and received on an interface to determine if a card is having a problem. There are two ways Serviceguard can handle the counts of packets sent and received. In the cluster configuration file, choose one of these two values for the NETWORK_FAILURE_DETECTION parameter:
A local network switch involves the detection of a local network interface failure and a failover to the local backup LAN card. The backup LAN card must not have any IP addresses configured. In the case of local network switch, TCP/IP connections are not lost for Ethernet, but IEEE 802.3 connections will be lost. For IPv4, Ethernet, Token Ring and FDDI use the ARP protocol, and HP-UX sends out an unsolicited ARP to notify remote systems of address mapping between MAC (link level) addresses and IP level addresses. IEEE 802.3 does not have the rearp function. IPv6 uses the Neighbor Discovery Protocol (NDP) instead of ARP. The NDP protocol is used by hosts and routers to do the following:
Within the Ethernet family, local switching configuration is supported:
On HP-UX 11i, however, Jumbo Frames must not be used since the 100Base-T cards do not support Jumbo Frames. Network interface cards running 1000Base-T or 1000Base-SX cannot do local failover to 10BaseT. During the transfer, IP packets will be lost, but TCP (Transmission Control Protocol) will retransmit the packets. In the case of UDP (User Datagram Protocol), the packets will not be retransmitted automatically by the protocol. However, since UDP is an unreliable service, UDP applications should be prepared to handle the case of lost network packets and recover appropriately. Note that a local switchover is supported only between two LANs of the same type. For example, a local switchover between Ethernet and FDDI interfaces is not supported, but a local switchover between 10BT Ethernet and 100BT Ethernet is supported. Figure 3-16 “Cluster Before Local Network Switching ” shows two nodes connected in one bridged net. LAN segments 1 and 2 are connected by a hub. Node 1 and Node 2 are communicating over LAN segment 2. LAN segment 1 is a standby. In Figure 3-17 “Cluster After Local Network Switching ”, we see what would happen if the LAN segment 2 network interface card on Node 1 were to fail. As the standby interface takes over, IP addresses will be switched to the hardware path associated with the standby interface. The switch is transparent at the TCP/IP level. All applications continue to run on their original nodes. During this time, IP traffic on Node 1 will be delayed as the transfer occurs. However, the TCP/IP connections will continue to be maintained and applications will continue to run. Control of the packages on Node 1 is not affected.
Another example of local switching is shown in Figure 3-18 “Local Switching After Cable Failure ”. In this case a failure affecting segment 2 causes both nodes to switch to the LAN cards attached to segment 1. Local network switching will work with a cluster containing one or more nodes. You may wish to design a single-node cluster in order to take advantages of this local network switching feature in situations where you need only one node and do not wish to set up a more complex cluster. Whenever a node is halted, the cluster daemon (cmcld) will always attempt to switch any Serviceguard-configured subnets running on standby interfaces back to the primary interfaces. This is done regardless of the link state of the primary interfaces. The intent of this switchback is to preserve the original network configuration as it was before the cluster started. Switching back occurs on the specified node if a cmhaltnode command is issued or on all nodes in the cluster if a cmhaltcl command is issued. A remote switch (that is, a package switch) involves moving packages and their associated IP addresses to a new system. The new system must already have the same subnetwork configured and working properly, otherwise the packages will not be started. With remote switching, TCP connections are lost. TCP applications must reconnect to regain connectivity; this is not handled automatically. Note that if the package is dependent on multiple subnetworks, all subnetworks must be available on the target node before the package will be started. Note that remote switching is supported only between LANs of the same type. For example, a remote switchover between Ethernet on one machine and FDDI interfaces on the failover machine is not supported. The remote switching of relocatable IP addresses was shown previously in Figure 3-5 “Before Package Switching” and Figure 3-6 “After Package Switching”. When a floating IPv4 address is moved to a new interface, either locally or remotely, an ARP message is broadcast to indicate the new mapping between IP address and link layer address. An ARP message is sent for each IPv4 address that has been moved. All systems receiving the broadcast should update the associated ARP cache entry to reflect the change. Currently, the ARP messages are sent at the time the IP address is added to the new system. An ARP message is sent in the form of an ARP request. The sender and receiver protocol address fields of the ARP request message are both set to the same floating IP address. This ensures that nodes receiving the message will not send replies. Unlike IPv4, IPv6 addresses use NDP messages to determine the link-layer addresses of its neighbors. Serviceguard supports the use of automatic port aggregation through HP-APA (Auto-Port Aggregation, HP product J4240AA). HP-APA is a networking technology that aggregates multiple physical Fast Ethernet or multiple physical Gigabit Ethernet ports into a logical link aggregate. HP-APA allows a flexible, scalable bandwidth based on multiple 100 Mbps Fast Ethernet links or multiple 1 Gbps Ethernet links (or 200 Mbps and 2 Gbps full duplex respectively). Its other benefits include load balancing between physical links, automatic fault detection, and recovery for environments which require high availability. Port aggregation capability is sometimes referred to as link aggregation or trunking. APA is also supported on dual-stack kernel. Once enabled, each link aggregate can be viewed as a single logical link of multiple physical ports with only one IP and MAC address. HP-APA can aggregate up to four physical ports into one link aggregate; the number of link aggregates allowed per system is 50. Empty link aggregates will have zero MAC addresses. You can aggregate the ports within a multi-ported networking card (cards with up to four ports are currently available). Alternatively, you can aggregate ports from different cards. Figure 3-19 “Aggregated Networking Ports” shows two examples. Both the Single and Dual ported LANs in the non-aggregated configuration have four LAN cards, each associated with a separate non-aggregated IP address and MAC address, and each with its own LAN name (lan0, lan1, lan2, lan3). When these ports are aggregated all four ports are associated with a single IP address and MAC address. In this example, the aggregated ports are collectively known as lan900, the name by which the aggregate is known on HP-UX 11i (on HP-UX 11.0, the aggregates would begin with lan100). Various combinations of Ethernet card types (single or dual-ported) and aggregation groups are possible, but it is vitally important to remember that at least two physical cards must be used in any combination of APA’s to avoid a single point of failure for heartbeat connections. HP-APA currently supports both automatic and manual configuration of link aggregates. For information about implementing APA with Serviceguard, see HP Auto Port Aggregation (APA) Support Guide and other APA documents posted at http://docs.hp.com in the Networking and Communications collection. Virtual LAN configuration using HP-UX VLAN software is now supported in Serviceguard clusters. VLAN is also supported on dual-stack kernel. Virtual LAN (or VLAN) is a networking technology that allows grouping of network nodes based on an association rule regardless of their physical locations. Specifically, VLAN can be used to divide a physical LAN into multiple logical LAN segments or broadcast domains. Each interface in a logical LAN will be assigned a tag id at the time it is configured. VLAN interfaces, which share the same tag id, can communicate to each other as if they were on the same physical network. The advantages of creating virtual LANs are to reduce broadcast traffic, increase network performance and security, and improve manageability. On HP-UX, initial VLAN association rules are port-based, IP-based, and protocol-based. Multiple VLAN interfaces can be configured from a physical LAN interface and then appear to applications as regular network interfaces. IP addresses can then be assigned on these VLAN interfaces to form their own subnets. Please refer to the document Using HP-UX VLAN (T1453-90001) for more details on how to configure VLAN interfaces. VLAN is supported with Serviceguard starting from A.11.14 on HP-UX 11i. The support of VLAN is similar to other link technologies. VLAN interfaces can be used as heartbeat as well as data networks in the cluster. The Network Manager will monitor the health of VLAN interfaces configured in the cluster, and perform local and remote failover of VLAN interfaces when failure is detected. The failure of VLAN interfaces typically occurs when the physical NIC port, upon which they are created, has failed. HP-UX allows up to 1024 VLANs to be created from a physical NIC port. Obviously, a large pool of system resources is required to accommodate such a configuration. With the availability of VLAN technology, Serviceguard may face potential performance degradation, high CPU utilization and memory shortage issues if there is a large number of network interfaces configured in each cluster node. To provide enough flexibility in VLAN networking, Serviceguard solutions should adhere to the following VLAN and general network configuration requirements:
VLAN technology allows greater flexibility in network configurations in the enterprise. In order to allow Serviceguard to be running on such dynamic environments successfully while maintaining its reliability and availability features simultaneously, the existing heartbeat rules must continue to be applied with more restrictions when VLAN interfaces are present in the cluster:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||