Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Managing Serviceguard Version A.11.16, Eleventh EditionSecond Printing > Chapter 1 Serviceguard at a Glance

What is Serviceguard?

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

Serviceguard allows you to create high availability clusters of HP 9000 or HP Integrity servers. A high availability computer system allows application services to continue in spite of a hardware or software failure. Highly available systems protect users from software failures as well as from failure of a system processing unit (SPU), disk, or local area network (LAN) component. In the event that one component fails, the redundant component takes over. Serviceguard and other high availability subsystems coordinate the transfer between components.

A Serviceguard cluster is a networked grouping of HP 9000 or HP Integrity servers (host systems known as nodes) having sufficient redundancy of software and hardware that a single point of failure will not significantly disrupt service. Application services (individual HP-UX processes) are grouped together in packages; in the event of a single service, node, network, or other resource failure, Serviceguard can automatically transfer control of the package to another node within the cluster, allowing services to remain available with minimal interruption.

In Figure 1-1 “Typical Cluster Configuration ”, node 1 (one of two SPU's) is running package A, and node 2 is running package B. Each package has a separate group of disks associated with it, containing data needed by the package's applications, and a mirror copy of the data. Note that both nodes are physically connected to both groups of mirrored disks. However, only one node at a time may access the data for a given group of disks. In the figure, node 1 is shown with exclusive access to the top two disks (solid line), and node 2 is shown as connected without access to the top disks (dotted line). Similarly, node 2 is shown with exclusive access to the bottom two disks (solid line), and node 1 is shown as connected without access to the bottom disks (dotted line).

Mirror copies of data provide redundancy in case of disk failures. In addition, a total of four data buses are shown for the disks that are connected to node 1 and node 2. This configuration provides the maximum redundancy and also gives optimal I/O performance, since each package is using different buses.

Note that the network hardware is cabled to provide redundant LAN interfaces on each node. Serviceguard uses TCP/IP network services for reliable communication among nodes in the cluster, including the transmission of heartbeat messages, signals from each functioning node which are central to the operation of the cluster. TCP/IP services also are used for other types of inter-node communication. (The heartbeat is explained in more detail in the chapter “Understanding Serviceguard Software.”)

Failover

Under normal conditions, a fully operating Serviceguard cluster simply monitors the health of the cluster's components while the packages are running on individual nodes. Any host system running in the Serviceguard cluster is called an active node. When you create the package, you specify a primary node and one or more adoptive nodes. When a node or its network communications fails, Serviceguard can transfer control of the package to the next available adoptive node. This situation is shown in Figure 1-2 “Typical Cluster After Failover ”.

Figure 1-2 Typical Cluster After Failover

Typical Cluster After Failover

After this transfer, the package typically remains on the adoptive node as long the adoptive node continues running. If you wish, however, you can configure the package to return to its primary node as soon as the primary node comes back online. Alternatively, you may manually transfer control of the package back to the primary node at the appropriate time.

Figure 1-2 “Typical Cluster After Failover ” does not show the power connections to the cluster, but these are important as well. In order to remove all single points of failure from the cluster, you should provide as many separate power circuits as needed to prevent a single point of failure of your nodes, disks and disk mirrors. Each power circuit should be protected by an uninterruptible power source. For more details, refer to the section on “Power Supply Planning” in Chapter 4, “Planning and Documenting an HA Cluster.”

Serviceguard is designed to work in conjunction with other high availability products, such as MirrorDisk/UX or VERITAS Volume Manager, which provide disk redundancy to eliminate single points of failure in the disk subsystem; Event Monitoring Service (EMS), which lets you monitor and detect failures that are not directly handled by Serviceguard; disk arrays, which use various RAID levels for data protection; and HP-supported uninterruptible power supplies (UPS), such as HP PowerTrust, which eliminates failures related to power outage. These products are highly recommended along with Serviceguard to provide the greatest degree of availability.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.