Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Managing Serviceguard Version A.11.16, Eleventh EditionSecond Printing > Chapter 8 Troubleshooting Your Cluster

Replacing Disks

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

The procedure for replacing a faulty disk mechanism depends on the type of disk configuration you are using. Separate descriptions are provided for replacing an array mechanism and a disk in a high availability enclosure.

Replacing a Faulty Array Mechanism

With any HA disk array configured in RAID 1 or RAID 5, refer to the array’s documentation for instruction on how to replace a faulty mechanism. After the replacement, the device itself automatically rebuilds the missing data on the new disk. No LVM activity is needed. This process is known as hot swapping the disk.

Replacing a Faulty Mechanism in an HA Enclosure

If you are using software mirroring with MirrorDisk/UX and the mirrored disks are mounted in a high availability disk enclosure, you can use the following steps to hot plug a disk mechanism:

  1. Identify the physical volume name of the failed disk and the name of the volume group in which it was configured. In the following examples, the volume group name is shown as /dev/vg_sg01 and the physical volume name is shown as /dev/dsk/c2t3d0. Substitute the volume group and physical volume names that are correct for your system.

  2. Identify the names of any logical volumes that have extents defined on the failed physical volume.

  3. On the node on which the volume group is currently activated, use the following command for each logical volume that has extents on the failed physical volume:

    lvreduce -m 0 /dev/vg_sg01/lvolname /dev/dsk/c2t3d0 
  4. At this point, remove the failed disk and insert a new one. The new disk will have the same HP-UX device name as the old one.

  5. On the node from which you issued the lvreduce command, issue the following command to restore the volume group configuration data to the newly inserted disk:

    vgcfgrestore -n /dev/vg_sg01 /dev/dsk/c2t3d0 
  6. Issue the following command to extend the logical volume to the newly inserted disk:

    lvextend -m 1 /dev/vg_sg01 /dev/dsk/c2t3d0 
  7. Finally, use the lvsync command for each logical volume that has extents on the failed physical volume. This synchronizes the extents of the new disk with the extents of the other mirror.

    lvsync /dev/vg_sg01/lvolname  

Replacing a Lock Disk

Replacing a failed lock disk mechanism is the same as replacing a data disk. If you are using a dedicated lock disk (one with no user data on it), then you need to issue only one LVM command, as in the following example:

# vgcfgrestore -n /dev/vg_lock /dev/dsk/c2t3d0

Serviceguard checks the lock disk on an hourly basis. After the vgcfgrestore command, review the syslog file of an active cluster node for not more than one hour. Then look for a message showing that the lock disk is healthy again.

On-line Hardware Maintenance with In-line SCSI Terminator

In some shared SCSI bus configurations, on-line SCSI disk controller hardware repairs can be made if HP in-line terminator (ILT) cables are used. In-line terminator cables are supported with most SCSI-2 Fast-Wide configurations.

In-line terminator cables are supported with Ultra2 SCSI host bus adapters only when used with the SC10 disk enclosure. This is because the SC10 operates at slower SCSI bus speeds, which are safe for the use of ILT cables. In-line terminator cables are not supported for use in any Ultra160 or Ultra3 SCSI configuration, since the higher SCSI bus speeds can cause silent data corruption when the ILT cables are used.

The in-line terminator cable is available in a number of form factors; in each case, the connector at one end contains SCSI termination, which is used to terminate the end of the SCSI bus instead of the termination on the SCSI host bus adapter (HBA) in the node. If the terminated end of an ILT cable is connected to an HBA, then termination must be disabled on that HBA. Disabling the termination is done on the HBA by removing the termination resistor packs, setting the appropriate DIP switches on the HBA, or by programmatically disabling termination, depending on the HBA being used. (Consult the documentation for the HBA to see which method works for a particular HBA.)

ILT cables can be used in combination with Y-cables to allow additional nodes to be connected to the shared SCSI bus. Some SCSI cables available from HP have combined ILT and Y-cable functionality. Any nodes connected to the middle connector of a Y-cable must also have SCSI termination disabled on the HBA’s.

When an ILT cable is used, it is possible to physically disconnect a host from the end of the shared SCSI bus without breaking the bus’s termination, allowing the remaining nodes in the cluster to continue to access the shared SCSI bus while the repairs are being made.

Similarly, it is possible to physically disconnect a host from the middle connector of a Y-cable on a shared SCSI bus without breaking the bus’s termination.

Whether using ILT cables or Y-cables, however, it is strongly recommended that you do not try to physically reconnect an HBA to the shared SCSI bus without first halting all nodes connected to that shared SCSI bus. This is because it is very easy to inadvertently short out a pin in the connector against the chassis ground, which would damage the other HBA’s connected to the shared SCSI bus, and bring the entire SCSI bus down.

NOTE: You cannot use inline terminators with internal FW/SCSI buses on D and K series systems, and you cannot use the inline terminator with single-ended SCSI buses. You must not use an inline terminator to connect a node to a Y cable.

Figure 8-1 “F/W SCSI-2 Buses with In-line Terminators ” shows a three-node cluster with two F/W SCSI-2 buses. The solid line and the dotted line represent different buses, both of which have inline terminators attached to nodes 1 and 3. Y cables are also shown attached to node 2.

Figure 8-1 F/W SCSI-2 Buses with In-line Terminators

F/W SCSI-2 Buses with In-line Terminators

The use of in-line SCSI terminators allows you to do hardware maintenance on a given node by temporarily moving its packages to another node and then halting the original node while its hardware is serviced. Following the replacement, the packages can be moved back to the original node.

Use the following procedure to disconnect a node that is attached to the bus with an in-line SCSI terminator or with a Y cable:

  1. Move any packages on the node that requires maintenance to a different node.

  2. Halt the node that requires maintenance. The cluster will re-form, and activity will continue on other nodes. Packages on the halted node will switch to other available nodes if they are configured to switch.

  3. Disconnect the power to the node.

  4. Disconnect the node from the in-line terminator cable or Y cable if necessary. The other nodes accessing the bus will encounter no problems as long as the in-line terminator or Y cable remains connected to the bus.

  5. Replace or upgrade hardware on the node, as needed.

  6. Halt all nodes connected to the shared SCSI bus, and power them down.

  7. Reconnect the node to the in-line terminator cable or Y cable if necessary.

  8. Power-on the nodes in the cluster. If AUTOSTART_CMCLD is set to 1 in the /etc/rc.config.d/cmcluster file, the nodes will automatically start the cluster and the packages.

  9. If necessary, move packages back to the node from their alternate locations and restart them.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.