 |
» |
|
|
 |
When the following procedures are completed, an adoptive node
will be able to access the data belonging to a package after it
fails over. Setting
up the Hardware |  |
Ensure
that the XP Series disk arrays are correctly cabled using PV links
to each node in the cluster that will run packages accessing data
on the array. Configure the XP disk array for synchronous
or asynchronous operation. If you are using a fence level of ASYNC, you must configure a side file using
the Service Processor (SVP) attached to the XP system. Synchronous
operation does not require side file configuration on the SVP. Use the ioscan command to determine what devices on the XP disk array
have been configured as command devices. The device-specific information
in the rightmost column of the ioscan output will have the suffix -CM for these devices, for example, OPEN-3-CM.If there are no configured command devices on
the disk array, you must create two before proceeding. Each command
device must have alternate links (PV links). The first command device
is the primary command device. The second command device is a redundant command
device and is used only upon failure of the primary command device.
The command devices must be mapped to the various host interfaces
by using the SVP (disk array console) or a remote console. Primary (PVOL) and secondary (SVOL) volumes must
be correctly defined and assigned to the appropriate nodes in the
XP hardware configuration. Primary devices (PVOLs) must be locally protected
(RAID 1 or RAID 5). Secondary devices (SVOLs) must be locally protected (RAID
1 or RAID 5).
Setting
Fence Levels |  |
All devices defined in a given device
group must be configured with the same fence level. A fence level
of DATA or NEVER results in synchronous data replication; a fence
level of ASYNC is used to enable asynchronous data replication. Fence level = NEVER should only be used when the availability
of the application is more important than the data currency on the
remote XP disk array. In the case when all CA links fail, the application
will continue to modify the data on PVOL side, however the new data
is not replicated to the SVOL side. The SVOL only contains a copy
of the data up to the point of CA links failure. If an additional
failure, such as a system failure before the CA link is fixed, causes
the application to fail over to the SVOL side, the application will
have to deal with non-current data. If Fence level = NEVER is used, the data may be inconsistent
in the case of a rolling disaster. Additional failures taking place
before the system has completely recovered from a previous failure.
See an example of rolling disaster in the following section "Fence
Level of DATA". Fence level = DATA is recommended to ensure a current and consistent copy
of the data on all sides. If Fence level = DATA is not enabled, the data
may be inconsistent in the case of a rolling disaster—additional failures
taking place before the system has completely recovered from a previous
failure.Fence level = DATA is recommended to ensure that there is no possibility of
inconsistent data at the SVOL side in case of CA link failure.
Since only dedicated CA links are supported, the probability of
intermittent link failure is extremely low. Therefore, the probability
of inconsistent data at the remote (SVOL) side is extremely low.
However, inconsistent and therefore unusable data will result from
the following sequence of circumstances: Fence level = DATA is not enabled. The application continues
to modify data. Resynchronization from PVOL
to SVOL starts, but does not finish.
Although
the risk of this sequence of events taking place is extremely low,
if your business cannot afford even this quite small risk, then
you must enable Fence level = DATA to ensure that the data at the SVOL side are always
consistent. The disadvantage of enabling Fence level = DATA is that when the CA link fails, or if the entire
remote (SVOL) data center fails, all I/Os will be refused (to those
devices) until the CA link is restored, or manual intervention is
undertaken to split the PVOL from the SVOL. Applications may fail
or may continuously retry the I/Os (depending on the application)
if Fence level = DATA is enabled and the CA link fails. Fence level = ASYNC is recommended to improve performance in data replication
between the primary and the remote site.  |  |  |  |  | NOTE: If Data currency is required on all sides, Fence level
= DATA should be used. |  |  |  |  |
The XP disk array supports asynchronous mode with guaranteed ordering.
When the host does a write I/O to the XP disk array, as soon as the
data is written to cache, the array sends a reply to the host. A
copy of the data with a sequence number is saved in an internal
buffer, known as the side file, for later
transmission to the remote XP disk array. When synchronous replication
is used, the primary system cannot complete a transaction until
a message is received acknowledging that data has been written to
the remote site. With asynchronous replication, the transaction
is completed once the data is written to the side file on the primary
system, which allows I/O activity to continue even if the CA link
is temporarily unavailable. The side file is 30% to 70% of cache (default 50%) that is
assigned through the XP system's Service Processor (SVP).
The high water mark (HWM) is 30% of the cache;
if the quantity of data in the side file exceeds this value, the
write I/O will be delayed to the side file starting from .5 seconds
and increasing to 4 seconds maximum with every 5% increase over
HWM in 500 ms increments. If the HWM continues to grow, it will eventually hit the side
file threshold (30 to 70% of cache). When this limit
has been reached, the XP on the primary site cannot write to the
XP on the secondary site until there is enough room in the side
file. Before continuing to write, the primary XP will wait until
there is enough room in the side file, and will keep trying until
it reaches its side file timeout value, which
is configured through the SVP. If timeout has been reached, then
the primary XP disk array will begin tracking data on its bitmap
which will be copied over to the secondary volume during resync. The side file operation is shown in Figure 3-9 “XP
Series Disk Array Side File”.
In asynchronous mode, when there is an CA link failure, both
the PVOL and SVOL sides change to a PSUE state. When the SVOL side
detects missing data blocks from the PVOL side, it will wait for
those data blocks from the PVOL side until it has reached the configured
CA link timeout value (set in the SVP). Once this timeout value
has been reached, then the SVOL side will change to a PSUE state.
The default CA link timeout value is 5 minutes (300 seconds). Limitations
of Asynchronous ModeThe following are restrictions for an asynchronous CT group
in a Raid Manager configuration file: Asynchronous device groups cannot
be defined to extend across multiple XP Series disk arrays. When making paired volumes, the Raid Manager registers
a CTGID to the XP Series disk array automatically at paircreate time, and the device group in the configuration file
is mapped to a CTGID. Efforts to create a CTGID with a higher number
will be terminated with a return value of EX_ENOCTG. MetroCluster/CA will support only one consistency
group per package. This means that in one metropolitan cluster,
the number of packages can be configured to use consistency group
is either limited by the maximum number of consistency group that
is supported by the XP model in the configuration or by the maximum
number of packages in the cluster which ever is the smaller.
Other
Considerations on Asynchronous ModeThe following are some additional considerations when using asynchronous
mode: When adding a new volume to
an existing device group, the new volume state is SMPL. The XP disk
array controller (DKC) is smart enough to do the paircreate only on the new volume. If the device group has mixed
volume states like PAIR and SMPL, the pairvolchk returns EX_ENQVOL, and horctakeover will fail. If you change the LDEV number associated with a
given target/LUN, you must restart all the Raid Manager instances
even though the Raid Manager configuration file is not modified. Any firmware update, cache expansion, or board change,
requires a restart of all Raid Manager instances. pairsplit for asynchronous mode may take a long time depending on
how long the synchronization takes. there is a potential for the
CA link to fail while pairsplit is in progress. If this happens, pairsplit will fail
with a return code of EX_EWSUSE. In most cases, MetroCluster/CA in asynchronous mode
will behave the same as when the fence level is set to NEVER in synchronous mode.
Installing
the Necessary Software |  |
Before any configuration can begin, you need to perform the
following installation tasks on all nodes: Install
Raid Manager XP, which allows you to manage the XP series disk arrays
from the node. Refer to the installation instructions in the Raid
Manager XP User's Guide. Edit the /etc/services file, adding an entry for
the Raid Manager XP instance to be used with MetroCluster/CA in
the format horcm<instance-number> <port-number>/udp. For example: horcm0 11000/udp #Raid Manager instance 0
|
See the file /opt/cmcluster/toolkit/SGCA/Samples/services.example. Install MetroCluster with Continuous Access XP on
all nodes according to the instructions in the MetroCluster
with Continuous Access XP Release Notes. Install MC/ServiceGuard if it is not already present,
and configure the MC/ServiceGuard cluster according to the procedures
outlined in Managing MC/ServiceGuard. The MAX_CONFIGURED_PACKAGES parameter
in the cluster configuration file should be set to 1 or more, depending
on the number of packages that will run on the cluster.
Creating
the Raid Manager Configuration |  |
The Raid Manager configuration file must be edited and customized
on each node that is attached to one of the XP Series disk arrays.
The file is named using the following convention: horcm<instance number>.conf All MetroCluster packages must use the same Raid Manager instance, and
must be configured in the same configuration file. In the examples
in this chapter, instance zero is assumed, which is configured in
file horcm0.conf. Here are the steps to follow for creating the configuration: Copy the default Raid Manager
configuration file to an instance-specific name: # cp /etc/horcm.conf /etc/horcm0.conf Create a minimum Raid Manager configuration file
by editing the following sections of the file created in the previous
step: - HORCM_MON
Enter the host-name of the
system on which you are editing and the TCP/IP port number specified for
this Raid Manager instance in the /etc/services file. - HORCM_CMD
Enter the primary and alternate
link device file names for both primary and redundant command devices
(for a total of four raw device file names).
 |  |  |  |  | WARNING! Make sure that the redundant command device is not on
the same physical device as the primary command
device. Also, make sure that the two command devices are on different buses
inside the XP Series disk array. |  |  |  |  |
If the Raid Manager protection facility is enabled,
export the HORCPERM environment variable to the HORCM permission
file: # export HORCMPERM=/etc/horcmperm0.conf If the Raid Manager protection facility is not used
or disabled, export the HORCPERM environment variable
as follows: # export HORCMPERM=MGRNOINST Start the Raid Manager instance by using the command horcmstart.sh <instance-#> as in the following example: # horcmstart.sh 0 Export the environment variable that specifies the
Raid Manager instance to be used by the Raid Manager commands.
For example, with the POSIX shell, type: # export HORCMINST=0 Now, you can use Raid Manager commands to get further information
from the disk arrays. Verify the software revision of the Raid Manager
and the firmware revision of the XP Series disk array, use the command raidqry -l.  |  |  |  |  | NOTE: Ensure that you have the minimum requirement level for
the XP and the Raid Manager software and firmware listed in the MetroCluster/CA
release notes. |  |  |  |  |
Obtain a list of the available devices on the disk
arrays using the raidscan command. This command must be invoked separately for each
host interface connection to the disk array. For example, if there
are two Fibre Channel host adapters, you might use the following
commands: # raidscan -p CL1-A # raidscan -p CL1-B  |  |  |  |  | NOTE: There must also be alternate links for each device,
and these alternate links must be on different buses inside the
XP Series disk array. These alternate links, for example, may be
CL2-E and CL2-F. |  |  |  |  |
Unless the devices have been previously paired either on this
or another host, the devices will show up as SMPL (simplex). Paired devices
will show up as PVOL (primary volume) or SVOL (secondary volume). Determine which devices will be used by the application
package. Define a device group that contains all of these devices.
The device group name (dev_group) is user-defined and must be the
same on each host in the MetroCluster that accesses the XP Series
disk array. It is recommended that you use a name that is easily
associated with the package. For example, a device group name of "db-payroll" is easily
associated with the database for the payroll application. A device
group name of "group1" would be more difficult
to easily relate to an application. The device group name MUST be
unique within the cluster. The device name (dev_name) is also user-defined and must be
the same on each host in the MetroCluster that accesses the XP Series disk
array. The device name (dev_name) must be unique
among all devices in the cluster. However, the TargetID and LU#
fields for each device name may be different on different hosts
in the cluster, to allow for different hardware I/O paths on different
hosts. Edit the following sections of the Raid Manager
configuration file that was created in a previous step: - HORCM_DEV
Include the devices and device
group used by the application package. Only one device group may
be specified for all of the devices that belong
to a single application package. - HORCM_INST
Supply the names of only
those hosts that are attached to the XP Series disk array
that is remote from the disk array directly
attached to this host. For example, with a MetroCluster of 6 nodes,
2 of which are Arbitrators, you would specify only hosts 3 and 4
in the HORCM_INST section. Host 1 would have previously been specified
in the HORCM_MON section.
See the file horcm0.conf.<sys-name> in /opt/cmcluster/toolkit/SGCA/Samples/
for an example.
Restart the Raid Manager instance so that the new
information in the configuration file is read. Use the following
commands: # horcmshutdown.sh <instance-#> # horcmstart.sh <instance-#> Repeat these steps on each
host that will run this particular application package. If a host
may run more than one application package, you must incorporate
device group and host information for each of these packages. Note
that the Raid Manager configuration file must be different for each
host, especially for the HORCM_MON and HORCM_INST fields. The HORCM_MON section of the file is unique for each node in
all clusters that are attached to an XP Series disk array. Enter
the host name or IP address followed by the name of the Raid Manager instance
that is monitoring the MetroCluster packages on that node (horcm0
in the current example). If you have not already done so, use the paircreate command to create the device groups that are listed in
the Raid Manager configuration files. See the Raid Manager
User's Guide or view the man page for paircreate for more information. Example: # paircreate -g db_payroll -f data -vl -c15  |  |  |  |  | WARNING! Paired devices must be of compatible sizes and types. |  |  |  |  |
Sample
Raid Manager Configuration File |  |
The following is an example of a Raid Manager configuration
file for one node (ftsys1).  |
# # horcm0.conf.ftsys1 # - This is an example Raid Manager configuration file for node ftsys1. # Note that this configuration file is for Raid Manager instance 0, # which can be determined by the "0" in the filename "horcm0.conf". # # Whenever this configuration file is changed, you must stop and restart the # instance of Raid Manager before the changes will be recognized. This can be done using the following commands: # # horcmshutdown.sh <instance> # horcmstart.sh <instance> # # After restarting the Raid Manager instance, you should confirm that there # are no configuration errors reported by running the pairdisplay command # with the "-c" option. # # NOTE: The Raid Manager command device (RORCM_CMD) cannot be used for # data storage (it is reserved for private Raid Manager usage). #/************************ HORCM_MON *************************************/ # # The HORCM_MON parameter is used for monitoring and control of device groups # by the Raid Manager. # It is used to define the IP address, port number, and paired volume error # monitoring interval for the local host. # <ip_address> # Defines a network address used by the local host. This can be a host name # or an IP address. # <service> # Specifies the port name assigned to the Raid Manager communication path, # which is must also be defined in /etc/services. If a port number, rather # than a port name is specified, the port number will be used. # <poll_interval> # Specifies the interval used for monitoring the paired volumes. By # increasing this interval, the Raid Manager daemon load is reduced. # If this interval is set to -1, the paired volumes are not monitored. # <timeout> # Specifies the time-out period for communication with the Raid Manager # server. HORCM_MON #ip_address service poll_interval(10ms) timeout(10ms) ftsys1 horcm0 1000 3000 #/************************* HORCM_CMD *************************************/ # # The HORCM_CMD parameter is used to define the special files (raw device # file names) of the Raid Manager command devices used for the monitoring # and control of Raid Manager device groups. # Define the special device files corresponding to two or more command devices # in order to use the Raid Manager alternate command device feature. An # alternate command device must be configured, otherwise a failure of a # single command device could prevent access to the device group. # Each command device must have alternate links (PVLinks). The first command # device is the primary command device. The second command device is a # redundant command device and is used only upon failure of the primary # command device. The command devices must be mapped to the various host # interfaces by using the SVP (disk array console) or a remote console. HORCM_CMD #Primary Primary Alt-Link Secondary Secondary Alt-link #dev_name dev_name dev_name dev_name /dev/rdsk/c4t1d0 /dev/rdsk/c5t1d0 /dev/rdsk/c4t0d1 /dev/rdsk/c5t0d1 #/************************* HORCM_DEV *************************************/ # # The HORCM_DEV parameter is used to define the addresses of the physical # volumes corresponding to the paired logical volume names. Each group # name is a unique name used by the hosts which will access the volumes. # # The group and paired logical volume names defined here must be the same for # all other (remote) hosts that will access this device group. # The hardware SCSI bus, SCSI-ID, and LUNs for the device groups do not need # to be the same on remote hosts. # # <dev_group> # This parameter is used to define the device group name for paired logical # volumes. The device group name is used by all Raid Manager commands for # accessing these paired logical volumes. # <dev_name> # This parameter is used to define the names of the paired logical volumes # in the device group. # <port#> # This parameter is used to define the XP256 port number used to access the # physical volumes in the XP256 connected to the "dev_name". Consult your # XP256 for valid Port numbers to specify here. # <TargetID> # This parameter is used to define the SCSI target ID of the physical # volume on the port specified in "port#". # <LUN#> # This parameter is used to define the SCSI logical unit number (LUN) of # the physical volume specified in "targetID". HORCM_DEV #dev_group dev_name port# TargetID LUN# pkgA pkgA_index CL1-E 0 1 pkgA pkgA_tables CL1-E 0 2 pkgA pkgA_logs CL1-E 0 3 pkgB pkgB_d1 CL1-E 0 4 pkgC pkgC_d1 CL1-E 0 5 pkgD pkgD_d1 CL1-E 0 2 #/************************* HORCM_INST ************************************/ # # This parameter is used to define the network address (IP address or host # name) of the remote hosts which can provide the remote Raid Manager access # for each of the device group secondary volumes. # The remote Raid Manager instances are required to get status or provide # control of the remote devices in the device group. All remote hosts # must be defined here, so that the failure of one remote host will prevent # obtaining status. # # <dev_group> # This is the same device group names as defined in dev_group of HORC_DEV. # <ip_address> # This parameter is used to define the network address of the remote hosts # with Raid Manager access to the device group. This can be either an # IP address or a host name. # <service> # This parameter is used to specify the port name assigned to the Raid # Manager instance, which must be registered in /etc/services. If this is # a port number rather than a port name, then the port number will be used. HORCM_INST #dev_group ip_address service pkgA ftsys1a horcm0 pkgA ftsys2a horcm0 pkgB ftsys1a horcm0 pkgB ftsys2a horcm0 pkgC ftsys1a horcm0 pkgC ftsys2a horcm0 pkgD ftsys1a horcm0 pkgD ftsys2a horcm0
|
 |
Configuring
Automatic Raid Manager Startup |  |
After editing the Raid Manager configuration files and installing
them on the nodes that are attached to the XP Series disk arrays,
you should configure automatic Raid Manager startup on the same
nodes. You do this by editing the rc script /etc/rc.config.d/raidmgr. Set the START_RAIDMGR parameter to 1, and define RAIDMGR_INSTANCE as the number of the Raid Manager instance you
are using with MetroCluster. By default, this is zero (0). An example of the edited startup file is shown below:  |
#*************************** RAIDMANAGER ************************* # MetroCluster with Continuous Access Toolkit script for configuring the # startup parameters for a HP StorageWorks E Disk Array XP Raid Manager # instance. The Raid Manager instance must be running before any # MetroCluster package can start up successfully. # # @(#) $Revision: 1.8 $ # # START_RAIDMGR: If set to 1, this host will attempt to start up # an instance of the Disk Array XP Raid Manager, # which must be running before a MetroCluster package # can be successfully started. If set to 0, this host # will not attempt to start the Raid Manager. # # RAIDMGR_INSTANCE This is the instance number of the Raid Manager # instance to be started by this script. The instance # number specified here must be the same as the # instance number specified in the MetroCluster # package control script. # Consult your Raid Manager documentation for more # information on Raid Manager instances. # # See the MetroCluster and Raid Manager documentation for more information # on configuring this script. # START_RAIDMGR=0 RAIDMGR_INSTANCE=0
|
 |
Verifying
the XP Series Disk Array Configuration |  |
Use
the following checklist to verify the configuration. Creating
and Exporting LVM Volume Groups |  |
Use the following procedure to create volume groups and export
them for access by other nodes. The sample script mk1VGs in the Samples directory can be modified to automate these steps. Define the appropriate
Volume Groups on each node that might run the application package.
Use the commands: # mkdir /dev/vgxx # mknod /dev/vgxx/group c 64 0xnn0000 where the name /dev/vgxx and the number nn are unique within the cluster. Create volume groups only on the primary system.
Use the vgcreate and the vgextend command, specifying the appropriate HP-UX device file
names. Use the vgexport command with the -p option to export the VGs on the primary system without
removing the HP-UX device files: # vgchange -a n vgname # vgexport -v -s -p -m mapfilename vgname Make sure that you copy the map files to all of the nodes.
The sample script Samples/ftpit shows a semi-automated way (using ftp) to copy the files. You need only enter the password
interactively.
Importing
Volume Groups on Other Nodes |  |
Use the following procedure to import volume groups. The sample
script mk2imports can be modified to automate these steps. Split the Raid Manager device groups
to enable read/write on the SVOL before doing the import by using
the following command: # pairsplit -g devicegroupname -rw Import the VGs on all of the other systems that
will run the MC/ServiceGuard package, and back up the configuration.
Use the following command: # vgimport -v -s -m mapfilename vgname Back up the configuration. Use the following commands: # vgchange -a y vgname # vgcfgbackup vgname # vgchange -a n vgname See the sample script Samples/mk2imports. Resynchronize the CA pair device. Use the following
commands: # pairresync -g devicegroupname -c 15
 |  |  |  |  | NOTE: Exclusive activation must be used
for all volume groups associated with packages that use the XP Series
disk array. The design of MetroCluster/CA assumes that only one
system in the cluster will have a VG activated at a time. |  |  |  |  |
Configuring
PV Links |  |
The examples in the previous sections show the use of the
vgimport and vgexport commands with the -s option. Also, the mk1VGs script uses a -s in the vgexport command, and the mk2imports script uses a -s in the vgimport command. You may wish to remove this option from both commands
if you are using PV links. The -s option to the vgexport command saves the volume group id (VGID) in the map file,
but it does not preserve the order of PV links. To specify the exact
order of PV links, do not use the -s option with vgexport, and in the vgimport command, enter the individual links in the desired order,
as in the following example: # vgimport -v -m mapfilename vgname linkname1 linkname2 Creating
VxVM Disk Groups for Use with MetroCluster/CA |  |
If you are using VERITAS storage, use the following procedure
to create disk groups. It is assumed that you have already created
a VERITAS root disk (rootdg) on the system where you are configuring
the storage. The following section shows how to set up VERITAS disk
groups. On one node do the following: Create the device pair to be used by
the package: # paircreate -g devgrpA -f never -vl -c 15 Check to make sure the devices are in the PAIR state: # pairdisplay -g devgrpA Initialize disks to be used with VxVM by running
the vxdisksetup command. Run the following command only on the primary
system: # vxdisksetup -i c5t0d0 Create the disk group to be used by using vxdg command. Run the following command only on the primary
system: # vxdg init logdata /dev/dsk/c5t0d0 Verify the configuration with the following command: # vxdg list Use the vxassist command to create the logical volume. # vxassist -g logdata make logfile 2048m Verify the configuration with the following command: # vxprint -g logdata Make the filesystem with the following command: # newfs -F vxfs /dev/vx/rdsk/logdata/logfile Create a directory to mount the volume group. # mkdir /logs Mount the volume group. # mount /dev/vx/dsk/logdata/logfile /logs Check if file system exits, then unmount the file
system # umount /logs
Validating
VxVM Disk Groups for Use with MetroCluster/CA |  |
The following section shows how to validate VERITAS disk groups.
On one node do the following: Deport the disk group. # vxdg deport logdata Enable other cluster nodes to have access to the
disk group. # vxdctl enable Suspend the CA link and have SVOL Read/Write permission. # pairsplit -g devgrpA -rw Import the disk group. # vxdg -tfC import logdata Start the logical volume in the disk group. # vxvol -g logdata startall Create a directory to mount the volume. # mkdir /logs Mount the volume # mount /dev/vx/dsk/logdata/logfile /logs Check to make sure the file system is present, then
unmount the file system. # umount /logs Resynchronize the CA pair device. # pairresync -g devicegroupname -c 15
|