A
|
|---|
| application restart | | Starting an application, usually on another node, after
a failure. Application can be restarted manually, which may be necessary
if data must be restarted before the application can run (example:
Business Recovery Services work like this.) Applications can by
restarted by an operator using a script, which can reduce human error.
Or applications can be started on the local or remote site automatically
after detecting the failure of the primary site.
|
|---|
| arbitrator | | Nodes in a disaster tolerant architecture that act
as tie-breakers in case all of the nodes in a data center go down
at the same time. These nodes are full members of the MC/ServiceGuard
cluster and must conform to the minimum requirements. The arbitrator
must be located in a third data center to ensure that the failure
of an entire data center does not bring the entire cluster down.
See also quorum server.
|
|---|
| asymmetrical cluster | | A cluster that has more nodes at one site than at another.
For example, an asymmetrical metropolitan cluster may have two nodes
in one building, and three nodes in another building. Asymmetrical
clusters are not supported in all disaster tolerant architectures.
|
|---|
| asynchronous data replication | | Local I/O will complete without waiting for the
replicated I/O to complete; however, it is expected that asynchronous
data replication will process the I/Os in the original order.
|
|---|
| automatic failover | | Failover directed by automation scripts or software (such
as MC/ServiceGuard) and requiring no human intervention. In a ContinentalClusters environment,
the start-up of package recovery groups on the Recovery Cluster without
intervention. See also application restart.
|
|---|
B
|
|---|
| BC | | (Business Copy) A PVOL or SVOL in an HP SureStore
XP series disk array that can be split from or merged into a normal
PVOL or SVOL. It is often used to create a snapshot of the data
taken at a known point in time. Although this copy, when split,
is often consistent, it is not usually current.
|
|---|
| BCV | | (Business Continuity Volume) An EMC Symmetrix term
that refers to a logical device on the EMC Symmetrix that may be
merged into or split from a regular R1 or R2 logical device. It
is often used to create a snapshot of the data taken at a known
point in time. Although this copy, when split, is often consistent,
it is not usually current.
|
|---|
| bi-directional configuration | | A continental cluster configuration in which each
cluster serves the roles of primary and recovery cluster for different
recovery groups. Also known as a mutual recovery configuration.
|
|---|
| Business Recovery Service | | Service provided by a vendor to host the backup
systems needed to run mission critical applications following a disaster.
|
|---|
C
|
|---|
| campus cluster | | A single cluster that is geographically dispersed within
the confines of an area owned or leased by the organization such that
it has the right to run cables above or below ground between buildings
in the campus. Campus clusters are usually spread out in different
rooms in a single building, or in different adjacent or nearby buildings.
See also extended distance cluster.
|
|---|
| cascading failover | | Cascading failover is the ability of an application to
fail from a primary to a secondary location, and then to fail to
a recovery location on a different site. The primary location contains
a metropolitan cluster built with MetroCluster EMC SRDF, and the recovery
location has a standard MC/ServiceGuard cluster.
|
|---|
| client reconnect | | Users access to the backup site after failover.
Client reconnect can be transparent, where the user is automatically
connected to the application running on the remote site, or manual,
where the user selects a site to connect to.
|
|---|
| cluster | | An MC/ServiceGuard cluster is a networked grouping
of HP 9000 series 800 servers (host systems known as nodes) having
sufficient redundancy of software and hardware that a single failure
will not significantly disrupt service. MC/ServiceGuard software
monitors the health of nodes, networks, application services, EMS
resources, and makes failover decisions based on where the application
is able to run successfully.
|
|---|
| cluster alarm | | Time at which a message is sent indicating that
the cluster is probably in need of recovery. The cmrecovercl command is enabled at this time.
|
|---|
| cluster alert | | Time at which a message is sent indicating a problem
with the cluster.
|
|---|
| cluster event | | A cluster condition that occurs when the cluster
goes down or enters an UNKNOWN state, or
when the monitor software returns an error. This event may cause
an alert messages to be sent out, or it may cause an alarm condition
to be set, which allows the administrator on the Recovery Cluster to issue
the cmrecovercl command. The return of the cluster to the UP state
results in a cancellation of the event, which may be accompanied
by a cancel event notice. In addition, the cancellation disables
the use of the cmrecovercl command.
|
|---|
| cluster quorum | | A dynamically calculated majority used to determine whether
any grouping of nodes is sufficient to start or run the cluster. Cluster
quorums prevent split-brain syndrome which can lead to data corruption
or inconsistency. Currently at least 50% of the nodes plus a tie-breaker
are required for a quorum. If no tie-breaker is configured, then
greater than 50% of the nodes is required to start and run a cluster.
|
|---|
| command device | | A disk area in the HP SureStore XP series disk array used
for internal system communication. You create two command devices
on each array, each with alternate links (PV links).
|
|---|
| consistency group | | A set of Symmetrix RDF devices that are configured to
act in unison to maintain the integrity of a database. Consistency groups
allow you to configure R1/R2 devices on multiple Symmetrix frames
in MetroCluster/SRDF.
|
|---|
| continental cluster | | A group of clusters that use routed networks and/or
common carrier networks for data replication and cluster communication
to support package failover between separate clusters in different
data centers. Continental clusters are often located in different cities
or different countries and can span 100s or 1000s of kilometers.
|
|---|
| Continuous Access | | A facility provided by the Continous Access software
option available with the HP SureStore E Disk Array XP series. This
facility enables physical data replication between XP series disk arrays.
|
|---|
D
|
|---|
| data center | | A physically proximate collection of nodes and
disks, usually all in one room.
|
|---|
| data consistency | | Whether data are logically correct and immediately usable;
the validity of the data after the last write. Inconsistent data,
if not recoverable to a consistent state, is corrupt.
|
|---|
| data currency | | Whether the data contain the most recent transactions, and/or
whether the replica database has all of the committed transactions
that the primary database contains; speed of data replication may
cause the replica to lag behind the primary copy, and compromise
data currency.
|
|---|
| data loss | | The inability to take action to recover data. Data
loss can be the result of transactions being copied that were lost
when a failure occurred, non-committed transactions that were rolled
back as pat of a recovery process, data in the process of being
replicated that never made it to the replica because of a failure,
transactions that were committed after the last tape backup when
a failure occurred that required a reload from the last tape backup. transaction
processing monitors (TPM), message queuing software,
and synchronous data replication are measures that can protect against
data loss.
|
|---|
| data mirroring | | See See mirroring..
|
|---|
| data recoverability | | The ability to take action that results in data consistency,
for example database rollback/roll forward recovery.
|
|---|
| data replication | | The scheme by which data is copied from one site
to another for disaster tolerance. Data replication can be either
physical (see physical data replication)
or logical (see logical data replication).
In a ContinentalClusters environment, the process by which data that is used
by the cluster packages is transferred to the Recovery Cluster and
made available for use on the Recovery Cluster in the event of a recovery.
|
|---|
| database replication | | A software-based logical data replication scheme that
is offered by most database vendors.
|
|---|
| disaster | | An event causing the failure of multiple components
or entire data centers that render unavailable all services at a
single location; these include natural disasters such as earthquake,
fire, or flood, acts of terrorism or sabotage, large-scale power
outages.
|
|---|
| disaster protection | | (Don't use this term?) Processes, tools,
hardware, and software that provide protection in the event of an
extreme occurrence that causes application downtime such that the
application can be restarted at a different location within a fixed
period of time.
|
|---|
| disaster recovery | | The process of restoring access to applications
and data after a disaster. Disaster recovery can be manual, meaning
human intervention is required, or it can be automated, requiring
little or no human intervention.
|
|---|
| disaster recovery services | | Services and products offered by companies that
provide the hardware, software, processes, and people necessary
to recover from a disaster.
|
|---|
| disaster tolerant | | The characteristic of being able to recover quickly from
a disaster. Components of disaster tolerance include redundant hardware,
data replication, geographic dispersion, partial or complete recovery
automation, and well-defined recovery procedures.
|
|---|
| disaster tolerant architecture | | A cluster architecture that protects against multiple
points of failure or a single catastrophic failure that affects
many components by locating parts of the cluster at a remote site and
by providing data replication to the remote site. Other components of
disaster tolerant architecture include redundant links, either for networking
or data replication, that are installed along different routes, and
automation of most or all of the recovery process.
|
|---|
E, F
|
|---|
| ESCON | | Enterprise Storage Connect. A type of fiber-optic
channel used for inter-frame communication between EMC Symmetrix
frames using EMC SRDF or between HP SureStore E XP series disk array
units using Continuous Access XP.
|
|---|
| event log | | The default location (/var/adm/cmconcl/eventlog) where events are logged on the monitoring ContinentalClusters system.
All events are written to this log, as well as all notifications
that are sent elsewhere.
|
|---|
| extended distance cluster | | A cluster with alternate nodes located in different
data centers separated by some distance. Formerly known as campus
cluster.
|
|---|
| failback | | Failing back from a backup node, which may or may
not be remote, to the primary node that the application normally
runs on.
|
|---|
| failover | | The transfer of control of an application or service
from one node to another node after a failure. Failover can be manual,
requiring human intervention, or automated, requiring little or
no human intervention.
|
|---|
| filesystem replication | | The process of replicating filesystem changes from
one node to another.
|
|---|
G
|
|---|
| gatekeeper | | A small EMC Symmetrix device configured to function
as a lock during certain state change operations.
|
|---|
H, I
|
|---|
| heartbeat network | | A network that provides reliable communication among
nodes in a cluster, including the transmission of heartbeat messages,
signals from each functioning node, which are central to the operation
of the cluster, and which determine the health of the nodes in the
cluster.
|
|---|
| high availability | | A combination of technology, processes, and support partnerships
that provide greater application or system availability.
|
|---|
J, K, L
|
|---|
| local cluster | | A cluster located in a single data center. This
type of cluster is not disaster tolerant.
|
|---|
| local failover | | Failover on the same node; this most often applied
to hardware failover, for example local LAN failover is switching
to the secondary LAN card on the same node after the primary LAN
card has failed.
|
|---|
| logical data replication | | A type of on-line data replication that replicates
logical transactions that change either the filesystem or the database.
Complex transactions may result in the modification of many diverse
physical blocks on the disk.
|
|---|
| LUN | | (Logical Unit Number) A SCSI term that refers to
a logical disk device composed of one or more physical disk mechanisms,
typically configured into a RAID level.
|
|---|
M
|
|---|
| M by N | | A type of Symmetrix grouping in which up to two
Symmetrix frames may be configured on either side of a data replication
link in a MetroCluster/SRDF configuration. M by N configurations
include 1 by 2, 2 by 1, and 2 by 2.
|
|---|
| manual failover | | Failover requiring human intervention to start an application
or service on another node.
|
|---|
| MetroCluster | | A Hewlett-Packard product that allows a customer
to configure an MC/ServiceGuard cluster as a disaster tolerant metropolitan
cluster.
|
|---|
| metropolitan cluster | | A cluster that is geographically dispersed within the
confines of a metropolitan area requiring right-of-way to lay cable
for redundant network and data replication components.
|
|---|
| mirrored data | | Data that is copied using mirroring.
|
|---|
| mirroring | | Disk mirroring hardware or software, such as MirrorDisk/UX.
Some mirroring methods may allow splitting and merging.
|
|---|
| mission critical application | | Hardware, software, processes and support services
that must meet the uptime requirements of an organization. Examples
of mission critical application that must be able to survive regional
disasters include financial trading services, e-business operations,
911 phone service, and patient record databases.
|
|---|
| mission critical solution | | The architecture and processes that provide the
required uptime for mission critical applications.
|
|---|
| multiple points of failure (MPOF) | | More than one point of failure that can bring down
an MC/ServiceGuard cluster.
|
|---|
| multiple system high availability | | Cluster technology and architecture that increases
the level of availability by grouping systems into a cooperative
failover design.
|
|---|
| mutual recovery configuration | | A continental cluster configuration in which each
cluster serves the roles of primary and recovery cluster for different
recovery groups. Also known as a bi-directional configuration.
|
|---|
N
|
|---|
| network failover | | The ability to restore a network connection after
a failure in network hardware when there are redundant network links
to the same IP subnet.
|
|---|
| notification | | A message that is sent following a cluster or package
event.
|
|---|
O
|
|---|
| off-line data replication. | | Data replication by storing data off-line, usually
a backup tape or disk stored in a safe location; this method is best
for applications that can accept a 24-hour recovery time.
|
|---|
| on-line data replication | | Data replication by copying to another location
that is immediately accessible. On-line data replication is usually
done by transmitting data over a link in real time or with a slight delay
to a remote site; this method is best for applications requiring quick
recovery (within a few hours or minutes).
|
|---|
P
|
|---|
| cluster | | A cluster in production that has packages protected by
the HP ContinentalClusters product.
|
|---|
| package alert | | Time at which a message is sent indicating a problem with
a package.
|
|---|
| package event | | A package condition such as a failure that causes
a notification message to be sent. Package events can be accompanied
by alerts, but not alarms. Messages are for information only; the cmrecovercl command is not enabled for a package event.
|
|---|
| package recovery group | | A set of one or more packages with a mapping between
their instances on the cluster and their instances on the Recovery Cluster.
|
|---|
| physical data replication | | An on-line data replication method that duplicates
I/O writes to another disk on a physical block basis. Physical replication
can be hardware-based where data is replicated between disks over
a dedicated link (e.g. EMC's Symmetrix Remote Data Facility or
the HP SureStore E Disk Array XP Series Continuous Access), or software-based
where data is replicated on multiple disks using dedicated software
on the primary node (e.g. MirrorDisk/UX).
|
|---|
| planned downtime | | An anticipated period of time when nodes are taken
down for hardware maintenance, software maintenance (OS and application),
backup, reorganization, upgrades (software or hardware), etc.
|
|---|
| PowerPath | | A host-based software product from Symmetrix that delivers
intelligent I/O path management. PowerPath is required for M by
N Symmetrix configurations using MetroCluster/SRDF.
|
|---|
| primary package | | The package that normally runs on the cluster in
a production environment.
|
|---|
| pushbutton failover | | Use of the cmrecovercl command to allow all package recovery groups to start
up on the Recovery Cluster following a significant cluster event on the cluster.
|
|---|
| PV links | | A method of LVM configuration that allows you to
provide redundant disk interfaces and buses to disk arrays, thereby
protecting against single points of failure in disk cards and cables.
|
|---|
| PVOL | | A primary volume configured in an XP series disk
array that uses Continuous Access. PVOLs are the primary copies
in physical data replication with Continous Access on the XP.
|
|---|
Q
|
|---|
| quorum | | See See cluster quorum..
|
|---|
| quorum server | | A cluster node that acts as a tie-breaker in a
disaster tolerant architecture in case all of the nodes in a data
center go down at the same time. See also arbitrator.
|
|---|
R
|
|---|
| R1 | | The Symmetrix term indicating the data copy that
is the primary copy.
|
|---|
| R2 | | The Symmetrix term indicating the remote data copy
that is the secondary copy. It is normally read-only by the nodes
at the remote site.
|
|---|
| Recovery Cluster | | A cluster on which recovery of a package takes place following
a failure on the cluster.
|
|---|
| recovery group failover | | A failover of a package recovery group from one
cluster to another.
|
|---|
| recovery package | | The package that takes over on the Recovery Cluster in
the event of a failure on the cluster.
|
|---|
| regional disaster | | A disaster, such as an earthquake or hurricane,
that affects a large region. Local, campus, and proximate metropolitan clusters
are less likely to protect from regional disasters.
|
|---|
| remote failover | | Failover to a node at another data center or remote location.
|
|---|
| resynchronization | | The process of making the data between two sites consistent
and current once systems are restored following a failure. Also called
data resynchronization.
|
|---|
| rolling disaster | | A second disaster that occurs before recovering
from a previous disaster, e.g. while data is being synchronized
between two data centers after a disaster, one of the data centers
fails, interrupting the data synchronization process. Rolling disasters
may result in data corruption that requires a reload from tape backups.
|
|---|
S
|
|---|
| single point of failure (SPOF) | | A component of a cluster or node that, if it fails,
affects access to applications or services. See also multiple points
of failure.
|
|---|
| single system high availability | | Hardware design that results in a single system
that has availability higher than normal. Hardware design examples
are: on-line addition or replacement of I/O cards, memory,
etc.
|
|---|
| special device file | | The device file name that the HP-UX operating system
gives to a single connection to a node, in the format /dev/devtype/filename.
|
|---|
| split-brain syndrome | | When a cluster reforms with equal numbers of nodes
at each site, and each half of the cluster thinks it is the authority and
starts up the same set of applications, and tries to modify the
same data, resulting in data corruption. MC/ServiceGuard architecture prevents
split-brain syndrome in all cases unless dual cluster locks are used.
|
|---|
| SRDF | | (Symmetrix Remote Data Facility) A level 1-3 protocol
used for physical data replication between EMC Symmetrix disk arrays.
|
|---|
| SVOL | | A secondary volume configured in an XP series disk
array that uses Continuous Access. SVOLs are the secondary copies
in physical data replication with Continous Access on the XP.
|
|---|
| SymCLI | | The Symmetrix command line interface used to configure
and manage EMC Symmetrix disk arrays.
|
|---|
| Symmetrix device number | | The unique device number that identifies an EMC
logical volume.
|
|---|
| synchronous data replication | | Each data replication I/O waits for the preceding
I/O to complete before beginning another replication. Minimizes
the chance of inconsistent or corrupt data in the event of a rolling
disaster.
|
|---|
T
|
|---|
| transaction processing monitor (TPM) | | Software that allows you to modify an application
to store in-flight transactions in an external location until that
transaction has been committed to all possible copies of the database
or filesystem, thus ensuring completion of all copied transactions.
A TPM protects against data loss at the expense of the CPU overhead
involved in applying the transaction in each database replica. Software that provides a reliable mechanism to ensure that
all transactions are successfully committed. A TPM may also provide
load balancing among nodes.
|
|---|
| transparent failover | | A client application that automatically reconnects
to a new server without the user taking any action.
|
|---|
| transparent IP failover | | Moving the IP address from one network interface
card (NIC), in the same node or another node, to another NIC that
is attached to the same IP subnet so that users or applications
may always specify the same IP name/address whenever they connect,
even after a failure.
|
|---|
U-Z
|
|---|
| volume group | | In LVM, a set of physical volumes such that logical volumes
can be defined within the volume group for user access. A volume
group can be activated by only one node at a time unless you are using
ServiceGuard OPS Edition. MC/ServiceGuard can activate a volume
group when it starts a package. A given disk can belong to only one
volume group. A logical volume can belong to only one volume group.
|
|---|
| WAN data replication solutions | | Data replication that functions over leased or switched
lines. See also continental cluster.
|
|---|