Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP Servers and Workstations: Managing Systems and Workgroups > Chapter 5 Administering a System: Booting and Shutdown

Abnormal System Shutdowns

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

When your system crashes, it is important to know why, so that you can take actions to prevent it from happening again. Sometimes, it is easy to determine why: for example, if somebody trips over the cable connecting your computer to the disk containing your root file system (disconnecting the disk).

At other times, the cause of the crash might not be so obvious. In extreme cases, you might want or need to analyze a snapshot of the computer’s memory at the time of the crash, or have Hewlett-Packard do it for you, in order to determine the cause of the crash.

Overview of the Dump / Save Cycle

Figure 5-1 Overview of the Dump/Save Cycle

Overview of the Dump/Save Cycle

When the system crashes, HP-UX tries to save the image of physical memory, or certain portions of it, to predefined locations called dump devices. Then, when you next reboot the system, a special utility copies the memory image from the dump devices to the HP-UX file system area. When the memory image is in the HP-UX file system, you can analyze it with a debugger or save it to removable media for shipment to someone else for analysis.

Prior to HP-UX Release 11.0, devices to be used as dump devices had to be defined in the kernel configuration, and they still can be. However, beginning with Release 11.0, a new, more flexible method for defining dump devices is available.

There are now multiple ways that dump devices can be configured. Here are three commonly used ways to define dump devices:

  • In the kernel (as with releases prior to Release 11.0)

  • During system initialization when the initialization script for crashconf runs (and reads entries from the /etc/fstab file)

  • During run time, by an operator or administrator manually running the /sbin/crashconf command.

Preparing for a System Crash

The dump process exists so that you have a way of capturing what your system was doing at the time of a crash. This is not for recovery purposes; processes cannot resume where they left off, following a system crash. Rather this is for analysis purposes, to help you determine why the system crashed in order to prevent it from happening again.

If you want to be able to capture the memory image of your system when a crash occurs (for later analysis), you need to define in advance the location(s) where HP-UX puts that image at the time of the crash. This location can be on local disk devices, or logical volumes.

Wherever you decide that HP-UX should put the dump, it is important to have enough space at that location (see “How Much Dump Space Do I Need?”) If you do not have enough space, not every page will be saved and you might not capture the part of memory that contains the instruction or data that caused the crash. If necessary, you can define more than one dump device so that if the first one fills up, the next one is used to continue the dumping process until the dump is complete or no more defined space is available. To guarantee that you have enough dump space, define a dump area that is at least as big as your computer’s physical memory, plus one megabyte. If you are doing a selective dump (which is the default dump mode in most cases), much less dump space will actually be needed. Full dumps require dump space equal to the size of your computer’s memory plus a little extra for header information.

In HP-UX Release 11i compressed dumps are enabled by default. However, dump compression will only occur if conditions in the crash environment are favorable. Do not plan your dump storage space based on potential compression but allow enough space for an uncompressed full or selective dump. See “Compressed Dump(HP-UX version 1 (B.11.11) or later)”.

Systems Running HP-UX Releases Prior to Release 11.0

Prior to HP-UX Release 11.0, you have limited control over the dump process. You can control:

  • Whether or not a dump occurs (you can define the dump devices in the kernel file to be dump none to prevent dumps from occurring)

  • Which devices will be used as dump devices

  • Whether or not the savecore command runs at reboot time to copy the dumped memory image to the HP-UX file system area

NOTE: You must define the dump devices for your system when you build its kernel. See “Kernel Dump Device Definitions” for details on how to do this. And, if you want to change the dump devices, you need to build a new kernel file and boot to it for the changes to take effect.

Dump Configuration Decisions

As computers continue to grow in speed and processing power, they also tend to grow in physical memory size. Where once a system with 16MB of memory was considered to be a huge system, today it is barely adequate for most tasks. Some of today’s HP-UX systems can have terabytes of memory. This is important to mention here because the larger the size of your computer’s physical memory, the longer it will take to dump its contents to disk following a system crash (and the more disk space it will consume).

Usually, when your system crashes it is important to get it back up and running as fast as possible. If your computer has a very large amount of memory, the time it takes to dump that memory to disk might be unacceptably long when you are trying to get the system back up quickly. And, if you happen to already know why the computer crashed (for example, if somebody accidently disconnected the wrong cable), there’s little or no need for a dump anyway.

Prior to HP-UX Release 11.0, you have little control over the process. You must decide in advance whether or not you want a dump to occur when the system crashes, and you must build that decision into the kernel itself. However, beginning with HP-UX Release 11.0, a new runtime dump subsystem is available to you that will give you a lot more control over the dump process. An operator at the system console can even override the runtime configuration as the system is crashing.

In addition to any previous options you had, you now have control over the following crash dump features:

  • How much memory gets dumped.

  • Run-time crash dump configuration. It is no longer necessary to build your dump configuration into the kernel file or to reboot the system to change the crash dump configuration.

  • Whether or not a dump is compressed.

These new capabilities give you a lot more flexibility, but you need to make some important decisions regarding how you will configure your system dumps. There are three main criteria to consider. Select which of these is most important to you and read the corresponding section. The criteria are:

System Recovery Time

Use this section if the most important criteria to you is to get your system back up and running as soon as possible. The factors you have to consider here are:

Dump Level: Full Dump, Selective Dump, or No Dump

In addition to being able to choose “dump everything” or “dump nothing,” as of HP-UX Release 11.0 you have the ability to determine which classes of memory pages get dumped.

You are reading this section because system recovery time is critical to you. Obviously, the fewer pages your system needs to dump to disk (and on reboot, copy to the HP-UX file system area), the faster your system can be back up and running. Therefore, if possible, avoid using the full dump option.

When you define your dump devices, whether in a kernel build or at run time, you can list which classes of memory must always get dumped, and which classes of memory should not be dumped. If you leave both of these lists empty, HP-UX will decide for you which parts of memory should be dumped based on what type of error occurred. In nearly all cases, this is the best thing to do.

NOTE: Even if you have defined (in the kernel or at run time) that you do not want a full dump to be performed, an operator at the system console at the time of a crash can override those definitions and request a full dump.

Likewise, if at the time of a crash you know what caused it (and therefore do not need the system dump), but have previously defined a full or selective dump, an operator at the system console at the time of a crash can override those definitions and request no dump be performed.

Compressed Dump(HP-UX version 1 (B.11.11) or later)

Compressed dump is a feature available on systems running HP-UX 11i version 1 (or subsequent releases). Following a system crash, the HP-UX operating system can use the feature to compress data from memory before it writes the data to the dump device. Compression decreases the volume of crash data, making the writes to disk faster.

By reducing the time required to store the entire dump, the recovery period is shorter and your system will be up and running much more quickly. Dump compression provides a greater time saving on systems that have large amounts of memory.

The following features are available in this version of dump compression:

  • Dump compression is not forced, it is only a kernel hint. At the time of a system crash, the dump subsystem examines the state of the system and its resources to determine whether it is possible to use compression. Depending on the resources available, the system decides dynamically whether to dump compressed or uncompressed.

    (For example if the processor that is processing the crash fails to assign a sufficient number of processors to do the compression, the dump will not be compressed. A recursive crash, such as a panic in dump path, also causes the system to dump uncompressed.)

  • For selective dumps that exclude UNUSED pages, you can expect the dump to take about one-third the time of uncompressed dumps on the same system. This interval includes the time required to run the savecrash program and write the dump to its final storage location on the HP-UX file system.(A dump that previously took 3 hours to complete should now take only 1 hour.)

  • You can use the crashconf(1M), command to disable or enable compressed dumps. (Compression is configured into the kernel by default.) During a crash event, you can also choose to override dump compression.

    Normally, there is no benefit in disabling compression unless the initial (compressed) dump is corrupt and you want to attempt an uncompressed dump on a subsequent crash event.

  • You can convert the compressed dump file to any of the previous dump formats for storage and analysis.

  • The compressed dump file requires less disk storage space and creates a smaller tar file that takes less time to copy to tape or to transmit for analysis by using the ftp program.

  • In HP-UX 11i v2, the progress of memory dumps is updated at least once every 15 seconds. This change reduces the dump time.

Restrictions on Compressed Dumps

The following restrictions apply to this release:

  • If your system uses virtual partitions (vPars), the dump might not be compressed but the dump process will continue.

  • If more than one crash occurs in close succession, it might not be possible for the system to compress the dump.

If either of the preceding conditions apply, the following messages are displayed at the console:

*** Recursive crash.
*** Dump defaulting to sequential without compression

Configuring Compressed Dumps

In HP-UX Release 11i version 1.0, compression is enabled in the kernel by default. You can disable (and enable) the feature by using the crashconf dump configuration utility. Use the -c option with the on argument to control the status of compressed dumps, as follows:

$ crashconf -c on

Use the -v option to examine the status of compressed dumps, as follows:

crashconf -v
{Lines omitted from display}
Dump compressed:     ON

You can disable compression by using the crashconf -c command with the off argument, as follows:

crashconf -v -c off
{Lines omitted from display}
Dump compressed:     OFF

Any changes that you make to the dump configuration take effect immediately but will persist only until the next reboot or the next invocation of the crashconf command. To make changes persist across reboots, use the -t option. To make changes persist across kernel rebuilds, use SAM or the kctune command.

In HP-UX 11i v2, you can use the persistent dynamic tunable dump_compress_on to set compression on or off as required. Set this tunable by using the crashconf command with the -t option.

Prior to HP-UX 11i v2, you can also edit the /etc/rc.config.d/crashconf initialization script to set the compression option for every subsequent reboot. Open this file with a text editor and modify the value of the CRASHCONF_COMPRESS variable to 1 (enable, on) or 0 (disable, off). If the CRASHCONF_COMPRESS variable does not exist in /etc/rc.config.d/crashconf, the default behavior is to compress the dump. Beginning with HP-UX 11i v2, this configuration is handled by the dynamic tunable dump_compress_on, and is not controlled by crashconf.

Programs can use the pstat_getcrashinfo() function to query the current crash dump configuration. See pstat(2) for more information. The psc_flags data field shows whether dump compression is enabled or not. For example:

/*
* This structure describes the system crash dump configuration.
* It is only available as 64-bit data (_PSTAT64 defined).
*/
struct pst_crashinfo {
  int64_t  psc_flags;   /* Dump config. flags, see below */
  struct   __psdev psc_headerdev;    /* Device containing dump header */
  int64_t  psc_headeroffset;/* Byte Offset of dump hdr on device */
  int64_t  psc_ncrashdevs;  /* Number of dump devices */
  int64_t  psc_totalsize;   /*Total amount of dump space (kB) */
  int64_t  psc_included;    /* Page classes to be included */
  int64_t  psc_excluded;    /* Page classes to be excluded */
  int64_t  psc_default;     /*Defaults for unspec’d classes*/
  int64_t  psc_nclasses;    /* Number of classes */
  int64_t  psc_pgcount[PST_MAXCLASSES]; /* Number of pages in each class */
  int64_t  psc_reserved;               /* Reserved for future use */
};
/* Flag values for psc_flags: */
#define PS_EARLY_DUMP 0x1   /* An early dump was taken */
#define PS_CONF_CHANGED 0x2 /* Config. changed since boot */
#define PS_HEADER_VALID 0x4 /* headerdev and headeroffset valid */
#define PS_COMPRESS 0x8     /* Compress dump state*/
Compressed Save versus Noncompressed Save

System dumps can be very large, so large that your ability to store them in your HP-UX file system area can be taxed.

The boot time utility called savecrash can be configured (by editing the file /etc/rc.config.d/savecrash) to compress or not compress the data as it copies the memory image from the dump devices to the HP-UX file system area during the reboot process. This has system recovery time implications in that compressing the data takes longer. So, if you have the disk space and require that your system be back up and running as quickly as possible, configure savecrash to not compress the data.

Using a Device for Both Paging and Dumping

It is possible to use a specific device for both paging (swapping) and as a dump device. If system recovery time is critical to you, do not configure the primary paging device as a dump device. From the savecrash(1M) manpage:

By default, when the primary paging device is not used as one of the dump devices or after the crash image on the primary paging device has been saved, savecrash runs in the background. This reduces system boot-up time by allowing the system to be run with only the primary paging device.

Another advantage to keeping your paging and dump devices separate is that paging will not overwrite information stored on a dump device, no matter how long the system has been up or how much activity has taken place. Therefore, you can prevent savecrash processing at boot time (by editing the file /etc/rc.config.d/savecrash). This can save you a lot of time when you are trying to get your system back up in a hurry. After the system is up and running, you can run savecrash manually to copy the memory image from the dump area to the HP-UX file system area.

You Can Do a Partial Save . . .

If a memory dump resides partially on dedicated dump devices and partially on devices that are also used for paging, you can choose to save (to the HP-UX file system) only those pages that are endangered by paging activity. Pages residing on the dedicated dump devices can remain there. If you know how to analyze memory dumps, it is even possible to analyze them directly from the dedicated dump devices using a debugger that supports this feature.

Before sending your memory dump to someone else for analysis, you must move the dumped pages from the dedicated dump devices to the HP-UX file system. You can then use a utility such as tar to bundle them up for shipment. To move the dumped pages, use the command /usr/sbin/crashutil to complete the copy instead of savecrash.

Crash Information Integrity

Use this section if the most important criteria to you is to make sure you capture the part of memory that contains the instruction or piece of data that caused crash. The factors you have to consider here are:

Full Dump vs. Selective Dump

You have chosen this section because it is most important to you to capture the specific instruction or piece of data that caused your system crash. The only way to guarantee that you have it is to capture everything. This means selecting to do a full dump of memory.

Be aware, however, that this can be a costly procedure from both a time and a disk space perspective. From the time perspective, it can take quite a while to dump the entire contents of memory in a system with very large amounts of memory. It can take an additional large amount of time to copy that memory image to the HP-UX file system area during the reboot process.

From the disk space perspective, if you have large amounts of memory (some HP-UX systems can now have terabytes of memory), you will need an amount of dump area at least equal to the amount of memory in your system; and, depending on a number of factors, you will need additional disk space in your HP-UX file system area equaling the amount of physical memory in your system, in the worst case.

Dump Definitions Built into the Kernel

There are now a number of places that you can define which devices are to be used as dump devices:

  • During kernel configuration

  • At boot time (entries defined in the /etc/fstab file)

  • At run time (using the /sbin/crashconf command)

Definitions at each of these places add to or replace any previous definitions from the other sources. However, consider the following situation:

Example 5-34 Example

In the network named MSW, the system called appserver has one gigabyte (1 GB) of physical memory. If you were to define system dump devices with a total of 256 MB of space in the kernel file, and then define an additional 768 MB of disk space in the /etc/fstab file, you would have enough dump space to hold the entire memory image (a full dump) by the time the system was fully up and running.

But, what if the crash occurs before /etc/fstab is processed? Only the amount of dump space already configured will be available at the time of the crash; in this example, 256 MB of space.

If it is critical to you to capture every byte of memory in all instances, including the early stages of the boot process, define enough dump space in the kernel configuration to account for this.

NOTE: The preceding example is presented for completeness. The actual amount of time between the point where kernel dump devices are activated, and the point where runtime dump devices are activated is very small (a few seconds), so the window of vulnerability for this situation is practically nonexistent.
Using a Device for Both Paging and as a Dump Device

It is possible to use a specific device for both paging purposes and as a dump device. But, if crash dump integrity is critical to you, this is not recommended. From the savecrash(1M) manpage:

If savecrash determines that a dump device is already enabled for paging, and that paging activity has already taken place on that device, a warning message will indicate that the dump may be invalid. If a dump device has not already been enabled for paging, savecrash prevents paging from being enabled to the device by creating the file /etc/savecore.LCK. swapon does not enable the device for paging if the device is locked in /etc/savecore.LCK...

So, if possible, avoid using a given device for both paging and dumping: particularly the primary paging device!

Systems configured with small amounts of memory and using only the primary swap device as a dump device are in danger of not being able to preserve the dump (copy it to the HP-UX file system area) before paging activity destroys the data in the dump area. Larger memory systems are less likely to need paging (swap) space during startup, and are therefore less likely to destroy a memory dump on the primary paging device before it can be copied.

Disk Space Needs

Use this section if the you have very limited disk resources on your system for the post-crash dump and/or the post-reboot save of the memory image to the HP-UX file system area. The factors you have to consider here are:

Dump Level

You are reading this section because disk space is a limited resource on your system. Obviously, the fewer pages that you have to dump, the less space is required to hold them. Therefore, a full dump is not recommended. If disk space is very limited, you can always choose no dump at all.

However, there is a happy medium, and it happens to be the default dump behavior; it is called a selective dump. HP-UX can do a pretty good job of determining which pages of memory are the most critical for a given type of crash, and save only those. By choosing this option, you can save a lot of disk space on your dump devices, and again later, in your HP-UX file system area. For instructions on how to do this, see “Defining Dump Devices”.

Compressed Save versus Noncompressed Save

Regardless of whether you choose to do a full or selective save, whatever is saved on the dump devices needs to be copied to your HP-UX file system area before you can use it.

NOTE: Beginning with HP-UX Release 11.0, it is possible to analyze a crash dump directly from dump devices using a debugger that supports this feature (see the caution in the section called “Analyzing Crash Dumps”). But if you need to save it to removable media, or send it to someone, you will first need to copy the memory image to the HP-UX file system area.

If the disk space shortage on your system is in the HP-UX file system area (not in the dump devices), you can choose to have savecrash (the boot time utility that does the copy) compress your data as it makes the copy.

Partial Save (savecrash -p)

If you have plenty of dump device space but are limited on space in your HP-UX file system, you can use the -p option to the savecrash command. This command copies only those pages on dump devices that are endangered by paging activity (the pages residing on devices that are being used for both paging and dumping). Pages that are on dedicated dump devices are not copied.

To configure this option into the boot process, edit the file /etc/rc.config.d/savecrash and uncomment the line that sets the environment variable SAVE_PART=1.

Defining Dump Devices

This section describes procedures for defining the dump devices that your system can use when a crash occurs.

NOTE: For HP-UX releases prior to Release 11.0, dump device definitions must be built into the kernel.
How Much Dump Space Do I Need?

Before you define dump devices, it is important to determine how much dump space you need, so that you can define enough dump space to hold the dump, but will not define too much dump space, which would be a waste of disk space.

Systems Running HP-UX Releases Prior to Release 11.0

The decision for systems running HP-UX Releases prior to Release 11.0 is pretty simple: How much physical memory is in your system? The concept of a “selective dump” was introduced at Release 11.0. Prior to that time, dumps are full memory dumps (if dump space permits).

So, define enough dump space to total the amount of physical memory in your system.

Systems Running HP-UX Release 11.0 or Later

For HP-UX Releases 11.0 and later, the amount of dump space you need to define is also equal to the size of your system’s physical memory if you want to have a full dump saved.

For selective dumps, the size of your dump space varies, depending on which classes of memory you are saving. There is an easy way to estimate your needs:

  1. When the system is up and running, with a fairly typical work load, run the following command:

    /sbin/crashconf -v

    You will get output that looks similar to the following:

    CLASS          PAGES  INCLUDED IN DUMP  DESCRIPTION
    --------  ----------  ----------------  -------------------------------------
    UNUSED          2036  no,  by default   unused pages
    USERPG          6984  no,  by default   user process pages
    BCACHE         15884  no,  by default   buffer cache pages
    KCODE           1656  no,  by default   kernel code pages
    USTACK           153  yes, by default   user process stacks
    FSDATA           133  yes, by default   file system metadata
    KDDATA          2860  yes, by default   kernel dynamic data
    KSDATA          3062  yes, by default   kernel static data

    Total pages on system:             32768
    Total pages included in dump:       6208

    DEVICE        OFFSET(kB)   SIZE (kB)  LOGICAL VOL.  NAME
    ------------  ----------  ----------  ------------  -------------------------
     31:0x00d000       52064      262144   64:0x000002  /dev/vg00/lvol2
                              ----------
                                  262144
  2. Multiply the number of pages listed in Total pages included in dump by the page size (4 KB), and add 25 percent for a margin of safety to give you an estimate of how much dump space to provide. For the preceding example, the calculation is:

    (6208 x 4 KB) x 1.25 = approximately. 30 MB

Kernel Dump Device Definitions

If you are running an HP-UX release prior to Release 11.0, and/or you are concerned about capturing dumps for crashes that occur during the early stages of the boot process, you need to define sufficient dump space in your kernel configuration.

Using SAM to Configure Dump Devices into the Kernel

The easiest way to configure into the kernel which devices can be used as dump devices is to use SAM. The dump device definition screen is located in SAM’s Kernel Configuration area. After changing the dump device definitions, you must build a new kernel and reboot the system using the new kernel file to make the changes take effect.

  1. Run SAM and select the Kernel Configuration area.

  2. From the Kernel Configuration area, select the Dump Devices area.

    A list of dump devices that will be configured into the next kernel built by SAM is displayed. This is the list of pending dump devices.

  3. Use SAM’s Action menu to add, remove or modify devices or logical volumes until the list of pending dump devices is as you would like it to be in the new kernel.

    NOTE: The order of the devices in the list is important. Devices are used in reverse order from the way they appear in the list. The last device in the list is used as the first dump device.
  4. Follow the SAM procedure for building a new kernel.

  5. When the time is appropriate, boot your system from the new kernel file to activate your new dump device definitions. For details on how to do that, see “Reconfiguring the Kernel
    (Prior to HP-UX 11i Version 2)”
    .

Using HP-UX Commands to Configure Dump Devices into the Kernel

You can also edit your system file and use the config program to build your new kernel.

  1. Edit your system file (the file that config will use to build your new kernel). This file is usually the file /stand/system, but can be another file if you prefer.

    Dump to Hardware Device

    For each hardware dump device you want to configure into the kernel, add a dump statement in the area of the file designated * Kernel Device info (immediately prior to any tunable parameter definitions). For example:

    dump 2/0/1.5.0

    dump 56/52.3.0

    NOTE: For systems that boot with LVM, either dump lvol or dump none must be present! Without one of these, any dump hardware_path statements are ignored.
    Dump to Logical Volume

    In the case of logical volumes, it is not necessary to define each volume that you want to use as a dump device. If you want to dump to logical volumes, the logical volumes must meet all of the following requirements:

    • Each logical volume to be used as a dump device must be part of the root volume group (vg00). For details on configuring logical volumes as kernel dump devices, see the lvlnboot(1M) manpage.

    • The logical volumes to be used as dump devices must be contiguous (no disk striping, or bad-block reallocation is permitted for dump logical volumes).

    • The logical volume cannot be used for file system storage, because the whole logical volume will be used.

    To use logical volumes for dump devices (regardless of how many logical volumes you want to use), include the following dump statement in the system file:

    dump lvol

    Configuring No Dump Devices

    To configure a kernel with no dump device, use the following dump statement in the system file:

    dump none

    NOTE: If you truly want no dump device to be configured into the kernel, you must use the dump none statement. Omitting dump statements altogether from the system file causes the kernel to use only the primary paging device (swap device) as the dump device.
  2. When you have edited the system file, build a new kernel file using the config command (see “Reconfiguring the Kernel
    (Prior to HP-UX 11i Version 2)”
    for details on how to do this.)

  3. Save the existing kernel file (probably /stand/vmunix) to a safe place (such as /stand/vmunix.safe) in case the new kernel file cannot be booted and you need to boot again from the old one.

  4. When the time is appropriate, boot your system from the new kernel file to activate your new dump device definitions.

Run Time Dump Device Definitions

As of HP-UX Release 11.0, unless you are concerned about capturing a dump of your system that occurs during the earliest stages of the boot process, you now have the ability to replace or supplement any dump device definitions that are built into your kernel while the system is booting or running. There are two ways to do this:

  • Using crashconf to read dump entries in the /etc/fstab file (using crashconf’s -a option)

  • Using arguments to the crashconf command, directly specifying the devices to be configured

The /etc/fstab File

You can define entries in the fstab file to activate dump devices during the HP-UX initialization (boot) process, or when crashconf reads the file. The format of a dump entry for /etc/fstab looks like this:

devicefile_name / dump defaults 0 0

For example:

/dev/dsk/c0t3d0 / dump defaults 0 0
/dev/vg00/lvol2 / dump defaults 0 0
/dev/vg01/lvol1 / dump defaults 0 0

Define one entry for each device or logical volume you want to use as a dump device.

NOTE: Unlike dump device definitions built into the kernel, with run time dump definitions you can use logical volumes from volume groups other than the root volume group.
The crashconf Command

You can also use the /sbin/crashconf command to add to, remove, or redefine dump devices. There are several ways to do this:

  • Re-read the /etc/fstab file using crashconf’s -a option

  • Use device arguments with crashconf to configure the devices

With either of the preceding uses of crashconf, you can use the -r option to specify that you want the new definitions to replace, rather than add to, any previous dump device definitions.

Here are some crashconf examples.

Example 5-35 Add fstab Entries to Active Dump List

To have crashconf read the /etc/fstab file, adding any listed dump devices to the currently active list of dump devices:

/sbin/crashconf -a

Example 5-36 Replace Active Dump List with fstab Entries

To have crashconf read the /etc/fstab file, replacing the currently active list of dump devices with those defined in fstab:

/sbin/crashconf -ar

Example 5-37 Add Specific Devices to Active Dump List

To have crashconf add the devices represented by the block device files /dev/dsk/c0t1d0 and /dev/dsk/c1t4d0 to the dump device list:

/sbin/crashconf /dev/dsk/c0t1d0 /dev/dsk/c1t4d0

Example 5-38 Replace Active Dump List with Specific Devices

To have crashconf replace any existing dump device definitions with the logical volume /dev/vg00/lvol3 and the device represented by block device file /dev/dsk/c0t1d0:

/sbin/crashconf -r /dev/vg00/lvol3 /dev/dsk/c0t1d0

Dump Order

In some circumstances, such as when you are using the primary paging device along with other devices as a dump device, you care about what order they are dumped to following a system crash. In this way you can minimize the chances that important dump information will be overwritten by paging activity during the subsequent reboot of your computer.

The rule is simple to remember:

No matter how the list of currently active dump devices is built (from a kernel build, from the /etc/fstab file, from use of the crashconf command, or any combination of these), dump devices are used (dumped to) in the reverse order from which they were defined. In other words, the last dump device in the list is the first one used, and the first device in the list is the last one used.

Therefore, if you have to use a device for both paging and dumping, it is best to put it early in the list of dump devices so that other dump devices are used first.

What Happens When the System Crashes

An HP-UX system crash is an unusual event. When a system panic occurs, it means that HP-UX encountered a condition that it did not know how to handle (or could not handle). Sometimes you know right away what caused the crash (for example: a power failure, or a forklift backed into the disk array, etcetera). Other times the cause is not readily apparent. It is for this reason that HP-UX is equipped with a dump procedure to capture the contents of memory at the time of the crash for later analysis.

Systems Running HP-UX Releases Prior to Release 11.0

For systems running HP-UX releases prior to Release 11.0, if you have dump devices defined in your kernel configuration - the default is to use the primary paging (swap) device - HP-UX dumps as much of your computer’s physical memory contents to the dump devices as dump space permits. A panic message will also be written to the system console and logged in the file /var/adm/shutdownlog (or /etc/shutdownlog), if shutdownlog exists.

Operator Override Options

Dump control options are displayed at the system console during a crash. If you are running HP-UX Release 11i, compressed dumps are an option. Prior releases of HP-UX provide a subset of the dump control options, depending on the release.

You have the option to control the dump as follows:

  • C or S option - Select a compressed or uncompressed dump

  • N option - Choose to abort the dump

If you opt to continue with the dump, you can also control the dump type, as follows:

  • S option - Perform a selective dump

  • P option - Perform a partial dump

  • F option - Perform a full dump

The following example simulates a dump by using the TC option from the guardian service processor (GSP) console, on a system running HP-UX Release 11i Version 2:

*** The dump will be COMPRESSED.
*** To change this dump type, press any key within 10 seconds.
*** Select one of the following dump types, by pressing the corresponding key:
C) The dump will be compressed.
S) The dump will be without compression.
N) There will be NO DUMP performed
*** Enter your selection now.
[ A Key is Pressed ]
*** Proceeding with compressed dump.

*** The dump will be a SELECTIVE dump:  1240 of 16352 megabytes.
*** To change this dump type, press any key within 10 seconds.
[ A Key is Pressed ]

*** Select one of the following dump types, by pressing the corresponding key:
S) The dump will be a SELECTIVE dump:  1240 of 16352 megabytes.
P) The dump will be a PARTIAL dump:  6138 of 16352 megabytes.
F) The dump will be a FULL dump of 16352 megabytes.

*** Enter your selection now.
[ A Key is Pressed ]
*** Proceeding with selective dump.

*** The dump may be aborted at any time by pressing ESC.

If the reason for the system crash is known, and a dump is not needed, the operator can override any dump device definitions by entering N (for no dump) at the system console within the 10-second override period.

If disk space is limited, but the operator feels that a dump is important, the operator can enter S (for selective dump) regardless of the currently defined dump level.

The Dump

After the operator is given a chance to override the current dump level, or the 10-second override period expires, HP-UX writes the physical memory contents to the dump devices until one of the following conditions is true:

  • The entire contents of memory are dumped (if a full dump was configured or requested by the operator)

  • The entire contents of selected memory pages are dumped (if a selective dump was configured or requested by the operator)

  • Configured dump device space is exhausted

Depending on the amount of memory being dumped, this process can take from a few seconds to hours.

NOTE: While the dump is in occurring, status messages on the system console indicates the dump’s progress.

You can interrupt the dump at any time by pressing the ESC (escape) key. It can take as much as 15 seconds to abort. However, if you interrupt a dump, it will be as though a dump never occurred; that is, you will not get a partial dump.

Following the dump, the system attempts to reboot.

The Reboot

After the dumping of physical memory pages is complete, the system attempts to reboot (if the AUTOBOOT flag is set). For information on the AUTOBOOT flag, see “Enabling / Disabling Autoboot”.

The savecrash Processing Option

You can define whether or not you want a process called savecrash to run as your system boots (on HP-UX systems prior to Release 11.0 the process is called savecore). This process copies (and optionally compresses) the memory image stored on the dump devices to the HP-UX file system area.

Dual-Mode Devices (dump / swap)

By default, savecrash is enabled and performs its copy during the boot process. You can disable this operation by editing the /etc/rc.config.d/savecrash file, setting the SAVECRASH environment variable to a value of 0. This is generally safe to do if your dump devices are not also being used as paging devices.

CAUTION: If you are using your devices for both paging and dumping, do not disable the savecrash boot processing or you will lose the dumped memory image to subsequent system paging activity.

What to Do After the System Has Rebooted

After your system is rebooted, one of the first things you need to do is to be sure that the physical memory image that was dumped to the dump devices is copied to the HP-UX file system area so that you can either package it up and send it to an expert for analysis, or analyze it yourself using a debugger.

NOTE: As of HP-UX Release 11.0, it is possible to analyze a crash dump directly from dump devices using a debugger that supports this feature. But if you need to save it to removable media, or send it to someone, you first need to copy the memory image to the HP-UX file system area. See also the information on converting compressed dumps in “Converting the Format of Compressed Dumps”.

Unless you specifically disable savecrash processing during reboot (see “The savecrash Processing Option”), the savecrash utility will copy the memory image for you during the reboot process. The default HP-UX directory that it will put the memory image in is /var/adm/crash. You can specify a different location by editing the file /etc/rc.config.d/savecrash and setting the environment variable called SAVECRASH_DIR to the name of the directory where you would like the dumps to be located.

Using crashutil to Complete the Saving of a Dump

If you are using devices for both paging (swapping) and dumping, it is very important not to disable savecrash processing at boot time. If you do, there is a chance that the memory image in your dump area will be overwritten by normal paging activity. If, however, you have separate dump and paging devices (no single device is used for both purposes), you can delay the copying of the memory image to the HP-UX file system area in order to speed up the boot process, to get your system up and running as soon as possible. You do this by editing the file /etc/rc.config.d/savecrash and setting the environment variable called SAVECRASH=0.

If you have delayed the copying of the physical memory image from the dump devices to the HP-UX file system area in this way, run savecrash manually to do the copy when your system is running and when you have made enough room to hold the copy in your HP-UX file system area.

If you chose to do a partial save by leaving the SAVECRASH environment set to 1, and by setting the environment variable called SAVE_PART=1 (in the file /etc/rc.config.d/savecrash) the only pages that were copied to your HP-UX file system area during the boot process are those that were on paging devices. Pages residing on dedicated dump devices are still there. To copy the remaining pages to the HP-UX file system area when your system is running again, use the command called crashutil. See the crashutil(1M) for details.

Example 5-39 Example

/usr/sbin/crashutil /var/adm/crash/crash.0

Savecrash Options for Compressed Dumps

The savecrash command runs during boot to copy the dump from the dump devices to its storage location in the HP-UX file system. Compressed dump configuration has the following impact on savecrash operations:

  • The savecrash command takes less time to copy the compressed dump. The compressed dump requires less disk storage space.

  • You can still use the savecrash command with the -p option to avoid saving portions of the dump from dedicated dump devices.

  • Although you can specify the -z option with the savecrash, the option is ignored. This is because the dump is already compressed.

  • Use the savecrash -s chunksize option with care. If you specify a chunk size that is less than the memory size corresponding to one compression unit of a compressed dump, the -s option will also be ignored. See savecrash(1M)

Converting the Format of Uncompressed Dumps

Over the course of many recent HP-UX releases, the format of the saved memory image (as saved in the HP-UX file system area) has changed. If you are analyzing a crash dump on a computer running a different version of HP-UX than the computer that crashed, or if you are using a debugging tool that does not understand the specific format of the saved file, you might not be able to analyze the crash dump in its current format. You can use crashutil to convert from one file type to another.

The syntax of the crashutil command to do a conversion is:

/usr/sbin/crashutil [-v version] source [destination]

version, in this command, is the format that you want to convert to. source is the HP-UX file system file/directory containing the dump you want to convert. And, if you do not want to convert the source in place, you can specify an alternate destination for the converted output.

Converting the Format of Compressed Dumps

PARDIR. The only debug tool that supports the PARDIR is adb, as specified in

Table 5-5 Versions of adb That Support Compressed Dumps

HP-UX Release

adb Version

11i Version 1

adb and patch PHCO_28744

11i Version 2

A new version of adb that supports PARDIR

 

Table 5-5 “Versions of adb That Support Compressed Dumps”.

To analyze compressed dumps with older debugging tools or debuggers other than adb, use the crashutil command to convert the compressed dump to one of the previous dump formats. For example, the following command converts a compressed dump to the CRASHDIR format:

/usr/sbin/crashutil -v CRASHDIR /var/adm/crash/crash.0 /var/adm/crash/crash.1

You can then use the crash.1 file for debugging purposes.

Analyzing Crash Dumps

CAUTION: Analyzing crash dumps is not a trivial task. It requires intimate knowledge of HP-UX internals and the use of debuggers. It is beyond the scope of this document to cover the actual analysis process. If you need help analyzing a crash dump, contact your Hewlett-Packard representative.
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1997-2006 Hewlett-Packard Development Company, L.P.