Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software: User's Guide > Chapter 6 Debugging Applications

Debugging Parallel Applications

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The following parallel debuggers are recommended for use in the HP XC environment are TotalView and DDT.

TotalView

TotalView is a full-featured GUI debugger for debugging parallel applications from Etnus, Inc. It is specifically designed to meet the requirements of parallel applications running on many cores. The use of TotalView in the HP XC environment is described in “Debugging with TotalView”. You can obtain additional information about TotalView from the TotalView documentation and the TotalView Web site at:

http://www.etnus.com

Note:

TotalView is not included with the HP XC software and is not supported. If you have any problems installing or using TotalView, contact Etnus, Inc.

DDT

DDT (Distributed Debugging Tool) is a parallel debugger from Streamline Computing. DDT is a comprehensive graphical debugger designed for debugging parallel code. It gives users a common interface for most compilers, languages and MPI distributions. For information about using DDT, see the Streamline Computing documentation and the Streamline Computing Web site:

http://www.streamline-computing.com/softwaredivision_1.shtml

Debugging with TotalView

TotalView™ is a full-featured, debugger based on GUI and specifically designed to fill the requirements of parallel applications running on many cores.

You can purchase the TotalView debugger, from Etnus, Inc., for use on the HP XC cluster.

TotalView is not included with the HP XC software and technical support is not provided by HP. Contact Etnus, Inc. for any issues with TotalView.

This section provides only minimum instructions to get you started using TotalView. Instructions for installing TotalView are included in the HP XC System Software Installation Guide. Read the TotalView documentation for full information about using TotalView; the TotalView documentation set is available directly from Etnus, Inc. at the following URL:

http://www.etnus.com

SSH and TotalView

As discussed in “Using the Secure Shell to Log In” and “Enabling Remote Execution with OpenSSH”, HP XC systems use the OpenSSH package in place of traditional commands like rsh to provide more secure communication between nodes in the cluster. When run in a parallel environment, TotalView expects to be able to use the rsh command to communicate with other nodes, but the default HP XC configuration disallows this.

Set the TVDSVRLAUNCHCMD environment variable to specify an alternate command for TotalView to use in place of rsh. When using the TotalView Modulefile, as described in “Setting Up TotalView”, this variable is automatically set to /usr/bin/ssh -o BatchMode=yes. If you manage your environment independently of the provided modulefiles, set this variable manually.

Setting Up TotalView

TotalView must be set up as described here:

  1. Determine if TotalView is installed, and whether environment variables have been defined for TotalView. Use the which or whereis to do so.

  2. Determine if environment variables have been defined for TotalView. You can use the echo for this.

  3. Set the DISPLAY environment variable of the system that hosts TotalView to display on your local system.

    Also, run the xhosts command locally to accept data from the system that hosts TotalView; see the X(7X) manpage for more information.

  4. Edit the PATH environment variable to include the location of the TotalView executable.

    Also add the location of the TotalView manpages to the MANPATH environment variable.

    The following list summarizes some suggestions:

    • Edit your login file or profile file to include the following commands:

      module load mpi
      module load totalview
    • Set the PATH and MANPATH environment variables in your shell initialization file or login file.

    • Have your system administrator set up your environment so that the TotalView modulefile loads automatically when you log in to the system.

    • Adjust your environment manually before invoking TotalView.

    See “Overview of Modules” for information on modulefiles.

    Your administrator may have already installed TotalView and set up the environment for you. In this case, skip the steps in this section and proceed to “Setting TotalView Preferences”, which describes using TotalView for the first time.

Using TotalView with SLURM

Use the following commands to allocate the nodes you need before you debug an application with SLURM, as shown here:

$ srun -Nx -A
$ mpirun -tv -srun application

These commands allocate x nodes and run TotalView to debug the program named application.

Be sure to exit from the SLURM allocation created with the srun command when you are done.

Using TotalView with LSF-HPC

HP recommends the use of xterm when debugging an application with LSF-HPC. You also need to allocate the nodes you will need.

You may need to verify the full path name of the xterm and mpirun commands:

First run a bsub command to allocate the nodes you will need and to launch an xterm window:

$ bsub -nx -ext "SLURM[nodes=x]" \
-Is /usr/bin/xterm

Enter an mpirun -tv command in the xterm window to start TotalView on the application you want to debug:

$ mpirun -tv -srun application

Setting TotalView Preferences

You should set TotalView preferences the first time you invoke it. For example, you need to tell TotalView how to launch TotalView processes on all the cores.

The TotalView preferences are maintained in a preferences file named .totalview/preferences in your home directory, so you only need to set these preferences once.

Follow these steps to set the TotalView preferences:

  1. Invoke the TotalView debugger:

    $ totalview

    TotalView's main control window (called the TotalView Root Window) appears.

  2. Select Preferences from the File pull-down menu.

    A Preferences window opens.

  3. Select (that is, click on) the Launch Strings tab in the Preferences window.

  4. Ensure that the Enable single debug server launch button is selected in the Launch Strings tab.

  5. In the Launch Strings table, in the area immediately to the right of Command:, verify that the default command launch string shown is as follows:

    %C %R -n "%B/tvdsvr -working_directory %D -callback %L -set_pw %P 
    -verbosity %V %F"

    You may be able to obtain this setting by selecting the Defaults button; otherwise, you need to enter this command launch string.

  6. Select the Bulk Launch tab in the Preferences window.

    Make sure that Enable debug server bulk launch is not selected.

  7. Select the OK button at the bottom-left of the Preferences window to save these changes.

  8. Exit TotalView by selecting Exit from the File pull-down menu.

The TotalView launch preferences are configured and saved. You can change this configuration at any time.

Debugging an Application

This section describes how to use TotalView to debug an application.

  1. Compile the application to be debugged. For example:

    $ mpicc -g -o Psimple simple.c -lm

    Use the -g option to enable debugging information.

  2. Run the application in TotalView:

    $ mpirun -tv -srun -n2 ./Psimple
  3. The TotalView main control window, called the TotalView root window, opens. It displays the following message in the window header:

    Etnus TotalView Version#
  4. The TotalView process window opens.

    This window contains multiple panes that provide various debugging functions and debugging information. The name of the application launcher that is being used (either srun or mpirun) is displayed in the title bar.

  5. Set the search path if you are invoking TotalView from a directory that does not contain the executable file and the source code. If TotalView is invoked from the same directory, you can skip to step Step 6.

    Set the search path as follows:

    1. Select the File pull-down menu of the TotalView process window.

    2. Select Search Path from the list that appears.

    TotalView, by default, will now search for source and binaries (including symbol files) in the following places and in the following order:

    • Current working directory

    • Directories in FileSearch Path

    • Directories specified in your PATH environment variable

  6. Select the Go button in the TotalView process window. A pop-up window appears, asking if you want to stop the job:

    Process srun is a parallel job.
    Do you want to stop the job now?
  7. Select Yes in this pop-up window. The TotalView root window appears and displays a line for each process being debugged.

    If you are running Fortran code, another pop-up window may appear with the following warning:

    Sourcefile initfdte.f was not found, using assembler mode.

    Select OK to close this pop-up window. You can safely ignore this warning.

  8. You can now set a breakpoint somewhere in your code. The method to do this may vary slightly between versions of TotalView. For TotalView Version 6.0, the basic process is as follows:

    1. Select At Location in the Action Point pull-down menu of the TotalView process window.

    2. Enter the name of the location where you want to set a breakpoint.

    3. Select OK.

  9. Select the Go button to run the application and go to the breakpoint.

Continue debugging as you would on any system. If you are not familiar with TotalView, you can select on Help in the right-hand corner of the process window for additional information.

Debugging Running Applications

As an alternative to the method described in “Debugging an Application”, it is also possible to "attach" an instance of TotalView to an application which is already running.

  1. Compile a long-running application as in “Debugging an Application”:

    $ mpicc -g -o Psimple simple.c -lm
  2. Run the application:

    $ mpirun -srun -n2 Psimple
  3. Start TotalView:

    $ totalview
  4. Select Unattached in the TotalView Root Window to display a list of running processes.

    Double-click on the srun process to attach to it.

  5. The TotalView Process Window appears, displaying information on the srun process.

    Select Attached in the TotalView Root Window.

  6. Double-click one of the remote srun processes to display it in the TotalView Process Window.

  7. Now you should be able set breakpoints to debug the application.

Exiting TotalView

Make sure your job has completed before exiting TotalView. This may require that you wait a few seconds from the time your job has completed until srun has completely exited.

If you exit TotalView before your job is completed, use the squeue command to ensure that your job is not still on the system.

$ squeue

If it is still there, use the following command to remove all of your jobs:

$ scancel --user username

To cancel individual jobs, see the scancel manpage for information about selective job cancellation.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.