| United States-English |
|
|
|
![]() |
HP XC System Software: User's Guide > Chapter 6 Debugging ApplicationsDebugging Parallel Applications |
|
The following parallel debuggers are recommended for use in the HP XC environment are TotalView and DDT. TotalViewTotalView is a full-featured GUI debugger for debugging parallel applications from Etnus, Inc. It is specifically designed to meet the requirements of parallel applications running on many cores. The use of TotalView in the HP XC environment is described in “Debugging with TotalView”. You can obtain additional information about TotalView from the TotalView documentation and the TotalView Web site at:
DDTDDT (Distributed Debugging Tool) is a parallel debugger from Streamline Computing. DDT is a comprehensive graphical debugger designed for debugging parallel code. It gives users a common interface for most compilers, languages and MPI distributions. For information about using DDT, refer to Streamline Computing documentation and the Streamline Computing Web site: http://www.streamline-computing.com/softwaredivision_1.shtml TotalView™ is a full-featured, GUI-based debugger specifically designed to meet the requirements of parallel applications running on many cores. You can purchase the TotalView debugger, from Etnus, Inc., for use on the HP XC cluster. TotalView is not included with the HP XC software and technical support is not provided by HP. Contact Etnus, Inc. for any issues with TotalView. This section provides only minimum instructions to get you started using TotalView. Instructions for installing TotalView are included in the HP XC System Software Installation Guide. Read the TotalView documentation for full information about using TotalView; the TotalView documentation set is available directly from Etnus, Inc. at the following URL: As discussed in “Using the Secure Shell to Log In” and “Enabling Remote Execution with OpenSSH”, HP XC systems use the OpenSSH package in place of traditional commands like rsh to provide more secure communication between nodes in the cluster. When run in a parallel environment, TotalView expects to be able to use the rsh command to communicate with other nodes, but the default HP XC configuration disallows this. Set the TVDSVRLAUNCHCMD environment variable to specify an alternate command for TotalView to use in place of rsh. When using the TotalView Modulefile, as described in “Setting Up TotalView”, this variable is automatically set to /usr/bin/ssh -o BatchMode=yes. If you manage your environment independently of the provided modulefiles, set this variable manually. TotalView must be set up as described here:
Your administrator may have already installed TotalView and set up the environment for you. In this case, skip the steps in this section and proceed to “Setting TotalView Preferences”, which describes using TotalView for the first time. Use the following commands to allocate the nodes you need before you debug an application with SLURM, as shown here:
These commands allocate x nodes and run TotalView to debug the program named application. Be sure to exit from the SLURM allocation created with the srun command when you are done. HP recommends the use of xterm when debugging an application with LSF-HPC. You also need to allocate the nodes you will need. You may need to verify the full path name of the xterm and mpirun commands: First run a bsub command to allocate the nodes you will need and to launch an xterm window:
Enter an mpirun -tv command in the xterm window to start TotalView on the application you want to debug:
You should set TotalView preferences the first time you invoke it. For example, you need to tell TotalView how to launch TotalView processes on all the cores. The TotalView preferences are maintained in a preferences file named .totalview/preferences in your home directory, so you only need to set these preferences once. Follow these steps to set the TotalView preferences:
The TotalView launch preferences are configured and saved. You can change this configuration at any time. This section describes how to use TotalView to debug an application.
Continue debugging as you would on any system. If you are not familiar with TotalView, you can click on Help in the right-hand corner of the process window for additional information. As an alternative to the method described in “Debugging an Application”, it is also possible to "attach" an instance of TotalView to an application which is already running.
It is important that you make sure your job has completed before exiting TotalView. This may require that you wait a few seconds from the time your job has completed until srun has completely exited. If you exit TotalView before your job is completed, use the squeue command to ensure that your job is not still on the system.
If it is still there, use the following command to remove all of your jobs:
If you desire to cancel just certain jobs, refer to the scancel manpage for information about selective job cancellation. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||