Similar results can be accomplished using "mpsched" but this
has the advantage of being more load-based distribution, and works
well in psets and across multiple machines.
Binding
ranks to ldoms (-cpu_bind) |
 |
On SMP systems, processes sometimes move to a different ldom
shortly after startup or during execution. This increases memory
latency and can cause slower performance as the application is now
accessing memory across cells.
Applications which are very memory latency sensitive can show
large performance degradation when memory access is mostly off-cell.
To solve this problem, ranks need to reside in the same ldom
which they were originally created. To accomplish this, HP-MPI provides
the -cpu_bind flag, which locks down a rank to
a specific ldom and prevents it from moving during execution. To
accomplish this, the -cpu_bind flag will preload a shared library
at startup for each process, which does the following:
Spins for a short time in
a tight loop to let the operating system distribute processes to
CPUs evenly.
Determines the current CPU
and ldom of the process and if no oversubscription occurs on the
current CPU, it will lock the process to the ldom of that CPU.
This will evenly distribute the ranks to CPUs, and prevents
the ranks from moving to a different ldom after the MPI application
starts, preventing cross-memory access.
See -cpu_bind under “mpirun
options” for more information.