High performance programming methods address systems of one
or more processors that can be distributed within an SMP system
or over the nodes of a cluster. These methods include: standard
serial optimizations and library calls; auto-parallelization offered
by some compilers, OpenMP directives, calls to the POSIX threads
library, and calls to the message passing interface (MPI).
The overall parallel high performance program that takes advantage
of multiple processors is still running a collection of single processor
programs concurrently. Therefore, standard serial optimizations
aimed at uniprocessor performance are important, including loop
unrolling, cache blocking, and other coding techniques that allow
the compilers to better optimize and pipeline programs.
By using multi-threading (OpenMP directives or Pthreads calls)
and message passing (calls to an MPI library), the developer then
achieves concurrency, where multiple parts of the program run simultaneously
to achieve factors of performance improvement. Compiler auto-parallelization
can be used to automatically generate multi-threading.
HP provides an MPI library that runs optimally on HP clusters
in the HP-UX PA-RISC and Itanium, Linux IA32 and Itanium platform
pairs. HP MPI is reliable, thread-safe, and fully compliant with
the MPI-2 standard. HP MPI supports a variety of interconnect fabrics including
TCP/IP, Quadrics Elan, Infiniband, and intranode communication.
HP Visual MPI, a companion product of HP MPI, is an analysis tool
that provides error detection and statistical analysis to highlight
issues for improving performance.