Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Fortran 90, Fortran 77, C, aC++: Exemplar Programming Guide > Chapter 6 Advanced shared-memory programming

Parallel information functions

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

Several intrinsics are available to provide information regarding the parallelism or potential parallelism of your program. These are all integer functions, available in both 4- and 8-byte lengths; they can appear in executable statements anywhere an integer expression is legal. The 8-byte versions, which are suffixed with _8, are typically only used in Fortran programs in which the default data lengths have been changed using the -I8 or similar compiler options. When default integer lengths are modified via compiler options in Fortran, the correct intrinsic is automatically chosen regardless of which is specified. These versions expect 8-byte input arguments and return 8-byte values.

NOTE: All C/C++ code examples presented in this chapter assume that the line
   
#include <spp_prog_model.h>

appears above the C/C++ code presented. This header file contains the necessary type and function definitions.

The subsections that follow describe these functions.

Number of processors

These functions return the total number of processors on which the process has initiated threads. These threads are not necessarily active.

In Fortran, these functions have the forms:

INTEGER NUM_PROCS()
INTEGER*8 NUM_PROCS_8()

In C, they have the forms:

int num_procs(void);
long long num_procs_8(void);

num_procs can be used to dimension automatic and adjustable arrays in Fortran, and may be used in Fortran, C, and C++ to dynamically specify array dimensions and allocate storage.

Number of threads

These functions return the total number of threads the process creates at initiation, regardless of how many hypernodes the threads occupy, and regardless of how many are idle or active. They are typically used to manually define thread-parallel loops which may span hypernodes.

In Fortran, these functions have the forms:

INTEGER NUM_THREADS()
INTEGER*8 NUM_THREADS_8()

In C, they have the forms:

int num_threads(void);
long long num_threads_8(void);

The return value will only differ from num_procs if threads are oversubscribed.

Number of hypernodes

These functions return the number of hypernodes on which the process is running. They can be used to dimension automatic and adjustable arrays in Fortran and can be used in Fortran, C, and C++ to dynamically specify array dimensions and allocate storage.

In Fortran, these functions have the forms:

INTEGER NUM_NODES()
INTEGER*8 NUM_NODES_8()

In C, they have the forms:

int num_nodes(void);
long long num_nodes_8(void);

Number of threads on current hypernode

These functions return the number of the calling process's threads running on the hypernode from which the function is called. This number can vary from one hypernode to another depending on system configurations, usage of manual parallelization directives, and the number of processors installed on each hypernode.

In Fortran, these functions have the forms:

INTEGER NUM_NODE_THREADS()
INTEGER*8 NUM_NODE_THREADS_8()

In C, they have the forms:

int num_node_threads(void);
long long num_node_threads_8(void);

Thread ID

When called from parallel code these functions return the spawn thread ID of the calling thread, in the range 0..nst-1, where nst is the number of threads in the current spawn context (the number of threads spawned by the last parallel construct). Use them when you wish to direct specific tasks to specific threads inside parallel constructs.

In Fortran, these functions have the forms:

INTEGER MY_THREAD()
INTEGER*8 MY_THREAD_8()

In C, they have the forms:

int my_thread(void);
long long my_thread_8(void);

When called from serial code, these functions return 0.

Hypernode ID

These functions return the logical hypernode ID of the hypernode on which the calling thread is running, in the range 0..num_nodes()-1. Use them when you wish to direct specific tasks to specific hypernodes inside parallel constructs.

In Fortran, these functions have the forms:

INTEGER MY_NODE()
INTEGER*8 MY_NODE_8()

In C, they have the forms:

int my_node(void);
long long my_node_8(void);

Logical hypernode IDs range from 0..n-1, where n is the number of available hypernodes in the system. Logical IDs are assigned in the order in which your program occupies the system. The hypernode that your program's thread 0 runs on is considered logical hypernode 0; any hypernodes it expands to later are assigned increasing logical ID numbers. Because the operating system starts a program on the least-loaded hypernode, mapping of logical hypernode IDs to physical hypernodes can differ between programs due to load balancing; thus two programs running on the same system are unlikely to address identical hypernodes with identical logical IDs.

Logical hypernode IDs are mapped to physical hypernode IDs, which are unique for each hypernode at the machine level.

Level of parallelism

These functions return a value representing the level of parallelism of the calling process.

In Fortran, these functions have the forms:

INTEGER LEVEL_OF_PARALLELISM()
INTEGER*8 LEVEL_OF_PARALLELISM_8()

In C and C++, they have the forms:

int level_of_parallelism(void);
long long level_of_parallelism_8(void);

The return value is one or a sum (bit-wise OR) of the values shown in Table 6-1 “Levels of parallelism”. In C and C++, these values are #defined as symbolic constants in spp_prog_model.h.

Table 6-1 Levels of parallelism

Function
return value
C/C++ symbolic constant nameMeaning
0CPS_PL_NONENot parallel
1CPS_PL_PARALLELAsymmetric thread active
2CPS_PL_NODENode-parallelism
4CPS_PL_NTHREADThread-parallelism within a hypernode
8CPS_PL_THREADSingle-dimensional thread-parallelism

 

As an example of how these can be summed, assume the return value is 6. This means the process is two-dimensionally parallel; it first went parallel across hypernodes, and, within the current hypernode, it went parallel again on the threads of the hypernode. This differs from a return value of 8, which means the process went one-dimensionally thread-parallel and occupies all available threads on all available hypernodes with no nested parallelism.

The valid sum values are: 3,5,6,7, and 9.

A return value of 1, or a sum including 1, means an asymmetric thread is active in the calling program. Asymmetric parallelism is currently only supported by the Compiler Parallel Support Library. Refer to Appendix F, "Appendix F “Compiler Parallel
Support Library”
," for more information.

Stack memory type

These functions return a value representing the memory class that the current thread stack is allocated from. The thread stack holds all the procedure-local arrays and variables not manually assigned a class. The thread stack is created in near_shared memory by default . (For nonscalable SMP systems, near_shared memory is automatically mapped to node_private memory.)

In Fortran, these functions have the forms:

INTEGER MEMORY_TYPE_OF_STACK()
INTEGER*8 MEMORY_TYPE_OF_STACK_8()

In C and C++, they have the forms:

int memory_type_of_stack(void);
long long memory_type_of_stack_8(void);

These functions return one of the values described in Table 6-2 “Stack type return values”.

Table 6-2 Stack type return values

Function
return value
C/C++ symbolic constant nameStack memory type
4FAR_SHARED_MEMfar_shared
3NEAR_SHARED_MEMnear_shared
2NODE_PRIVATE_MEMnode_private

 

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.