| United States-English |
|
|
|
![]() |
Fortran 90, Fortran 77, C, aC++: Exemplar Programming Guide > Chapter 6 Advanced shared-memory programmingParallel information functions |
|
Several intrinsics are available to provide information regarding the parallelism or potential parallelism of your program. These are all integer functions, available in both 4- and 8-byte lengths; they can appear in executable statements anywhere an integer expression is legal. The 8-byte versions, which are suffixed with _8, are typically only used in Fortran programs in which the default data lengths have been changed using the -I8 or similar compiler options. When default integer lengths are modified via compiler options in Fortran, the correct intrinsic is automatically chosen regardless of which is specified. These versions expect 8-byte input arguments and return 8-byte values.
The subsections that follow describe these functions. These functions return the total number of processors on which the process has initiated threads. These threads are not necessarily active. In Fortran, these functions have the forms:
In C, they have the forms:
num_procs can be used to dimension automatic and adjustable arrays in Fortran, and may be used in Fortran, C, and C++ to dynamically specify array dimensions and allocate storage. These functions return the total number of threads the process creates at initiation, regardless of how many hypernodes the threads occupy, and regardless of how many are idle or active. They are typically used to manually define thread-parallel loops which may span hypernodes. In Fortran, these functions have the forms:
In C, they have the forms:
The return value will only differ from num_procs if threads are oversubscribed. These functions return the number of hypernodes on which the process is running. They can be used to dimension automatic and adjustable arrays in Fortran and can be used in Fortran, C, and C++ to dynamically specify array dimensions and allocate storage. In Fortran, these functions have the forms:
In C, they have the forms:
These functions return the number of the calling process's threads running on the hypernode from which the function is called. This number can vary from one hypernode to another depending on system configurations, usage of manual parallelization directives, and the number of processors installed on each hypernode. In Fortran, these functions have the forms:
In C, they have the forms:
When called from parallel code these functions return the spawn thread ID of the calling thread, in the range 0..nst-1, where nst is the number of threads in the current spawn context (the number of threads spawned by the last parallel construct). Use them when you wish to direct specific tasks to specific threads inside parallel constructs. In Fortran, these functions have the forms:
In C, they have the forms:
When called from serial code, these functions return 0. These functions return the logical hypernode ID of the hypernode on which the calling thread is running, in the range 0..num_nodes()-1. Use them when you wish to direct specific tasks to specific hypernodes inside parallel constructs. In Fortran, these functions have the forms:
In C, they have the forms:
Logical hypernode IDs range from 0..n-1, where n is the number of available hypernodes in the system. Logical IDs are assigned in the order in which your program occupies the system. The hypernode that your program's thread 0 runs on is considered logical hypernode 0; any hypernodes it expands to later are assigned increasing logical ID numbers. Because the operating system starts a program on the least-loaded hypernode, mapping of logical hypernode IDs to physical hypernodes can differ between programs due to load balancing; thus two programs running on the same system are unlikely to address identical hypernodes with identical logical IDs. Logical hypernode IDs are mapped to physical hypernode IDs, which are unique for each hypernode at the machine level. These functions return a value representing the level of parallelism of the calling process. In Fortran, these functions have the forms:
In C and C++, they have the forms:
The return value is one or a sum (bit-wise OR) of the values shown in Table 6-1 “Levels of parallelism”. In C and C++, these values are #defined as symbolic constants in spp_prog_model.h. Table 6-1 Levels of parallelism
As an example of how these can be summed, assume the return value is 6. This means the process is two-dimensionally parallel; it first went parallel across hypernodes, and, within the current hypernode, it went parallel again on the threads of the hypernode. This differs from a return value of 8, which means the process went one-dimensionally thread-parallel and occupies all available threads on all available hypernodes with no nested parallelism. The valid sum values are: 3,5,6,7, and 9. A return value of 1, or a sum including 1, means an asymmetric
thread is active in the calling program. Asymmetric parallelism
is currently only supported by the Compiler Parallel Support Library.
Refer to Appendix F, "Appendix F “Compiler Parallel These functions return a value representing the memory class that the current thread stack is allocated from. The thread stack holds all the procedure-local arrays and variables not manually assigned a class. The thread stack is created in near_shared memory by default . (For nonscalable SMP systems, near_shared memory is automatically mapped to node_private memory.) In Fortran, these functions have the forms:
In C and C++, they have the forms:
These functions return one of the values described in Table 6-2 “Stack type return values”. Table 6-2 Stack type return values
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||