Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Exemplar Fortran 77, Exemplar C: Exemplar C and Fortran 77 Programmer's Guide > Chapter 2 Exemplar extensions

Exemplar compiler directives and pragmas

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

This section presents an alphabetical list of the Fortran directives and C pragmas that make up the Exemplar programming model. The Exemplar compilers accept the directives and pragmas listed below in addition to those supported by the standard HP compilers.

This section is intended to provide a brief overview of the available directives and pragmas. More specific information and examples can be found in the Exemplar Programming Guide. The Fortran directives not supported as C pragmas are expressed in C as either storage class extensions (thread_private, etc.) or as typedefs (gate_t, barrier_t, etc.) in the spp_prog_model.h file and are described in the "Memory classes" and the "Advanced shared-memory programming" chapters of the Exemplar Programming Guide.

The form of an Exemplar Fortran compiler directive is:

C$DIR directive-list

The form of an Exemplar C pragma is:

#pragma _CNX directive-list

where

directive-list

Is a comma-separated list of the directives/pragmas described in this chapter.

For information on how to properly use these directives or pragmas, see the Exemplar Programming Guide.

Directive names are presented here in lowercase; they may be specified in either case in both languages, but #pragma must always appear in lowercase in C.

In the sections that follow, namelist represents a comma-separated list of names. These names can be variables, arrays, or COMMON blocks. In the case of a COMMON block, its name must be enclosed within slashes. The occurrence of a lowercase n or m is used to indicate an integer constant. Occurrences of gate_var are for variables that have been, or are being, defined as gates. Any parameters that appear within square brackets ([ ]) are optional.

align_cti(namelist)

This directive or pragma aligns the variables and arrays listed in namelist on CTIcache boundaries. This allows for more efficient data reuse.

A CTIcache is a partition of physical memory that exists on each hypernode and is used to store copies of global data fetched from other hypernodes. (A hypernode is a set of processors and physical memory organized as a symmetric multiprocessor (SMP) running a single image of the operating system microkernel.)

Single-hypernode systems do not have CTIcaches; however, this directive can be useful if you are porting code to a multi-hypernode system. See the Exemplar Programming Guide for more information.

barrier(namelist)

This Fortran directive denotes a list of variables, as given in namelist, that are to be used as the synchronization variables for the barrier routines. This does not imply any synchronization in itself; it is simply defining the barrier variables. In C, barrier is a typedef (barrier_t), rather than a pragma. For more information, refer to the Exemplar Programming Guide.

begin_tasks[(attribute_list)]

This directive or pragma defines the beginning of a section (or sections; see next_task) of code that is to be executed as an independent, parallel task. Each task is executed by a separate thread. begin_tasks must have an accompanying end_tasks in the same program unit.

The optional attribute_list can be any of the following legal combinations (m is an integer constant):

  • threads (default)

  • nodes

  • dist

  • ordered

  • max_threads=m

  • threads, ordered

  • nodes, ordered

  • dist, ordered

  • threads, max_threads=m

  • nodes, max_threads=m

  • dist, max_threads=m

  • ordered, max_threads=m

  • threads, ordered, max_threads=m

  • nodes, ordered, max_threads=m

  • dist, ordered, max_threads=m

Attributes may be listed in any order. The compilers flag any attribute combinations other than those listed above with a warning and ignore the directive.

Refer to the Exemplar Programming Guide for a complete discussion of parallel tasking.

block_loop[(block_factor=n)]

This directive or pragma indicates a specific loop to block, and optionally, the block factor n (n must be an integer constant greater than or equal to 2) that is to be used in the compiler's internal computation of loop nest based data reuse. If no block_factor is specified, the compiler uses a heuristic to determine the block_factor. Refer to the Exemplar Programming Guide for more information on blocking.

block_shared(allocatable_array_namelist)

This Fortran directive is used to declare arrays as being of type block_shared. Block-shared arrays are sized to be an integral multiple of the page size. The pages of the array are distributed in same-size blocks across the hypernodes on which the process is executing in the system. If the user-specified size is not an integral multiple of page size * num_nodes(), then the size is automatically rounded up to meet this criterion. Refer to Chapter 5, "Memory classes," in the Exemplar Programming Guide for more information.

critical_section[(gate_var)]

This directive or pragma defines the beginning of a code block in which only one thread may be executing at a time. The end of the code block must be indicated by an end_critical_section directive or pragma, which must appear in the same flow of control within the same program unit. The optional gate_var can be used to differentiate between parallel tasks. Refer to the Exemplar Programming Guide for more information.

dynsel[(trip_count=n)]

This directive or pragma enables workload-based dynamic selection for the immediately following loop. trip_count represents either the thread_trip_count or node_trip_count attribute, and n is an integer constant.

When thread_trip_count=n is specified, the serial version of the loop is run if the iteration count is less than n; otherwise, the thread-parallel version is run. When node_trip_count=n is specified, the serial version of the loop is run if the iteration count is less than n; otherwise, the node-parallel version is run, assuming +Onodepar is specified.

end_critical_section

This directive or pragma defines the end of the critical section that was begun with the critical_section directive or pragma. critical_section and end_critical_section must appear as a pair. Refer to the Exemplar Programming Guide for more information.

end_ordered_section

This directive or pragma defines the end of the ordered section that was begun with the ordered_section directive or pragma. ordered_section and end_ordered_section must appear as a pair. Refer to the Exemplar Programming Guide for more information on ordered sections.

end_parallel

This directive or pragma signifies the end of a parallel region. The parallel directive signifies the beginning of a parallel region. Refer to Chapter 4, "Basic shared-memory programming," in the Exemplar Programming Guide for more information.

end_tasks

This directive or pragma terminates the specification of parallel tasks indicated by begin_tasks and next_task. It must appear at the end of the last section of parallel code defined by these directives or pragmas. All of these must appear in the same program unit. Refer to the Exemplar Programming Guide for more information.

far_shared(namelist)

This Fortran directive causes the compiler to place the data objects in namelist (variables, arrays, or COMMON blocks) into far_shared memory. far_shared memory is the most general form that is distributed on a page basis across the memories of all hypernodes in a system. The far_shared data objects of a process are addressable by all threads of that process. In C, far_shared is a storage class specifier. Refer to the Exemplar Programming Guide for more information on memory classes.

far_shared_pointer(namelist)

This Fortran directive causes the compiler to place the (compiler-generated, hidden) pointers to the allocated objects (specified in namelist) in far_shared memory, regardless of the memory classes to which the respective objects are allocated.

This directive applies only to Fortran 90-style allocatable data objects used in HP Fortran 77 programs. Refer to Chapter 5, "Memory classes," in the Exemplar Programming Guide for more information on memory classes.

gate(namelist)

This Fortran directive defines a gate variable that is to be used subsequently in a critical section, ordered section, or passed as an argument to the synchronization intrinsics. In C, gate is a typedef (gate_t), rather than a pragma. Refer to the Exemplar Programming Guide for more information.

loop_parallel[(attribute_list)]

This directive or pragma is an explicit instruction to the compiler to parallelize the immediately following loop. The loop iterations are run in an indeterminate order unless the optional ordered attribute appears. You are responsible for any required data privatization and loop synchronization, as described in Chapter 4, "Basic shared-memory programming," and Chapter 6, "Advanced shared-memory programming," of the Exemplar Programming Guide. The optional attribute_list can be any of the following combinations (n and m are integer constants):

  • threads (default)

  • nodes

  • dist

  • ordered

  • max_threads=m

  • chunk_size=n

  • threads, ordered

  • nodes, ordered

  • dist, ordered

  • threads, max_threads=m

  • nodes, max_threads=m

  • dist, max_threads=m

  • ordered, max_threads=m

  • threads, chunk_size=n

  • nodes, chunk_size=n

  • dist, chunk_size=n

  • threads, ordered, max_threads=m

  • nodes, ordered, max_threads=m

  • dist, ordered, max_threads=m

  • chunk_size=n, max_threads=m

  • threads, chunk_size=n, max_threads=m

  • nodes, chunk_size=n, max_threads=m

  • dist, chunk_size=n, max_threads=m

  • ivar= indvar

The ivar= indvar attribute is:

  • Required for all loops in C and for DO WHILE and hand-rolled loops in Fortran

  • Optional for Fortran DO loops

  • Compatible with any other attribute

Attributes may be listed in any order. The compilers flag any attribute combinations other than those listed above with a warning and ignore the directive.

Refer to the Exemplar Programming Guide for more information.

loop_private(namelist)

This directive or pragma declares a list of variables and/or arrays private to the immediately following loop. No values may be carried into the loop by loop_private variables. To be loop private, the variables and/or arrays must be assigned before they are used on each iteration of the immediately following loop. These private data items are distinct from the shared items of the same name that exist outside the loop. Values assigned to loop_private variables on the final iteration (that is, the nth iteration of a loop with n iterations) may be saved into the shared variables of the same name if the save_last directive or pragma also appears on this loop. If save_last is not used, then the value of any shared variable declared to be loop_private is undefined at loop termination. Refer to the Exemplar Programming Guide for more information.

near_shared(namelist)

When applied to static variables at compile-time, this Fortran directive causes all pages of the data objects in namelist to be mapped to physical pages on logical hypernode 0 (the hypernode where the program starts). If applied to allocatable arrays, then the pages of such arrays will be mapped to physical pages on the hypernode of the allocating thread. near_shared data can be addressed by any thread of a process on any hypernode in the system but it is "closer" (in terms of access latency) to the threads on the hypernode that allocates the data. In C, near_shared is a storage class specifier. Refer to the Exemplar Programming Guide for more information on memory classes.

near_shared_pointer(namelist)

This Fortran directive causes the compiler to place the (compiler-generated, hidden) pointers to the allocated objects (specified in namelist) in near_shared memory, regardless of the memory classes to which the respective objects are allocated.

This directive applies only to Fortran 90-style allocatable data objects used in HP Fortran 77 programs. Refer to Chapter 5, "Memory classes," in the Exemplar Programming Guide for more information on memory classes.

next_task

This directive or pragma starts a block of code following a begin_tasks block that will be executed as a parallel task. The end of the code block is marked by another next_task or by an end_tasks directive or pragma.

This directive must appear within a begin_tasks and end_tasks pair. There is no limit on the number of next_task directives that can appear. Refer to the Exemplar Programming Guide for more information.

no_block_loop

This directive or pragma disables loop blocking on the immediately following loop. Refer to the Exemplar Programming Guide for more information on loop blocking.

no_distribute

This directive or pragma disables loop distribution for the immediately following loop. Refer to the Exemplar Programming Guide for more information on loop distribution.

no_dynsel

This directive or pragma disables workload-based dynamic selection for the immediately following loop. Refer to the Exemplar Programming Guide for more information on dynamic selection.

no_loop_dependence(namelist)

This directive or pragma informs the compiler that the arrays in namelist do not have any dependences for iterations of the immediately following loop. Use no_loop_dependence for arrays only; use loop_private to indicate dependence-free scalar variables.

This directive or pragma causes the compiler to ignore any dependences that it perceives to exist. This can enhance the compiler's ability to optimize the loop, including the possibility of parallelization.

Refer to the Exemplar Programming Guide for more information.

no_loop_transform

This directive or pragma prevents the compiler from performing
reordering transformations on the following loop. The compiler does not distribute, fuse, block, interchange, unroll, unroll and jam, or parallelize a loop on which this directive or pragma appears. Refer to the Exemplar Programming Guide for more information.

no_parallel

This directive or pragma prevents the compiler from generating parallel code for the immediately following loop. Refer to the Exemplar Programming Guide for more information.

no_side_effects(funclist)

This directive or pragma informs the compiler that the functions appearing in funclist have no side effects wherever they appear lexically following the directive. Side effects include modifying a function argument, modifying a Fortran COMMON variable, performing I/O, or calling another routine that does any of the above. The compiler can sometimes eliminate calls to procedures that have no side effects; also, the compiler may be able to parallelize loops with calls when informed that the called routines do not have side effects.

no_unroll_and_jam

This directive or pragma disables loop unroll and jam for the immediately following loop. Refer to the Exemplar Programming Guide for more information.

node_private(namelist)

This Fortran directive causes the variables and arrays specified in namelist to be replicated in the physical memory of each hypernode on which the process is executing. Thus, while each data object has a single image in virtual memory, it maps to a different physical location on each hypernode. The threads of a process within a hypernode all share access to the copy on their hypernode and cannot access the copies on other hypernodes. In C, node_private is a storage class specifier. Refer to Chapter 5, "Memory classes," in the Exemplar Programming Guide for more information.

node_private_pointer(namelist)

This Fortran directive causes the compiler to place the (compiler-generated, hidden) pointers to the allocated objects (specified in namelist) in node_private memory, regardless of the memory classes to which the respective objects are allocated.

This directive applies only to Fortran 90-style allocatable data objects used in HP Fortran 77 programs. Refer to Chapter 5, "Memory classes," in the Exemplar Programming Guide for more information.

ordered_section(gate_var)

This directive or pragma defines the beginning of an ordered section. An ordered section is the same as a critical section (a code block in which only one thread may be executing at a time) with the additional restriction that the threads must pass through the ordered section in iteration order. The end of the code block must be indicated by an end_ordered_section directive or pragma. Ordered sections must appear within the control flow of a loop_parallel(ordered)directive. Refer to the Exemplar Programming Guide for more information.

parallel[(attribute_list)]

This directive or pragma signifies the beginning of a parallel region of code. All code up to the following end_parallel directive or pragma will be run on all available threads. No loop transformations, data privatization, or parallelization analysis will be performed by the compiler on the code in the region.

The optional attribute_list can be any of the following legal combinations (m is an integer constant):

  • threads (default)

  • nodes

  • max_threads=m

  • threads,max_threads=m

  • nodes,max_threads=m

Attributes may be listed in any order. The compilers flag any attribute combinations other than those listed above with a warning and ignore the directive.

Refer to Chapter 4, "Basic shared-memory programming," in the Exemplar Programming Guide for more information.

parallel_private(namelist)

This directive or pragma declares a list of variables or arrays private to the immediately following parallel region. It serves the same purpose for parallel regions that task_private serves for tasks. The privatized variables and arrays will not carry their values beyond the end_parallel directive or pragma. Refer to Chapter 4, "Basic shared-memory programming," in the Exemplar Programming Guide for more information.

prefer_parallel[(attribute_list)]

This directive or pragma instructs the compiler to parallelize the following loop, but only if it is safe to do so. A loop is safe to parallelize if it has an iteration count that can be determined at runtime before loop invocation and contains no loop-carried dependences, procedure calls, or I/O operations. (A loop-carried dependence exists when one iteration of a loop assigns a value to an address that is referenced or assigned on another iteration.) Refer to the Exemplar Programming Guide for more information.

The optional attribute_list can be any of the following combinations (n and m are integer constants):

  • threads (default)

  • nodes

  • dist

  • max_threads=m

  • chunk_size=n

  • threads, max_threads=m

  • nodes, max_threads=m

  • dist, max_threads=m

  • threads, chunk_size=n

  • nodes, chunk_size=n

  • dist, chunk_size=n

  • chunk_size=n, max_threads=m

  • threads, chunk_size=n, max_threads=m

  • nodes, chunk_size=n, max_threads=m

  • dist, chunk_size=n, max_threads=m

Attributes may be listed in any order. The compilers flag any attribute combinations other than those listed above with a warning and ignore the directive.

reduction(namelist)

This pragma (available in CV2.0)—which is only to be used with loop_parallel—specifies that the scalar variables in the comma-separated namelist are involved in reductions. The reduction pragma is used to inform the compiler of reductions in loop_parallel loops. Once the compiler is informed of the reductions, the compiler generates code to perform the reduction while parallelizing the loop—assuming no other parallelization inhibitors occur in the loop. Refer to the Exemplar Programming Guide for more information.

save_last[(list)]

This directive or pragma specifies that the variables in the comma-separated list that are also named in an associated loop_private(namelist) directive or pragma must have their last values saved into the "shared" variable of the same name at loop termination. (A variable's last value in a loop of n iterations is the value it is assigned in the nth iteration.)

If the optional list is not used, save_last specifies that all variables named in an associated loop_private(namelist) directive or pragma must have their last values saved into the "shared" variable of the same name at loop termination.

If save_last is not specified then the values in any privatized variables or arrays are indeterminate at loop termination. Refer to the Exemplar Programming Guide for more information.

scalar

This directive or pragma prevents the compiler from performing reordering transformations on the following loop. The compiler does not distribute, fuse, block, interchange, unroll, unroll and jam, or parallelize a loop on which this directive or pragma appears.

The no_loop_transform directive or pragma provides the same functionality as the scalar directive or pragma and is recommended in place of the scalar directive or pragma.

sync_routine(routinelist)

This directive or pragma indicates to the compiler that the routines listed in routinelist are user-defined synchronization routines, so that the compiler does not attempt to move code across these routine calls. Use sync_routine anytime you hide a call to a compiler synchronization function inside another routine call, or anytime you use CPSlib functions for synchronization. (CPSlib is a library of low-level parallelization and synchronization routines. See the Exemplar Programming Guide for more information.)

sync_routine is effective only for the listed routines in the file in which it appears.

task_private(namelist)

This directive or pragma privatizes the variables and arrays specified in namelist for each task specified in the immediately following begin_tasks/end_tasks block. If a task_private data object is referenced within a task, it must have been assigned a value previously in that task. The privatized variables and arrays do not carry their values beyond the end_tasks directive or pragma. Refer to the Exemplar Programming Guide for more information.

thread_private(namelist)

This Fortran directive causes the variables and arrays specified in namelist to be treated as being thread_private. thread_private data objects map to unique node_private addresses for each thread of a process. In C, thread_private is a storage class specifier. Refer to the Exemplar Programming Guide for more information.

thread_private_pointer(namelist)

This Fortran directive causes the compiler to place the (compiler-generated, hidden) pointers to the allocated objects (specified in namelist) in thread_private memory, regardless of the memory classes to which the respective objects are allocated.

This directive applies only to Fortran 90-style allocatable data objects used in HP Fortran 77 programs. Refer to Chapter 5, "Memory classes," in the Exemplar Programming Guide for more information.

unroll_and_jam[(unroll_factor=n)]

This directive or pragma causes one or more noninnermost loops in the immediately following nest to be partially unrolled (to a depth of n if unroll_factor is specified), then fuses the resulting loops back together. It must be placed on a loop that ends up be ing noninnermost after any compiler-initiated interchanges. Refer to the Exemplar Programming Guide for more information.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.