Among the most basic optimizations
performed by an Exemplar compiler is code motion, which is described
in Chapter 3, "Chapter 3 “Compiler optimizations”." This optimization can move
some code across routine calls. If the routine call is to a synchronization
or parallelization function and the code moved must execute on a
certain side of it, this movement can cause wrong answers. Anytime
you use CPSlib functions in Fortran or C rather than the directives
or functions described in Chapter 4, "Chapter 4 “Basic shared-memory
programming”," and Chapter 6, "Chapter 6 “Advanced shared-memory programming”," to synchronize
or parallelize code, you must identify the functions with a sync_routine
directive or pragma. sync_routine
should be used to identify all CPSlib functions, as well as any
user-written routines that accomplish synchronization or parallelization
or hide calls to any synchronization or parallelization routines.
In Fortran, sync_routine
has the following form:
C$DIR SYNC_ROUTINE (routinelist)
In C, it has the following form:
#pragma _CNX sync_routine (routinelist)
where
- routinelist
is a comma-separated list of synchronization procedures.
sync_routine is only effective
for the listed routines that lexically follow it in the file in
which it appears.
Consider the following Fortran example:
SUBROUTINE WORK(ARG1, ARG2, MUTX) INTEGER ARG1, ARG2, MUTX, CPS_MUTEX_LOCK, CPS_MUTEX_UNLOCK C$DIR SYNC_ROUTINE(CPS_MUTEX_LOCK, CPS_MUTEX_UNLOCK) . . . DO I = 1, N . . . LCK = CPS_MUTEX_LOCK(MUTX) . . . LCK = CPS_MUTEX_UNLOCK(MUTX) ENDDO . . . END |
Here, the subroutine WORK
is called in parallel and contains a loop that contains a critical
section protected by calls to CPSlib functions. Listing these CPSlib
functions in a SYNC_PARALLEL directive
at the beginning of the subroutine prevents the compiler from moving
code out of the critical section.
An analogous C example follows:
#include <spp_prog_model.h>
work(int arg1, int arg2, int mutx) { int i, lck; #pragma _CNX sync_routine(cps_mutex_lock, cps_mutex_unlock) . . . #pragma _CNX loop_parallel(ivar=i) for(i=0; i<n; i++) { . . . lck = cps_mutex_lock(&mutx); . . . lck = cps_mutex_unlock(&mutx); } } |