Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Parallel Programming Guide for HP-UX Systems: K-Class and V-Class Servers > Chapter 7 Controlling optimization

Optimization directives and pragmas

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

This section lists the directives, and pragmas available for use in optimization. Table 7-5 “Directive-based optimization options” below describes the options and the optimization levels at which they are used. The pragmas are not supported by the aC++ compiler.

The loop_parallel, parallel, prefer_parallel, and end_parallel options are described in Chapter 9 “Parallel programming techniques”.

Table 7-5 Directive-based optimization options

Directives and PragmasValid Optimization levels
block_loop [(block_factor=n)]+O3, +O4
dynsel[(trip_count=n)]+O3, +O4
no_block_loop+O3, +O4
no_distribute+O3, +O4
no_dynsel+O3, +O4
no_loop_dependence(namelist)+O3, +O4
no_loop_transform+O3, +O4
no_parallel+O3, +O4
no_side_effects+O3, +O4
no_unroll_and_jam+O3, +O4
reduction(namelist)+O3, +O4
scalar+O3, +O4
sync_routine(routinelist)+O3, +O4
unroll_and_jam[(unroll_factor=n)]+O3, +O4

 

Rules for usage

The form of the optimization directives and pragmas is shown in
Table 7-6 “Form of optimization directives and pragmas”.

NOTE: The HP aC++ compiler does not support the optimization pragmas described in this section.

Table 7-6 Form of optimization directives and pragmas

LanguageForm
Fortran

C$DIR directive-list

C

#pragma _CNX directive-list

 

where

directive-list

is a comma-separated list of one or more of the directives/pragmas described in this chapter.

  • Directive names are presented here in lowercase, and they may be specified in either case in both languages. However, #pragma must always appear in lowercase in C.

  • In the sections that follow, namelist represents a comma-separated list of names. These names can be variables, arrays, or COMMON blocks. In the case of a COMMON block, its name must be enclosed within slashes. The occurrence of a lowercase n or m is used to indicate an integer constant.

  • Occurrences of gate_var are for variables that have been or are being defined as gates. Any parameters that appear within square brackets ([ ]) are optional.

block_loop[(block_factor=n)]

block_loop[(block_factor=n)]indicates a specific loop to block and, optionally, the block factor n. This block factor is used in the compiler's internal computation of loop nest-based data reuse; this is the number of times that the data reuse has resulted as a result of loop nesting. This figure must be an integer constant greater than or equal to 2. If no block_factor is specified, the compiler uses a heuristic to determine the block_factor. For more information on loop blocking, refer to Chapter 3 “Optimization levels”.

dynsel[(trip_count=n)]

dynsel[(trip_count=n)] enables workload-based dynamic selection for the immediately following loop. trip_count represents the thread_trip_count attribute, and n is an integer constant.

  • When thread_trip_count = n is specified, the serial version of the loop is run if the iteration count is less than n. Otherwise, the
    thread-parallel version is run.

  • For more information on dynamic selection, refer to the description of the optimization option +O[no]dynsel.

no_block_loop

no_block_loop disables loop blocking on the immediately following loop. For more information on loop blocking, see the description of block_loop[(block_factor=n)] in this section, or refer to the description of the optimization option +O[no]loop_block.

no_distribute

no_distribute disables loop distribution for the immediately following loop. For more information on loop distribution, refer to the description of the optimization option +O[no]loop_transform.

no_dynsel

no_dynsel disables workload-based dynamic selectio n for the immediately following loop. For more information on dynamic selection, refer to the description of the optimization option +O[no]dynsel.

no_loop_dependence(namelist)

no_loop_dependence(namelist) informs the compiler that the arrays in namelist do not have any dependences for iterations of the immediately following loop. Use no_loop_dependence for arrays only. Use loop_private to indicate dependence-free scalar variables.

This directive or pragma causes the compiler to ignore any dependences that it perceives to exist. This can enhance th e compiler's ability to optimize the loop, including parallelization.

For more information on loop dependence, refer to “Loop-carried dependences”.

no_loop_transform

no_loop_transform prevents the compiler from performing reordering transformations on the following loop. The compiler does not distribute, fuse, block, interchange, unroll, unroll and jam, or parallelize a loop on which this directive appears. For more information on no_loop_transform, refer to the optimization option +O[no]loop_transform.

no_parallel

no_parallel prevents the compiler from generating parallel code for the immediately following loop. For more information on no_parallel, refer to the optimization option +O[no]parallel.

no_side_effects(funclist)

no_side_effects(funclist)informs the compiler that the functions appearing in funclist have no side effects wherever they appear lexically following the directive. Side effects include modifying a function argument, modifying a Fortran COMMON variable, performing I/O, or calling another routine that does any of the above. The compiler can sometimes eliminate calls to procedures that have no side effects. The compiler may also be able to parallelize loops with calls when informed that the called routines do not have side effects.

unroll_and_jam[(unroll_factor=n)]

unroll_and_jam[(unroll_factor=n)] causes one or more noninnermost loops in the immediately following nest to be partially unrolled (to a depth of n if unroll_factor is specified), then fuses the resulting loops back together. It must be placed on a loop that ends up being noninnermost after any compiler-initiated interchanges. For more information on unroll_and_jam, refer to the description of +O[no]loop_unroll_jam.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.