Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Exemplar Fortran 77, Exemplar C: Exemplar C and Fortran 77 Programmer's Guide > Chapter 2 Exemplar extensions

Exemplar compiler options

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

The options below are recognized in addition to those supported by the standard HP compilers or are available in the standard HP compilers, but have been modified to behave differently in the Exemplar compilers.

-g

This option requests that the compiler generate debugging information in the executable file that can be used by the CXdb debugger (an optional product). See Chapter 5 “Debugging and profiling” for more information on CXdb.

NOTE: Debugging with the dde and xdb debuggers is not supported with code compiled using the Exemplar compilers.

If -g is specified when compiling, the Exemplar C and Fortran 77 compilers restrict optimizations to the +O0 level.

-I8

This option specifies that INTEGER and LOGICAL variable declarations with unspecified lengths are to occupy 8 bytes of storage.

Also, this option transforms intrinsic function references that return default integer or logical values to return 8-byte values of the specified type.

+O[no]autopar

When used with the +Oparallel option, +Oautopar (the default) causes the compiler to automatically parallelize loops that are safe to parallelize.

A loop is safe to parallelize if it has an iteration count that can be determined at runtime before loop invocation, and contains no loop-carried dependences, procedure calls, or I/O operations. A loop-carried dependence exists when one iteration of a loop assigns a value to an address that is referenced or assigned on another iteration.

You can use Fortran directives and C pragmas to improve on the automatic optimizations and to assist the compiler in locating additional opportunities for parallelization.

When used with +Oparallel, the +Onoautopar option causes the compiler to parallelize only those loops marked by the loop_parallel or prefer_parallel directives or pragmas. Because the compiler does not automatically find parallel tasks or regions, user-specified task and region parallelization is not affected by this option.

Because parallelization takes places only at +O3 and above, +O[no]autopar is useful only at +O3 and above.

+O[no]dataprefetch

The +O[no]dataprefetch option enables [disables] optimizations to generate data prefetch instructions for data referenced within innermost loops. The effect is that the memory system will retrieve the data for future iterations while the processor is executing current iterations. For cache lines containing data that will be written, +Odataprefetch prefetches the cache lines so that they are valid for both read and write access.

This option provides no benefit to loops whose data fits in the cache; in fact, it can slow them down because of the prefetch instructions. For loops whose data does not fit in the cache, the speedup can be substantial.

The +O[no]dataprefetch option is valid at +O2 and above. The default is +Onodataprefetch.

+O[no]dynsel

When specified with +Oparallel, +Odynsel (the default) enables workload-based dynamic selection. For parallelizable loops whose iteration counts are known at compile time, +Odynsel causes the compiler to generate either a parallel or a serial version of the loop—depending on which is more profitable.

This optimization also causes the compiler to generate both parallel and serial versions of parallelizable loops whose iteration counts are unknown at compile time. At runtime, the loop's workload is compared to parallelization overhead, and the parallel version is run only if it is profitable to do so.

The +Onodynsel option disables dynamic selection and tells the compiler that it is profitable to parallelize all parallelizable loops. The dynsel directive and pragma can be used to enable dynamic selection for specific loops when +Onodynsel is in effect.

+O[no]exemplar_model

+Oexemplar_model (the default) causes the compiler to recognize the Exemplar programming model. This option allows you to use the directives, pragmas, and associated command-line options that make up the programming model. At lower optimization levels (+O0, +O1, +O2), this option enables only the following components of the programming model:

  • Synchronization directives (Fortran)

  • Synchronization pragmas and synchronization typedefs (C)

  • Memory class directives (Fortran)

  • Memory storage class specifiers (C)

At +O3 and +O4, using +Oexemplar_model enables all directives, pragmas, storage class specifiers, and typedefs. See the section “Exemplar compiler directives and pragmas” for additional information.

The +Onoexemplar_model option turns off support for the Exemplar programming model. If you use this option, directives and pragmas from the Exemplar programming model are ignored.

This option is available only in C V1.2.3 and Fortran 77 V.1.2.3. In C V2.0, +Oexemplar_model is on by default, and +Onoexemplar_model is not available.

+O[no]loop_block

Optimization level(s): +O3, +O4

Default: +Onoloop_block

The +O[no]loop_block option (available only in C V2.0) enables [disables] blocking of eligible loops for improved cache performance. The +Onoloop_block option disables both automatic and directive-specified loop blocking. For more information on loop blocking, see the Exemplar Programming Guide.

+O[no]loop_unroll_jam

Optimization level(s): +O3, +O4

Default: +Oloop_unroll_jam

The +O[no]loop_unroll_jam option (available only in C V2.0) enables [disables] loop unrolling and jamming. The +Onoloop_unroll_jam option disables both automatic and directive-specified unroll and jam. Loop unrolling and jamming increases register exploitation. For more information on the unroll and jam optimization, see the Exemplar Programming Guide.

+O[no]parallel

The +Oparallel option causes the compiler to:

  • Honor the directives and pragmas of the Exemplar programming model that involve parallelism, such as begin_tasks, loop_parallel, prefer_parallel, and parallel. These directives and pragmas are not recognized if +Onoexemplar_model is specified.

  • Look for opportunities for parallel execution in loops.

The following methods can be used to specify the number of processors used in your parallel program:

  • loop_parallel(max_threads=m) directive and pragma

  • prefer_parallel(max_threads=m)directive and pragma

    For more information on these directives and pragmas see the section “Exemplar compiler directives and pragmas”.

  • The environment variable MP_NUMBER_OF_THREADS, which is read at runtime by your program. If this variable is set to some positive integer n, your program executes on n processors; n must be less than or equal to the number of processors in the system where the program is executing. If MP_NUMBER_OF_THREADS is not set, your program runs on the number of processors in the system where it is executing.

The +Oparallel option is valid only at optimization level +O3 and above. Using the +Oparallel option disables +Ofail_safe, which is on by default. See the section +O[no]fail_safe for more information.

The +Onoparallel option is the default for all optimization levels. This option disables automatic and directive-specified parallelization.

NOTE: If you compile one file in an application using +Oparallel, then you must link the application (using the compiler driver) with the +Oparallel option to link in the proper start-up files and runtime support.

+O[no]report[=report_type]

This option causes the compiler to display various optimization reports. +Onoreport is the default. The value of report_type determines which report is displayed, as described below.

+Oreport=loop produces the Loop Report. This report gives information on optimizations performed on loops and calls. Using +Oreport (without =report_type) also produces the Loop Report.

+Oreport=private produces the Loop Report and the Privatization Table, which provides information on loop variables that are privatized by the compiler.

+Oreport=all produces all reports.

The +Oreport[=report_type] option is active only at +O3 and above. The +Onoreport option does not accept any of the report_type values. See the Exemplar Programming Guide for more information on the optimization reports.

The option +Oinfo also displays information on the various optimizations being performed by the compilers. +Oinfo can be used at any optimization level but is most useful at +O3 and above. The default, at all optimization levels, is +Onoinfo.

+O[no]sharedgra

The +Onosharedgra option disables global register allocation for shared-memory variables that are visible to multiple threads. This option can help if a variable shared among parallel threads is causing wrong answers. See the Exemplar Programming Guide for more information.

Global register allocation (+Osharedgra) is enabled by default at optimization level +O2 and higher.

+pa

The +pa option requests that the application be compiled for routine-level profiling with CXperf. The +pa option is not valid with the +O4 or +Oall optimization levels. Also, +pa is not compatible with the -p or -G options. See Chapter 5 “Debugging and profiling” for more information on CXperf.

+pal

At +O2 and +O3, the +pal option requests that the application be compiled for routine-level and loop-level profiling with CXperf. The +pal option is not valid with the +O4 or +Oall optimization levels. Also, +pal is not compatible with the -p or -G options. See Chapter 5 “Debugging and profiling” for more information on CXperf.

+tm target

This option specifies the target machine architecture for which compilation is to be performed. Using this option causes the compiler to perform architecture-specific optimizations. target takes one of the following values:

  • spp1200 to specify SPP1200 Series machines

  • spp1600 to specify SPP1600 Series machines

  • S2000 to specify S2000 servers

  • X2000 to specify X2000 servers

In addition to the values above, the Exemplar C V2.0 compiler accepts the following values for target:

  • K7200 to specify K-Class servers using PA-7200 processors

  • K8000 to specify K-Class servers using PA-8000 processors

  • V2200 to specify V2200 servers

Although Fortran 77 is available on the K-Class and V-Class servers, these target values are not available.

This option is valid at all optimization levels. The default target value corresponds to the machine on which you invoke the compiler. The +tm target option is automatically specified when you use one of the Exemplar compiler drivers.

Using the +tm target option implies +DA and +DS settings as described in Table 2-1 “+tm target and +DA/+DS. +DAarchitecture causes the compiler to generate code for the architecture specified by architecture. +DSmodel causes the compiler to use the instruction scheduler tuned to model. See the cc(1) man page or the f77(1) man page for more information on the +DA and +DS options.

Table 2-1 +tm target and +DA/+DS

target value specified+DAarchitecture implied+DSmodel implied
spp12001.11.1
spp16001.11.1
S20002.02.0
X20002.02.0
K72001.11.1
K80002.02.0
V22002.02.0

 

If you specify +DA or +DS on the compiler command line, your setting takes precedence over the setting implied by +tm target.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.