Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP Fortran 90 Programmer's Reference: HP Series 700/800 Computers > Chapter 13 Compile-line options

Optimization options

» 

Technical documentation

» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The options described in this section allow you to control the different optimizations that the compiler can apply to your program. These options fall into two categories:

  • Options that control classes of optimization (for example, optimizations that affect code size)

  • Options that control specific optimizations (for example, inlining)

The following subsections describe the options in both categories. For information about the options that control levels of optimization, see the description of the +On option in the “List of compile-line options”. The +O[no]info option, which provides compile-time information about the optimization process, is described in the same section.

NOTE: You can insert (or remove) underscore characters in the names of any of the optimization options to improve their readability. The compiler will recognize the option name with or without underscores.

General optimization options

The following options allow you to control how optimization affects code size, compilation time, runtime performance, and other user-visible effects. The syntax for using these options is:

+O[no]optimization

where optimization is a parameter that specifies the class of optimization to apply to your program. The different parameters are described below. The prefix no negates the effect of optimization.

Except for +Oall, the options do not override a specified level of optimization, nor do they imply a particular level. (The +Oall option automatically invokes the highest level of optimization.) To use any of these options you must also include the +On option on the same command line, where n specifies the level at which the type of optimization is effective. Thus, if you wish to apply all optimizations available at level 3 except those that might significantly increase code size, you would use the command line:

f90 +O3 +Osize my_prog.f90

If an option is mistakenly used at a level at which the corresponding optimization is not performed, the compiler will issue a warning message.

The defaults specified in the following descriptions are in effect only at the specified optimization levels, unless stated otherwise.

+O[no]aggressive

The +Oaggressive option enables optimizations that can result in significant performance improvement but can also change a program's behavior. This option is only effective at optimization level 2 or higher.

The +Oaggressive option performs optimizations invoked by the following options:

  • +Oentrysched

  • +Olibcalls

  • +Onofltacc

  • +Onoinitcheck

  • +Oregionsched

  • +Ovectorize

The +Oaggressive option is incompatible with +Oconservative.

The default is +Onoaggressive.

+O[no]all

The +Oall option performs maximum optimization, including aggressive optimizations and optimizations that can significantly increase compile time and memory usage. The +Oall option automatically invokes the highest level of optimization.

The default is +Onoall.

+O[no]conservative

The +Oconservative option causes the optimizer to make conservative assumptions about the code when optimizing it. This option is only effective at optimization level 2 or higher.

The +Oconservative option sets the following options:

  • +Ofltacc

  • +Onomoveflops

  • +Oparmsoverlap

Use +Oconservative when conservative assumptions are necessary due to the coding style, as with nonstandard-conforming programs. Note that it is incompatible with +Oaggressive.

The +Onoconservative option relaxes the optimizer's assumptions about the target program.

The default is +Onoconservative.

+O[no]limit

The +Olimit option suppresses optimizations that significantly increase compilation time or that can consume large amounts of memory at compile time. This option is only effective at optimization level 2 or higher.

The +Onolimit option allows optimizations to be performed regardless of their effect on compilation time or memory usage.

The default is +Olimit.

+O[no]size

The +Osize option suppresses optimizations that significantly increase code size. This option is only effective at optimization level 2 or higher.

The +Onosize option permits optimizations that can increase code size.

The default is +Onosize.

Fine-tuning optimization options

The following options allow you to fine-tune the optimization process by providing control over the specific techniques that the optimizer applies to your program. The syntax for using these options is

+O[no]optimization

where optimization is a parameter that specifies an optimization technique to apply to your program. The different parameters are described below. The prefix no negates the effect of optimization.

The options do not override a specified level of optimization, nor do they imply a particular level. To use any of these options you must also include the +On option on the same command line, where n specifies the level at which the type of optimization can be performed.

For example, if you find that the optimizer is causing your program to produce different floating-point results from those produced by the unoptimized program, you could use the following command line to suppress optimizations that affect floating-point calculations:

f90 +O3 +Onomoveflops +Ofltacc my_prog.f90

If an option is mistakenly used at a level for which the corresponding optimization is not performed, the compiler will issue a warning message.

The defaults given in the following descriptions are in effect only at the specified optimization levels, unless stated otherwise.

+O[no]cache_pad_common

The +Ocache_pad_common option can improve program performance by padding common blocks to avoid cache collisions. Cache-line collisions occur when the difference between the addresses of two data points is a multiple of the cache size. By inserting empty space between large variables (for example, arrays), the optimizer ensures that they do not start at nearby addresses, where the possibility of a cache collision is greater. This option is only effective at optimization level 3 or higher.

Note the following precautions when using this option:

  • All program modules that reference the common block must be compiled with the +Ocache_pad_common option.

  • Each common block in the program should have the same layout in all program units within which it is declared. If the layouts are different, they must be fully independent—that is, they must not pass values between them.

The default, +Onocache_pad_common, disables padding.

+O[no]dataprefetch

The +Odataprefetch option causes the optimizer to insert instructions within innermost loops to explicitly prefetch data from memory into the data cache. Data prefetch instructions will be inserted only for data structures referenced within innermost loops using simple loop varying addresses (that is, in a simple arithmetic progression). This option is only effective at optimization level 2 or higher. It is only available for PA-RISC 2.0 targets.

Use this option for applications that have high data cache miss overhead.

The default is +Onodataprefetch.

+O[no]entrysched

The +Oentrysched option allows the optimizer to perform instruction scheduling on a subprogram's entry and exit code sequences. This option is only effective at optimization level 1 or higher.

The option can change the behavior of programs that perform exception-handling or that handle asynchronous interrupts.

The default is +Onoentrysched.

+O[no]fastaccess

The +Ofastaccess option improves execution time by speeding up access to global data items. You can use this option at any level of optimization.

Note that the +Ofastaccess option may increase link time.

The default is +Onofastaccess at optimization levels 1, 2, and 3; and +Ofastaccess at optimization level 4.

+O[no]fltacc

The +Onofltacc option allows optimizations that follow the rules of algebra but change the order of expression evaluation. For example, if a, b, and c are floating-point variables, the expressions (a + b) + c and a + (b + c) may give slightly different results due to roundoff. This option is only effective at optimization level 2 or higher.

The +Ofltacc option does not allow any optimizations that change the order of expression evaluation and therefore may affect the accuracy of the result.

The +Onofltacc option also allows the optimizer to fuse together adjacent multiply and add operations, known as Fused Multiply-Add (FMA). FMA is implemented by the FMPYFADD and FMPYNFADD instructions and is only available on PA-RISC 2.0 systems. FMA improves performance but occasionally produces results that may differ in accuracy from results produced by code where fusing has not occurred. In general, the difference is slight.

The +Ofltacc option disables fusing. At optimization level 2 or higher, FMA code generation occurs by default.

Table 13-9 “ Optimizations performed by +O[no]fltacc identifies the different actions taken by the optimizer, according to whether you specify +Ofltacc, +Onofltacc, or neither option.

Table 13-9  Optimizations performed by +O[no]fltacc

Optimization options

Expression reordering?

FMA?

+O2

No

Yes

+O2 +Ofltacc

No

No

+O2 +Onofltacc

Yes

Yes

 

+O[no]initcheck

The initialization checking feature of the optimizer has three possible states: on, off, or unspecified. When this option is specified in the on state (+Oinitcheck), the optimizer initializes to zero any local, scalar, nonstatic variables that are uninitialized with respect to at least one path leading to a use of the variable.

When +Onoinitcheck is specified, the optimizer issues warning messages when it discovers definitely uninitialized variables, but does not initialize them.

When this option is unspecified, the optimizer initializes to zero any local, scalar, nonstatic variables that are definitely uninitialized with respect to all paths leading to a use of the variable.

This option is only effective at optimization level 2 or higher.

+O[no]inline

The +Oinline option makes all subprograms eligible for inlining. This option is only effective at optimization level 3 or higher.

The +Onoinline option disables inlining for all subprograms in your program.

The default is +Oinline at optimization level 3 and +Onoinline at the lower levels.

+Oinline_budget=n

The +Oinline_budget option enables the optimizer to perform more aggressive inlining.

This option has the following syntax:

+Oinline_budget=n

where n is an integer in the range 1 - 1000000 that specifies the level of aggressiveness, as listed in Table 13-10 “Values for the +Oinline_budget option”.

The +Onolimit and +Osize options also affect inlining. Specifying the +Onolimit option has the same effect as specifying +Oinline_budget=200. The +Osize option has the same effect as +Oinline_budget=1.

Note, however, that the +Oinline_budget option takes precedence over both of these options. This means that you can override the effect of +Onolimit or +Osize option on inlining by specifying the +Oinline_budget option on the same compile line.

This option is only effective at optimization level 3 or higher.

Table 13-10 Values for the +Oinline_budget option

Values for n

Meaning

= 100

Default level of inlining.

> 100

More aggressive inlining. The optimizer is less restricted by compilation time and code size when searching for eligible routines to inline.

2 - 99

Less aggressive inlining. The optimizer gives more weight to compilation time and code size when determining whether to inline.

= 1

Only inline if it reduces code size.

 

+O[no]libcalls

The +Olibcalls option invokes millicode versions of a number of frequently called intrinsic functions. Millicode routines have very low call overhead and provide no error-handling. Use this option to improve the performance of selected library routines only when your program does not depend upon exception-handling.

The default is +Onolibcalls.

+O[no]loop_unroll[=factor]

The +Oloop_unroll option turns on loop unrolling. factor is the unroll factor that controls the code expansion. The default unroll factor is 4; that is, four copies of the loop body. By experimenting with different factors, you may improve the performance of your program. This option is only effective at optimization level 2 or higher.

The default is +Oloop_unroll.

+O[no]moveflops

The +Omoveflops option allows the optimizer to move conditional floating-point instructions, enabling other optimizations to occur. This option is only effective at optimization level 2 or higher.

Note that the behavior of floating-point exception handling may be altered by this option.

Use +Onomoveflops if floating-point traps are enabled and you do not want the behavior of floating-point exceptions to be altered by the relocation of floating-point instructions, as when your program uses the ON statement (see Appendix D).

The default is +Omoveflops.

+O[no]parmsoverlap

The +Oparmsoverlap option causes the optimizer to assume that the actual arguments of function calls overlap in memory, thus preventing any optimizations that violate this assumption. This option is only effective at optimization level 2 or higher.

Use the +Onoparmsoverlap option with programs that conform to the standard requirement that parameters must not overlap.

The default is +Onoparmsoverlap.

+O[no]pipeline

The +Opipeline option enables software pipelining. This option is only effective at optimization level 2 or higher.

Use +Onopipeline (disable software pipelining) to conserve code space.

The default is +Opipeline.

+O[no]procelim

When +Oprocelim is specified, procedures that are not referenced by the application are eliminated from the output executable file. When +Onoprocelim is specified, procedures that are not referenced by the application are not eliminated from the output executable file. You can use this option at any level of optimization.

Use +Oprocelim to reduce the size of the executable file, especially when optimizing at levels 3 and 4, when inlining can remove all calls to some routines.

The default is +Onoprocelim at levels 0-3, and +Oprocelim at level 4.

+O[no]regionsched

The +Oregionsched option improves run-time performance by applying aggressive scheduling techniques to move instructions across branches. This option is only effective at optimization level 2 or higher.

The +Oregionsched option is incompatible with the +check option.

The default is +Onoregionsched.

+O[no]regreassoc

The +Onoregreassoc option disables register reassociation. This option is only effective at optimization level 2 or higher.

Use +Onoregreassoc to disable register reassociation in the rare case that this optimization degrades performance.

The default is +Oregreassoc.

+O[no]vectorize

The +Ovectorize option causes the compiler to replace certain loops with calls to the math library. This option is only effective at optimization level 3 or higher.

If you link separately from the compile line and you compiled with the +Ovectorize option, you must ensure that the link line causes the math library to be searched.

The +Onovectorize option is the default.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1996 Hewlett-Packard Development Company, L.P.