 |
» |
|
|
 |
The options described in this
section allow you to control the different optimizations that the
compiler can apply to your program. These options fall into two
categories: Options that control classes of optimization
(for example, optimizations that affect code size) Options that control specific optimizations (for
example, inlining)
The following subsections describe the options in both categories.
For information about the options that control levels of optimization,
see the description of the +On
option in the “List of compile-line
options”.
The +O[no]info
option, which provides compile-time information about the optimization
process, is described in the same section.  |  |  |  |  | NOTE: You can insert (or remove) underscore
characters in the names of any of the optimization options to improve
their readability. The compiler will recognize the option name with
or without underscores. |  |  |  |  |
General optimization options |  |
The following options allow you
to control how optimization affects code size, compilation time,
runtime performance, and other user-visible effects. The syntax
for using these options is: - +O[no]optimization
where optimization is
a parameter that specifies the class of optimization to apply to
your program. The different parameters are described below. The
prefix no negates
the effect of optimization.
Except for +Oall,
the options do not override a specified level of optimization, nor
do they imply a particular level. (The +Oall
option automatically invokes the highest level of optimization.)
To use any of these options you must also include the +On
option on the same command line, where n
specifies the level at which the type of optimization is effective.
Thus, if you wish to apply all optimizations available at level
3 except those that might significantly increase code size, you
would use the command line: f90 +O3 +Osize my_prog.f90 |
If an option is mistakenly used at a level at which the corresponding
optimization is not performed, the compiler will issue a warning
message. The defaults specified in the following descriptions are in
effect only at the specified optimization levels, unless stated
otherwise. - +O[no]aggressive
The
+Oaggressive
option enables optimizations that can result in significant performance
improvement but can also change a program's behavior. This
option is only effective at optimization level 2 or higher. The +Oaggressive
option performs optimizations invoked by the following options: The +Oaggressive
option is incompatible with +Oconservative. The default is +Onoaggressive. - +O[no]all
The +Oall
option performs maximum optimization, including aggressive optimizations
and optimizations that can significantly increase compile time and
memory usage. The +Oall
option automatically invokes the highest level of optimization. The default is +Onoall. - +O[no]conservative
The +Oconservative
option causes the optimizer to make conservative assumptions about
the code when optimizing it. This option is only effective at optimization
level 2 or higher. The +Oconservative
option sets the following options: Use +Oconservative
when conservative assumptions are necessary due to the coding style,
as with nonstandard-conforming programs. Note that it is incompatible
with +Oaggressive. The +Onoconservative
option relaxes the optimizer's assumptions about the target
program. The default is +Onoconservative. - +O[no]limit
The +Olimit
option suppresses optimizations that significantly increase compilation
time or that can consume large amounts of memory at compile time.
This option is only effective at optimization level 2 or higher. The +Onolimit
option allows optimizations to be performed regardless of their
effect on compilation time or memory usage. The default is +Olimit. - +O[no]size
The +Osize
option suppresses optimizations that significantly increase code
size. This option is only effective at optimization level 2 or higher. The +Onosize
option permits optimizations that can increase code size. The default is +Onosize.
Fine-tuning optimization options |  |
The
following options allow you to fine-tune the optimization process
by providing control over the specific techniques that the optimizer
applies to your program. The syntax for using these options is - +O[no]optimization
where optimization is
a parameter that specifies an optimization technique to apply to
your program. The different parameters are described below. The
prefix no negates
the effect of optimization.
The options do not override a specified level of optimization,
nor do they imply a particular level. To use any of these options
you must also include the +On
option on the same command line, where n
specifies the level at which the type of optimization can be performed. For example, if you find that the optimizer is causing your
program to produce different floating-point results from those produced
by the unoptimized program, you could use the following command
line to suppress optimizations that affect floating-point calculations: f90 +O3 +Onomoveflops +Ofltacc my_prog.f90 |
If an option is mistakenly used at a level for which the corresponding
optimization is not performed, the compiler will issue a warning
message. The defaults given in the following descriptions are in effect
only at the specified optimization levels, unless stated otherwise. - +O[no]cache_pad_common
The
+Ocache_pad_common
option can improve program performance by padding common blocks
to avoid cache collisions. Cache-line collisions occur when the
difference between the addresses of two data points is a multiple
of the cache size. By inserting empty space between large variables
(for example, arrays), the optimizer ensures that they do not start
at nearby addresses, where the possibility of a cache collision
is greater. This option is only effective at optimization level
3 or higher. Note the following precautions when using this option: All program modules that reference
the common block must be compiled with the +Ocache_pad_common
option. Each common block in the program should have the
same layout in all program units within which it is declared. If
the layouts are different, they must be fully independent—that
is, they must not pass values between them.
The default, +Onocache_pad_common,
disables padding. - +O[no]dataprefetch
The
+Odataprefetch
option causes the optimizer to insert instructions within innermost
loops to explicitly prefetch data from memory into the data cache.
Data prefetch instructions will be inserted only for data structures
referenced within innermost loops using simple loop varying addresses
(that is, in a simple arithmetic progression). This option is only
effective at optimization level 2 or higher. It is only available
for PA-RISC 2.0 targets. Use this option for applications that have high data cache
miss overhead. The default is +Onodataprefetch. - +O[no]entrysched
The +Oentrysched
option allows the optimizer to perform instruction scheduling on
a subprogram's entry and exit code sequences. This option
is only effective at optimization level 1 or higher. The option can change the behavior of programs that perform
exception-handling or that handle asynchronous interrupts. The default is +Onoentrysched. - +O[no]fastaccess
The
+Ofastaccess
option improves execution time by speeding up access to global data
items. You can use this option at any level of optimization. Note that the +Ofastaccess
option may increase link time. The default is +Onofastaccess
at optimization levels 1, 2, and 3; and +Ofastaccess
at optimization level 4. - +O[no]fltacc
The
+Onofltacc option
allows optimizations that follow the rules of algebra but change
the order of expression evaluation. For example, if a,
b, and c
are floating-point variables, the expressions (a + b) + c
and a + (b + c)
may give slightly different results due to roundoff. This option
is only effective at optimization level 2 or higher. The +Ofltacc
option does not allow any optimizations that change the order of
expression evaluation and therefore may affect the accuracy of the
result. The
+Onofltacc option
also allows the optimizer to fuse together adjacent multiply and
add operations, known as Fused Multiply-Add (FMA). FMA is implemented
by the FMPYFADD
and FMPYNFADD
instructions and is only available on PA-RISC 2.0 systems. FMA improves
performance but occasionally produces results that may differ in
accuracy from results produced by code where fusing has not occurred.
In general, the difference is slight. The +Ofltacc
option disables fusing. At optimization level 2 or higher, FMA code
generation occurs by default. Table 13-9 “ Optimizations performed by +O[no]fltacc” identifies the different actions taken
by the optimizer, according to whether you specify +Ofltacc,
+Onofltacc, or
neither option.
Table 13-9 Optimizations performed by +O[no]fltacc Optimization options | Expression reordering? | FMA? |
|---|
+O2 | No | Yes | +O2 +Ofltacc | No | No | +O2 +Onofltacc | Yes | Yes |
- +O[no]initcheck
The
initialization checking feature of the optimizer has three possible
states: on, off, or unspecified. When this option is specified in
the on state (+Oinitcheck),
the optimizer initializes to zero any local, scalar, nonstatic variables
that are uninitialized with respect to at least one path leading
to a use of the variable. When +Onoinitcheck
is specified, the optimizer issues warning messages when it discovers
definitely uninitialized variables, but does not initialize them. When this option is unspecified, the optimizer initializes
to zero any local, scalar, nonstatic variables that are definitely
uninitialized with respect to all paths leading to a use of the
variable. This option is only effective at optimization level 2 or higher. - +O[no]inline
The
+Oinline option
makes all subprograms eligible for inlining. This option is only
effective at optimization level 3 or higher. The +Onoinline
option disables inlining for all subprograms in your program. The default is +Oinline
at optimization level 3 and +Onoinline
at the lower levels. - +Oinline_budget=n
The
+Oinline_budget
option enables the optimizer to perform more aggressive inlining. This option has the following syntax: +Oinline_budget=n where n is an integer in the range
1 - 1000000 that specifies the level of aggressiveness, as listed
in Table 13-10 “Values for the +Oinline_budget
option”. The +Onolimit
and +Osize options
also affect inlining. Specifying the +Onolimit
option has the same effect as specifying +Oinline_budget=200.
The +Osize option
has the same effect as +Oinline_budget=1. Note, however, that the +Oinline_budget
option takes precedence over both of these options. This means that
you can override the effect of +Onolimit
or +Osize option
on inlining by specifying the +Oinline_budget
option on the same compile line. This option is only effective at optimization level 3 or higher.
Table 13-10 Values for the +Oinline_budget
option Values for n | Meaning |
|---|
= 100 | Default
level of inlining. | > 100 | More
aggressive inlining. The optimizer is less restricted by compilation
time and code size when searching for eligible routines to inline. | 2 - 99 | Less
aggressive inlining. The optimizer gives more weight to compilation
time and code size when determining whether to inline. | =
1 | Only inline
if it reduces code size. |
- +O[no]libcalls
The +Olibcalls
option invokes millicode versions of a number of frequently called
intrinsic functions. Millicode routines have very low call overhead
and provide no error-handling. Use this option to improve the performance
of selected library routines only when your program does not depend
upon exception-handling. The default is +Onolibcalls. - +O[no]loop_unroll[=factor]
The
+Oloop_unroll
option turns on loop unrolling. factor
is the unroll factor that controls the code expansion. The default
unroll factor is 4; that is, four copies of the loop body. By experimenting
with different factors, you may improve the performance of your
program. This option is only effective at optimization level 2 or
higher. The default is +Oloop_unroll. - +O[no]moveflops
The
+Omoveflops option
allows the optimizer to move conditional floating-point instructions,
enabling other optimizations to occur. This option is only effective
at optimization level 2 or higher. Note that the behavior of floating-point exception handling
may be altered by this option. Use +Onomoveflops
if floating-point traps are enabled and you do not want the behavior
of floating-point exceptions to be altered by the relocation of
floating-point instructions, as when your program uses the ON
statement (see Appendix D). The default is +Omoveflops. - +O[no]parmsoverlap
The
+Oparmsoverlap
option causes the optimizer to assume that the actual arguments
of function calls overlap in memory, thus preventing any optimizations
that violate this assumption. This option is only effective at optimization
level 2 or higher. Use the +Onoparmsoverlap
option with programs that conform to the standard requirement that
parameters must not overlap. The default is +Onoparmsoverlap. - +O[no]pipeline
The
+Opipeline option
enables software pipelining. This option is only effective at optimization
level 2 or higher. Use +Onopipeline
(disable software pipelining) to conserve code space. The default is +Opipeline. - +O[no]procelim
When
+Oprocelim is
specified, procedures that are not referenced by the application
are eliminated from the output executable file. When +Onoprocelim
is specified, procedures that are not referenced by the application
are not eliminated from the output executable file. You can use
this option at any level of optimization. Use +Oprocelim
to reduce the size of the executable file, especially when optimizing
at levels 3 and 4, when inlining can remove all calls to some routines. The default is +Onoprocelim
at levels 0-3, and +Oprocelim
at level 4. - +O[no]regionsched
The
+Oregionsched
option improves run-time performance by applying aggressive scheduling
techniques to move instructions across branches. This option is
only effective at optimization level 2 or higher. The +Oregionsched
option is incompatible with the +check
option. The default is +Onoregionsched. - +O[no]regreassoc
The +Onoregreassoc
option disables register reassociation. This option is only effective
at optimization level 2 or higher. Use +Onoregreassoc
to disable register reassociation in the rare case that this optimization
degrades performance. The default is +Oregreassoc. - +O[no]vectorize
The
+Ovectorize option
causes the compiler to replace certain loops with calls to the math
library. This option is only effective at optimization level 3 or
higher. If you link separately from the compile line and you compiled
with the +Ovectorize
option, you must ensure that the link line causes the math library
to be searched. The +Onovectorize
option is the default.
|