Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Exemplar Fortran 77, Exemplar C: Exemplar C and Fortran 77 Programmer's Guide > Chapter 1 Introduction

Standard HP compiler information

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

This section discusses some of the standard HP compiler options that are referenced later in this book. However, this book is a supplement to the standard HP compiler documentation. See the cc(1) and f77(1) man pages for:

  • Command-line options that are used most often

  • Optimization options

  • Input files information

  • Diagnostics information

  • Environment variables

See the section “Associated documents” for a list of additional documentation.

NOTE: The Exemplar compilers perform optimizations beyond those found in the standard HP compilers. See the Exemplar Programming Guide for more information.

+O0 (default)

Optimization level +O0 is the default optimization level. Your code compiles fastest at this level, but with little optimization. Code development and debugging should be done at this level.

At optimization level +O0, the optimizations in Table 1-1 “Optimizations performed at +O0 are performed.

Table 1-1 Optimizations performed at +O0

OptimizationDescription
Constant foldingReplaces an operation on constant operands with the result of the operation
Partial evaluation of test conditionsDetermines, where possible, the truth value of a logical expression without evaluating all the operands (also known as short-circuiting)

 

+O1

The transformations performed at +O1 are local to small subsections of code and, therefore, are performed quickly and with little runtime storage required by the compiler. Use +O1 when some optimization is desired, but when compile-time performance is more important than runtime performance.

At optimization level +O1, the optimizations listed in Table 1-2 “Optimizations performed at +O1 are performed.

Table 1-2 Optimizations performed at +O1

OptimizationDescription
+O0 optimizationsSee Table 1-1 “Optimizations performed at +O0
Branch optimizationsChanges branch instructions into more efficient sequences
Dead code eliminationRemoves code that is unreachable or is otherwise never executed
Instruction schedulingSchedules instructions to take advantage of pipelining
More efficient use of registers 
Peephole optimizationsReplaces assembly language instruction sequences with faster sequences and removes redundant register loads and stores

 

+O2, -O

You can use either -O or +O2 to enable the +O2 optimizations.

Transformations at +O2 are performed over the scope of each procedure. If you use this optimization level, the compiler uses more memory than at +O1 and takes longer to process your program. Optimizing procedures of more than 1,000 lines at this level takes considerably longer than at +O1.

At optimization level +O2, the optimizations in Table 1-3 “Optimizations performed at +O2 are performed.

Table 1-3 Optimizations performed at +O2

OptimizationDescription
+O0 and +O1 optimizationsSee Table 1-1 “Optimizations performed at +O0 and Table 1-2 “Optimizations performed at +O1
Global register allocationDetermines when and how long commonly used variables and expressions occupy a register
Strength reduction of
induction variables
Removes linear functions of a loop counter and replaces each function with a variable that contains the value of that function
Strength reduction of constantsReplaces some multiplication instructions with addition instructions
Common subexpression eliminationReplaces subsequent instances of an expression with its result
Advanced constant folding and propagation (Simple constant folding is done at +O0)Replaces an operation on constant operands with result of the operation (constant folding) and replaces variable references with a constant value previously assigned to that variable (constant propagation)
Loop-invariant code motionRecognizes instructions inside a loop where the results never change and moves those instructions outside the loop
Store/copy optimizationSubstitutes registers for memory locations
Unused definition eliminationRemoves unused references to memory locations and register definitions
Software pipeliningRearranges the order in which instructions execute in a loop to prevent processor stalls
Register reassociationReduces the cost of computing address expressions for array references by dedicating a register to track the value of the address expression
Loop unrolling (innermost loops)Increases a loop's step value and replicates the loop body, with each replication appropriately offset from the induction variable so that all iterations are performed given the new step

 

+O3

At optimization level +O3, the following optimizations are made:

+O4

At this level, optimization occurs at link time, allowing the optimizer to analyze all files compiled with the +O4 option at once. Because analysis is done when linking, the compile time is generally shorter than at lower optimization levels, but linking takes more time.

At optimization level +O4, the following optimizations are made:

+O[no]aggressive

The +O[no]aggressive option enables optimizations that can result in significant performance improvement, but that can change a program's behavior. These optimizations include those invoked by the following advanced options (which are described in the cc(1) and f77(1) man pages):

  • +Osignedpointers (available only in C)

  • +Oentrysched

  • +Onofltacc

  • +Olibcalls

  • +Onoinitcheck

  • +Ovectorize

The default is +Onoaggressive. The +O[no]aggressive option can be used at +O2 and above.

+O[no]all

The +Oall option applies maximum optimization to achieve the best runtime performance. This option is equivalent to specifying +Oaggressive and +Onolimit on the same command line. The +Oall option implies +O4. The default is +Onoall.

+O[no]fail_safe

The +Ofail_safe option allows a compilation with internal optimization errors to continue, rather than abort. If internal optimization errors are found, the compiler issues a warning message, then restarts the compilation at +O0. When using +Onofail_safe, compilation aborts if internal optimization errors occur.

This option can be used at +01 or higher. The default is +Ofail_safe.

+O[no]info

The +O[no]info option displays [does not display] feedback information about the optimization process (for example, cloning and inlining). Currently, this option is useful only at +O3 and above. The default is +Onoinfo. For information on a related option, see the section +O[no]report[=report_type].

+Oinline_budget=n

In +Oinline_budget=n, n is an integer in the range 1 to 1000000 that specifies the level of aggressiveness, as follows:

n = 100

Default level of inlining.

n > 100

More aggressive inlining.

The optimizer is less restricted by compilation time and code size when searching for eligible routines to inline.

n = 1

Only inline if it reduces code size.

The default is +Oinline_budget=100.

The +Onolimit and +Osize options also affect inlining. Specifying the +Onolimit option implies specifying +Oinline_budget=200. The +Osize option implies +Oinline_budget=1.

Note, however, that the +Oinline_budget option takes precedence over both of these options. This means that you can override the effects on inlining of the +Onolimit and +Osize options by specifying the +Oinline_budget option on the same command line.

The +Oinline_budget=n option is valid at +O3 and above.

+O[no]limit

The +O[no]limit option suppresses [does not suppress] optimizations that significantly increase compile-time or consume large amounts of memory. Specifying +Onolimit implies specifying +Oinline_budget=200. (See the section "+Oinline_budget=n" above for additional information.) This option can be used at +O2 and above. The default is +Olimit.

+O[no]loop_transform

The +O[no]loop_transform option transforms [does not transform] eligible loops for improved cache performance. The transforms include loop distribution, loop interchange, loop blocking, loop unroll, loop unroll and jam (in C V2.0), and loop fusion. This option can be used at +O3 and above. The default is +Oloop_transform.

+O[no]loop_unroll[=n]

This option unrolls [does not unroll] program loops by a factor of n. For example, specifying +Oloop_unroll=4 requests the optimizer to replicate the loop body four times. This option can be used at +O2 and above. The default is +Oloop_unroll=4.

+O[no]size

The +Osize option suppresses optimizations that significantly increase code size. Specifying +Osize implies specifying +Oinline_budget=1. See the section +Oinline_budget=n for additional information.

The +Onosize option does not prevent optimizations that can increase code size.

The +O[no]size option can be used at +O2, +O3, or +O4. The default is +Onosize.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.