Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
Fortran 90, Fortran 77, C, aC++: Exemplar Programming Guide > Chapter 3 Compiler optimizations

Optimization levels

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

Five optimization levels are available for use with the Exemplar compilers. These options have identical names and perform identical optimizations, regardless of which compiler you are using. They are specified on the compiler command line along with any other options you wish to use. Exemplar compiler optimization levels are summarized in Table 3-1 “Compiler optimization levels”.

Table 3-1 Compiler optimization levels

OptionDescription
+O0
(default)
(Machine instruction-level optimizations)
Constant folding and simple register assignment
+O1(Block-level optimizations)
+O0 optimizations, plus instruction scheduling and optimizations on basic blocks (A basic block is a linear sequence of machine instructions with a single entry and a single exit.)
+O2(Routine-level optimizations)
+O1 optimizations, plus optimizations within subprograms in a single file; loop optimizations to reduce pipeline stalls; analysis of data flow, memory usage, loops, and expressions
+O3(File-level optimizations)
+O2 optimizations, plus full optimizations across all subprograms (including inlining) within a single file; use of parallelism-related directives and pragmas from the Exemplar programming model when +Oparallel is also specified
+O4[1](Cross-module optimizations)
+O3 optimizations, plus full optimizations across the entire application; optimizations include inlining across the application; the +O4 optimizations are performed at link time

[1] The +O4 option is not available in Fortran 90.

 

These options are cumulative; each option retains the optimizations of the previous option. For example, entering the following command line compiles the Fortran program foo.f with all +O2, +O1, and +O0 optimizations shown in Table 3-1 “Compiler optimization levels”.

% f90 +O2 foo.f

In addition to these options, the +Oparallel option is available for use at +O3 and above. (+Onoparallel is the default.) When the +Oparallel option is specified, the compiler:

  • Looks for opportunities for parallel execution in loops.

  • Honors the parallelism-related directives and pragmas of the Exemplar programming model. When using Exemplar Fortran 77 Version 1.2.3 or Exemplar C Version 1.2.3, +Oexemplar_model (the default) must also be in effect for these directives and pragmas to be enabled.

The +Onoautopar (no automatic parallelization) option is available for use with +Oparallel at +O3 and above; +Oautopar is the default. +Onoautopar causes the compiler to parallelize only those loops that are immediately preceded by loop_parallel or prefer_parallel directives or pragmas; for more information, refer to Chapter 4, "Chapter 4 “Basic shared-memory
programming”
."

The +Onodepar (node-parallelism) option is also available for use with +Oparallel at +O3 and above. This option causes the compiler to generate node-parallel code (indicated by directives and pragmas that use the nodes attribute) for a multinode, scalable SMP. (See Chapter 4, "Chapter 4 “Basic shared-memory
programming”
," for information on attributes.)

The +Ononodepar option (the default) causes the compiler to generate code for a single-node machine. When this option is used, serial code is generated for node-parallel constructs; thus, node-parallelism is not implemented. Thread-parallelism—both automatic and directive-specified—is still implemented.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© Hewlett-Packard Development Company, L.P.