Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP C/HP-UX Programmer's Guide: HP-UX Systems > Chapter 4 Optimizing HP C Programs

Profile-Based Optimization

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Index

Profile-based optimization (PBO) is a set of performance-improving code transformations based on the run-time characteristics of your application.

There are three steps involved in performing this optimization:

  1. Instrumentation - Insert data collection code into the object program.

  2. Data Collection - Run the program with representative data to collect execution profile statistics.

  3. Optimization - Generate optimized code based on the profile data.

Invoke profile-based optimization through HP C by using any level of optimization and the +I and +P options on the cc command line.

When you use PBO, compile times are faster and link times are slower because code generation happens at link time.

Instrumenting the Code

To instrument your program, use the +I option as follows:

cc -Aa +I -O -c sample.c

cc -o sample.exe +I -O sample.o

The first command line uses the -O option to perform level 2 optimization and instruments the code. The -c option in the first command line suppresses linking and creates an intermediate object file called sample.o. The.o file can be used later in the optimization phase, avoiding a second compile.

The second command line uses the -o option to link sample.o into sample.exe. The +I option instruments sample.exe with data collection code. Note that instrumented programs run slower than non-instrumented programs. Only use instrumented code to collect statistics for profile-based optimization.

Collecting Data for Profiling

To collect execution profile statistics, run your instrumented program with representative data as follows:

sample.exe < input.file1

sample.exe < input.file2

This step creates and logs the profile statistics to a file, by default called flow.data. You can use this data collection file to store the statistics from multiple test runs of different programs that you may have instrumented.

Performing Profile-Based Optimization

To optimize the program based on the previously collected run-time profile statistics, relink the program as follows:

cc -o sample.exe +P -O sample.o

An alternative to this procedure is to recompile the source file in the optimization step:

cc -o sample.exe +I -0 sample.c    

sample.exe < input.file1            

cc -o sample.exe +P -O sample.c    

Maintaining Profile Data Files

Profile-based optimization stores execution profile data in a disk file. By default, this file is called flow.data and is located in your current working directory.

You can override the default name of the profile data file. This is useful when working on large programs or on projects with many different program files.

You can use the FLOW_DATA environment variable to specify the name of the profile data file with either the +I or +P options. You can use the +df command-line option to specify the name of the profile data file with the +P option.

The +df option takes precedence over the FLOW_DATA environment variable.

In the following example, the FLOW_DATA environment variable is set to override the flow.data file name. The profile data is stored instead in /users/profiles/prog.data.

% setenv FLOW_DATA /users/profiles/prog.data
% cc -Aa -c +I +O3 sample.c
% cc -o sample.exe +I +03 sample.o
% sample.exe < input.file1
% cc -o sample.exe +P +03 sample.o

In the next example, the +df option uses /users/profiles/prog.data to override the flow.data file name.

% cc -Aa -c +I +O3 sample.c
% cc -o sample.exe +I +03 sample.o
% sample.exe < input.file1
% mv flow.data /users/profile/prog.data
% cc -o sample.exe +df /users/profiles/prog.data +P +03 sample.o

Maintaining Instrumented and Optimized Program Files

You can maintain both instrumented and optimized versions of a program. You might keep an instrumented version of the program on hand for development use, and several optimized versions on hand for performance testing and program distribution.

Care must be taken when maintaining different versions of the executable file because the instrumented program file name is used as the key identifier when storing execution profile data in the data file.

The optimizer must know what this key identifier name is in order to find the execution profile data. By default, the key identifier name used to retrieve the profile data is the instrumented program file name used to run the program for data collection.

When you optimize a program file and the optimized program file name is different from the instrumented program file name, you must use the +pgm option. Specify the instrumented program file name with this option. The optimizer uses this value as the key identifier to retrieve execution profile data.

In the following example, the instrumented program file name is sample.inst. The optimized program file name is sample.opt. The +pgm name option is used to pass the instrumented program name to the optimizer:

% cc -Aa -c +I +O3 sample.c
% cc -o sample.inst +I +03 sample.o
% sample.inst < input.file1
% cc -o sample.opt +P +03 +pgm sample.inst sample.o

Profile-Based Optimization Notes

When using profile-based optimization, please note the following:

  • Because the linker performs code generation for profile-based optimization, linking object files compiled with +I and +P takes more time than linking ordinary object files. However, compile-times will be relatively fast. This is because the compiler is only generating the intermediate code.

  • Profile-based optimization has a greater impact on application performance at each higher level of optimization.

  • Profile-based optimization should be enabled during the final stages of application development. To obtain the best performance, re-profile and re-optimize your application after making source code changes.

  • If you use level-4 or profile-based optimization and do not use +DA to generate code for a specific version of PA-RISC, note that code generation occurs at link time. Therefore, the system on which you link, rather than compile, determines the object code generated.

  • If you use level-4 or profile-based optimization and do not use +DS to specify instruction scheduling, note that instruction scheduling occurs at link time. Therefore, the system on which you link, rather than compile, determines the implementation of instruction scheduling.

For more information on profile-based optimization, see the HP-UX Linker and Libraries Online User Guide.

+Oprofile, option for Profile Based Optimization

HP C compiler provides the flexibility of choosing to generate PA-RISC machine code (SOMs) directly instead of the compiler’s intermediate code (ISOMs) during the compilation phase itself.

The existing behavior of the compiler has been to generate intermediate code when PBO options (+I, +P) are used and the final code generation will happen during link-phase, unless +Oreusedir= is used. At this stage, linker calls ucomp. An obvious disadvantage is, even when a single file is changed code generation for all other files will happen during link-phase. This makes the overall compile-link time significantly high.

As an enhancement to the current behavior, compiler will generate the PA-RISC machine code (SOM) whenever the newly introduced PBO options are used. This does not require code generation to happen during link-phase as the compiler itself would have converted the intermediate code (ISOM) into machine code (SOM) by calling ucomp.

The following lists the newly introduced PBO options:

  • +Oprofile=use

    Use the profile database to optimize. This is a synonym for the +P option.

  • +Oprofile=use[:filename]

    Specify filename as the name of the profile database file. This is a synonym for the +P and +df options.

  • +Oprofile=collect

    Instrument the application for profile based optimization. This is a synonym for +I.

  • +Oprofile=prediction:static

    Select static branch prediction for this executable. This is a synonym for +Ostaticprediction.

The above new options correspond to (though building SOMs instead of ISOMs):

  • +Oprofile=collectto +I.

  • +Oprofile=useto +P.

  • +Oprofile=use:filenameto "+P +df filename”.

  • +Oprofile=prediction:staticto +Ostaticprediction.

As seen above, the behavior of the new +Oprofile options are equivalent to the existing PBO options. Except that whenever +Oprofile is used compiler calls ucomp to convert intermediate code into machine code.

Performing PBO as earlier is not changed. There is no behavior change when +I/+P and any other old options are used in the command line. The cc driver calls ld to generate ISOMs. The options +pgm and -tu will work with the new options.

NOTE: The new options can be used only with -c (compile only), if not the optimization is performed as in previous releases.

The new options are available only at optimization levels below +O4 and at +O4 optimization, +I or +P is used.

Mixing of old and new options while optimizing on the same command line is disabled. For example, +Oprofile and +I/+P/+df in the same command line are incompatible.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.