HP-UX Linker and Libraries User's Guide

HP 9000 Computers

Part Number: B2355-90730
Publication Date: December 2001
Copyright © 2001 Hewlett-Packard Company. All rights reserved.

Legal Notices

Table of Contents


What's New in Recent Releases

This section contains information about recent releases of the HP-UX linker toolset:
For This Release

Intermediate Update:

The HP-UX March 2001 linker toolset contains new features:

In the previous HP-UX 11i Release

The HP-UX 11i linker toolset contains new features:

In Previous HP-UX 11.x Release

The HP-UX 10.20 and 11.x releases contain performance enhancements:

For the HP-UX 11.00 Release

The HP-UX 11.00 linker toolset contains new features:

If you use the 32-bit mode linker toolset, see the following items:

If you use the 64-bit mode linker toolset, see the following items: For Previous Releases

The following items were added in the HP-UX 10.30 release:

The following items were added in the HP-UX 10.20 release:

PA-RISC Changes in Hardware Compatibility

The HP-UX 10.20 release introduced HP 9000 systems based on the PA-RISC 2.0 architecture. Also, beginning with that release, HP compilers by default generate executable code for the PA-RISC architecture of the machine on which you are compiling.

In previous releases, the compilers generated PA-RISC 1.0 code on all HP 9000 Series 800 servers and PA-RISC 1.1 code on Series 700 workstations. HP compilers now by default generate PA-RISC 1.1 code on 1.1 systems and 2.0 code on 2.0 systems.

Using the +DAportable compiler option provides compatibility of code between PA-RISC 1.1 and 2.0 systems. Note that the HP-UX 10.10 release is the last supported release for PA-RISC 1.0 systems, so 1.1 and 2.0 code generated by the HP-UX 10.20 release (or later) of HP compilers is not supported on PA-RISC 1.0 systems.


NOTE  The +DA1.0 option will be obsolete in a future release. You cannot build PA1.0 executables on 11.x and run them on 10.x systems. You can achieve better performance on PA-RISC 1.1 and 2.0 systems by not using this option.

PA-RISC 2.0 Compatibility

The instruction set on PA-RISC 2.0 is a superset of the instruction set on PA-RISC 1.1. As a result, code generated for PA-RISC 1.1 systems will run on PA-RISC 2.0 systems. However, code generated for PA-RISC 2.0 systems will not run on PA-RISC 1.1 or 1.0. The linker issues a hardware compatibility warning whenever it links in any PA-RISC 2.0 object files with -v option:
/usr/ccs/bin/ld: (Warning) At least one PA 2.0 object file 
(sum.o) was detected. The linked output may not run on PA 1.x 
system.
If you try to run a PA-RISC 2.0 program on a 1.1 system, you'll see a message like:
$ a.out
ksh: ./a.out: Executable file incompatible with hardware
In this example, the +DAportable compiler option can be used to create code compatible for PA-RISC 1.1 and 2.0 systems.

PA-RISC Architectures and Their System Models

The HP 9000 PA-RISC (Precision Architecture Reduced Instruction Set Computing) Series 700/800 family of workstations and servers has evolved from three versions of PA-RISC:
PA-RISC 1.0

The original version of PA-RISC first introduced on Series 800 servers. The following Series are included: 840, 825, 835/SE, 845/SE, 850, 855, 860, 865, 870/x00, 822, 832, 842, 852, 890, 808, 815, 635, 645.
PA-RISC 1.1

The second version of PA-RISC first introduced on Series 700 workstations. Newer Series 800 systems also use this version of the architecture. The following Series are included: 700, 705, 710, 715, 720, 725, 730, 735, 750, 755, B132L, B160L, B132L+, B180L, C100, C110, J200, J210, J210XC, 742i, 742rt, 743i, 743rt, 745i, 747i, 748i, 8x7, D (except Dx70, Dx80), E, F, G, H, I, K (except Kx50, Kx60, Kx70), T500, T520.
PA-RISC 2.0

The newest version of PA-RISC. The following Series are included: C160, C180, C180XP, C200, C240, J280, J282, J2240, Dx70, Dx80, Kx50, Kx60, Kx70, T600, V2200.
For More Information

64-bit Mode Linker Toolset Compatibility with De Facto Industry Standards

The 64-bit mode linker and dynamic loader provide linking and loading behaviors found widely across the Unix industry, considered, with the SVR4 standards, to define the de facto industry standards. The following 64-bit linker behavior is compliant with de facto industry standard: The HP-UX 11.00 release maintains certain behaviors to make transition from 32-bit to 64-bit mode easier:

64-bit Mode ELF Object File Format

Starting with HP-UX release 11.00, the 64-bit linker toolset supports the ELF (executable and linking format) object file format. The 64-bit linker toolset provides new tools to display and manipulate ELF files. The libelf(3E) library routines provide access to ELF files. The command elfdump(1) displays contents of an ELF file.

The following options instruct the compiler to generate 64-bit ELF object code.
 
Option Compiler
+DA2.0W C and aC++
+DD64 C

See the HP-UX Software Transition Toolkit (STK) at http://www.software.hp.com/STK/ for more information on the structure of ELF object files.

New Features for 64-bit Mode Linking

This section introduces new features of the 64-bit linker for HP-UX release 11.00.

64-bit Mode Linker Options

The ld(1) command supports the following new options in 64-bit mode:
 
Option Action
-dynamic Forces the linker to create a shared executable. The linker looks for shared libraries first and then archived libraries. This option is on by default when you compile in 64-bit mode.
-noshared Forces the linker to create a fully bound archive program.
-k filename Allows you to control the mapping of input section in the object file to segments in the output file.
+[no]allowunsats Instructs the linker how to report errors for output files with unsatisfied symbols.
+compat Instruct the linker to use 32-bit mode linking and dynamic loading behaviors.
+[no]forceload Enables/disables forced loading of all the object files from archive libraries.

The linker accepts but ignores this option in 32-bit mode. It creates an executable (a.out).

+hideallsymbols Hides all symbols from being exported. 
+nodefaultmap Instructs the linker not to load the default mapfile. See the -k option. 
+noenvvar Instructs the dynamic loader not to look at the LD_LIBRARY_PATH and SHLIB_PATH environment variables at runtime. 
+std Instructs the linker to use SVR4 compatible linking and loading behaviors. Default for 64-bit mode. 
+stripunwind Instructs the linker not to output the unwind table.
+vtype type Produces verbose output about selected link operations. 

64-bit Mode Linker-defined Symbols

The 64-bit linker reserves the following symbol names:
 
Symbol Definition
__SYSTEM_ID Largest architecture revision level used by any compilation unit
_FPU_STATUS Initial value of FPU status register
_end or end Address of first byte following the end of the main program's data segment; identifies the beginning of the heap segment
__TLS_SIZE Size of the Thread Local Storage segment required by the program
__text_start Beginning of the text segment
__text_start_f Beginning of text segment, declared as a function
_etext or etext End of the text segment
_etext_f End of text segment, declared as a function
__data_start Beginning of the data segment
_edata or edata End of initialized data
__gp Global pointer value
__init_start  Beginning of the .init section
__init_end End of the .init section
__preinit_start Beginning of the .preinit section
__preinit_end End of the .preinit section
__fini_start Beginning of the .fini section
__fini_end End of the .fini section
__unwind_start Beginning of the unwind table
__unwind_end End of the unwind table

NOTE  The linker generates an error if a user application also defines these symbols.

64-bit Mode Link-time Differences

The 64-bit mode linker toolset does not support the following 32-bit mode features.
Option or Behavior Description
-A name Specifies incremental loading. 64-bit applications must use shared libraries instead.
-C n Does parameter type checking. This option is unsupported. 
-S Generates an initial program loader header file. This option is unsupported.
-T Save data and relocation information in temporary files to reduce virtual memory requirements during linking. This option is unsupported.
-q, -Q, -n Generates an executable with file type DEMAND_MAGIC, EXEC_MAGIC, and SHARE_MAGIC respectively. These options have no effect and are ignored in 64-bit mode. 
-N Causes the data segment to be placed immediately after the text segment. This option is accepted but ignored in 64-bit mode. If this option is used because your application data segment is large, then the option is no longer needed in 64-bit mode. If this option is used because your program is used in an embedded system or other specialized application, consider using mapfile support with the -k option.
+cg pathname Specifies pathname for compiling I-SOMs to SOMs. This option is unsupported.
+dpv Displays verbose messages regarding procedures which have been removed due to dead procedure elimination. Use the -v linker option instead.
Intra-library versioning Specified by using the HP_SHLIB_VERSION pragma (C and aC++) or SHLIB_VERSION directive (Fortran90).

In 32-bit mode, the linker lets you version your library by object files. 64-bit applications must use SVR4 library-level versioning instead.

Duplicate code and data symbols Code and data cannot share the same namespace in 64-bit mode. You should rename the conflicting symbols.
All internal and undocumented linker options These options are unsupported.

For more information, see the HP-UX Linker and Libraries Online User Guide (ld +help).

64-bit Mode Run Time Differences

Applications compiled and linked in 64-bit mode use a run-time dynamic loading model similar to other SVR4 systems. There are two main areas where program startup changes in 64-bit mode: It is recommended that you use the standard SVR4 linking option (+std), which is on by default when linking 64-bit applications. There may be circumstances while you transition, that you need 32-bit compatible linking behavior. The 64-bit linker provides the +compat option to force the linker to use 32-bit linking and dynamic loading behavior.

The following table summarizes the dynamic loader differences between 32-bit and 64-bit mode:
 
Linker and Loader Functions 32-bit Mode Behavior 64-bit Mode Behavior
+s and +b path_list ordering Ordering is significant. Ordering is insignificant by default.

Use +compat to enforce ordering.

Symbol searching in dependent libraries Depth-first search order. Breadth-first search order. 

Use +compat to enforce depth first ordering. 

Run time path environment variables No run time environment variables by default.

If +s is specified, then SHLIB_PATH is available.

LD_LIBRARY_PATH and SHLIB_PATH are available.

Use +noenv or +compat to turn off run-time path environment variables. 

+b path_list and -L directories interaction -L directories recorded as absolute paths in executables. -L directories are not recorded in executables, if -L and +b are both used.

To record all the directories in executables, add all directories specified in -L to +b path_list.

For more information on transition issues, see HP-UX 64-bit Porting and Transition Guide.

Changes in Future Releases

The following changes are planned in future releases.

Online Help for Linker and Libraries

The Linker and Libraries Online User Guide is available for HP 9000 Series 700 and 800 systems. The online help comes with HP C, HP C++, HP aC++, HP Fortran, HP Pascal, and HP Micro Focus COBOL/UX. Online help can be accessed from any browser which can read HTML files.

To access the Linker and Libraries Online User Guide from the ld command line:

ld +help

What Happens When You Compile and Link a Program

This chapter describes the process of compiling and linking a program.

Compiling Programs on HP-UX: An Example

To create an executable program, you compile a source file containing a main program. For example, to compile an ANSI C program named sumnum.c, shown below, use this command (-Aa says to compile in ANSI mode):
$ cc -Aa sumnum.c
The compiler displays status, warning, and error messages to standard error output (stderr). If no errors occur, the compiler creates an executable file named a.out in the current working directory. If your PATH environment variable includes the current working directory, you can run a.out as follows:
$ a.out
Enter a number: 4
Sum 1 to 4: 10
The process is essentially the same for all HP-UX compilers. For instance, to compile and run a similar FORTRAN program named sumnum.f:
$ f77 sumnum.f       Compile and link sumnum.f. 
    ...                      The compiler displays any messages here.
$ a.out                     Run the program. 
   ...           Output from the program is displayed here.
Program source can also be divided among separate files. For example, sumnum.c could be divided into two files: main.c, containing the main program, and func.c, containing the function sum_n. The command for compiling the two together is:
$ cc -Aa main.c func.c
main.c:
func.c:
Notice that cc displays the name of each source file it compiles. This way, if errors occur, you know where they occur.
#include <stdio.h>              /* contains standard I/O defs */
int     sum_n( int n )           /* sum numbers from n to 1    */
{
  int   sum = 0;                 /* running total; initially 0 */
  for (; n >= 1; n--)            /* sum from n to 1            */
    sum += n;                   /* add n to sum               */
  return sum;                   /* return the value of sum    */
}
 
main()                          /* begin main program         */
{
  int   n;                      /* number to input from user  */
  printf("Enter a number: ");   /* prompt for number          */
  scanf("%d", &n);              /* read the number into n     */
  printf("Sum 1 to %d: %d\\n", n, sum_n(n)); /* display the sum */
}
Generally speaking, the compiler reads one or more source files, one of which contains a main program, and outputs an executable a.out file, as shown in Figure 1: High-Level View of the Compiler .

Figure 1: High-Level View of the Compiler

Looking "inside" a Compiler

On the surface, it appears as though an HP-UX compiler generates an a.out file by itself. Actually, an HP-UX compiler is a driver that calls other commands to create the a.out file. The driver performs different tasks (or phases) for different languages, but two phases are common to all languages:
    For each source file, the driver calls the language compiler to create an object file. (See Also What is an Object File?.)

    Then, the driver calls the HP-UX linker (ld) which builds an a.out file from the object files. This is known as the link-edit phase of compilation. (See Also Compiler-Linker Interaction .)

Figure 2: Looking "inside" a Compiler summarizes how a compiler driver works.

Figure 2: Looking "inside" a Compiler

The C, C++, FORTRAN, and Pascal compilers provide the -v (verbose) option to display the phases a compiler is performing. Compiling main.c and func.c with the -v option produced this output on a Series 700 workstation (\ at the end of a line indicates the line is continued to the next line):

$ cc -Aa -v main.c func.c -lm
cc: CCOPTS is not set.
main.c:
/opt/langtools/lbin/cpp.ansi main.c /var/tmp/ctmAAAa10102 \\
  -D__hp9000s700 -D__hp9000s800 -D__hppa -D__hpux \\
  -D__unix -D_PA_RISC1_1
cc: Entering Preprocessor.
/opt/ansic/lbin/ccom /var/tmp/ctmAAAa10102 main.o -O0 -Aa \\
  func.c:
/opt/langtools/lbin/cpp.ansi func.c /var/tmp/ctmAAAa10102 \\
  -D__hp9000s700 -D__hp9000s800 -D__hppa -D__hpux \\
  -D__unix -D_PA_RISC1_1
cc: Entering Preprocessor.
/opt/ansic/lbin/ccom /var/tmp/ctmAAAa10102 func.o -O0 -Aa
cc: LPATH is /usr/lib/pa1.1:/usr/lib:/opt/langtools/lib:
/usr/ccs/bin/ld /opt/langtools/lib/crt0.o -u main main.o func.o \\
  -lm -lc
cc: Entering Link editor.
This example shows that the cc driver calls the C preprocessor (/opt/langtools/lbin/cpp.ansi) for each source file, then calls the actual C compiler (/opt/ansic/lbin/ccom) to create the object files. Finally, the driver calls the linker (/usr/ccs/bin/ld) on the object files created by the compiler (main.o and func.o).

What is an Object File?

An object file is basically a file containing machine language instructions and data in a form that the linker can use to create an executable program. Each routine or data item defined in an object file has a corresponding symbol name by which it is referenced. A symbol generated for a routine or data definition can be either a local definition or global definition. Any reference to a symbol outside the object file is known as an external reference.

To keep track of where all the symbols and external references occur, an object file has a symbol table. The linker uses the symbol tables of all input object files to match up external references to global definitions.

Local Definitions

A local definition is a definition of a routine or data that is accessible only within the object file in which it is defined. Such a definition cannot be accessed from another object file. Local definitions are used primarily by debuggers, such as adb. More important for this discussion are global definitions and external references.

Global Definitions

A global definition is a definition of a procedure, function, or data item that can be accessed by code in another object file. For example, the C compiler generates global definitions for all variable and function definitions that are not static. The FORTRAN compiler generates global definitions for subroutines and common blocks. In Pascal, global definitions are generated for external procedures, external variables, and global data areas for each module.

External References

An external reference is an attempt by code in one object file to access a global definition in another object file. A compiler cannot resolve external references because it works on only one source file at a time. Therefore, the compiler simply places external references in an object file's symbol table; the matching of external references to global definitions is left to the linker or loader.

Compiler-Linker Interaction

As described in Looking "inside" a Compiler , the compilers automatically call ld to create an executable file. To see how the compilers call ld, run the compiler with the -v (verbose) option. For example, compiling a C program in 32-bit mode produces the output below:
$ cc -Aa -v main.c func.c -lm
cc: CCOPTS is not set.
main.c:
/opt/langtools/lbin/cpp.ansi main.c /var/tmp/ctmAAAa10102 \\
  -D__hp9000s700 -D__hp9000s800 -D__hppa -D__hpux \\
  -D__unix -D_PA_RISC1_1
cc: Entering Preprocessor.
/opt/ansic/lbin/ccom /var/tmp/ctmAAAa10102 main.o -O0 -Aa
func.c:
/opt/langtools/lbin/cpp.ansi func.c /var/tmp/ctmAAAa10102 \\
  -D__hp9000s700 -D__hp9000s800 -D__hppa -D__hpux \\
  -D__unix -D_PA_RISC1_1
cc: Entering Preprocessor.
/opt/ansic/lbin/ccom /var/tmp/ctmAAAa10102 func.o -O0 -Aa
cc: LPATH is /usr/lib/pa1.1:/usr/lib:/opt/langtools/lib:
/usr/ccs/bin/ld /opt/langtools/lib/crt0.o -u main main.o
func.o -lm -lc
cc: Entering Link editor.
The next-to-last line in the above example is the command line the compiler used to invoke the 32-bit mode linker, /usr/ccs/bin/ld. In this command, ld combines a startup file (crt0.o) and the two object files created by the compiler (main.o and func.o). Also, ld searches the libm and libc libraries.

In 64-bit mode, the startup functions are handled by the dynamic loader, dld.sl. In most cases, the ld command line does not include crt0.o.


NOTE  If you are linking any C++ object files to create an executable or a shared library, you must use the CC command to link. This ensures that c++patch executes and chains together your nonlocal static constructors and destructors. If you use ld, the library or executable may not work correctly and you may not get any error messages. For more information see the HP C++ Programmer's Guide.

Linking Programs on HP-UX

The HP-UX linker, ld, produces a single executable file from one or more input object files and libraries. In doing so, it matches external references to global definitions contained in other object files or libraries. It revises code and data to reflect new addresses, a process known as relocation. If the input files contain debugger information, ld updates this information appropriately. The linker places the resulting executable code in a file named, by default, a.out.

In the C program example (see Compiling Programs on HP-UX: An Example ) main.o contains an external reference to sum_n, which has a global definition in func.o. ld matches the external reference to the global definition, allowing the main program code in a.out to access sum_n (see Figure 3: Matching the External Reference to sum_n ).


Figure 3: Matching the External Reference to sum_n

If ld cannot match an external reference to a global definition, it displays a message to standard error output. If, for instance, you compile main.cwithoutfunc.c, ld cannot match the external reference to sum_n and displays this output:

$ cc -Aa main.c
/usr/ccs/bin/ld: Unsatisfied symbols:
   sum_n (code)

The crt0.o Startup File

Notice in the example in Compiler-Linker Interaction that the first object file on the linker command line is /opt/langtools/lib/crt0.o, even though this file was not specified on the compiler command line. This file, known as a startup file, contains the program's entry point that is, the location at which the program starts running after HP-UX loads it into memory to begin execution. The startup code does such things as retrieving command line arguments into the program at run time, and activating the dynamic loader (dld.sl(5)) to load any required shared libraries. In the C language, it also calls the routine _start in libc which, in turn, calls the main program as a function.

The 64-bit linker uses the startup file, /opt/langtools/lib/pa_64/crt0.o, when:

If the -p profiling option is specified on the 32-bit mode compile line, the compilers link with mcrt0.o instead of crt0.o. If the -G profiling option is specified, the compilers link with gcrt0.o. In 64-bit mode with the -p option, the linker adds -lprof before the -lc option. With the -G option, the linker adds -lgprof.

If the linker option -I is specified to create an executable file with profile-based optimization, in 32-bit mode icrt0.o is used, and in 64-bit mode the linker inserts /usr/ccs/lib/pa20_64/fdp_init.o. If the linker options -I and -b are specified to create a shared library with profile-based optimization, in 32-bit mode scrt0.o is used, and in 64-bit mode, the linker inserts /usr/ccs/lib/pa20_64/fdp_init_sl.o. In 64-bit mode, the linker uses the single 64-bit crt0.o to support these option.

For details on startup files, see crt0(3).

The Program's Entry Point

In 32-bit mode and in 64-bit statically-bound (-noshared) executables, the entry point is the location at which execution begins in the a.out file. The entry point is defined by the symbol $START$ in crt0.o.

In 64-bit mode for dynamically bound executables, the entry point, defined by the symbol $START$ in the dynamic loader (dld.sl).

The a.out File

The information contained in the resulting a.out file depends on which architecture the file was created on and what options were used to link the program. In any case, an executable a.out file contains information that HP-UX needs when loading and running the file, for example: Is it a shared executable? Does it reference shared libraries? Is it demand-loadable? Where do the code (text), data, and bss (uninitialized data) segments reside in the file? For details on the format of this file, see a.out(4).

Magic Numbers

In 32-bit mode, the linker records a magic number with each executable program that determines how the program should be loaded. There are three possible values for an executable file's magic number:
SHARE_MAGIC

The program's text (code) can be shared by processes; its data cannot be shared. The first process to run the program loads the entire program into virtual memory. If the program is already loaded by another process, then a process shares the program text with the other process.
DEMAND_MAGIC

As with SHARE_MAGIC the program's text is shareable but its data is not. However, the program's text is loaded only as needed - that is, only as the pages are accessed. This can improve process startup time since the entire program does not need to be loaded; however, it can degrade performance throughout execution.
EXEC_MAGIC

Neither the program's text nor data is shareable. In other words, the program is an unshared executable. Usually, it is not desirable to create such unshared executables because they place greater demands on memory resources.
By default, the linker creates executables whose magic number is SHARE_MAGIC. The following shows which linker option to use to specifically set the magic number.
 
Table 1: 32-bit Mode Magic Number Linker Options 
To set the magic number to:  Use this option: 
SHARE_MAGIC -n
DEMAND_MAGIC -q
EXEC_MAGIC -N

An executable file's magic number can also be changed using the chatr command (see Changing a Program's Attributes with chatr(1) ). However, chatr can only toggle between SHARE_MAGIC and DEMAND_MAGIC; it cannot be used to change from or to EXEC_MAGIC. This is because the file format of SHARE_MAGIC and DEMAND_MAGIC is exactly the same, while EXEC_MAGIC files have a different format. For details on magic numbers, refer to magic(4).

In 64-bit mode, the linker sets the magic number to the predefined type for ELF object files (\177ELF). The value of the E_TYPE in the ELF object file specifies how the file should be loaded.

File Permissions

If no linker errors occur, the linker gives the a.out file read/write/execute permissions to all users (owner, group, and other). If errors occurred, the linker gives read/write permissions to all users. Permissions are further modified if the umask is set (see umask(1)). For example, on a system with umask set to 022, a successful link produces an a.out file with read/write/execute permissions for the owner, and read/execute permissions for group and other:
$ umask
022
$ ls -l a.out
-rwxr-xr-x   1 michael  users      74440 Apr  4 14:38 a.out

Linking with Libraries

In addition to matching external references to global definitions in object files, ld matches external references to global definitions in libraries. A library is a file containing object code for subroutines and data that can be used by other programs. For example, the standard C library, libc, contains object code for functions that can be used by C, C++, FORTRAN, and Pascal programs to do input, output, and other standard operations.

Library Naming Conventions

By convention, library names have the form:

libname.suffix

name - is a string of one or more characters that identifies the library.
suffix - is .a if the library is an archive library or .sl if the library is a shared library. suffix is a number, for example .0, .1, and so forth, if library-level versioning is being used.
Typically, library names are referred to without the suffix. For instance, the standard C library is referred to as libc.

Default Libraries

A compiler driver automatically specifies certain default libraries when it invokes ld. For example, cc automatically links in the standard library libc, as shown by the -lc option to ld in this example:
$ cc -Aa -v main.c func.c
    ...
/usr/ccs/bin/ld /opt/langtools/lib/crt0.o -u main main.o \
func.o -lc
cc: Entering Link editor.
Similarly, the Series 700/800 FORTRAN compiler automatically links with the libcl (C interface), libisamstub (ISAM file I/O), and libc libraries:
$ f77 -v sumnum.f
   ...
/usr/ccs/bin/ld -x /opt/langtools/lib/crt0.o \
 sumnum.o -lcl -lisamstub -lc

The Default Library Search Path

By default, ld searches for libraries in the directory /usr/lib. (If the -p or -G compiler profiling option is specified on the command line, the compiler directs the linker to also search /usr/lib/libp.) The default order can be overridden with the LPATH environment variable or the -L linker option. LPATH and -L are described in Changing the Default Library Search Path with -L and LPATH .

Link Order

The linker searches libraries in the order in which they are specified on the command line - the link order. Link order is important in that a library containing an external reference to another library must precede the library containing the definition. This is why libc is typically the last library specified on the linker command line: because the other libraries preceding it in the link order often contain references to libc routines and so must precede it.

NOTE  If multiple definitions of a symbol occur in the specified libraries, ld does not necessarily choose the first definition. It depends on whether the program is linked with archive libraries, shared libraries, or a combination of both. Depending on link order to resolve such library definition conflicts is risky because it relies on undocumented linker behavior that may change in future releases. (See Also Caution When Mixing Shared and Archive Libraries .)

Running the Program

An executable file is created after the program has been compiled and linked. The next step is to run or load the program.

Loading Programs: exec

When you run an executable file created by ld, the program is loaded into memory by the HP-UX program loader, exec. This routine is actually a system call and can be called by other programs to load a new program into the current process space. The exec function performs many tasks; some of the more important ones are: For details on exec, see the exec(2) page in the HP-UX Reference.

Binding Routines to a Program

Since shared library routines and data are not actually contained in the a.out file, the dynamic loader must attach the routines and data to the program at run time. Attaching a shared library entails mapping the shared library code and data into the process's address space, relocating any pointers in the shared library data that depend on actual virtual addresses, allocating the bss segment, and binding routines and data in the shared library to the program.

The dynamic loader binds only those symbols that are reachable during the execution of the program. This is similar to how archive libraries are treated by the linker; namely, ld pulls in an object file from an archive library only if the object file is needed for program execution.

Deferred Binding is the Default

To accelerate program startup time, routines in a shared library are not bound until referenced. (Data items are always bound at program startup.) This deferred binding of shared library routines distributes the overhead of binding across the execution time of the program and is especially expedient for programs that contain many references that are not likely to be executed. In essence, deferred binding is similar to demand-loading.

Linker Thread-Safe Features

Beginning with the HP-UX 10.30 release, the dynamic loader (dld.sl) and its application interface library (libdld.sl) are thread-safe.

Also, beginning with the HP-UX 10.30 release, the linker toolset provides thread local storage support in:

Thread local storage (also called thread-specific data) is data specific to a thread. Each thread has its own copy of the data item.

NOTE  A program with thread local storage is only supported on systems running HP-UX 10.30 or later versions of the operating system.


NOTE  Use of the __thread keyword in a shared library prevents that shared library from being dynamically loaded, that is, loaded by an explicit call to shl_load().

For More Information:

Pthread stubs in HP C library

Problem

On HP-UX if a non-threaded application links to a thread-safe library call to create thread safe routines from the application, it fails at run time due to unre solved symbols like pthread_*. Linking the non-threaded application to the threa ds library libpthread helps you to resolve the symbols. But linking to the libpt hread library has the possibility of making the threaded application to loose a considerable performance even if it creates no threads.

Resolution

Due to the problems stated above, HP has decided to implement POSIX 1x thread st ubs. Providing POSIX 1c thread stubs in HP-UX C language library have two direct effe cts for non-threaded applications: a) POSIX 1c thread symbols are resolved if a non-threaded application links to a thread-safe library. b) We avoid the overhead of a real thread library especially the overhead associ ated with mutexes when the non-threaded application uses thread stubs rather tha n the real thread library procedures.

As per the libc cumulative patches, PHCO_22923 (11.00) and PHCO_23772 (11.11), t he libc shared library contains stubs for the pthread_* functions in libpthread and libcma. The stubs allow non-threaded applications to dynamically load threa d-safe libraries successfully, so that the pthread symbols are resolved. Applica tions that resolves pthread/cma calls to the stub must be built without -lpthread or -lcma on the link line.

Stubs provided in libc do not have any functionality, these are dummy functions returning zero except pthread_getspecific() family of APIs which has full functi onality implemented in the stubs.

List of pthread calls for which the stubs are provided in the C library is given below.

The pthread calls to any of the above stub returns zero.

Exceptions

The stub for the following pthread calls has full functionality. More informatio n about their functionalities is in the pthread (3T) man pages.

Calls to the stub,

Link Order problems:

An application may inadvertently pick up the stubs present in libc when it is in tended to use the real pthread APIs, or cma APIs. These are link order problems. An application that needs cma behavior must link to libcma and must do so in th e supported link order, i.e. the link line should only be shared and not contain -lc before -lcma.

So long as this condition is met, the correct cma functions will be referenced. Similarly, a multithreaded application that needs pthread library behavior must link to libpthread and must do so in a supported link order, and only use shared libc and libpthread.

Examples of Potential link order problems:

Example 1

The applications or any library linked that will resolve pthread/cma calls to th e stubs must be built without -lpthread or -lcma on the link line. If you specif y -lc before -lpthread, your application will use the pthread stubs in libc, but other problems may occur as given in the examples below:

$ cat thread.c
#include 
#include 

void *thread_nothing(void *p)
{
printf("Success\n");
}

int main()
{
    int err;
    pthread_t thrid;

    err = pthread_create(&thrid, (pthread_attr_t *) NULL, thread_nothing,
                         (void *) NULL);
    sleep(1);
    if (err) {
      printf("Error\n");
      return err;
    }
}

$ cc thread.c -lc -lpthread
$ a.out
Error

$ chatr a.out
a.out:
         shared executable
         shared library dynamic path search:
             SHLIB_PATH     disabled  second
             embedded path  disabled  first  Not Defined
         shared library list:
             dynamic   /usr/lib/libc.2  <- libc before libpthread
             dynamic   /usr/lib/libpthread.1
         shared library binding:
             deferred
         global hash table disabled

$ cc thread.c -lpthread
$ a.out
Success
$ chatr a.out

a.out:
         shared executable
         shared library dynamic path search:
             SHLIB_PATH     disabled  second
             embedded path  disabled  first  Not Defined
         shared library list:
             dynamic   /usr/lib/libpthread.1
             dynamic   /usr/lib/libc.2
         shared library binding:
             deferred
         global hash table disabled ...

Example 2

Specifying -lc before -lpthread in threaded applications can cause run-time prob lems as in the following example. Because the pthread/cma stubs are resolved ins tead of the real pthread/cma functions:

$ cat a.c
     #include 
     #include 

     extern int errno;

     main()
     {
             shl_load("/usr/lib/librt.2", BIND_DEFERRED, 0);
             printf("Error %d, %s\n", errno, strerror(errno));
     }
$ cc a.c -lc -lpthread
$ a.out
Error 22, Invalid argument
$ LD_PRELOAD=/usr/lib/libpthread.1 ./a.out
Error 0, Error 0
$ cat  b.c
     #include 
     #include 

     void* handle;
     extern int errno;

     main()
     {
             handle = dlopen("lib_not_found", RTLD_LAZY);
             printf("Error %d, %s\n", errno, strerror(errno));
             if (handle == NULL) {
               printf("Error: %s\n",dlerror());
             }
     }
$ cc b.c -lc -lpthread
$ a.out
Error 22, Invalid argument
Error:
$ LD_PRELOAD=/usr/lib/libpthread.1 ./a.out
Error 0, Error 0
Error: Can't open shared library: lib_not_found

Therefore, -lc should never be specified in the build command of an executable o r shared library.

By default, the compiler drivers (cc, aCC, f90) automatically pass -lc to the li nker at the end of the link line of the executables. It is not necessary to spec ify -lc when building a shared library because libc will be resolved by the refe rence to libc in the executable build command. The libc will be resolved by -lc that is automatically passed by the compiler drivers to the linker.

To see if a shared library was built with -lc, look at the shared library list i n the chatr() output, or list the dependent libraries with ldd(1):

$ cc +z -c lib1.c
$ ld -b -o lib1.sl lib1.o -lc
$ ldd lib1.sl
        /usr/lib/libc.2 =>      /usr/lib/libc.2
        /usr/lib/libdld.2 =>    /usr/lib/libdld.2
        /usr/lib/libc.2 =>      /usr/lib/libc.2

$ cc +DA2.0W +z -c lib1.c
$ ld -b -o lib1.sl lib1.o -lc
$ ldd lib1.sl
        libc.2 =>       /lib/pa20_64/libc.2
        libdl.1 =>      /usr/lib/pa20_64/libdl.1

To see the order in which dependent shared libraries will be loaded at run-time (order is only valid in 64-bit mode), use ldd(1) on the executable (ldd in 32-bi t mode displays the order in which libraries are loaded in reverse order):

$ cc +DA2.0W thread.c -lpthread
$ ldd a.out
        libpthread.1 => /usr/lib/pa20_64/libpthread.1
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libdl.1 =>      /usr/lib/pa20_64/libdl.1
$ cc +DA2.0W thread.c -lc -lpthread
$ ldd a.out
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libpthread.1 => /usr/lib/pa20_64/libpthread.1
        libdl.1 =>      /usr/lib/pa20_64/libdl.1
$ cc +DA2.0W thread.c -lpthread -lc
$ ldd a.out
        libpthread.1 => /usr/lib/pa20_64/libpthread.1
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libdl.1 =>      /usr/lib/pa20_64/libdl.1

Recommendations:

Example 1 (64-bit):

If a 64-bit shared library is built with -lpthread but the executable is not, li bc is loaded before libpthread (due to breadth-first searching), and the pthread calls are resolved to the pthread stubs in libc. At run-time, after the a.out i s loaded, the dependencies of a.out are loaded in breadth-first order: libc is loaded as a dependent of a.out before libpthread is loaded as a dependent of libc.2. The dependency list of the first case is:

                     a.out
                  /     /   \
                lib1  lib2  libc
                  |     |
                libc  libpthread
Therefore the load graph is constructed as:
  lib1.sl --> lib2.sl -->libc.2 --> libpthread.1

This is the desired behavior for non-threaded applications, but causes threaded applications (that use either libpthread or libcma) to fail.

# lib1.sl specifies -lc; lib2.sl specifies  -lpthread;
no -lpthread on a.out

$ cc -c +z +DA2.0W lib1.c lib2.c
lib1.c:
lib2.c:
$ ld -b -o lib1.sl -lc lib1.o
$ ld -b -o lib2.sl -lpthread lib2.o
$ cc +DA2.0W thread.c -L. -l1 -l2
$ a.out
Error
$ ldd a.out
        lib1.sl =>      ./lib1.sl
        lib2.sl =>      ./lib2.sl
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libc.2 =>       /lib/pa20_64/libc.2
        libpthread.1 => /lib/pa20_64/libpthread.1
        libdl.1 =>      /usr/lib/pa20_64/libdl.1

# lib2.sl specifies -lpthread; no -lpthread on a.out
$ ld -b -o lib1.sl lib1.o
$ ld -b -o lib2.sl -lpthread lib2.o
$ cc +DA2.0W thread.c -L. -l1 -l2
$ a.out
Error
$ ldd a.out
        lib1.sl =>      ./lib1.sl
        lib2.sl =>      ./lib2.sl
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libpthread.1 => /lib/pa20_64/libpthread.1
        libdl.1 =>      /usr/lib/pa20_64/libdl.1

The same problem will occur if -lcma is listed as a dependent library of a share d library, and you would need to link the executable with -lcma.

Solution:

For threaded applications, run the executable with LD_PRELOAD set to the libpthr ead library or link the executable with -lpthread:

# use LD_PRELOAD to load libpthread first
$ ld -b -o lib1.sl lib1.o
$ ld -b -o lib2.sl -lpthread lib2.o
$ cc +DA2.0W thread.c -L. -l1 -l2
$ a.out
Error
$ ldd a.out
        lib1.sl =>      ./lib1.sl
        lib2.sl =>      ./lib2.sl
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libpthread.1 => /lib/pa20_64/libpthread.1
        libdl.1 =>      /usr/lib/pa20_64/libdl.1
$ LD_PRELOAD="/lib/pa20_64/libpthread.1" a.out
Success

# a.out correctly lists -lpthread for a threaded application
$ ld -b -o lib1.sl lib1.o
$ ld -b -o lib2.sl -lpthread lib2.o
$ cc +DA2.0W thread.c -L. -l1 -l2 -lpthread
$ a.out
Success
$ ldd a.out
        lib1.sl =>      ./lib1.sl
        lib2.sl =>      ./lib2.sl
        libpthread.1 => /usr/lib/pa20_64/libpthread.1
        libc.2 =>       /usr/lib/pa20_64/libc.2
        libpthread.1 => /lib/pa20_64/libpthread.1
        libdl.1 =>      /usr/lib/pa20_64/libdl.1

Example 2 (archived libc):

If the link line of your shared library contains -lc to explicitly link in libc, remove -lc. Otherwise, shared libraries may be referencing libc.2 while the a.out may reference some older (archived) libc version. Thus the application will actually be using two different versions of libc and possibly mixing code. This may cause compatibility problems. Basically, an application or library should ne ver directly link against libc. All programs need to be linked against libc (which the compiler does automatically), so a shared library will always have the interfaces it needs to execute properly without needing to specify -lc on the link line.


Linker Tasks

You have a great deal of control over how the linker links your program or library by using ld command-line options.

Using the Compiler to Link

In many cases, you use your compiler command to compile and link programs. Your compiler uses options that directly affect the linker.

Changing the Default Library Search Path with -Wl, -L

By default, the linker searches the directory /usr/lib and /usr/ccs/lib for libraries specified with the -l compiler option. (If the -p or -G compiler option is specified, then the linker also searches the profiling library directory /usr/lib/libp.)

The -Llibpath option to ld augments the default search path; that is, it causes ld to search the specified libpath before the default places. The C compiler (cc), the C++ compiler (CC), the POSIX FORTRAN compiler (fort77), and the HP Fortran 90 compiler (f90) recognize the -L option and pass it directly to ld. However, the HP FORTRAN compiler (f77) and Pascal compiler (pc) do not recognize -L; it must be passed to ld with the -Wl option.

Example Using -Wl, -L

For example, to make the f77 compiler search /usr/local/lib to find a locally developed library named liblocal, use this command line:

$f77 prog.f -Wl,-L,/usr/local/lib -llocal

(The f77 compiler searches /opt/fortran/lib and /usr/lib as default directories.)

To make the f90 compiler search /usr/local/lib to find a locally developed library named liblocal,, use this command line:

$f90 prog.f90 -L/usr/local/lib -llocal

(The f90 compiler searches /opt/fortran90/lib and /usr/lib as default directories.)

For the C compiler, use this command line:

$ cc -Aa prog.c -L /usr/local/lib -llocal
The LPATH environment variable provides another way to override the default search path. For details, see Changing the Default Library Search Path with -L and LPATH .

Getting Verbose Output with -v

The -v option makes a compiler display verbose information. This is useful for seeing how the compiler calls ld. For example, using the -v option with the Pascal compiler shows that it automatically links with libcl, libm, and libc.
$ pc -v prog.p
/opt/pascal/lbin/pascomp prog.p prog.o -O0
LPATH = /usr/lib/pa1.1:/usr/lib:/opt/langtools/lib
/usr/ccs/bin/ld /opt/langtools/lib/crt0.o -z prog.o -lcl -lm -lc
unlink prog.o

Passing Linker Options from the Compiler Command with -Wl

The -Wl option passes options and arguments to ld directly, without the compiler interpreting the options. Its syntax is:

-Wl,arg1 [,arg2]...

where each argn is an option or argument passed to the linker. For example, to make ld use the archive version of a library instead of the shared, you must specify -a archive on the ld command line before the library.

Example Using -Wl

The command for telling the linker to use an archive version of libmfrom the C command line is:
$ cc -Aa mathprog.c -Wl,-a,archive,-lm,-a,default
The command for telling the linker to use an archive version of libm is:
$ ld /opt/langtools/lib/crt0.o mathprog.o -a archive -lm \
  -a default -lc

Renaming the Output File with -o

The -oname option causes ld to name the output file name instead of a.out. For example, to compile a C program prog.c and name the resulting file sum_num:
$ cc -Aa -o sum_num prog.c           Compile using -o option.  
$ sum_num                            Run the program.                    
Enter a number to sum: 5
The sum of 1 to 5: 15

Specifying Libraries with -l

Sometimes programs call routines not contained in the default libraries. In such cases you must explicitly specify the necessary libraries on the compile line with the -l option. The compilers pass -l options directly to the linker before the default libraries.

For example, if a C program calls library routines in the curses library (libcurses), you must specify -lcurses on the cc command line:

$ cc -Aa -v cursesprog.c -lcurses
    ...
/usr/ccs/bin/ld /opt/langtools/lib/crt0.o -u main \
 cursesprog.o -lcurses -lc
cc: Entering Link editor.

Linking with the crt0.o Startup File in 32-bit mode

Notice also, in the above example, that the compiler linked cursesprog.o with the file /opt/langtools/lib/crt0.o. This file contains object code that performs tasks which must be executed when a program starts running - for example, retrieving any arguments specified on the command line when the program is invoked. For details on this file, see crt0(3) and The crt0.o Startup File.

Suppressing the Link-Edit Phase with -c

The -c compiler option suppresses the link-edit phase. That is, the compiler generates only the .o files and not the a.out file. This is useful when compiling source files that contain only subprograms and data. These may be linked later with other object files, or placed in an archive or shared library. The resulting object files can then be specified on the compiler command line, just like source files. For example:
$ f77 -c func.f             Produce .o for func.f.         
$ ls func.o        
func.o                      Verify that func.o was created.
$ f77 main.f func.o         Compile main.f with func.o
$ a.out                     Run
it to verify it worked.

Using Linker commands

This section describes linker commands for the 32-bit and 64-bit linker.

NOTE  Unless otherwise noted, all examples show 32-bit behavior.

Linking with the 32-bit crt0.o Startup File

In 32-bit mode, you must always include crt0.o on the link line.

In 64-bit mode, you must include crt0.o on the link line for all fully archive links (ld -noshared) and in compatibility mode (+compat). You do not need to include the crt0.o startup file on the ld command line for shared bound links. In 64-bit mode, the dynamic loader, dld.sl, does some of the startup duties previously done by crt0.o.

See The crt0.o Startup File, and the crt0(3) man page for more information.

Changing the Default Library Search Path with -L and LPATH

You can change or override the default linker search path by using the LPATH environment variable or the -L linker option.

Overriding the Default Linker Search Path with LPATH

The LPATH environment variable allows you to specify which directories ld should search. If LPATH is not set, ld searches the default directory /usr/lib. If LPATH is set, ld searches only the directories specified in LPATH; the default directories are not searched unless they are specified in LPATH.

If set, LPATH should contain a list of colon-separated directory path names ld should search. For example, to include /usr/local/lib in the search path after the default directories, set LPATH as follows:

$ LPATH=/usr/lib:/usr/local/lib     Korn and Bourne shell syntax.
$ export LPATH

Augmenting the Default Linker Search Path with -L

The -L option to ld also allows you to add additional directories to the search path. If -L libpath is specified, ld searches the libpath directory before the default places.

For example, suppose you have a locally developed version of libc, which resides in the directory /usr/local/lib. To make ld find this version of libc before the default libc, use the -L option as follows:

$ ld /opt/langtools/lib/crt0.o prog.o -L /usr/local/lib -lc
Multiple -L options can be specified. For example, to search /usr/contrib/lib and /usr/local/lib before the default places:
$ ld /opt/langtools/lib/crt0.o prog.o -L /usr/contrib/lib  \
-L /usr/local/lib -lc
If LPATH is set, then the -L option specifies the directories to search before the directories specified in LPATH.

Using $ORIGIN

You can use the $ORIGIN string in LD_LIBRARY_PATH, SHLIB_PATH, RUNPATH (the embedded path or RPATH), or in the path of a shared library in the shared library list.The loader determines the path of the current load module when the loader first encounters $ORIGIN, whether it is in LD_LIBRARY_PATH, SHLIB_PATH, RUNPATH, or the shared library name in the shared library list.

To add $ORIGIN to the environment variables LD_LIBRARY_PATH or SHLIB_PATH, just place $ORIGIN in the value of these environment variables. To add $ORIGIN to the RUNPATH, use the linker options +b or -L. To add $ORIGIN to the path of a shared library in the shared library list, use the linker option +origin.

+origin -lx
or
+origin shared_library_name
(You can only use the +origin option before the -l option or the name of a shared library.) The option causes the linker to add $ORIGIN before the shared library name in the shared library list and set the DF_ORIGIN flag for the output module. At runtime, the dynamic loader determines the directory of the parent module (object module, shared library, or executable) and replaces $ORIGIN for that directory name. For example,
$ ld main.o +origin libx.sl -L -lc

NOTE  While the +origin option is available, the recommended way to specify $origin is in the embedded path with the +b option. For example,
$ ld main.o -lc +b $ORIGIN
If you use +b,\$ORIGIN; the $ORIGIN only affects libraries that are subject to dynamic path lookup; that is, the library shared_library_name is specified with -l or with no embedded / character. If you use +origin shared_library_name, the library will be located using $ORIGIN, which is recorded in the full library name.

Changing the Default Shared Library Binding with -B

You might want to force immediate binding - that is, force all routines and data to be bound at startup time. With immediate binding, the overhead of binding occurs only at program startup, rather than across the program's execution. One possibly useful characteristic of immediate binding is that it causes any possible unresolved symbols to be detected at startup time, rather than during program execution. Another use of immediate binding is to get better interactive performance, if you don't mind program startup taking a little longer.

Example Using -B immediate

To force immediate binding, link an application with the -B immediate linker option. For example, to force immediate binding of all symbols in the main program and in all shared libraries linked with it, you could use this ld command:
$ ld -B immediate /opt/langtools/lib/crt0.o prog.o -lc -lm

Nonfatal Shared Library Binding with -B nonfatal

The linker also supports nonfatal binding, which is useful with the -B immediate option. Like immediate binding, nonfatal immediate binding causes all required symbols to be bound at program startup. The main difference from immediate binding is that program execution continues even if the dynamic loader cannot resolve symbols. Compare this with immediate binding, where unresolved symbols cause the program to abort.

To use nonfatal binding, specify the -B nonfatal option along with the -B immediate option on the linker command line. The order of the options is not important, nor is the placement of the options on the line. For example, the following ld command uses nonfatal immediate binding:

$ ld /opt/langtools/lib/crt0.o prog.o -B nonfatal \
  -B immediate -lm -lc
Note that the -B nonfatal modifier does not work with deferred binding because a symbol must have been bound by the time a program actually references or calls it. A program attempting to call or access a nonexistent symbol is a fatal error.

Restricted Shared Library Binding with -B restricted

The linker also supports restricted binding, which is useful with the -B deferred and -B nonfatal options. The -B restricted option causes the dynamic loader to restrict the search for symbols to those that were visible when the library was loaded. If the dynamic loader cannot find a symbol within the restricted set, a run-time symbol binding error occurs and the program aborts.

The -B nonfatal modifier alters this behavior slightly: If the dynamic loader cannot find a symbol in the restricted set, it looks in the global symbol set (the symbols defined in all libraries) to resolve the symbol. If it still cannot find the symbol, then a run-time symbol-binding error occurs and the program aborts.

When is -B restricted most useful? Consider a program that creates duplicate symbol definitions by either of these methods:

If such a program is linked with -B immediate, references to symbols will be bound at program startup, regardless of whether duplicate symbols are created later by shl_load or shl_definesym.

But what happens when, to take advantage of the performance benefits of deferred binding, the same program is linked with -B deferred? If a duplicate, more visible symbol definition is created prior to referencing the symbol, it binds to the more visible definition, and the program might run incorrectly. In such cases, -B restricted is useful, because symbols bind the same way as they do with -B immediate, but actual binding is still deferred.

Improving Shared Library Performance with -B symbolic

The linker supports the -B symbolic option which optimizes call paths between procedures when building shared libraries. It does this by building direct internal call paths inside a shared library. In linker terms, import and export stubs are bypassed for calls within the library.

A benefit of -B symbolic is that it can help improve application performance and the resulting shared library will be slightly smaller. The -B symbolic option is useful for applications that make a lot of calls between procedures inside a shared library and when these same procedures are called by programs outside of the shared library.


NOTE  The -B symbolic option applies only to function, but not variable, references in a shared library.

Example Using -B symbolic

For example, to optimize the call path between procedures when building a shared library called lib1.sl, use -B symbolic as follows:
$ ld -B symbolic -b func1.o func2.o -o lib1.sl

NOTE  The +e option overrides the -B symbolic option. For example, you use +esymbol, only symbol is exported and all other symbols are hidden. Similarly, if you use +eesymbol, only symbol is exported, but other symbols exported by default remain visible.

Since all internal calls inside the shared library are resolved inside the shared library, user-supplied modules with the same name are not seen by routines inside the library. For example, you could not replace internal libc.sl malloc() calls with your own version of malloc() if libc.sl was linked with -B symbolic.


Comparing -B symbolic with -h and +e

Similar to the -h (hide symbol) and +e (export symbol) linker options, -B symbolic optimizes call paths in a shared library. However, unlike -h and +e, all functions in a shared library linked with -B symbolic are also visible outside of the shared library.

Case 1: Building a Shared Library with -B symbolic

Suppose you have two functions to place in a shared library. The convert_rtn() calls gal_to_liter().
    Build the shared library with -b. Optimize the call path inside the shared library with -B symbolic.
    $ ld -B symbolic -b convert.o volume.o -o libunits.sl
    Two main programs link to the shared library. main1 calls convert_rtn() and main2 calls gal_to_liter().
    $ cc -Aa main1.c libunits.sl -o main1
    $ cc -Aa main1.c libunits.sl -o main2
Figure 4: Symbols inside a Shared Library Visible with -B symbolic shows that a direct call path is established between convert_rtn() and gal_to_liter() inside the shared library. Both symbols are visible to outside callers.

Figure 4: Symbols inside a Shared Library Visible with -B symbolic

Case 2: Building a Shared Library with -h or +e

The -h (hide symbol) and +e (export symbol) options can also optimize the call path in a shared library for symbols that are explicitly hidden. However, only the exported symbols are visible outside of the shared library.

For example, you could hide the gal_to_liter symbol as shown:

$ ld -b convert.o -h gal_to_liter volume.o -o libunits.sl
or export the convert_rtn symbol:
$ ld -b +e convert_rtn convert.o volume.o -o libunits.sl
In both cases, main2 will not be able to resolve its reference to gal_to_liter() because only the convert_rtn() symbol is exported as shown below inFigure 5: Symbol hidden in a Shared Library:

Figure 5: Symbol hidden in a Shared Library

Choosing Archive or Shared Libraries with -a

If both an archive and shared version of a particular library reside in the same directory, ld links with the shared version. Occasionally, you might want to override this behavior.

As an example, suppose you write an application that will run on a system on which shared libraries may not be present. Since the program could not run without the shared library, it would be best to link with the archive library, resulting in executable code that contains the required library routines. See also Caution When Mixing Shared and Archive Libraries .

Option Settings to -a

The -a option tells the linker what kind of library to link with. It applies to all libraries (-l options) until the end of the command line or until the next -a option. Its syntax is:
-a {archive | shared | default | archive_shared | shared_archive}
The different option settings are:
-a archive

Select archive libraries. If the archive library does not exist, ld generates an error message and does not generate the output file.
-a shared

Select shared libraries. If the shared library does not exist, ld generates an error message and does not generate the output file.
-a default

This is the same as -a shared_archive.
-a archive_shared

Select the archive library if it exists; otherwise, select the shared library. If the library cannot be found in either version, ld generates an error message and does not generate the output file.
-a shared_archive

Select the shared library if it exists; otherwise, select the archive library. If the library cannot be found in either version, ld generates an error message and does not generate the output file.
The -a shared and -a archive options specify only one type of library to use. An error results if that type is not found. The other three options specify a preferred type of library and an alternate type of library if the preferred type is not found.

CAUTION  You should avoid mixing shared libraries and archive libraries in the same application. For more information see Caution When Mixing Shared and Archive Libraries .

Example Using -a

The following command links with the archive versions of libcurses, libm and libc:
$ ld /opt/langtools/lib/crt0.o prog.o -a archive -lcurses -lm -lc

Dynamic Linking with -A and -R

This section describes how to do dynamic linking - that is, how to add an object module to a running program. Conceptually, it is very similar to loading a shared library and accessing its symbols (routines and data). In fact, if you require such functionality, you should probably use shared library management routines (see Shared Library Management Routines ).

However, be aware that dynamic linking is incompatible with shared libraries. That is, a running program cannot be linked to shared libraries and also use ld -A to dynamically load object modules.


NOTE  Another reason to use shared library management routines instead of dynamic linking is that dynamic linking may not be supported in a future release. See Linker Compatibility Warnings and Changes in Future Releasesfor additional future changes.

Topics in this section include:

Overview of Dynamic Linking

The implementation details of dynamic linking vary across platforms. To load an object module into the address space of a running program, and to be able to access its procedures and data, follow these steps on all HP9000 computers:
    Determine how much space is required to load the module.

    Allocate the required memory and obtain its starting address.

    Link the module from the running application.

    Get information about the module's text, data, and bss segments from the module's header.

    Read the text and data into the allocated space.

    Clear (fill with zeros) the bss segment.

    Flush the text from the data cache before executing code from the loaded module.

    Get the addresses of routines and data that are referenced in the module.

Step 1: Determine how much space is required to load the module

There must be enough contiguous memory to hold the module's text, data, and bss segments. You can make a liberal guess as to how much memory is needed, and hope that you've guessed correctly. Or you can be more precise by pre-linking the module and getting size information from its header.

Step 2: Allocate the required memory and obtain its starting address

Typically, you use malloc(3C) to allocate the required memory. You must modify the starting address returned by malloc to ensure that it starts on a memory page boundary (address MOD 4096 == 0).

Step 3: Link the module from the running application

Use the following options when invoking the linker from the program:
-omod_name

Name of the output module that will be loaded by the running program.
-Abase_prog

Tells the linker to prepare the output file for incremental loading. Also causes the linker to include symbol table information from base_prog in the output file.
-Rhex_addr

Specifies the hexadecimal address at which the module will be loaded. This is the address calculated in Step 2.
-N

Causes the data segment to be placed immediately after the text segment.
-eentry_pt

If specified (it is optional), causes the symbol named entry_pt to be the entry point into the module. The location of the entry point is stored in the module's header.

Step 4: Get information about the module's text, data, and bss segments from the module's header

There are two header structures stored at the start of the file: struct header (defined in <filehdr.h>) and struct som_exec_auxhdr (defined in <aouthdr.h>). The required information is stored in the second header, so to get it, a program must seek past the first header before reading the second one.

The useful members of the som_exec_auxhdr structure are:

.exec_tsize

Size of text (code) segment.
.exec_tmem

Address at which to load the text (already adjusted for offset specified by the -R linker option).
.exec_tfile

Offset into file (location) where text segment starts.
.exec_dsize

Size of data segment.
.exec_dmem

Address at which to load the data (already adjusted).
.exec_dfile

Offset into file (location) where data segment starts.
.exec_bsize

Size of bss segment. It is assumed to start immediately after the data segment.
.exec_entry

Address of entry point (if one was specified by the -e linker option).

Step 5: Read the text and data into the allocated space

Once you know the location of the required segments in the file, you can read them into the area allocated in Step 2.

The location of the text and data segments in the file is defined by the .exec_tfile and .exec_dfile members of the som_exec_auxhdr structure. The address at which to place the segments in the allocated memory is defined by the .exec_tmem and .exec_dmem members. The size of the segments to read in is defined by the .exec_tsize and .exec_dsize members.

Step 6: Clear (zero out) the bss segment

The bss segment starts immediately after the data segment. To zero out the bss, find the end of the data segment and use memset (see memory(3C)) to zero out the size of the bss.

The end of the data segment can be determined by adding the .exec_dmem and .exec_dsize members of the som_exec_auxhdr structure. The bss's size is defined by the .exec_bsize member.

Step 7: Flush the text from the data cache before executing code from the loaded module

Before executing code in the allocated space, a program should flush the instruction and data caches. Although this is really only necessary on systems that have instruction and data caches, it is easiest just to do it on all systems for ease of portability.

Use an assembly language routine named flush_cache (see The flush_cache Function in this chapter). You must assemble this routine separately (with the as command) and link it with the main program.

Step 8: Get the addresses of routines and data that are referenced in the module

If the -e linker option was used, the module's header will contain the address of the entry point. The entry point's address is stored in the .exec_entry member of the som_exec_auxhdr structure.

If the module contains multiple routines and data that must be accessed from the main program, the main program can use the nlist(3C) function to get their addresses.

Another approach that can be used is to have the entry point routine return the addresses of required routines and data.

An Example Program

To illustrate dynamic linking concepts, this section presents an example program, dynprog. This program loads an object module named dynobj.o, which is created by dynamically linking two object files file1.o and file2.o (see file1.o and file2.o ).

The program allocates space for dynobj.o by calling a function named alloc_load_space (see The alloc_load_space Function later in this chapter). The program then calls a function named dyn_load to dynamically link and load dynobj.o (see The dyn_load Function later in this chapter). Both functions are defined in a file called dynload.c (see dynload.c ).

As a return value, dyn_load provides the address of the entry point in dynobj.o - in this case, the function foo. To get the addresses of the function bar and the variable counter, the program uses the nlist(3C) function.

The Build Environment

Before seeing the program's source code, it may help to see how the program and the various object files were built. The following shows the makefile used to generate the various files.
Makefile Used to Create Dynamic Link Files
CFLAGS = -Aa -D_POSIX_SOURCE
dynprog:        dynprog.o dynload.o flush_cache.o
# Compile line:
  cc -o dynprog dynprog.o dynload.o flush_cache.o -Wl,-a,archive
 
file1.o:        file1.c dynprog.c
file2.o:        file2.c
 
# Create flush_cache.o:
flush_cache.o:
        as flush_cache.s
This makefile assumes that the following files are found in the current directory:
dynload.c

The file containing the alloc_load_space and dyn_load functions.
dynprog.c

The main program that calls functions from dynload.c and dynamically links and loads file1.o and file2.o. Also contains the function glorp, which is called by foo and bar.
file1.c

Contains the functions foo and bar.
file2.c

Contains the variable counter, which is incremented by foo, bar, and main.
flush_cache.s

Assembly language source for function flush_cache, which is called by the dyn_load function.
To create the executable program dynprog from this makefile, you would simply run:
$ make dynprog file1.o file2.o 
        cc -Aa -D_POSIX_SOURCE -c dynprog.c
        cc -Aa -D_POSIX_SOURCE -c dynload.c
        cc -o dynprog dynprog.o dynload.o -Wl,-a,archive
        cc -Aa -D_POSIX_SOURCE -c file1.c
        cc -Aa -D_POSIX_SOURCE -c file2.c
        as -o flush_cache flush_cache.s
Note that the line CFLAGS = causes any C files to be compiled in ANSI mode (-Aa) and causes the compiler to search for routines that are defined in the Posix standard (-D_POSIX_SOURCE).

For details on using make refer to make(1).

Source for dynprog

Here is the source file for the dynprog program.
dynprog.c - Example Dynamic Link and Load Program
#include <stdio.h>
#include <nlist.h>
 
extern void * alloc_load_space(const char * base_prog,
                               const char * obj_files,
                               const char * dest_file);
 
 
extern void * dyn_load(const char * base_prog,
                       unsigned int addr,
                       const char * obj_files,
                       const char * dest_file,
                       const char * entry_pt);
 
const char * base_prog = "dynprog";       /* this executable's name */
const char * obj_files = "file1.o file2.o"; /* .o files to combine  */
const char * dest_file = "dynobj.o";        /* .o file to load      */
const char * entry_pt  = "foo";             /* define entry pt name */
 
void glorp (const char *); /* prototype for local function          */
void (* foo_ptr) ();       /* pointer to entry point foo            */
void (* bar_ptr) ();       /* pointer to function bar               */
int * counter_ptr;         /* pointer to variable counter [file2.c] */
main()
{
  unsigned int addr;       /* address at which to load dynobj.o  */
  struct nlist nl[3];      /* nlist struct to retrieve addresse  */
 
/*
STEP 1:  Allocate space for module:
*/
 addr = (unsigned int) alloc_load_space(base_prog,
                                        obj_files, dest_file);
 
/*
STEP 2:  Load the file at the address, and get address of entry point:
*/
  foo_ptr = (void (*)()) dyn_load(base_prog, addr, obj_files,
                                  dest_file, entry_pt);
 
 
/*
STEP 3:  Get the addresses of all desired routines using nlist(3C):
*/
 
  nl[0].n_name = "bar";         
  nl[1].n_name = "counter";
  nl[2].n_name = NULL;
  if (nlist(dest_file, nl)) {
    fprintf(stderr, "error obtaining namelist for %s\n", dest_file);
    exit(1);
  }
  /*
   * Assign the addresses to meaningful variable names:
   */
  bar_ptr = (void (*)()) nl[0].n_value;
  counter_ptr = (int *) nl[1].n_value;
 
  /*
   * Now you can call the routines and modify the variables:
   */
  glorp("main");
  (*foo_ptr) ();
  (*bar_ptr) ();
  (*counter_ptr) ++;
  printf("counter = %d\n", *counter_ptr);
}
 
void    glorp(const char * from)
{
  printf("glorp called from %s\n", from);
}

file1.o and file2.o

Source for file1.c and file2.c shows the source for file1.o and file2.o. Notice that foo and bar call glorp in dynprog.c. Also, both functions update the variable counter in file2.o; however, foo updates counter through the pointer (counter_ptr) defined in dynprog.c.
Source for file1.c and file2.c
/****************************************************************
 * file1.c - Contains routines foo() and bar().
 ****************************************************************/
 
extern int * counter_ptr;             /* defined in dynprog.c */
extern int counter;                   /* defined in file2.c   */
extern void glorp(const char * from); /* defined in dynprog.c */
 
void foo()
{
 glorp("foo");
 (*counter_ptr) ++; /* update counter indirectly with global pointer */
}
 
void bar()
{
 glorp("bar");
 counter ++;        /* update counter directly */
}
 
/****************************************************************
 * file2.c - Global counter variable referenced by dynprog.c 
 * and file1.c.
 ****************************************************************/
int counter = 0;

Output of dynprog

Now that you see how the main program and the module it loads are organized, here is the output produced when dynprog runs:
glorp called from main
glorp called from foo
glorp called from bar
counter = 3

dynload.c

The dynload.c file contains the definitions of the functions alloc_load_space and dyn_load. Include Directives for dynload.c shows the #include directives must appear at the start of this file.
Include Directives for dynload.c
#include <stdio.h>
#include <stdlib.h>
#include <nlist.h>
#  include <filehdr.h>           
#  include <aouthdr.h>           
#  define PAGE_SIZE 4096         /* memory page size  */

The alloc_load_space Function

The alloc_load_space function returns a pointer to space (allocated by malloc) into which dynprog will load the object module dynobj.o. Its syntax is:
void * alloc_load_space(const char * base_prog,
                        const char * obj_files,
                        const char * dest_file)
base_prog

The name of the program that is calling the routine. In other words, the name of the program that will dynamically link and load dest_file.
obj_files

The name of the object file or files that will be linked together to create dest_file.
dest_file

The name of the resulting object module that will be dynamically linked and loaded by base_prog.
As described in Step 1 in Overview of Dynamic Linking at the start of this section, you can either guess at how much space will be required to load a module, or you can try to be more accurate. The advantage of the former approach is that it is much easier and probably adequate in most cases; the advantage of the latter is that it results in less memory fragmentation and could be a better approach if you have multiple modules to load throughout the course of program execution.

The alloc_load_space function allocates only the required amount of space. To determine how much memory is required, alloc_load_space performs these steps:

    Pre-link the specified obj_files to create base_prog.

    Get text, data, and bss segment location and size information to determine how much space to allocate.

    Return a pointer to the space. (The address of the space is adjusted to begin on a memory page boundary - that is, a 4096-byte boundary.)

C Source for alloc_load_space Function shows the source for this function.
C Source for alloc_load_space Function
void * alloc_load_space(const char * base_prog,
                        const char * obj_files,
                        const char * dest_file)
{
  char cmd_buf[256];    /* linker command line                   */
  int ret_val;          /* value returned by various lib calls   */
  size_t space;         /* size of space to allocate for module  */
  size_t addr;          /* address of allocated space            */
  size_t bss_size;      /* size of bss (uninitialized data)      */
  FILE * destfp;        /* file pointer for dest_file            */
  
  struct som_exec_auxhdr aux_hdr;       /* file header           */
  unsigned int tdb_size; /* size of text, data, and bss combined */
 
/* ---------------------------------------------------------------
 * STEP 1:  Pre-link the destination module so we can get its size:
 */
  sprintf(cmd_buf, "/bin/ld -a archive -R80000 -A %s -N %s -o %s -lc",
          base_prog, obj_files, dest_file);
  if (ret_val = system(cmd_buf)) {
    fprintf(stderr, "link failed: %s\n", cmd_buf);
    exit(ret_val);
  }
/* ---------------------------------------------------------------
 * STEP 2:  Get the size of the module's text, data, and bss segments 
 * from the auxiliary header for dest_file; add them together to 
 * determine size:
*/
  if ((destfp = fopen(dest_file, "r")) == NULL) {
    fprintf(stderr, "error opening %s\n", dest_file);
    exit(1);
  }
 
  /*
   * Must seek past SOM "header" to get to the desired 
   * "som_exec_auxhdr":
   */
  if (fseek(destfp, sizeof(struct header), 0)) {
    fprintf(stderr, "error seeking past header in %s\n", dest_file);
    exit(1);
  }
  if (fread(&aux_hdr, sizeof(aux_hdr), 1, destfp) <= 0) {
    fprintf(stderr, "error reading som aux header from %s\n", dest_file);
    exit(1);
  }
  
  /* allow for page-alignment of data segment */
  
  space = aux_hdr.exec_tsize + aux_hdr.exec_dsize 
    + aux_hdr.exec_bsize + 2 * PAGE_SIZE;    
 
  fclose(destfp);               /* done reading from module file */
/* ---------------------------------------------------------------
 * STEP 3:  Call malloc(3C) to allocate the required memory and get
 * its address; then return a pointer to the space:
 */
  addr = (size_t) malloc(space);
  /*
   * Make sure allocated area is on page-aligned address:
   */
  if (addr % PAGE_SIZE != 0) addr += PAGE_SIZE - (addr % PAGE_SIZE);
 
  return((void *) addr);
}

The dyn_load Function

The dyn_load function dynamically links and loads an object module into the space allocated by the alloc_load_space function. In addition, it returns the address of the entry point in the loaded module. Its syntax is:
void * dyn_load(const char * base_prog,
                unsigned int addr,
                const char * obj_files,
                const char * dest_file,
                const char * entry_pt)
The base_prog, obj_files, and dest_file parameters are the same parameters supplied to alloc_load_space. The addr parameter is the address returned by alloc_load_space, and the entry_pt parameter specifies a symbol name that you want to act as the entry point in the module.

To dynamically link and load dest_file into base_prog, the dyn_load function performs these steps:

    Dynamically link base_prog with obj_files, producing dest_file. The address at which dest_file will be loaded into memory is specified with the -Raddr option. The name of the entry point for the file is specified with -eentry_pt.

    Open dest_file and get its header information on the text, data, and bss segments. Read this information into a som_exec_auxhdr structure, which starts immediately after a header structure.

    Read the text and data segments into the area allocated by alloc_load_space. (The text and data segments are read separately.)

    Initialize (fill with zeros) the bss, which starts immediately after the data segment.

    Flush text from the data cache before execution, using the flush_cache routine. (See The flush_cache Function later in this chapter.)

    Return a pointer to the entry point, specified by the -e option in Step 1.

C Source for dyn_load Function
void * dyn_load(const char * base_prog,
                unsigned int addr,
 
                const char * obj_files,
                const char * dest_file,
                const char * entry_pt)
{
  char  cmd_buf[256];       /* buffer holding linker command       */
  int   ret_val;            /* holds return value of library calls */
  FILE  * destfp;           /* file pointer for destination file   */
  unsigned int bss_start;   /* start address of bss in VM          */
  unsigned int bss_size;    /* size of bss                         */
  unsigned int entry_pt_addr; /* address of entry point            */
 
  struct som_exec_auxhdr aux_hdr;  /* som file auxiliary header    */
  unsigned int tdb_size;    /* size of text, data, and bss combined*/
 
/* -----------------------------------------------------------------
 * STEP 1: Dynamically link the module to be loaded:
 */
  sprintf(cmd_buf, 
          "/bin/ld -a archive -A %s -R %x -N %s -o %s -lc -e %s",
          base_prog, addr, obj_files, dest_file, entry_pt);
          
  if (ret_val = system(cmd_buf)) 
  {
    fprintf(stderr, "link command failed: %s\n", cmd_buf);
    exit(ret_val);
  }
 
/* -----------------------------------------------------------------
 * STEP 2: Open dest_file. Read its auxiliary header for text, data, 
 *         and bss info:
 */
  if ((destfp = fopen(dest_file, "r")) == NULL) 
  {
    fprintf(stderr, "error opening %s for loading\n", dest_file);
    exit(1);
  }
 
  /*
   * Get auxiliary header information from "som_exec_auxhdr" struct, 
   * which is after SOM header.
   */
 
  if (fseek(destfp, sizeof(struct header), 0)) 
  {
    fprintf(stderr, "error seeking past header in %s\n", dest_file);
    exit(1);
  }
  
  if (fread(&aux_hdr, sizeof(aux_hdr), 1, destfp) <= 0) 
  {
    fprintf(stderr, "error reading som aux header from %s\n", dest_file);
    exit(1);
  }
/* -----------------------------------------------------------------
 * STEP 3:  Read the text and data segments into the buffer area:
 */
 
  /*
   * Read text and data separately.  First load the text:
   */
   
  if (fseek(destfp, aux_hdr.exec_tfile, 0)) 
  {
    fprintf(stderr, "error seeking start of text in %s\n", dest_file);
    exit(1);
  }
  
  if ((fread(aux_hdr.exec_tmem, aux_hdr.exec_tsize, 1, destfp)) <= 0)
  {
    fprintf(stderr, "error reading text from %s\n", dest_file);
    exit(1);
  }
  /*
   * Now load the data, if any:
   */
  if (aux_hdr.exec_dsize) {
    if (fseek(destfp, aux_hdr.exec_dfile, 0))
    {
      fprintf(stderr, "error seeking start of data in %s\n", dest_file);
      exit(1);
    }

 
    if ((fread(aux_hdr.exec_dmem, aux_hdr.exec_dsize, 1, destfp))<= 0)
    {
      fprintf(stderr, "error reading data from %s\n", dest_file);
      exit(1);
    }
  }
 
  fclose(destfp);               /* done reading from module file */
/* -----------------------------------------------------------------
 * STEP 4:  Zero out the bss (uninitialized data segment):
 */
 
  bss_start = aux_hdr.exec_dmem + aux_hdr.exec_dsize;
  bss_size  = aux_hdr.exec_bsize;
 
  memset(bss_start, 0, bss_size);
 
/* -----------------------------------------------------------------
 * STEP 5:  Flush the text from the data cache before execution:
 */
 
  /*
   * The flush_cache routine must know the exact size of the
   * text, data, and bss, computed as follows:
   *   Size = (Data Addr - Text Addr) + Data Size + BSS Size
   * where (Data Addr - Text Addr) = Text Size + alignment between
   *   Text and Data.
   */
  tdb_size = (aux_hdr.exec_dmem - aux_hdr.exec_tmem) +
    aux_hdr.exec_dsize + aux_hdr.exec_bsize;
  flush_cache(addr, tdb_size);
 
/* -----------------------------------------------------------------
 * STEP 6:  Return a pointer to the entry point specified by -e:
 */
 
  entry_pt_addr = (unsigned int) aux_hdr.exec_entry;
  return ((void *) entry_pt_addr);
}

The flush_cache Function

Since there is no existing routine to flush text from the data cache before execution, you must create one. Below is the assembly language source for such a function.
Assembly Language Source for flush_cache Function
; flush_cache.s
;
; Routine to flush and synchronize data and instruction caches
; for dynamic loading
;
; Copyright Hewlett-Packard Co. 1985,1991, 1995
;
; All HP VARs and HP customers have a non-exclusive royalty-free 
; license to copy and use this flush_cashe() routine in source 
; code and/or object code. 
 
        .code
 
; flush_cache(addr, len) - executes FDC and FIC instructions for 
; every cache line in the text region given by starting addr and 
; len. When done, it executes a SYNC instruction and then enough 
; NOPs to assure the cache has been flushed.
;
; Assumption: Cache line size is at least 16 bytes.  Seven NOPs 
; is enough to assure cache has been flushed.  This routine is 
; called to flush the cache for just-loaded dynamically linked 
; code which will be executed from SR5 (data) space.
 
; %arg0=GR26, %arg1=GR25, %arg2=GR24, %arg3=GR23, %sr0=SR0.
; loop1 flushes data cache.  arg0 holds address.  arg1 holds offset.
; SR=0 means that SID of data area is used for fdc.
; loop2 flushes inst cache.  arg2 holds address.  arg3 holds offset.
; SR=sr0 means that SID of data area is used for fic.
; fdc x(0,y) -> 0 means use SID of data area.
; fic x(%sr0,y) -> SR0 means use SR0 SID (which is set to data area).
 
  .proc
  .callinfo
  .export flush_cache,entry
flush_cache
  .enter
  ldsid   (0,%arg0),%r1           ; Extract SID (SR5) from address
  mtsp    %r1,%sr0                ; SID -> SR0
  ldo     -1(%arg1),%arg1         ; offset = length -1
  copy    %arg0,%arg2             ; Copy address from GR26 to GR24
  copy    %arg1,%arg3             ; Copy offset from GR25 to GR23
 
  fdc     %arg1(0,%arg0)          ; Flush data cache @SID.address+offset
l