| United States-English |
|
|
|
![]() |
HP-UX Linker and Libraries User's Guide: HP 9000 Computers > Chapter 3 Linker TasksUsing Linker commands |
|
This section describes linker commands for the 32-bit and 64-bit linker.
In 32-bit mode, you must always include crt0.o on the link line. In 64-bit mode, you must include crt0.o on the link line for all fully archive links (ld -noshared) and in compatibility mode (+compat). You do not need to include the crt0.o startup file on the ld command line for shared bound links. In 64-bit mode, the dynamic loader, dld.sl, does some of the startup duties previously done by crt0.o. See “The crt0.o Startup File”, and the crt0(3) man page for more information. You can change or override the default linker search path by using the LPATH environment variable or the -L linker option. The LPATH environment variable allows you to specify which directories ld should search. If LPATH is not set, ld searches the default directory /usr/lib. If LPATH is set, ld searches only the directories specified in LPATH; the default directories are not searched unless they are specified in LPATH. If set, LPATH should contain a list of colon-separated directory path names ld should search. For example, to include /usr/local/lib in the search path after the default directories, set LPATH as follows:
The -L option to ld also allows you to add additional directories to the search path. If -L libpath is specified, ld searches the libpath directory before the default places. For example, suppose you have a locally developed version of libc, which resides in the directory /usr/local/lib. To make ld find this version of libc before the default libc, use the -L option as follows:
Multiple -L options can be specified. For example, to search /usr/contrib/lib and /usr/local/lib before the default places:
If LPATH is set, then the -L option specifies the directories to search before the directories specified in LPATH. You might want to force immediate binding — that is, force all routines and data to be bound at startup time. With immediate binding, the overhead of binding occurs only at program startup, rather than across the program's execution. One possibly useful characteristic of immediate binding is that it causes any possible unresolved symbols to be detected at startup time, rather than during program execution. Another use of immediate binding is to get better interactive performance, if you don't mind program startup taking a little longer. To force immediate binding, link an application with the -B immediate linker option. For example, to force immediate binding of all symbols in the main program and in all shared libraries linked with it, you could use this ld command:
The linker also supports nonfatal binding, which is useful with the -B immediate option. Like immediate binding, nonfatal immediate binding causes all required symbols to be bound at program startup. The main difference from immediate binding is that program execution continues even if the dynamic loader cannot resolve symbols. Compare this with immediate binding, where unresolved symbols cause the program to abort. To use nonfatal binding, specify the -B nonfatal option along with the -B immediate option on the linker command line. The order of the options is not important, nor is the placement of the options on the line. For example, the following ld command uses nonfatal immediate binding:
Note that the -B nonfatal modifier does not work with deferred binding because a symbol must have been bound by the time a program actually references or calls it. A program attempting to call or access a nonexistent symbol is a fatal error. The linker also supports restricted binding, which is useful with the -B deferred and -B nonfatal options. The -B restricted option causes the dynamic loader to restrict the search for symbols to those that were visible when the library was loaded. If the dynamic loader cannot find a symbol within the restricted set, a run-time symbol binding error occurs and the program aborts. The -B nonfatal modifier alters this behavior slightly: If the dynamic loader cannot find a symbol in the restricted set, it looks in the global symbol set (the symbols defined in all libraries) to resolve the symbol. If it still cannot find the symbol, then a run-time symbol-binding error occurs and the program aborts. When is -B restricted most useful? Consider a program that creates duplicate symbol definitions by either of these methods:
If such a program is linked with -B immediate, references to symbols will be bound at program startup, regardless of whether duplicate symbols are created later by shl_load or shl_definesym. But what happens when, to take advantage of the performance benefits of deferred binding, the same program is linked with -B deferred? If a duplicate, more visible symbol definition is created prior to referencing the symbol, it binds to the more visible definition, and the program might run incorrectly. In such cases, -B restricted is useful, because symbols bind the same way as they do with -B immediate, but actual binding is still deferred. The linker supports the -B symbolic option which optimizes call paths between procedures when building shared libraries. It does this by building direct internal call paths inside a shared library. In linker terms, import and export stubs are bypassed for calls within the library. A benefit of -B symbolic is that it can help improve application performance and the resulting shared library will be slightly smaller. The -B symbolic option is useful for applications that make a lot of calls between procedures inside a shared library and when these same procedures are called by programs outside of the shared library.
For example, to optimize the call path between procedures when building a shared library called lib1.sl, use -B symbolic as follows:
Similar to the -h (hide symbol) and +e (export symbol) linker options, -B symbolic optimizes call paths in a shared library. However, unlike -h and +e, all functions in a shared library linked with -B symbolic are also visible outside of the shared library. Suppose you have two functions to place in a shared library. The convert_rtn() calls gal_to_liter().
Figure 3-1 shows that a direct call path is established between convert_rtn() and gal_to_liter() inside the shared library. Both symbols are visible to outside callers. The -h (hide symbol) and +e (export symbol) options can also optimize the call path in a shared library for symbols that are explicitly hidden. However, only the exported symbols are visible outside of the shared library. For example, you could hide the gal_to_liter symbol as shown:
or export the convert_rtn symbol:
In both cases, main2 will not be able to resolve its reference to gal_to_liter() because only the convert_rtn() symbol is exported as shown below: If both an archive and shared version of a particular library reside in the same directory, ld links with the shared version. Occasionally, you might want to override this behavior. As an example, suppose you write an application that will run on a system on which shared libraries may not be present. Since the program could not run without the shared library, it would be best to link with the archive library, resulting in executable code that contains the required library routines. See also “Caution When Mixing Shared and Archive Libraries ”. The -a option tells the linker what kind of library to link with. It applies to all libraries (-l options) until the end of the command line or until the next -a option. Its syntax is:
The different option settings are:
The -a shared and -a archive options specify only one type of library to use. An error results if that type is not found. The other three options specify a preferred type of library and an alternate type of library if the preferred type is not found.
This section describes how to do dynamic linking — that is, how to add an object module to a running program. Conceptually, it is very similar to loading a shared library and accessing its symbols (routines and data). In fact, if you require such functionality, you should probably use shared library management routines (see Chapter 6 “Shared Library Management Routines ”). However, be aware that dynamic linking is incompatible with shared libraries. That is, a running program cannot be linked to shared libraries and also use ld -A to dynamically load object modules.
Topics in this section include:
The implementation details of dynamic linking vary across platforms. To load an object module into the address space of a running program, and to be able to access its procedures and data, follow these steps on all HP9000 computers:
There must be enough contiguous memory to hold the module's text, data, and bss segments. You can make a liberal guess as to how much memory is needed, and hope that you've guessed correctly. Or you can be more precise by pre-linking the module and getting size information from its header. Typically, you use malloc(3C) to allocate the required memory. You must modify the starting address returned by malloc to ensure that it starts on a memory page boundary (address MOD 4096 == 0). Use the following options when invoking the linker from the program:
There are two header structures stored at the start of the file: struct header (defined in <filehdr.h>) and struct som_exec_auxhdr (defined in <aouthdr.h>). The required information is stored in the second header, so to get it, a program must seek past the first header before reading the second one. The useful members of the som_exec_auxhdr structure are:
Once you know the location of the required segments in the file, you can read them into the area allocated in Step 2. The location of the text and data segments in the file is defined by the .exec_tfile and .exec_dfile members of the som_exec_auxhdr structure. The address at which to place the segments in the allocated memory is defined by the .exec_tmem and .exec_dmem members. The size of the segments to read in is defined by the .exec_tsize and .exec_dsize members. The bss segment starts immediately after the data segment. To zero out the bss, find the end of the data segment and use memset (see memory(3C)) to zero out the size of the bss. The end of the data segment can be determined by adding the .exec_dmem and .exec_dsize members of the som_exec_auxhdr structure. The bss's size is defined by the .exec_bsize member. Before executing code in the allocated space, a program should flush the instruction and data caches. Although this is really only necessary on systems that have instruction and data caches, it is easiest just to do it on all systems for ease of portability. Use an assembly language routine named flush_cache (see “The flush_cache Function ” in this chapter). You must assemble this routine separately (with the as command) and link it with the main program. If the -e linker option was used, the module's header will contain the address of the entry point. The entry point's address is stored in the .exec_entry member of the som_exec_auxhdr structure. If the module contains multiple routines and data that must be accessed from the main program, the main program can use the nlist(3C) function to get their addresses. Another approach that can be used is to have the entry point routine return the addresses of required routines and data. To illustrate dynamic linking concepts, this section presents an example program, dynprog. This program loads an object module named dynobj.o, which is created by dynamically linking two object files file1.o and file2.o (see “file1.o and file2.o ”). The program allocates space for dynobj.o by calling a function named alloc_load_space (see “The alloc_load_space Function ” later in this chapter). The program then calls a function named dyn_load to dynamically link and load dynobj.o (see “The dyn_load Function ” later in this chapter). Both functions are defined in a file called dynload.c (see “dynload.c ”). As a return value, dyn_load provides the address of the entry point in dynobj.o — in this case, the function foo. To get the addresses of the function bar and the variable counter, the program uses the nlist(3C) function.
Before seeing the program's source code, it may help to see how the program and the various object files were built. The following shows the makefile used to generate the various files. Example 3-1 Makefile Used to Create Dynamic Link Files
This makefile assumes that the following files are found in the current directory:
To create the executable program dynprog from this makefile, you would simply run:
Note that the line CFLAGS = causes any C files to be compiled in ANSI mode (-Aa) and causes the compiler to search for routines that are defined in the Posix standard (-D_POSIX_SOURCE). For details on using make refer to make(1). Here is the source file for the dynprog program. Example 3-2 dynprog.c — Example Dynamic Link and Load Program
Example 3-3 “Source for file1.c and file2.c” shows the source for file1.o and file2.o. Notice that foo and bar call glorp in dynprog.c. Also, both functions update the variable counter in file2.o; however, foo updates counter through the pointer (counter_ptr) defined in dynprog.c. Example 3-3 Source for file1.c and file2.c
Now that you see how the main program and the module it loads are organized, here is the output produced when dynprog runs:
The dynload.c file contains the definitions of the functions alloc_load_space and dyn_load. Example 3-4 “Include Directives for dynload.c ” shows the #include directives must appear at the start of this file. The alloc_load_space function returns a pointer to space (allocated by malloc) into which dynprog will load the object module dynobj.o. It syntax is:
As described in Step 1 in “Overview of Dynamic Linking ” at the start of this section, you can either guess at how much space will be required to load a module, or you can try to be more accurate. The advantage of the former approach is that it is much easier and probably adequate in most cases; the advantage of the latter is that it results in less memory fragmentation and could be a better approach if you have multiple modules to load throughout the course of program execution. The alloc_load_space function allocates only the required amount of space. To determine how much memory is required, alloc_load_space performs these steps:
Example 3-5 “C Source for alloc_load_space Function ” shows the source for this function. Example 3-5 C Source for alloc_load_space Function
The dyn_load function dynamically links and loads an object module into the space allocated by the alloc_load_space function. In addition, it returns the address of the entry point in the loaded module. Its syntax is:
The base_prog, obj_files, and dest_file parameters are the same parameters supplied to alloc_load_space. The addr parameter is the address returned by alloc_load_space, and the entry_pt parameter specifies a symbol name that you want to act as the entry point in the module. To dynamically link and load dest_file into base_prog, the dyn_load function performs these steps:
Example 3-6 C Source for dyn_load Function
Since there is no existing routine to flush text from the data cache before execution, you must create one. Below is the assembly language source for such a function. Example 3-7 Assembly Language Source for flush_cache Function
The +e option allow you to hide and export symbols. Exporting a symbol makes the symbol a global definition, which can be accessed by any other object modules or libraries. The +e option exports symbol and hides from export all other global symbols not specified with +e. In essence, -h and +e provide two different ways to do the same thing. The syntax of the +e option is: +e symbol Suppose you want to build a shared library from an object file that contains the following symbol definitions as displayed by the nm command:
In this example, check_sem_val, foo, bar, and sem are all global definitions. To create a shared library where check_sem_val is a hidden, local definition, you could use either of the following commands:
In contrast, suppose you want to export only the check_sem_val symbol. Either of the following commands would work:
How do you decide whether to use -h or +e? In general, use -h if you simply want to hide a few symbols. And use +e if you want to export a few symbols and hide a large number of symbols. You should not combine -h and +e options on the same command line. For instance, suppose you specify +e sem. This would export the symbol sem and hide all other symbols. Any additional -h options would be unnecessary. If both -h and +e are used on the same symbol, the -h overrides the +e option. The linker command line could get quite lengthy and difficult to read if several such options were specified. And in fact, you could exceed the maximum HP-UX command line length if you specify too many options. To get around this, use ld linker option files, described under “Passing Linker Options in a file with -c ”. You can specify any number of -h or +e options in this file. You can use -h or +e options when building a shared library (with -b) and when linking to create an a.out file. When combining .o files with -r, you can still use only the -h option. Like the +e option, the +ee option allows you to export symbols. Unlike the +e option, the option does not alter the visibility of any other symbols in the file. It exports the specified symbol, and does not hide any of the symbols exported by default. By default, the linker exports from a program only those symbols that were imported by a shared library. For example, if a shared executable's libraries do not reference the program's main routine, the linker does not include the main symbol in the a.out file's export list. Normally, this is a problem only when a program calls shared library management routines (described in Chapter 6 “Shared Library Management Routines ”). To make the linker export all symbols from a program, invoke ld with the -E option. The +e option allows you to be more selective about which symbols are exported, resulting in better performance. For details on +e, see “Exporting Symbols with +e”. The -h option allows you to hide symbols. Hiding a symbol makes the symbol a local definition, accessible only from the object module or library in which it is defined. Use -h if you simply want to hide a few symbols. You can use -h option when building a shared library (with -b) and when linking to create an a.out file. When combining .o files with -r, you can use the -h option. The syntax of the -h option is: -h symbol The -h option hides symbol. Any other global symbols remain exported unless hidden with -h. Suppose you want to build a shared library from an object file that contains the following symbol definitions as displayed by the nm command:
In this example, check_sem_val, foo, bar, and sem are all global definitions. To create a shared library where check_sem_val is a hidden, local definition, you could do the following:
You should not combine -h and +e options on the same command line. For instance, suppose you specify +e sem. This would export the symbol sem and hide all other symbols. Any additional -h options would be unnecessary. If both -h and +e are used on the same symbol, the -h overrides the +e option. The linker command line could get quite lengthy and difficult to read if several such options were specified. And in fact, you could exceed the maximum HP-UX command line length if you specify too many options. To get around this, use ld linker option files, described under “Passing Linker Options in a file with -c ”. You can specify any number of -h or +e options in this file. When building a shared library, you might want to hide a symbol in the library for several reasons:
Exporting a symbol is necessary if the symbol must be accessible outside the shared library. But remember that, by default, most symbols are global definitions anyway, so it is seldom necessary to explicitly export symbols. In C, all functions and global variables that are not explicitly declared as static have global definitions, while static functions and variables have local definitions. In FORTRAN, global definitions are generated for all subroutines, functions, and initialized common blocks. When using +e, be sure to export any data symbols defined in the shared library that will be used by another shared library or the program, even if these other files have definitions of the data symbols. Otherwise, your shared library will use its own private copy of the global data, and another library or the program file will not see any change. One example of a data symbol that should almost always be exported from a shared library is errno. errno is defined in every shared library and program; if this definition is hidden, the value of errno will not be shared outside of the library. The -r option combines multiple .o files, creating a single .o file. The reasons for hiding symbols in a .o file are the same as the reasons listed above for shared libraries. However, a performance improvement will occur only if the resulting .o file is later linked into a shared library. By default, the linker exports all of a program's global definitions that are imported by shared libraries specified on the linker command line. For example, given the following linker command, all global symbols in crt0.o and prog.o that are referenced by libm or libc are automatically exported:
With libraries that are explicitly loaded with shl_load, this behavior may not always be sufficient because the linker does not search explicitly loaded libraries (they aren't even present on the command line). You can work around this using the -E or +e linker option. As mentioned previously in the section “Exporting Symbols from main with -E ”, the -E option forces the export of all symbols from the program, regardless of whether they are referenced by shared libraries on the linker command line. The +e option allows you to be more selective in what symbols are exported. You can use +e to limit the exported symbols to only those symbols you want to be visible. For example, the following ld command exports the symbols main and foo. The symbol main is referenced by libc. The symbol foo is referenced at run time by an explicitly loaded library not specified at link time:
When using +e, be sure to export any data symbols defined in the program that may also be defined in explicitly loaded libraries. If a data symbol that a shared library imports is not exported from the program file, the program uses its own copy while the shared library uses a different copy if a definition exists outside the program file. In such cases, a shared library might update a global variable needed by the program, but the program would never see the change because it would be referencing its own copy. One example of a data symbol that should almost always be exported from a program is errno. errno is defined in every shared library and program; if this definition is hidden, the value of errno will not be shared outside of the program in which it is hidden. A library can be moved even after an application has been linked with it. This is done by providing the executable with a list of directories to search at run time for any required libraries. One way you can store a directory path list in the program is by using the +b path_list linker option. Note that dynamic path list search works only for libraries specified with -l on the linker command line (for example, -lfoo). It won't work for libraries whose full path name is specified (for example, /usr/contrib/lib/libfoo.sl). However, it can be enabled for such libraries with the -l option to the chatr command (see “Changing a Program's Attributes with chatr(1) ”). The syntax of the +b option is
where path_list is the list of directories you want the dynamic loader to search at run time. For example, the following linker command causes the path .:/app/lib:: to be stored in the executable. At run time, the dynamic loader would search for libfoo.sl, libm.sl, and libc.sl in the current working directory (.), the directory /app/lib, and lastly in the location in which the libraries were found at link time (::):
If path_list is only a single colon, the linker constructs a path list consisting of all the directories specified by -L, followed by all the directories specified by the LPATH environment variable. For instance, the following linker command records the path list as /app/lib:/tmp:
Whether specified as a parameter to +b or set as the value of the SHLIB_PATH environment variable, the path list is simply one or more path names separated by colons (:), just like the syntax of the PATH environment variable. An optional colon can appear at the start and end of the list. Absolute and relative path names are allowed. Relative paths are searched relative to the program's current working directory at run time. Remember that a shared library's full path name is stored in the executable. When searching for a library in an absolute or relative path at run time, the dynamic loader uses only the basename of the library path name stored in the executable. For instance, if a program is linked with /usr/local/lib/libfoo.sl, and the directory path list contains /apps/lib:xyz, the dynamic loader searches for /apps/lib/libfoo.sl, then ./xyz/libfoo.sl. The full library path name stored in the executable is referred to as the default library path. To cause the dynamic loader to search for the library in the default location, use a null directory path (). When the loader comes to a null directory path, it uses the default shared library path stored in the executable. For instance, if the directory path list in the previous example were /apps/lib::xyz, the dynamic loader would search for /apps/lib/libfoo.sl, /usr/local/lib/libfoo.sl, then ./xyz/libfoo.sl. If the dynamic loader cannot find a required library in any of the directories specified in the path list, it searches for the library in the default location () recorded by the linker. A library can be moved even after an application has been linked with it. Linking the program with +s, enables the program to use the path list defined by the SHLIB_PATH environment variable at run time. When a program is linked with +s, the dynamic loader will get the library path list from the SHLIB_PATH environment variable at run time. This is especially useful for application developers who don't know where the libraries will reside at run time. In such cases, they can have the user or an install script set SHLIB_PATH to the correct value.
The -c file option causes the linker to read command line options from the specified file. This is useful if you have many -h or +e options to include on the ld command line, or if you have to link with numerous object files. For example, suppose you have over a hundred +e options that you need when building a shared library. You could place them in a file named eopts and force the linker to read options from the file as follows:
Note that the linker ignores lines in that option file that begin with a pound sign (#). You can use such lines as comment lines or to temporarily disable certain linker options in the file. For instance, the following linker option file for an application contains a disabled -O option:
If you use certain linker options all the time, you may find it useful to specify them in the LDOPTS environment variable. The linker inserts the value of this variable before all other arguments on the linker command line. For instance, if you always want the linker to display verbose information (-v) and a trace of each input file (-t), set LDOPTS as follows:
Thereafter, the following commands would be equivalent:
To direct the linker to search a particular library, use the -lname option. For example, to specify libc, use -lc; to specify libm, use -lm; to specify libXm, use -lXm. When writing programs that call routines not found in the default libraries linked at compile time, you must specify the libraries on the compiler command line with the -lx option. For example, if you write a C program that calls POSIX math functions, you must link with libm. The x argument corresponds to the identifying portion of the library path name — the part following lib and preceding the suffix .a or .sl. For example, for the libm.sl or libm.a library, x is the letter m:
The linker searches libraries in the order in which they are specified on the command line (that is, the link order). In addition, libraries specified with -l are searched before the libraries that the compiler links by default. The -l: option works just like the -l option with one major difference: -l: allows you to specify the full basename of the library to link with. For instance, -l:libm.a causes the linker to link with the archive library /usr/lib/libm.a, regardless of whether -a shared was specified previously on the linker command line. The advantage of using this option is that it allows you to specify an archive or shared library explicitly without having to change the state of the -a option. (See also “Caution When Mixing Shared and Archive Libraries ”.) For instance, suppose you use the LDOPTS environment variable (see “Passing Linker Options with LDOPTS ”) to set the -a option that you want to use by default when linking. And depending on what environment you are building an application for, you might set LDOPTS to -a archive or -a shared. You can use -l: to ensure that the linker will always link with a particular library regardless of the setting of the -a option in the LDOPTS variable. The a.out file created by the linker contains symbol table, relocation, and (if debug options were specified) information used by the debugger. Such information can be used by other commands that work on a.out files, but is not actually necessary to make the file run. ld provides two command line options for removing such information and, thus, reducing the size of executables:
These options can reduce the size of executables dramatically. Note, also, that these options can also be used when generating shared libraries without affecting shareability. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||