HP aC++ Version A.03.55 Release Notes MPN : 5990-6778 March 2004 Copyright (c) 2004 Hewlett-Packard Company ________________________________________________________________________________ Legal Notices Reproduction, adaptation, or translation without prior written permission is prohibited, except as allowed under the copyright laws. Hewlett-Packard makes no warranty of any kind with regard to this document, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. Hewlett-Packard shall not be liable for errors contained herein nor direct, indirect, special, incidental or consequential damages in connection with the furnishing, performance, or use of this material. Information in this publication is subject to change without notice. Corporate Offices: Hewlett-Packard Development Company L.P. 20555 S.H. 249 Houston, Texas 77070 U.S.A Use, duplication or disclosure by the U.S. Government Department of Defense is subject to restrictions as set forth in paragraph (b)(3)(ii) of the Rights in Technical Data and Software clause in FAR 52.227-7013. Rights for non-DOD U.S. Government Departments and Agencies are as set forth in FAR 52.227-19(c)(1,2). Use of this document and flexible disc(s), compact disc(s), or tape cartridge(s) supplied for this pack is restricted to this product only. Additional copies of the programs may be made for security and back-up purposes only. Resale of the programs in their present form or with alterations, is expressly prohibited. A copy of the specific warranty terms applicable to your Hewlett-Packard product and replacement parts can be obtained from your local Sales and Service Office. (c) Copyright 1980, 1984, 1986 AT&T Technologies, Inc. UNIX and System V are registered trademarks of AT&T in the USA and other countries. UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company Limited. PostScript is a trademark of Adobe Systems, Inc. (c) Copyright 1985-1986, 1988 Massachusetts Institute of Technology. X Window System is a trademark of the Massachusetts Institute of Technology. ________________________________________________________________________________ Preface This document discusses the following HP aC++ compiler topics: * Supported features * Known Issues * Related documentation Features introduced in prior release versions are also listed and grouped by the compiler version number. Note: The software code in the title of this document indicates the software product version at the time of release. Some product and operating system changes do not require changes to documentation; therefore, do not expect a one-to-one correspondence between these changes and release notes updates. Latest printing: March, 2004 Reporting Problems: If you have any problems with the software or documentation, please contact your local Hewlett-Packard Sales Office or Customer Service Center. ________________________________________________________________________________ Chapter 1 Features ________________________________________________________________________________ This chapter summarizes the features included in this version of the HP aC++ compiler. The compiler supports much of the ISO/IEC 14882 Standard for the C++ Programming Language (the international standard for C++). HP aC++ provides a variety of performance related options, in addition to the options described in these release notes. See the "HP aC++ Online Programmer's Guide" - "Performance" section for full documentation. Chapter 3 of this release notes provides access instructions to the guide. 1.1 New Features in the A.03.55 Release ======================================= HP aC++ version A.03.55 supports the following new features: o -notrigraph Option o NO_SIDE_EFFECTS Pragma 1.1.1 -notrigraph Option ------------------------ This option inhibits the processing of trigraphs. The -notrigraph option, in previous versions, caused the legacy preprocessor to be invoked, which ignored trigraphs. These trigraphs were still interpreted by the compiler in the preprocessed source. In this version of aCC, the -notrigraph option does not invoke the legacy preprocessor and also suppresses the trigraphs from being interpreted. This option is not recommended. The proper protable solution is to quote the "?" as "\?". 1.1.2 NO_SIDE_EFFECTS Pragma ---------------------------- This pragma states that functionname and all the functions that functionname calls will not modify any of a program's local or global variables. This pragma provides additional information to the optimizer which results in more efficient code. Syntax: #pragma NO_SIDE_EFFECTS functionname,..., functionnameN Example: #pragma NO_SIDE_EFFECTS foo // where foo is name of a function. 1.2 New Features in the A.03.50 Release ======================================= HP aC++ version A.03.50 supports the following new features: - Precompiled Header (PCH) feature fully supported under -AA - More robust support for Debugging Optimized Code (DOC) - +O[no]clone Option - +O[no]memory[=malloc] - Cache line length change for PA8800 - Improved optimization of exception handling code sequences at +O2 - 'restrict' keyword - Increased +O3/+O4 robustness with aCC - Support for gdb steplast feature in aCC - +Olit=[all|none] Option - Dynamic unloading of C++ runtime shared library libCsup - Pragma INIT and pragma FINI in 32-bit mode 1.2.1 Precompiled Header (PCH) feature fully supported under -AA ---------------------------------------------------------------- Precompiled Header (PCH) is now fully supported with -AA option. That means PCH feature can be used with STL (Standard Template Library). 1.2.2 Debugging Optimized Code (DOC) ------------------------------------ Debugging of optimized code (at optimization level +O2) is more robust now. Debugging of template functions is much improved. 1.2.3 +O[no]clone ----------------- This option provides user control to turn on [off] the cloning feature of the optimizer. This option is primarily for users who may see a lot of cloning adversely affecting the performance of their code, and want more control over cloning. Cloning is on by default, and is valid at optimization levels +O3 and +O4. When inlining is turned off, cloning is turned off too. 1.2.4 +O[no]memory[=malloc] --------------------------- Enable [disable] memory optimizations. Specifying 'malloc' in the list will enable[disable] optimizations which consolidate memory allocation procedure calls. This option is disabled by default. This option is incompatible with +Oopenmp and +Oparallel, and is ignored. 1.2.5 Improved prefetching and datalocality for PA8800 --------------------------------------------------------------- Taking advantage of the increased cache line length of PA8800 processor (128 bytes), compiler generates better code with improved data prefetching and data locality. This may help improve the performance of loop intensive applications. 1.2.6 Improved optimization of exception handling code sequences at optimization level +O2 with +Oexception Option ----------------------------------------------------------------------- The compiler now does a much more robust optimization in and around the code regions containing try/catch constructs. This is expected to provide performance boost to C++ applications with a large amount of exception handling. This can be turned on with option +Oexception. 1.2.7 "restrict" keyword ------------------------ This is a C99 feature. This keyword tells the optimizer that variables declared as restrict cannot have aliases (using pointers). Thus optimizer can do better alias analysis. As of the current release, only the keyword is supported without any accompanying optimizations. 1.2.8 Increased +O3/+O4 robustness with aCC ------------------------------------------- Robustness and usability of optimizations levels +O3/+O4 has been improved for C++ applications. This is expected to provide performance benefits to user applications written in C++. 1.2.9 Support for gdb steplast ------------------------------- In order to use the new 'steplast' command of 'gdb', C++ programs must be built with -g0 option only. NOTE: Because of the extra debug information emitted to support this feature, it is expected that there will be minor compatibility issues encountered while using DDE. To be more specific, if you receive the following message from within DDE when you have built using -g0: ?(dde/ui_line) File ".../test.c" has only NNN lines. Stopped at: \\test\main\134217746 (00002404) you can turnoff the extra debug information by setting the environment variable aCC_ENABLE_STEPLAST to OFF. $ export aCC_ENABLE_STEPLAST=OFF 1.2.10 +Olit=[all|none] Option ------------------------------------- The +Olit option specifies the type of data items placed in the read-only data section. +Olit can take the values 'all' and 'none'. +Olit=all places all string variables and all const-qualified variables that do not require load-time or run-time initialization in the read-only data section. If +Olit=none is specified, no constants are placed in the read-only data section. 1.2.11 Dynamic unloading of C++ runtime shared library libCsup -------------------------------------------------------------- It is safe to dynamically load and unload C++ shared libraries that directly or indirectly depend on shared library libCsup. It is no longer necessary to specify -lCsup on the link line while building a non-C++ main executable. 1.2.12 Pragma INIT and Pragma FINI in 32-bit mode ------------------------------------------------- Pragmas INIT and FINI now work in 32-bit mode too. Functionality of both the pragmas are similar to their functionality in the 64-bit mode. Please look at aCC online help (aCC +help) for more information. Patches Required ================ The following patches must be installed in order to enable all these new features: For HP-UX B.11.00: PHSS_28879 (aC++ runtime) PHSS_28869 (linker) For HP-UX B.11.11: PHSS_28880 (aC++ runtime) PHSS_28871 (linker) 1.3 New and Changed Features in version A.03.37 =========================================== New features in HP aC++ version A.03.37 are listed below. They apply to HP-UX 11.x operating systems. - New Rogue Wave Tools.h++ library - Version 7.1.1- compatible with -AA - UTF-16 character transformation format support - __restrict keyword support - +ub and +sb options - to control the signedness of bitfields - ANSI C++ covariant Return Type - Improved support for using PCH (Precompiled Headers) with -AA option - Improved support for #pragma pack and #pragma align. - Improved DOC (Debug Optimized Code) support - Performance Improvements to -AA iostream - Thread mutex contention fix on null strings with -AP 1.3.1 Rogue Wave Tools.h++ Version 7.1.1 compatible with -AA ------------------------------------------------------------- Rogue Wave Tools.h++ library version 7.1.1 can now be used with -AA option, that is, it can be used with the Standard C++ Library 2.1.1. Note that the earlier Tools.h++ library version 7.0.6 could not be used with -AA. 1.3.2 UTF-16 character transformation format support ----------------------------------------------------- The current compiler supports only ASCII strings or characters (8 bit chars with no transliteration) as UTF-16. UTF-16 is described in the Unicode Standard, version 3.0 [UNICODE]. The definitive reference is Annex Q of ISO/IEC 10646-1 [ISO-10646]. Any string or character which is preceded by 'u' is recognized as a UTF-16 literal or character and is stored as an unsigned short type. Example: #define _UTF16(x) u##x #define UTF16(y) _UTF16(#y) typedef unsigned short utf16_t; utf16_t *utf16_str = UTF16(y); // u"y" int size = sizeof(u't'); // size of 2 bytes 1.3.3 __restrict keyword support --------------------------------- The __restrict keyword is now recognized by the compiler. Refer to the description of the C99 restrict type-qualifier keyword in ISO/IEC 9899:1999 (6.7.3). 1.3.4 +ub and +sb options - to control the signedness of bitfields ------------------------------------------------------------------ The +ub option treats unqualified bit fields as unsigned. The +sb option treats unqualified bit fields as signed. +uc option will override +sb option for char bit fields. Note that in 64 bit mode, +sb option is set by default, to match HP C. 1.3.5 ANSI C++ covariant return Type ------------------------------------- With this release, covariant return type feature is fully supported. Basically, return type of an overriding function can be a pointer or reference to a class derived from the return type of the base class. Example 1: class BaseClass { public: virtual BaseClass* foo(); }; class DerivedClass : public BaseClass { public: DerivedClass* foo(); }; Example 2: class BaseClass_1 { public: virtual BaseClass_1* foo(); }; class BaseClass_2 { public: virtual BaseClass_2* goo(); }; class DerivedClass : public BaseClass_1, BaseClass_2 { public: DerivedClass* goo(); }; Note: No Debugger support for covariant return type: HP WDB3.1 doesn't support covariant return types. So, gdb can't "step into" a covariant function. However, setting a breakpoint at a covariant function and running into it, will work fine. Debugger will show the internal compiler generated function, when a user does a "backtrace", or "finish", or "return" in gdb at a covariant function. 1.3.6 Improved support for PCH (Precompiled Headers) with -AA -------------------------------------------------------------- Support for using precompiled header (PCH) feature with -AA option has been improved. A significant number of problems have been addressed since the previous release. Note that, this feature is not fully supported in -AA mode. There may be unexpected compiletime problems. 1.3.7 Improved support for #pragma pack and #pragma align --------------------------------------------------------- Please see HP aC++ Online Programmer's Guide at http://docs.hp.com for more details. 1.3.8 Improved DOC (Debug Optimized Code) support ------------------------------------------------ Ability to debug the optimized C++ code (DOC) has been improved significantly in this release. To use these improvements, set the environment variable aCC_DOC_MODE to ON. Example: $ cat sample.C #include int x = 1; int main() { int j = 4; printf("we are here:%d:\n", j); } $ aCC_DOC_MODE=ON aCC -g -O sample.C Now, with the improved DOC, while debugging the above sample program you can display the correct value of local variable 'j'. [Note that in further releases, the above environment variable will be automatically set by the compiler]. 1.3.9 Performance Improvements to -AA iostream ---------------------------------------------- Standard C++ Iostreams have been further tuned to improve the performance of I/O. Sometimes, the obtained performance may be comparable to that of old iostream.h library (i.e., -AP). 1.3.10 Thread mutex contention fix on null strings with -AP ----------------------------------------------------------- Using the string template (with -AP) in a threaded environment may result in excessive contention on a single null string mutex. This is because of the usage of a single null string object for default initialization and string modifications. This fix is enabled with: -D__HPACC_THREAD_NULL_STRING Note: There is a very small chance that mixing objects or libraries compiled with and without -D__HPACC_THREAD_NULL_STRING will lead to incompatibilities. This is because the new implementation sets the null string reference count to INT_MAX/2 whereas the old implementation would increment or decrement the reference count. There is a very small chance that the reference count may incorrectly go to 0 and the null string object may get deleted. Required Patches ================ The following patches must be installed after installing A.03.37 in order to enable all the new features. For HP-UX 11.00: PHCO_24723 (libc) PHCO_23792 (libpthread) PHSS_24303 (linker) PHSS_26945 (aC++ runtime) PHSS_25028 (libomp) For HP_UX 11.11: PHCO_24400 (libc) PHCO_23846 (libpthread) PHSS_24304 (linker) PHSS_26946 (aC++ runtime) PHSS_25029 (libomp) 1.4 Version A.03.33 Features ============================ New features in HP aC++ version A.03.33 are listed below. They apply to HP-UX 11.x operating systems. - OpenMP Standard supported - Changes to Small Block Allocator (SBA) for malloc - Gather/Scatter Prefetching pragma - Support for SDK/XDK (cross-compilation) - Support for _declspec - aCC_MAXERR to control maximum number of compiler errors - +Oprofile option for Profile Based Optimization - Improved optimization for HP_LONG_RETURN and +DA1.1 - Initialized Thread Local Storage - +O[no]inline=list - -I- option enhanced to perform prefixinclude search Note the new features OpenMP, _declspec, prefetch pragma, -I-, and initialized TLS are currently not available on the IPF (Itanium Processor Family) compilers. 1.4.1 OpenMP Standard Supported ------------------------------- This release introduces full support for version 1.0 of the "OpenMP C and C++ Application Program Interface". This specification is available at http://www.openmp.org/specs. To enable recognition of OpenMP pragmas, use the "+Oopenmp" command line option when invoking aCC. This option is effective at any optimization level. Note: Currently +Onoparallel does not affect the OpenMP pragmas in the source but still disables +Oautopar. Because multithreading is involved, -mt must also be used with +Oopenmp. (Otherwise runtime aborts may occur, especially with -AA.) OpenMP programs require the libomp and libcps runtime support libraries to be present on both the compilation and runtime system(s). The compiler driver will automatically include them when linking. These libraries are installed by applying the appropriate patches: PHSS_25028 - for 11.x prior to 11.11 PHSS_25029 - for 11.11 and greater It is recommended that you use the -N option when linking OpenMP programs to avoid exhausting memory when running with large numbers of threads. For this first release of aCC containing OpenMP, some debugging position information for OpenMP constructs may not be accurate. In addition, symbols marked with the "threadprivate" pragma may not be visible to the debugger. To work around this limitation, use the "__thread" storage class specifier in the symbol declaration instead. #if defined(__HP_aCC) && !defined(__THREAD) #define __THREAD __thread #else #define __THREAD #endif __THREAD int tprvt; #pragma omp threadprivate(tprvt) OpenMP also supported in aC++'s ANSI C mode (-Ae). OpenMP Known Problems: Initialization of firstprivate variables is erroneously done after calculation of the loop iteration count. As a result, loops with iteration counts that depend on the value of firstprivate variables will execute incorrectly. Example: int n = 100; #pragma omp for firstprivate(n) for (int i = 0; i < n; i++) { // Loop executes an indeterminate number of times because // private copy of n is not initialized prior to calculation // of loop iteration count. } 1.4.2 aCC_MAXERR to control maximum number of compiler errors ------------------------------------------------------------- The aCC_MAXERR environment variable allows you to set the maximum number of errors you want the compiler to report before it terminates compilation. The current default is 12, but you can set it to any number greater than zero. The compiler may not be able to recover from all errors and still display: 445 Cannot recover from earlier errors instead of 699 Error limit reached: halting compilation Example: The following increases the maximum to 100 errors. $export aCC_MAXERR=100 $aCC -c buggy.c o Small Block Allocator for malloc The aC++ runtime now automatically enables malloc's Small Block Allocator (SBA) after the aCC runtime patch and libc patch appropriate for your system are installed. (See the Required Patches section above.) This improves heap performance. See malloc(3) and mallopt(3). The default values are: M_MXFAST = 512 bytes M_NLBLKS = 100 M_GRAIN = 8 bytes If you want to change the defaults, the environment variable _M_SBA_OPTS can be used. The format is: export _M_SBA_OPTS=:: If your existing application is already calling mallopt, then mallopt will likely return an error because libCsup will have already called mallopt and allocated a small block by the time the application calls mallopt. If the above defaults are acceptable or you are already using _M_SBA_OPTS then the error should just be ignored. If the defaults would degrade performance, then either set _M_SBA_OPTS with the values used by the application or you can disable this new feature by using the following: export _M_SBA_OPTS=0:0:0 Applications with latent memory leaks may fail. If the application allocates a block that is too small while SBA is disabled, the block may be padded such that a overrun of the end of the allocated block might not cause a failure. But with SBA enabled, the next contiguous bytes may have been used for control information and an overrun would corrupt the heap and cause various aborts. 1.4.3 Gather/Scatter Prefetch pragma ------------------------------------ A pragma is now supported to prefetch specified cache lines. The behavior of this pragma is similar to +Odataprefetch but the prefetch pragma can access specific elements in indexed arrays that are stored in cache. In addition, any valid lvalue can be used as an argument, but the intent of the pragma is to support array processing. Syntax: #pragma prefetch There can be only one argument per pragma. The compiler generates instructions to prefetch the cache lines starting from the address given in the argument. The array element values prefetched must be valid. Reading outside the boundaries of an array results in undefined behavior at runtime. Example: The function below will prefetch ia and b, but not a[ia[i]] when compiled with +O2 +Odataprefetch +DA2.0 (or +DA2.0W). void testprefc2(int n, double *a, int *ia, double *b) { for (int i=0; i