| United States-English |
|
|
|
![]() |
HP-UX 11.0 - 11i Internationalization Features White Paper > Chapter 2 Encoding
CharactersUnicode 2.1 Support [11.0 patch, 11i v1] |
|
HP-UX provides system-level support for the Unicode 2.1/ISO 10646 character set. Hewlett-Packard’s support for Unicode provides the basis for enabling heterogeneous interoperability for all locales. ISO 10646 is an industry standard for defining a single encoding that uniquely encodes all the world’s characters. Unicode 2.1 is the companion specification to ISO 10646. Unicode support conforms with existing X/Open (OpenGroup), POSIX, ISO C and other relevant UNIX-based standards. HP-UX 11.0 supports Unicode/ISO 10646 by using the UTF-8 (Universal Transformation Format - 8) representation for persistent storage. UTF-8 is an industry-recognized 8-bit multibyte format representation for Unicode. This representation allows for successful data transmission over 8-bit networking protocols as well as safe storage and retrieval within a historically byte-oriented operating system such as HP-UX. For internal processing, HP-UX uses the four-octet (32-bit) canonical form specified in ISO 10646. This support allows parity with current HP-UX wchar_t implementation, that has been based on a 32-bit representation. Full systems level support is available for all locales provided in the release. For more information on the Unicode features of the Asian System Environment, refer to the /usr/share/doc/ASX-UTF8 directory. The following tables display a select subset of locale binaries that are provided for 32-bit application processing: Table 2-13 Base utf8 Locales for 32-bit Application Processing
Table 2-14 European utf8 Locales for 32-bit Application Processing
Table 2-15 Asian utf8 Locales for 32-bit Application Processing
To enable Unicode support in applications, set the environment variable to a desired utf8 locale. Locales are installed based on the current language filesets already installed on the target system. For example, if the system uses the International German, the German Unicode locale (de_DE.utf8) is installed. Source files for ALL supported locales (34 total) are also supplied for 64- or 32-bit applications. To build Unicode locales, use the localedef command. Refer to the localedef(1M) man page. Systems must have the kernel parameters MAXDSIZ, MAXTSIZ, and SHMMAX set to at least 100 MB to ensure adequate swap space allowance for a successful localedef compilation of these locales. This release provides expanded Unicode support to align the character repertoire with the ISO 8859-15 locales that are being provided for euro support. This support ensures full interoperability with the newly added support for the ISO 8859-15 codeset. Specific enhancements are provided to allow euro display and input capabilities though Xlib and new fonts. Unicode support requires additional disk space depending on the language used. The following tables provide the size requirements for specific languages. The base Unicode offering installed on all systems is approximately 10 MB. Table 2-16 Unicode European Locales and Localized Files
Table 2-17 Unicode Asian Locales and Localized Files
Applications using Unicode support should see performance comparable to that of other multibyte codesets. For those applications moving from a single-byte codeset to Unicode, some performance impact will be observed for some types of character-based operations. UTF-8 is supported on the Streams PTY driver line discipline (ldterm) module. The user does not interact with the Streams PTY driver directly; it runs underneath the dtterm window. The Streams PTY driver is responsible for providing a UTF-8 communication channel while dtterm is responsible for processing the UTF-8 code and displaying the characters on the screen. Refer to the eucset (1), ldterm (7), and lp (1) model script for details. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||