| United States-English |
|
|
|
![]() |
HP-UX 11.0 - 11i Internationalization Features White Paper > Chapter 2 Encoding
CharactersConverting Between Encodings |
|
This release contains defect fixes for incorrect character mappings. The corrections concern the Simplified Chinese, Traditional Chinese, Japanese, and Korean characters of HP-UX. Corrected character converter mappings allow for improved interoperability when sending or receiving converted character data to and from Unicode-aware systems. A patch corrects an incorrect character mapping that occurs when converting between hp15CN and Unicode (UCS2)/UTF-8 for Simplified Chinese. Specifically, the Simplified Chinese character Double Vertical Line mapped incorrectly when converting between hp15CN and UCS2/UTF-8. This character was being mapped to the Parallel To character, which is a different character. Table 2-19 “Changes in iconv Tables for Simplified Chinese” summarizes the change applied to the iconv tables: Table 2-19 Changes in iconv Tables for Simplified Chinese
The hp15CN=ucs2 and ucs2=hp15CN iconv converter tables are affected. These tables are shared by both UCS2 and UTF-8 conversions. No compatibility problems are anticipated. However, if compatibility concerns arise with regard to persistent data stored either in Unicode (UCS2) or UTF-8 on an HP-UX system, it is possible to generate a simple conversion script to search for each occurrence of an incorrect value in either UCS2 or UTF-8 and convert it to the correct value, based on the mapping in Table 2-20 “Mapping Between Old and New Unicode Characters for Simplified Chinese”. Table 2-20 Mapping Between Old and New Unicode Characters for Simplified Chinese
A patch corrects several incorrect character mappings that occur when converting between Big-5/EUC and Unicode (UCS2)/UTF-8 for Traditional Chinese. In the case of big5 coding to and from UCS2/UTF-8, the Ideographic Space character was absent in the Unicode conversion table mapping: Table 2-21 Changes in iconv Tables for big5/Unicode
The following table summarizes the changes applied for conversions between eucTW and UCS2. Table 2-22 Changes in iconv Tables for eucTW/Unicode
The iconv conversions between eucTW and UCS2 or UTF-8 may be affected. Big-5 conversions with UCS2/UTF-8 are not directly impacted since only a missing table entry has been added. The eucTW=ucs2, ucs2=eucTW, big5=ucs2 and ucs2=big5 iconv converter tables are affected. These tables are shared by both UCS2 and UTF-8 conversions. No compatibility problems are anticipated. However, if compatibility concerns arise with regard to persistent data stored either in Unicode (UCS2) or UTF-8 on an HP-UX system, it is possible to generate a simple conversion script to search for each occurrence of an incorrect value in either UCS2 or UTF-8 and convert it to the correct value, based on the mappings in Table 2-23 “Mapping Between Old and New Unicode Characters for Traditional Chinese”. Table 2-23 Mapping Between Old and New Unicode Characters for Traditional Chinese
A patch corrects four incorrect Japanese character mappings that occur between Shift-JIS/EUC and Unicode (UCS2)/UTF-8. The following table summarizes the changes applied. Table 2-24 Changes in iconv Tables for Japanese
Affected iconv conversions are conversions between sjis and UCS2 or UTF-8 as well as conversions between eucJP and UCS2 or UTF-8. The sjis=ucs2, ucs2=sjis, eucJP=ucs2 and ucs2=eucJP iconv conversion tables are affected. These tables are shared by both UCS2 and UTF-8 conversions. No compatibility problems are anticipated. However, if compatibility concerns arise with regard to persistent data stored either in Unicode (UCS2) or UTF-8 on an HP-UX system, it is possible to generate a simple conversion script to search for each occurrence of an incorrect value in either UCS2 or UTF-8 and convert it to the correct value, based on the mappings in Table 2-25 “Mapping Between Old and New Unicode Characters for Japanese”. Table 2-25 Mapping Between Old and New Unicode Characters for Japanese
A patch provides a defect fix to address standards nonconformance for Korean Unicode (UCS2)/UTF-8 character mappings. The currently supplied Korean iconv converter tables do not conform to the Unicode 2.1 and ISO 10646 (with 1997 amendments) standards in addition to the Korean national standard, KSC-5700. The current mappings are considered obsolete by all noted standards organizations. The enhancement provides a set of standards-conformant iconv converter tables for converting between eucKR and Unicode/UTF-8. Specifically, the obsolete region of 0x3d2e-0x4dff has been remapped to the 0xac00-0xd7ff region specified in Unicode 2.1 for Hangul. Without this modification, it is impossible to share data with any other system that is standards-conformant in adhering to the Unicode 2.1/ISO 10646/KSC-5700 standards. Affected iconv conversions are any conversions between eucKR and UCS2 or UTF-8. The iconv conversion tables affected by this modification are eucKR=ucs2 and ucs2=eucKR. These tables are shared by both UCS2 and UTF-8 conversions. No compatibility problems are anticipated. However, if compatibility concerns arise with regard to persistent data stored either in Unicode (UCS2) or UTF-8 on an HP-UX system, it is recommended that the previously installed ucs2=eucKR table be saved and renamed prior to installation of this fix. Persistent data can then be converted back to eucKR using this old table and then reconverted to the correct Unicode/UTF-8 representation. New iconv converters have been introduced to allow for greater interoperability of data sharing within Japanese computing environments. The following items are related to this change:
See “Greek Euro Support [11i v1.6]” for detailed information on the iconv enhancements for Greek Euro Support. Mainframe iconv converters between ShiftJIS/eucJP/UCS2 and NEC-JIPS/Hitachi-KEIS/Fujitsu-JEF, were introduced at HP-UX 11i v1.0. This release includes several fixes of mapping errors for JIS standard characters. This release of mainframe iconv conversion tables includes numerous fixes for mapping errors for JIS standard characters in the basic part of those mainframe codesets. The detailed changes are described in MFConvChanges.jips, MFConvChanges.keis and MFConvChanges.jef in the /usr/share/doc directory. In addition, this release of mainframe iconv conversion methods includes a fix to handle an incomplete shift sequence at the end of an input buffer. If the customer has already used the HP-UX 11i v1 version of mainframe iconv converters and then uses this version, the results will be different because of fixes in the mappings for JIS standard characters. It is recommended that the previously installed tables be saved and renamed prior to installation of this release. Persistent data can then be converted back using the old table and then reconverted using the new tables to the correct representation. If the last character in the input buffer could be a valid character OR an incomplete shift sequence, iconv(3C) returns EINVAL. If that character is the final one of the input file, iconv(3C) never returns successfully without appending other dummy data like NULL to that character. That character is 0x1a for jipsj, 0x3f for jipsec/jipsek and 0xa for keis7c/keis7k/keis8c/keis8k which could be a control character OR an incomplete shift sequence. No compatibility problems are anticipated. However, if compatibility concerns arise with regard to persistent data stored on an HP-UX system, it is recommended that the previously installed tables be saved and renamed prior to installation of this release. Persistent data can then be converted back using this old table and then reconverted to the correct representation using the new tables. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||