Internationalization-
and Localization-related Changes |
 |
The multibyte routines (mblen(), mbtowc(), mbstowcs(), wctomb(), and wcstombs()) in past releases have incorrectly identified certain
invalid characters as valid for the ko_KR.eucKR, zh_CN.hp15CN, zh_TW.big5, and zh_TW.ccdc locales.
The
locale definitions and method libraries have been modified so that
these locales only recognize 7-bit ASCII single-byte values and
the following two-byte values as valid characters.
Table 5-1 Two-byte Values Recognized by Asian Locales
Locale | First Byte | Second Byte |
|---|
| ko_KR.eucKR | 0xa1-0xfe | 0xa1-0xfe |
| zh_CN.hp15CN[1] | 0xa1-0xfe 0xfb 0xfc-0xfe | 0xa1-0xfe 0x3f-0x7e 0x21-0x7e |
| zh_TW.ccdc | 0xa1-0xfe | 0x21-0x7e,0xa1-0xfe |
| zh_TW.big5 | 0x81-0xfe | 0x40-0x7e,0xa1-0xfe |
The multibyte.3c routines now exhibit the behavior that is documented
in the man pages. Because the input methods for these languages
do not support the invalid characters, this should not impact customer applications.
Users are now be able to use the multibyte routines to check for
character validity.
Limited impact on performance is expected in most cases. The
simplified Chinese locale (zh_CN.hp15CN) user-defined character range ([0xfb,0x3f-0x7e] and
[0xfc-0xfe,0x21-0x7e]) requires additional checks that cause the
routines to be slower for UDCs than the other character ranges.
No applications conforming to the documented behavior are
affected. HP has made every attempt to verify that no applications
depend on the prior behavior.
The ability exists to create and use user-defined locales
and method libraries instead of the system-provided locales.