-
Bug
-
Resolution: Fixed
-
P4
-
1.1.4, 1.2.0, 1.2.2, 1.3.0
-
kestrel
-
generic, x86, sparc
-
generic, solaris_2.6, windows_95, windows_nt
Name: bb33257 Date: 12/23/98
The converters for Cp930 and Cp939 (Japanese EBCDIC) have a number of problems
dealing with the "ITAIJI" characters. (At least that's what I think they're
called, but I'm not a Japanese speaker...)
The broken conversions, and the corrections, are as follows:
ByteToCharCp939, ByteToCharCp930 ITAIJI mapping
Host EBCDIC to Unicode
Current mapping Correct mapping
x52EC -> u4FE0 x52EC -> u4FA0
x5481 -> u525D x5481 -> u5265
x54D4 -> u555E x54D4 -> u5516
x547D -> u5699 x547D -> u565B
x5190 -> u56CA x5190 -> u56A2
x4F5E -> u5861 x4F5E -> u586B
x5443 -> u5C5B x5443 -> u5C4F
x55C0 -> u5C62 x55C0 -> u5C61
x54CD -> u6451 x54CD -> u63B4
x54A3 -> u6414 x54A3 -> u63BB
x5B72 -> u6522 x5B72 -> u6505
x5BFE -> u688E x5BFE -> u688D
x5550 -> u7006 x5550 -> u6D9C
x54FA -> u6F51 x54FA -> u6E8C
x53EE -> u7130 x53EE -> u7114
x54A4 -> u7626 x54A4 -> u75E9
x5553 -> u79B1 x5553 -> u7977
x54CA -> u7C1E x54CA -> u7BAA
x60F1 -> u7E48 x60F1 -> u7E66
x5373 -> u7E6B x5373 -> u7E4B
x52DA -> u7E61 x52DA -> u7E4D
x61B0 -> u8141 x61B0 -> u80FC
x52C9 -> u840A x52C9 -> u83B1
x53F8 -> u8523 x53F8 -> u848B
x53E8 -> u87EC x53E8 -> u8749
x52A1 -> u881F x52A1 -> u874B
x5353 -> u8EC0 x5353 -> u8EAF
x51FA -> u91B1 x51FA -> u9197
x507F -> u91AC x507F -> u91A4
x4EB3 -> u9830 x4EB3 -> u982C
x66C8 -> u9839 x66C8 -> u983D
x55C1 -> u985A x55C1 -> u985B
x53DA -> u9A52 x53DA -> u9A28
x5464 -> u9DD7 x5464 -> u9D0E
x4C7D -> u9E7C x4C7D -> u9E78
x5261 -> u9EB4 x5261 -> u9EB9
x555F -> u9EB5 x555F -> u9EBA
x446E -> uF86F x446E -> u2116
CharToByteCp939, CharToByteCp930 ITAIJI mapping
Unicode to Host EBCDIC
Current mapping Correct mapping
u4FA0 -> x6F u4FA0 -> x52EC
u5265 -> x6F u5265 -> x5481
u5516 -> x6F u5516 -> x54D4
u565B -> x6F u565B -> x547D
u56A2 -> x6F u56A2 -> x5190
u586B -> x6F u586B -> x4F5E
u5C4F -> x6F u5C4F -> x5443
u5C61 -> x6F u5C61 -> x55C0
u63B4 -> x6F u63B4 -> x54CD
u63BB -> x6F u63BB -> x54A3
u6505 -> x6F u6505 -> x5B72
u6805 -> x6F u6805 -> x51F1
u688D -> x6F u688D -> x5BFE
u6D9C -> x6F u6D9C -> x5550
u6E8C -> x6F u6E8C -> x54FA
u7114 -> x6F u7114 -> x53EE
u75E9 -> x6F u75E9 -> x54A4
u7977 -> x6F u7977 -> x5553
u7BAA -> x6F u7BAA -> x54CA
u7E66 -> x6F u7E66 -> x60F1
u7E4B -> x6F u7E4B -> x5373
u7E4D -> x6F u7E4D -> x52DA
u80FC -> x6F u80FC -> x61B0
u8346 -> x6F u8346 -> x53B3
u83B1 -> x6F u83B1 -> x52C9
u848B -> x6F u848B -> x53F8
u8749 -> x6F u8749 -> x53E8
u874B -> x6F u874B -> x52A1
u8EAF -> x6F u8EAF -> x5353
u9197 -> x6F u9197 -> x51FA
u91A4 -> x6F u91A4 -> x507F
u982C -> x6F u982C -> x4EB3
u983D -> x6F u983D -> x66C8
u985B -> x6F u985B -> x55C1
u9A28 -> x6F u9A28 -> x53DA
u9D0E -> x6F u9D0E -> x5464
u9E78 -> x6F u9E78 -> x4C7D
u9EB9 -> x6F u9EB9 -> x5261
u9EBA -> x6F u9EBA -> x555F
u2116 -> x6F u2116 -> x446E
Additional chars for MS Cp930,939 compatibility:
u2015 -> x444A
uFF5E -> x43A1
u2225 -> x447C
uFF0D -> x4260
uFFE4 -> x426A
I'm filing this bug on behalf of IBM Japan's DBCS group. Since
I don't really know that much about these issues myself, feel free
to contact Masayuki Fuse at <###@###.###> for more info.
======================================================================
- duplicates
-
JDK-4199570 CharToByteCp939 throws InternalError: Converter malfunction
- Closed
-
JDK-4250728 Character converters for Cp930, Cp933, Cp935, Cp937, Cp939 can not work.
- Closed
-
JDK-4095349 java.lang.String.getBytes("Cp939") fails to add SI when string is single DBCS ch
- Closed