Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P3
Fix Version/s: 7
Affects Version/s: 1.2.0, 1.4.2, 1.4.2_19-rev
Component/s: core-libs
Labels:
- ianl
- licbug
- webbug

Subcomponent:
java.nio.charsets
Resolved In Build:
b70
CPU:

generic, x86
OS:

generic, windows_95, windows_nt, windows_xp

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-2192011	6u21	Robert Mckenna	P3	Resolved	Fixed	b03
JDK-2191884	6u20-rev	Robert Mckenna	P3	Closed	Fixed	b03
JDK-2177540	6u19-rev	Robert Mckenna	P3	Closed	Fixed	b07
JDK-2192895	5.0u25	Robert Mckenna	P3	Closed	Fixed	b01
JDK-2190167	5.0u24-rev	Robert Mckenna	P3	Resolved	Fixed	b04
JDK-2177539	5.0u23-rev	Robert Mckenna	P3	Closed	Fixed	b05
JDK-2192856	1.4.2_27	Robert Mckenna	P3	Closed	Fixed	b02
JDK-2177393	1.4.2_26-rev	Robert Mckenna	P3	Closed	Fixed	b06

Name: bb33257 Date: 02/27/98

This bug is an addendum to bug #4113734. You can close that bug as a duplicate of this one.

I've gotten back the following additional information from the Arabic people at IBM
in response to a note from me regarding the errors in the code-conversion tables for
Arabic:

* * *

>I looked over the code-conversion tables you sent me and tried to compare them
>with Javasoft's, and then I filed a bug report with them. I'll let you know
>what happens with this.
>
>I didn't see many discrepancies. In the ByteToChar conversion tables, =
>there
>were four to six mistakes per code page. They were as follows:
>
>In ByteToCharCp420.java:
> 0x45 should map to \u200c
> 0x77 should map to \ufeb1
> 0x80 should map to \uf8f5

correction, 0x80 should map to \ufeb5

> 0x8b should map to \ufeb9
> 0x8d should map to \ufebd
>In ByteToCharCp1256.java:
> 0x80 should map to \u0080
> 0x8a should map to \u008a
> 0x8f should map to \u008f
> 0x98 should map to \u0098
> 0x9a should map to \u009a
> 0x9f should map to \u009f
>In ByteToCharCp864.java
> 0x9f should map to \u200c
> 0xd7 should map to \ufec3
> 0xd8 should map to \ufec7
> 0xf1 should map to \ufe7c

in 864 there are some more differences:
0x1a should map to \u001c
0x1c should map to \u007f
0x7f should map to \u001a
0x25 should map to \u0025

>And code page 1089 doesn't exist at all.

code page 1089 is the IBM number for ISO8859-6. I have checked the
differences and they are as follows:
0x30 should map to \u0030
0x31 should map to \u0031
0x32 should map to \u0032
0x33 should map to \u0033
0x34 should map to \u0034
0x35 should map to \u0035
0x36 should map to \u0036
0x37 should map to \u0037
0x38 should map to \u0038
0x39 should map to \u0039

>I suspect there are other things which also need to change in the CharT=
>oByte
>conversion tables, but I haven't looked into this.

I will try to look into them, however, the format of those tables are bit
complex.

>
>If you know of anything I've missed or gotten wrong here, I'd like to k=
>now
>about it, and I'd appreciate more background as to why we're suggesting=
> the
>changes we're suggesting (are the old tables just plain wrong in these =
>cases,
>or is this more of a judgment call?), but basically I just wanted to le=
>t you
>know I'd filed the bug and I'll let you know what happens with it.

There are two issues related to why there are differences in the mappings:
1- code pages 420 and 864 are basically font pages and coming from old
   implementations. The characters is question (except for the control
   characters in 864) are displayed in two cells vs. one cell in other
   code pages (ISO8859-6, 1256, unicode), the two cells are divided into,
   what we call 3/4 of the character which identifies that character and
   the other 1/4 is the tail which is common to all of them.
   So in IBM tables we map the 3/4 characters to the full shape in unicode.

2- in other cases like code 0x25 and 0x30-0x39, IBM tables map those
   characters to the US ASCII part of unicode for proper processing
   by BIDI unaware applications.

* * *

If I get back additional information regarding the inverse conversions from
Unicode back to SBCS, I'll pass it along in the form of another bug report.
======================================================================

backported by

JDK-2190167 Errors in Arabic code-conversion tables, part II

Resolved

JDK-2192011 Errors in Arabic code-conversion tables, part II

Resolved

JDK-2177393 Errors in Arabic code-conversion tables, part II

Closed

JDK-2177539 Errors in Arabic code-conversion tables, part II

Closed

JDK-2177540 Errors in Arabic code-conversion tables, part II

Closed

JDK-2191884 Errors in Arabic code-conversion tables, part II

Closed

JDK-2192856 Errors in Arabic code-conversion tables, part II

Closed

JDK-2192895 Errors in Arabic code-conversion tables, part II

Closed

duplicates

JDK-4113734 Errors in Arabic code conversion

Closed

JDK-6418187 Invalid translations for arabic codepage 420

Closed

relates to

JDK-6929767 three sun/io/Converter tests failed after integration of # 4116222

Resolved

JDK-6924100 2 Charset tests failed with NPE with 1.4.2_25 nightly build

Closed

JDK-6942162 TEST_BUG: sun/io/Converter tests should be modified in 1.4.2u26b06 workspace

Closed

(3 backported by, 2 duplicates, 3 relates to)

Assignee:: Xueming Shen
Reporter:: Brian Beck (Inactive)
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: 1998-02-27 14:16
Updated:: 2009-08-14 16:37
Resolved:: 2009-08-14 16:37
Imported:: 15/Sep/12 1:19 PM
Indexed:: 17/Jul/12 10:52 AM

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates