Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P4
Fix Version/s: 26
Affects Version/s: 8, 26
Component/s: core-libs
Labels:

Subcomponent:
java.nio.charsets
Resolved In Build:
b19
CPU:

generic
OS:

generic

A DESCRIPTION OF THE PROBLEM :
IBM930 uses the wrong character when decoding the hex sequence 0x4260.
The correct character would be U+2212 (full-width hyphen).
The character being used currently is U+FF0D (Minus sign)

This is especially important when trying to convert from IBM930 to windows-31j, which has a full-width hyphen (0x817C), but no minus sign.
In particular the equivalence between IBM930's 0x4260 and Windows-31j's 0x817C is established in this document from IBM: https://public.dhe.ibm.com/software/globalization/gcoc/attachments/CP00300.pdf, page 411. I believe the corrections in 12.1.2 might not have been incorporated into this character set.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Decode the byte sequence 0x4260 into a string using the x-IBM930 charset, and then encode it to bytes using Windows-31j charset.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The expected output bytes would be 0x817C
ACTUAL -
The actual result is that the character is not in the windows-31j charset, and you will get a replacement character, error, or nothing according to the charset configuration.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Main.java
0.9 kB
2025-09-29 00:57

caused by

JDK-6843578 Re-implement IBM doublebyte charsets

Resolved

links to

Commit(master) openjdk/jdk/6b316262

Review(master) openjdk/jdk/27594

Assignee:: Naoto Sato
Reporter:: Webbug Group
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: 2025-09-25 09:21
Updated:: 2025-10-13 15:00
Resolved:: 2025-10-07 10:25

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates