Loading...

XML

Word

Printable

Type: Bug
Resolution: Not an Issue
Priority: P3
Fix Version/s: None
Affects Version/s: 1.4.1
Component/s: core-libs
Labels:
- webbug

Subcomponent:
java.nio.charsets
CPU:

x86
OS:

windows_xp

Name: nt126004 Date: 03/26/2003

FULL PRODUCT VERSION :
java version "1.4.1"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-b21)
Java HotSpot(TM) Client VM (build 1.4.1-b21, mixed mode)

FULL OS VERSION :
Microsoft Windows XP [Version 5.1.2600]

A DESCRIPTION OF THE PROBLEM :
The methods String.getBytes(String charsetName) and new String(byte[] bytes, String charsetName) should be complimentary (unless characters in the string are not defined in the specified charset). For any String, you should be able to create a byte array with getBytes and then create a new String from the byte array such that the new String is equivalent to original String.

When the charsetName is "Shift_JIS" and the String contains a Yen character, the method String.getBytes returns a value of 0x5C. This is correct behavior. However, the method new String(bytes, charsetName) converts the byte back to a String containing the Reverse Solidus character instead of the Yen character. This is not correct behavior.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See the example program RoundTrip.java

EXPECTED VERSUS ACTUAL BEHAVIOR :
When the charset is Shift_JIS, a byte with value 0x5C should be converted to a character with unicode value 0xA5.
The byte value 0x5C is converted to Unicode value 0x5c.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
public class RoundTrip{
    public static void main (String args[]){
        String csName = "Shift_JIS";
        String testString = "\u3072\u00a5"; //HIRAGANA LETTER HI, YEN SIGN
        roundTrip(csName, testString);
    }

    //do a round-trip conversion from String to byte[] and back to String
    private static void roundTrip(String csName, String testString){
     try{
     /* depending on your configuration, the unicode
                        characters may not
     display correctly. This is not relevant to the issue
                        at hand
     though. */

     // display the arguments passed in
         System.out.println("encode and decode '" + testString + "' using " + csName);
         System.out.print("Unicode values: ");
         int len = testString.length();

         //display the numeric value of each character before encoding
         for (int n = 0; n < len; n++){
             int val = testString.charAt(n);
             System.out.print(val);
             System.out.print(" ");
         }
         System.out.println();
         System.out.println();

         //encode to bytes using the specified charsetName
         byte[] b = testString.getBytes(csName);
         System.out.print("Encoded bytes: ");

         //display the encoded values
         for (int n = 0; n < b.length; n++){
         System.out.print(b[n]);
         System.out.print(" ");
         }
         System.out.println();
         System.out.println();

         //convert the bytes back to a String
         String decode = new String(b, csName);
         System.out.println("Decoded String " + decode);
         System.out.print("Unicode values: ");
         len = decode.length();

         //display the numeric value of each character again
         for (int n = 0; n < len; n++){
             int val = decode.charAt(n);
             System.out.print(val);
             System.out.print(" ");
         }
         System.out.println();
        }
        catch (Throwable t){
         System.out.println(t);
        }

    }
}
---------- END SOURCE ----------
(Review ID: 183058)
======================================================================

relates to

JDK-4486307 (spec) Need to document deviation from standards in Japanese charsets

Closed

Assignee:: Ian Little (Inactive)

Reporter:: Nathanael Thompson (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Created:: 2003-03-26 11:35

Updated:: 2003-03-27 06:26

Resolved:: 2003-03-27 06:26

Imported:: 15/Sep/12 1:22 PM

Indexed:: 17/Jul/12 10:54 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates