Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4838072

Yen sign is not converted properly when using String.getBytes("Shift_JIS")

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P3 P3
    • None
    • 1.4.1
    • core-libs

      Name: nt126004 Date: 03/26/2003


      FULL PRODUCT VERSION :
      java version "1.4.1"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-b21)
      Java HotSpot(TM) Client VM (build 1.4.1-b21, mixed mode)

      FULL OS VERSION :
      Microsoft Windows XP [Version 5.1.2600]

      A DESCRIPTION OF THE PROBLEM :
      The methods String.getBytes(String charsetName) and new String(byte[] bytes, String charsetName) should be complimentary (unless characters in the string are not defined in the specified charset). For any String, you should be able to create a byte array with getBytes and then create a new String from the byte array such that the new String is equivalent to original String.
       
      When the charsetName is "Shift_JIS" and the String contains a Yen character, the method String.getBytes returns a value of 0x5C. This is correct behavior. However, the method new String(bytes, charsetName) converts the byte back to a String containing the Reverse Solidus character instead of the Yen character. This is not correct behavior.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      See the example program RoundTrip.java

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      When the charset is Shift_JIS, a byte with value 0x5C should be converted to a character with unicode value 0xA5.
      The byte value 0x5C is converted to Unicode value 0x5c.

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      public class RoundTrip{
          public static void main (String args[]){
              String csName = "Shift_JIS";
              String testString = "\u3072\u00a5"; //HIRAGANA LETTER HI, YEN SIGN
              roundTrip(csName, testString);
          }
          
          //do a round-trip conversion from String to byte[] and back to String
          private static void roundTrip(String csName, String testString){
           try{
           /* depending on your configuration, the unicode
                              characters may not
           display correctly. This is not relevant to the issue
                              at hand
           though. */
          
           // display the arguments passed in
               System.out.println("encode and decode '" + testString + "' using " + csName);
               System.out.print("Unicode values: ");
               int len = testString.length();
              
               //display the numeric value of each character before encoding
               for (int n = 0; n < len; n++){
                   int val = testString.charAt(n);
                   System.out.print(val);
                   System.out.print(" ");
               }
               System.out.println();
               System.out.println();

               //encode to bytes using the specified charsetName
               byte[] b = testString.getBytes(csName);
               System.out.print("Encoded bytes: ");
              
               //display the encoded values
               for (int n = 0; n < b.length; n++){
               System.out.print(b[n]);
               System.out.print(" ");
               }
               System.out.println();
               System.out.println();

               //convert the bytes back to a String
               String decode = new String(b, csName);
               System.out.println("Decoded String " + decode);
               System.out.print("Unicode values: ");
               len = decode.length();
              
               //display the numeric value of each character again
               for (int n = 0; n < len; n++){
                   int val = decode.charAt(n);
                   System.out.print(val);
                   System.out.print(" ");
               }
               System.out.println();
              }
              catch (Throwable t){
               System.out.println(t);
              }
              
          }
      }
      ---------- END SOURCE ----------
      (Review ID: 183058)
      ======================================================================

            ilittlesunw Ian Little (Inactive)
            nthompsosunw Nathanael Thompson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: