Loading...

XML

Word

Printable

Type: Bug
Resolution: Not an Issue
Priority: P4
Fix Version/s: None
Affects Version/s: 8-pool, 9
Component/s: core-libs
Labels:
- webbug

Subcomponent:
java.nio.charsets
CPU:

x86_64
OS:

linux

FULL PRODUCT VERSION :
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
Fedora 22
4.0.4-301.fc22.x86_64

A DESCRIPTION OF THE PROBLEM :
I'm on a UTF-8 platform, testing the String constructors ability to handle different charsets. Converting a byte array in 16BE or 16LE encoding to a String works fine if I use a constructor that converts the whole string. However, using the constructors that extract a substring -- those break.

String( byte[], int, int, String csn )
and
String( byte[], int, int, Charset )
both break. This is just one bug report, not two, since I figure the second one calls the first one, or vice versa, and therefore fixing one will fix the other.

ADDITIONAL REGRESSION INFORMATION:
I have no idea if this worked right in previous versions.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Each encoding is being used to do the same thing: isolate the word "cow" and print it to the screen. Only UTF-8 is successful.

Each encoding can print the entire phrase "Moo cow". Slicing a substring is where the problem lies.
ACTUAL -
UTF8_substring is cow
UTF16BE_substring is o�
UTF16LE_substring is o�

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------

import java.io.*;
import java.nio.charset.*;

public class StringBreaker
{
        public static void main( String [] args )
        {
                PrintStream p = System.out;
                try
                {
                        // UTF-8 works, and this is how the others SHOULD work too.
                        byte utf8_bytes[] = "Moo cow".getBytes( StandardCharsets.UTF_8 );
                        String UTF8_substring = new String( utf8_bytes, 4, 3, StandardCharsets.UTF_8 );
                        p.println( "UTF8_substring is " + UTF8_substring ); // prints "cow"

                        // UTF-16BE fails
                        byte utf16be_bytes[] = "Moo cow".getBytes( StandardCharsets.UTF_16BE );
                        String UTF16BE_substring = new String( utf16be_bytes, 4, 3, StandardCharsets.UTF_16BE );
                        // substring now holds the letter 'o' with visible garbage after it.
                        p.println( "UTF16BE_substring is " + UTF16BE_substring );

                        // UTF-16LE fails
                        byte utf16le_bytes[] = "Moo cow".getBytes( StandardCharsets.UTF_16LE );
                        String UTF16LE_substring = new String( utf16le_bytes, 4, 3, StandardCharsets.UTF_16LE );
                        // substring now holds the letter 'o' with visible garbage after it.
                        p.println( "UTF16LE_substring is " + UTF16LE_substring );

                        p.println( "The constructors that convert the whole byte array work fine. Only the substring constructors are broken." );
                        p.println( "UTF-16BE bytes printable: " +
                                new String( utf16be_bytes, StandardCharsets.UTF_16BE ) );
                        p.println( "UTF-16LE bytes printable: " +
                                new String( utf16le_bytes, StandardCharsets.UTF_16LE ) );
                }
                catch( Exception e )
                {
                        p.println( "ERROR: Bad charset string." );
                        System.exit(1);
                }
        }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
Convert all 16BE and 16LE byte arrays to Strings, not substrings.

Then get your substring from that, in the platform's default charset.

Lastly, convert the substring back into 16BE or 16LE bytes.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

StringBreaker.java
2 kB
2015-08-02 22:27

Assignee:: Xueming Shen

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2015-07-31 10:11

Updated:: 2015-08-06 21:34

Resolved:: 2015-08-03 09:34

Details

Description

Attachments

Attachments

Activity

People

Dates