Loading...

XML

Word

Printable

Type: Bug
Resolution: Not an Issue
Priority: P4
Fix Version/s: None
Affects Version/s: 9
Component/s: core-libs
Labels:
- dcs-pso
- webbug

Subcomponent:
java.lang
Introduced In Build:
b93
CPU:

x86
OS:

windows_8

FULL PRODUCT VERSION :
java version "1.9.0-ea"
Java(TM) SE Runtime Environment (build 1.9.0-ea-b93)
Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b93, mixed mode)

A DESCRIPTION OF THE PROBLEM :
The problem: Java source code suggests that bytes for UTF16 characters are stored in java.lang.String's field "value" as big endian:

// from String.java:
    public char charAt(int index) {
        if (isLatin1()) {
            return StringLatin1.charAt(value, index);
        } else {
            return StringUTF16.charAt(value, index);
        }
    }

// from StringUTF16.java:

    @HotSpotIntrinsicCandidate
    public static char getChar(byte[] val, int index) {
        index <<= 1;
        return (char)(((val[index++] & 0xff) << HI_BYTE_SHIFT) |
                      ((val[index] & 0xff) << LO_BYTE_SHIFT));
    }

... however, in fact, JVM applies intrinsic that uses target platfrom's natural endianness. For example, it is little endian on Windows which is x86/x64, but big endian on Solaris SPARC.

I suppose this is done for performance reasons, and it's OK. However, if this is the case, the source code should explicitly reflect this fact.

Also, tools opening HPROF dumps or debuggers that present Strings should be provided with explicit means to find out which endianness is used in particular JVM.

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Running the following TestCompactStrings on a little endian machine (e.g. Windows) prints "little endian" and on a big endian machine (e.g. Solaris SPARC) prints "big endian".

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
According to the current Java source code, it should print "big endian" always.

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
import java.lang.reflect.Field;

class TestCompactStrings {
  public static void main(String[] args) throws Exception {
    String s = "\u1234";
    Class<? extends String> aClass = s.getClass();

    Field valueField = aClass.getDeclaredField("value");
    valueField.setAccessible(true);

    byte[] bytes = (byte[])valueField.get(s);

    if (bytes[0] == 0x12 && bytes[1] == 0x34) {
      System.out.println("big endian");
    }
    else if (bytes[0] == 0x34 && bytes[1] == 0x12) {
      System.out.println("little endian");
    }
    else {
      System.out.println("unexpected");
    }
  }
}

---------- END SOURCE ----------

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

TestCompactStrings.java
2015-12-04 01:22
0.7 kB
Pallavi Sonal

relates to

JDK-8054307 JEP 254: Compact Strings

Closed

Assignee:: Xueming Shen
Reporter:: Webbug Group
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: 2015-11-27 03:28
Updated:: 2024-10-09 13:44
Resolved:: 2015-12-04 15:13

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates