Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Affects Version/s: 7
Component/s: core-libs
Labels:
- charset
- webbug

Subcomponent:
java.lang
CPU:

x86
OS:

windows_xp

A DESCRIPTION OF THE REQUEST :
String#getBytes(..) and new String(bytes..) internally use slow and each time newly instatiated Charset-X-coders.

Additionally:
At first assumption user could think, that String#getBytes(byte[] buf, Charset cs) might be faster than String#getBytes(byte[] buf, String csn), because he assumes, that Charset would be internally created from csn.
As this is only true for the first call, there should be a *note* in JavaDoc about cost of those methods in comparision. Don't forget (byte[] ...) constructor's JavaDoc too.

JUSTIFICATION :
Assumed that ASCII and ISO-8859-1 have high percentage in usage on those methods especially for CORBA applications, we should have a fast shortcut in class String.

  See also:
http://cr.openjdk.java.net/~sherman/6636323_6636319/webrev
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6636319
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6636323

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Fastpath for ASCII + ISO-8859-1 for methods and constructors like:
String#getBytes(..) and new String(bytes..)
Alternatives:
String#getASCIIBytes(..)
String#getISO8859_1Bytes(..)

ACTUAL -
byte[] getBytes(Charset charset)
internally instantiates CharsetEncoder which is much slower, especially on short strings.

---------- BEGIN SOURCE ----------
1 simple example:

public class String {
    ...
    int getBytes(byte[] buf, byte mask) {
        int j = 0;
        for (int i=0; i<values.length; i++, j++) {
            if (values[i] | mask == mask)
                buf[j] = (byte)values[i];
                continue;
            if (isHighSurrogate(values[i] && i+1<length && isLowSurrogate(values[i+1])
                 i++;
            buf[j] = '?'; // or default replacement
        }
        return j;
    ...
    }

---------- END SOURCE ----------

relates to

JDK-8054307 JEP 254: Compact Strings

Closed

Assignee:: Xueming Shen
Reporter:: Roger Yeung (Inactive)
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: 2009-04-03 15:28
Updated:: 2021-06-26 12:56
Imported:: 15/Sep/12 11:32 PM
Indexed:: 17/Jul/12 7:48 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates