Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Affects Version/s: None
Component/s: core-libs
Labels:
- 19ea
- dcsaw
- reproducer-other
- webbug

CPU:

generic
OS:

generic

A DESCRIPTION OF THE PROBLEM :
When the String(byte[] bytes, int offset, int length, Charset charset) constructor is called with UTF-8, there is a check whether the byte array contains only ascii, and if it's not there is an attempt to decode it as a latin1 string.
That attempt kicks off by allocating a new array.
Instead, we can speculatively check if the first byte can be the starting byte for a latin1 encoded code point, and if it's not, skip the array creation and move straight to the decodeUTF8_UTF16 call.
This will be a tiny slow down for strings that start with a latin1 code point, but will give a relatively large performance boost if the string starts with a non latin1 code point.
This seems like a good tradeoff, considering that there is a fair chance a non ascii string starts with a non latin1 character.

Assignee:: Unassigned

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2022-01-19 13:02

Updated:: 2023-08-29 10:44

Details

Description

Attachments

Activity

People

Dates