Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P4
Fix Version/s: 19
Affects Version/s: None
Component/s: core-libs
Labels:

Subcomponent:
java.lang
Resolved In Build:
b06

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8293706	17.0.6	Boris Ulasevich	P4	Resolved	Fixed	b01

String(byte[], int, int, Charset) constructor has this check for latin-1 in the latin-1 fast-path:

if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && ...

Since the two constant bytes differ only on the lowest bit this can be transformed to this:

if ((b1 & 0xfe) == 0xc2 &&

Which makes the code less branchy and produce a small speed-up on a targetted microbenchmark:

Benchmark (charsetName) Mode Cnt Score Error Units
StringDecode.decodeLatin1LongStart UTF-8 avgt 50 2283.591 ± 12.332 ns/op

StringDecode.decodeLatin1LongStart UTF-8 avgt 50 2165.984 ± 13.136 ns/op

(While this minor inefficiency appears to have been introduced in with JEP 254 in JDK 9, the performance of decoding latin-1 strings was much improved thanks to the compactness of latin-1 encoded Strings, so I've not seen a regression caused by this.)

backported by

JDK-8293706 Reduce branches decoding latin-1 chars from UTF-8 encoded bytes

Resolved

links to

Commit openjdk/jdk17u-dev/061ddcab

Commit openjdk/jdk/e314a4cf

Review openjdk/jdk17u-dev/658

Review openjdk/jdk/7122

Assignee:: Claes Redestad

Reporter:: Claes Redestad

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2022-01-18 01:56

Updated:: 2022-09-19 03:54

Resolved:: 2022-01-18 11:30

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates