-
Enhancement
-
Resolution: Fixed
-
P4
-
17-pool, 18, 19
-
b15
The current hasNegatives intrinsic answers false if a byte[] only has positive bytes and true otherwise. Being aggressively vectorized this is a good foundation for a very quick test for whether a String or byte[] is ASCII-only.
However such a test doesn't help and might even be costly for strings that have a lot of leading ASCII, since we'll have to scan the same data twice (and the second time around in a slow, non-vectorized loop).
If we were to instead calculate the number of leading positive bytes we can do a quick arraycopy of the bytes known to be ASCII-only. This speeds up a number of operations, and in the case of encoding latin1 strings might even reduce allotation pressure. I've prototyped this on x64, with promising results:
https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
Draft PR: https://github.com/openjdk/jdk/pull/7231
However such a test doesn't help and might even be costly for strings that have a lot of leading ASCII, since we'll have to scan the same data twice (and the second time around in a slow, non-vectorized loop).
If we were to instead calculate the number of leading positive bytes we can do a quick arraycopy of the bytes known to be ASCII-only. This speeds up a number of operations, and in the case of encoding latin1 strings might even reduce allotation pressure. I've prototyped this on x64, with promising results:
https://jmh.morethan.io/?gists=428b487e92e3e47ccb7f169501600a88,3c585de7435506d3a3bdb32160fe8904
Draft PR: https://github.com/openjdk/jdk/pull/7231
- relates to
-
JDK-8318509 x86 count_positives intrinsic broken for -XX:AVX3Threshold=0
- Resolved
-
JDK-8283325 US_ASCII decoder relies on String.decodeASCII being exhaustive
- Closed