Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Affects Version/s: 26
Component/s: hotspot
Labels:
None

Subcomponent:
compiler

I did some experiments comparing performance of memory segment bulk operation against plain Java loops. Here are some results (unscientific benchmark attached):

FILL

Benchmark Mode Cnt Score Error Units
BulkOps.segment_fill avgt 10 119323.358 ± 3484.991 ns/op
BulkOps.segment_fill_int_loop avgt 10 2055700.828 ± 101325.298 ns/op
BulkOps.segment_fill_long_loop avgt 10 47875.953 ± 1727.711 ns/op

COPY

Benchmark Mode Cnt Score Error Units
BulkOps.segment_copy_static avgt 10 86283.631 ± 4562.169 ns/op
BulkOps.segment_copy_static_int_loop avgt 10 82480.038 ± 3476.123 ns/op
BulkOps.segment_copy_static_long_loop avgt 10 78929.262 ± 2100.533 ns/op
BulkOps.segment_copy_static_small avgt 10 4.346 ± 0.037 ns/op
BulkOps.segment_copy_static_small_int_loop avgt 10 5.110 ± 0.055 ns/op
BulkOps.segment_copy_static_small_long_loop avgt 10 4.208 ± 0.026 ns/op

MISMATCH

Benchmark Mode Cnt Score Error Units
BulkOps.mismatch_large_segment avgt 10 38011.887 ± 2219.403 ns/op
BulkOps.mismatch_large_segment_int_loop avgt 10 778412.959 ± 11380.481 ns/op
BulkOps.mismatch_large_segment_long_loop avgt 10 283515.423 ± 7737.791 ns/op
BulkOps.mismatch_small_segment avgt 10 2.719 ± 0.097 ns/op
BulkOps.mismatch_small_segment_int_loop avgt 10 2.963 ± 0.030 ns/op
BulkOps.mismatch_small_segment_long_loop avgt 10 2.892 ± 0.011 ns/op

Overall, really great progress. I think we're really close to being able to just use plain loops for these routines in the memory segment implementation (and maybe even ByteBuffer) classes.

One notable hiccup is that loops using int induction variables are still significantly slower than those using long variables.

Another issue (but this is known) is that the intrinsics for mismatch is still faster than a loop -- this is due to limitations with autovectorization and control flow (as mismatch needs to branch out of the loop if a mismatch is detected).

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

BulkOps.java
10 kB
2025-11-10 02:52

relates to

JDK-8331659 C2 SuperWord: investigate failed vectorization in compiler/loopopts/superword/TestMemorySegment.java

Closed

Assignee:: Emanuel Peter
Reporter:: Maurizio Cimadamore
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: 12 hours ago
Updated:: 11 hours ago

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates