Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: tbd
Affects Version/s: None
Component/s: hotspot
Labels:

Subcomponent:
compiler

I noticed this while experimenting with JDK-8149758, in C2_MacroAssembler::genmask as mov64(temp, -1L) which is encoded using 10 bytes.

For immediate values that fit in 32 bits, we could use shorter encodings with movl and movq instead of movabs, similar to what MacroAssembler::movptr is doing.

I have found only 1 location (in MacroAssembler::ic_call) that depends on mov64 being 10 bytes and it can be excluded.

In renaissance dotty, most of the savings are with the immediate values 0x3ffffff, 0xfffffffffffffffc and 0xffffffff, for a total of ~460 byte savings:

java -jar ~/.cache/stress/renaissance-gpl-0.16.1.jar -r 1 dotty

During java build with make, I have seen ~1K savings in one instance.

While the number may seem small, the fix may improve instruction cache hit rate and reduce decode pressure on the frontend. The proposed change would not change semantics either, between movabs/movl/movq.

relates to

JDK-8149758 Small arraycopy of non-constant length is slower than individual load/stores

Open

links to

Review(master) openjdk/jdk/30073

Assignee:: Kerem Kat
Reporter:: Kerem Kat
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: 2 days ago 12:23
Updated:: Yesterday 03:54

Details

Description

Attachments

Issue Links

Activity

People

Dates