-
Enhancement
-
Resolution: Unresolved
-
P4
-
24
Consider this function:
int sum(MemorySegment ms) {
int sum = 0;
for (long offset = 0; offset < ms.byteSize() - 3; offset += 4) {
sum += ms.get(JAVA_INT_UNALIGNED, offset);
notinlinedCall(); // Or a memory fence such as VarHandle.fullFence()
}
}
At each iteration, we have to do several checks as well as loading the corresponding fields:
* ms.length - 3 >= 0
* offset u< ms.length - 3
* ms.scope.owner != null
* ms.scope.owner == Thread.currentThread()
* ms.scope.state >= 0
However, it can be seen that ms.length, ms.scope, ms.owner are trusted final fields, if we hoist them out of the loop we may eliminate 2 checks inside the loop. Hoisting ms.length may also make the loop a counted loop, opening more chances for other optimizations. Trusted final fields are allowed to be constant folded, so I think the VM allows the assumption that they are truly immutable, which should allow the proposed hoisting.
The way to do so may be to idealize a LoadNode that loads from a trusted final field by changing its memory input to C->immutable_memory(). Another way is to decorate the load with C2_IMMUTABLE_MEMORY during parsing, the drawback is that this does not cover the cases of Unsafe accesses such as VarHandle.
Note that while it is hard to make sure that an object from which a final field is loaded is fully initialized, it seems easier to make sure that a final field is not changed during the execution of the compiled method. A well-behaved final field is only stored into in the constructor of the object which holds it. As a result, a final field may be changed during the execution of the method if any of the following holds:
- The method being compiled is the constructor of the object from which the final field is loaded
- The method being compiled calls the constructor of the object from which the final field is loaded
- The method being compiled runs concurrently with the constructor of the object from which the final field is loaded
- The final field is not really well-behaved (the final field may be written into after the object is initialized, maybe using reflection or Unsafe)
The first condition is trivial to check. The second condition means that the object from which the final field is loaded should not be a newly allocated object, as constructors are only allowed to run on uninitialized objects. The third condition means that the object should not leak itself to another thread before the constructor finishes, while this is generally the case, this means that the optimization seems not applicable to general usages. The fourth condition means that trust_final_non_static_fields(ciInstanceKlass*) in ciField.cpp returns true.
Combining all these conditions, it seems we can apply this optimization to final field loads from classes that are explicitly listed in trust_final_non_static_fields(ciInstanceKlass*) (i.e. we can apply to fields of classes that belong to the listed packages such as java.lang.foreign.AbstractMemorySegmentImpl but we cannot apply to fields of a general hidden or record class, as they may leak themselves to another thread before the constructor finishes).
int sum(MemorySegment ms) {
int sum = 0;
for (long offset = 0; offset < ms.byteSize() - 3; offset += 4) {
sum += ms.get(JAVA_INT_UNALIGNED, offset);
notinlinedCall(); // Or a memory fence such as VarHandle.fullFence()
}
}
At each iteration, we have to do several checks as well as loading the corresponding fields:
* ms.length - 3 >= 0
* offset u< ms.length - 3
* ms.scope.owner != null
* ms.scope.owner == Thread.currentThread()
* ms.scope.state >= 0
However, it can be seen that ms.length, ms.scope, ms.owner are trusted final fields, if we hoist them out of the loop we may eliminate 2 checks inside the loop. Hoisting ms.length may also make the loop a counted loop, opening more chances for other optimizations. Trusted final fields are allowed to be constant folded, so I think the VM allows the assumption that they are truly immutable, which should allow the proposed hoisting.
The way to do so may be to idealize a LoadNode that loads from a trusted final field by changing its memory input to C->immutable_memory(). Another way is to decorate the load with C2_IMMUTABLE_MEMORY during parsing, the drawback is that this does not cover the cases of Unsafe accesses such as VarHandle.
Note that while it is hard to make sure that an object from which a final field is loaded is fully initialized, it seems easier to make sure that a final field is not changed during the execution of the compiled method. A well-behaved final field is only stored into in the constructor of the object which holds it. As a result, a final field may be changed during the execution of the method if any of the following holds:
- The method being compiled is the constructor of the object from which the final field is loaded
- The method being compiled calls the constructor of the object from which the final field is loaded
- The method being compiled runs concurrently with the constructor of the object from which the final field is loaded
- The final field is not really well-behaved (the final field may be written into after the object is initialized, maybe using reflection or Unsafe)
The first condition is trivial to check. The second condition means that the object from which the final field is loaded should not be a newly allocated object, as constructors are only allowed to run on uninitialized objects. The third condition means that the object should not leak itself to another thread before the constructor finishes, while this is generally the case, this means that the optimization seems not applicable to general usages. The fourth condition means that trust_final_non_static_fields(ciInstanceKlass*) in ciField.cpp returns true.
Combining all these conditions, it seems we can apply this optimization to final field loads from classes that are explicitly listed in trust_final_non_static_fields(ciInstanceKlass*) (i.e. we can apply to fields of classes that belong to the listed packages such as java.lang.foreign.AbstractMemorySegmentImpl but we cannot apply to fields of a general hidden or record class, as they may leak themselves to another thread before the constructor finishes).
- relates to
-
JDK-8334754 C2: Optimize accesses to provably final instance fields
-
- Open
-