-
Enhancement
-
Resolution: Unresolved
-
P4
-
24
[~asmehra] noticed the following issue:
There are two compilations of the method.
In the OptoAssembly for compile id 386 (the second compile), the basic block #5 is:
059 B5: # out( B10 B6 ) <- in( B4 ) Freq: 0.31918
059 decode_heap_oop_not_null R10,RBP
05d leaq RDI, [R12 + RBP << 3 + #16] (compressed oop addressing) # ptr compressedoopoff32
062 cmpl RDX, #2
065 jl B10 P=0.000625 C=8001.000000
Corresponding assembly code:
;; B5: # out( B10 B6 ) <- in( B4 ) Freq: 0.31918
0x00007f90807b0819: lea (%r12,%rbp,8),%r10 ;*getfield value {reexecute=0 rethrow=0 return_oop=0}
; - java.lang.String::hashCode@24 (line 2458)
0x00007f90807b081d: lea 0x10(%r12,%rbp,8),%rdi ;*invokestatic vectorizedHashCode {reexecute=0 rethrow=0 return_oop=0}
; - jdk.internal.util.ArraysSupport::hashCodeOfUnsigned@51 (line 276)
; - java.lang.StringLatin1::hashCode@5 (line 306)
; - java.lang.String::hashCode@27 (line 2458)
0x00007f90807b0822: cmp $0x2,%edx
0x00007f90807b0825: jl 0x00007f90807b0a1d ;*lookupswitch {reexecute=0 rethrow=0 return_oop=0}
; - jdk.internal.util.ArraysSupport::hashCodeOfUnsigned@1 (line 273)
; - java.lang.StringLatin1::hashCode@5 (line 306)
; - java.lang.String::hashCode@27 (line 2458)
Turns out the result of decoding the narrow oop at instruction 0x00007f90807b0819 is never used. It seems that operation is just redundant here.
What I think is happening is:
https://github.com/openjdk/jdk/blob/b6700095c018a67a55b746cd4eee763c68f538e0/src/hotspot/share/opto/matcher.cpp#L1870
explicitly adds the DecodeN as a dependency on the
leaPCompressedOopOffset node. I think this is due to derived
pointers.
I think, at some point (after register allocation), the edge between
leaPCompressedOopOffset and the DecodeN is no longer needed and could be removed and the DecodeN be removed as well if it has no use. I don't see code that does that (but I could be missing it). Also, I suppose the oopmap could record:
(compressed oop, decompressed oop + off)
and that would be sufficient to update the derived pointer. That would
require change to the derived pointer logic on the runtime side and c2
side. Also, I suppose we would need 2 variants of the oop map entries:
(decompressed oop, decompressed oop + off)
and
(compressed oop, decompressed oop + off)
because there must be cases where all we have is a decompressed
oop. So we would need a way for the runtime to tell which kind it's
working on.
I suppose it also happens that debug info keeps a DecodeN live while it would be trivial to compute the oop from the compressed oop at runtime. So if we were to fix this, it could apply to more than just derived pointers.
There are two compilations of the method.
In the OptoAssembly for compile id 386 (the second compile), the basic block #5 is:
059 B5: # out( B10 B6 ) <- in( B4 ) Freq: 0.31918
059 decode_heap_oop_not_null R10,RBP
05d leaq RDI, [R12 + RBP << 3 + #16] (compressed oop addressing) # ptr compressedoopoff32
062 cmpl RDX, #2
065 jl B10 P=0.000625 C=8001.000000
Corresponding assembly code:
;; B5: # out( B10 B6 ) <- in( B4 ) Freq: 0.31918
0x00007f90807b0819: lea (%r12,%rbp,8),%r10 ;*getfield value {reexecute=0 rethrow=0 return_oop=0}
; - java.lang.String::hashCode@24 (line 2458)
0x00007f90807b081d: lea 0x10(%r12,%rbp,8),%rdi ;*invokestatic vectorizedHashCode {reexecute=0 rethrow=0 return_oop=0}
; - jdk.internal.util.ArraysSupport::hashCodeOfUnsigned@51 (line 276)
; - java.lang.StringLatin1::hashCode@5 (line 306)
; - java.lang.String::hashCode@27 (line 2458)
0x00007f90807b0822: cmp $0x2,%edx
0x00007f90807b0825: jl 0x00007f90807b0a1d ;*lookupswitch {reexecute=0 rethrow=0 return_oop=0}
; - jdk.internal.util.ArraysSupport::hashCodeOfUnsigned@1 (line 273)
; - java.lang.StringLatin1::hashCode@5 (line 306)
; - java.lang.String::hashCode@27 (line 2458)
Turns out the result of decoding the narrow oop at instruction 0x00007f90807b0819 is never used. It seems that operation is just redundant here.
What I think is happening is:
https://github.com/openjdk/jdk/blob/b6700095c018a67a55b746cd4eee763c68f538e0/src/hotspot/share/opto/matcher.cpp#L1870
explicitly adds the DecodeN as a dependency on the
leaPCompressedOopOffset node. I think this is due to derived
pointers.
I think, at some point (after register allocation), the edge between
leaPCompressedOopOffset and the DecodeN is no longer needed and could be removed and the DecodeN be removed as well if it has no use. I don't see code that does that (but I could be missing it). Also, I suppose the oopmap could record:
(compressed oop, decompressed oop + off)
and that would be sufficient to update the derived pointer. That would
require change to the derived pointer logic on the runtime side and c2
side. Also, I suppose we would need 2 variants of the oop map entries:
(decompressed oop, decompressed oop + off)
and
(compressed oop, decompressed oop + off)
because there must be cases where all we have is a decompressed
oop. So we would need a way for the runtime to tell which kind it's
working on.
I suppose it also happens that debug info keeps a DecodeN live while it would be trivial to compute the oop from the compressed oop at runtime. So if we were to fix this, it could apply to more than just derived pointers.