-
Enhancement
-
Resolution: Fixed
-
P4
-
17, 21, 22
-
b06
Similarly to JDK-8284578, we would like to handle ICStub alignment. Currently, the small stub that takes only 24 bytes of code is covered by 128 bytes on AArch64. This is due to the same thing fixed by JDK-8284578 for interpreter codelets: aligning twice the CodeEntryAlignment.
These stubs are needed for IC transitions, so we don't need to over-align them.
However, there is a tangential correctness consideration: we want to have ICStubs taking the separate instruction cache lines, to piggyback on instruction coherence. Erik O says:
--- 8< ------
"One thing that sometimes keeps me up at night is that when ICStubs are small enough that a newly allocated one shares s cache line with a previously allocated one, then we start to rely on instruction cache coherency working properly for correctness. In particular a CPU might have both the destination of an inline cache and the contents of the ICStub cache line in its instruction caches. Yet when we first update the ICStub and then subsequently update the destination to point at the stub, we are relying on instruction fetching being ordered.
We have a similar situation with static call stubs. But there we have an isb instruction in the static call stub which is there from when an nmethod is compiled, which deals with this exact problem. Now thatstub is used to transfer control to the interpreter and we can do inefficient things. As for ICStubs, it would probably not be very pretty.
So instead, as I see it, we have kind of relied on ICStubs being spaced out in memory enough that the instruction fetcher would just not have it cached since the ICStub finalization which happens in a safepoint and is followed by cross modifying code fences on all threads.
Because of this, I think I would sleep a bit worse at night, especially regarding AArch64, if these stubs start to get too cozy with each other in memory. I sleep better with the social distancing policy of today, where the stubs are aware there is a dangerous disease to be worried about from their friends and not to catch it."
--- 8< ------
Since on most interesting architectures the cache line size is smaller than 2*CodeEntryAlignment, this would also optimize ICBuffer footprint.
These stubs are needed for IC transitions, so we don't need to over-align them.
However, there is a tangential correctness consideration: we want to have ICStubs taking the separate instruction cache lines, to piggyback on instruction coherence. Erik O says:
--- 8< ------
"One thing that sometimes keeps me up at night is that when ICStubs are small enough that a newly allocated one shares s cache line with a previously allocated one, then we start to rely on instruction cache coherency working properly for correctness. In particular a CPU might have both the destination of an inline cache and the contents of the ICStub cache line in its instruction caches. Yet when we first update the ICStub and then subsequently update the destination to point at the stub, we are relying on instruction fetching being ordered.
We have a similar situation with static call stubs. But there we have an isb instruction in the static call stub which is there from when an nmethod is compiled, which deals with this exact problem. Now thatstub is used to transfer control to the interpreter and we can do inefficient things. As for ICStubs, it would probably not be very pretty.
So instead, as I see it, we have kind of relied on ICStubs being spaced out in memory enough that the instruction fetcher would just not have it cached since the ICStub finalization which happens in a safepoint and is followed by cross modifying code fences on all threads.
Because of this, I think I would sleep a bit worse at night, especially regarding AArch64, if these stubs start to get too cozy with each other in memory. I sleep better with the social distancing policy of today, where the stubs are aware there is a dangerous disease to be worried about from their friends and not to catch it."
--- 8< ------
Since on most interesting architectures the cache line size is smaller than 2*CodeEntryAlignment, this would also optimize ICBuffer footprint.
- is blocked by
-
JDK-8237842 Separate definitions for default cache line and padding sizes
- Resolved
- relates to
-
JDK-8284578 Relax InterpreterCodelet stub alignment
- Resolved