Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8353558

x86: Use better instructions for ICache sync when available

XMLWordPrintable

      Since the beginning of time, we have been using CLFLUSH + 2xMFENCE to perform the ICache flushes. Both CLFLUSH and MFENCE-s are slow, but it was not a practical problem so far, as most flushes would happen in the shadow of much larger operation, like whole method JIT compilation, or method patching following the class load or GC, etc.

      But for Leyden, that wants to load a lot of code as fast as it can, code loads are fast, and code cache flush costs are now significant part of the picture. There are single-digit percent startup time opportunities in better ICache flushes. Luckily, there are CLFLUSHOPT and CLWB available on modern x86 that allow for HW-side optimization.

      CLFLUSHOPT promises more throughput due to avoiding inter-dependencies against other CLFLUSHOPT-s. Plus, it can use more relaxed fences to order with the rest of the code.

      CLWB provides the same thing as CLFLUSHOPT, _plus_ it keeps the cache line in local cache for later reuse. For our code cache flushes, when we often have to make a few adjacent relocation modifications, this turns out to be an additional benefit.

      Draft PR: https://github.com/openjdk/jdk/pull/24389

            shade Aleksey Shipilev
            shade Aleksey Shipilev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: