Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8229895

Avoid GC lock when no array parameters are passed to critical native

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Won't Fix
    • P3
    • tbd
    • 14
    • hotspot
    • None

    Description

      A discussion on shenandoah-dev mailing list (https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-August/010422.html), as a way to improve jni performance, proposed by Ioannis Tsakpinis <iotsakp@gmail.com>

      It's true that CriticalJNINatives were added as an efficient way to
      access Java arrays from JNI code. However, the overhead of JNI calls
      affects all methods, especially methods that accept or return primitive
      values only and the JNI code does nothing but pass the arguments to
      another native function.

      There are thousands of JNI functions in LWJGL and almost all are like
      that, they simply cast arguments to the appropriate type and pass them
      to a target native function. Libraries like JNR and other JNI binding
      generators also look the same.

      The major benefit of using CriticalJNINatives for such functions is the
      removal of the first two standard JNI parameters: JNIEnv* and jclass.
      Normally that would only mean less register pressure, which may help in
      some cases. In practice though, native compilers are able to optimize
      away any argument shuffling and convert everything to a simple
      tail-call, i.e. a single jump instruction.

      We go from this for standard JNI:

      Java -> shuffle arguments -> JNI -> shuffle arguments -> native call

      to this for critical JNI:

      Java -> shuffle arguments -> JNI -> native call

      Example code and assembly output: https://godbolt.org/z/qZRIi1

      This has a measurable effect on JNI call overhead and becomes more
      important the simpler the target native function is. With Project Panama
      there is no JNI function and it should be possible to optimize the first
      argument shuffling too. Until then, this is the best we can do, unless
      there are opportunities to slim down the JNI wrapper even further for
      critical native methods (e.g. remove the safepoint polling if it's safe
      to do so).

      To sum up, the motivation is reduced JNI overhead. My argument is that
      primitive-only functions could benefit from significant overhead
      reduction with CriticalJNINatives. However, the GC locking effect is a
      major and unnecessary disadvantage. Shenandoah does a perfect job here
      because it supports region pinning and there's no actual locking
      happening in primitive-only functions. Every other GC though will choke
      hard with applications that make heavy use of critical natives (such as
      typical LWJGL applications). So, two requests:

      - PRIMARY: Skip check_needs_gc_for_critical_native() in primitive-only
      functions, regardless of GC algorithm and object-pinning support.

      - BONUS: JNI call overhead is significantly higher (3-4ns) on Java 10+
      compared to Java 8 (with or without critical natives). I went through
      the timeline of sharedRuntime_x86_64.cpp but couldn't spot anything that
      would justify such a difference (thread-local handshakes maybe?). I was
      wondering if this is a performance regression that needs to be looked
      into.

      Attachments

        Issue Links

          Activity

            People

              coleenp Coleen Phillimore
              zgu Zhengyu Gu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: