Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8308762

Metaspace leak with Instrumentation.retransform

    XMLWordPrintable

Details

    • b17
    • generic
    • generic

    Description

      We have just identified a Metaspace leak in a very specific case when
      a class has a method using a try-with-resources construct (or similar
      with try-catch) and re-transforming this class in a loop. It is
      reproducible from jdk8 to jdk20.
      Here the steps to reproduce:

       1. create a java file with following content:

      public class RetransformLeak {
          public static void main(String[] args) throws Exception {
              new MyClass();
              while (true) {
                  Thread.sleep(1000);
              }
          }
      }

      class MyClass {
          private static void writeFile() {
              TWR var0 = new TWR();
              try {
                  var0.process();
              } catch (Throwable var4) {
                  try {
                      var0.close();
                  } catch (Throwable var3) {
                      var4.addSuppressed(var3);
                  }
                  throw var4;
              }
              var0.close();
      // try (TWR twr = new TWR()) {
      // twr.process();
      // }
          }

          static class TWR implements AutoCloseable {
              public void process() {}
              @Override
              public void close() {}
          }
      }

      2. compile it: javac RetransformLeak.java
      3. create a java file Agent.java with the following content that will
      be our java agent performing re-transformation:

      public class Agent {
          public static void premain(String arg, Instrumentation inst) {
              new Thread(() -> retransformLoop(inst, arg)).start();
          }

          private static void retransformLoop(Instrumentation
      instrumentation, String className) {
              Class<?> classToRetransform = null;
              while (true) {
                  if (classToRetransform == null) {
                      for (Class<?> clazz : instrumentation.getAllLoadedClasses()) {
                          if (clazz.getName().equals(className)) {
                              System.out.println("found class: " + className);
                              classToRetransform = clazz;
                              break;
                          }
                      }
                  }
                  if (classToRetransform != null) {
                      try {
                          instrumentation.retransformClasses(classToRetransform);
                          //Thread.sleep(1);
                      } catch (Exception e) {
                          throw new RuntimeException(e);
                      }
                  }
              }
          }
      }

       4. Compile it: javac Agent.java
       5. create a Manifest.txt file for the java agent:

      Premain-Class: Agent
      Can-Retransform-Classes: true

       6. create java agent jar: jar cfm agent.jar Manifest.txt Agent.class
       7. execute the RetransformLeak class with a max metaspace size:
      java -javaagent:agent.jar=MyClass -XX:MaxMetaspaceSize=128M -cp .
      RetransformLeak

      output:
      found class: MyClass
      Exception in thread "Thread-0" java.lang.OutOfMemoryError
        at java.instrument/sun.instrument.InstrumentationImpl.retransformClasses0(Native
      Method)
        at java.instrument/sun.instrument.InstrumentationImpl.retransformClasses(InstrumentationImpl.java:169)
        at Agent.retransformLoop(Agent.java:22)
        at Agent.lambda$premain$0(Agent.java:5)
        at java.base/java.lang.Thread.run(Thread.java:1623)

      If you comment the line:
                      var4.addSuppressed(var3);
      in MyClass#writeFile method, no OOME will be thrown and Metaspace will
      remain stable.
      You can also directly use a try-with-resources construct to reproduce
      the leak but I have decomposed it with try catch to be able to
      pinpoint more precisely which bytecode may generate the leak.

      Here the output of VM.metaspace after the OOME:

      $ jcmd 6940 VM.metaspace
      6940:

      Total Usage - 6 loaders, 809 classes (776 shared):
        Non-Class: 2605 chunks, 131.55 MB capacity, 127.87 MB ( 97%)
      committed, 127.83 MB ( 97%) used, 37.94 KB ( <1%) free, 8
      bytes ( <1%) waste , deallocated: 11783 blocks with 127.36 MB
            Class: 8 chunks, 266.00 KB capacity, 74.00 KB ( 28%)
      committed, 17.48 KB ( 7%) used, 56.52 KB ( 21%) free, 0
      bytes ( 0%) waste , deallocated: 2 blocks with 512 bytes
             Both: 2613 chunks, 131.81 MB capacity, 127.94 MB ( 97%)
      committed, 127.85 MB ( 97%) used, 94.45 KB ( <1%) free, 8
      bytes ( <1%) waste , deallocated: 11785 blocks with 127.36 MB


      Virtual space:
        Non-class space: 192.00 MB reserved, 127.88 MB ( 67%)
      committed, 3 nodes.
            Class space: 104.00 MB reserved, 128.00 KB ( <1%)
      committed, 1 nodes.
                   Both: 296.00 MB reserved, 128.00 MB ( 43%) committed.


      Chunk freelists:
         Non-Class:

        4m: (none)
        2m: (none)
        1m: (none)
      512k: (none)
      256k: 2, capacity=512.00 KB, committed=0 bytes ( 0%)
      128k: 2, capacity=256.00 KB, committed=0 bytes ( 0%)
       64k: 2, capacity=128.00 KB, committed=0 bytes ( 0%)
       32k: (none)
       16k: (none)
        8k: (none)
        4k: 2, capacity=8.00 KB, committed=8.00 KB (100%)
        2k: 6, capacity=12.00 KB, committed=12.00 KB (100%)
        1k: (none)
      Total word size: 916.00 KB, committed: 20.00 KB ( 2%)

             Class:

        4m: (none)
        2m: 1, capacity=2.00 MB, committed=0 bytes ( 0%)
        1m: 1, capacity=1.00 MB, committed=0 bytes ( 0%)
      512k: 1, capacity=512.00 KB, committed=0 bytes ( 0%)
      256k: (none)
      128k: 1, capacity=128.00 KB, committed=0 bytes ( 0%)
       64k: 1, capacity=64.00 KB, committed=0 bytes ( 0%)
       32k: 1, capacity=32.00 KB, committed=0 bytes ( 0%)
       16k: 1, capacity=16.00 KB, committed=0 bytes ( 0%)
        8k: (none)
        4k: (none)
        2k: 1, capacity=2.00 KB, committed=2.00 KB (100%)
        1k: 3, capacity=3.00 KB, committed=3.00 KB (100%)
      Total word size: 3.74 MB, committed: 5.00 KB ( <1%)

              Both:

        4m: (none)
        2m: 1, capacity=2.00 MB, committed=0 bytes ( 0%)
        1m: 1, capacity=1.00 MB, committed=0 bytes ( 0%)
      512k: 1, capacity=512.00 KB, committed=0 bytes ( 0%)
      256k: 2, capacity=512.00 KB, committed=0 bytes ( 0%)
      128k: 3, capacity=384.00 KB, committed=0 bytes ( 0%)
       64k: 3, capacity=192.00 KB, committed=0 bytes ( 0%)
       32k: 1, capacity=32.00 KB, committed=0 bytes ( 0%)
       16k: 1, capacity=16.00 KB, committed=0 bytes ( 0%)
        8k: (none)
        4k: 2, capacity=8.00 KB, committed=8.00 KB (100%)
        2k: 7, capacity=14.00 KB, committed=14.00 KB (100%)
        1k: 3, capacity=3.00 KB, committed=3.00 KB (100%)
      Total word size: 4.63 MB, committed: 25.00 KB ( <1%)



      Waste (unused committed space):(percentages refer to total committed
      size 128.00 MB):
              Waste in chunks in use: 8 bytes ( <1%)
              Free in chunks in use: 94.45 KB ( <1%)
                      In free chunks: 25.00 KB ( <1%)
      Deallocated from chunks in use: 127.36 MB (>99%) (11785 blocks)
                             -total-: 127.48 MB (>99%)

      chunk header pool: 2632 items, 189.33 KB.

      Internal statistics:

      num_allocs_failed_limit: 23.
      num_arena_births: 22.
      num_arena_deaths: 10.
      num_vsnodes_births: 4.
      num_vsnodes_deaths: 0.
      num_space_committed: 2048.
      num_space_uncommitted: 0.
      num_chunks_returned_to_freelist: 33.
      num_chunks_taken_from_freelist: 2624.
      num_chunk_merges: 11.
      num_chunk_splits: 1456.
      num_chunks_enlarged: 276.
      num_inconsistent_stats: 0.


      Settings:
      MaxMetaspaceSize: 128.00 MB
      CompressedClassSpaceSize: 104.00 MB
      Initial GC threshold: 21.00 MB
      Current GC threshold: 128.00 MB
      CDS: on
      MetaspaceReclaimPolicy: balanced
       - commit_granule_bytes: 65536.
       - commit_granule_words: 8192.
       - virtual_space_node_default_size: 8388608.
       - enlarge_chunks_in_place: 1.
       - new_chunks_are_fully_committed: 0.
       - uncommit_free_chunks: 1.
       - use_allocation_guard: 0.

      Also I have run it with -XX:+CrashOnOutOfMemoryError to have stacktrace:
      Stack: [0x00007f8d9c400000,0x00007f8d9c500000],
      sp=0x00007f8d9c4fdb80, free space=1014k
      Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
      V [libjvm.so+0xf9f4e7] VMError::report_and_die(int, char const*,
      char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*,
      char const*, int, unsigned long)+0x197 (debug.cpp:368)
      V [libjvm.so+0x672eab] report_fatal(VMErrorType, char const*, int,
      char const*, ...)+0x12b
      V [libjvm.so+0x67315d] report_java_out_of_memory(char const*)+0xed
      V [libjvm.so+0xc238d0]
      Metaspace::report_metadata_oome(ClassLoaderData*, unsigned long,
      MetaspaceObj::Type, Metaspace::MetadataType, JavaThread*)+0x260
      V [libjvm.so+0xc23b33] Metaspace::allocate(ClassLoaderData*,
      unsigned long, MetaspaceObj::Type, JavaThread*)+0x133
      V [libjvm.so+0x650e8b] ConstantPool::allocate(ClassLoaderData*, int,
      JavaThread*)+0x7b
      V [libjvm.so+0xae518c]
      VM_RedefineClasses::merge_cp_and_rewrite(InstanceKlass*,
      InstanceKlass*, JavaThread*)+0x4c
      V [libjvm.so+0xae6e6d] VM_RedefineClasses::load_new_class_versions()
      [clone .part.0]+0x34d
      V [libjvm.so+0xae794f] VM_RedefineClasses::doit_prologue()+0x17f
      V [libjvm.so+0xfa8ee0] VMThread::execute(VM_Operation*)+0x40
      V [libjvm.so+0xab416f] JvmtiEnv::RetransformClasses(int, _jclass*
      const*)+0x2bf
      V [libjvm.so+0xa5f04c] jvmti_RetransformClasses+0xfc
      C [libinstrument.so+0x4e86] retransformClasses+0x1b6
      J 182 sun.instrument.InstrumentationImpl.retransformClasses0(J[Ljava/lang/Class;)V
      java.instrument@20.0.1 (0 bytes) @ 0x00007f8d886d8077
      [0x00007f8d886d7fa0+0x00000000000000d7]
      J 181 c1 sun.instrument.InstrumentationImpl.retransformClasses([Ljava/lang/Class;)V
      java.instrument@20.0.1 (33 bytes) @ 0x00007f8d80c29a5c
      [0x00007f8d80c29860+0x00000000000001fc]
      j Agent.retransformLoop(Ljava/lang/instrument/Instrumentation;Ljava/lang/String;)V+82
      j Agent.lambda$premain$0(Ljava/lang/instrument/Instrumentation;Ljava/lang/String;)V+2
      j Agent$$Lambda$14+0x0000000801002c00.run()V+8
      j java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5
      java.base@20.0.1
      j java.lang.Thread.run()V+19 java.base@20.0.1
      v ~StubRoutines::call_stub 0x00007f8d88138cc6
      V [libjvm.so+0x8c9c85] JavaCalls::call_helper(JavaValue*,
      methodHandle const&, JavaCallArguments*, JavaThread*)+0x315
      V [libjvm.so+0x8cb5f2] JavaCalls::call_virtual(JavaValue*, Handle,
      Klass*, Symbol*, Symbol*, JavaThread*)+0x1d2
      V [libjvm.so+0x99debe] thread_entry(JavaThread*, JavaThread*)+0x8e
      V [libjvm.so+0x8e1578] JavaThread::thread_main_inner() [clone .part.0]+0xb8
      V [libjvm.so+0xf18736] Thread::call_run()+0xa6
      V [libjvm.so+0xcae8d8] thread_native_entry(Thread*)+0xd8
      Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
      J 182 sun.instrument.InstrumentationImpl.retransformClasses0(J[Ljava/lang/Class;)V
      java.instrument@20.0.1 (0 bytes) @ 0x00007f8d886d8004
      [0x00007f8d886d7fa0+0x0000000000000064]
      J 181 c1 sun.instrument.InstrumentationImpl.retransformClasses([Ljava/lang/Class;)V
      java.instrument@20.0.1 (33 bytes) @ 0x00007f8d80c29a5c
      [0x00007f8d80c29860+0x00000000000001fc]
      j Agent.retransformLoop(Ljava/lang/instrument/Instrumentation;Ljava/lang/String;)V+82
      j Agent.lambda$premain$0(Ljava/lang/instrument/Instrumentation;Ljava/lang/String;)V+2
      j Agent$$Lambda$14+0x0000000801002c00.run()V+8
      j java.lang.Thread.runWith(Ljava/lang/Object;Ljava/lang/Runnable;)V+5
      java.base@20.0.1
      j java.lang.Thread.run()V+19 java.base@20.0.1
      v ~StubRoutines::call_stub 0x00007f8d88138cc6

      I also have run with "-Xlog:metaspace=trace" and the log is attached as metaspace.log.gz

      Running with "-Xlog:redefine+class+iklass+purge=trace" gives:

      [0.705s][trace][redefine,class,iklass,purge] Class unloading: has_previous_versions = false
      [0.963s][trace][redefine,class,iklass,purge] Class unloading: has_previous_versions = false
      [1.425s][trace][redefine,class,iklass,purge] Class unloading: has_previous_versions = false
      [1.914s][trace][redefine,class,iklass,purge] Class unloading: has_previous_versions = false
      [2.318s][trace][redefine,class,iklass,purge] Class unloading: has_previous_versions = false
      [2.323s][trace][redefine,class,iklass,purge] Class unloading: has_previous_versions = false


      Attachments

        Issue Links

          Activity

            People

              jpbempel Jean-Philippe Bempel
              jpbempel Jean-Philippe Bempel
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: