Reference leak in Generational Shenandoah

XMLWordPrintable

    • gc
    • generic
    • generic

      ADDITIONAL SYSTEM INFORMATION :
      Occurs on Linux and OSX.

      I've only tested with corretto:
      openjdk 25.0.1 2025-10-21 LTS
      OpenJDK Runtime Environment Corretto-25.0.1.8.1 (build 25.0.1+8-LTS)
      OpenJDK 64-Bit Server VM Corretto-25.0.1.8.1 (build 25.0.1+8-LTS, mixed mode, sharing)


      A DESCRIPTION OF THE PROBLEM :
      We just upgraded to JDK25 and are trying out Generational Shenandoah, coming from ZGC. We noticed native memory (in the "other" category) due to direct byte buffers steadily increasing and not getting freed - despite these ByteBuffer objects becoming unreachable. One service of ours hit 2GB of native memory used after 24 hours, ultimately causing our service to be OOMKilled. Triggering GC's manually by taking a (live) heap histo clears the native memory, so this seems to be a failure of the GC to find and clean up certain objects, rather than a true "leak".

      We tracked this down to issues with Undertow's DefaultByteBufferPool, which uses Finalizers and WeakHashMaps - these both use types of references (eg WeakReferences) that need at least one additional GC cycle to be removed by the GC.

      I plan to submit a change to Undertow's code to reduce its reliance on these, but it's possible this issue impacts other code, so I produced a minimal repro of it.

      I believe the root issue is a Reference in the old generation will fail to be discovered by the GC. A reference in the old gen will not be encountered by any young gen collections. And when it gets encountered in the old gen, should_discover() is returning false, so there's no way for it to ever be enqueued. I think this is due to these references being wrongly considered strongly live:

      [383.643s][trace][gc,ref ] GC(213) Encountered Reference: 0x00000007ffce3000 (Phantom, OLD)
      [383.643s][trace][gc,ref ] GC(213) Reference strongly live: 0x00000007ffce3000

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run:
      java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ShenandoahGCMode=generational -XX:ShenandoahIgnoreGarbageThreshold=0 -XX:ShenandoahOldGarbageThreshold=0 -XX:ShenandoahGarbageThreshold=0 -XX:ShenandoahGuaranteedOldGCInterval=10000 -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent -Xmx8g -Xms8g -XX:NativeMemoryTracking=detail GenShenNativeLeakRepro

      These flags help prove that the references are guaranteed to be encountered during each old gen GC cycle (otherwise they might be skipped over if the region has very little garbage)
      -XX:ShenandoahIgnoreGarbageThreshold=0 -XX:ShenandoahOldGarbageThreshold=0 -XX:ShenandoahGarbageThreshold=0

      This flag guarantees that references in old gen regions get processed every 10 seconds (each iteration takes about 20 seconds on my M1 macbook)
      -XX:ShenandoahGuaranteedOldGCInterval=10000





      I have another simpler version of the test that doesn't allocate native memory, to show this issue isn't unique to native mem:
      java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ShenandoahGCMode=generational -XX:ShenandoahIgnoreGarbageThreshold=0 -XX:ShenandoahOldGarbageThreshold=0 -XX:ShenandoahGarbageThreshold=0 -XX:ShenandoahGuaranteedOldGCInterval=10000 -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent -Xmx8g -Xms8g GenShenWeakRefLeakRepro


      I have tested the repo against non-generational shenandoah (satb mode), G1 and ZGC and the issue does not happen. It only happens with Generational Shenandoah.

      I've played around with the heap size (currently set to 8gb) and the allocation rates, and I've found 8gb to be the most predictable repro.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Each iteration allocates a 1kb direct byte buffer, which then becomes unreachable, and then should be cleared away by the GC. I would expect the native memory usage after each iteration to be 1kb (instead of adding 1kb per iteration).

      ACTUAL -
      java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ShenandoahGCMode=generational -XX:ShenandoahIgnoreGarbageThreshold=0 -XX:ShenandoahOldGarbageThreshold=0 -XX:ShenandoahGarbageThreshold=0 -XX:ShenandoahGuaranteedOldGCInterval=10000 -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent -Xmx8g -Xms8g -XX:NativeMemoryTracking=detail GenShenNativeLeakRepro

      Iteration 1: Native Memory = 1 KB
      Iteration 2: Native Memory = 2 KB
      Iteration 3: Native Memory = 3 KB
      Iteration 4: Native Memory = 4 KB
      Iteration 5: Native Memory = 5 KB
      Iteration 6: Native Memory = 6 KB
      Iteration 7: Native Memory = 7 KB
      Iteration 8: Native Memory = 8 KB
      Iteration 9: Native Memory = 9 KB
      Iteration 10: Native Memory = 10 KB
      Iteration 11: Native Memory = 11 KB
      Iteration 12: Native Memory = 12 KB
      Iteration 13: Native Memory = 13 KB
      Iteration 14: Native Memory = 14 KB
      Iteration 15: Native Memory = 15 KB
      Iteration 16: Native Memory = 16 KB
      Iteration 17: Native Memory = 17 KB
      Iteration 18: Native Memory = 18 KB
      Iteration 19: Native Memory = 19 KB
      Iteration 20: Running GC...
      Iteration 20: Native Memory = 1 KB


      With GC logs enabled (-Xlog:gc*=info,gc+ref=trace), filtered down to reference processing in old gens:

      Iteration 1: Native Memory = 1 KB
      [20.423s][info ][gc,ref ] GC(46) Encountered references: Soft: 66, Weak: 183, Final: 0, Phantom: 3
      [20.423s][info ][gc,ref ] GC(46) Discovered references: Soft: 0, Weak: 0, Final: 0, Phantom: 0
      [20.423s][info ][gc,ref ] GC(46) Enqueued references: Soft: 0, Weak: 0, Final: 0, Phantom: 0
      Iteration 2: Native Memory = 2 KB
      [30.687s][info ][gc,ref ] GC(52) Encountered references: Soft: 66, Weak: 187, Final: 0, Phantom: 4
      [30.688s][info ][gc,ref ] GC(52) Discovered references: Soft: 0, Weak: 0, Final: 0, Phantom: 0
      [30.688s][info ][gc,ref ] GC(52) Enqueued references: Soft: 0, Weak: 0, Final: 0, Phantom: 0
      Iteration 3: Native Memory = 3 KB
      [54.496s][info ][gc,ref ] GC(70) Encountered references: Soft: 66, Weak: 187, Final: 0, Phantom: 5
      [54.496s][info ][gc,ref ] GC(70) Discovered references: Soft: 0, Weak: 0, Final: 0, Phantom: 1
      [54.496s][info ][gc,ref ] GC(70) Enqueued references: Soft: 0, Weak: 0, Final: 0, Phantom: 0
      Iteration 4: Native Memory = 4 KB
      [93.706s][info ][gc,ref ] GC(91) Encountered references: Soft: 66, Weak: 187, Final: 0, Phantom: 6
      [93.706s][info ][gc,ref ] GC(91) Discovered references: Soft: 0, Weak: 0, Final: 0, Phantom: 0
      [93.706s][info ][gc,ref ] GC(91) Enqueued references: Soft: 0, Weak: 0, Final: 0, Phantom: 0

      Note there were other old gen GCs between the iterations, but I only included the latest (to show that encountered phantom references grows by 1 per generation)



      Without native memory:
      java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ShenandoahGCMode=generational -XX:ShenandoahIgnoreGarbageThreshold=0 -XX:ShenandoahOldGarbageThreshold=0 -XX:ShenandoahGarbageThreshold=0 -XX:ShenandoahGuaranteedOldGCInterval=10000 -XX:+AlwaysPreTouch -XX:+ExplicitGCInvokesConcurrent -XX:NativeMemoryTracking=detail -Xmx8g -Xms8g GenShenWeakRefLeakRepro
      Iteration 1: MyLeakedObject=1, WeakReference=5, WeakRefs with live referent=1
      Iteration 2: MyLeakedObject=2, WeakReference=6, WeakRefs with live referent=2
      Iteration 3: MyLeakedObject=3, WeakReference=7, WeakRefs with live referent=3
      Iteration 4: MyLeakedObject=4, WeakReference=8, WeakRefs with live referent=4
      Iteration 5: MyLeakedObject=5, WeakReference=9, WeakRefs with live referent=5
      Iteration 6: MyLeakedObject=6, WeakReference=10, WeakRefs with live referent=6
      Iteration 7: MyLeakedObject=7, WeakReference=11, WeakRefs with live referent=7
      Iteration 8: MyLeakedObject=8, WeakReference=12, WeakRefs with live referent=8
      Iteration 9: MyLeakedObject=9, WeakReference=13, WeakRefs with live referent=9
      Iteration 10: MyLeakedObject=10, WeakReference=14, WeakRefs with live referent=10
      Iteration 11: MyLeakedObject=11, WeakReference=15, WeakRefs with live referent=11
      Iteration 12: MyLeakedObject=12, WeakReference=16, WeakRefs with live referent=12
      Iteration 13: MyLeakedObject=13, WeakReference=17, WeakRefs with live referent=13
      Iteration 14: MyLeakedObject=14, WeakReference=18, WeakRefs with live referent=14
      Iteration 15: MyLeakedObject=15, WeakReference=19, WeakRefs with live referent=15
      Iteration 16: MyLeakedObject=16, WeakReference=20, WeakRefs with live referent=16
      Iteration 17: MyLeakedObject=17, WeakReference=21, WeakRefs with live referent=17
      Iteration 18: MyLeakedObject=18, WeakReference=22, WeakRefs with live referent=18
      Iteration 19: MyLeakedObject=19, WeakReference=23, WeakRefs with live referent=19
      Iteration 20: MyLeakedObject=20, WeakReference=24, WeakRefs with live referent=20
      Forcing GCs...
      Iteration 21: MyLeakedObject=2, WeakReference=6, WeakRefs with live referent=2


      ---------- BEGIN SOURCE ----------
      GenShenNativeLeakRepro.java

      import java.io.BufferedReader;
      import java.io.InputStreamReader;
      import java.lang.management.ManagementFactory;
      import java.lang.ref.WeakReference;
      import java.nio.ByteBuffer;
      import java.nio.charset.StandardCharsets;
      import java.util.regex.Matcher;
      import java.util.regex.Pattern;

      /**
       * Reproduces native memory leak with Generational Shenandoah GC.
       * Run with -XX:NativeMemoryTracking=detail JVM flag.
       */
      public class GenShenNativeLeakRepro {

          public static void main(String[] args) throws Exception {
              if (!ManagementFactory.getRuntimeMXBean().getInputArguments().stream()
                      .anyMatch(arg -> arg.contains("NativeMemoryTracking"))) {
                  System.out.println("Run with -XX:NativeMemoryTracking=detail");
                  return;
              }

              for (int iteration = 0; iteration < 100; iteration++) {
                  ByteBuffer b = ByteBuffer.allocateDirect(1000);
                  WeakReference<ByteBuffer> wr = new WeakReference<>(b);

                  // Allocate a lot of garbage to force some young gen collections
                  // Should also promote the ByteBuffer to the old gen
                  for (int i = 0; i < 800; i++) {
                      byte[] garbage = new byte[100 * 1024 * 1024];
                      garbage[i] = (byte) i;
                  }

                  // Every 20 iterations force several GCs. These should clear the native memory back down to 0.
                  // This proves it's not a true leak, but is instead an issue with the GC.
                  // System.gc() triggers a global GC, which bypasses some logic in standard old gen GCs.
                  if ((iteration + 1) % 20 == 0) {
                      System.out.println("Iteration " + (iteration + 1) + ": Running GC...");
                      for (int i = 0; i < 4; i++) {
                          System.gc();
                          Thread.sleep(4000);
                      }
                  }

                  System.out.println("Iteration " + (iteration + 1) + ": Native Memory = " + getNativeMemoryKb() + " KB");
              }
          }

          private static long getNativeMemoryKb() {
              try {
                  Process p = new ProcessBuilder(
                                  "jcmd", String.valueOf(ProcessHandle.current().pid()), "VM.native_memory", "summary")
                          .start();
                  try (BufferedReader r =
                          new BufferedReader(new InputStreamReader(p.getInputStream(), StandardCharsets.UTF_8))) {
                      Pattern pat = Pattern.compile("^-\\s+Other\\s+\\(reserved=\\d+KB,\\s+committed=(\\d+)KB.*");
                      String line;
                      while ((line = r.readLine()) != null) {
                          Matcher m = pat.matcher(line);
                          if (m.matches()) return Long.parseLong(m.group(1));
                      }
                  }
              } catch (Exception e) {
                  System.err.println("NMT failed: " + e.getMessage());
              }
              return -1;
          }
      }





      GenShenWeakRefLeakRepro.java

      import java.io.BufferedReader;
      import java.io.InputStreamReader;
      import java.lang.ref.WeakReference;
      import java.nio.charset.StandardCharsets;
      import java.util.ArrayList;
      import java.util.List;

      /**
       * Tests if WeakReferences with old-gen referents leak in Generational Shenandoah.
       */
      public class GenShenWeakRefLeakRepro {

          // Keep WeakReferences alive in a static list (will be in old gen)
          private static final List<WeakReference<MyLeakedObject>> WEAK_REFS = new ArrayList<>();
          private static final long[] COUNTS = new long[2];

          static class MyLeakedObject {
              private final int value;

              MyLeakedObject(int value) {
                  this.value = value;
              }
          }

          public static void main(String[] args) throws Exception {
              //allocate garbage to promote WEAK_REFS to old gen
              for (int i = 0; i < 400; i++) {
                  byte[] garbage = new byte[100 * 1024 * 1024];
                  garbage[i % garbage.length] = (byte) i;
              }

              for (int iteration = 0; iteration < 100; iteration++) {
                  // Create object and weak reference
                  MyLeakedObject obj = new MyLeakedObject(iteration);
                  WeakReference<MyLeakedObject> wr = new WeakReference<>(obj);

                  // Store in static list (so WeakRef survives and gets promoted)
                  WEAK_REFS.add(wr);

                  // Allocate garbage to promote both WeakRef and referent to old gen
                  for (int i = 0; i < 400; i++) {
                      byte[] garbage = new byte[100 * 1024 * 1024];
                      garbage[i % garbage.length] = (byte) i;
                  }
                  // Remove cleared WeakRefs (referent was collected)
                  WEAK_REFS.removeIf(w -> w.get() == null);

                  // Count objects
                  getObjectCounts();

                  // What remains are WeakRefs with live referents
                  long aliveCount = WEAK_REFS.size();

                  System.out.println("Iteration " + (iteration + 1) +
                      ": MyLeakedObject=" + COUNTS[0] +
                      ", WeakReference=" + COUNTS[1] +
                      ", WeakRefs with live referent=" + aliveCount);

                  // Periodically force GCs
                  if ((iteration + 1) % 20 == 0) {
                      System.out.println("Forcing GCs...");
                      for (int i = 0; i < 4; i++) {
                          System.gc();
                          Thread.sleep(3000);
                      }
                      getObjectCounts();
                      System.out.println("After GC: MyLeakedObject=" + COUNTS[0] +
                          ", WeakRefs with live referent=" + aliveCount);
                  }
              }
          }

          private static void getObjectCounts() {
              COUNTS[0] = 0;
              COUNTS[1] = 0;
              try {
                  Process p = new ProcessBuilder(
                      "jcmd", String.valueOf(ProcessHandle.current().pid()),
                      "GC.class_histogram", "-all")
                      .start();
                  try (BufferedReader r = new BufferedReader(
                          new InputStreamReader(p.getInputStream(), StandardCharsets.UTF_8))) {
                      String line;
                      while ((line = r.readLine()) != null) {
                          String[] parts = line.trim().split("\\s+");
                          if (parts.length >= 4) {
                              if (line.contains("GenShenWeakRefLeakRepro$MyLeakedObject")) {
                                  COUNTS[0] = Long.parseLong(parts[1]);
                              } else if (line.contains("java.lang.ref.WeakReference ")) {
                                  COUNTS[1] = Long.parseLong(parts[1]);
                              }
                          }
                      }
                  }
              } catch (Exception e) {
                  System.err.println("Histogram failed: " + e.getMessage());
              }
          }
      }

      ---------- END SOURCE ----------

            Assignee:
            Unassigned
            Reporter:
            Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: