Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8337994

[REDO] Native memory leak when not recording any events

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 24
    • 17, 21, 23, 24
    • hotspot
    • jfr
    • b11
    • generic
    • generic

      ADDITIONAL SYSTEM INFORMATION :
      OS (running in docker on a ubuntu 22.04) :
      Alpine Linux source887b8b5876c2-5784d75ffb-l5fgv 5.15.0-1058-aws #64~20.04.1-Ubuntu SMP Tue Apr 9 11:12:27 UTC 2024 x86_64 Linux

      $ java --version
      NOTE: Picked up JDK_JAVA_OPTIONS: -XX:NativeMemoryTracking=detail
      NOTE: Picked up JDK_JAVA_OPTIONS: -XX:NativeMemoryTracking=detail
      openjdk 21.0.3 2024-04-16 LTS
      OpenJDK Runtime Environment Zulu21.34+19-CA (build 21.0.3+9-LTS)
      OpenJDK 64-Bit Server VM Zulu21.34+19-CA (build 21.0.3+9-LTS, mixed mode, sharing)



      A DESCRIPTION OF THE PROBLEM :
      We have a service running into a native memory leak which we believe is related to JFR. This service is running on Java 21 and it's currently not using virtual threads at all. It is however pulling on a shared library which provide basic monitoring around thread pinning event within a class named PinnedThreadTracker. The source code of this class is included at the end of this description.

      When the service runs with this component enabled, it experiences a native memory leak. We do not get any leak when disabling this component. The heap/off-heap spaces are all within the limits defined and do not grow indefinitely. However, the process memory is growing. Using -XX:NativeMemoryTracking=detail, we are able to see the 'Tracing' memory region growing and growing. Here is the result of 2 invocation of 'jcmd 1 VM.native_memory detail.diff scale=MB' spaced about 2 hours apart :

      Native Memory Tracking:

      (Omitting categories weighting less than 1MB)

      Total: reserved=4543MB +77MB, committed=2084MB +141MB

      - Java Heap (reserved=2252MB, committed=1074MB +60MB)
                                  (mmap: reserved=2252MB, committed=1074MB +60MB)

      - Class (reserved=1029MB, committed=32MB)
                                  (classes #40298 +76)
                                  ( instance classes #37933 +74, array classes #2365 +2)
                                  (malloc=5MB #132908 +1918)
                                  (mmap: reserved=1024MB, committed=27MB)
                                 : ( Metadata)
                                  ( reserved=192MB, committed=186MB +1MB)
                                  ( used=185MB +1MB)
                                  ( waste=1MB =0.49%)
                                 : ( Class space)
                                  ( reserved=1024MB, committed=27MB)
                                  ( used=25MB)
                                  ( waste=2MB =5.75%)

      - Thread (reserved=129MB -3MB, committed=13MB)
                                  (thread #129 -3)
                                  (stack: reserved=128MB -3MB, committed=13MB)

      - Code (reserved=250MB, committed=114MB)
                                  (malloc=8MB #21506 +1389)
                                  (mmap: reserved=242MB, committed=106MB)

      - GC (reserved=89MB, committed=66MB +1MB)
                                  (malloc=13MB #39253 +2286)
                                  (mmap: reserved=76MB, committed=53MB +1MB)

      - GCCardSet (reserved=3MB +2MB, committed=3MB +2MB)
                                  (malloc=3MB +2MB #3771 +1098)

      - Compiler (reserved=2MB, committed=2MB)
                                  (malloc=1MB #4047 +71)

      - Internal (reserved=7MB, committed=7MB)
                                  (malloc=7MB #93677 +482)

      - Other (reserved=3MB, committed=3MB)
                                  (malloc=3MB #1136 -5)

      - Symbol (reserved=62MB, committed=62MB)
                                  (malloc=58MB #539325 +515)
                                  (arena=5MB #1)

      - Native Memory Tracking (reserved=38MB +4MB, committed=38MB +4MB)
                                  (malloc=1MB #12411 +2862)
                                  (tracking overhead=37MB +4MB)

      - Shared class space (reserved=16MB, committed=13MB)
                                  (mmap: reserved=16MB, committed=13MB)

      - Tracing (reserved=464MB +73MB, committed=464MB +73MB)
                                  (malloc=464MB +73MB #1548059 +250121)

      - Synchronization (reserved=4MB, committed=4MB)
                                  (malloc=4MB #43969 +128)

      - Metaspace (reserved=194MB, committed=188MB +1MB)
                                  (malloc=2MB #2448 +22)
                                  (mmap: reserved=192MB, committed=186MB +1MB)

      [0x00007f72792c8828] G1MonotonicArena::allocate()+0x1d8
      [0x00007f727927ff99] G1CardSet::create_coarsened_array_of_cards(unsigned int, bool)+0x39
      [0x00007f7279280323] G1CardSet::coarsen_container(void* volatile*, void*, unsigned int, bool)+0x123
      [0x00007f72792815a0] G1CardSet::add_to_howl(void*, unsigned int, unsigned int, bool)+0x150
                                   (malloc=1MB type=GCCardSet +1MB #284 +269)

      [0x00007f72793ddd96] JfrCheckpointManager::lease_thread_local(Thread*, unsigned long)+0x126
      [0x00007f72793e3c46] JfrCheckpointWriter::JfrCheckpointWriter(Thread*, bool, JfrCheckpointType, JfrCheckpointBufferKind)+0x26
      [0x00007f727944b164] JfrTypeManager::write_checkpoint(Thread*, unsigned long, oopDesc*)+0x174
      [0x00007f727943dc0c] JfrThreadLocal::on_start(Thread*)+0x9c
                                   (malloc=224MB type=Tracing +36MB #773886 +125058)

      [0x00007f72793ddd96] JfrCheckpointManager::lease_thread_local(Thread*, unsigned long)+0x126
      [0x00007f72793e3c46] JfrCheckpointWriter::JfrCheckpointWriter(Thread*, bool, JfrCheckpointType, JfrCheckpointBufferKind)+0x26
      [0x00007f727944b164] JfrTypeManager::write_checkpoint(Thread*, unsigned long, oopDesc*)+0x174
      [0x00007f727943df84] JfrThreadLocal::on_exit(Thread*)+0x94
                                   (malloc=224MB type=Tracing +36MB #773858 +125062)

      [0x00007f72797ac35b] os::reserve_memory_aligned(unsigned long, unsigned long, bool)+0x2b
      [0x00007f7279aaefbf] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x51f
      [0x00007f7279aaf373] ReservedSpace::ReservedSpace(unsigned long, unsigned long, unsigned long, char*)+0x53
      [0x00007f7279aadc4b] metaspace::VirtualSpaceNode::create_node(unsigned long, metaspace::CommitLimiter*, metaspace::AbstractCounter<unsigned long>*, metaspace::AbstractCounter<unsigned long>*)+0x4b
                                   (mmap: reserved=128MB, committed=122MB +1MB Type=Metaspace)

      [0x00007f7279aaef90] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x4f0
      [0x00007f7279286259] G1CollectedHeap::create_aux_memory_mapper(char const*, unsigned long, unsigned long)+0x39
      [0x00007f727928ac09] G1CollectedHeap::initialize()+0x2d9
      [0x00007f7279a59a36] universe_init()+0xd6
                                   (mmap: reserved=35MB, committed=17MB +1MB Type=GC)

      [0x00007f7279aafaa7] ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, char*, unsigned long, unsigned long, unsigned long)+0x4c7
      [0x00007f7279ab0434] ReservedHeapSpace::initialize_compressed_heap(unsigned long, unsigned long, unsigned long)+0x614
      [0x00007f7279ab0699] ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, unsigned long, char const*)+0x179
      [0x00007f7279a59c80] Universe::reserve_heap(unsigned long, unsigned long)+0x70
                                   (mmap: reserved=2252MB, committed=1074MB +60MB Type=Java Heap)

      [0x00007f72797a20aa] thread_native_entry(Thread*)+0x12a
                                   (mmap: reserved=127MB -3MB, committed=13MB Type=Thread Stack)

      Native Memory Tracking:

      (Omitting categories weighting less than 1MB)

      Total: reserved=4743MB +276MB, committed=2229MB +286MB

      - Java Heap (reserved=2252MB, committed=1020MB +6MB)
                                  (mmap: reserved=2252MB, committed=1020MB +6MB)

      - Class (reserved=1029MB, committed=32MB)
                                  (classes #40392 +170)
                                  ( instance classes #38027 +168, array classes #2365 +2)
                                  (malloc=5MB #133963 +2973)
                                  (mmap: reserved=1024MB, committed=27MB)
                                 : ( Metadata)
                                  ( reserved=192MB, committed=187MB +1MB)
                                  ( used=186MB +1MB)
                                  ( waste=1MB =0.49%)
                                 : ( Class space)
                                  ( reserved=1024MB, committed=27MB)
                                  ( used=25MB)
                                  ( waste=2MB =6.01%)

      - Thread (reserved=131MB -1MB, committed=13MB)
                                  (thread #131 -1)
                                  (stack: reserved=130MB -1MB, committed=13MB)

      - Code (reserved=250MB +1MB, committed=116MB +2MB)
                                  (malloc=8MB +1MB #21142 +1025)
                                  (mmap: reserved=242MB, committed=107MB +1MB)

      - GC (reserved=93MB +4MB, committed=69MB +4MB)
                                  (malloc=17MB +4MB #39987 +3020)
                                  (mmap: reserved=76MB, committed=52MB)

      - GCCardSet (reserved=3MB +2MB, committed=3MB +2MB)
                                  (malloc=3MB +2MB #3860 +1187)

      - Compiler (reserved=2MB, committed=2MB)
                                  (malloc=1MB #4028 +52)

      - Internal (reserved=7MB, committed=7MB)
                                  (malloc=7MB #94712 +1517)

      - Other (reserved=3MB, committed=3MB)
                                  (malloc=3MB #1137 -4)

      - Symbol (reserved=62MB, committed=62MB)
                                  (malloc=58MB #539867 +1057)
                                  (arena=5MB #1)

      - Native Memory Tracking (reserved=48MB +14MB, committed=48MB +14MB)
                                  (malloc=1MB #12618 +3069)
                                  (tracking overhead=47MB +14MB)

      - Shared class space (reserved=16MB, committed=13MB)
                                  (mmap: reserved=16MB, committed=13MB)

      - Tracing (reserved=647MB +256MB, committed=647MB +256MB)
                                  (malloc=647MB +256MB #2180371 +882433)

      - Synchronization (reserved=4MB, committed=4MB)
                                  (malloc=4MB #44142 +301)

      - Metaspace (reserved=194MB, committed=189MB +1MB)
                                  (malloc=2MB #2600 +174)
                                  (mmap: reserved=192MB, committed=187MB +1MB)

      [0x00007f72792c8828] G1MonotonicArena::allocate()+0x1d8
      [0x00007f727927ff99] G1CardSet::create_coarsened_array_of_cards(unsigned int, bool)+0x39
      [0x00007f7279280323] G1CardSet::coarsen_container(void* volatile*, void*, unsigned int, bool)+0x123
      [0x00007f72792815a0] G1CardSet::add_to_howl(void*, unsigned int, unsigned int, bool)+0x150
                                   (malloc=1MB type=GCCardSet +1MB #287 +272)

      [0x00007f727978e67b] ImmutableOopMapSet::build_from(OopMapSet const*)+0x9b
      [0x00007f72790f64be] CodeBlob::CodeBlob(char const*, CompilerType, CodeBlobLayout const&, CodeBuffer*, int, int, OopMapSet*, bool, bool)+0x7e
      [0x00007f727913c709] CompiledMethod::CompiledMethod(Method*, char const*, CompilerType, int, int, CodeBuffer*, int, int, OopMapSet*, bool, bool)+0x109
      [0x00007f727975ff9a] nmethod::nmethod(Method*, CompilerType, int, int, int, CodeOffsets*, int, DebugInformationRecorder*, Dependencies*, CodeBuffer*, int, OopMapSet*, ExceptionHandlerTable*, ImplicitExceptionTable*, AbstractCompiler*, CompLevel, char*, int, JVMCINMethodData*)+0xaa
                                   (malloc=8MB type=Code +1MB #17611 +1002)

      [0x00007f7279825389] PtrQueueSet::exchange_buffer_with_new(PtrQueue&)+0x39
      [0x00007f727985aa0a] SATBMarkQueueSet::handle_zero_index(SATBMarkQueue&)+0x5a
      [0x00007f727985ab0b] SATBMarkQueueSet::enqueue_known_active(SATBMarkQueue&, oopDesc*)+0x3b
                                   (malloc=7MB type=GC +4MB #943 +493)

      [0x00007f72793ddd96] JfrCheckpointManager::lease_thread_local(Thread*, unsigned long)+0x126
      [0x00007f72793e3c46] JfrCheckpointWriter::JfrCheckpointWriter(Thread*, bool, JfrCheckpointType, JfrCheckpointBufferKind)+0x26
      [0x00007f727944b164] JfrTypeManager::write_checkpoint(Thread*, unsigned long, oopDesc*)+0x174
      [0x00007f727943dc0c] JfrThreadLocal::on_start(Thread*)+0x9c
                                   (malloc=316MB type=Tracing +128MB #1090043 +441215)

      [0x00007f72793ddd96] JfrCheckpointManager::lease_thread_local(Thread*, unsigned long)+0x126
      [0x00007f72793e3c46] JfrCheckpointWriter::JfrCheckpointWriter(Thread*, bool, JfrCheckpointType, JfrCheckpointBufferKind)+0x26
      [0x00007f727944b164] JfrTypeManager::write_checkpoint(Thread*, unsigned long, oopDesc*)+0x174
      [0x00007f727943df84] JfrThreadLocal::on_exit(Thread*)+0x94
                                   (malloc=316MB type=Tracing +128MB #1090013 +441217)

      [0x00007f72797ac35b] os::reserve_memory_aligned(unsigned long, unsigned long, bool)+0x2b
      [0x00007f7279aaefbf] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x51f
      [0x00007f7279aaf373] ReservedSpace::ReservedSpace(unsigned long, unsigned long, unsigned long, char*)+0x53
      [0x00007f7279aadc4b] metaspace::VirtualSpaceNode::create_node(unsigned long, metaspace::CommitLimiter*, metaspace::AbstractCounter<unsigned long>*, metaspace::AbstractCounter<unsigned long>*)+0x4b
                                   (mmap: reserved=128MB, committed=123MB +1MB Type=Metaspace)

      [0x00007f7279aaef90] ReservedSpace::reserve(unsigned long, unsigned long, unsigned long, char*, bool)+0x4f0
      [0x00007f7279aaf16c] ReservedCodeSpace::ReservedCodeSpace(unsigned long, unsigned long, unsigned long)+0x5c
      [0x00007f727910191e] CodeCache::initialize_heaps()+0x25e
      [0x00007f7279101fd5] CodeCache::initialize()+0x55
                                   (mmap: reserved=240MB, committed=107MB +1MB Type=Code)

      [0x00007f7279aafaa7] ReservedHeapSpace::try_reserve_range(char*, char*, unsigned long, char*, char*, unsigned long, unsigned long, unsigned long)+0x4c7
      [0x00007f7279ab0434] ReservedHeapSpace::initialize_compressed_heap(unsigned long, unsigned long, unsigned long)+0x614
      [0x00007f7279ab0699] ReservedHeapSpace::ReservedHeapSpace(unsigned long, unsigned long, unsigned long, char const*)+0x179
      [0x00007f7279a59c80] Universe::reserve_heap(unsigned long, unsigned long)+0x70
                                   (mmap: reserved=2252MB, committed=1020MB +6MB Type=Java Heap)

      [0x00007f72797a20aa] thread_native_entry(Thread*)+0x12a
                                   (mmap: reserved=129MB -1MB, committed=13MB Type=Thread Stack)

      public class PinnedThreadTracker implements AutoCloseable
      {
          private static final Logger logger = CoveoLogger.getLogger(PinnedThreadTracker.class);

          private static final String VIRTUAL_THREAD_PINNED_JFR_EVENT_NAME = "jdk.VirtualThreadPinned";

          private final RecordingStream recordingStream;
          private final int maximumStackDepth;
          private final boolean isCloudEnvironment;

          public PinnedThreadTracker(Duration recordingThreshold, int maximumStackTraceDepth, Environment environment)
          {
              maximumStackDepth = maximumStackTraceDepth;
              isCloudEnvironment = environment.acceptsProfiles(Profiles.CLOUD_PROFILES);

              recordingStream = new RecordingStream();
              recordingStream.enable(VIRTUAL_THREAD_PINNED_JFR_EVENT_NAME)
                             .withStackTrace()
                             .withThreshold(Objects.requireNonNull(recordingThreshold));
              recordingStream.onEvent(VIRTUAL_THREAD_PINNED_JFR_EVENT_NAME, this::recordEvent);
              recordingStream.setReuse(true);
              recordingStream.setMaxAge(Duration.ofSeconds(5));
              recordingStream.startAsync();
          }

          private void recordEvent(RecordedEvent event)
          {
              if (isCloudEnvironment) {
                  LogstashMarker markers = Markers.appendEntries(Map.of("event_stack_trace",
                                                                        getStackTrace(event.getStackTrace(),
                                                                                      maximumStackDepth),
                                                                        "pinned_duration_ms",
                                                                        event.getDuration().toMillis()));
                  logger.warn(markers,
                              "Pinned thread event of duration : '{}'. See event_stack_trace for more details.",
                              event.getDuration());
              } else {
                  logger.warn("Pinned thread event of duration : '{}'. Stack trace:{}{}",
                              event.getDuration(),
                              System.lineSeparator(),
                              getStackTrace(event.getStackTrace(), maximumStackDepth));
              }
          }

          @Override
          public void close()
          {
              recordingStream.close();
          }

          /**
           * Get a pretty printed stack trace from a {@link RecordedStackTrace}.
           * The toString implementation of {@link RecordedStackTrace} has a max depth of 5 which is not enough to get a proper picture.
           * This method is strongly inspired from the original classes, minus some details that are not relevant for our use case.
           */
          private static String getStackTrace(RecordedStackTrace stackTrace, int maximumStackDepth)
          {
              StringBuilder builder = new StringBuilder();
              List<RecordedFrame> frames = stackTrace.getFrames();
              int i = 0;
              int depth = 0;
              while (i < frames.size() && depth < maximumStackDepth) {
                  RecordedFrame frame = frames.get(i);
                  if (frame.isJavaFrame() && !frame.getMethod().isHidden()) {
                      formatMethod(frame, builder);
                      builder.append(System.lineSeparator());
                      depth++;
                  }
                  i++;
              }
              if (stackTrace.isTruncated() || i == maximumStackDepth) {
                  builder.append("... (truncated)");
              }
              return builder.toString();
          }

          private static void formatMethod(RecordedFrame recordedFrame, StringBuilder builder)
          {
              RecordedMethod method = recordedFrame.getMethod();
              builder.append(method.getType().getName());
              builder.append(".");
              builder.append(method.getName());
              builder.append("()");
              int line = recordedFrame.getLineNumber();
              if (line >= 0) {
                  builder.append(" line: ").append(line);
              }
          }
      }



      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      We're currently seeing this issue on only one service with no special setup. Therefore, I cannot provide an easy way to reproduce but hopefully from the details provided it will be enough to see the issue. If not, please say so and I'll try to find a way to reproduce it.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      I'd expect this relatively simple JFR setup to not leak memory
      ACTUAL -
      memory leak

      FREQUENCY : always


            mgronlun Markus Grönlund
            mgronlun Markus Grönlund
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: