Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8293167

Memory leak in JfrThreadSampler if stackdepth is larger than default (64)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 20
    • 18, 19, 20
    • hotspot
    • jfr
    • b14

      Datadog is using JFR as the backbone of its continuous Java profiler.

      Recently we have received the following bug report:
      ```
      We use the latest Datadog Java Tracer (fetched from https://dtdg.co/latest-java-tracer) included in our Java/SpringBoot application using the following JVM parameters

      -javaagent:/opt/dd-java-agent.jar
      -Ddd.profiling.enabled=true
      -XX:FlightRecorderOptions=stackdepth=256
      -Ddd.logs.injection=true
      -Ddd.trace.sample.rate=1
      java -version
      openjdk version "18.0.2" 2022-07-19
      OpenJDK Runtime Environment (build 18.0.2+9-61)
      OpenJDK 64-Bit Server VM (build 18.0.2+9-61, mixed mode, sharing)
      The application is running within a Docker container in AWS Elastic Beanstalk.

      We observed, that docker.mem.rss is continuously increasing over time, whereas jvm.heap_memory and jvm.non_heap_memory stay constant (after ~1d of 'warm-up' period). After ~10-15 days, the container RSS reaches a configured memory limit and the container is killed and restarted.

      Further investigation (using java native memory tracking) revealed, that it is the off-heap memory area called 'Tracing' that gets bigger and bigger over time. We observed up to ~130MB of allocated memory in that area after ~10 days.

      With -Ddd.profiling.enabled=false the problem does not occur ('Tracing' memory stays constant at 32KB).

      In the Datadog Agents (v7.38.2, Docker) logs we see no obvious problems (except lots of CPU threshold exceeded warnings).

      What can we do to prevent this 'Tracing' memory leak with activated profiling?
      ```
      (https://github.com/DataDog/dd-trace-java/issues/3778)

      From the described symptoms it looks like something is leaking in the mtTracing arena. A brief scan of the sources shows that JFR is using this arena in several places - and since the issue goes away when JFR is not enabled (recording is not active, to be precise) it is quite a safe guess that this would be happening somewhere in JFR.

            mgronlun Markus Grönlund
            jbachorik Jaroslav Bachorík
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: