Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8288663

JFR: Disabling the JfrThreadSampler commits only a partially disabled state

    XMLWordPrintable

Details

    • jfr
    • b03

    Backports

      Description

        void JfrThreadSampling::set_sampling_interval(bool java_interval, size_t period) {
          size_t interval_java = 0;
          size_t interval_native = 0;
          if (_sampler != NULL) {
            interval_java = _sampler->get_java_interval();
            interval_native = _sampler->get_native_interval();
          }
          if (java_interval) {
            interval_java = period;
          } else {
            interval_native = period;
          }
          if (interval_java > 0 || interval_native > 0) {
            if (_sampler == NULL) {
              log_trace(jfr)("Creating thread sampler for java:%zu ms, native %zu ms", interval_java, interval_native);
              start_sampler(interval_java, interval_native);
            } else {
              _sampler->set_java_interval(interval_java);
              _sampler->set_native_interval(interval_native);
              _sampler->enroll();
            }
            assert(_sampler != NULL, "invariant");
            log(interval_java, interval_native);
          } else if (_sampler != NULL) {
            _sampler->disenroll();
          }
        }

        The _sampler run state is controlled by the java interval and the native interval parameters (milliseconds).

        If either is > 0, the sampler is to run. If both are 0, the sampler is disabled and should not run.

        Let us say the sampler instance is currently running, with a java interval of 20 ms and a native interval of 20 ms.

        The last recording is stopping, so the sampler is to be disabled. This is accomplished by setting first the java interval to 0 ms and then subsequently the native interval to 0 ms.

        First, the java_interval comes in as 0. The code reads the current values of the _sampler instance, both for the java interval and the native interval. Again, if either value is > 0, the sampler is running or should run. The updated java interval value is committed as 0 onto the _sampler instance via _sampler->set_java_interval(interval_java).

        Current state: java_interval = 0ms, native_interval = 20 ms. The sampler is still running.

        Then comes the update to the native interval, for 0 ms.

        The code reads the current field values of the sampler, 0 ms for the java interval, and 20 ms for the native interval. After reading 20 ms as the current value of the native interval, the code overwrites that local variable with the incoming value, so it is now 0 ms.

        In the test, neither value is > 0, so the sampler state is not updated with the incoming 0 ms for the native interval, as it was for the java interval. Instead, the code branches to the _sampler->disenroll(), which stops the sampler.

        At this point, the sampler is stopped, just as intended, because the last recording stopped.

        But the _sampler instance state looks like this:

        _java_interval = 0 ms
        _native_interval = 20 ms

        Let us now say another recording starts, but one that does not enable java execution sampling nor native method sampling. After the recording is started, the setting system will pass in disabled values for the java and the native intervals, both set to 0 ms.

        First, the java interval: The code reads the current value of the _sampler instance, 0 ms for the Java interval, and 20 ms for the native interval. In the evaluation, this looks like the sampler is running because of the 20 ms native interval. The java interval is set, and the sampler is enrolled (i.e. started) when it should not.

        Then comes the disabled native interval value of 0 ms. The system reads the current values from the sampler instance, 0 ms for the Java interval and 20 ms for the native interval. Again, it overwrites the native interval value with 0 ms, so it looks like the sampler is not running (although it is). The code then calls disenroll, which stops the sampler.

        Still the _sampler instance state looks like this:
        _java_interval = 0 ms
        _native_interval = 20 ms

        The sampler is eventually stopped (as intended when disabling). However, it was running for a brief period, resulting in native method sampling events generated in a situation where they should not.

        Attachments

          Issue Links

            Activity

              People

                mgronlun Markus Grönlund
                mgronlun Markus Grönlund
                Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: