Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8307766

Linux: Provide the option to override the timer slack

XMLWordPrintable

    • b06
    • linux

        I have been investigating the accuracy of Thread.sleep timers, to identify why they overshoot the duration by about 50us on Linux. Tangentially, I have been looking the reasons why time-to-safepoint in some cases overshoots at about the same duration.

        It looks like Linux has the "timer slack" feature that coalesces the consecutive high-resolution timer events in order to optimize CPU wakeups. The default seems to be 50us! ...and there is the prctl(PR_SET_TIMERSLACK, ...) call to adjust it. With my quick prototype that drops the timer slack to 1ns, both the Thread.sleep accuracy and TTSP durations improve by tens of microseconds.

        There is a /proc/$pid/timerslack_ns interface, but it is inconvenient for two reasons: one needs to find the JVM pid and have permissions to write to /proc. I think there is a system-wide option to change timer slack, but that would affect every process in the system, which might not be a good idea.

        It makes sense to provide an experimental option that tunes the per-JVM timer slack.

        There is an alternative to look through every use of nanosleep and timed waits in the JDK and make sure we do something else for the delays that are lower than the timer slack. That, however, would eventually come to fighting with the jitter introduced by this OS-side event merge. We need this option to turn the timer slack off for those experiments too.

        Additional bonus: if one cranks the timer slack _up_, it emulates the sleep/wait stalls and exposes the pieces of code that over-rely on sleep/wait to return quickly.

        Draft PR: https://github.com/openjdk/jdk/pull/13889

              shade Aleksey Shipilev
              shade Aleksey Shipilev
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: