Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8282469

Allow considered use of C++ thread_local in Hotspot

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 19
    • 19
    • hotspot
    • b15

      In JDK 9 we looked at replacing library-based thread local storage (TLS) with use of C++ thread_local (JDK-8132510) but there some issues/concerns around the use of that and so we opted to use the compiler specific TLS mechanisms provided by gcc/clang/VS.

      A significant limitation to the gcc TLS extension is that if an initializer is present for a thread-local variable, it must be a constant-expression. [1] That means that we can't declare a thread-local variable that is a class instance with non-trivial construction and destruction.

      Project Panama has a usecase for TLS that requires a non-trivial destructor for a C++ class, such that threads that attach to the JVM to process Java "upcalls" will be automatically detached when the thread terminates (if it didn't detach explicitly).

      A discussion on the pros and cons of using C++ thread_local as the mechanism for TLS in the JVM, shows there are still a number of concerns that argue against its wholescale adoption. Some relevant extracts from that discussion:

      "[A] reminder that the difference between C++11 thread_local and the gcc's __thread came up in the discussion of JDK-8230877.

      https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-September/039487.html

      That's what led us to the current restriction against using thread_local**. We could revisit that. thread_local usually requires an extra prologue before an access to ensure the variable has been initialized, while __thread requires the initializer be a constant expression. Also JDK-8230877 was before C++11/14 support and use was in place."

      ---

      ** The issue here is a potential performance hit. As the gcc documentation describes it [2]:

      "Unfortunately, this [C++ thread_local] support requires a run-time penalty for references to non-function-local thread_local variables defined in a different translation unit even if they don't need dynamic initialization, so users may want to continue to use __thread for TLS variables with static initialization semantics."

      Some preliminary benchmarking with gcc __thread converted to C++ thread_local did show some significant regressions on a couple of benchmarks on Aarch64.

      ---

      "thread_local has all the same initialization order issues as globals. There's a nicely worked out analysis here:

      https://stackoverflow.com/questions/60813372/initialization-order-of-thread-local-vs-global-variables

      So I think I'd like us to stick with the limited version that requires a constexpr initialization expression, at least for the most part."

      ---

      "We could relax the prohibition to allow thread_local where really required. I might want a noisy looking macro for that use-case, with bare thread_local remaining forbidden. That makes it clear that someone thought about the question at least a little bit.

      I looked for a way to warn about uses of thread_local that could be locally disabled where we intentionally use it, but didn't find such a thing. Clang (some version) has -Wglobal-constructors, and a patch exists for adding it to gcc, but it's not in gcc11.2 (the latest release).
      https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01860.html
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71482

      But I did stumble over this. Might this be a problem for you?
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61991"

      ---

      For the record gcc bug 61991 is not an issue for the proposed use-case because the TLS variable does get used.

      ---

      "There are two possible maintenance issues: 1. If we don't document our decisions about our choices we won't be able to re-evaluate them later on without full re-analysis, so let's put the info into the JBS entry. 2. Even if the choices are fully documented, there's some cost and risk in applying the documented reasoning correctly in each case, compared to a "one size fits all" design. But it seems like we have a plan to deal with those possible maintenance issues."

      ---



      So the proposal here is to allow "well considered" uses of C++ thread_local, by providing a suitably "noisy" macro, and adjusting the Hotspot Style Guide [3] section on allowed C++ features to accommodate this.

      [1] https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
      [2] https://gcc.gnu.org/gcc-4.8/changes.html
      [3] https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md

            dholmes David Holmes
            dholmes David Holmes
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: