Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8234192

undefined behavior: C++ volatile keyword



    • Bug
    • Resolution: Unresolved
    • P3
    • tbd
    • 15
    • hotspot


      This bug is specifically about undefined behavior from use of the C++ volatile keyword in the HotSpot source base. (Similar bugs may well be on file for other C or C++ source bases, or over other classes of undefined code.)

      There is a traditional meaning to volatile in C and C++ which boils down to "optimize loads and stores less vigorously, and use more expensive hardware operations if necessary". The expensive hardware operations have sometimes conferred extra benefits, such as atomicity, in an ad hoc manner. The good news is that this traditional meaning is being replaced by a proper memory model in recent years; the bad news for HotSpot is that this memory model is focused clearly on ordering of side effects and handles atomicity via other means.

      Reference: https://en.cppreference.com/w/cpp/language/cv

      The Java Memory Model, which is similar to but predates the modern C++ memory model, differs in many details, including its treatment of the volatile keyword. In the JMM volatile integrates/conflates atomicity and sequencing. There is a special risk to volatile in HotSpot that a reader of C++ code may mistakenly assign a Java meaning to an occurrence of volatile in C++ code.

      Another special risk to HotSpot from volatile is that, as the C++ community evolves the meaning of volatile, we will be tasked to support various platforms and toolchains that are at varying levels of adoption. We've seen tools which are bleeding-edge, conservative, and stuck-in-the-past. It's a challenge to pick a combination of platform settings and code styles to keep our source base up to date yet broadly portable.

      The result is that, as the meaning of volatile is sharpened (and/or deprecated) in C++ and as it diverges more clearly from both tradition and the JMM, volatile risks losing behaviors that we rely on in our code base. We need to re-evaluate our uses of the keyword and in some (or all?) uses replace them with recommended alternatives, such as explicit C++ atomics.

      Broadly speaking, HotSpot tends to build small abstractions and utility functions to encapsulate portability risks. It may be that we need to organize our code base so that all remaining uses of volatile (and atomics) are encapsulated by us in clearly documented, carefully tested, highly portable abstractions, and "naked volatile" is discouraged in all other parts of our code base.

      # more data

      ## C++ memory model

      In some versions of C++, use of a volatile variable to perform lock-free inter-thread synchronization is defined to be a data race. Data races are undefined behavior, which is not something the JVM easily tolerates.

      — See passage beginning "When an evaluation of an expression writes to a memory location and another evaluation reads or modifies the same memory location, the expressions are said to conflict." and ending "If a data race occurs, the behavior of the program is undefined."

      — See passage including "...this order is not guaranteed to be observed by another thread, since volatile access does not establish inter-thread synchronization..."

      ## is volatile on the way out?

      Some uses (stupid ones) are being deprecated. What trend is this part of?
      CppCon 2019: JF Bastien “Deprecating volatile”

      # in defense of volatile

      The case for volatile is something like the following:

      1. It’s a natural and familiar notation; folks know what it means. (Counter: Maybe they don't anymore.)

      2. if you don’t write an explicit “load” or “store” call (just x=*vp or some such) the compiler fixes it for you (instead of silently making a race condition). (Counter: It's too magic. Hard to tell what "x=*vp" means from inspection if it's type-driven.)

      3. Explicit code in this case is too explicit and verbose, hurtiing readability. (Counter: Encapsulate it.)

      Perhaps it boils down to this: How do we want to formulate those parts of our code which are subject to races? Can we isolate them inside appropriate abstractions? Or will our workarounds be so ugly, clunky, noisy, and verbose that they make the problem of races worse than the volatile keyword, since they obscure the meaning of our code?


        Issue Links



              Unassigned Unassigned
              jrose John Rose
              0 Vote for this issue
              11 Start watching this issue