Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8206305

[lworld] allow value classes with no instance fields as "unit types"

XMLWordPrintable

      Value types "code like a class work like an int".
      That includes the option to have no instance fields as a class.

      The question remains, what is the "int-like" behavior for such a class?
      This question is relatively straightforward to answer, but requires us
      to engineer value types which do not require *any* bits of backing store.

      A class can have zero instance fields, which means it has only
      static state and its identity to work with.

      A value type should also be able to do this trick minus the identity.
      Without identity it no longer requires an address or offset within a container.

      A type with zero bits and no identity requires no machine resources
      to represent, *except* the single value, in all of time and space, which
      that type possesses. Recall that an n-bit type without identity has
      2**n possible values. A 0-bit type therefore has *one* not *zero*
      possible values (2**0 == 1 not 0). The science-y name for such
      a type is a "unit type".

      The familiar type-like entity in Java which behaves like a unit type
      is void. In a sense, an empty value type is a named alternative
      to void.

      ## Implementation details

      A correct implementation in HotSpot for empty (unit) value types
      would be to produce a single heap-buffered instance of the type
      at preparation time, and connect it to the place where VT.default
      is cached. (This is done anyway for all value types: The default
      value of a value type is cached in a public place for use by the
      interpreter and JIT.) Once VT.default is created for an empty
      type, there is no need ever to create any other value. Since
      VT.default acts like a global static constant (unnamed),
      computations with unit types can be treated very much like
      computations with global static constants.

      All bytecodes that need to materialize a value of the unit type
      can simply load the VT.default value from its cached global
      location. When linking a field load (from getfield or getstatic)
      the cached linkage information (CPCE) must note that the value
      is effectively a static constant, and produce VT.default always.
      A putfield or putstatic is a nop, since there is no way the field
      value can possibly be changed (there is only one value).

      When laying out fields, there is no need to allocate storage
      for an empty type, although for technical reasons it may be
      desirable for static fields to be references that copy the
      global cached reference to VT.default. (We use an extra
      level of indirection with statics to solve circular bootstrap
      problems.)

      A flat array of zero-length value types has only a length.
      Its body is of zero length. As an object it *does* have object
      identity. Since Java does not support internal addressing
      within array bodies, there are no paradoxes that arise from
      having a zero element length, and the data structure is safe,
      although of limited usefulness.

      A method that takes an argument of unit type simply ignores
      that argument. A method that returns a unit type behaves
      exactly like a void method. The caller of a method that returns
      a unit type is perfectly capable of mentioning the unique
      VT.default value to itself, without receiving any bits as a return
      from the callee.

      ## More background

      An overview of unit types in various languages is here:
        https://en.wikipedia.org/wiki/Unit_type

      By contrast, a type with no values at all is called a "bottom type".
      These are less useful, although we can probably build them on
      top of value types by adding dynamic constraints. For Java
      they can represent computations which cannot produce a value
      but must throw. Neal Gafter proposed "Unreachable" as one such.
      See:
        https://en.wikipedia.org/wiki/Bottom_type
        https://gafter.blogspot.com/2006/11/closures-esoterica-completion.html

      C++ handles this edge case explicitly by adding in a padding byte,
      so the struct will have a nonzero sizeof and (crucially) an object
      identity. (This may be taken as an indication that C++ lacks proper
      value types, although the newer std::monostate type may be such
      a type.) Haskell and Rust on the other hand supply built-in unit
      types.

      Because this RFE mixes named classes with units, the resulting
      named unit types may be said to support the slogan "codes like
      a class, works like a unit type". This is consistent with Valhalla's
      goals when you remember that "works like an int" is shorthand for
      "works like a primitive", and Java's unit type void is a primitive
      (when viewed as a type).

      C++ discussion:
      https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Empty_Base_Optimization

      Any non-instantiable reference type could be implemented as
      a unit type also. For example, java.lang.Void has only one value,
      null, so there is no need to actually store a null pointer for a field
      of value Void. This is an interesting possible optimization to pursue
      as an add-on to the support for value types that are units.
      Instead of VT.default, a Void value can be materialized as null.

      Filling this RFE will allow us to remove the limitation temporarily
      imposed on javac by JDK-8205933.

      This RFE would bring the JVM closer to allowing void to be a proper
      type in the Java language, since it would naturally support making
      fields of type void.

      Unit-type fields are (perhaps surprisingly) useful when they arise
      from specializing a template type parameter to a unit type.
      For example, Set<K> can be derived mechanically from Map<K,U>
      where U is a unit type. When this RFE is implemented in HotSpot,
      and when specialized generics are implemented also, the data
      structure for Set<K> will be as efficient as a hand-specialized version
      of Map<K,Object> which drops all the unused/null object references
      in the internal data structure. Note that in some data structures arrays
      of units may naturally appear here; this is why arrays are called out
      above.

      See email discussion here, including contrary positions:
        http://mail.openjdk.java.net/pipermail/valhalla-dev/2018-June/004497.html

            Unassigned Unassigned
            jrose John Rose
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: