Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8294992

64 bit object headers

    XMLWordPrintable

Details

    • JEP
    • Status: Draft
    • P4
    • Resolution: Unresolved
    • None
    • hotspot
    • None
    • Roman Kennke
    • Feature
    • Open
    • Implementation
    • hotspot dash dev
    • L
    • L

    Description

      Summary

      Reduce size of 64 bit Java object headers from currently 96/128 bits to 64 bits.

      Goals

      Java object headers in 64 bit HotSpot JVMs are 96 bits (with compressed class pointers) or 128 bits (without compressed class pointers). The goal of this JEP is to reduce the size of 64 bit HotSpot JVM object headers to 64 bits.

      Non-Goals

      It is not a goal of this JEP to change the layout or reduce the size of the body of Java objects (i.e., their fields and array elements).

      Success Metrics

      Object headers are consistently 64 bits and the freed-up space is utilized by the object body.

      Motivation

      Java object headers in 64 bit HotSpot JVMs are currently 96 bits with compressed class pointers (CCP) or 128 bits without compressed class pointers. Object sizes in Java programs tend to be relatively small. Research has shown that many workloads exhibit average object sizes of 4 to 8 words, that is 256 to 512 bits. This implies that ~19% - 38% of heap occupancy is object headers (assuming 96 bit headers, which is currently the default). If we could achieve 64 bit object headers, then headers would only take ~12% - 25% of heap occupancy. In other words, overall heap memory usage would be reduced by ~6% - 12%, with corresponding reduced GC pressure and CPU usage. Tighter packing of objects also means fewer cache misses, with corresponding performance benefits.

      Description

      The current header layout is roughly:

      • Bits 0..7: Lock bits and GC age (contains two unused bits)
      • Bits 8..38: Identity hash-code (31 bits)
      • Bits 39..63: Unused
      • Bits 64..95: Class pointer (without CCP, the class pointer would take bits 64-127)

      This JEP intends to change the header layout to:

      • Bits 0..5: Lock bits and GC age
      • Bits 6..31: Identity hash-code (26 bits)
      • Bits 32..63: Class pointer

      The number of bits for the hash-code is not set in stone. We could either trade some bits with the class-pointer, or we could implement on-demand hash-code allocation [0], and only require two bits for the hash-code in the header. In the latter scenario, the header layout may look like this:

      • Bits 0..5: Lock bits and GC age
      • Bits 6..7: Identity hash-code (2 bits)
      • Bits 8..63: Class pointer

      The class-pointers would always be compressed, the option to run without compressed class pointers would be disabled or removed entirely. It may be possible to compress class pointers to less than 32 bits, at the expense of a smaller addressable class space. The upside would be that we have more bits available for the identity-hash-code.

      Notice that this layout mirrors the header layout of 32 bit HotSpot JVMs.

      When the object is locked by a stack-lock or monitor, or if the object is forwarded by the GC (all of which is indicated by the lowest two bits of the header), the upper 62 bits of the first word of the header are interpreted as a pointer to the stack-lock, object monitor or the forwarded location of the object. When that is the case, the actual header is ‘displaced’, that is stored either in the stack-lock, in the monitor or in a GC side-table.

      Since the contents of the header word, which now includes the class pointer, is currently displaced and sometimes irreversibly lost when locking or forwarding objects, we propose to make a few adjustments:

      • Stack-locking will be replaced by a similar locking scheme that doesn’t require pointers in the header.
      • In the case of monitor-locking, we can safely fetch the real header from the object monitor. A few minor modifications will be necessary to avoid races with concurrent monitor deflation.
      • GC forwarding can often be handled by fetching the real header from the forwarded copy. For the case of sliding GC's, we use a scheme to compress forwarding pointers to 32 bits, so that the class pointer and forwarding pointer can both be accommodated in the header. For details on sliding forwarding, see [1].

      A number of subsystems will be affected and will require changes to adopt the header layout change. In particular, JVMTI, the serviceability agent, and JVMCI will require changes.

      Alternatives

      A number of different implementation approaches have been discussed and tried in project Lilliput, for example removing stack-locking altogether, using external tables to provide object to monitor- or forwarded-object- mapping, allocating space for identity-hash-code on-demand. None of these seemed necessary for the purpose of this JEP. However, some of the approaches could be explored in future work to make object headers even smaller.

      Testing

      This JEP will be tested by the usual jtreg tests. Objects and the use of their headers are so central that practically every single test will exercise the important code paths. We also have tests for locking and identity hash-code in place, that are going to be verified. A few new jtreg tests will be added to verify that object layout will correctly utilize the newly freed space.

      Risks and Assumptions

      There is a general risk that changes in some of the most fundamental subsystems like locking and object header layout introduce bugs or performance regressions. This will be addressed by extensive correctness and performance testing of the changes before they are integrated.

      There is a specific risk that workloads may suffer from some of the changes that are necessary to reduce the object header size, while not benefiting from that header reduction to offset that penalty. For example, a workload could show performance regressions because of the extra complexity of fetching the Klass* from the object header and dealing with monitor-locked objects, but at the same time not benefit much from smaller objects, because it doesn't allocate much, or allocates mostly large objects.

      In order to mitigate and investigate potential performance or even correctness problems, a diagnostic flag will be added, which switches the JVM back to the original object header layout and behaviour. Note that while this mitigates some risks, it also introduces the new risk of mistakes when guarding code paths by that flag. It also greatly increases the testing load, because both old and new paths should be equally well tested.

      Dependencies

      Project Valhalla may need some header bits. The layout of the header should be flexible enough to make room for more header bits. For example, it may be useful to trade some bits of the identity-hash-code or the compressed class pointer for other uses.

      JVMCI and users of that interface (that’s Graal VM, at the moment) would have to adopt the new header layout.

      [0] https://github.com/openjdk/lilliput/pull/14 [1] https://github.com/openjdk/lilliput/pull/8

      Attachments

        Issue Links

          Activity

            People

              rkennke Roman Kennke
              rkennke Roman Kennke
              Roman Kennke Roman Kennke
              Votes:
              1 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated: