Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8298129

Let checkpoint event sizes grow beyond u4 limit

    XMLWordPrintable

Details

    • jfr
    • b27

    Backports

      Description

        Per tradition, the data type used for the size of an event is u4, i.e. a 32-bit wide type. This comes from the original implementation, where the scheme used big endian encoding. Subsequent developments introduced a LEB128 variable-sized integer encoding scheme. This effectively reduced the maximum value of an event to the limit of 1 << 28 data type, because of the need to write a padded variable integer value. Of course, this set of values is well large enough to accommodate regular, user-defined events, which are normally small, often only some bytes, for example, the jdk.ClassLoad event is only 28 bytes.

        There is a catch, however, in that everything in JFR is an event. This means that also the metadata representation (event id 0) and the checkpoints/constant pools (event id 1) are encoded as events. These are normally much larger compared to user-defined events. Sometimes there is so much metadata that the limit of 1 << 28 bytes becomes a practical limit in the amount of data that can be encoded. This can for example be the case with very many stack traces or constants, such as symbols and strings.

        The current implementation stores checkpoint events in-memory, using an intermediary representation, before serializing to disk, which opens for the possibility to estimate how big the checkpoint event size really is,. Using this information, we can determine the exact size of the checkpoint event size, before serialising the event. An implication is that there is no need to reserve a size upfront, for later to be filled in with the actual size. The event size will only take as few bytes as is necessary, but will also seamlessly expand beyond the 1 << 28 limits. The parser reads varints as java longs, but it traditionally downcasts the event size to an int, under the assumption of 32-bits. Just removing the downcast for the parser when reading checkpoint events (event id 1), makes the change transparent. There will need to be an update on the JMC side as well.

        So with the suggested scheme, which only applies to checkpoint events, the exact size of the event will be written, with no padding. This lets us represent both smaller events in the common cases, but more importantly, also larger checkpoint / constant pool events as well. The parser side will continue to read checkpoint event sizes as varints, but will expose the event size as a long, instead of an int.

        Attachments

          Issue Links

            Activity

              People

                mgronlun Markus Grönlund
                mgronlun Markus Grönlund
                Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: