Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8279011

JFR: JfrChunkWriter incorrectly handles int64_t chunk size as size_t

XMLWordPrintable

    • jfr
    • b29

        Followed up on JFR failures on x86_32, and spotted this failure:

        $ CONF=linux-x86-server-fastdebug make run-test TEST=jdk/jfr/jmx/streaming/TestClose.java

        STDOUT:
        test-close-1640016512165/2021_12_20_17_08_33.jfr
        STDERR:
        java.lang.Exception: Expected repository to be empty
        at jdk.jfr.jmx.streaming.TestClose.main(TestClose.java:69)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
        at java.base/java.lang.reflect.Method.invoke(Method.java:577)
        at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
        at java.base/java.lang.Thread.run(Thread.java:833)

        Intrusive debugging shows the same picture: chunkSize is recorded as huge value, so the JFR code internally receives the EOF when reading, event handling gets borked, and the test fails, never observing the chunk flushes.

        [3.141s][info][jfr,system,parser ] Chunk: startPosition=0
        [3.141s][info][jfr,system,parser ] Chunk: major=2
        [3.141s][info][jfr,system,parser ] Chunk: minor=1
        [3.141s][info][jfr,system,parser ] Chunk: chunkSize=9896023423722239
        [3.141s][info][jfr,system,parser ] Chunk: constantPoolPosition=0
        [3.141s][info][jfr,system,parser ] Chunk: metadataPosition=0
        [3.141s][info][jfr,system,parser ] Chunk: startNanos=1640015039383595228
        [3.141s][info][jfr,system,parser ] Chunk: durationNanos=945990689
        [3.141s][info][jfr,system,parser ] Chunk: startTicks=1203376481
        [3.141s][info][jfr,system,parser ] Chunk: ticksPerSecond=1000000000
        [3.141s][info][jfr,system,parser ] Setting input size to 9896023423722239
        [3.141s][info][jfr,system,parser ] Chunk: chunkSize=9896023423722239
        [3.141s][info][jfr,system,parser ] Chunk: constantPoolPosition=2304002
        [3.141s][info][jfr,system,parser ] Chunk: metadataPosition=2103375
        [3.141s][info][jfr,system,parser ] Chunk: durationNanos =945990689
        [3.141s][info][jfr,system,parser ] Chunk: generation=2
        [3.141s][info][jfr,system,parser ] Chunk: finished=false
        [3.141s][info][jfr,system,parser ] Chunk: fileSize=9896023423722239
        [3.141s][info][jfr,system,parser ] Chunk: finalChunk=false

        The problem seems to be here:

        int64_t JfrChunkWriter::write_chunk_header_checkpoint(bool flushpoint) {
           const u4 checkpoint_size = current_offset() - event_size_offset;
           write_padded_at_offset<u4>(checkpoint_size, event_size_offset);
           set_last_checkpoint_offset(event_size_offset);
           const size_t sz_written = size_written(); // <-- returns int64_t
           write_be_at_offset(sz_written, chunk_size_offset); // <--- template instantiation with type=size_t
           return sz_written;
         }

        This would have been nearly fine -- small size_t -> int64_t conversion is okay value-wise. But write_be_at_offset calculates the position for the writeout using sizeof(T), which silently borks the whole thing on at least 32-bit platforms, where sizeof(size_t) != sizeof(int64_t).

        This might also explain timeouts in these tests, as the fix apparently resolves the timeouts there too:

        # Timeouts
        jdk/jfr/jmx/streaming/TestRemoteDump.java
        jdk/jfr/jmx/streaming/TestSetSettings.java
        jdk/jfr/jmx/streaming/TestDelegated.java
        jdk/jfr/jmx/streaming/TestRotate.java
        jdk/jfr/jmx/streaming/TestEnableDisable.java

              shade Aleksey Shipilev
              shade Aleksey Shipilev
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: