Unfortunately it is unreproducible on development machines, it happens only during CI and only rarely.
java.lang.RuntimeException: bad RestoreTime: 15422: expected that 15422 < 10000
at jdk.test.lib.Asserts.fail(Asserts.java:634)
at jdk.test.lib.Asserts.assertLessThan(Asserts.java:99)
at jdk.test.lib.Asserts.assertLT(Asserts.java:70)
at MXBean.test(MXBean.java:80)
at jdk.test.lib.crac.CracTest.run(CracTest.java:155)
at jdk.test.lib.crac.CracTest.main(CracTest.java:89)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:333)
at java.base/java.lang.Thread.run(Thread.java:1583)
Timofei Pushkin:
But looking at the test I would expect it to be fragile: it measures the time from the start of the checkpointed process to the start of its restore and wants it to be 0, and since this is not a reasonable thing to expect it sets a huge tolerance of 10 seconds.
I think there should be two tests really:
1. For platforms that have engines with restore time passing (only Linux currently, I think pauseengine is enough here) — this will be almost the same test as it is now, but it will do an actual restore call and measure the restore start time from that call.
2. For platforms that only have simengine — the test will just assert that “checkpointed process start time“ <= “restore start time“ and “uptime since restore“ >= 0 and close to 0.
java.lang.RuntimeException: bad RestoreTime: 15422: expected that 15422 < 10000
at jdk.test.lib.Asserts.fail(Asserts.java:634)
at jdk.test.lib.Asserts.assertLessThan(Asserts.java:99)
at jdk.test.lib.Asserts.assertLT(Asserts.java:70)
at MXBean.test(MXBean.java:80)
at jdk.test.lib.crac.CracTest.run(CracTest.java:155)
at jdk.test.lib.crac.CracTest.main(CracTest.java:89)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:333)
at java.base/java.lang.Thread.run(Thread.java:1583)
Timofei Pushkin:
But looking at the test I would expect it to be fragile: it measures the time from the start of the checkpointed process to the start of its restore and wants it to be 0, and since this is not a reasonable thing to expect it sets a huge tolerance of 10 seconds.
I think there should be two tests really:
1. For platforms that have engines with restore time passing (only Linux currently, I think pauseengine is enough here) — this will be almost the same test as it is now, but it will do an actual restore call and measure the restore start time from that call.
2. For platforms that only have simengine — the test will just assert that “checkpointed process start time“ <= “restore start time“ and “uptime since restore“ >= 0 and close to 0.
- links to
-
Commit(crac) openjdk/crac/9a1ab26a
-
Review(crac) openjdk/crac/246