-
Bug
-
Resolution: Fixed
-
P4
-
18
-
b11
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8357858 | 17.0.16 | Goetz Lindenmaier | P4 | Resolved | Fixed | b05 |
The failure_handler configuration for linux[1] and macos[2] uses
kill -ABRT %p
to dump the core of a timed out jtreg test. This command returns immediatelly and the coredump is initiated in the background by the OS, making it impossible for the failure_handler to properly track the timeout of this action. Let's change to a coredump method which will wait until the coredump is actually finished before returning:
On Linux:
bash -c "kill -ABRT %p && tail --pid=%p -f /dev/null"
On Mac:
bash -c "kill -ABRT %p && lsof -p %p +r 1 &>/dev/null"
(credit: https://stackoverflow.com/a/41613532)
Dumping a core can also take longer than the default action timeout of 20 seconds. Some personal testing showed coredumps for heaps of size 10-20G to take roughly 1-2 minutes. Let's set a safe default of 10 minutes for this action.
[1]:https://github.com/openjdk/jdk/blob/master/test/failure_handler/src/share/conf/linux.properties
[2]:https://github.com/openjdk/jdk/blob/master/test/failure_handler/src/share/conf/mac.properties
kill -ABRT %p
to dump the core of a timed out jtreg test. This command returns immediatelly and the coredump is initiated in the background by the OS, making it impossible for the failure_handler to properly track the timeout of this action. Let's change to a coredump method which will wait until the coredump is actually finished before returning:
On Linux:
bash -c "kill -ABRT %p && tail --pid=%p -f /dev/null"
On Mac:
bash -c "kill -ABRT %p && lsof -p %p +r 1 &>/dev/null"
(credit: https://stackoverflow.com/a/41613532)
Dumping a core can also take longer than the default action timeout of 20 seconds. Some personal testing showed coredumps for heaps of size 10-20G to take roughly 1-2 minutes. Let's set a safe default of 10 minutes for this action.
[1]:https://github.com/openjdk/jdk/blob/master/test/failure_handler/src/share/conf/linux.properties
[2]:https://github.com/openjdk/jdk/blob/master/test/failure_handler/src/share/conf/mac.properties
- backported by
-
JDK-8357858 failure_handler native.core should wait for coredump to finish
-
- Resolved
-
- links to
-
Commit openjdk/jdk/6120319a
-
Commit(master) openjdk/jdk17u-dev/5cce7702
-
Review openjdk/jdk/12515
-
Review(master) openjdk/jdk17u-dev/3604