As JDK-8144013 showed there is a risk of deadlock if a suspended thread holds a resource needed by the sampling code. This risk has always been known with the use of signal-based suspension in JFR but I am concerned that as time has passed the once well-contained and constrained sampling code now has a transitive closure that might include deadlock-prone executions. JDK-8144013 showed it with the dladdr lock, but as discussed in related email it might as easily be a malloc lock if the NMT code reachable from the sampling code has to perform a malloc. NMT, JFR events and even Unified Logging, all potentially increase the code closure reachable from the core sampling code and thus increase the chances of suspension induced deadlock.
If we can't avoid such usage we should at least have a clear understanding of where problems may arise, by generating some form of code reachability analysis.
If we can't avoid such usage we should at least have a clear understanding of where problems may arise, by generating some form of code reachability analysis.
- duplicates
-
JDK-8352251 Implement JEP 518: JFR Cooperative Sampling
-
- Resolved
-