-
Enhancement
-
Resolution: Won't Fix
-
P3
-
None
-
7, 8, 9
-
generic
-
generic
FULL PRODUCT VERSION :
$ java -version
java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
And other versions
FULL OS VERSION :
Linux: 2.6.32-358.23.2.el6.x86_64, Mac OS: 10.9.5. This problem is unlikely to be OS-specific.
EXTRA RELEVANT SYSTEM CONFIGURATION :
Observed on multiple versions of Java 8.
A DESCRIPTION OF THE PROBLEM :
After a close examination of the OpenJDK implementation of ReentrantReadWriteLock, I have a succeeded in creating a scenario similar to (our) customer's: a deadlock that the JVM deadlock detector cannot detect.
The trick is that a ReentrantReadWriteLock$NonfairSync (the default, and the type of ReentrantReadWriteLock that the customer's class loader is using) contains a heuristic intended to ensure that writers get some priority over readers, so that even in a busy reading environment a lone writer will eventually get a chance to proceed. The heuristic works like this: when a new thread arrives, the code gives the head of the queue an unsynchronized "glance". If the thread at the head of the queue is a writer, then the arriving thread is queued behind it, even if the the lock is only held by a reader and the arriving thread is also a reader.
As a result, it is possible for readers to queue behind readers when a writer intervenes (and I have a repro case to prove it).
Given this, the following set of actions result in a deadlock which is not reported by the JVM in a thread dump:
T1 takes read lock, then gets slow or is descheduled.
T2 queues for write lock
T3 takes object monitor, queues for readlock. But it queues behind T2 as explained above.
T1 gets busy again, queues for monitor
Now T1 is holding the read lock trying to get the monitor, and T3 is holding the monitor trying to get the read lock. It would be perfectly legal for T3 to take the read lock in parallel with T1, since they are both readers. And this would allow the monitor to be released, solving the whole problem. But the implementation doesn't allow it as explained above. So all three threads are deadlocked. If the JVM is now sent a kill -3, it does not report this as a deadlock, presumably because T2 is not holding any resources and T3 looks like it could run (even though really, it can't.)
Here's the output from the repro:
Thread T1: start, taking read lock...
Thread T2: start, sleeping for 1 second...
Thread T1: have read lock, sleeping for 3 seconds...
Thread T3: start: waiting 2 seconds for T1 to get the read lock and T2 to queue for the write lock...
Thread T2: queuing for write lock...
Thread T3: awake after 2 seconds, taking the monitor...
Thread T3: have monitor, queuing for read lock...
Thread T1: continuing after 3 seconds, queuing for the monitor...
(thread dump here - exceeds your length limit)
THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try
THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Did not try
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See source code below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
The JVM should report a deadlock. It doesn't.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
Not required.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
/*
T1 takes read lock, gets slow
T2 queues for write lock
T3 takes monitor, queues for readlock, stuck behind T2 (SEE NOTE)
T1 queues for monitor
Deadlock - not detected by JVM in thread dump in response to kill -3.
NOTE: the issue is that the rwlock only held for read, so in theory T3 can proceed.
But in practice, the implementation of the ReentrantReadWriteLock$NonfairSync cheats
by "glancing" at the head of the queue on every arrival. If the thread at the head
of the queue is trying to get the write lock, the new arrival is queued, even if its
a reader that could be allowed to proceed, like T3. As a result, readers can queue
behind readers, leading to the possibility of deadlocks that don't look like deadlocks
to the JVM deadlock detector.
*/
static void sleepMillis(int millis) {
java.util.concurrent.locks.LockSupport.parkNanos(1000L * 1000L * millis);
}
static void pr(String msg) {
System.err.println("Thread " + Thread.currentThread().getName() + ": " + msg);
}
static class T1 extends Thread {
@Override
public void run() {
Thread.currentThread().setName("T1");
pr("start, taking read lock...");
rwLock.readLock().lock();
pr("have read lock, sleeping for 3 seconds...");
sleepMillis(3000);
pr("continuing after 3 seconds, queuing for the monitor...");
synchronized (monitorLock) {
pr("uh oh: got the monitor (not expected).");
}
}
}
static class T2 extends Thread {
@Override
public void run() {
Thread.currentThread().setName("T2");
pr("start, sleeping for 1 second...");
sleepMillis(1000);
pr("queuing for write lock...");
rwLock.writeLock().lock();
pr("uh oh: got the write lock (not expected).");
}
}
static class T3 extends Thread {
@Override
public void run() {
Thread.currentThread().setName("T3");
pr("start: waiting 2 seconds for T1 to get the read lock and T2 to queue for the write lock...");
sleepMillis(2000);
pr("awake after 2 seconds, taking the monitor...");
synchronized(monitorLock) {
pr("have monitor, queuing for read lock...");
rwLock.readLock().lock();
}
pr("uh oh: got the read lock and released the monitor (not expected).");
}
}
@Test
public void testDeadlockWithBothReadersAndWriters() throws Exception {
Thread first = new T1();
Thread second = new T2();
Thread third = new T3();
first.start();
second.start();
third.start();
first.join();
second.join();
third.join();
}
---------- END SOURCE ----------
$ java -version
java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
And other versions
FULL OS VERSION :
Linux: 2.6.32-358.23.2.el6.x86_64, Mac OS: 10.9.5. This problem is unlikely to be OS-specific.
EXTRA RELEVANT SYSTEM CONFIGURATION :
Observed on multiple versions of Java 8.
A DESCRIPTION OF THE PROBLEM :
After a close examination of the OpenJDK implementation of ReentrantReadWriteLock, I have a succeeded in creating a scenario similar to (our) customer's: a deadlock that the JVM deadlock detector cannot detect.
The trick is that a ReentrantReadWriteLock$NonfairSync (the default, and the type of ReentrantReadWriteLock that the customer's class loader is using) contains a heuristic intended to ensure that writers get some priority over readers, so that even in a busy reading environment a lone writer will eventually get a chance to proceed. The heuristic works like this: when a new thread arrives, the code gives the head of the queue an unsynchronized "glance". If the thread at the head of the queue is a writer, then the arriving thread is queued behind it, even if the the lock is only held by a reader and the arriving thread is also a reader.
As a result, it is possible for readers to queue behind readers when a writer intervenes (and I have a repro case to prove it).
Given this, the following set of actions result in a deadlock which is not reported by the JVM in a thread dump:
T1 takes read lock, then gets slow or is descheduled.
T2 queues for write lock
T3 takes object monitor, queues for readlock. But it queues behind T2 as explained above.
T1 gets busy again, queues for monitor
Now T1 is holding the read lock trying to get the monitor, and T3 is holding the monitor trying to get the read lock. It would be perfectly legal for T3 to take the read lock in parallel with T1, since they are both readers. And this would allow the monitor to be released, solving the whole problem. But the implementation doesn't allow it as explained above. So all three threads are deadlocked. If the JVM is now sent a kill -3, it does not report this as a deadlock, presumably because T2 is not holding any resources and T3 looks like it could run (even though really, it can't.)
Here's the output from the repro:
Thread T1: start, taking read lock...
Thread T2: start, sleeping for 1 second...
Thread T1: have read lock, sleeping for 3 seconds...
Thread T3: start: waiting 2 seconds for T1 to get the read lock and T2 to queue for the write lock...
Thread T2: queuing for write lock...
Thread T3: awake after 2 seconds, taking the monitor...
Thread T3: have monitor, queuing for read lock...
Thread T1: continuing after 3 seconds, queuing for the monitor...
(thread dump here - exceeds your length limit)
THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try
THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Did not try
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
See source code below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
The JVM should report a deadlock. It doesn't.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
Not required.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
/*
T1 takes read lock, gets slow
T2 queues for write lock
T3 takes monitor, queues for readlock, stuck behind T2 (SEE NOTE)
T1 queues for monitor
Deadlock - not detected by JVM in thread dump in response to kill -3.
NOTE: the issue is that the rwlock only held for read, so in theory T3 can proceed.
But in practice, the implementation of the ReentrantReadWriteLock$NonfairSync cheats
by "glancing" at the head of the queue on every arrival. If the thread at the head
of the queue is trying to get the write lock, the new arrival is queued, even if its
a reader that could be allowed to proceed, like T3. As a result, readers can queue
behind readers, leading to the possibility of deadlocks that don't look like deadlocks
to the JVM deadlock detector.
*/
static void sleepMillis(int millis) {
java.util.concurrent.locks.LockSupport.parkNanos(1000L * 1000L * millis);
}
static void pr(String msg) {
System.err.println("Thread " + Thread.currentThread().getName() + ": " + msg);
}
static class T1 extends Thread {
@Override
public void run() {
Thread.currentThread().setName("T1");
pr("start, taking read lock...");
rwLock.readLock().lock();
pr("have read lock, sleeping for 3 seconds...");
sleepMillis(3000);
pr("continuing after 3 seconds, queuing for the monitor...");
synchronized (monitorLock) {
pr("uh oh: got the monitor (not expected).");
}
}
}
static class T2 extends Thread {
@Override
public void run() {
Thread.currentThread().setName("T2");
pr("start, sleeping for 1 second...");
sleepMillis(1000);
pr("queuing for write lock...");
rwLock.writeLock().lock();
pr("uh oh: got the write lock (not expected).");
}
}
static class T3 extends Thread {
@Override
public void run() {
Thread.currentThread().setName("T3");
pr("start: waiting 2 seconds for T1 to get the read lock and T2 to queue for the write lock...");
sleepMillis(2000);
pr("awake after 2 seconds, taking the monitor...");
synchronized(monitorLock) {
pr("have monitor, queuing for read lock...");
rwLock.readLock().lock();
}
pr("uh oh: got the read lock and released the monitor (not expected).");
}
}
@Test
public void testDeadlockWithBothReadersAndWriters() throws Exception {
Thread first = new T1();
Thread second = new T2();
Thread third = new T3();
first.start();
second.start();
third.start();
first.join();
second.join();
third.join();
}
---------- END SOURCE ----------
- relates to
-
JDK-8176204 [DOC] ThreadMXBean Fails to Detect ReentrantReadWriteLock Deadlock
-
- Resolved
-