Devoxx France 2012 – Deadlock Victim
By taking the dining philosophers problem as example, they show us that an inappropriate strategy can lead to a DeadLock. In the case of a database, the engine chooses the deadlock victim by returning an SQL exception to one of the involved clients. On the other hand, in the case of a Java program with a minimum of 2 threads trying to access 2 resources (after they have locked them), there is no exception to indicate a thread is involved in a DeadLock (in that case, it would be good that the thread throw an Error).
The solution to this problem (named Left-Right DeadLock) is to apply the following strategy to all the philosophers :
- take (lock) the flatware on his right side
- take (lock) the flatware on his left side
- put down (unlock) the flatware on his left side
- put down (unlock) the flatware on his right side
- It’s certainly possible to take the opposite strategy : the left side flatware, then the right side one. The essentiel is that all philosophers apply the same strategy.
- It’s important that a philosopher put down its flatwares in the opposite order he has taken them
Instead of using right-left notion, it’s possible to number each flatware with a unique number. In this case, there are 2 possible strategies : start by taking the flatware with the smallest number (among the 2 ones located near the philosopher), start by taking the flatware with the biggest number. Again, it’s important that all philosophers apply the same strategy. It gives a generic solution to DeadLock problems : define a strict order among resources (by associating them for example to a unique numerical identifier) and apply the “smallest first” strategy.
Among concrete cases, there is the private writeObject(ObjectOutputStream) method from Vector class, synchronized on the instance. A DeadLock could occur while serializing 2 vectors containing a reference to the other.
For the unit test side, how to check a DeadLock is fixed ? How many time should the test be played to prove the fix is working ? Indeed, a DeadLock can happen from time to time and not always systematically. A good test must pass or not, independantly from external conditions (load of processor …). Heinz and Olivier propose to add an empty method named sleep into the production code and call it between 2 attempts to lock 2 different resources. By overloading the sleep method into the test code to call Thread.sleep(long), it becomes possible to reveal the presence or the absence of a DeadLock in a determinist way.
Since Java 6, there is new features allowing to make it easier to detect and fix a DeadLock :
- Visualize threads state of a running JVM :
- Under Windows : The JVM must have been launched in a console and type Ctrl-Break
- Under Linux : If the JVM has been launched in a console, type Ctrl-\. Else, send the QUIT signal to the process with the command kill -QUIT process_id, where process_id is the process identifier (you can find it with the command ps axf | grep java)
Then, use the command jstack process_id.
- List threads involved in a DeadLock by using JMX. It only works for locks of the same type used by ReentrantLock and not for locks using a monitor (keyword synchronized).
To illustrate that, Heinz and Olivier have created a class named DeadlockArbitrator that kills a random thread among the ones involved in a DeadLock (the list is obtained through JMX). Then, they show a demonstration of a Swing application with a button to create a DeadLock and another to kill a thread involved in a DeadLock (by using the DeadlockArbitrator class).
Remark : to kill a thread, the DeadlockArbitrator class uses the deprecated method (useful in our case) Thread.stop(Throwable). This technic must only be used for code that you can’t fix (third party libraries). In other cases, you must obviously fix the DeadLock.
For more informations on technical subjects like this one, Heinz invite us to look at his site Javaspecialists.eu