We’ve been using the Quartz work scheduler in a service that I’m maintaining at work. This is a .NET port of a Java application, and while looking over the source code I noticed a pattern that I hadn’t seen used before – the catching of ThreadInterruptedException and the subsequent use of the Interrupt method on a thread, as for example in this file.
Threads have a load of interesting state information associated with them. Previously I’d researched ThreadAbortException, which is associated with the Thread.Abort method. This causes an asynchronous exception to be thrown on the target thread, which is potentially queued if the thread isn’t in a suitable state to throw the exception, and which can use the notion of a pending abort to make sure that the abort is re-thrown when the code exits any catch or finally blocks. This flag can be reset using the Thread.ResetAbort method. Asynchronous exceptions seem to have fallen out of favour in most .NET code, as it is very hard to ensure invariants over your objects if an exception can be thrown at any point in the code – it is impossible to guarantee atomic regions if the presence of asynchronous exceptions unless you start using features like constrained execution regions.
I wasn’t aware before that a Thread can have an associated interrupt pending flag, which the CLR checks before it goes into blocking waits. So this code throws the exception on the line with the Sleep.
So then, how does this all work under the covers? Time to get out the SSCLI source code and have a quick look.
The cpp code in the file sscli20/clr/src/vm/threads.cpp handles the native control of Threads. Each thread object has a field that records whether a user interrupt is pending called m_UserInterrupt. The method HandleThreadInterrupt checks this flag and if we are in a suitable state then it throws the ThreadInterruptedException. This method is called in three other methods – UserSleep, DoAppropriateWaitWorker and DoAppropriateWaitWorkerAlertableHelper (where this last method is called from the non helper method). The first method is called if the user had asked the thread to sleep, and it is DoAppropriateWaitWorker which does the clever logic when the current thread needs to wait for a HANDLE at the OS level. Waiting is quite a complicated business in Windows – if we want an alertable wait, then queued APC messages can hijack our thread, releasing the wait and this may require us to wait again for a smaller time interval if we are trying to wait with a timeout. These APC messages could come from IO completions for example.
Also in this file is the interesting ReadyForAsyncException which contains the code to see if an asynchronous abort can be delivered to the thread. This involves walking the stack to see if we are inside catch or finally methods. There is also the code in UserAbort which does the logic of a thread abort and which is very complicated – we need to do different things depending on whether we are in managed code or are in unmanaged code for the target thread. We may also need to wait until the thread hits a safe point which will cause it to call CommonTripThread.
The real CLR has the ability to hijack code, by changing the return address on stack frame to regain control. Some of the code to handle this can be seen in excep.cpp around the method IsThreadHijackedForThreadStop.