Exception Injection: Throwing an Exception in Other Thread

valdok

4.94/5 (15 votes)

8 Apr 2010CPOL6 min read

How to abort a non-cooperating thread by an exception

Introduction

This article shows a way to "abort" a non-cooperating thread. More precisely, it can be used to abort some non-cooperating function called from another thread, and return execution to some "friendly" point within that thread. The method described in this article causes an exception to be raised in the target thread (similar to Thread.Abort in .NET).

The Method

First let's agree on the exception type that we may want to throw. Let's call it ThreadAbort and define it:

C++

class ThreadAbort
{
	__declspec (noreturn) static void Throw();
public:
	static bool RaiseInThread(HANDLE hThread);
	static void DontOptimize() throw (...);
};

As you may see, the ThreadAbort has no member variables. Means - we don't pass any parameters with our exception. In fact parameters may be added as well, but we won't discuss this here. The static member functions have the following purpose:

RaiseInThread causes the specified thread to raise the ThreadAbort exception.
DontOptimize does nothing. However it should be called inside the appropriate try/catch block of the target thread. This is related to the fact that at the compile time the compiler won't see a possibility for our exception to be raised there, and as a result during optimizations it may omit the try/catch block, hence our exception won't be handled. Another way to solve this is to set the asynchronous exception handling model (more about this later).

The non-cooperating function that we may call in that thread should be wrapped by the appropriate try/catch block:

C++

try {
	ThreadAbort::DontOptimize();
	
	// pass control to the func
	SomeNonResponsiveFunc();
	
} catch (ThreadAbort&) {
	// process abortion
}

Now let's dig into the implementation of ThreadAbort:

C++

__declspec (noreturn) void ThreadAbort::Throw()
{
	// just throw
	throw ThreadAbort();
}

void ThreadAbort::DontOptimize() throw (...)
{
	// By this awkward method we may convince the compiler that during the runtime
	// the exception *may* be thrown.
	// However it may not actually.
	volatile int i=0;
	if (i)
		Throw();
}

bool ThreadAbort::RaiseInThread(HANDLE hThread)
{
	bool ok = false;

	// Suspend the thread, so that we won't have surprises
	DWORD dwVal = SuspendThread(hThread);
	if (INFINITE != dwVal)
	{
		// Get its context (processor registers)
		CONTEXT ctx;
		ctx.ContextFlags = CONTEXT_CONTROL;
		if (GetThreadContext(hThread, &ctx))
		{
			// Jump into our Throw() function
			ctx.Eip = (DWORD) (DWORD_PTR) Throw;

			if (SetThreadContext(hThread, &ctx))
				ok = true;
		}

		// go ahead
		VERIFY(ResumeThread(hThread));
	}
	return ok;
}

As can be seen from the above code, in order to cause a thread to raise an exception we suspend it, modify its EIP register (instruction pointer) to point straight into the hands of the ThreadAbort::Throw, and then resume it, to go happily into the abyss. This is a brute-force method: we don't know anything about what that thread is actually doing, it may be in the middle of something. So we may interrupt it any moment.

Compared to Other Methods

There're other methods to interrupt a thread in the middle of what it's doing, however they're less flexible than that with exception:

TerminateThread can be called to terminate a thread immediately. This however doesn't allow to pass control into a "friendly" code in that thread. That is, we may not want to terminate that thread, we just want to pass control to other code inside that thread. Plus when a thread is terminated by TerminateThread - its stack memory is not released by the OS. So we have a memory leak (in addition to the leaks allocated by the aborted code).
We may modify the EIP to go directly into the "friendly" code, rather than involving the exception handling mechanism which will (hopefully) finish at our friendly code.
The problem here is that we don't give the chance to the being-aborted code to execute its cleanup. Hence - all the resources allocated by it are lost, and we'll probably have resource/memory leaks. On the other hand, if that code is written in an exception-aware way - it may cleanup gracefully.

That is, the method with exception allows to abort gracefully. However, unfortunately, it can't guarantee that we won't have leaks at all. There're several reasons for this:

Not every piece of code is written in an exception-aware way (means - no allocated resource "in the air", everything is guarded either by destructors of automatic variables, or __try/__finally SEH blocks). Some code blocks are written with the assumption that exception may not occur within them.
Even if everything is written in an exception-aware way - the compiler is free to optimize the code. It may omit the needed exception handling records if it doesn't see a possibility for exception to occur. Luckily this can be prevented by selecting so-called asynchronous exception handling model (for more information, please read this article).
If everything is written in an exception-aware way and even if we select the asynchronous exception handling model - still we may have a problem.
According to C++ rules, the lifetime of an automatic object officially begins after it finishes its constructor. For instance, if the object throws an exception during its constructor - its destructor won't be called.

Now, since we blindly cause our exception at the middle of whatever-thread-is-doing - we may cause an exception right at the end of the constructor of an object. After it allocated its resources, but just before it's officially born. In such a scenario its destructor won't be called, and we'll have leaks.

Luckily this also has a workaround: we may omit doing allocations in constructors. That is, you may not allocate anything in the constructor, just "zero-initialize" your variables. Then actual allocations may be done in some other method which should be called right after the constructor. This is called two-stage object construction.
And finally, if we do everything in the most careful way - still we may have a problem. This time with destructor. During the normal program flow when the lifetime of the object ends - the code generated by the compiler removes the exception handling information for this object and then immediately calls its destructor. But what if we raise our exception right after the exception handling information is removed but before the destructor of the object is called? Or during its destructor when it didn't finish its cleanup yet?

Unfortunately there's no way to guarantee correct cleanup of a non-cooperating code in all the possible scenarios. This is the reason our method is a brute-force. However, compared to other methods, it's more graceful. Actually you have a good chance of a correct cleanup, and in the worst case (if however everything is written carefully) - you'll have no more than one leak.

Kernel-mode Calls

Our method causes an exception in the target thread when this thread runs in the user-mode. However if that thread is currently executing a system (kernel-mode) call - it won't be aborted immediately. The abortion will be deferred until it returns from the system call. If we talk about a short-duration system call (such as a call to SetEvent, CreateMutex or etc.) - there's no problem. But if one calls a waitable function (such as Sleep, WaitForSingleObject, etc.) - they may take a long time to complete, or even never complete.

AFAIK it's impossible to abort a system call from within a user-mode code. It may only be achieved by going deep into the OS internals. The only way to abort such a call is to use TerminateThread.

Conclusion

The most important conclusion is that you should not use non-cooperating code. Every piece of code that is potentially time-consuming must provide a conventional way for abortion.

Next, it's impossible to abort the unknown code and be sure that everything is cleaned up. However the method with injected exception gives the best chances of a graceful cleanup.

It's impossible to abort a system call from the user-mode code. In some situations, there's no choice but to call the TerminateThread function.

I'll appreciate comments. Criticisms and new ideas are welcome.

History

8^th April, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)