Introduction
This is the second part for building a thread deadlock detector. Please see the first article to understand what is going on: A (working) implementation of API hooking (Part I). The next part (with a working deadlock detector) is also here: Thread Deadlock detector
The API hooked functions for a thread deadlock detector
What should be intercepted for the purpose of detecting a deadlock:
- Thread functions (
Create[Remote]Thread
, [Suspend/Resume]Thread
, ExitThread
, TerminateThread
, OpenThread
). - Synchronization functions (
WaitFor[Single/Multiple]Object[Ex]
, SignalObjectAndWait
, [Set/Reset/Pulse]Event
). - Synchronization objects creation (
Create/Open[Mutex/Semaphore/Event]
, DuplicateHandle
). - Synchronization objects deletion (
CloseHandle
).
To intercept the code, I simply added to the previous code the required functions. Please have a look at the previous article about how to add functions to be hooked. The main idea is to have a hook structure declared like:
typedef struct
{
char szDLLName[MAX_PATH]; char szFuncName[MAX_PATH]; void * pNewFunc; void * pPrevFunc; Flags flags; } HookStruct;
When hooking the function in all modules, the previous (and true) function pointer is saved in pPrevFunc
, while the new function pointer (set to our hooking function) replaces the module IAT
.
Then in our function, we can simply call the previous function by converting the pPrevFunc
pointer to the correct function's pointer signature. I defined a useful macro for that (the source code will be in part III).
#define CallFunction(X)
((Signature_##X)GetPreviousFunctionAddress(Index_##X))
#define Signature_CreateThread HANDLE (FAR PASCAL *)\
(LPSECURITY_ATTRIBUTES lpThreadAttributes, DWORD dwStackSize, \
LPTHREAD_START_ROUTINE lpStartAddress, LPVOID lpParameter, DWORD \
dwCreationFlags, LPDWORD lpThreadId)
#define Index_CreateThread 3
Okay, we can now call the good function, so let's start the real work.
The communication process
So, how do we inform the server that the "hooked" application is currently performing a monitored action? The simple answer is to create a structure to send to the server through a WM_COPYDATA
message, with the needed information. The needed information is application-dependant, and in our case, it consists of:
- The current thread handle (who is calling this function)..
- The current thread ID (the handle is not enough, see below).
- The manipulated object ID/handle (what are we touching).
- An additional argument (if any).
- The current command (what function are we calling).
- The object name, if any (useful to debugging only).
- The call stack (required to find from where the call came from).
- The current timestamp (needed to check for the deadlocks).
The structure is then defined as:
typedef struct
{
void * lAddress; unsigned int lFlags; } StackPointer;
typedef struct
{
HANDLE hObjectID; HANDLE hThread; DWORD dwThreadID; unsigned int lState; HANDLE hFlag; Commands Command; char sName[256]; LARGE_INTEGER llTimestamp; StackPointer xPointers[10]; } CommunicationObject;
A hooking function will then look like:
THREADSPY_API HANDLE WINAPI MyCreateThread(LPSECURITY_ATTRIBUTES
lpThreadAttributes, DWORD dwStackSize, LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter, DWORD dwCreationFlags, LPDWORD lpThreadId)
{
DWORD dwID;
HANDLE hHandle = CallFunction(CreateThread)(lpThreadAttributes,
dwStackSize, lpStartAddress, lpParameter, dwCreationFlags, &dwID);
if (hHandle != NULL)
{
CommunicationObject xObj;
memset(&xObj, 0, sizeof(xObj));
xObj.hThread = GetTrueCurrentThread();
xObj.hObjectID = hHandle;
xObj.lState = dwID;
xObj.dwThreadID = GetCurrentThreadId();
if (lpThreadId != NULL)
{
*lpThreadId = dwID;
}
xObj.Command = CmdCreateThread;
Communicate(&xObj, sizeof(xObj));
if (dwCreationFlags & CREATE_SUSPENDED)
{
xObj.lState = 1;
xObj.Command = CmdSuspendThread;
CommunicateWithoutTime(&xObj, sizeof(xObj));
}
if (xObj.hThread != NULL) CallFunction(CloseHandle)(xObj.hThread);
}
return hHandle;
}
The tricky part
Okay, now we simply have to fill in the structure. Even if it looks easy, it is not because of any problem due to shortage in Win32 API. For example, if you use the GetCurrentThread
function, the Windows API will return a special handle CURRENT_THREAD_HANDLE
. This information is, of course, not useful here. This is how Windows handles HANDLE. For the same kernel object, one can have multiple handles on it, on different memory space. So with a HANDLE it is not possible to uniquely identify a thread, we need its ID. While it is easy to store the thread handle and ID in a structure in any program, it is not obvious that the debuggee program will have such a mapping. That's why we need to find out how to get the thread ID given its HANDLE. For example, when a thread call TerminateThread
to kill another thread is called, it only uses the killed thread handle, not its ID. The server will never be able to match which thread was killed (or this will require a kind of matching algorithm etc., etc...). Windows Server 2003 provides a function called GetThreadId
to get the thread ID, but because it is only in Win2K3, it is not useful.
The solution to this issue is to use the NtQueryInformationThread
hidden function from NTDLL.DLL (NTDLL.DLL is mapped in every process memory) like Visual Studio debugger. This function can return a CLIENT_ID
structure with the thread ID in it. For more information, please see the Undocumented NT Internal.
Now that we can identify the object being manipulated, we need to get the stack trace. This can be done by using the StackWalk
function from Win32 API. Usually, this function is used by stopping the debuggee thread, retrieving its context, and then resuming the thread. As we don't want to stop the thread (because it will change scheduling order while being debugged), we need to fill up a context structure by ourselves. The trick is to read the EIP
register before using StackWalk
, and this can be done easily with a few ASM commands like:
CONTEXT c;
_EnterCriticalSection(&mhSection);
__asm
{
call GetEIP
GetEIP:
pop eax
mov c.Eip, eax
mov c.Ebp, ebp
};
_LeaveCriticalSection(&mhSection);
The last trick needed is to detect when a thread has stopped. Because the injected DLL cannot create a thread without disturbing the program, we are supplying a function called CheckRunningThread
that uses the same signature as a Thread Start routine, taking a thread handle in client memory space, and returning a DWORD
for the thread state. By using the same method for injecting the DLL (thanks to CreateRemoteThread
), the server can stop the process being debugged, create a remote thread starting on the CheckRunningThread
function, and reading the thread status before resuming the debuggee. This way, the server can check when a thread finishes and detect for never-released objects.
To conclude
In this article there are answers to some impossible things (from MSDN) like:
- Get a thread ID from its thread handle (Google around, and you'll see it is a real issue).
- Get the stack trace of a running thread (again it is not a usual practice).
- Get a real thread handle instead of the default generic value (using
DuplicateHandle
). - Spy when a thread in a remote process has finished.
- Communicate with a server about any action.
- Hook any Win32 API function.
The drawback is that it requires WinNT kernel (like XP, 2K and 2K3), but I'm sure it is not an issue nowadays.
This is for the client part. I will provide the source code for both the server and client in the next part (Part III). We will then see how to get the entire log of synchronization function call in a process, and how to map the call stack value to function names.