Introduction
There are numerous codes in the WWW to explain how to hook API functions inside a remote process. So, why creating another one?
This article is the first (I) part of a tool I'm working on. These tools won't be too complex, so there is no need for a huge class hierarchy like in almost all other codes, nor a hard coded assembly code. This tool is just a technical preview of what is possible to achieve, with QUICK and DIRTY "put your functions here" blocks.
Before you read
Okay, to be clear, when you'll open the project and have a look to the code, you'll see it is not very well polished. It doesn't compile without warning, there is only few comments (compared to production code). However, it compiles and run cleanly on WinXP and 2003 platforms, (and should also run under Win2K, but I didn't check that).
Background
I used to develop a software using DirectShow. Sadly, I run into problem with a simple synchronization object which deadlocked the application. I still think there are two things a developer might fear about.
- The first one is "memory leaks". Memory leaks are most of the time due to an oversight. They are very well covered because they are easily (automatically) corrected.
- The other one is "error handling". Error handling seems very simple at first, but most of the time, there is always the "BAD CASE" which causes the "BAD BUG".
Badly handling errors in a software leads to crash or illogic state. To compare to physic state, it is easy to predict and create a system in its stable state. However, reaching/leaving this stable state is hard to predict, because error / thread scheduling order / memory usage / dark phenomena might change parameters. Recovering from an unstable state is very hard.
I can't give a global solution to this (if anyone could, please mail me). So let's see the usual solutions:
- Error return: Never expect a function will always work. There are numerous articles about these issues.
- Thread: This is the subject of the whole project.
- Memory usage: Monitoring the memory allocated by overriding the
malloc
/free
/new
/delete
functions/operators most of time is sufficient. - IO states: You'll need to establish a complete graph of states for handling all cases
Threads are one of the most "strange" thing. While it is possible to simulate almost all memory conditions, IO states and error returns, it is not possible to simulate the execution order of threads.
(uh, that's long!)
So to sum up, this project only focuses on one problem: data synchronisation in multithreaded environment. This article explains a technique to monitor calls to system functions. The aim of the global project is to build a deadlock detector. A deadlock occurs when, while protecting the access to a block of data (A), a thread is trying to access another protected block of data (B). This other block (B) is protected by another thread which tries to access the first protected block (A).
Then, to achieve this goal, let's split the project in small steps:
- Part 1. Hooking system functions.
- Part 2. Intercepting all calls to synchronization functions.
- Part 3. Building the logic for deadlock detection.
- Part 4. [Cosmetic] Using the map file + stack address to fetch the source code[/Cosmetic].
Part 1. Hooking system functions
I highly recommend you to read "Process-wide API spying - an ultimate hack" article by Anton Bassov and "API Hooking system" by Ivo Ivanov.
I won't dig into an n-th explanation about how to inject your code into a process. I will however explain what I chose, and how I make it works for me. The idea is to inject a DLL in a remote process so that the DLL will perform the function hijacking. The DLL is called in this project ThreadSpy.DLL. When the DLL is created, the DllMain
function is called with ProcessAttach's reason. In this case, it parse the HookStruct
array (defined in ThreadSpy.cpp), and replaces a given Win32 API function by a DLL supplied one.
The hook functions are defined in Hooked.cpp source file. Then, the import table of each already loaded module (like Kernel32.DLL, USER32.DLL, GDI32.DLL, etc...) is updated to replace the original function address with any supplied function.
Then a server could inject this DLL in a running process or spawn its own. The server is inside the ThreadDLD project, and is simply a WTL application which will receive message from the hooked process and inject the DLL in the hooked process. I've chosen to create the monitored process with the CreateProcess
function with CREATE_SUSPENDED
flag. This way, I can easily inject my DLL before any other DLL are loaded (in fact after the vital one like NTDLL and Kernel32). Then the server resume the thread, and is ready to collect the message from the DLL.
In order to intercept any newly loaded DLL in the hooked process, it is required to intercept the LoadLibrary
and GetProcAddress
functions. That's why those functions are always hooked.
I won't discuss about how to send a message to the server in this article, that's why I removed the code for this in the demo and source. With this demo, you could easily implement your own API hooking by changing the functions in ThreadSpy.{h,cpp} and in Hooked.cpp. This way you can easily monitor who is trying to create a file (and reject the creation if the credentials are not high enough) by adding a MyCreateFile
function.
In this example, I've chosen to hook TextOutA
and TextOutW
so that it is visually obvious.
Credits
As I said earlier, please read:
Future...
In the second part of the project, I will try to explain how to intercept all the required functions to monitor thread creation/destructions and synchronisation objects and communicate those information to the server. This article is more a prolog to the real article. The second part is here: Thread deadlock detection (part II)