Introduction
After publishing my last article ([2]) explaining how to emulate some missing Windows functions
used for remote code execution, the next logical step was to use these functions as a framework for implementing a library that allows easy remote code injection.
Remote code injection is the method that permits executing code within the address space of a process other that the current one. Because the architecture of Windows
isolates each process to protect them against memory overwrites and other bugs in applications, injecting code into a remote process is not straightforward.
This library implements functions that allow direct remote code injection, DLL remote injection and remote subclassing for Win32 processes (GUI and CUI)
and NT native processes. Don't expect to find any innovative code as this library is mainly based on the techniques described by Robert Kuster in his article
"Three ways to inject your code into another process" ([1]). Nevertheless I hope that you'll find the library
useful and use it in your projects.
Remote SEH (Structured Exception Handling)
All the remote code execution is protected by SEH to avoid any exception to crash the remote process. The SEH code you normally find in a C/C++ application looks like the following:
__try
{
}
__except(filter-expression)
{
}
You cannot use this code in remote code because this is the compiler implementation of SEH and internally it calls the standard library
functions (__except_handler3
) that reside on the current process. You need to use system-level SEH ([6]).
System-level SEH is implemented as a per-thread linked list of callback exception handler functions. A pointer to the beginning of this list can be retrieved
from the first DWORD
of the TIB
(Thread Information Block). The FS
segment register always points to the current TIB
.
To implement SEH all that is needed is to add an exception handler to the linked list. In the simplest form this can be accomplished with the following code:
push addr _exception_handler
push dword ptr fs:[0]
mov fs:[0], esp
pop dword ptr fs:[0]
add esp, 4
Every time an exception in the try
code block occurs, the operating system calls the _exception_handler
routine.
In the simplest form, only two DWORD
s (which make up an EXCEPTION_REGISTRATION
structure) must be pushed on the stack.
Of course nothing prevents us from adding additional data fields to this structure (VC, for example, pushes an extended EXCEPTION_REGISTRATION
structure
containing five fields). In my implementation, I'm adding two fields to the standard SEH frame: the value of the EBP
register
and the address where the execution should resume after the exception occurs. The final code will look like this (You'll notice that the code is written in assembly.
I used assembly for two reasons: assembly permits a greater control of the generated code and only in assembly is it possible to access the FS
register):
push ebp
push addr _resume_at_safe_place
push addr _exception_handler
push dword ptr fs:[0]
mov fs:[0], esp
_resume_at_safe_place:
pop dword ptr fs:[0]
add esp, 3*4
EXCEPTION_DISPOSITION __cdecl _exception_handler(
struct _EXCEPTION_RECORD *ExceptionRecord,
struct _EXTENDED_EXCEPTION_REGISTRATION *EstablisherFrame,
struct _CONTEXT *ContextRecord,
void *DispatcherContext)
{
ContextRecord->cx_Eax = ExceptionRecord->ExceptionCode;
ContextRecord->cx_Eip = EstablisherFrame->SafeExit;
ContextRecord->cx_Ebp = EstablisherFrame->SafeEBP;
ContextRecord->cx_Esp = EstablisherFrame;
return ExceptionContinueExecution;
}
The _exception_handler
restores the EBP
register, sets the EAX
register to the exception code, and resumes execution
at _resume_at_safe_place
. The complete source code can be found on file "Stub.asm".
GetProcessInfo()
The GetProcessInfo()
function returns valuable information about a process needed to decide what type of injection can be performed in this process.
The following information is returned:
- OS family: Windows 9x (95, 98, Me) or Windows NT (3, 4, 2000, XP, Vista, 7)
- Process is invalid: DOS, 16-bit, system, other
- Process is being debugged
- Process has not yet finished its initialization
- Protected process
OS family
This information is necessary because the injection algorithms are different for the Windows 9x (95, 98, Me) and NT (3, 4, 2000, XP, Vista, 7) families.
The information is returned directly by a call to GetVersionEx()
:
OSVERSIONINFO osvi;
osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO);
GetVersionEx(&osvi);
fWin9x = (osvi.dwPlatformId == VER_PLATFORM_WIN32_WINDOWS);
fWinNT = (osvi.dwPlatformId == VER_PLATFORM_WIN32_NT);
Invalid process
NT
An NT process is considered invalid if its exit code is not equal to 259 (hex 0x103) or if it doesn't have a PEB
(Process Environment Block) (i.e., a system process).
PROCESS_BASIC_INFORMATION pbi;
NtQueryInformationProcess(hProcess,
ProcessBasicInformation,
&pbi,
sizeof(pbi),
NULL);
fINVALID = ((pbi.ExitStatus != 0x103) ||
(pbi.PebBaseAddress == NULL));
9x
A Win9x process is invalid if its exit code is not 259 (hex 0x103) unless it is a DOS 16-bit process or it's in a termination state.
#define fINVALIDPROCFLAGS (fTerminated | fTerminating |
fNearlyTerminating | fDosProcess | fWin16Process)
PDB *pPDB = GetPDB(dwPID);
fINVALID = ((pPDB->TerminationStatus != 0x103) ||
(pPDB->Flags & fINVALIDPROCFLAGS));
Process is being debugged
NT
If either the ProcessDebugPort
or the PEB
BeingDebugged
field is non-zero then the NT process is being debugged.
PROCESS_BASIC_INFORMATION pbi;
BOOL DebugPort;
PEB_NT PEB, *pPEB;
NtQueryInformationProcess(hProcess,
ProcessDebugPort,
&DebugPort,
sizeof(DebugPort),
NULL);
NtQueryInformationProcess(hProcess,
ProcessBasicInformation,
&pbi,
sizeof(pbi),
NULL);
pPEB = pbi.PebBaseAddress;
ReadProcessMemory(hProcess, pPEB, &PEB, sizeof(PEB), NULL);
fDEBUGGED = DebugPort || PEB.BeingDebugged;
9x
If the PDB
(Process Database) Debug Context pointer is non NULL
or the fDebugSingle
bit of the PDB
flag
is set then the Win9x process is being debugged.
PDB *pPDB = GetPDB(dwPID);
fDEBUGGED = ((pPDB->DebuggeeCB != NULL) ||
(pPDB->Flags & fDebugSingle));
Process is not initialized
NT
If the LdrData
or LoaderLock
fields of the PEB
are NULL
then the NT process is not initialized.
Both fields are set by the NT loader user-mode APC routine LdrpInitialize()
while initializing the process.
fNOTINITIALIZED = (PEB.LdrData == NULL || PEB.LoaderLock == NULL);
9x
Only if the last DWORD
of the main thread stack is below 2GB (0x80000000) the Win9x process is initialized ([3]).
PDB *pPDB = GetPDB(dwPID);
DWORD *pThreadHead = pPDB->ThreadList;
THREADLIST *pThreadNode = *pThreadHead;
TDB *pTDB = pThreadNode->pTDB;
void *pvStackUserTop = pTDB->tib.pvStackUserTop;
pvStackUserTop = (DWORD *)((DWORD)pvStackUserTop - sizeof(DWORD));
DWORD StackUserTopContents;
ReadProcessMemory(hProcess, pvStackUserTop, &StackUserTopContents,
sizeof(StackUserTopContents), NULL);
fNOTINITIALIZED = ((int)StackUserTopContents < 0);
Protected process
Starting with Windows Vista a new type of process, called a protected process, is introduced.
In a protected process the following operations cannot be performed: inject a thread, access the virtual memory,
debug the process, duplicate a handle or change the quota or working set. Therefore remote injection it's not
possible in protected processes. Use the following code to detect a protected process:
HANDLE hProcess;
PROCESS_EXTENDED_BASIC_INFORMATION ExtendedBasicInformation;
hProcess = OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, FALSE, dwPID);
ExtendedBasicInformation.Size = sizeof(PROCESS_EXTENDED_BASIC_INFORMATION);
NtQueryInformationProcess(hProcess,
ProcessBasicInformation,
&ExtendedBasicInformation,
sizeof(ExtendedBasicInformation),
NULL);
fPROTECTED = ExtendedBasicInformation.IsProtectedProcess;
Subsystem
This is the type of subsystem the process uses for its user interface. It's the same as the Subsystem
field found in the PE Header of the file on disk (and of the Module Header
in memory).
NT
In NT the subsystem type can be directly retrieved from the PEB
ImageSubsystem
field:
Subsystem = PEB.ImageSubsystem;
9x
The subsystem type can be retrieved from the module's header Subsystem
field. To locate the module's header in memory we can use the Kernel32
GetModuleHandle()
function or the MTEModTable
. The pointer to the NT header is obtained from the pNTHdr
field
of the IMTE
(Internal Module Table Entry). The IMTE
address is obtained from the MTEModTable
using the PDB
MTEIndex
field as an index ([4] chapter 3 details all these structures and explains the hack needed
to obtain the address of the MTEModTable
from the Kernel32 GDIReallyCares()
function).
#define GDIREALLYCARES_ORDINAL 23 // 0x17
HMODULE hKernel32 = GetModuleHandle("Kernel32.dll");
void *pGDIReallyCares = _GetProcAddress(hKernel32, GDIREALLYCARES_ORDINAL);
int GDIReallyCaresLength = GetProcLength(hKernel32, GDIREALLYCARES_ORDINAL);
BYTE *p = MemSearch(pGDIReallyCares, GDIReallyCaresLength, "\x8B\x0D", 2);
IMTE **pMTEModTable = (IMTE **)*(DWORD *)*(DWORD *)(p+2);
PDB *pPDB = GetPDB(dwPID);
IMTE *pIMTE = pMTEModTable[pPDB->MTEIndex];
PIMAGE_NT_HEADERS32 pNTHeader = pIMTE->pNTHdr;
Subsystem = pNTHeader->OptionalHeader.Subsystem;
RemoteExecute()
The RemoteExecute()
function executes code in the context of a remote process. It accepts 7 parameters:
hProcess
: Handle of the remote process.ProcessFlags
: Returned by GetProcessInfo()
. Can be zero.Function
: Thread function that will be executed within the remote process context. The thread function is protected against exceptions by SEH.pData
: Memory block that will be copied to the remote process address space. Can be NULL
.Size
: Size of the pData
block. If zero is specified pData
is treated as a DWORD
.dwTimeout
: Timeout in milliseconds used in wait functions. Can be INFINITE
.ExitCode
: Pointer to a DWORD
that will receive the remote code exit status.
The following steps are executed by RemoteExecute()
(see [1]):
- If
ProcessFlags
is zero then call GetProcessInfo()
. - Check if the function code is safe to be relocated (no calls or absolute addressing) and calculate its length. Note that this is not 100% secure!
You should write relocatable code and analyze the generated code.
- Allocate a remote memory block and copy the function code to it.
- If a data block is specified, allocate a remote memory block and copy the data to it.
- Allocate a remote memory block and copy the stub code to it (see file "Stub.asm"). The stub code will set an SEH frame and call the user thread function.
The special native process exit is also handled by this code.
- According to the
ProcessFlags
it will run the remote code using one of the available methods: CreateRemoteThread()
,
RtlCreateUserThread()
or NtQueueApcThread()
. - Wait for remote code to finish using
WaitForSingleObject(hThread)
or check the Finished
flag set by the stub code. - If a data block was specified, read back the data from the remote memory block.
- Cleanup and return error code.
Depending on the ProcessFlags
a different remote code execution method must be used:
Win32 initialized process
Use the CreateRemoteThread()
function to execute the remote code (because this function doesn't exist in Win9x it must
be emulated (see [2])). Starting with Windows Vista CreateRemoteThread()
will fail
if the target process is in a different session than the calling process. The solution to this limitation is to use the undocumented NtCreateThreadEx()
function
on Windows Vista and 7 ([8]). Wait for the remote code to finish by calling WaitForSingleObject()
on the returned thread handle, and get the remote exit code by calling GetExitCodeThread()
.
Win32 non-initialized process
What you can do in a non-initialized process is very limited (because you cannot assume that the system internal structures are initialized, the DLLs are loaded, ...)
therefore you should be extremely careful while injecting code into this type of process. It's advised to wait until the process finishes its initialization.
For GUI processes, this can be accomplished by using the WaitForInputIdle()
function, but unfortunately there's no equivalent function for the other
types of processes. Anther possible technique involves setting a breakpoint into the process entry point (this allows to detect when the system part of the process
initialization has terminated).
9x
Just set a bit in the CreateRemoteThread()
dwCreationFlags
parameter that causes this function internally to prevent the THREAD_ATTACH
message being sent before PROCESS_ATTACH
(see [3]).
NT
The NtQueueApcThread()
function is used to queue an APC routine (our remote code) on an existing remote thread. The APC routine will run as soon
as the thread becomes signaled. We cannot use wait functions on a thread for which the APC was queued and therefore to get the remote code exit status we poll
the Finished
flag set by the remote stub code. We also cannot use GetExitCodeThread()
to get the remote exit code (this will return
the "hijacked" thread exit status) so we always set the exit code to zero (of course we could save the exit status in a variable and read it later as we do with
the Finished
flag).
NT native process
To create an NT native process the RtlCreateUserThread()
function is used. The WaitForSingleObject()
and
GetExitCodeThread()
can be used on the returned thread handle. Note that the native remote code requires a different exit code.
This is handled by the remote stub code. The code used for the native exit is the Kernel32 ExitThread()
equivalent but for native processes:
- Call
LdrShutdownThread()
to notify all DLLs on thread exit. - Release the thread stack by calling
NtFreeVirtualMemory()
. Note that before releasing the stack we must switch to a temporary stack.
The UserReserved
area within the TEB
is used for this purpose. - Terminate the thread by calling
NtTerminateThread()
.
InjectDll()
The InjectDll()
function loads a DLL into the address space of a remote process. It accepts 5 parameters:
hProcess
: Handle of the remote process.ProcessFlags
: Returned by GetProcessInfo()
. Can be zero.szDllPath
: Path of the DLL to load. ANSI/Unicode strings can be passed to InjectDllA()
/InjectDllW()
.dwTimeout
: Timeout in milliseconds used in wait functions. Can be INFINITE
.hRemoteDll
: Pointer to an HINSTANCE
variable that will receive the loaded DLL handle.
InjectDll()
just initializes the data block needed by the remote code and use RemoteExecute()
to remote execute
the function RemoteInjectDll()
.
DWORD WINAPI RemoteInjectDll(RDATADLL *pData)
{
return (pData->hRemoteDll = pData->LoadLibrary(pData->szDll));
}
RemoteInjectDll()
will run in the address space of the remote process and calls LoadLibrary()
to load the specified
DLL within the address space of the remote process. The handle of the loaded DLL is returned.
EjectDll()
The EjectDll()
function unloads a DLL from the address space of a remote process. It accepts 5 parameters:
hProcess
: Handle of the remote process.
ProcessFlags
: Returned by GetProcessInfo()
. Can be zero.szDllPath
: Path of the DLL to unload. ANSI/Unicode strings can be passed to EjectDllA()
/EjectDllW()
. Can be NULL
.hRemoteDll
: If szDllPath
is NULL
the hRemoteDll
parameter is used as the DLL handle.dwTimeout
: Timeout in milliseconds used in wait functions. Can be INFINITE
.
EjectDll()
initializes the data block needed by the remote function and use RemoteExecute()
to remote execute
the function RemoteEjectDll()
.
DWORD WINAPI RemoteEjectDll(RDATADLL *pData)
{
if (pData->szDll[0] != '\0')
pData->hRemoteDll = pData->GetModuleHandle(pData->szDll);
do {
pData->Result = pData->FreeLibrary(pData->hRemoteDll);
} while (pData->Result);
return 0;
}
RemoteEjectDll()
will run in the address space of the remote process and calls FreeLibrary()
to unload the specified DLL.
FreeLibrary()
is called a number of times necessary to decrease the reference count to zero. If the DLL name is specified
GetModuleHandle()
is used to retrieve the handle of the DLL needed by FreeLibrary()
.
StartRemoteSubclass()
The StartRemoteSubclass()
function subclasses a remote window (i.e., changes a remote process window procedure). It accepts 2 parameters:
rd
: Pointer to a RDATA
structure defined as:
typedef struct _RDATA {
int Size; HANDLE hProcess; DWORD ProcessFlags; DWORD dwTimeout; HWND hWnd; struct _RDATA *pRDATA; WNDPROC pfnStubWndProc; USERWNDPROC pfnUserWndProc; WNDPROC pfnOldWndProc; LRESULT Result; SETWINDOWLONG pfnSetWindowLong; CALLWINDOWPROC pfnCallWindowProc; } RDATA;
If you need to pass extra data to the new window procedure handler, it must be appended to the existing RDATA
. Before calling StartRemoteSubclass()
,
the following fields of the RDATA
structure must be initialized: Size
must contain the size of the RDATA
structure plus any appended data,
hProcess
must contain the handle of the remote process, and hWnd
must contain the handle of the window to be subclassed. The extra fields
of the appended data should also be initialized at this point. All the remaining fields should be considered private and not used.
WndProc
: User window procedure that will handle the subclassed window messages. It's defined as:
typedef LRESULT (WINAPI* USERWNDPROC)(RDATA *, HWND, UINT, WPARAM, LPARAM);
Except for the first parameter (a pointer to the RDATA
structure) the remaining parameters are the normal window handle, message type,
and wParam
and lParam
found in any window procedure handler. The new window procedure handler will be called by Windows every time
a message to the window must be processed, therefore the function should be coded as a "normal" window procedure handler (with the switch(Msg)
loop).
Please note that because this function will be executed on a remote process, it must follow the same rules as any remote code execution. Any unhandled message
should be processed by the default window procedure handler. For this, the function must return FALSE
. If you want to process yourself some messages,
return the value in the Result
field of the RDATA
structure and return TRUE
for the function. This function
is protected from exceptions by a remote SEH frame.
StartRemoteSubclass()
initializes the remaining RDATA
fields and uses RemoteExecute()
to remote execute the function
RemoteStartSubclass()
:
DWORD WINAPI RemoteStartSubclass(RDATA *pData)
{
return (pData->pfnOldWndProc =
pData->pfnSetWindowLong(pData->hWnd,
GWL_WNDPROC,
pData->pfnStubWndProc));
}
RemoteStartSubclass()
will run in the address space of the remote process and calls SetWindowLong()
with the parameter
GWL_WNDPROC
to change the window procedure handler to a new window handler. This handler will be called by Windows every time a message
to the window must be processed. The new window procedure handler (StubWndProc()
of file "Stub.asm") sets an SEH frame and calls
UserWndProc()
. If UserWndProc()
returns FALSE
a call to CallWindowProc()
allows the original window procedure to handle the message.
StopRemoteSubclass()
The StopRemoteSubclass()
function restores the remote process original window handler. It accepts one parameter:
rd
: This is the same RDATA
structure passed to StartRemoteSubclass()
and contains the needed data initialized by this function.
StopRemoteSubclass()
releases the allocated memory and uses RemoteExecute()
to remote execute the function RemoteStopSubclass()
:
DWORD WINAPI RemoteStopSubclass(RDATA *pData)
{
return (pData->pfnSetWindowLong(pData->hWnd,
GWL_WNDPROC, pData->pfnOldWndProc));
}
RemoteStopSubclass()
will run in the address space of the remote process and calls SetWindowLong()
with parameter GWL_WNDPROC
to restore the original window procedure handler.
Demo
Finally to demonstrate how to use the Injection Library exported functions, I wrote an application that lets you use all the injection methods on any
running process (if applicable!). The application just fills a listview control with all running processes, and according to the user choices, injects code,
a DLL, or subclasses a process window. From my tests, only the following processes couldn't be injected:
- Windows 9x: 16-bit processes (they are considered invalid processes).
- Windows NT: idle process (PID = 0), system process (PID = 4) and protected processes.
History
- September 27, 2005: version 1.0 - Windows 95 to Windows XP.
- November 1, 2011: version 2.0 - Updated for Vista, Windows 7.
References
- "Three ways to inject your code into another process" by Robert Kuster.
- "Remote Library" by António Feijão.
- "PrcHelp" by Radim Picha.
- "Windows 95 System Programming Secrets" by Matt Pietrek.
- "Windows NT/2000 Native API Reference" by Gary Nebbett.
- "A Crash Course on the Depths of Win32 Structured Exception Handling" by Matt Pietrek.
- "Enumerating Windows Processes" by Alex Fedotov.
- "Remote Thread Execution in System Process using NtCreateThreadEx for Vista & Windows 7"
by SecurityXploded.
- "Process Hacker" by wj32.