Introduction
Remote code injection has always been a popular topic with dozens of articles written about the subject. One of the preferred techniques involves the following steps [4]:
- Allocate memory in the remote process using
VirtualAllocEx()
.
- Copy the code to the allocated remote memory using
WriteProcessMemory()
.
- Execute the remote code using
CreateRemoteThread()
.
The problem of this technique (as stated by several remote injection tutorials) is that the needed Windows functions don't exist across all Windows versions. The purpose of this library is to emulate the missing functions to allow to use the same code across all Windows versions.
Remote memory handling
The VirtualAllocEx()
function is used to allocate memory within the virtual address of a specified process, and VirtualFreeEx()
is used to release the allocated memory. The difference between VirtualAllocEx()
/VirtualFreeEx()
and VirtualAlloc()
/VirtualFree()
is the first parameter (hProcess
) that allows to allocate/release memory for any process and not only within the address space of the calling process. The Kernel32 functions VirtualAllocEx()
and VirtualFreeEx()
are only implemented on Windows NT 4.0 or higher. To emulate these functions in NT 3.51, we can use the undocumented NTDLL functions NtAllocateVirtualMemory()
and NtFreeVirtualMemory()
:
LPVOID VirtualAllocExNT3(HANDLE hProcess,
LPVOID lpAddress,
DWORD dwSize,
DWORD flAllocationType,
DWORD flProtect)
{
NTSTATUS Status = NtAllocateVirtualMemory(hProcess,
&lpAddress,
0,
&dwSize,
flAllocationType,
flProtect);
if (!NT_SUCCESS(Status))
{
SetLastError(RtlNtStatusToDosError(Status));
return NULL;
}
return lpAddress;
}
BOOL VirtualFreeExNT3(HANDLE hProcess,
LPVOID lpAddress,
DWORD dwSize,
DWORD dwFreeType)
{
if ((dwFreeType & MEM_RELEASE) && (dwSize != 0))
{
SetLastError(ERROR_INVALID_PARAMETER);
return FALSE;
}
NTSTATUS Status = NtFreeVirtualMemory(hProcess,
&lpAddress,
&dwSize,
dwFreeType);
if (!NT_SUCCESS(Status))
{
SetLastError(RtlNtStatusToDosError(Status));
return FALSE;
}
return TRUE;
}
In Windows 9x, the trick is to use VirtualAlloc()
to allocate memory in the shared area (between address 0x80000000 and 0xC0000000). Unlike Windows NT, in Win9x, the virtual memory area between 2GB and 3GB is always mapped into every process without the need to do explicit mapping. This means that if one process allocates memory within this region, other processes can access it directly using the pointer address. To allocate shared memory in Win9x using VirtualAlloc()
, the undocumented flag VA_SHARED
must be ORed with the flAllocationType
parameter ([1] Chapter 5). VirtualFree()
can be used to release the shared memory.
The undocumented flag VA_SHARED
is defined as:
#define VA_SHARED 0x8000000
LPVOID VirtualAllocEx9x(HANDLE hProcess,
LPVOID lpAddress,
DWORD dwSize,
DWORD flAllocationType,
DWORD flProtect)
{
#define VA_SHARED 0x8000000
return VirtualAlloc(lpAddress,
dwSize,
flAllocationType |
VA_SHARED,
flProtect)
}
BOOL VirtualFreeEx9x(HANDLE hProcess,
LPVOID lpAddress,
DWORD dwSize,
DWORD dwFreeType)
{
return VirtualFree(lpAddress,
dwSize,
dwFreeType);
}
OpenThread
OpenThread()
returns a handle to a thread object from its thread identifier (TID
). It's a convenient way of converting a TID
to a thread handle. The Kernel32 function OpenThread()
is only implemented on Win 2000 or higher and Win ME. On Windows NT, the undocumented NTDLL function NtOpenThread()
can be used to emulate OpenThread()
:
HANDLE OpenThreadNT(DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwThreadId)
{
OBJECT_ATTRIBUTES ObjectAttributes;
CLIENT_ID ClientId;
HANDLE hThread;
NTSTATUS Status;
InitializeObjectAttributes(&ObjectAttributes, NULL, 0, NULL, NULL);
if (bInheritHandle)
ObjectAttributes.Attributes = OBJ_INHERIT;
ClientId.UniqueProcess = NULL;
ClientId.UniqueThread = (HANDLE)dwThreadId;
Status = NtOpenThread(&hThread,
dwDesiredAccess,
&ObjectAttributes,
&ClientId);
if (!NT_SUCCESS(Status))
{
SetLastError(RtlNtStatusToDosError(Status));
return NULL;
}
return hThread;
}
In Win9x, Kernel32 OpenProcess()
can also be used to open a thread: instead of passing it a PID
(Process ID), a TID
(Thread ID) is passed to it. Inside OpenProcess()
, there's a check to verify if the passed Id
corresponds to a PID
and if not returns ERROR_INVALID_PARAMETER
, therefore a method to bypass this check is necessary to be able to use OpenProcess()
to open a thread object. There are two ways of doing this: skip the test inside OpenProcess()
(ATM method by Enrico Del Fante), or change the thread object type to process (Each Kernel object -- process, thread, file, mutex, ... -- has a corresponding data structure that holds information about the object. These structures are undocumented, but you can find the description of some of these structures in [1]. The first field of each Kernel object struct
is always the object type: K32OBJ_PROCESS
, K32OBJ_THREAD
, K32OBJ_FILE
, K32OBJ_MUTEX
, ...). After the initial checks, OpenProcess()
just calls the Kernel32 internal function OpenHandle()
that returns a handle to an object -- that's why we can use OpenProcess()
to open both process/thread objects.
Method 1:
HANDLE OpenThread9x(DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwThreadId)
{
HANDLE hThread;
HANDLE hKernel32 = GetModuleHandle("Kernel32.dll");
DWORD OpenProcessOrdinal = NameToOrdinal(hKernel32, "OpenProcess");
PVOID pOpenProcess = _GetProcAddress(hKernel32, OpenProcessOrdinal);
int OpenProcessLength = GetProcLength(hKernel32, OpenProcessOrdinal);
INTERNALOPENTHREAD InternalOpenThread = MemSearch(pOpenProcess,
OpenProcessLength,
"\xB9\x00\x00\x00\x00",
5);
PTDB pTDB = dwThreadId ^ dwObsfucator;
__asm
{
mov eax, pTDB
push dwThreadId
push bInheritHandle
push dwDesiredAccess
call InternalOpenThread
mov hThread, eax
}
return hThread;
}
The internal Kernel32 OpenThread()
function must be called from assembly language because it expects register EAX
to contain the TDB
(Thread Database) address. For a detailed description of functions NameToOrdinal()
, _GetProcAddress()
and GetProcLength()
, see the file "GetProcAddress.c".
Method 2:
HANDLE OpenThread9x(DWORD dwDesiredAccess,
BOOL bInheritHandle,
DWORD dwThreadId)
{
PTDB pTDB = dwThreadId ^ dwObsfucator;
pTDB->Type = K32OBJ_PROCESS;
HANDLE hThread = OpenProcess(dwDesiredAccess, bInheritHandle, dwThreadId);
pTDB->Type = K32OBJ_THREAD;
return hThread;
}
After obtaining a pointer to the thread object (pTDB
) [for a detailed explanation of the thread object and how to obtain a pointer to it, see the GetTDB()
function description below], its type is changed to process to bypass the check inside OpenProcess()
. After the call to OpenProcess()
, the object type is changed back to thread. In a multitasking OS like Windows, sometimes it's not a good idea to directly manipulate internal structures, so method 1 should be safer.
GetProcessId
GetProcessId()
returns the process identifier (PID
) of the specified process. It's a convenient way of converting a process handle to the corresponding PID
. The Kernel32 function GetProcessId()
is only implemented on Win XP or higher. On Windows NT, the semi-documented NTDLL function NtQueryInformationProcess()
can be used to emulate GetProcessId()
:
DWORD WINAPI GetProcessIdNT(HANDLE hProcess)
{
NTSTATUS Status;
PROCESS_BASIC_INFORMATION pbi;
HANDLE hDupHandle;
HANDLE hCurrentProcess;
hCurrentProcess = GetCurrentProcess();
if (!DuplicateHandle(hCurrentProcess,
hProcess,
hCurrentProcess,
&hDupHandle,
PROCESS_QUERY_INFORMATION,
FALSE,
0))
{
SetLastError(ERROR_ACCESS_DENIED);
return 0;
}
Status = NtQueryInformationProcess(hDupHandle,
ProcessBasicInformation,
&pbi,
sizeof(pbi),
NULL);
CloseHandle(hDupHandle);
if (!NT_SUCCESS(Status))
{
SetLastError(RtlNtStatusToDosError(Status));
return 0;
}
return pbi.UniqueProcessId;
}
It's advisable to use the DuplicateHandle()
function to obtain the PROCESS_QUERY_INFORMATION
access right before calling NtQueryInformationProcess()
(because the passed hProcess
could not have this access right).
In Win9x, a PID
is in reality a pointer to a process object (PDB
) (actually XORed with a DWORD
called "Obfuscator") [for a detailed explanation of the process object and how to obtain a pointer to it, see the GetPDB()
function description below; and for an explanation of the meaning of "Obfuscator", see the GetObsfucator()
function below]. Using the process handle as an index to the handle table, we retrieve a pointer to the process object. This pointer XORed with the Obfuscator provides the PID
.
DWORD GetProcessId9x(HANDLE hProcess)
{
PPDB pPDB, pObject;
PHANDLE_TABLE pHandleTable;
int index;
pPDB = GetPDB();
if (OSWin95)
index = hProcess;
else if (OSWin98)
index = hProcess / 4;
pHandleTable = pPDB->pHandleTable;
pObject = pHandleTable->array[index].pObject;
return pObject ^ dwObsfucator;
}
An alternative method that works for all Windows versions is to call GetCurrentProcessId()
in the context of the specified process:
DWORD _GetProcessId(HANDLE hProcess)
{
HANDLE hThread;
DWORD Pid;
GetCurrentProcessId = GetProcAddress(GetModuleHandle("Kernel32.dll"),
"GetCurrentProcessId");
hThread = _CreateRemoteThread(hProcess,
NULL,
0,
(LPTHREAD_START_ROUTINE)GetCurrentProcessId,
NULL,
0,
NULL);
if (!hThread)
return 0;
else
{
WaitForSingleObject(hThread, INFINITE);
GetExitCodeThread(hThread, &Pid);
CloseHandle(hThread);
return Pid;
}
}
GetThreadId
GetThreadId()
returns the thread identifier (TID
) of the specified thread. It's a convenient way of converting a thread handle to the corresponding TID
. The Kernel32 function GetThreadId()
is only implemented on Win XP or higher. On Windows NT, the semi-documented NTDLL function NtQueryInformationThread()
can be used to emulate GetThreadId()
:
DWORD WINAPI GetThreadIdNT(HANDLE hThread)
{
NTSTATUS Status;
THREAD_BASIC_INFORMATION tbi;
HANDLE hDupHandle;
HANDLE hCurrentProcess;
hCurrentProcess = GetCurrentProcess();
if (!DuplicateHandle(hCurrentProcess,
hThread,
hCurrentProcess,
&hDupHandle,
THREAD_QUERY_INFORMATION,
FALSE,
0))
{
SetLastError(ERROR_ACCESS_DENIED);
return 0;
}
Status = NtQueryInformationThread(hDupHandle,
ThreadBasicInformation,
&tbi,
sizeof(tbi),
NULL);
CloseHandle(hDupHandle);
if (!NT_SUCCESS(Status))
{
SetLastError(RtlNtStatusToDosError(Status));
return 0;
}
return tbi.ClientId.UniqueThread;
}
It's advisable to use the DuplicateHandle()
function to obtain the THREAD_QUERY_INFORMATION
access right before calling NtQueryInformationThread()
(because the passed hThread
could not have this access right).
In Win9x, similar to GetProcessId()
, using the thread handle as an index to the handle table, we retrieve a pointer to the thread object. This pointer XORed with the Obfuscator provides the TID
.
DWORD GetThreadId9x(HANDLE hThread)
{
PPDB pPDB;
PTDB pObject;
PHANDLE_TABLE pHandleTable;
int index;
pPDB = GetPDB();
if (OSWin95)
index = hThread;
else if (OSWin98)
index = hThread / 4;
pHandleTable = pPDB->pHandleTable;
pObject = pHandleTable->array[index].pObject;
return pObject ^ dwObsfucator;
}
CreateRemoteThread
CreateRemoteThread()
creates a thread that runs in the virtual address space of another process. The Kernel32 function CreateRemoteThread()
is not implemented in the Win9x platform. There are two completely different ways of emulating the CreateRemoteThread()
function in Win9x: locate the Kernel32 internal function that creates a remote thread (and that ultimately will call the VxD VWIN32_CreateThread
-- VxDCall 0x002A,0x0008
). This internal function is called by exported functions like CreateThread()
, CreateKernelThread()
, CreateProcess()
and DebugActiveProcess()
among others. The other method involves "hijacking" an existing remote thread and manipulating its context to run the new thread code.
1. InternalCreateRemoteThread()
Inside Kernel32, there's a function (which I am calling "InternalCreateRemoteThread
") with the following declaration:
PTDB InternalCreateRemoteThread(PVOID pPDB,
DWORD dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter,
DWORD Flags);
pPDB
is a pointer to the Process Database (PDB
), dwStackSize
is the initial stack size in bytes, lpStartAddress
is the thread function address, lpParameter
is a pointer to the data to be passed to the thread function, and Flags
are internal flags for the function (they differ from dwCreationFlags
!). The function returns a pointer to the newly created thread object (TDB
). Because this function isn't exported, we must find a way of locating it inside Kernel32. Fortunately, this function is called by several exported functions, so by searching inside one of these exported functions, we will be able to locate it. Contrary to a popular belief, CreateThread()
requires several search levels to locate the internal function and therefore it shouldn't be our primary choice. DebugActiveProcess()
calls InternalCreateRemoteThread()
directly with the following parameters: pPDB
in EDI, dwStackSize
= 0xFFFFF000, lpStartAddress
= SomeInternalKernel32Routine, lpParameter
in EDI, Flags
= 8. With this information, it's easy to retrieve the address of the InternalCreateRemoteThread()
as follows:
HANDLE hKernel32 = GetModuleHandle("Kernel32.dll");
DWORD DebugActiveProcessOrdinal = NameToOrdinal(hKernel32, "DebugActiveProcess");
PVOID pDebugActiveProcess = _GetProcAddress(hKernel32, DebugActiveProcessOrdinal);
int DebugActiveProcessLength = GetProcLength(hKernel32, DebugActiveProcessOrdinal);
PBYTE p = MemSearch(pDebugActiveProcess, DebugActiveProcessLength,
"\x68\x00\xF0\xFF\xFF", 5);
p += 7;
InternalCreateRemoteThread = (INTERNALCREATEREMOTETHREAD)(p + *(DWORD *)p + 4);
The complete CreateRemoteThread()
emulation in Win9x then translates to:
HANDLE CreateRemoteThread9x(HANDLE hProcess,
LPSECURITY_ATTRIBUTES lpThreadAttributes,
DWORD dwStackSize,
LPTHREAD_START_ROUTINE lpStartAddress,
LPVOID lpParameter,
DWORD dwCreationFlags,
LPDWORD lpThreadId)
{
#define INVALID_FLAGS (fTerminated | fTerminating | fNearlyTerminating)
PPDB pPDB, pPDBLocal;
PTDB pTDB;
DWORD dwThreadId, dwProcessId;
HANDLE hThread;
BOOL bInheritHandle = FALSE;
DWORD fLocal;
DWORD fFlags = 8;
DWORD StackSize;
SetLastError(ERROR_INVALID_PARAMETER);
if (lpThreadAttributes != NULL)
{
if (lpThreadAttributes->nLength != sizeof(SECURITY_ATTRIBUTES))
return NULL;
bInheritHandle = lpThreadAttributes->bInheritHandle;
}
if (!lpStartAddress)
return NULL;
if (!(dwProcessId = _GetProcessId(hProcess)))
return NULL;
if (!(pPDB = GetPDB(dwProcessId)))
return NULL;
if (((PPDB95)pPDB)->Flags & INVALID_FLAGS)
{
SetLastError(ERROR_INVALID_PARAMETER);
return NULL;
}
if (!(pPDBLocal = GetPDB(-1)))
return NULL;
fLocal = pPDB == pPDBLocal;
if (!fLocal)
{
if (dwStackSize == 0)
StackSize = -(int)0x3000;
else
StackSize = -(int)dwStackSize;
}
else
StackSize = dwStackSize;
fFlags |= 0x40;
if (dwCreationFlags & CREATE_SILENT)
fFlags |= 0x10;
EnterSysLevel(Win16Mutex);
if (Krn32Mutex)
EnterSysLevel(Krn32Mutex);
pTDB = InternalCreateRemoteThread(pPDB,
StackSize,
lpStartAddress,
lpParameter,
fFlags);
if (Krn32Mutex)
LeaveSysLevel(Krn32Mutex);
LeaveSysLevel(Win16Mutex);
dwThreadId = (DWORD)pTDB ^ dwObsfucator;
if (lpThreadId != NULL)
*lpThreadId = dwThreadId;
hThread = OpenThread9x(THREAD_ALL_ACCESS, bInheritHandle, dwThreadId);
if (!(dwCreationFlags & CREATE_SUSPENDED))
ResumeThread(hThread);
return hThread;
}
2. SetThreadContext()
This method involves the following steps:
- Get a thread ID from the remote process using one of the available enumeration methods (see [6]).
- Allocate memory within the remote process and copy code and data to it.
- Save the suspend count of the thread to be restored later.
SuspendThread()
returns this value. On Win9x, this value can be retrieved directly from the TDB
/TDBX ApiSuspendCount
field.
- Suspend the thread using
SuspendThread()
.
- Save the thread context (using
GetThreadContext()
) to be restored later.
- Set the
Eip
(Instruction Pointer) field of the thread context data structure to point to our code. To ease the detection of the thread termination, the Eip
is not set to the thread function directly, but instead it points to a stub code that will call the remote thread function. This code is defined as: push pParams
call ThreadFunction
jmp $
After pushing the needed parameters and calling the thread function, the code enters an infinite loop.
- Resume (i.e. run) the thread by using
ResumeThread()
.
- Compare the current instruction pointer (
Eip
) with the address of the jmp $
instruction. If it matches, we know that the thread function has terminated. The thread exit code is returned in the Eax
register.
- Restore the saved context and suspend count.
- Release the allocated memory.
With this method, in reality it's not possible to fully emulate the CreateRemoteThread()
function, so I presented instead a clone of the RemoteExec()
function (that uses CreateRemoteProcess()
and is described below) called ContextRemoteExecute()
. Note that suspending a running thread belonging to another process can lead to deadlock, turning this method very unsafe.
Other functions
The following functions do not exist in Windows and are only implemented to ease the manipulation of the undocumented internal structures and remote code injection.
GetTIB
This function returns a pointer to the Thread Information Block (TIB
). The return value should be cast depending on the Windows version. The format of this data structure is defined in the file "struct.h". This structure is documented for Window NT (defined in winnt.h). It contains stack and exception information used for SEH - Structured Exception Handling (used in C/C++ try
/catch
) - among other fields. DWORD
at offset 0x18 contains the linear address of the TIB
structure, therefore to retrieve a pointer to the TIB
, use the following code:
__asm mov eax, fs:[0x18]
__asm mov pTIB, eax
It's necessary to use assembly because this data structure is pointed by the FS
segment register (data references normally use the DS
segment register). Note that in Windows NT because the TIB
structure is contained by the TEB
(Thread Environment Block), pTIB
is also a pointer to the TEB
. In Windows 9x, the TIB
is part of the TDB
(Thread Database) -- see the description of the TDB
below. For a detailed explanation of the TIB
internals, see [2].
GetTDB
This function returns a pointer to the Thread Database (TDB
) for the specified thread ID (TID
). The return value should be cast depending on the Windows version. Because the TDB
exists only on the Windows 9x platform, the function returns NULL
for Windows NT or higher. The format of this data structure is defined in the file "struct.h". The TDB
is a system data structure that represents a thread object and contains information such as: PDB
pointer, TIB
, TLS
array pointer, thread exit code, etc. To obtain a pointer to the TDB
, just XOR the TID
with the Obfuscator value:
pTDB = TID ^ dwObsfucator;
This means that the TID
is just an encrypted pointer to the TDB
. For the current thread, the TDB
can be retrieved from the TIB
by subtracting the offset of the TIB
inside the TDB
structure:
__asm mov eax,fs:[0x18]
__asm mov pTDB,eax
if (OSWin95)
pTDB -= 0x10
else if (OSWin98)
pTDB -= 8
GetPDB
This function returns a pointer to the Process Database (PDB
) for the specified process ID (PID
). The return value should be cast depending on the Windows version. Because the PDB
exists only on the Windows 9x platform, the function returns NULL
for Windows NT or higher. The format of this data structure is defined in the file "struct.h". The PDB
is a system data structure that represents a process object and contains information about a process including: list of threads and modules, the handle table, the environment database, etc. To obtain a pointer to the PDB
, just XOR the PID
with the Obfuscator value:
pPDB = PID ^ dwObsfucator;
This means that the PID
is just an encrypted pointer to the PDB
. For the current process, the PDB
can be retrieved from the pProcess
field (offset 0x30) of the TIB
:
__asm mov eax, fs:[0x30]
__asm mov pPDB, eax
GetObsfucator
In early versions of Windows 95, GetCurrentProcessId()
and GetCurrentThreadId()
returned direct pointers to the PDB
and TDB
. Later Microsoft changed this behaviour and encrypted the return value (XOR with a random value). In the debug build, this value is called "Obsfucator
" ("Obfuscator" misspelled!). Because this is a random value (computed every time the system boots), it must be calculated at run time. From the properties of the XOR operator, we know that if A = B XOR C (PID
= PDB
XOR Obsfucator
) then C = A XOR B (Obsfucator
= PID
XOR PDB
). For the current process, we obtain the PID
from GetCurrentProcessId()
and the PDB
from FS:[0x30]
, and therefore the Obsfucator
can be obtained using the following code:
DWORD GetObsfucator()
{
DWORD PID, PDB;
PID = GetCurrentProcessId();
__asm mov eax,fs:[0x30];
__asm mov PDB,eax
return PDB ^ PID;
}
Because the TDB
pointer is also encrypted, it's possible to obtain the Obsfuscator
from TDB
XOR TID
. For the current thread, the TID
is obtained from GetCurrentThreadId()
, and the TDB
pointer from FS:[0x18]
minus an offset value. (FS:[0x18]
is the linear address of the TIB
. Each TDB
contains a TIB
structure, therefore by subtracting the offset of the starting TIB
inside the TDB
, we get the starting address of the TDB
.)
DWORD GetObsfucator()
{
DWORD TID, TDB;
TID = GetCurrentThreadId();
__asm mov eax,fs:[0x18];
__asm mov TDB,eax
if (OSWin95)
TDB -= 0x10;
else if (OSWin98)
TDB -= 8;
return TDB ^ TID;
}
RemoteExecute
The RemoteExecute()
function is used to execute code in the context of a remote process. It has the following declaration:
BOOL RemoteExecute(HANDLE hProcess,
LPTHREAD_START_ROUTINE Function,
PVOID pParams,
DWORD Size);
hProcess
is the handle of the remote process, Function
is the address of the thread function, and pParams
and Size
are the pointer and size of the optional data structure to be passed to the thread function. The function uses _VirtualAllocEx()
, WriteProcessMemory()
and _CreateRemoteThread()
to allocate memory and copy code and data to the remote process and execute the remote thread. It returns TRUE
if everything went OK. To successfully run the thread within the remote process, the thread function must be coded following certain rules: it cannot use any absolute addressing (calls, jumps, data references, ...) at all! Why? Because these are hard encoded pointer addresses that reference data/code in the running process and become invalid when the code is moved to the remote process address space. This means that it's not possible to call any external functions (including Windows DLLs and standard library -- like strlen()
or printf()
) and reference any static data like strings or structures. If needed, any function reference or data must be passed in a structure pointed by the pParams
parameter. Inside RemoteExec()
, there's a function (IsCodeSafe()
) that disassembles the thread function code and checks for absolute addressing, but this is not 100% secure, so as a final advise always check the generated code (using a disassembler or debugger) to catch any invalid references. If you are using Visual C++ (v 6.0) you can follow these tips:
- Turn on file listing and check the generated code: Project\Settings\C/C++\Listing Files\Listing file type=Assembly, Machine Code, and Source.
- Turn off stack probes. Check for
__chkstk()
references in the listing files.
- Use
#pragma check_stack(off)
.
- Use less than 4K of local variables.
- Augment the stack size: /Gs size (Project\Settings\C/C++\ProjectOptions).
- Remove the /GZ switch in the debug build. Check for
__chkesp()
references in the listing files.
- Project\Settings\C/C++\Project Options
- Disable incremental compilation (/Gi).
- Use
#pragma comment(linker, "/INCREMENTAL:NO")
- Remove the /Gi switch (Project\Settings\C/C++\Customize\Enable incremental compilation=Off)
- Declare the functions as static.
- Don't call any functions besides those in KERNEL32.DLL. Use
LoadLibrary()
/GetProcAddress
if you need functions from other libraries.
- Don't use any static strings. Pass them in a structure pointed by
pParams
.
StartRemoteSubclass/StopRemoteSubclass
These two functions are used to remote subclass the Windows procedure handler of a remote process. They are declared as:
BOOL StartRemoteSubclass(PRDATA rd, USERWNDPROC WndProc);
BOOL StopRemoteSubclass(PRDATA rd);
rd
is the address of a RDATA
structure that contains important information to be passed to these functions:
struct _RDATA {
int Size;
HANDLE hProcess;
HWND hWnd;
struct RDATA *pRDATA;
WNDPROC pfnNewWndProc;
WNDPROC pfnOldWndProc;
USERWNDPROC pfnUserWndProc;
LRESULT Result;
SETWINDOWLONG pfnSetWindowLong;
CALLWINDOWPROC pfnCallWindowProc;
} RDATA;
If you need to pass extra data to the new window procedure handler, it must be appended to the existing RDATA
. Before calling StartRemoteSubclass()
, the following fields of the RDATA
structure must be initialized: Size
must contain the size of the RDATA
structure plus any appended data, hProcess
must contain the handle of the remote process, and hWnd
must contain the handle of the window to be subclassed. The extra fields of the appended data should also be initialized at this point. All the remaining fields should be considered private and not used. WndProc
is the address of the new window handler procedure and is declared as:
LRESULT WINAPI WndProc(PRDATA pData,
HWND hWnd,
UINT Msg,
WPARAM wParam,
LPARAM lParam)
Except for the first parameter (a pointer to the RDATA
structure) the remaining parameters are the normal window handler, message type, and wParam
and lParam
found in any window procedure handler. The new window procedure handler will be called by Windows every time a message to the window must be processed, therefore the function should be coded as a "normal" window procedure handler (with the switch(Msg)
loop). Please note that because this function will be executed on a remote process, it must follow the same rules as any remote code execution (see description of the RemoteExecute()
function above). Any unhandled message should be processed by the default window procedure handler. For this, the function must return FALSE
. If you want to process some messages yourself, return the value in the Result
field of the RDATA
structure and return TRUE
for the function.
Demo Code
Finally to glue everything together, I wrote an application that demonstrates how to use the functions exported by the Remote library. Here's a brief description of the code:
- Display Windows version.
- Launch "Notepad.exe" using
CreateProcess()
.
- Allocate a 0x1000 bytes buffer using
_VirtualAllocEx()
.
- Write to the allocated memory using
WriteProcessMemory
or memset()
(Win9x only).
- Free the allocated memory using
_VirtualFreeEx()
.
- Convert the process handle to the PID by using
_GetProcessId()
.
- Convert the thread handle to the TID by using
_GetThreadId()
.
- Use
_CreateRemoteThread()
to run a thread in the current process. The thread just sleeps for 3 seconds and returns the lpParameter
value.
- Use
_OpenThread()
to return a new handle to the created thread. This handle is used by WaitForSingleObject()
and GetExitCodeThread()
.
- Call
ContextRemoteExecute()
to run a thread in the remote process. The thread just copies a message string to the "Notepad" edit window.
- Call
RemoteExecute()
to run a thread in the remote process. The thread just copies a message string to the "Notepad" edit window.
- Subclass the "Notepad" edit window using
StartRemoteSubclass()
. If you press the key sequence "remote" in the "Notepad" edit window, a message box is displayed.
- After 30 seconds, restore the original "Notepad" window handler by using
StopRemoteSubclass()
.
- Wait for the "Notepad" process to finish.
Note
This library was tested with the following Windows versions:
- Windows 95 (4.00.1111B)
- Windows 98 (4.10.1998/4.10.2222A)
- Windows ME (4.90.3000)
- Windows NT 4.0 (4.0.1381 SP1/SP3/SP6)
- Windows 2000 (5.0.2195 SP2/SP4)
- Windows XP (5.1.2600)
Because the code relies heavily on undocumented functions and structures, it can malfunction (or even crash) if run in an untested version. In this case, you should modify the code to include the idiosyncrasies of the new versions.
N.B. Windows 2000 SP3 is known for having several bugs. Because of this, some of the presented code doesn't work correctly on this version (in particular, the function ContextRemoteExecute()
).
References
- "Windows 95 System Programming Secrets" by Matt Pietrek
- MSJ May 1996 - Under the Hood by Matt Pietrek
- "DLL Injection on Win32 platforms" by Yoda
- "Three ways to inject your code into another process" by Robert Kuster
- The Undocumented Functions by NTinternals
- "Enumerating Windows Processes" by Alex Fedotov