(untagged)

Remote Library

Antnio Feijo

0.00/5 (No votes)

30 Sep 2005

A library that implements a common interface for remote memory handling and remote execution for all Windows versions.

Introduction

Remote code injection has always been a popular topic with dozens of articles written about the subject. One of the preferred techniques involves the following steps [4]:

Allocate memory in the remote process using VirtualAllocEx().
Copy the code to the allocated remote memory using WriteProcessMemory().
Execute the remote code using CreateRemoteThread().

The problem of this technique (as stated by several remote injection tutorials) is that the needed Windows functions don't exist across all Windows versions. The purpose of this library is to emulate the missing functions to allow to use the same code across all Windows versions.

Remote memory handling

The VirtualAllocEx() function is used to allocate memory within the virtual address of a specified process, and VirtualFreeEx() is used to release the allocated memory. The difference between VirtualAllocEx()/VirtualFreeEx() and VirtualAlloc()/VirtualFree() is the first parameter (hProcess) that allows to allocate/release memory for any process and not only within the address space of the calling process. The Kernel32 functions VirtualAllocEx() and VirtualFreeEx() are only implemented on Windows NT 4.0 or higher. To emulate these functions in NT 3.51, we can use the undocumented NTDLL functions NtAllocateVirtualMemory() and NtFreeVirtualMemory():

LPVOID VirtualAllocExNT3(HANDLE hProcess,
                         LPVOID lpAddress,
                         DWORD dwSize,
                         DWORD flAllocationType,
                         DWORD flProtect)
{
    NTSTATUS Status = NtAllocateVirtualMemory(hProcess,          // Process handle

                                              &lpAddress,        // Memory address

                                              0,                 // Zero bits

                                              &dwSize,           // Region size

                                              flAllocationType,  // Allocation type

                                              flProtect);        // Protection attr.


    if (!NT_SUCCESS(Status))
    {
        SetLastError(RtlNtStatusToDosError(Status));
        return NULL;
    }

    return lpAddress;
}

BOOL VirtualFreeExNT3(HANDLE hProcess,
                      LPVOID lpAddress,
                      DWORD dwSize,
                      DWORD dwFreeType)
{
    // Param 'dwSize' must be zero for MEM_RELEASE

    if ((dwFreeType & MEM_RELEASE) && (dwSize != 0))
    {
        SetLastError(ERROR_INVALID_PARAMETER);
        return FALSE;
    }

    NTSTATUS Status = NtFreeVirtualMemory(hProcess,      // Process handle

                                          &lpAddress,    // Base address

                                          &dwSize,       // Region size

                                          dwFreeType);   // Free type


    if (!NT_SUCCESS(Status))
    {
        SetLastError(RtlNtStatusToDosError(Status));
        return FALSE;
    }

    return TRUE;
}

In Windows 9x, the trick is to use VirtualAlloc() to allocate memory in the shared area (between address 0x80000000 and 0xC0000000). Unlike Windows NT, in Win9x, the virtual memory area between 2GB and 3GB is always mapped into every process without the need to do explicit mapping. This means that if one process allocates memory within this region, other processes can access it directly using the pointer address. To allocate shared memory in Win9x using VirtualAlloc(), the undocumented flag VA_SHARED must be ORed with the flAllocationType parameter ([1] Chapter 5). VirtualFree() can be used to release the shared memory.

The undocumented flag VA_SHARED is defined as:

#define VA_SHARED 0x8000000

LPVOID VirtualAllocEx9x(HANDLE hProcess,
                        LPVOID lpAddress,
                        DWORD dwSize,
                        DWORD flAllocationType,
                        DWORD flProtect)
{
    #define VA_SHARED 0x8000000              // Undoc. shared memory flag


    return VirtualAlloc(lpAddress,            // Addr. of memory block

                        dwSize,               // Size of memory block

                        flAllocationType |    // Allocation type OR

                        VA_SHARED,            // Shared memory flag

                        flProtect)            // Access protection

}

BOOL VirtualFreeEx9x(HANDLE hProcess,
                     LPVOID lpAddress,
                     DWORD dwSize,
                     DWORD dwFreeType)
{
    return VirtualFree(lpAddress,               // Address of memory block

                       dwSize,                  // Size of memory block

                       dwFreeType);             // Type of free operation

}

OpenThread

OpenThread() returns a handle to a thread object from its thread identifier (TID). It's a convenient way of converting a TID to a thread handle. The Kernel32 function OpenThread() is only implemented on Win 2000 or higher and Win ME. On Windows NT, the undocumented NTDLL function NtOpenThread() can be used to emulate OpenThread():

HANDLE OpenThreadNT(DWORD dwDesiredAccess,
                    BOOL  bInheritHandle,
                    DWORD dwThreadId)
{
    OBJECT_ATTRIBUTES   ObjectAttributes;
    CLIENT_ID           ClientId;
    HANDLE              hThread;
    NTSTATUS            Status;

    InitializeObjectAttributes(&ObjectAttributes, NULL, 0, NULL, NULL);

    if (bInheritHandle)
        ObjectAttributes.Attributes = OBJ_INHERIT;

    ClientId.UniqueProcess = NULL;
    ClientId.UniqueThread = (HANDLE)dwThreadId;

    Status = NtOpenThread(&hThread,             // Thread handle

                          dwDesiredAccess,      // Access to thread object

                          &ObjectAttributes,    // Object attributes

                          &ClientId);           // Client Id


    if (!NT_SUCCESS(Status))
    {
        SetLastError(RtlNtStatusToDosError(Status));
        return NULL;
    }

    return hThread;
}

In Win9x, Kernel32 OpenProcess() can also be used to open a thread: instead of passing it a PID (Process ID), a TID (Thread ID) is passed to it. Inside OpenProcess(), there's a check to verify if the passed Id corresponds to a PID and if not returns ERROR_INVALID_PARAMETER, therefore a method to bypass this check is necessary to be able to use OpenProcess() to open a thread object. There are two ways of doing this: skip the test inside OpenProcess() (ATM method by Enrico Del Fante), or change the thread object type to process (Each Kernel object -- process, thread, file, mutex, ... -- has a corresponding data structure that holds information about the object. These structures are undocumented, but you can find the description of some of these structures in [1]. The first field of each Kernel object struct is always the object type: K32OBJ_PROCESS, K32OBJ_THREAD, K32OBJ_FILE, K32OBJ_MUTEX, ...). After the initial checks, OpenProcess() just calls the Kernel32 internal function OpenHandle() that returns a handle to an object -- that's why we can use OpenProcess() to open both process/thread objects.

Method 1:

HANDLE OpenThread9x(DWORD dwDesiredAccess,
                    BOOL  bInheritHandle,
                    DWORD dwThreadId)
{
    HANDLE  hThread;

    // Get OpenProcess() address and length

    HANDLE hKernel32 = GetModuleHandle("Kernel32.dll");
    DWORD OpenProcessOrdinal = NameToOrdinal(hKernel32, "OpenProcess");
    PVOID pOpenProcess = _GetProcAddress(hKernel32, OpenProcessOrdinal);
    int OpenProcessLength = GetProcLength(hKernel32, OpenProcessOrdinal);

    // Search for MOV ECX,0 (B9,00,00,00,00) inside OpenProcess() function

    INTERNALOPENTHREAD InternalOpenThread = MemSearch(pOpenProcess,
                                                      OpenProcessLength,
                                                      "\xB9\x00\x00\x00\x00",
                                                      5);

    // Thread Database pointer

    PTDB pTDB = dwThreadId ^ dwObsfucator;

    // InternalOpenThread()

    __asm
    {
        mov   eax, pTDB
        push  dwThreadId
        push  bInheritHandle
        push  dwDesiredAccess
        call  InternalOpenThread
        mov   hThread, eax
    }

    return hThread;
}

The internal Kernel32 OpenThread() function must be called from assembly language because it expects register EAX to contain the TDB (Thread Database) address. For a detailed description of functions NameToOrdinal(), _GetProcAddress() and GetProcLength(), see the file "GetProcAddress.c".

Method 2:

HANDLE OpenThread9x(DWORD dwDesiredAccess,
                    BOOL  bInheritHandle,
                    DWORD dwThreadId)
{
    // Thread Database pointer

    PTDB pTDB = dwThreadId ^ dwObsfucator;

    pTDB->Type = K32OBJ_PROCESS;
    HANDLE hThread = OpenProcess(dwDesiredAccess, bInheritHandle, dwThreadId);
    pTDB->Type = K32OBJ_THREAD;

    return hThread;
}

After obtaining a pointer to the thread object (pTDB) [for a detailed explanation of the thread object and how to obtain a pointer to it, see the GetTDB() function description below], its type is changed to process to bypass the check inside OpenProcess(). After the call to OpenProcess(), the object type is changed back to thread. In a multitasking OS like Windows, sometimes it's not a good idea to directly manipulate internal structures, so method 1 should be safer.

GetProcessId

GetProcessId() returns the process identifier (PID) of the specified process. It's a convenient way of converting a process handle to the corresponding PID. The Kernel32 function GetProcessId() is only implemented on Win XP or higher. On Windows NT, the semi-documented NTDLL function NtQueryInformationProcess() can be used to emulate GetProcessId():

DWORD WINAPI GetProcessIdNT(HANDLE hProcess)
{
    NTSTATUS                  Status;
    PROCESS_BASIC_INFORMATION pbi;
    HANDLE                    hDupHandle;
    HANDLE                    hCurrentProcess;

    hCurrentProcess = GetCurrentProcess();

    // Use DuplicateHandle() to get PROCESS_QUERY_INFORMATION access right

    if (!DuplicateHandle(hCurrentProcess,
                         hProcess,
                         hCurrentProcess,
                         &hDupHandle,
                         PROCESS_QUERY_INFORMATION,
                         FALSE,
                         0))
    {
        SetLastError(ERROR_ACCESS_DENIED);
        return 0;
    }

    Status = NtQueryInformationProcess(hDupHandle,
                                       ProcessBasicInformation,
                                       &pbi,
                                       sizeof(pbi),
                                       NULL);

    CloseHandle(hDupHandle);

    if (!NT_SUCCESS(Status))
    {
        SetLastError(RtlNtStatusToDosError(Status));
        return 0;
    }

    // Return PID

    return pbi.UniqueProcessId;
}

It's advisable to use the DuplicateHandle() function to obtain the PROCESS_QUERY_INFORMATION access right before calling NtQueryInformationProcess() (because the passed hProcess could not have this access right).

In Win9x, a PID is in reality a pointer to a process object (PDB) (actually XORed with a DWORD called "Obfuscator") [for a detailed explanation of the process object and how to obtain a pointer to it, see the GetPDB() function description below; and for an explanation of the meaning of "Obfuscator", see the GetObsfucator() function below]. Using the process handle as an index to the handle table, we retrieve a pointer to the process object. This pointer XORed with the Obfuscator provides the PID.

DWORD GetProcessId9x(HANDLE hProcess)
{
    PPDB           pPDB, pObject;
    PHANDLE_TABLE  pHandleTable;
    int            index;

    // Current PDB

    pPDB = GetPDB();

    // Handle table index

    if (OSWin95)
        index = hProcess;
    else if (OSWin98)
        index = hProcess / 4;

    // Handle table pointer

    pHandleTable = pPDB->pHandleTable;

    // Pointer to process database

    pObject = pHandleTable->array[index].pObject;

    // Return PID

    return pObject ^ dwObsfucator;
}

An alternative method that works for all Windows versions is to call GetCurrentProcessId() in the context of the specified process:

DWORD _GetProcessId(HANDLE hProcess)
{
    HANDLE  hThread;
    DWORD   Pid;

    GetCurrentProcessId = GetProcAddress(GetModuleHandle("Kernel32.dll"),
                                         "GetCurrentProcessId");

    hThread = _CreateRemoteThread(hProcess,
                                  NULL,
                                  0,
                                  (LPTHREAD_START_ROUTINE)GetCurrentProcessId,
                                  NULL,
                                  0,
                                  NULL);
    if (!hThread)
        return 0;
    else
    {
        WaitForSingleObject(hThread, INFINITE);
        // Exit code of remote thread is the PID

        GetExitCodeThread(hThread, &Pid);
        CloseHandle(hThread);
        return Pid;
   }
}

GetThreadId

GetThreadId() returns the thread identifier (TID) of the specified thread. It's a convenient way of converting a thread handle to the corresponding TID. The Kernel32 function GetThreadId() is only implemented on Win XP or higher. On Windows NT, the semi-documented NTDLL function NtQueryInformationThread() can be used to emulate GetThreadId():

DWORD WINAPI GetThreadIdNT(HANDLE hThread)
{
    NTSTATUS                 Status;
    THREAD_BASIC_INFORMATION tbi;
    HANDLE                   hDupHandle;
    HANDLE                   hCurrentProcess;

    hCurrentProcess = GetCurrentProcess();

    // Use DuplicateHandle() to get THREAD_QUERY_INFORMATION access right

    if (!DuplicateHandle(hCurrentProcess,
                         hThread,
                         hCurrentProcess,
                         &hDupHandle,
                         THREAD_QUERY_INFORMATION,
                         FALSE,
                         0))
    {
        SetLastError(ERROR_ACCESS_DENIED);
        return 0;
    }

    Status = NtQueryInformationThread(hDupHandle,
                                      ThreadBasicInformation,
                                      &tbi,
                                      sizeof(tbi),
                                      NULL);

    CloseHandle(hDupHandle);

    if (!NT_SUCCESS(Status))
    {
        SetLastError(RtlNtStatusToDosError(Status));
        return 0;
    }

    // Return TID

    return tbi.ClientId.UniqueThread;
}

It's advisable to use the DuplicateHandle() function to obtain the THREAD_QUERY_INFORMATION access right before calling NtQueryInformationThread() (because the passed hThread could not have this access right).

In Win9x, similar to GetProcessId(), using the thread handle as an index to the handle table, we retrieve a pointer to the thread object. This pointer XORed with the Obfuscator provides the TID.

DWORD GetThreadId9x(HANDLE hThread)
{
    PPDB           pPDB;
    PTDB           pObject;
    PHANDLE_TABLE  pHandleTable;
    int            index;

    // Current PDB

    pPDB = GetPDB();

    // Handle table index

    if (OSWin95)
        index = hThread;
    else if (OSWin98)
        index = hThread / 4;

    // Handle table pointer

    pHandleTable = pPDB->pHandleTable;

    // Pointer to thread database

    pObject = pHandleTable->array[index].pObject;

    // Return TID

    return pObject ^ dwObsfucator;
}

CreateRemoteThread

CreateRemoteThread() creates a thread that runs in the virtual address space of another process. The Kernel32 function CreateRemoteThread() is not implemented in the Win9x platform. There are two completely different ways of emulating the CreateRemoteThread() function in Win9x: locate the Kernel32 internal function that creates a remote thread (and that ultimately will call the VxD VWIN32_CreateThread -- VxDCall 0x002A,0x0008). This internal function is called by exported functions like CreateThread(), CreateKernelThread(), CreateProcess() and DebugActiveProcess() among others. The other method involves "hijacking" an existing remote thread and manipulating its context to run the new thread code.

1. InternalCreateRemoteThread()

Inside Kernel32, there's a function (which I am calling "InternalCreateRemoteThread") with the following declaration:

PTDB InternalCreateRemoteThread(PVOID pPDB,
                                DWORD dwStackSize,
                                LPTHREAD_START_ROUTINE lpStartAddress,
                                LPVOID lpParameter,
                                DWORD Flags);

pPDB is a pointer to the Process Database (PDB), dwStackSize is the initial stack size in bytes, lpStartAddress is the thread function address, lpParameter is a pointer to the data to be passed to the thread function, and Flags are internal flags for the function (they differ from dwCreationFlags!). The function returns a pointer to the newly created thread object (TDB). Because this function isn't exported, we must find a way of locating it inside Kernel32. Fortunately, this function is called by several exported functions, so by searching inside one of these exported functions, we will be able to locate it. Contrary to a popular belief, CreateThread() requires several search levels to locate the internal function and therefore it shouldn't be our primary choice. DebugActiveProcess() calls InternalCreateRemoteThread() directly with the following parameters: pPDB in EDI, dwStackSize = 0xFFFFF000, lpStartAddress = SomeInternalKernel32Routine, lpParameter in EDI, Flags = 8. With this information, it's easy to retrieve the address of the InternalCreateRemoteThread() as follows:

// Get DebugActiveProcess() address and length

HANDLE hKernel32 = GetModuleHandle("Kernel32.dll");
DWORD DebugActiveProcessOrdinal = NameToOrdinal(hKernel32, "DebugActiveProcess");
PVOID pDebugActiveProcess = _GetProcAddress(hKernel32, DebugActiveProcessOrdinal);
int DebugActiveProcessLength = GetProcLength(hKernel32, DebugActiveProcessOrdinal);

// Search for PUSH 0xFFFFF000 (68,00,F0,FF,FF) inside DebugActiveProcess() function

PBYTE p = MemSearch(pDebugActiveProcess, DebugActiveProcessLength,
                    "\x68\x00\xF0\xFF\xFF", 5);
p += 7; // Point to CALL InternalCreateRemoteThread

// Address of InternalCreateRemoteThread() inside DebugActiveProc()

InternalCreateRemoteThread = (INTERNALCREATEREMOTETHREAD)(p + *(DWORD *)p + 4);

The complete CreateRemoteThread() emulation in Win9x then translates to:

HANDLE CreateRemoteThread9x(HANDLE                 hProcess,
                            LPSECURITY_ATTRIBUTES  lpThreadAttributes,
                            DWORD                  dwStackSize,
                            LPTHREAD_START_ROUTINE lpStartAddress,
                            LPVOID                 lpParameter,
                            DWORD                  dwCreationFlags,
                            LPDWORD                lpThreadId)
{
    #define INVALID_FLAGS (fTerminated | fTerminating | fNearlyTerminating)

    PPDB    pPDB, pPDBLocal;
    PTDB    pTDB;
    DWORD   dwThreadId, dwProcessId;
    HANDLE  hThread;
    BOOL    bInheritHandle = FALSE;
    DWORD   fLocal;
    DWORD   fFlags = 8; // Initial flag for InternalCreateRemoteThread()

    DWORD   StackSize;

    SetLastError(ERROR_INVALID_PARAMETER);

    if (lpThreadAttributes != NULL)
    {
        if (lpThreadAttributes->nLength != sizeof(SECURITY_ATTRIBUTES))
            return NULL;

        bInheritHandle = lpThreadAttributes->bInheritHandle;
    }

    if (!lpStartAddress)
        return NULL;

    // Get PID

    if (!(dwProcessId = _GetProcessId(hProcess)))
        return NULL;

    //Get PDB

    if (!(pPDB = GetPDB(dwProcessId)))
        return NULL;

    // Check process flags

    if (((PPDB95)pPDB)->Flags & INVALID_FLAGS)
    {
        SetLastError(ERROR_INVALID_PARAMETER);
        return NULL;
    }

    // Get local PDB

    if (!(pPDBLocal = GetPDB(-1)))
        return NULL;

    // Is the process local or remote ?

    fLocal = pPDB == pPDBLocal;

    // Remote process

    // (negate stack size)

    if (!fLocal)
    {
        if (dwStackSize == 0)
            StackSize = -(int)0x3000;   // Default stack size

        else
            StackSize = -(int)dwStackSize;
    }
    // Current process

    else
        StackSize = dwStackSize;

    // Always create thread suspended

    fFlags |= 0x40;

    // If process not initialized

    // suppress DLL_THREAD_ATTACH notification

    if (dwCreationFlags & CREATE_SILENT)
        fFlags |= 0x10;

    // Set correct system level

    EnterSysLevel(Win16Mutex);
    if (Krn32Mutex)
        EnterSysLevel(Krn32Mutex);

    // This function creates a new THREAD object and returns a pointer to it

    // (run thread)

    pTDB = InternalCreateRemoteThread(pPDB,             // PDB

                                      StackSize,        // Stack size

                                      lpStartAddress,   // Thread address

                                      lpParameter,      // Parameter passed to thread

                                      fFlags);          // Flags

    // Leave system level

    if (Krn32Mutex)
        LeaveSysLevel(Krn32Mutex);
    LeaveSysLevel(Win16Mutex);

    dwThreadId = (DWORD)pTDB ^ dwObsfucator;

    if (lpThreadId != NULL)
        *lpThreadId = dwThreadId;

    // Get thread handle

    hThread = OpenThread9x(THREAD_ALL_ACCESS, bInheritHandle, dwThreadId);

    // If thread created not suspended let it run

    if (!(dwCreationFlags & CREATE_SUSPENDED))
        ResumeThread(hThread);

    return hThread;
}

2. SetThreadContext()

This method involves the following steps:

Get a thread ID from the remote process using one of the available enumeration methods (see [6]).
Allocate memory within the remote process and copy code and data to it.
Save the suspend count of the thread to be restored later. SuspendThread() returns this value. On Win9x, this value can be retrieved directly from the TDB/TDBX ApiSuspendCount field.
Suspend the thread using SuspendThread().
Save the thread context (using GetThreadContext()) to be restored later.
Set the Eip (Instruction Pointer) field of the thread context data structure to point to our code. To ease the detection of the thread termination, the Eip is not set to the thread function directly, but instead it points to a stub code that will call the remote thread function. This code is defined as:
```
    push pParams
    call ThreadFunction
    jmp  $
```
After pushing the needed parameters and calling the thread function, the code enters an infinite loop.
Resume (i.e. run) the thread by using ResumeThread().
Compare the current instruction pointer (Eip) with the address of the jmp $ instruction. If it matches, we know that the thread function has terminated. The thread exit code is returned in the Eax register.
Restore the saved context and suspend count.
Release the allocated memory.

With this method, in reality it's not possible to fully emulate the CreateRemoteThread() function, so I presented instead a clone of the RemoteExec() function (that uses CreateRemoteProcess() and is described below) called ContextRemoteExecute(). Note that suspending a running thread belonging to another process can lead to deadlock, turning this method very unsafe.

Other functions

The following functions do not exist in Windows and are only implemented to ease the manipulation of the undocumented internal structures and remote code injection.

GetTIB

This function returns a pointer to the Thread Information Block (TIB). The return value should be cast depending on the Windows version. The format of this data structure is defined in the file "struct.h". This structure is documented for Window NT (defined in winnt.h). It contains stack and exception information used for SEH - Structured Exception Handling (used in C/C++ try/catch) - among other fields. DWORD at offset 0x18 contains the linear address of the TIB structure, therefore to retrieve a pointer to the TIB, use the following code:

    __asm mov eax, fs:[0x18]
    __asm mov pTIB, eax

It's necessary to use assembly because this data structure is pointed by the FS segment register (data references normally use the DS segment register). Note that in Windows NT because the TIB structure is contained by the TEB (Thread Environment Block), pTIB is also a pointer to the TEB. In Windows 9x, the TIB is part of the TDB (Thread Database) -- see the description of the TDB below. For a detailed explanation of the TIB internals, see [2].

GetTDB

This function returns a pointer to the Thread Database (TDB) for the specified thread ID (TID). The return value should be cast depending on the Windows version. Because the TDB exists only on the Windows 9x platform, the function returns NULL for Windows NT or higher. The format of this data structure is defined in the file "struct.h". The TDB is a system data structure that represents a thread object and contains information such as: PDB pointer, TIB, TLS array pointer, thread exit code, etc. To obtain a pointer to the TDB, just XOR the TID with the Obfuscator value:

    pTDB = TID ^ dwObsfucator;

This means that the TID is just an encrypted pointer to the TDB. For the current thread, the TDB can be retrieved from the TIB by subtracting the offset of the TIB inside the TDB structure:

    __asm mov  eax,fs:[0x18]
    __asm mov  pTDB,eax

    if (OSWin95)
       pTDB -= 0x10;

    else if (OSWin98)
       pTDB -= 8;

GetPDB

This function returns a pointer to the Process Database (PDB) for the specified process ID (PID). The return value should be cast depending on the Windows version. Because the PDB exists only on the Windows 9x platform, the function returns NULL for Windows NT or higher. The format of this data structure is defined in the file "struct.h". The PDB is a system data structure that represents a process object and contains information about a process including: list of threads and modules, the handle table, the environment database, etc. To obtain a pointer to the PDB, just XOR the PID with the Obfuscator value:

    pPDB = PID ^ dwObsfucator;

This means that the PID is just an encrypted pointer to the PDB. For the current process, the PDB can be retrieved from the pProcess field (offset 0x30) of the TIB:

    __asm mov eax, fs:[0x30]
    __asm mov pPDB, eax

GetObsfucator

In early versions of Windows 95, GetCurrentProcessId() and GetCurrentThreadId() returned direct pointers to the PDB and TDB. Later Microsoft changed this behaviour and encrypted the return value (XOR with a random value). In the debug build, this value is called "Obsfucator" ("Obfuscator" misspelled!). Because this is a random value (computed every time the system boots), it must be calculated at run time. From the properties of the XOR operator, we know that if A = B XOR C (PID = PDB XOR Obsfucator) then C = A XOR B (Obsfucator = PID XOR PDB). For the current process, we obtain the PID from GetCurrentProcessId() and the PDB from FS:[0x30], and therefore the Obsfucator can be obtained using the following code:

DWORD GetObsfucator()
{
    DWORD PID, PDB;

    PID = GetCurrentProcessId();

    __asm mov eax,fs:[0x30];  // PDB

    __asm mov PDB,eax

    return PDB ^ PID;
}

Because the TDB pointer is also encrypted, it's possible to obtain the Obsfuscator from TDB XOR TID. For the current thread, the TID is obtained from GetCurrentThreadId(), and the TDB pointer from FS:[0x18] minus an offset value. (FS:[0x18] is the linear address of the TIB. Each TDB contains a TIB structure, therefore by subtracting the offset of the starting TIB inside the TDB, we get the starting address of the TDB.)

DWORD GetObsfucator()
{
    DWORD TID, TDB;

    TID = GetCurrentThreadId();

    __asm mov  eax,fs:[0x18];  // TIB

    __asm mov  TDB,eax

    if (OSWin95)
        TDB -= 0x10;
    else if (OSWin98)
        TDB -= 8;

    return TDB ^ TID;
}

RemoteExecute

The RemoteExecute() function is used to execute code in the context of a remote process. It has the following declaration:

    BOOL RemoteExecute(HANDLE hProcess,
                       LPTHREAD_START_ROUTINE Function,
                       PVOID pParams,
                       DWORD Size);

hProcess is the handle of the remote process, Function is the address of the thread function, and pParams and Size are the pointer and size of the optional data structure to be passed to the thread function. The function uses _VirtualAllocEx(), WriteProcessMemory() and _CreateRemoteThread() to allocate memory and copy code and data to the remote process and execute the remote thread. It returns TRUE if everything went OK. To successfully run the thread within the remote process, the thread function must be coded following certain rules: it cannot use any absolute addressing (calls, jumps, data references, ...) at all! Why? Because these are hard encoded pointer addresses that reference data/code in the running process and become invalid when the code is moved to the remote process address space. This means that it's not possible to call any external functions (including Windows DLLs and standard library -- like strlen() or printf()) and reference any static data like strings or structures. If needed, any function reference or data must be passed in a structure pointed by the pParams parameter. Inside RemoteExec(), there's a function (IsCodeSafe()) that disassembles the thread function code and checks for absolute addressing, but this is not 100% secure, so as a final advise always check the generated code (using a disassembler or debugger) to catch any invalid references. If you are using Visual C++ (v 6.0) you can follow these tips:

Turn on file listing and check the generated code: Project\Settings\C/C++\Listing Files\Listing file type=Assembly, Machine Code, and Source.
Turn off stack probes. Check for __chkstk() references in the listing files.
1. Use #pragma check_stack(off).
2. Use less than 4K of local variables.
3. Augment the stack size: /Gs size (Project\Settings\C/C++\ProjectOptions).
Remove the /GZ switch in the debug build. Check for __chkesp() references in the listing files.
1. Project\Settings\C/C++\Project Options
Disable incremental compilation (/Gi).
1. Use #pragma comment(linker, "/INCREMENTAL:NO")
2. Remove the /Gi switch (Project\Settings\C/C++\Customize\Enable incremental compilation=Off)
3. Declare the functions as static.
Don't call any functions besides those in KERNEL32.DLL. Use LoadLibrary()/GetProcAddress if you need functions from other libraries.
Don't use any static strings. Pass them in a structure pointed by pParams.

StartRemoteSubclass/StopRemoteSubclass

These two functions are used to remote subclass the Windows procedure handler of a remote process. They are declared as:

   BOOL StartRemoteSubclass(PRDATA rd, USERWNDPROC WndProc);
   BOOL StopRemoteSubclass(PRDATA rd);

rd is the address of a RDATA structure that contains important information to be passed to these functions:

   struct _RDATA {
          int             Size;                 // Size of structure

          HANDLE          hProcess;             // Process handle

          HWND            hWnd;                 // Window handle

          struct RDATA    *pRDATA;              // Pointer to RDATA structure

          WNDPROC         pfnNewWndProc;        // Addr. of new window handler

          WNDPROC         pfnOldWndProc;        // Addr. of old window handler

          USERWNDPROC     pfnUserWndProc;       // Addr. of user's proc. handler

          LRESULT         Result;               // Result from user's proc. handler

          SETWINDOWLONG   pfnSetWindowLong;     // Address of SetWindowLong()

          CALLWINDOWPROC  pfnCallWindowProc;    // Address of CallWindowProc()

  } RDATA;

If you need to pass extra data to the new window procedure handler, it must be appended to the existing RDATA. Before calling StartRemoteSubclass(), the following fields of the RDATA structure must be initialized: Size must contain the size of the RDATA structure plus any appended data, hProcess must contain the handle of the remote process, and hWnd must contain the handle of the window to be subclassed. The extra fields of the appended data should also be initialized at this point. All the remaining fields should be considered private and not used. WndProc is the address of the new window handler procedure and is declared as:

    LRESULT WINAPI WndProc(PRDATA pData,
                           HWND   hWnd,
                           UINT   Msg,
                           WPARAM wParam,
                           LPARAM lParam)

Except for the first parameter (a pointer to the RDATA structure) the remaining parameters are the normal window handler, message type, and wParam and lParam found in any window procedure handler. The new window procedure handler will be called by Windows every time a message to the window must be processed, therefore the function should be coded as a "normal" window procedure handler (with the switch(Msg) loop). Please note that because this function will be executed on a remote process, it must follow the same rules as any remote code execution (see description of the RemoteExecute() function above). Any unhandled message should be processed by the default window procedure handler. For this, the function must return FALSE. If you want to process some messages yourself, return the value in the Result field of the RDATA structure and return TRUE for the function.

Demo Code

Finally to glue everything together, I wrote an application that demonstrates how to use the functions exported by the Remote library. Here's a brief description of the code:

Display Windows version.
Launch "Notepad.exe" using CreateProcess().
Allocate a 0x1000 bytes buffer using _VirtualAllocEx().
Write to the allocated memory using WriteProcessMemory or memset() (Win9x only).
Free the allocated memory using _VirtualFreeEx().
Convert the process handle to the PID by using _GetProcessId().
Convert the thread handle to the TID by using _GetThreadId().
Use _CreateRemoteThread() to run a thread in the current process. The thread just sleeps for 3 seconds and returns the lpParameter value.
Use _OpenThread() to return a new handle to the created thread. This handle is used by WaitForSingleObject() and GetExitCodeThread().
Call ContextRemoteExecute() to run a thread in the remote process. The thread just copies a message string to the "Notepad" edit window.
Call RemoteExecute() to run a thread in the remote process. The thread just copies a message string to the "Notepad" edit window.
Subclass the "Notepad" edit window using StartRemoteSubclass(). If you press the key sequence "remote" in the "Notepad" edit window, a message box is displayed.
After 30 seconds, restore the original "Notepad" window handler by using StopRemoteSubclass().
Wait for the "Notepad" process to finish.

Note

This library was tested with the following Windows versions:

Windows 95 (4.00.1111B)
Windows 98 (4.10.1998/4.10.2222A)
Windows ME (4.90.3000)
Windows NT 4.0 (4.0.1381 SP1/SP3/SP6)
Windows 2000 (5.0.2195 SP2/SP4)
Windows XP (5.1.2600)

Because the code relies heavily on undocumented functions and structures, it can malfunction (or even crash) if run in an untested version. In this case, you should modify the code to include the idiosyncrasies of the new versions.

N.B. Windows 2000 SP3 is known for having several bugs. Because of this, some of the presented code doesn't work correctly on this version (in particular, the function ContextRemoteExecute()).

References

"Windows 95 System Programming Secrets" by Matt Pietrek
MSJ May 1996 - Under the Hood by Matt Pietrek
"DLL Injection on Win32 platforms" by Yoda
"Three ways to inject your code into another process" by Robert Kuster
The Undocumented Functions by NTinternals
"Enumerating Windows Processes" by Alex Fedotov

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here