Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / game

How to Use C Compiler for Reverse Engineering

4.68/5 (10 votes)
27 Jul 2020CPOL18 min read 27.7K  
This article shows how to bring the power of real C compiler for reverse engineering purposes.
Make your reverse engineering effort successful. Switch to real C language, say no to pseudocode.

Introduction

We can use IDA to examine assembly code, and translate disassembly to some pseude C code by hand. We can even use HexRays decompiler. However, no matter how good pseudo code is, it is still pseudocode. It cannot be compiled and tested. The chances of errors and ambiguities increase with amount of pseudocode, and we can finally get lost.

I thought about another method of reverse engineering. We will still use IDA to inspect disassembly, however we will translate disassembly to real C code on function basis, and force inspected program to use our code instead. We will accumulate our code in a separate DLL, just like IDA's database for inspected executable. This way, we will have solid ground of C (type system, clear names, function prototypes). We can write something that actually builds, and we can run the program to see if we got it right. If we made errors, the program will probably crash.

As we advance through reverse engineering process, we can translate more functions, rename struct fields, functions, variables in our C code, when their purpose becomes clear. Also we can make changes to IDA's database to keep things synced. I think it increases our chances to achieve our reverse engineering goals, especially for large programs, that contain thousands of functions.

Let's See the Idea

I think it's preferable to run program as usual, without any tricks, like CreateProcess with suspended flag, DLL injection, etc. The idea is to create dlc DLL for each image (EXE, DLL) we are interested in. This dlc DLL will be loaded together (right after) with image, and will replace original functions with jumps to our own functions (inside dlc DLL). How can we force dlc DLL loading? That's simple, we need to modify image's import descriptor (for real-world programs, each image will always import functions from at least one module). Also, we need to make sure that image's code section has write permission.

Let's demonstrate simple 32-bit example. It will be empty console application (it just returns number).

main.c:

C++
int main(int argc, char *argv[])
{
    return 0x10203040;                // to find function in IDA disassembly
}

Compile EXE image and see its imports:

dumpbin /imports main.exe

0

1

We can see that our EXE imports functions from kernel32.dll. So we need to build dlc DLL that will export all these functions and forward them to kernel32.dll. Create an empty DLL project, and add module definition file with all required functions. It's easier to redirect dumpbin output to file:

dumpbin /imports main.exe > main.txt

so you can easily copy functions to module definition file. In my case, def file looks like this:

LIBRARY DLC
EXPORTS
                  GetModuleHandleW                 = kernel32.GetModuleHandleW
                  GetModuleFileNameW               = kernel32.GetModuleFileNameW
                  FreeLibrary                      = kernel32.FreeLibrary
                  VirtualQuery                     = kernel32.VirtualQuery
                  GetProcessHeap                   = kernel32.GetProcessHeap
                  HeapFree                         = kernel32.HeapFree
                  HeapAlloc                        = kernel32.HeapAlloc
                  WideCharToMultiByte              = kernel32.WideCharToMultiByte
                  MultiByteToWideChar              = kernel32.MultiByteToWideChar
                  LoadLibraryExW                   = kernel32.LoadLibraryExW
                  GetProcAddress                   = kernel32.GetProcAddress
                  GetLastError                     = kernel32.GetLastError
                  RaiseException                   = kernel32.RaiseException
                  IsDebuggerPresent                = kernel32.IsDebuggerPresent
                  DecodePointer                    = kernel32.DecodePointer
                  GetSystemTimeAsFileTime          = kernel32.GetSystemTimeAsFileTime
                  GetCurrentThreadId               = kernel32.GetCurrentThreadId
                  GetCurrentProcessId              = kernel32.GetCurrentProcessId
                  QueryPerformanceCounter          = kernel32.QueryPerformanceCounter
                  EncodePointer                    = kernel32.EncodePointer
                  IsProcessorFeaturePresent        = kernel32.IsProcessorFeaturePresent

Add our own main function:

C++
int main(int argc, char *argv[])
{
    printf("You forgot Hello World!\n");
    getchar();
    return 0x10203040;
}

Now open main.exe with IDA. We need to get address of original main function, so we know at what address we should insert jump to our own main.

2

3

So we can see that address of original main function is equal to:

0x411380

If function has another name (without address), we can always rename it (delete its name), or open function info window to see its address. Note that IDA assumes default image base. For 32-bit images, it is equal to:

0x400000

So during initial autoanalysis, IDA autogenerates functions/global variables names with this base in mind. However IDA may rebase the program later, right before program debugging starts. In this case, all autogenerated names will be renamed. In this case, you need to rebase program back to default base address:

Edit -> Segments -> Rebase program...

Here is an example of function after debugging session:

4

Now let's do rebase to default base address:

5

6

7

And all function names are ok again.

Back to the point. To get runtime function address, we need to add offset of the function (from the beginning of image base, luckily it doesn't change) to runtime image base:

runtime function address = runtime image base + (IDA's function address - default image base)

runtime function address = GetModuleHandle(NULL) + (0x411380 - 0x400000)

In our dlc DLL, we will have:

C++
const unsigned int DefExeBase = 0x400000;
BYTE *g_ExeBase;

void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCReplaceFunction(g_ExeBase + 
    (0x411380 - DefExeBase), (BYTE*)main);   // 411380 is address of function in IDA
}

We will call DLCInit inside DllMain (on DLL_PROCESS_ATTACH event). What is this DLCReplaceFunction ? Let's see how to make jump from original function to our own function. To achieve this, we will write the following structure at the beginning of the original function:

C++
#pragma pack(push, 1)
typedef struct _Sorry       // 6 bytes of space needed
{
    struct
    {
        BYTE Opcode;
        INT Value;
    } Push;                 // push func
    struct
    {
        BYTE Opcode;
    } Ret;                  // retn
} Sorry;
#pragma pack(pop)

We push the address of our own function on the stack and pop it with retn. It is so called "backward function call". To encode it, we need only 6 bytes, all registers are preserved, and we use direct address. So why not? Now the DLCReplaceFunction:

C++
void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->Push.Opcode = 0x68;
    s->Push.Value = (INT)NewFunc;
    s->Ret.Opcode = 0xC3;
}

We overwrite the original function start with this structure. You can see that we have some x86 opcode values here. How do we know all these in the first place? The best way to find out opcode bytes needed is to use Nasm. Write down instructions you are interested in:

op.asm:

[BITS 32]

push 0x10203040     ; to clearly see the value in assembled code
retn

Compile raw binary:

nasm -f bin op.asm

8

And examine it in hex editor:

9

It's more conveninent to use Nasm, no need to refer to opcode tables. Also, error is less likely.

Let's see dlc DLL full code:

C++
#include <Windows.h>
#include <stdio.h>

#pragma pack(push, 1)
typedef struct _Sorry       // 6 bytes of space needed
{
    struct
    {
        BYTE Opcode;
        INT Value;
    } Push;                 // push func
    struct
    {
        BYTE Opcode;
    } Ret;                  // retn
} Sorry;
#pragma pack(pop)

const unsigned int DefExeBase = 0x400000;
BYTE *g_ExeBase;

void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->Push.Opcode = 0x68;
    s->Push.Value = (INT)NewFunc;
    s->Ret.Opcode = 0xC3;
}

int main(int argc, char *argv[])
{
    printf("You forgot Hello World!\n");
    getchar();
    return 0x10203040;
}

void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCReplaceFunction(g_ExeBase + 
    (0x411380 - DefExeBase), (BYTE*)main);   // 411380 is address of function in IDA
}

BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        DLCInit();
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

Now build dlc DLL, and copy it to main.exe's folder. Next step is to modify main.exe's import descriptor (we decided to abuse import descriptor for kernel32.dll) and code section permissions (by default, code section has only execute/read permissions). We will use a small utility from this stackoverflow question:

We will make small modifications to get information we need:

C++
#include <Windows.h>

DWORD Rva2Offset(DWORD rva, PIMAGE_SECTION_HEADER psh, PIMAGE_NT_HEADERS pnt);
int _tmain(int argc, _TCHAR* argv[])
{
//>>>>> Change file name
    LPCWSTR fNmae = L"E:\\Reverse\\HULK2\\Main\\Debug\\Main.exe";
//<<<<<
    HANDLE handle = CreateFile(fNmae/*"messagebox.exe"*/, 
                    GENERIC_READ, 0, 0, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
    DWORD byteread, size = GetFileSize(handle, NULL);
    PVOID virtualpointer = VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE);
    ReadFile(handle, virtualpointer, size, &byteread, NULL);
    CloseHandle(handle);
    // Get pointer to NT header
    PIMAGE_NT_HEADERS           ntheaders = (PIMAGE_NT_HEADERS)(PCHAR(virtualpointer) + 
                                             PIMAGE_DOS_HEADER(virtualpointer)->e_lfanew);
    PIMAGE_SECTION_HEADER       pSech = IMAGE_FIRST_SECTION(ntheaders);//Pointer to 
                                                                // first section header
    PIMAGE_IMPORT_DESCRIPTOR    pImportDescriptor; //Pointer to import descriptor 

//>>>>> Add this block (note: executable may have multiple code sections, 
//we don't consider this case here)
    for (int i = 0; i < ntheaders->FileHeader.NumberOfSections; ++i)
    {
        printf("Section: %s\n", pSech->Name);
        if (!_stricmp((char*)pSech->Name, ".text"))  // case insensitive compare
        {
            UINT d = (PCHAR)(&pSech->Characteristics) - (PCHAR)virtualpointer;
            printf("Code Section Permissions FileOffset: %X\n", d);
            getchar();
        }
        ++pSech;
    }
    pSech = IMAGE_FIRST_SECTION(ntheaders);
//<<<<<
    __try
    {
        if (ntheaders->OptionalHeader.DataDirectory
            [IMAGE_DIRECTORY_ENTRY_IMPORT].Size != 0)/*if size of the table 
                                                       is 0 - Import Table does not exist */
        {
            pImportDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)((DWORD_PTR)virtualpointer + \
                Rva2Offset(ntheaders->OptionalHeader.DataDirectory
                [IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress, pSech, ntheaders));
            LPSTR libname[256];
            size_t i = 0;
            // Walk until you reached an empty IMAGE_IMPORT_DESCRIPTOR
            while (pImportDescriptor->Name != NULL)
            {
                printf("Library Name   :");
                //Get the name of each DLL
                libname[i] = (PCHAR)((DWORD_PTR)virtualpointer + 
                              Rva2Offset(pImportDescriptor->Name, pSech, ntheaders));
                printf("%s\n", libname[i]);

//>>>>>> Add this block
                if (!_stricmp(libname[i], "kernel32.dll"))    // case insensitive compare
                {
                    UINT d = libname[i] - (PCHAR)virtualpointer;
                    printf("DLL Name FileOffset: %X\n", d);
                    getchar();
                }

                PIMAGE_THUNK_DATA ThunkData = (PIMAGE_THUNK_DATA)((DWORD_PTR)virtualpointer + 
                    Rva2Offset(pImportDescriptor->OriginalFirstThunk, pSech, ntheaders));

                while (ThunkData->u1.AddressOfData)
                {
                    PIMAGE_IMPORT_BY_NAME ImportByName = (PIMAGE_IMPORT_BY_NAME)
                      ((DWORD_PTR)virtualpointer + 
                      Rva2Offset(ThunkData->u1.AddressOfData, pSech, ntheaders));
                    printf("\t%s\n", ImportByName->Name);
                    ++ThunkData;
                }
//<<<<<<
                pImportDescriptor++; //advance to next IMAGE_IMPORT_DESCRIPTOR
                i++;

            }

        }
        else
        {
            printf("No Import Table!\n");
            return 1;
        }
    }
    __except (EXCEPTION_EXECUTE_HANDLER)
    {
        if (EXCEPTION_ACCESS_VIOLATION == GetExceptionCode())
        {
            printf("Exception: EXCEPTION_ACCESS_VIOLATION\n");
            return 1;
        }

    }
    if (virtualpointer)
        VirtualFree(virtualpointer, size, MEM_DECOMMIT);

//>>>>>> Add getchar
    getchar();
//<<<<<<
    return 0;
}
/*Convert Virtual Address to File Offset */
DWORD Rva2Offset(DWORD rva, PIMAGE_SECTION_HEADER psh, PIMAGE_NT_HEADERS pnt)
{
    size_t i = 0;
    PIMAGE_SECTION_HEADER pSeh;
    if (rva == 0)
    {
        return (rva);
    }
    pSeh = psh;
    for (i = 0; i < pnt->FileHeader.NumberOfSections; i++)
    {
        if (rva >= pSeh->VirtualAddress && rva < pSeh->VirtualAddress +
            pSeh->Misc.VirtualSize)
        {
            break;
        }
        pSeh++;
    }
    return (rva - pSeh->VirtualAddress + pSeh->PointerToRawData);
}

Since we are digging into 32-bit executable, we should build this tool as 32-bit also. Otherwise, 64-bit PE structure definitions will be used, and tool won't work correctly. Let's run it and see file offsets we will need to patch:

10

11

So the file offset of code section permissions is equal to:

0x22C

and file offset of kernel32 DLL name we are going to import from is equal to:

0x68A4

Create a copy of main.exe and open it in hex editor (I use hex editor Neo):

12

13

You can press Ctrl + g, choose absolute offset, enter required offset value, and you are at the right time, at the right place. We need to set write flag in code section characteristics. See IMAGE_SECTION_HEADER struct and Characteristics flags here:

Characteristics is 4-byte DWORD field. We see that it contains the following bytes:

20 00 00 60

We have little endian byte order, so the value is:

0x60000020

In human readable form:

0x40000000 | 0x20000000 | 0x00000020

IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_CNT_CODE

We will add IMAGE_SCN_MEM_WRITE:

0x80000000 | 0x40000000 | 0x20000000 | 0x00000020

IMAGE_SCN_MEM_WRITE | IMAGE_SCN_MEM_READ | IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_CNT_CODE

So we need this value:

0xE0000020

And these bytes:

20 00 00 E0

Here it is:

14

Now let's change kernel32.dll to dlc.dll:

15

And finally, let's run our patched main.exe:

16

We can see that our main function gets called, not the old one (it is overwritten with our jump).

Now let's repeat the same example, only for 64-bit. Compile main as 64-bit, and dump imports. For me, the list of imported functions didn't change (well, it shouldn't), so module definition file of our dlc DLL remains the same.

Now back to IDA. I want to mention that later versions of IDA are 64-bit, so we can debug 64-bit applications natively. I know 100% that IDA 6.7 is still 32-bit, and IDA 7.2 is already 64-bit. Don't know the exact version of IDA when this significant change occured.

Open main.exe with IDA. We need to get the address of original main function, so we know at what address we should insert jump to our own main.

17

18

So we can see that address of original main function is equal to:

0x140001010

If function has another name (without address), we can always rename it (delete its name), or open function info window to see its address. Note that IDA assumes default image base. For 64-bit images, it is equal to:

0x140000000

In our dlc DLL, we will have:

C++
const unsigned long long DefExeBase = 0x140000000;
BYTE *g_ExeBase;

void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCReplaceFunction(g_ExeBase + 
    (0x140001010 - DefExeBase), (BYTE*)main);  // 140001010 is address of function in IDA
}

We will call DLCInit inside DllMain (on DLL_PROCESS_ATTACH event). What is this DLCReplaceFunction ? Let's see how to make jump from original function to our own function. To achieve this, we will write the following structure at the beginning of the original function:

C++
#pragma pack(push, 1)
typedef struct _Sorry            // 12 bytes of space needed, rax value is lost
{
    struct
    {
        BYTE Force64bit;
        BYTE Opcode;
        INT64 Value;
    } MovToRax;                  // mov rax, func
    struct
    {
        BYTE Opcode;
        BYTE Reg;
    } JmpRax;                    // jmp rax
} Sorry;
#pragma pack(pop)

We move the address of our own function in rax register and jump through it. To encode it, we need 12 bytes, all registers are preserved save for rax, and we use direct address. So why not? Now the DLCReplaceFunction:

C++
void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->MovToRax.Force64bit = 0x48;
    s->MovToRax.Opcode = 0xb8;
    s->MovToRax.Value = (INT64)NewFunc;
    s->JmpRax.Opcode = 0xff;
    s->JmpRax.Reg = 0xe0;
}

We overwrite the original function start with this structure. You can see that we have some x64 opcode values here. Write down instructions you are interested in:

op.asm:

[BITS 64]

mov rax, 0x1020304050607080             ; to clearly see the value in assembled code
jmp rax

Compile raw binary:

nasm -f bin op.asm

19

And examine it in hex editor:

20

Let's see dlc DLL full code:

C++
#include <Windows.h>
#include <stdio.h>

const unsigned long long DefExeBase = 0x140000000;
BYTE *g_ExeBase;

#pragma pack(push, 1)
typedef struct _Sorry            // 12 bytes of space needed, rax value is lost
{
    struct
    {
        BYTE Force64bit;
        BYTE Opcode;
        INT64 Value;
    } MovToRax;                  // mov rax, func
    struct
    {
        BYTE Opcode;
        BYTE Reg;
    } JmpRax;                    // jmp rax
} Sorry;
#pragma pack(pop)

void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->MovToRax.Force64bit = 0x48;
    s->MovToRax.Opcode = 0xb8;
    s->MovToRax.Value = (INT64)NewFunc;
    s->JmpRax.Opcode = 0xff;
    s->JmpRax.Reg = 0xe0;
}

int main(int argc, char *argv[])
{
    printf("You forgot Hello World!\n");
    getchar();
    return 0x10203040;
}

void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCReplaceFunction(g_ExeBase + (0x140001010 - DefExeBase), 
       (BYTE*)main);          // 140001010 is address of function in IDA
}

BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        DLCInit();
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

Now build dlc DLL as 64-bit, and copy it to main.exe's folder. The next step is to modify main.exe's import descriptor and code section permissions. In our tool, replace fine name if needed. Since we are digging into 64-bit executable now, we need to build it as 64-bit also, otherwise, it will not work correctly. Let's see file offsets:

21

22

Make a copy of main.exe and open it in hex editor:

23

24

And after patching:

25

26

Run our patched main.exe.

27

We can see that our main function gets called, not the old one (it is overwritten with our jump).

More Realistic Examples

Let's see a more complex example. It will be hulk.exe from The Hulk - 2003 game (32-bit). Dump imports:

dumpbin /imports hulk.exe

28

29

We can see that our EXE imports functions from binkw32.dll. It's more interesting than standard kernel32.dll, so we will stick to it. So we need to build dlc DLL that will export all these functions and forward them to binkw32.dll. Create empty DLL project, and add the following module definition file:

LIBRARY DLC

EXPORTS

                   _BinkSetPan@12                 = binkw32._BinkSetPan@12
                   _BinkSetVolume@12              = binkw32._BinkSetVolume@12
                   _BinkGetError@0                = binkw32._BinkGetError@0
                   _BinkPause@8                   = binkw32._BinkPause@8
                   _BinkOpen@8                    = binkw32._BinkOpen@8
                   _BinkSetIO@4                   = binkw32._BinkSetIO@4
                   _BinkSetSoundTrack@8           = binkw32._BinkSetSoundTrack@8
                   _BinkSetSoundSystem@8          = binkw32._BinkSetSoundSystem@8
                   _BinkOpenDirectSound@4         = binkw32._BinkOpenDirectSound@4
                   _RADSetMemory@8                = binkw32._RADSetMemory@8
                   _BinkClose@4                   = binkw32._BinkClose@4
                   _BinkNextFrame@4               = binkw32._BinkNextFrame@4
                   _BinkCopyToBufferRect@44       = binkw32._BinkCopyToBufferRect@44
                   _BinkDoFrame@4                 = binkw32._BinkDoFrame@4
                   _BinkWait@4                    = binkw32._BinkWait@4
                   _RADTimerRead@0                = binkw32._RADTimerRead@0

Note that despite the fact that binkw32.dll is not "standard" DLL (Windows provided, with import lib file to link agains), we don't need to create lib file for it to make it work. It is probably due to decorated exported function names (__declspec(dllexport) __stdcall, can be found only in 32-bit images). If we had binkw32.dll with exported function names without decoration, just like this:

BinkSetPan

we would have to create lib file for it, otherwise dlc DLL build would have failed. If you have this case, wait untill 64-bit example (immediately follows 32-bit). There, you will see how to create import lib file nof not "standard" DLL (the process is the same for 32-bit and 64-bit).

So back to the point. Open hulk.exe in IDA. For our purposes, we will pick a straighforward function that creates game's main window.

30

31

We need to collect list of all global variables/functions used in this function. We compute runtime address in the same way as we did for function replacement. After this, we bind to address with pointer. Let's bind to all required items:

C++
DWORD *dword_69E600;
HINSTANCE *hInstance;
char *IconName;
char *lpClassName;
WNDPROC sub_46DF00;
char *WindowName;
HWND *hWnd;

void DLCBindGlobals()
{
    dword_69E600 = (DWORD*)(g_ExeBase + (0x69E600 - DefExeBase));
    hInstance = (HINSTANCE*)(g_ExeBase + (0x6B2910 - DefExeBase));
    IconName = (char*)(g_ExeBase + (0x61D744 - DefExeBase));
    lpClassName = (char*)(g_ExeBase + (0x69E608 - DefExeBase));
    sub_46DF00 = (WNDPROC)(g_ExeBase + (0x46DF00 - DefExeBase));
    WindowName = (char*)(g_ExeBase + (0x61D6C4 - DefExeBase));
    hWnd = (HWND*)(g_ExeBase + (0x6B2908 - DefExeBase));
}

And replace function itself:

C++
void DLCBindFunctions()
{
    DLCReplaceFunction(g_ExeBase + (0x46E200 - DefExeBase), (BYTE*)sub_46E200);
}

Initialization:

C++
void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();
}

Now let's write our own version of function. To see the effect, we will make a small modification:

C++
void sub_46E200()
{
    DWORD Style;

    if (*dword_69E600 == 1) Style = 0xCF0000;
    else Style = 0xC80000;

    WNDCLASSA WndClass;
    RECT Rect;

    WndClass.cbClsExtra = 4;
    WndClass.cbWndExtra = 4;
    WndClass.style = 0x203;
    WndClass.lpfnWndProc = sub_46DF00;
    WndClass.hInstance = *hInstance;
    WndClass.hIcon = LoadIconA(*hInstance, IconName);
    WndClass.hCursor = NULL;
    WndClass.hbrBackground = NULL;
    WndClass.lpszMenuName = NULL;
    WndClass.lpszClassName = lpClassName;

    RegisterClassA(&WndClass);

    Rect.left = 0;
    Rect.top = 0;
    Rect.right = 0x280;
    Rect.bottom = 0x1E0;

    if (*dword_69E600 == 1) ExitProcess(0);

    // DLC extra
    //AdjustWindowRect(&Rect, Style, FALSE);

    // DLC extra
    std::string WindowsNameDLC = std::string(WindowName) + std::string(" --> with DLC!");

    *hWnd = CreateWindowExA(0, lpClassName, WindowsNameDLC.c_str(), 
            Style, Rect.left, Rect.top, Rect.right - Rect.left, Rect.bottom - Rect.top,
        NULL, NULL, *hInstance, NULL);

    if (!*hWnd) ExitProcess(0);

    int CmdShow;

    if (*dword_69E600 == 1) CmdShow = 5;
    else CmdShow = 3;

    ShowWindow(*hWnd, CmdShow);
    ShowCursor(FALSE);

    // DLC extra
    Sleep(5000);
}

We changed window's text, added sleep to actually see it, and removed nasty AdjustWindowRect call. AdjustWindowRect screwed something and window was not fully full screen. Now everything is as it should be.

As you can see, we can't access global variables directly, only through pointer. Also note, that if we are too lazy to translate code branches, that we know don't get called (we can check it while debugging in IDA), we can get away with ExitProcess call. This way, the process will terminate if unimplemented code branch will somehow get called.

Full dlc DLL code:

C++
#include <Windows.h>
#include <string>

#pragma pack(push, 1)
typedef struct _Sorry
{
    struct
    {
        BYTE Opcode;
        INT Value;
    } Push;
    struct
    {
        BYTE Opcode;
    } Ret;
} Sorry;
#pragma pack(pop)

BYTE *g_ExeBase;
const unsigned int DefExeBase = 0x400000;

DWORD *dword_69E600;
HINSTANCE *hInstance;
char *IconName;
char *lpClassName;
WNDPROC sub_46DF00;
char *WindowName;
HWND *hWnd;

void sub_46E200();

void DLCBindGlobals()
{
    dword_69E600 = (DWORD*)(g_ExeBase + (0x69E600 - DefExeBase));
    hInstance = (HINSTANCE*)(g_ExeBase + (0x6B2910 - DefExeBase));
    IconName = (char*)(g_ExeBase + (0x61D744 - DefExeBase));
    lpClassName = (char*)(g_ExeBase + (0x69E608 - DefExeBase));
    sub_46DF00 = (WNDPROC)(g_ExeBase + (0x46DF00 - DefExeBase));
    WindowName = (char*)(g_ExeBase + (0x61D6C4 - DefExeBase));
    hWnd = (HWND*)(g_ExeBase + (0x6B2908 - DefExeBase));
}

void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->Push.Opcode = 0x68;
    s->Push.Value = (INT)NewFunc;
    s->Ret.Opcode = 0xC3;
}

void DLCBindFunctions()
{
    DLCReplaceFunction(g_ExeBase + (0x46E200 - DefExeBase), (BYTE*)sub_46E200);
}

void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();
}

void sub_46E200()
{
    DWORD Style;

    if (*dword_69E600 == 1) Style = 0xCF0000;
    else Style = 0xC80000;

    WNDCLASSA WndClass;
    RECT Rect;

    WndClass.cbClsExtra = 4;
    WndClass.cbWndExtra = 4;
    WndClass.style = 0x203;
    WndClass.lpfnWndProc = sub_46DF00;
    WndClass.hInstance = *hInstance;
    WndClass.hIcon = LoadIconA(*hInstance, IconName);
    WndClass.hCursor = NULL;
    WndClass.hbrBackground = NULL;
    WndClass.lpszMenuName = NULL;
    WndClass.lpszClassName = lpClassName;

    RegisterClassA(&WndClass);

    Rect.left = 0;
    Rect.top = 0;
    Rect.right = 0x280;
    Rect.bottom = 0x1E0;

    if (*dword_69E600 == 1) ExitProcess(0);

    // DLC extra
    //AdjustWindowRect(&Rect, Style, FALSE);

    // DLC extra
    std::string WindowsNameDLC = std::string(WindowName) + std::string(" --> with DLC!");

    *hWnd = CreateWindowExA(0, lpClassName, WindowsNameDLC.c_str(), 
            Style, Rect.left, Rect.top, Rect.right - Rect.left, Rect.bottom - Rect.top,
        NULL, NULL, *hInstance, NULL);

    if (!*hWnd) ExitProcess(0);

    int CmdShow;

    if (*dword_69E600 == 1) CmdShow = 5;
    else CmdShow = 3;

    ShowWindow(*hWnd, CmdShow);
    ShowCursor(FALSE);

    // DLC extra
    Sleep(5000);
}

BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        DLCInit();
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

Now build dlc DLL, and copy it to game's folder. Next step is to modify hulk.exe's import descriptor and code section permissions. Let's see file offsets:

32

33

Make a copy of hulk.exe and open it with hex editor:

34

35

36

37

Now run the patched program. We can see our changes (sorry, I couldn't capture the screen as it is, even with Fraps). Cool, we can not only reverse, but also modify game's behavior, without touching the original EXE (save for import descriptor and code section permissions). It's pretty handy (moddy).

There is one more question left. Games run full screen, and to debug, we need to see debugger at least. We want to debug with IDA, and to debug our own dlc code in Visual Studio. Not all games provide windowed mode. So what can we do about it?

To make game windowed - use DxWnd. We can tell DxWnd to pass additional command line argument to the program, this way in dlc DLL, we will know whether we should wait for debugger attach or proceed as usual.

DxWnd main window.

38

We will add to profiles. First one will be normal, and second one will be debug. Here is normal:

39

40

For debug profile, we pass additional command line parameter: -debug-wait

41

42

If you exit DxWnd without save, you will be asked whether you want to save:

43

Now let's see how -debug-wait parameter is actually implemented. First, we check whether we were started with -debug-wait paramter. If we have it, we call DLCWaitForDebugger.

C++
void DLCInit()
{
    char *_cmd = GetCommandLineA();
    std::string cmd(_cmd);
    std::string suffix("-debug-wait");
    std::size_t found = cmd.find(suffix);
    bool debug_wait = false;

    while (found != std::string::npos)
    {
        if (cmd.length() == (found + suffix.length()))
        {
            _cmd[found] = 0;  // probably we should hide this parameter from program
            debug_wait = true;
            break;
        }
        found = cmd.find(suffix, found + 1);
    }

    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();
    if (debug_wait) DLCWaitForDebugger();
}

Here, we check whether command line string ends with -debug-wait. Also we cut it off by inserting null character in buffer, in case inspected program has something to say about unknown parameters.

Now, we are interested in start function (program's entry point):

44

45

We will save bytes at the beginning of start function, and overwrite it with jump to our custom function. Custom function will restore original bytes, and will wait for debugger in loop. After debugger attach, we will issue debug break, and will call start ourselves. Let's see the code:

C++
typedef int(__stdcall *Start)();

int __stdcall sub_5D095E();

bool debug_wait;
Start start_ptr;
Sorry start_sorry;

void DLCWaitForDebugger()
{
    start_ptr = (Start)(g_ExeBase + (0x5D095E - DefExeBase));
    start_sorry = *((Sorry*)start_ptr);
    DLCReplaceFunction((BYTE*)start_ptr, (BYTE*)sub_5D095E);
}

int __stdcall sub_5D095E()
{
    *((Sorry*)start_ptr) = start_sorry;
    while (!IsDebuggerPresent()) Sleep(500);
    DebugBreak();
    return start_ptr();
}

We have some kind of "one-shot" trampoline. Now let's see all together:

C++
#include <Windows.h>
#include <string>

#pragma pack(push, 1)
typedef struct _Sorry
{
    struct
    {
        BYTE Opcode;
        INT Value;
    } Push;
    struct
    {
        BYTE Opcode;
    } Ret;
} Sorry;
#pragma pack(pop)

BYTE *g_ExeBase;
const unsigned int DefExeBase = 0x400000;

typedef int(__stdcall *Start)();

int __stdcall sub_5D095E();

Start start_ptr;
Sorry start_sorry;

DWORD *dword_69E600;
HINSTANCE *hInstance;
char *IconName;
char *lpClassName;
WNDPROC sub_46DF00;
char *WindowName;
HWND *hWnd;

void sub_46E200();

void DLCBindGlobals()
{
    dword_69E600 = (DWORD*)(g_ExeBase + (0x69E600 - DefExeBase));
    hInstance = (HINSTANCE*)(g_ExeBase + (0x6B2910 - DefExeBase));
    IconName = (char*)(g_ExeBase + (0x61D744 - DefExeBase));
    lpClassName = (char*)(g_ExeBase + (0x69E608 - DefExeBase));
    sub_46DF00 = (WNDPROC)(g_ExeBase + (0x46DF00 - DefExeBase));
    WindowName = (char*)(g_ExeBase + (0x61D6C4 - DefExeBase));
    hWnd = (HWND*)(g_ExeBase + (0x6B2908 - DefExeBase));
}

void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->Push.Opcode = 0x68;
    s->Push.Value = (INT)NewFunc;
    s->Ret.Opcode = 0xC3;
}

void DLCBindFunctions()
{
    DLCReplaceFunction(g_ExeBase + (0x46E200 - DefExeBase), (BYTE*)sub_46E200);
}

int __stdcall sub_5D095E()
{
    *((Sorry*)start_ptr) = start_sorry;
    while (!IsDebuggerPresent()) Sleep(500);
    DebugBreak();
    return start_ptr();
}

void DLCWaitForDebugger()
{
    start_ptr = (Start)(g_ExeBase + (0x5D095E - DefExeBase));
    start_sorry = *((Sorry*)start_ptr);
    DLCReplaceFunction((BYTE*)start_ptr, (BYTE*)sub_5D095E);
}

void DLCInit()
{
    char *_cmd = GetCommandLineA();
    std::string cmd(_cmd);
    std::string suffix("-debug-wait");
    std::size_t found = cmd.find(suffix);
    bool debug_wait = false;

    while (found != std::string::npos)
    {
        if (cmd.length() == (found + suffix.length()))
        {
            _cmd[found] = 0;  // probably we should hide this parameter from program
            debug_wait = true;
            break;
        }
        found = cmd.find(suffix, found + 1);
    }

    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();
    if (debug_wait) DLCWaitForDebugger();
}

void sub_46E200()
{
    DWORD Style;

    if (*dword_69E600 == 1) Style = 0xCF0000;
    else Style = 0xC80000;

    WNDCLASSA WndClass;
    RECT Rect;

    WndClass.cbClsExtra = 4;
    WndClass.cbWndExtra = 4;
    WndClass.style = 0x203;
    WndClass.lpfnWndProc = sub_46DF00;
    WndClass.hInstance = *hInstance;
    WndClass.hIcon = LoadIconA(*hInstance, IconName);
    WndClass.hCursor = NULL;
    WndClass.hbrBackground = NULL;
    WndClass.lpszMenuName = NULL;
    WndClass.lpszClassName = lpClassName;

    RegisterClassA(&WndClass);

    Rect.left = 0;
    Rect.top = 0;
    Rect.right = 0x280;
    Rect.bottom = 0x1E0;

    if (*dword_69E600 == 1) ExitProcess(0);

    // DLC extra
    //AdjustWindowRect(&Rect, Style, FALSE);

    // DLC extra
    std::string WindowsNameDLC = std::string(WindowName) + std::string(" --> with DLC!");

    *hWnd = CreateWindowExA(0, lpClassName, WindowsNameDLC.c_str(), 
            Style, Rect.left, Rect.top, Rect.right - Rect.left, Rect.bottom - Rect.top,
        NULL, NULL, *hInstance, NULL);

    if (!*hWnd) ExitProcess(0);

    int CmdShow;

    if (*dword_69E600 == 1) CmdShow = 5;
    else CmdShow = 3;

    ShowWindow(*hWnd, CmdShow);
    ShowCursor(FALSE);

    // DLC extra
    Sleep(5000);
}

BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        DLCInit();
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

Let's try to start game in DxWnd with -debug-wait, and attach with IDA (when I used IDA 6.7 it worked, when I tried IDA 7.2, it crashed due to some internal error):

46

It can take a while to load symbols for various DLLs. The program will hit debug break and suspend. Put breakpoint at the beginning of function we have replaced, and continue execution. Here it is:

47

48

Now we are inside our dlc:

49

50

Sometimes, I got strange buggy behavior, even with IDA 6.7. When I started the game with DxWnd with -debug-wait, the game exited on CreateWindowExA call, for no obvious reason. Also, if I set breakpoint at the beginning of original start function, it was ignored by IDA. In DxWnd, I deleted debug profile, and created it again. Things started to work again. So you should experiment a little here, or maybe even improve our debugger attachment experience.

Now let's try Visual Studio debugging. We should not update dlc DLL source after its build, in other words source and binary shall match, otherwise, we won't be able to debug. Set breakpoint:

51

Attach to process:

52

We will hit debug break, continue process. Breakpoint will be hit:

53

Now let's say we are interested in interface between hulk.exe and binkw32.dll. Let's take _RADSetMemory@8 API. When we made out its prototype, we can no longer forward this function, but provide our own, that calls original one. From IDA's disassembly, we can make out that this function takes two function pointers and its return value is ignored.

54

Let's comment this function in module definition file:

LIBRARY DLC

EXPORTS

                   _BinkSetPan@12                 = binkw32._BinkSetPan@12
                   _BinkSetVolume@12              = binkw32._BinkSetVolume@12
                   _BinkGetError@0                = binkw32._BinkGetError@0
                   _BinkPause@8                   = binkw32._BinkPause@8
                   _BinkOpen@8                    = binkw32._BinkOpen@8
                   _BinkSetIO@4                   = binkw32._BinkSetIO@4
                   _BinkSetSoundTrack@8           = binkw32._BinkSetSoundTrack@8
                   _BinkSetSoundSystem@8          = binkw32._BinkSetSoundSystem@8
                   _BinkOpenDirectSound@4         = binkw32._BinkOpenDirectSound@4
                   
;                   _RADSetMemory@8                = binkw32._RADSetMemory@8

                   _BinkClose@4                   = binkw32._BinkClose@4
                   _BinkNextFrame@4               = binkw32._BinkNextFrame@4
                   _BinkCopyToBufferRect@44       = binkw32._BinkCopyToBufferRect@44
                   _BinkDoFrame@4                 = binkw32._BinkDoFrame@4
                   _BinkWait@4                    = binkw32._BinkWait@4
                   _RADTimerRead@0                = binkw32._RADTimerRead@0

And provide our own implementation:

C++
typedef void *type_SomeFunctionPointer;
typedef void(__stdcall *type_binkw32_RADSetMemory)
            (type_SomeFunctionPointer, type_SomeFunctionPointer);
type_binkw32_RADSetMemory binkw32_RADSetMemory;

void DLCInit()
{
    ...
    
    HMODULE binkw32 = LoadLibraryA("binkw32.dll");
    binkw32_RADSetMemory = 
            (type_binkw32_RADSetMemory)GetProcAddress(binkw32, "_RADSetMemory@8");
}

extern "C"
{
    __declspec(dllexport) void __stdcall RADSetMemory
              (type_SomeFunctionPointer f1, type_SomeFunctionPointer f2)
    {
        binkw32_RADSetMemory(f1, f2);
    }
}

Let's build our dlc DLL with this small change, and verify its exports:

dumpbin /exports dlc.dll

55

We can debug it like any other code:

56

Why we bother to provide such stubs? This way, we can have reversed API interface expressed as C code, and we will immediately know if we didn't get it right (program will probably crash), because this function gets called actually. We can patch multiple import descriptors with our dlc DLL name, and either forward API to appropriate DLL, or provide our own stub as we advance through reverse process.

Now let's show 64-bit example. It will be port of doom3 source code called dhewm3:

We can build it as 64-bit, this is something that can't be done with official doom3 source code:

There is no prebuild 64-bit release for Windows. So we have to build our own. Though it is recommended to use MinGW for 64-bit build, I have managed to build 64-bit version with Visual Studio (with some project properties tuned). I successfully made it to game's main menu, the game still crushed in the middle of level loading (probably I should have used MinGW). However, it will do fine for our purpose.

Let's see doom 3 imports:

dumpbin /imports dhewm3.exe

57

58

Let's stick to sdl2.dll. Redirect imports to file, so we can easily copy them to module definition file. Here is SDL2 part of dumpbin output:

SDL2.dll
         1439BC3B0 Import Address Table
         1439BD818 Import Name Table
                 0 time date stamp
                 0 Index of first forwarder reference

                     14C SDL_Quit
                      F1 SDL_HasAltiVec
                      F6 SDL_HasMMX
                      EF SDL_Has3DNow
                      F8 SDL_HasSSE
                      F9 SDL_HasSSE2
                      23 SDL_CreateMutex
                     124 SDL_LockMutex
                     1B9 SDL_UnlockMutex
                      37 SDL_DestroyMutex
                      21 SDL_CreateCond
                      36 SDL_DestroyCond
                      19 SDL_CondSignal
                      1A SDL_CondWait
                      2D SDL_CreateThread
                      FF SDL_Init
                      BC SDL_GetThreadID
                     1C7 SDL_WaitThread
                      35 SDL_Delay
                      93 SDL_GetModState
                     18A SDL_SetModState
                     149 SDL_PumpEvents
                     148 SDL_PollEvent
                     14A SDL_PushEvent
                     1EB SDL_malloc
                     1E0 SDL_getenv
                     200 SDL_strlen
                     1FF SDL_strlcpy
                     1FE SDL_strlcat
                     204 SDL_strrchr
                     1D7 SDL_atoi
                      25 SDL_CreateRGBSurfaceFrom
                      4C SDL_FreeSurface
                      2E SDL_CreateWindow
                      C7 SDL_GetWindowFlags
                     1A2 SDL_SetWindowIcon
                      D0 SDL_GetWindowSize
                     1A1 SDL_SetWindowGrab
                     1A0 SDL_SetWindowGammaRamp
                      3B SDL_DestroyWindow
                      56 SDL_GL_GetProcAddress
                      5B SDL_GL_SetAttribute
                      4F SDL_GL_CreateContext
                      5C SDL_GL_SetSwapInterval
                      5D SDL_GL_SwapWindow
                      50 SDL_GL_DeleteContext
                     18D SDL_SetRelativeMouseMode
                     1A9 SDL_ShowCursor
                     1C9 SDL_WasInit
                      8B SDL_GetError
                      BE SDL_GetTicks
                       2 SDL_AddTimer
                      C1 SDL_GetVersion
                     15D SDL_RemoveTimer
                     1B3 SDL_ThreadID

If we have too many functions, we can automate the process of def file creation. For example, we can remove headers and replace this text (regular expression):

.*SDL_                            # any characters before "SDL_" and "SDL_" itself

with this:

SDL_

Next, we create module definition file for sdl2.dll (this time, we will need it). And put all functions inside it:

sdl2.def:

LIBRARY SDL2

EXPORTS

SDL_Quit
SDL_HasAltiVec
SDL_HasMMX
SDL_Has3DNow
SDL_HasSSE
SDL_HasSSE2
SDL_CreateMutex
SDL_LockMutex
SDL_UnlockMutex
SDL_DestroyMutex
SDL_CreateCond
SDL_DestroyCond
SDL_CondSignal
SDL_CondWait
SDL_CreateThread
SDL_Init
SDL_GetThreadID
SDL_WaitThread
SDL_Delay
SDL_GetModState
SDL_SetModState
SDL_PumpEvents
SDL_PollEvent
SDL_PushEvent
SDL_malloc
SDL_getenv
SDL_strlen
SDL_strlcpy
SDL_strlcat
SDL_strrchr
SDL_atoi
SDL_CreateRGBSurfaceFrom
SDL_FreeSurface
SDL_CreateWindow
SDL_GetWindowFlags
SDL_SetWindowIcon
SDL_GetWindowSize
SDL_SetWindowGrab
SDL_SetWindowGammaRamp
SDL_DestroyWindow
SDL_GL_GetProcAddress
SDL_GL_SetAttribute
SDL_GL_CreateContext
SDL_GL_SetSwapInterval
SDL_GL_SwapWindow
SDL_GL_DeleteContext
SDL_SetRelativeMouseMode
SDL_ShowCursor
SDL_WasInit
SDL_GetError
SDL_GetTicks
SDL_AddTimer
SDL_GetVersion
SDL_RemoveTimer
SDL_ThreadID

Invoke the following command to produce import lib file:

lib /def:sdl2.def /out:sdl2.lib /machine:x64

You will get sdl2.lib file that we will need to link against in our dlc DLL. Now let's create module definition file for dlc DLL. Make a copy of sdl2.def file and delete "LIBRARY SDL2" and "EXPORTS" lines, let's name it sdl2.txt. We can use the following python script:

gen.py:

Python
file = open("dlc.def", "w")
file.write("LIBRARY DLC\n")
file.write("EXPORTS\n")
for i, line in enumerate(open("sdl2.txt", "r")):
    line = line.strip("\n")
    format = "%s = sdl2.%s\n"
    file.write(format % (line, line))
file.close()

Run the script:

python gen.py

We will get the following dlc def file:

C++
LIBRARY DLC
EXPORTS
SDL_Quit = sdl2.SDL_Quit
SDL_HasAltiVec = sdl2.SDL_HasAltiVec
SDL_HasMMX = sdl2.SDL_HasMMX
SDL_Has3DNow = sdl2.SDL_Has3DNow
SDL_HasSSE = sdl2.SDL_HasSSE
SDL_HasSSE2 = sdl2.SDL_HasSSE2
SDL_CreateMutex = sdl2.SDL_CreateMutex
SDL_LockMutex = sdl2.SDL_LockMutex
SDL_UnlockMutex = sdl2.SDL_UnlockMutex
SDL_DestroyMutex = sdl2.SDL_DestroyMutex
SDL_CreateCond = sdl2.SDL_CreateCond
SDL_DestroyCond = sdl2.SDL_DestroyCond
SDL_CondSignal = sdl2.SDL_CondSignal
SDL_CondWait = sdl2.SDL_CondWait
SDL_CreateThread = sdl2.SDL_CreateThread
SDL_Init = sdl2.SDL_Init
SDL_GetThreadID = sdl2.SDL_GetThreadID
SDL_WaitThread = sdl2.SDL_WaitThread
SDL_Delay = sdl2.SDL_Delay
SDL_GetModState = sdl2.SDL_GetModState
SDL_SetModState = sdl2.SDL_SetModState
SDL_PumpEvents = sdl2.SDL_PumpEvents
SDL_PollEvent = sdl2.SDL_PollEvent
SDL_PushEvent = sdl2.SDL_PushEvent
SDL_malloc = sdl2.SDL_malloc
SDL_getenv = sdl2.SDL_getenv
SDL_strlen = sdl2.SDL_strlen
SDL_strlcpy = sdl2.SDL_strlcpy
SDL_strlcat = sdl2.SDL_strlcat
SDL_strrchr = sdl2.SDL_strrchr
SDL_atoi = sdl2.SDL_atoi
SDL_CreateRGBSurfaceFrom = sdl2.SDL_CreateRGBSurfaceFrom
SDL_FreeSurface = sdl2.SDL_FreeSurface
SDL_CreateWindow = sdl2.SDL_CreateWindow
SDL_GetWindowFlags = sdl2.SDL_GetWindowFlags
SDL_SetWindowIcon = sdl2.SDL_SetWindowIcon
SDL_GetWindowSize = sdl2.SDL_GetWindowSize
SDL_SetWindowGrab = sdl2.SDL_SetWindowGrab
SDL_SetWindowGammaRamp = sdl2.SDL_SetWindowGammaRamp
SDL_DestroyWindow = sdl2.SDL_DestroyWindow
SDL_GL_GetProcAddress = sdl2.SDL_GL_GetProcAddress
SDL_GL_SetAttribute = sdl2.SDL_GL_SetAttribute
SDL_GL_CreateContext = sdl2.SDL_GL_CreateContext
SDL_GL_SetSwapInterval = sdl2.SDL_GL_SetSwapInterval
SDL_GL_SwapWindow = sdl2.SDL_GL_SwapWindow
SDL_GL_DeleteContext = sdl2.SDL_GL_DeleteContext
SDL_SetRelativeMouseMode = sdl2.SDL_SetRelativeMouseMode
SDL_ShowCursor = sdl2.SDL_ShowCursor
SDL_WasInit = sdl2.SDL_WasInit
SDL_GetError = sdl2.SDL_GetError
SDL_GetTicks = sdl2.SDL_GetTicks
SDL_AddTimer = sdl2.SDL_AddTimer
SDL_GetVersion = sdl2.SDL_GetVersion
SDL_RemoveTimer = sdl2.SDL_RemoveTimer
SDL_ThreadID = sdl2.SDL_ThreadID

Comment out the following lines:

C++
SDL_HasMMX = sdl2.SDL_HasMMX
SDL_Has3DNow = sdl2.SDL_Has3DNow
SDL_HasSSE = sdl2.SDL_HasSSE
SDL_HasSSE2 = sdl2.SDL_HasSSE2
SDL_HasAltiVec = sdl2.SDL_HasAltiVec

just like this:

; SDL_HasMMX = sdl2.SDL_HasMMX
; SDL_Has3DNow = sdl2.SDL_Has3DNow
; SDL_HasSSE = sdl2.SDL_HasSSE
; SDL_HasSSE2 = sdl2.SDL_HasSSE2
; SDL_HasAltiVec = sdl2.SDL_HasAltiVec

We will provide our own stubs, since we will have to call this API ourselves. Now let's turn our attention to dlc DLL code. We will pick the following function:

59

Let's open EXE in IDA and find this function:

60

61

62

63

Image 6564

65

Let's see the code:

C++
#include <Windows.h>

#pragma pack(push, 1)
typedef struct _Sorry            // 12 bytes of space needed
{
    struct
    {
        BYTE Force64bit;
        BYTE Opcode;
        INT64 Value;
    } MovToRax;                  // mov rax, func
    struct
    {
        BYTE Opcode;
        BYTE Reg;
    } JmpRax;                    // jmp rax
} Sorry;
#pragma pack(pop)

const unsigned long long DefExeBase = 0x140000000;
BYTE *g_ExeBase;

typedef int(*type_sdl2_api)();
type_sdl2_api sdl2_HasMMX;
type_sdl2_api sdl2_Has3DNow;
type_sdl2_api sdl2_HasSSE;
type_sdl2_api sdl2_HasSSE2;
type_sdl2_api sdl2_HasAltiVec;

void DLCInit()
{
    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();

    HMODULE sdl2 = LoadLibraryA("SDL2.dll");
    sdl2_HasMMX = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasMMX");
    sdl2_Has3DNow = (type_sdl2_api)GetProcAddress(sdl2, "SDL_Has3DNow");
    sdl2_HasSSE = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasSSE");
    sdl2_HasSSE2 = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasSSE2");
    sdl2_HasAltiVec = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasAltiVec");
}

void DLCBindGlobals()
{
    // we don't use globals in our function, so nothing here
}

void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->MovToRax.Force64bit = 0x48;
    s->MovToRax.Opcode = 0xb8;
    s->MovToRax.Value = (INT64)NewFunc;
    s->JmpRax.Opcode = 0xff;
    s->JmpRax.Reg = 0xe0;
}

void DLCBindFunctions()
{
    DLCReplaceFunction(g_ExeBase + (0x1402E3F50 - DefExeBase), (BYTE*)sub_1402E3F50);
}

typedef enum
{
    CPUID_DEFAULT = 0x00002,
    CPUID_MMX = 0x00010,
    CPUID_3DNOW = 0x00020,
    CPUID_SSE = 0x00040,
    CPUID_SSE2 = 0x00080,
    CPUID_ALTIVEC = 0x00200,
};

extern "C"
{
    __declspec(dllexport) int SDL_HasMMX()
    {
        return sdl2_HasMMX();
    }

    __declspec(dllexport) int SDL_Has3DNow()
    {
        return sdl2_Has3DNow();
    }

    __declspec(dllexport) int SDL_HasSSE()
    {
        return sdl2_HasSSE();
    }

    __declspec(dllexport) int SDL_HasSSE2()
    {
        return sdl2_HasSSE2();
    }

    __declspec(dllexport) int SDL_HasAltiVec()
    {
        return sdl2_HasAltiVec();
    }
}

int sub_1402E3F50()
{
    int flags = CPUID_DEFAULT;

    if (SDL_HasMMX())
        flags |= CPUID_MMX;

    // DLC extra
    //if (SDL_Has3DNow())
        flags |= CPUID_3DNOW;

    if (SDL_HasSSE())
        flags |= CPUID_SSE;

    if (SDL_HasSSE2())
        flags |= CPUID_SSE2;

    // DLC extra
    //if (SDL_HasAltiVec())
        flags |= CPUID_ALTIVEC;

    return flags;
}

BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        DLCInit();
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

I don't have 3DNow and AltiVec support on my CPU, so I modified the function a little. This mod produced no visual side effect (sadly, game didn't even crash), however for our example, it will do. Also, we have to call some sdl2 API here, that's why we immediately follow our idea to express module-to-module interface as C code.

One note here: if you have 32-bit program and will follow the same scheme:

Comment function in def file:

; Api = some_dll.Api

and put it in code:

C++
extern "C"
{
    __declspec(dllexport) int Api(int param) { ... }
}

You will have Api function with __cdecl calling convention. However, it is possible to export __stdcall functions from 32-bit DLL without any name decoration (with def file, and not with __declspec(dllexport)). You should inspect program's disassembly to see whether it is __cdecl or __stdcall. If it is __stdcall, and your function is __cdecl, the crash is possible because your function doesn't clean up stack from its parameters, and inspected program expects that it does. To solve this problem, you will need to do the following:

def file:

Api = _Api@4

The code is as follows:

C++
extern "C"
{
        int __stdcall Api(int param) { ... }
}

In this case, exported name is not decorated, and function is __stdcall, as expected.

Now back to 64-bit example. Let's add -debug-wait. Here's start_0 function (start is just a jump to start_0, so we don't have enough space to write Sorry structure):

66

Let's see its address:

67

-debug-wait implementation is no different from 32-bit:

C++
typedef int(*Start)();    // __stdcall no longer needed

Start start_ptr;
Sorry start_sorry;

void DLCInit()
{
    char *_cmd = GetCommandLineA();
    std::string cmd(_cmd);
    std::string suffix("-debug-wait");
    std::size_t found = cmd.find(suffix);
    bool debug_wait = false;

    while (found != std::string::npos)
    {
        if (cmd.length() == (found + suffix.length()))
        {
            _cmd[found] = 0;  // probably we should hide this parameter from program
            debug_wait = true;
            break;
        }
        found = cmd.find(suffix, found + 1);
    }

    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();
    if (debug_wait) DLCWaitForDebugger();

    ...
}

void DLCWaitForDebugger()
{
        start_ptr = (Start)(g_ExeBase + (0x1403F59A0 - DefExeBase));
        start_sorry = *((Sorry*)start_ptr);
        DLCReplaceFunction((BYTE*)start_ptr, (BYTE*)sub_1403F59A0);
}

int sub_1403F59A0()
{
    *((Sorry*)start_ptr) = start_sorry;
    while (!IsDebuggerPresent()) Sleep(500);
    DebugBreak();
    return start_ptr();
}

Now let's see the full code:

C++
#include <Windows.h>
#include <string>

#pragma pack(push, 1)
typedef struct _Sorry            // 12 bytes of space needed
{
    struct
    {
        BYTE Force64bit;
        BYTE Opcode;
        INT64 Value;
    } MovToRax;                  // mov rax, func
    struct
    {
        BYTE Opcode;
        BYTE Reg;
    } JmpRax;                    // jmp rax
} Sorry;
#pragma pack(pop)

const unsigned long long DefExeBase = 0x140000000;
BYTE *g_ExeBase;

typedef int(*Start)(); // __stdcall no longer needed

Start start_ptr;
Sorry start_sorry;

typedef int(*type_sdl2_api)();
type_sdl2_api sdl2_HasMMX;
type_sdl2_api sdl2_Has3DNow;
type_sdl2_api sdl2_HasSSE;
type_sdl2_api sdl2_HasSSE2;
type_sdl2_api sdl2_HasAltiVec;

void DLCInit()
{
    char *_cmd = GetCommandLineA();
    std::string cmd(_cmd);
    std::string suffix("-debug-wait");
    std::size_t found = cmd.find(suffix);
    bool debug_wait = false;

    while (found != std::string::npos)
    {
        if (cmd.length() == (found + suffix.length()))
        {
            _cmd[found] = 0;  // probably we should hide this parameter from program
            debug_wait = true;
            break;
        }
        found = cmd.find(suffix, found + 1);
    }

    g_ExeBase = (BYTE*)GetModuleHandleA(NULL);
    DLCBindGlobals();
    DLCBindFunctions();
    if (debug_wait) DLCWaitForDebugger();

    HMODULE sdl2 = LoadLibraryA("SDL2.dll");
    sdl2_HasMMX = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasMMX");
    sdl2_Has3DNow = (type_sdl2_api)GetProcAddress(sdl2, "SDL_Has3DNow");
    sdl2_HasSSE = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasSSE");
    sdl2_HasSSE2 = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasSSE2");
    sdl2_HasAltiVec = (type_sdl2_api)GetProcAddress(sdl2, "SDL_HasAltiVec");
}

void DLCBindGlobals()
{
    // we don't use globals in our function, so nothing here
}

void DLCReplaceFunction(BYTE *OldFunc, BYTE *NewFunc)
{
    Sorry *s = (Sorry*)OldFunc;
    s->MovToRax.Force64bit = 0x48;
    s->MovToRax.Opcode = 0xb8;
    s->MovToRax.Value = (INT64)NewFunc;
    s->JmpRax.Opcode = 0xff;
    s->JmpRax.Reg = 0xe0;
}

void DLCBindFunctions()
{
    DLCReplaceFunction(g_ExeBase + (0x1402E3F50 - DefExeBase), (BYTE*)sub_1402E3F50);
}

void DLCWaitForDebugger()
{
    start_ptr = (Start)(g_ExeBase + (0x1403F59A0 - DefExeBase));
    start_sorry = *((Sorry*)start_ptr);
    DLCReplaceFunction((BYTE*)start_ptr, (BYTE*)sub_1403F59A0);
}

int sub_1403F59A0()
{
    *((Sorry*)start_ptr) = start_sorry;
    while (!IsDebuggerPresent()) Sleep(500);
    DebugBreak();
    return start_ptr();
}

typedef enum
{
    CPUID_DEFAULT = 0x00002,
    CPUID_MMX = 0x00010,
    CPUID_3DNOW = 0x00020,
    CPUID_SSE = 0x00040,
    CPUID_SSE2 = 0x00080,
    CPUID_ALTIVEC = 0x00200,
};

extern "C"
{
    __declspec(dllexport) int SDL_HasMMX()
    {
        return sdl2_HasMMX();
    }

    __declspec(dllexport) int SDL_Has3DNow()
    {
        return sdl2_Has3DNow();
    }

    __declspec(dllexport) int SDL_HasSSE()
    {
        return sdl2_HasSSE();
    }

    __declspec(dllexport) int SDL_HasSSE2()
    {
        return sdl2_HasSSE2();
    }

    __declspec(dllexport) int SDL_HasAltiVec()
    {
        return sdl2_HasAltiVec();
    }
}

int sub_1402E3F50()
{
    int flags = CPUID_DEFAULT;

    if (SDL_HasMMX())
        flags |= CPUID_MMX;

    // DLC extra
    //if (SDL_Has3DNow())
        flags |= CPUID_3DNOW;

    if (SDL_HasSSE())
        flags |= CPUID_SSE;

    if (SDL_HasSSE2())
        flags |= CPUID_SSE2;

    // DLC extra
    //if (SDL_HasAltiVec())
        flags |= CPUID_ALTIVEC;

    return flags;
}

BOOL APIENTRY DllMain( HMODULE hModule,
                       DWORD  ul_reason_for_call,
                       LPVOID lpReserved
                     )
{
    switch (ul_reason_for_call)
    {
    case DLL_PROCESS_ATTACH:
        DLCInit();
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH:
        break;
    }
    return TRUE;
}

Build dlc DLL as 64-bit and copy it to game's folder. Now, it's time to see file offsets:

68

69

Make a copy of dhewm3.exe and open it in hex editor:

70

71

And patch it:

72

73

Now we can run our patched doom 3 and enjoy the game.

Conclusion

We demonstated the concept of using C compiler in reverse engineering process. It's handy because we can write real code and test it, we have all advantages of real programming language. We can have one dlc DLL for each image of the program we are reversing (just like IDA database).

We should mention what to do with small functions. So small, that we can't write our Sorry structure at the beginning without overwriting another function or data. Well, if the function is small, it is easy to understand. We can still write our own version of it, it just won't get called.

Of course, things will likely fail if image contains self modifying code, or checks whether it has been patched, or another anti-reverse tricks. However, we are discussing reverse engineering approach here, not solving anti-debugging tricks.

Some things were not covered. For example, forwarding of data imports, and C++ kind of things. We can't cover everything, also C++ makes things more complex. I would stick to C, and compensate this lack of C++ expression with good program design (for example, separate "classes" in separate .c files) and good comments. Later, when we have enough information, and if we clearly see that we can use C++, we will take advantage of C++ capabilities.

Another important thing: this approach requires code writing, and helps you to feel like actual programmer, not only intruder. We can translate functions to C one by one, and slowly change the role of "intruder in hostile executable written by another dudes" to "creator, writer of the code". Something like: of course I know how it works, I have written it myself!

We need to store the particular version of the program we are reversing, together with our dlc DLLs. It is because our dlc DLL is made to work with particular image. If update happens and executable changes (think of all those hardcoded function offsets), our dlc will likely stop working. So we need to have original version of program, and when update happens, we can also update our dlc DLL to work with new executable. This way, we can slowly cut into program and keep things up to date.

I am going to try this approach myself. I want to make some mod for this old Hulk game (for example, add new hulk attack). I really liked it when I was a kid, it's cool to smash everything.

I wish you good luck in all your reverse engineering endeavors. May the force of C compiler be with you. Thank you for reading.

History

  • 27th July, 2020: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)