Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Writing a basic Windows debugger

4.92/5 (81 votes)
24 Jan 2011CPOL14 min read 217.7K   12.7K  
Learn how you can write your own Windows debugger.

DebuggerUI.jpg

Preamble

All of us have used some kind of debugger while programming in some language. The debugger you used may be in C++, C#, Java or another language. It might be standalone like WinDbg, or inside an IDE like Visual Studio. But have you been inquisitive over how debuggers work?

Well, this article presents the hidden glory on how debuggers work. This article only covers writing debugger on Windows. Please note that here I am concerned only about the debugger and not about compilers, linkers, or debugging extensions. Thus, we'll only debug executables (like WinDbg). This article assumes a basic understanding of multithreading from the reader (read my article on multithreading).

1. How to Debug a Program?

Two steps:

  1. Starting the process with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS flags.
  2. Setting up the debugger's loop that will handle debugging events.

Before we move further, please remember:

  1. Debugger is the process/program which debugs the other process (target-process).
  2. Debuggee is the process being debugged, by the debugger.
  3. Only one debugger can be attached to a debuggee. However, a debugger can debug multiple processes (in separate threads).
  4. Only the thread that created/spawned the debuggee can debug the target-process. Thus, CreateProcess and the debugger-loop must be in the same thread.
  5. When the debugger thread terminates, the debuggee terminates as well. The debugger process may keep running, however.
  6. When the debugger's debugging thread is busy processing a debug event, all threads in the debuggee (target-process) stand suspended. More on this later.

A. Starting the process with the debugging flag

Use CreateProcess to start the process, specifying DEBUG_ONLY_THIS_PROCESS as the sixth parameter (dwCreationFlags). With this flag, we are asking the Windows OS to communicate this thread for all debugging events, including process creation/termination, thread creation/termination, runtime exceptions, and so on. A detailed explanation is given below. Please note that we'll be using DEBUG_ONLY_THIS_PROCESS in this article. It essentially means we want only to debug the process we are creating, and not any child process(es) that may be created by the process we create.

C++
STARTUPINFO si; 
PROCESS_INFORMATION pi; 
ZeroMemory( &si, sizeof(si) ); 
si.cb = sizeof(si); 
ZeroMemory( &pi, sizeof(pi) );

CreateProcess ( ProcessNameToDebug, NULL, NULL, NULL, FALSE, 
                DEBUG_ONLY_THIS_PROCESS, NULL,NULL, &si, &pi );

After this statement, you would see the process in the Task Manager, but the process hasn't started yet. The newly created process is suspended. No, we don't have to call ResumeThread, but write a debugger-loop.

B. The debugger loop

The debugger-loop is the central area for debuggers! The loop runs around the WaitForDebugEvent API. This API takes two parameters: a pointer to the DEBUG_EVENT structure and the DWORD timeout parameter. For timeout, we would simply specify INFINITE. This API exists in kernel32.dll, thus we need not link to any library.

C++
BOOL WaitForDebugEvent(DEBUG_EVENT* lpDebugEvent, DWORD dwMilliseconds);

The DEBUG_EVENT structure contains the debugging event information. It has four members: Debug event code, process-ID, thread-ID, and the event information. As soon as WaitForDebugEvent returns, we process the received debugging event, and then eventually call ContinueDebugEvent. Here is a minimal debugger-loop:

C++
DEBUG_EVENT debug_event = {0};
for(;;)
{
    if (!WaitForDebugEvent(&debug_event, INFINITE))
        return;
    ProcessDebugEvent(&debug_event);  // User-defined function, not API
    ContinueDebugEvent(debug_event.dwProcessId,
                      debug_event.dwThreadId,
                      DBG_CONTINUE);
}

Using the ContinueDebugEvent API, we are asking the OS to continue executing the debuggee. The dwProcessId and dwThreadId specify the process and thread. These values are the same that we received form WaitForDebugEvent. The last parameter specifies if the execution should continue or not. This parameter is relevant only if the exception-event is received. We will cover this later. Until then, we'll utilize only DBG_CONTINUE (another possible value is DBG_EXCEPTION_NOT_HANDLED).

2. Handling debugging events

There are nine different major debugging events, and 20 different sub-events under the exception-event category. I will discuss them, starting from the simplest. Here is the DEBUG_EVENT structure:

C++
struct DEBUG_EVENT
{
    DWORD dwDebugEventCode;
    DWORD dwProcessId;
    DWORD dwThreadId;
    union {
        EXCEPTION_DEBUG_INFO Exception;
        CREATE_THREAD_DEBUG_INFO CreateThread;
        CREATE_PROCESS_DEBUG_INFO CreateProcessInfo;
        EXIT_THREAD_DEBUG_INFO ExitThread;
        EXIT_PROCESS_DEBUG_INFO ExitProcess;
        LOAD_DLL_DEBUG_INFO LoadDll;
        UNLOAD_DLL_DEBUG_INFO UnloadDll;
        OUTPUT_DEBUG_STRING_INFO DebugString;
        RIP_INFO RipInfo;
    } u;
};

WaitForDebugEvent, on successful return, fills-in the values in this structure. dwDebugEventCode specifies which debugging-event has occurred. Depending on the event-code received, one of the members of the union u contains the event information, and we should only use the respective union-member. For example, if the debug event code is OUTPUT_DEBUG_STRING_EVENT, the member OUTPUT_DEBUG_STRING_INFO would be valid.

A. Processing OUTPUT_DEBUG_STRING_EVENT

Programmers generally use OutputDebugString to generate debugging-text that would be displayed on the debugger's 'Output' window. Depending on the language/framework you use, you might be familiar with the TRACE, ATLTRACE macros. A .NET programmers may use the System.Diagnostics.Debug.Print/<code>System.Diagnostics.Trace.WriteLine methods (or other methods). But with all these methods, the <code>OutputDebugString API is called, and the debugger would receive this event (unless it is buried with the DEBUG symbol undefined!).

When this event is received, we work on the DebugString member variable. The structure OUTPUT_DEBUG_STRING_INFO is defined as:

C++
struct OUTPUT_DEBUG_STRING_INFO
{
   LPSTR lpDebugStringData;  // char*
   WORD fUnicode;
   WORD nDebugStringLength;
};

The member-variable 'nDebugStringLength' specifies the length of the string, including the terminating null, in characters (not bytes). The variable 'fUnicode' specifies if the string is Unicode (non-zero), or ANSI (zero). That means, we read 'nDebugStringLength' bytes from 'lpDebugStringData' if the string is ANSI; otherwise, we read (nDebugStringLength x 2) bytes. But remember, the address pointed by 'lpDebugStringData' is not from the address-space of the debugger's memory. The address is relevant to the debuggee memory. Thus, we need to read the contents from the debuggee's process memory.

To read data from another process' memory, we use the ReadProcessMemory function. It requires that the calling process should have the appropriate permission. Since the debugger only created the process, we do have the rights. Here is the code to process this debugging event:

C++
case OUTPUT_DEBUG_STRING_EVENT:
{
   CStringW strEventMessage;  // Force Unicode
   OUTPUT_DEBUG_STRING_INFO & DebugString = debug_event.u.DebugString;

   WCHAR *msg=new WCHAR[DebugString.nDebugStringLength];
   // Don't care if string is ANSI, and we allocate double...

   ReadProcessMemory(pi.hProcess,       // HANDLE to Debuggee
         DebugString.lpDebugStringData, // Target process' valid pointer
         msg,                           // Copy to this address space
         DebugString.nDebugStringLength, NULL);

   if ( DebugString.fUnicode )
      strEventMessage = msg;
   else
      strEventMessage = (char*)msg; // char* to CStringW (Unicode) conversion.

   delete []msg;
   // Utilize strEventMessage
}

What if the debuggee terminates before the debugger copies the memory contents?

Well... In that case, I would like to remind you: when the debugger is processing a debugging event, all threads in the debuggee are suspended. The process cannot kill itself in anyway at this moment. Also, no other method can terminate the process (Task Manager, Process Explorer, Kill utility...). Attempting to kill the process from these utilities will, however, schedule the terminating process. Thus, the debugger would receive EXIT_PROCESS_DEBUG_EVENT as the next event!

B. Processing CREATE_PROCESS_DEBUG_EVENT

This event is raised when the process (debuggee) is being spawned. This would be the first event the sebugger receives. For this event, the relevant member of DEBUG_EVENT would be CreateProcessInfo. Here is the structure definition of CREATE_PROCESS_DEBUG_INFO:

C++
struct CREATE_PROCESS_DEBUG_INFO
{
    HANDLE hFile;   // The handle to the physical file (.EXE)
    HANDLE hProcess; // Handle to the process
    HANDLE hThread;  // Handle to the main/initial thread of process
    LPVOID lpBaseOfImage; // base address of the executable image
    DWORD dwDebugInfoFileOffset;
    DWORD nDebugInfoSize;
    LPVOID lpThreadLocalBase;
    LPTHREAD_START_ROUTINE lpStartAddress;
    LPVOID lpImageName;  // Pointer to first byte of image name (in Debuggee)
    WORD fUnicode; // If image name is Unicode.
};

Please note that hProcess and hThread may not have the same handle values we have received in pi (PROCESS_INFORMATION). The process-ID and the thread-ID would, however, be the same. Each handle you get by Windows (for the same resource) is different from other handles, and has a different purpose. So, the debugger may choose to display either the handles or the IDs.

The hFile as well as lpImageName can both be used to get the file-name of the process being debugged. Although, we already know what the name of the process is, since we only created the debuggee. But locating the module name of the EXE or DLL is important, since we would anyway need to find the name of the DLL while processing the LOAD_DLL_DEBUG_EVENT message.

As you can read in MSDN, lpImageName will never return the filename directly, and the name would be in the target-process. Furthermore, it may not have a filename in the target-process too (i.e., via ReadProcessMemory). Also, the filename may not be fully qualified (as I've tested). Thus, we will not use this method. We'll retrieve the filename from the hFile member.

How to get the name of the file by HANDLE

Unfortunately, we need to use the method described in MSDN that uses around 10 API calls to get the filename from the handle. I have slightly modified the function GetFileNameFromHandle. The code is not shown here for brevity, it is available in the source file attached with this article. Anyway, here is the basic code to process this event:

C++
case CREATE_PROCESS_DEBUG_EVENT:
{
   CString strEventMessage = 
     GetFileNameFromHandle(debug_event.u.CreateProcessInfo.hFile);
   // Use strEventMessage, and other members
   // of CreateProcessInfo to intimate the user of this event.
}

You may have noticed that I did not cover a few members of this structure. I would probably cover all of them in the next part of this article.

C. Processing LOAD_DLL_DEBUG_EVENT

This event is similar to CREATE_PROCESS_DEBUG_EVENT, and as you might have guessed, it is raised when a DLL is loaded by the OS. This event is raised whenever a DLL is loaded, either implicitly or explicitly (when the debuggee calls LoadLibrary). This debugging event only occurs the first time the system attaches a DLL to the virtual address space of a process. For this event processing, we use the 'LoadDll' member of the union. It is of type LOAD_DLL_DEBUG_INFO:

C++
struct LOAD_DLL_DEBUG_INFO
{
   HANDLE hFile;         // Handle to the DLL physical file.
   LPVOID lpBaseOfDll;   // The DLL Actual load address in process.
   DWORD dwDebugInfoFileOffset;
   DWORD nDebugInfoSize;
   LPVOID lpImageName;   // These two member are same as CREATE_PROCESS_DEBUG_INFO
   WORD fUnicode;
};

For retrieving the filename, we would use the same function, GetFileNameFromHandle, as we have used in CREATE_PROCESS_DEBUG_EVENT. I will list out the code for processing this event when I would describe UNLOAD_DLL_DEBUG_EVENT, since the UNLOAD_DLL_DEBUG_EVENT does not have any direct information available to find out the name of the DLL file.

D. Processing CREATE_THREAD_DEBUG_EVENT

This debug event is generated whenever a new thread is created in the debuggee. Like CREATE_PROCESS_DEBUG_EVENT, this event is raised before the thread actually gets to run. To get information about this event, we use the 'CreateThread' union member. This variable is of type CREATE_THREAD_DEBUG_INFO:

C++
struct CREATE_THREAD_DEBUG_INFO
{
  // Handle to the newly created thread in debuggee
  HANDLE hThread;
  LPVOID lpThreadLocalBase;
  // pointer to the starting address of the thread
  LPTHREAD_START_ROUTINE lpStartAddress;
};

The thread-ID for a newly arrived thread is available in DEBUG_EVENT::dwThreadId. Using this member to intimate the user is straightforward:

C++
case CREATE_THREAD_DEBUG_EVENT:
{
   CString strEventMessage;
   strEventMessage.Format(L"Thread 0x%x (Id: %d) created at: 0x%x",
            debug_event.u.CreateThread.hThread,
            debug_event.dwThreadId,
            debug_event.u.CreateThread.lpStartAddress);
            // Thread 0xc (Id: 7920) created at: 0x77b15e58
}

The 'lpStartAddress' is relevant to the debuggee and not the debugger; we are just displaying it for completeness. Remember this event is not received for the primary/initial thread of the process. It is received only for the subsequent thread creations in the debuggee.

E. Processing EXIT_THREAD_DEBUG_EVENT

This event is raised as soon as the thread returns, and the return code is available to the system. The 'dwThreadId' member of DEBUG_EVENT specifies which thread exited. To get the thread handle and other information that we received in CREATE_THREAD_DEBUG_EVENT, we need to store the information in some map. This event has a relevant member named 'ExitThread', which is of type EXIT_THREAD_DEBUG_INFO:

C++
struct EXIT_THREAD_DEBUG_INFO
{
   DWORD dwExitCode; // The thread exit code of DEBUG_EVENT::dwThreadId
};

Here is the event handler code:

C++
case EXIT_THREAD_DEBUG_EVENT:
{
   CString strEventMessage;
   strEventMessage.Format( _T("The thread %d exited with code: %d"),
      debug_event.dwThreadId,
      debug_event.u.ExitThread.dwExitCode);    // The thread 2760 exited with code: 0
}

F. Processing UNLOAD_DLL_DEBUG_EVENT

Of course, this event occurs when a DLL is unloaded from the debuggee's memory. But wait! It is only generated against FreeLibrary calls, and not when the system unloads DLLs. The debuggee may call LoadLibrary multiple times, and thus only the last call to FreeLibrary would raise this event. It means, the implicitly loaded DLLs will not receive this event when they are unloaded, when the process exits. (You can verify this assertion in your favorite debugger!)

For this event, you use the 'UnloadDll' member of the union, which is of type UNLOAD_DLL_DEBUG_INFO:

C++
struct UNLOAD_DLL_DEBUG_INFO
{
    LPVOID lpBaseOfDll;
};

As you can see, only the base-address of the DLL (a simple pointer) is available for us to process this event. This is the reason I had delayed giving the code for LOAD_DLL_DEBUG_EVENT. In the DLL loading event, we get the 'lpBaseOfDll' also. We can use the map (or another data structure you like) to store the name of the DLL against the base-address of the DLL. The same base-address would arrive while processing UNLOAD_DLL_DEBUG_EVENT.

It should be noted that not all DLL-load events would get the DLL-unload event; still, we have to store all DLL names into the map, since LOAD_DLL_DEBUG_EVENT doesn't provide us info on how the DLL was loaded.

Here is the code to process these two events:

C++
std::map < LPVOID, CString > DllNameMap;
...
case LOAD_DLL_DEBUG_EVENT:
{
   strEventMessage = GetFileNameFromHandle(debug_event.u.LoadDll.hFile);


   // Storing the DLL name into map. Map's key is the Base-address
   DllNameMap.insert(
      std::make_pair( debug_event.u.LoadDll.lpBaseOfDll, strEventMessage) );

   strEventMessage.AppendFormat(L" - Loaded at %x", debug_event.u.LoadDll.lpBaseOfDll);
}
break;
...
case UNLOAD_DLL_DEBUG_EVENT:
{
   strEventMessage.Format(L"DLL '%s' unloaded.",
      DllNameMap[debug_event.u.UnloadDll.lpBaseOfDll] ); // Get DLL name from map.
}
break;

G. Processing EXIT_PROCESS_DEBUG_EVENT

This is one of the simplest debugging event, and as you can assess, would arrive when the process exists. This event would arrive irrespective of how the process exits - normally, terminated externally (Task Manager etc.), or the application's (debuggee) fault leading it to crash.

We use the 'ExitProcess' member, which is of type EXIT_PROCESS_DEBUG_INFO:

C++
struct EXIT_PROCESS_DEBUG_INFO
{
    DWORD dwExitCode;
};

As soon as this event occurs, we also end the debugger-loop and terminate the debugging thread. For this, we can use a variable to control the loop (the 'for' loop shown in the first page), and set its value to indicate loop termination. Please download the attached files to see the entire code.

C++
bool bContinueDebugging=true;
...
case EXIT_PROCESS_DEBUG_EVENT:
{
   strEventMessage.Format(L"Process exited with code:  0x%x", 
                          debug_event.u.ExitProcess.dwExitCode);
   bContinueDebugging=false;
}
break;

H. Processing EXCEPTION_DEBUG_EVENT

This is the prodigious event amongst all the debugging events! From MSDN:

This is generated whenever an exception occurs in the process being debugged. Possible exceptions include attempting to access inaccessible memory, executing breakpoint instructions, attempting to divide by zero, or any other exception noted in Structured Exception Handling. The DEBUG_EVENT structure contains an EXCEPTION_DEBUG_INFO structure. This structure describes the exception that caused the debugging event.

This debugging event needs a separate article to complete it fully (or partially!). Thus, I would discuss only one type of exception event, along with an introduction to this event itself.

The member variable 'Exception' holds the information regarding the exception just occurred. It is of type EXCEPTION_DEBUG_INFO:

C++
struct EXCEPTION_DEBUG_INFO
{
    EXCEPTION_RECORD ExceptionRecord;
    DWORD dwFirstChance;
};

The 'ExceptionRecord' member of this structure contains detailed information regarding the exception. It is of type EXCEPTION_RECORD:

C++
struct EXCEPTION_RECORD
{
    DWORD     ExceptionCode;
    DWORD     ExceptionFlags;
    struct _EXCEPTION_RECORD *ExceptionRecord;
    PVOID     ExceptionAddress;
    DWORD     NumberParameters;
    ULONG_PTR ExceptionInformation[EXCEPTION_MAXIMUM_PARAMETERS];  // 15
};

The detailed information is put into this sub-structure, because exceptions may appear nested, and would be linked to each other in a linked-list manner. It is out of topic for now to discuss nested exceptions.

Before we delve into EXCEPTION_RECORD, it is important to discuss EXCEPTION_DEBUG_INFO::dwFirstChance.

Are exceptions giving chances?

Not exactly! When a process is being debugged, the debugger always receives the exception before the debuggee gets it. You must have seen "First-chance exception at 0x00412882 in SomeModule:..." while debugging your Visual C++ module. This is referred to as First Chance Exception. The same exception may or may not follow with a second chance exception.

When the debuggee gets the exception, it is termed as Second Chance Exception. The debuggee may handle the exception, or may simply crash down. These types of exceptions are not C++ exceptions, but Windows' SEH (structure exception handling) mechanism. I would cover more about it in the next part of this article.

The debugger gets exceptions first (First-chance exception), so that it can handle it before giving it to the debuggee. The break-point exception is a kind of exception, which is relevant to the debugger, not the debuggee. Some libraries also generate First chance exceptions to aid the debugger and the debugging process.

A word for ContinueDebugEvent

The third parameter (dwContinueStatus) of this function is relevant only after an exception event is received. For non-exception events that we discussed, the system ignores the value passed to this function.

After the exception event processing, ContinueDebugEvent should be called with:

  • DBG_CONTINUE if the exception event was successfully handled by the debugger. No action is required by the debuggee, and the debuggee can run normally.
  • DBG_EXCEPTION_NOT_HANDLED if this event is not handled/resolved by the debuggee. The debugger might just record this event, notify the debugger-user, or do something else.

Please note that returning DBG_CONTINUE for the improper debugging event would raise the same event in the debugger, and the same event would arrive indefinitely. Since we are in the early stage of writing debuggers, we should play safe, and return EXCEPTION_NOT_HANDLED (give up flag!). The exclusion, for this article, is the Break-point event, which I am discussing next.

Exceptions codes

The EXCEPTION_RECORD::ExceptionCode variable holds the arrived exception code, and can have one of these codes (ignore nested exceptions!):

  1. EXCEPTION_ACCESS_VIOLATION
  2. EXCEPTION_ARRAY_BOUNDS_EXCEEDED
  3. EXCEPTION_BREAKPOINT
  4. EXCEPTION_DATATYPE_MISALIGNMENT
  5. EXCEPTION_FLT_DENORMAL_OPERAND
  6. EXCEPTION_FLT_DIVIDE_BY_ZERO
  7. EXCEPTION_FLT_INEXACT_RESULT
  8. EXCEPTION_FLT_INVALID_OPERATION
  9. EXCEPTION_FLT_OVERFLOW
  10. EXCEPTION_FLT_STACK_CHECK
  11. EXCEPTION_FLT_UNDERFLOW
  12. EXCEPTION_ILLEGAL_INSTRUCTION
  13. EXCEPTION_IN_PAGE_ERROR
  14. EXCEPTION_INT_DIVIDE_BY_ZERO
  15. EXCEPTION_INT_OVERFLOW
  16. EXCEPTION_INVALID_DISPOSITION
  17. EXCEPTION_NONCONTINUABLE_EXCEPTION
  18. EXCEPTION_PRIV_INSTRUCTION
  19. EXCEPTION_SINGLE_STEP
  20. EXCEPTION_STACK_OVERFLOW

Relax! I am not discussing all of them, but one: EXCEPTION_BREAKPOINT. Okay, here is the code:

C++
case EXCEPTION_DEBUG_EVENT:
{
   EXCEPTION_DEBUG_INFO& exception = debug_event.u.Exception;
   switch( exception.ExceptionRecord.ExceptionCode)
   {
      case STATUS_BREAKPOINT:  // Same value as EXCEPTION_BREAKPOINT

         strEventMessage= "Break point";
         break;

      default:
         if(exception.dwFirstChance == 1)
         {
            strEventMessage.Format(L"First chance exception at %x, exception-code: 0x%08x",
                        exception.ExceptionRecord.ExceptionAddress,
                        exception.ExceptionRecord.ExceptionCode);
         }
         // else
         // { Let the OS handle }

         // There are cases where OS ignores the dwContinueStatus,
         // and executes the process in its own way.
         // For first chance exceptions, this parameter is not-important
         // but still we are saying that we have NOT handled this event.

         // Changing this to DBG_CONTINUE (for 1st chance exception also),
         // may cause same debugging event to occur continously.
         // In short, this Debugger does not handle debug exception events
         // efficiently, and let's keep it simple for a while!

         dwContinueStatus = DBG_EXCEPTION_NOT_HANDLED;
         }

         break;
}

You might be aware of what a breakpoint is. Out of the standard debugger perspective, the break-pointing can happen with the DebugBreak API, or the {int 3} assembly instruction, or System.Diagnostics.Debugger.Break in the .NET Framework. The debugger would receive the Debug-exception code STATUS_BREAKPOINT (same as EXCEPTION_BREAKPOINT) when any of these occur in the running process. The debuggers generally use this event to break the running process, and may display the source code where the event occurred. But in our basic debugger, we would just display this event to the user. No source code or the instruction location is shown. We'll cover displaying the source code in the next part of this article.

Raising breakpoint from a process which is not being debugged would crash the application, or may display the JIT dialog box. The is the reason I used:

C++
if ( !IsDebuggerPresent() )
   AfxMessageBox(L"No debugger is attached currently.");
else
   DebugBreak();

As a final note to this simplest debug-exception event: EXCEPTION_DEBUG_EVENT would be raised first time by the kernel itself, and would always arrive. Debuggers like Visual Studio ignore this very first breakpoint exception, but debuggers like WinDbg would always show you this event too.

Winding up...

Use any process to debug or use the attached debuggee named DebugMe:

DebuggeeUI.jpg

The binaries (EXEs) attached here are compiled with Visual Studio 2005 Service Pack 1. You may not have the VC++ runtime libraries for the same version. You can download them from Microsoft.com or rebuild the projects from your IDE.

Follow up:

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)