Writing a basic Windows Debugger – Part 1


Preamble

All of us have used some kind of Debugger while programming in some language. The Debugger you used may be in C++, C#, Java or other language. It might be standalone like WinDbg, or inside an IDE like Visual Studio. But have you been inquisitive over how Debuggers work?

Well, this article presents the hidden glory on how Debuggers work. This article only covers writing Debugger on Windows. Please note that here I am concerned only on Debugger and not on: Compilers, Linkers or Debugging Extensions. Thus, we’ll only debug the executables (like WinDbg).
This article assumes basic understanding of multithreading from reader (read my article on multithreading).

How to Debug a Program?

Two steps:

  1. Starting the process with DEBUG_ONLY_THIS_PROCESS or DEBUG_PROCESS flags
  2. Setting up a Debugger’s loop, that will handle debugging events.

Before we move further, please remember:

  1. Debugger is the process/program which is debugging the other process (target-process).
  2. Debuggee is the process being debugged, by the Debugger.
  3. Only one Debugger can be attached to a Debuggee. However, a Debugger can debug multiple processes (in separate threads).
  4. Only the thread that created/spawned the Debuggee, can debug the target-process. Thus CreateProcess and Debugger-loop must be in same thread.
  5. When the Debugger thread terminates, the Debuggee terminates as well. The Debugger process may keep running, however.
  6. When the Debugger’s debugging thread is busy processing a debug event, ALL threads in Debuggee (target-process) stand suspended. More on this later.

Starting the process with debugging flag

Use CreateProcess to start the process, specifying DEBUG_ONLY_THIS_PROCESS in its 6th parameter (dwCreationFlags). With this flag, we are asking Windows OS to communicate this thread for all debugging events, including process creation/termination, thread creation/termination, runtime exceptions and so on. Detailed explanation is given below. Please note, we’ll be using DEBUG_ONLY_THIS_PROCESS in this article. It essentially means we want only to debug the process we are creating, and not any child process(es) that may be created by the process we create.

STARTUPINFO si;
PROCESS_INFORMATION pi;

ZeroMemory( &si, sizeof(si) );
si.cb = sizeof(si);
ZeroMemory( &pi, sizeof(pi) );

CreateProcess ( ProcessNameToDebug, NULL, NULL, NULL, FALSE,
	DEBUG_ONLY_THIS_PROCESS,
	NULL,NULL, &si, &pi );

After this statement, you would see the process in Task Manager, but process hasn’t started yet. The newly created process is suspended. No, we don’t have to call ResumeThread, but write a Debugger-loop.

The Debugger Loop

The Debugger-loop is the central area for the Debuggers! The loop runs around WaitForDebugEvent API. This API takes 2 parameters: A pointer to DEBUG_EVENT structure and DWORD timeout parameter. For timeout, we would simply specify INFINITE. This API exists in kernel32.dll, thus we need not to link to any library.

BOOL WaitForDebugEvent(DEBUG_EVENT* lpDebugEvent, DWORD dwMilliseconds);

The DEBUG_EVENT structure contains the debugging event information. It has 4 members: Debug event code, process-id, thread-id and the event information.
As soon as WaitForDebugEvent returns, we process the received debugging event, and then eventually call ContinueDebugEvent. Here is minimal Debugger-loop:

DEBUG_EVENT debug_event = {0};
for(;;)
{
	if (!WaitForDebugEvent(&debug_event, INFINITE))
		return;
	ProcessDebugEvent(&debug_event);  // User-defined function, not API
	ContinueDebugEvent(debug_event.dwProcessId,
                      debug_event.dwThreadId,
                      DBG_CONTINUE);
}

Using ContinueDebugEvent API, we are asking OS to continue executing the Debuggee. The dwProcessId and dwThreadId specify the process and thread. These values are same that we received form WaitForDebugEvent.
The last parameter specifies if execution should continue or not. This parameter is relevant only if exception-event is received. We will cover it later. Until then we’ll utilize only DBG_CONTINUE (other possible value is DBG_EXCEPTION_NOT_HANDLED
).

Handling Debugging Events

There are 9 different major debugging events, and 20 different sub-events under exception-event category. I will discuss starting from the simplest. Here is DEBUG_EVENT structure:

struct DEBUG_EVENT
{
	DWORD dwDebugEventCode;
	DWORD dwProcessId;
	DWORD dwThreadId;
	union {
		EXCEPTION_DEBUG_INFO Exception;
    		CREATE_THREAD_DEBUG_INFO CreateThread;
		CREATE_PROCESS_DEBUG_INFO CreateProcessInfo;
		EXIT_THREAD_DEBUG_INFO ExitThread;
		EXIT_PROCESS_DEBUG_INFO ExitProcess;
		LOAD_DLL_DEBUG_INFO LoadDll;
		UNLOAD_DLL_DEBUG_INFO UnloadDll;
		OUTPUT_DEBUG_STRING_INFO DebugString;
		RIP_INFO RipInfo;
	} u;
};

WaitForDebugEvent, on successful return, fills-in the values in this structure. dwDebugEventCode specifies which debugging-event has occurred. Depending on event-code received, one of the member of union u contains event information, and we should only use respective union-member. For example if debug event code is OUTPUT_DEBUG_STRING_EVENT, the member OUTPUT_DEBUG_STRING_INFO would be valid.

Processing OUTPUT_DEBUG_STRING_EVENT

Programmers generally use OutputDebugString to generate debugging-text that would be displayed on Debugger’s ‘Output’ window. Depending on language/framework you use, you might be familiar with TRACE, ATLTRACE macros. .NET programmers may use System.Diagnostics.Debug.Print/System.Diagnostics.Trace.WriteLine methods (or other methods). But with all these methods, OutputDebugString API is called, and the Debugger would receive this event (unless it is buried with DEBUG symbol undefined!).

When this event is received, we work on DebugString member variable. The structure OUTPUT_DEBUG_STRING_INFO is defined as:

struct OUTPUT_DEBUG_STRING_INFO
{
   LPSTR lpDebugStringData;  // char*
   WORD fUnicode;
   WORD nDebugStringLength;
};

The member-variable ‘nDebugStringLength’ specifies the length of string, including terminating null, in characters (not bytes). Variable ‘fUnicode’ specifies if string is Unicode (non-zero), or ANSI (zero). That means. we read ‘nDebugStringLength’ bytes from ‘lpDebugStringData’ if string is ANSI, otherwise we read (nDebugStringLength x 2) bytes. But remember the address pointed by ‘lpDebugStringData’ is not from the address-space of the Debugger’s memory. The address is relevant to Debuggee memory. Thus we need to read the contents from the Debuggee’s process memory.

To read data from other process’ memory we use ReadProcessMemory function. It requires that the calling process should have appropriate permission. Since the Debugger only created the process, we do have the rights. Here is the code to process this debugging event:

case OUTPUT_DEBUG_STRING_EVENT:
{
   CStringW strEventMessage;  // Force Unicode
   OUTPUT_DEBUG_STRING_INFO & DebugString = debug_event.u.DebugString;

   WCHAR *msg=new WCHAR[DebugString.nDebugStringLength]; // Don't care if string is ANSI, and we allocate double...

   ReadProcessMemory(pi.hProcess,       // HANDLE to Debuggee
         DebugString.lpDebugStringData, // Target process' valid pointer
         msg,                           // Copy to this address space
         DebugString.nDebugStringLength, NULL);

   if ( DebugString.fUnicode )
      strEventMessage = msg;
   else
      strEventMessage = (char*)msg; // char* to CStringW (Unicode) conversion.

   delete []msg;
   // Utilize strEventMessage
}

What if Debuggee terminates before Debugger copies the memory contents?

Well… In that case I would like to remind you: When the Debugger is processing a debugging event, ALL threads in Debuggee are suspended. The process cannot kill itself in anyway at this moment. Also, no other method can terminate the process (Task Manager, Process Explorer, kill utility…). Attempting to kill process from these utilities will, however, schedule the terminating the process. Thus, the Debugger would receive EXIT_PROCESS_DEBUG_EVENT as the next event!

Processing CREATE_PROCESS_DEBUG_EVENT

This event is raised when the process (Debuggee) is being spawned. This would be the first event the Debugger receives. For this event, the relevant member of DEBUG_EVENT would be CreateProcessInfo. Here is structure definition of CREATE_PROCESS_DEBUG_INFO:

struct CREATE_PROCESS_DEBUG_INFO
{
    HANDLE hFile;   // The handle to the physical file (.EXE)
    HANDLE hProcess; // Handle to the process
    HANDLE hThread;  // Handle to the main/initial thread of process
    LPVOID lpBaseOfImage; // base address of the executable image
    DWORD dwDebugInfoFileOffset;
    DWORD nDebugInfoSize;
    LPVOID lpThreadLocalBase;
    LPTHREAD_START_ROUTINE lpStartAddress;
    LPVOID lpImageName;  // Pointer to first byte of image name (in Debuggee)
    WORD fUnicode; // If image name is Unicode.
};

Please note that hProcess and hThread may not have the same handle values we have received in pi (PROCESS_INFORMATION). The process-id and thread-id would, however, be same. Each handle you get by Windows (for same resource) is different than other handles, and has different purpose. So, the Debugger may choose to display either handles or the Ids.

The hFile as well as lpImageName can both be used to get the file-name of the process being debugged. Although we already know what is the name of process, since we only created the debuggee. But locating the module name of EXE or DLL is important, since we would anyway need to find the name of DLL while processing LOAD_DLL_DEBUG_EVENT message.

As you can read in MSDN, lpImageName will never return the filename directly, and the name would be in target-process. Furthermore, it may not have a filename in target-process too (i.e. via ReadProcessMemory). Also, the filename may not be fully qualified (as I’ve tested). Thus we will not use this method. We’ll retrieve filename from hFile member.

How to get the name of file by HANDLE?

Unfortunately, we need to use the method described in MSDN, that uses around 10 API calls to get the filename from handle. I have slightly modified the function GetFileNameFromHandle. The code is not shown here for brevity, it is available in source file attached with this article.
Anyway, here is basic code to process this event:

case CREATE_PROCESS_DEBUG_EVENT:
{
   CString strEventMessage = GetFileNameFromHandle(debug_event.u.CreateProcessInfo.hFile);
   // Use strEventMessage, and other members of CreateProcessInfo to intimate the user of this event.
}

You may have noticed that I did not cover few members of this structure. I would probably cover all of them in the next part of this article.

Processing LOAD_DLL_DEBUG_EVENT

This event is similar to CREATE_PROCESS_DEBUG_EVENT, and as you might have guessed, it is raised when a DLL is loaded by OS. This event is raised whenever DLL is loaded, either implicitly or explicitly (when debuggee calls LoadLibrary). This debugging event only occurs the first time the system attaches a DLL to the virtual address space of a process. For this event processing we use ‘LoadDll’ member of the union. It is of type LOAD_DLL_DEBUG_INFO:

struct LOAD_DLL_DEBUG_INFO
{
   HANDLE hFile;         // Handle to the DLL physical file.
   LPVOID lpBaseOfDll;   // The DLL Actual load address in process.
   DWORD dwDebugInfoFileOffset;
   DWORD nDebugInfoSize;
   LPVOID lpImageName;   // These two member are same as CREATE_PROCESS_DEBUG_INFO
   WORD fUnicode;
};

For retrieving the filename we would use the same function, GetFileNameFromHandle, as we have used in CREATE_PROCESS_DEBUG_EVENT.
I will list out the code for processing this event, when I would describe UNLOAD_DLL_DEBUG_EVENT, since the UNLOAD_DLL_DEBUG_EVENT does not have any direct information available to find out the name of DLL file.

Processing CREATE_THREAD_DEBUG_EVENT

This debug event is generated whenever a new thread is created in the debuggee. Like CREATE_PROCESS_DEBUG_EVENT, this event is raised before the thread actually gets to run. To get information about this event, we use ‘CreateThread’ union member. This variable is of type CREATE_THREAD_DEBUG_INFO:

struct CREATE_THREAD_DEBUG_INFO
{
  HANDLE hThread;      // Handle to the newly created thread in debuggee
  LPVOID lpThreadLocalBase;
  LPTHREAD_START_ROUTINE lpStartAddress; // pointer to the starting address of the thread
};

The thread-id for newly arrived thread is available in DEBUG_EVENT::dwThreadId.
Using this member to intimate user is straightforward:

case CREATE_THREAD_DEBUG_EVENT:
{
   CString strEventMessage;
   strEventMessage.Format(L"Thread 0x%x (Id: %d) created at: 0x%x",
			debug_event.u.CreateThread.hThread,
			debug_event.dwThreadId,
			debug_event.u.CreateThread.lpStartAddress); // Thread 0xc (Id: 7920) created at: 0x77b15e58
}

The ‘lpStartAddress’ is relevant to Debuggee and not the Debugger, we are just displaying it for completeness. Remember this event is not received for the primary/initial thread of the process. It is received only for subsequent thread creations in the debuggee.

More by Author

Must Read