In this part, I will show you how to load module debugging symbols (PDB files) into the debugger and how to bind them with source files. This can’t be achieved without diving into process, thread and module internals so we will examine these structures also.
Our small debugger mindbg after the last part (part 2) is attached to the appdomains and receives events from the debuggee. Before we start dealing with symbols and sources, I will quickly explain what changes were made to the already implemented logic.
I created a new class that will be a parent for all debuggee events:
public class CorEventArgs
{
private readonly CorController controller;
public CorEventArgs(CorController controller)
{
this.controller = controller;
}
public CorController Controller { get { return this.controller; } }
public bool Continue { get; set; }
}
All events are now dispatched to the process that they belong to. As an example, take a look at the Breakpoint event handler in CorDebugger
:
void ICorDebugManagedCallback.Breakpoint
(ICorDebugAppDomain pAppDomain, ICorDebugThread pThread, ICorDebugBreakpoint pBreakpoint)
{
var ev = new CorBreakpointEventArgs(new CorAppDomain(pAppDomain, p_options),
new CorThread(pThread),
new CorFunctionBreakpoint(
(ICorDebugFunctionBreakpoint)pBreakpoint));
GetOwner(ev.Controller).DispatchEvent(ev);
FinishEvent(ev);
}
DispatchEvent
method is implemented in the CorProcess
. For each type of event that we are interested in, we have an overloaded version of this method. Example:
public delegate void CorBreakpointEventHandler(CorBreakpointEventArgs ev);
public event CorBreakpointEventHandler OnBreakpoint;
internal void DispatchEvent(CorBreakpointEventArgs ev)
{
ev.Continue = false;
OnBreakpoint(ev);
}
We want also to stop the debugger on the Main
method of the executable module so we will create a function breakpoint in ModuleLoad
event handler (more about breakpoints will be in the next part of the series):
internal void DispatchEvent(CorModuleLoadEventArgs ev)
{
if (!p_options.IsAttaching)
{
var symreader = ev.Module.GetSymbolReader();
if (symreader != null)
{
Int32 token = symreader.UserEntryPoint.GetToken();
if (token != 0)
{
CorFunction func = ev.Module.GetFunctionFromToken(token);
CorBreakpoint breakpoint = func.CreateBreakpoint();
breakpoint.Activate(true);
}
}
}
ev.Continue = true;
}
That’s all about events – I also made some minor changes in other parts of the application but I don’t think they are important enough to be mentioned in this post . So let’s focus on the main topic.
I want to display the source code for the location where the breakpoint was hit. So first, let’s subscribe to the breakpoint event on the newly created process:
var debugger = DebuggingFacility.CreateDebuggerForExecutable(args[0]);
var process = debugger.CreateProcess(args[0]);
process.OnBreakpoint += new MinDbg.CorDebug.CorProcess.CorBreakpointEventHandler(process_OnBreakpoint);
The handler code is as follows:
static void process_OnBreakpoint(MinDbg.CorDebug.CorBreakpointEventArgs ev)
{
Console.WriteLine("Breakpoint hit.");
var source = ev.Thread.GetCurrentSourcePosition();
DisplayCurrentSourceCode(source);
}
There are two methods that are mysterious here: CorThread.GetCurrentSourcePosition
and DisplayCurrentSourceCode
. Let’s start from GetCurrentSourcePosition
method. When a thread executes application code, it uses a stack to store function’s local variables, arguments and return address. So each stack frame is associated with a function that is currently using it. The most recent frame is the active frame and we may retrieve it using ICorDebugThread.GetActiveFrame
method:
public CorFrame GetActiveFrame()
{
ICorDebugFrame coframe;
p_cothread.GetActiveFrame(out coframe);
return new CorFrame(coframe, s_options);
}
and use it to get the current source position:
public CorSourcePosition GetCurrentSourcePosition()
{
return GetActiveFrame().GetSourcePosition();
}
Inside the active CorFrame
, we have an access to the function associated with it:
{
ICorDebugFunction cofunc;
p_coframe.GetFunction(out cofunc);
return cofunc == null ? null : new CorFunction(cofunc, s_options);
}
public CorSourcePosition GetSourcePosition()
{
UInt32 ip;
CorDebugMappingResult mappingResult;
frame.GetIP(out ip, out mappingResult);
if (mappingResult == CorDebugMappingResult.MAPPING_NO_INFO ||
mappingResult == CorDebugMappingResult.MAPPING_UNMAPPED_ADDRESS)
return null;
return GetFunction().GetSourcePositionFromIP((Int32)ip);
}
The ip
variable represents the instruction pointer which (after MSDN) is the stack frame’s offset into the function’s Microsoft intermediate language (MSIL) code. That basically means that the ip
variable points to the currently executed code. The question now is how to bind this instruction pointer with the real source code line stored in a physical file. Here, symbol files come into play. Symbol files (PDB files) may be considered as translators of the binary code into the human readable source code. Unfortunately, the whole logic behind symbol files is quite complex and explaining it thoroughly would take a lot of space (which might be actually a good subject for few further posts ). For now, let’s assume that symbol files will provide us with the source file path and line coordinates corresponding to our instruction pointer value. I tried to implement the symbol readers and binders on my own but this subject overwhelmed me and I finally imported all symbol classes and interfaces from MDBG source code. So I will just show you how to use these classes and if someone is not satisfied with it he/she may look and analyze content of the mindbg\Symbols folder.
Each module (CorModule
instance) has its own instance of the SymReader
class (created with help of the SymbolBinder
):
public ISymbolReader GetSymbolReader()
{
if (!p_isSymbolReaderInitialized)
{
p_isSymbolReaderInitialized = true;
p_symbolReader = (GetSymbolBinder() as ISymbolBinder2).GetReaderForFile(
GetMetadataInterface<IMetadataImport>(),
GetName(),
s_options.SymbolPath);
}
return p_symbolReader;
}
Moving back to the CorFrame.GetSourcePosition
method code snippet, you might have noticed that in the end, it called GetSourcePositionFromIP
method CorFunction
instance associated with this frame. Let’s now load source information from symbol files for this function:
private void SetupSymbolInformation()
{
if (p_symbolsInitialized)
return;
p_symbolsInitialized = true;
CorModule module = GetModule();
ISymbolReader symreader = module.GetSymbolReader();
p_hasSymbols = symreader != null;
if (p_hasSymbols)
{
ISymbolMethod sm = null;
sm = symreader.GetMethod(new SymbolToken((Int32)GetToken()));
if (sm == null)
{
p_hasSymbols = false;
return;
}
p_symMethod = sm;
p_SPcount = p_symMethod.SequencePointCount;
p_SPoffsets = new Int32[p_SPcount];
p_SPdocuments = new ISymbolDocument[p_SPcount];
p_SPstartLines = new Int32[p_SPcount];
p_SPendLines = new Int32[p_SPcount];
p_SPstartColumns = new Int32[p_SPcount];
p_SPendColumns = new Int32[p_SPcount];
p_symMethod.GetSequencePoints(p_SPoffsets, p_SPdocuments, p_SPstartLines,
p_SPstartColumns, p_SPendLines, p_SPendColumns);
}
}
You may see that our function is represented in Symbol API as SymMethod
which contains a collection of sequence points. Each sequence point is defined by the IL offset, source file path, start line number, end line number, start column index and end column index. IL offset is actually the value that interests us most because it is directly connected to the ip
variable (which holds instruction pointer value). So finally, we are ready to implement CorFunction.GetSourcePositionFromIP
method:
public CorSourcePosition GetSourcePositionFromIP(Int32 ip)
{
SetupSymbolInformation();
if (!p_hasSymbols)
return null;
if (p_SPcount > 0 && p_SPoffsets[0] <= ip)
{
Int32 i;
for (i = 0; i < p_SPcount; i++)
{
if (p_SPoffsets[i] >= ip)
break;
}
if (i == p_SPcount || p_SPoffsets[i] != ip)
i--;
CorSourcePosition sp = null;
if (p_SPstartLines[i] == SpecialSequencePoint)
{
Int32 noSpecialSequencePointInd = i;
while (--noSpecialSequencePointInd >= 0)
if (p_SPstartLines[noSpecialSequencePointInd] != SpecialSequencePoint)
break;
if (noSpecialSequencePointInd < 0)
{
noSpecialSequencePointInd = i;
while (++noSpecialSequencePointInd < p_SPcount)
if (p_SPstartLines[noSpecialSequencePointInd] != SpecialSequencePoint)
break;
}
Debug.Assert(noSpecialSequencePointInd >= 0);
if (noSpecialSequencePointInd < p_SPcount)
{
sp = new CorSourcePosition(true,
p_SPdocuments[noSpecialSequencePointInd].URL,
p_SPstartLines[noSpecialSequencePointInd],
p_SPendLines[noSpecialSequencePointInd],
p_SPstartColumns[noSpecialSequencePointInd],
p_SPendColumns[noSpecialSequencePointInd]);
}
}
else
{
sp = new CorSourcePosition(false, p_SPdocuments[i].URL, p_SPstartLines[i], p_SPendLines[i],
p_SPstartColumns[i], p_SPendColumns[i]);
}
return sp;
}
return null;
}
And the second mysterious function – DisplayCurrentSourceCode
– from the beginning of the post is as follows:
static void DisplayCurrentSourceCode(CorSourcePosition source)
{
SourceFileReader sourceReader = new SourceFileReader(source.Path);
Debug.Assert(source.StartLine < sourceReader.LineCount
&& source.EndLine < sourceReader.LineCount);
if (source.StartLine >= sourceReader.LineCount ||
source.EndLine >= sourceReader.LineCount)
return;
for (Int32 i = source.StartLine; i <= source.EndLine; i++)
{
String line = sourceReader[i];
bool highlightning = false;
for (Int32 col = 0; col < line.Length; col++)
{
if (source.EndColumn == 0 || col >= source.StartColumn - 1
&& col <= source.EndColumn)
{
if (!highlightning)
{
Console.ForegroundColor = ConsoleColor.Yellow;
highlightning = true;
}
Console.Write(line[col]);
}
else
{
if (highlightning)
{
Console.ForegroundColor = ConsoleColor.Gray;
highlightning = false;
}
Console.Write(line[col]);
}
}
}
}
SourceFileReader
class is just a simple text file reader which reads the whole file at once and stores all lines in a collection of string
s. What’s the final result? Have a look:
There is a lot more to say about symbols and source files. I hope that in further posts, I will show you how to download symbols from symbol store and source files from repositories. As usual, the source code for this post may be found at mindbg.codeplex.com (revision 55200).
Filed under: CodeProject, Debugging