When we work with binary data, we often use the dt
command to group the bytes into meaningful fields, e.g.:
0:000> dt ntdll!_PEB @$peb
+0x000 InheritedAddressSpace : 0 ''
+0x001 ReadImageFileExecOptions : 0 ''
+0x002 BeingDebugged : 0x1 ''
+0x003 BitField : 0x8 ''
+0x003 ImageUsesLargePages : 0y0
+0x003 IsProtectedProcess : 0y0
+0x003 IsLegacyProcess : 0y0
+0x003 IsImageDynamicallyRelocated : 0y1
+0x003 SkipPatchingUser32Forwarders : 0y0
...
The problem arises when the library owner does not provide type information in the symbol files. We are usually left with a manual decomposition of the bytes in a binary editor (010 Editor has a nice template system). Wouldn’t it be great if we had some template system available also in the debugger? I have some good news for you: with the latest release of WinDbg, we received a very powerful feature: .natvis files. There were even two Defrag Tools episodes dedicated to this functionality: Defrag Tools #138 and Defrag Tools #139. Let’s first analyze how the .natvis files are built, to later use them in our binary data analysis.
.natvis Files
.natvis files have been used from some time in Visual Studio to customize the way variables are displayed in watch windows. You may find .natvis files used by Visual Studio in %VSINSTALLDIR%\Common7\Packages\Debugger\Visualizers. They are XML files, constructed in accordance to the schema defined in the %VSINSTALLDIR%\XML\Schemas\1033\natvis.xsd file. You may define your own .natvis files in a project and Visual Studio will embed them in .pdb files (more information here). A sample .natvis file may look as follows:
="1.0"="utf-8"
<AutoVisualizer
xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
<Type Name="tagRECT">
<AlternativeType Name="CRect"></AlternativeType>
<DisplayString>{{LT({left}, {top}) RB({right}, {bottom})
[{right-left} x {bottom-top}]}}</DisplayString>
<Expand>
<Item Name="[top]">top,x</Item>
<Item Name="[right]">right,x</Item>
<Item Name="[width]">right - left</Item>
<Item Name="[bottom]">bottom</Item>
<Item Name="[left]">left</Item>
</Expand>
</Type>
</AutoVisualizer>
The Name
attribute of the Type
tag is very important – it is the type identifier and specifies that this template will be used only for objects whose type matches this string
. The DisplayString
tag value is used to show a one line view of the object and Item
tags represent fields. Each field is a C++ expression which is evaluated in the context of the current object. In the DisplayString
tag, expressions are placed inside curly braces, eg. {left}
. Additionally, we may control how the expression value is displayed with the help of the format specifiers. A list of available format specifiers can be found on MSDN. In our example, we use the hex specifier for the top and right field.
To load our .natvis file into WinDbg, we may use the .nvload <path-to-the-file>
command. To unload it, use the .nvunload <file-name>
command or the .nvunloadall
command to unload all the .natvis files. If you want your .natvis file to be loaded automatically by WinDbg, place it in the Visualizers folder in the Debugging Tools installation directory. The dx
command allows you to use the .natvis type definition to dump the contents of the object instance. There is no help for this command in the official WinDbg .chm file so you are left with the -?
switch. To finish this paragraph, let’s have a look at a sample WinDbg session in which we will load the above .natvis file:
0:000> .nvload c:\temp\windbg-dx\test.natvis
Successfully loaded visualizers in "c:\temp\windbg-dx\test.natvis"
0:000> .nvlist
Loaded NatVis Files:
c:\temp\windbg-dx\test.natvis
0:000> dx rect
rect : {LT(1, 2) RB(3, 4) [2 x 2]} [Type: tagRECT]
[<Raw View>]
[top] : 0x2
[right] : 0x3
[width] : 2
[bottom] : 4
[left] : 1
Type Templates in WinDbg
In the previous paragraph, we have examined the usual way of using the .natvis files. But what about raw binary data when we have no private
symbols available? The good news is that it’s still possible to use the dx
command. In the next example, we will work with the following C# class:
public struct TestClass
{
public Guid Id { get; set; }
public int Count { get; set; }
public String Name { get; set; }
}
and a very simple program:
public static void Main() {
var t = new TestClass() {
Id = Guid.NewGuid(),
Count = 2,
Name = "test class"
};
Console.ReadLine();
}
Let’s break the execution when the application is waiting for the user input and dump the TestClass
instance using the !wdo
command from the netext extension:
0:000> !Name2EE Test TestClass
Module: 01263fbc
Assembly: Test.exe
Token: 02000002
MethodTable: 01264db0
EEClass: 012617b8
Name: TestClass
0:000> !wdo -mt 01264db0 0x010ff1d0
...
629ae918 System.String +0000 _Name_k__BackingField 032f2754 test class
629b07a0 System.Int32 +0004 _Count_k__BackingField 2 (0n2)
629aba00 System.Guid +0008 _Id_k__BackingField -mt 629ABA00 00000000 {c41e14c4-95fc-402b-8e54-9f2ec1f4865e}
We will now try to mimic the above output using the .natvis file and dx
command. Our .natvis file will look as follows:
<AutoVisualizer xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
<Type Name="T1">
<DisplayString>CLR string</DisplayString>
<Expand>
<ArrayItems>
<Size>*((int *)(this + 4))</Size>
<ValuePointer>(NvWchar *)(this + 8)</ValuePointer>
</ArrayItems>
</Expand>
</Type>
<Type Name="T0">
<Expand>
<Item Name="Id">*((NvGuid *)(this + 8))</Item>
<Item Name="Count">*((int *)(this + 4))</Item>
<Item Name="Name">*((T1 *)(*(int *)this))</Item>
</Expand>
</Type>
</AutoVisualizer>
Don’t be scared by the number of *
in the fields definitions As we don’t have any symbols to rely on, we need to deal with pointers. Our base pointer is always this
. To make the evaluator work, we always need to specify which type we are expecting on the output. For example, the Count
field is at offset 4 of the TestClass
instance. Thus, we first add 4 bytes to the this
address, cast the address to int *
and later dereference it as we are interested in its value – thus, the expression *((int *)(this + 4))
. CLR String
is slightly more complicated, but the rules are the same. Last thing I need to explain are the type names. You’ve probably noticed those strange T0, T1 and T2 type names used in the templates as well as NvWchar
and NvGuid
. dx
command is able to operate only on types for which it has symbols. So if we create a completely imaginary type in the .natvis file and try to cast a memory address to it, the dx
command won’t work. Here comes with help a NatvisTypes
library where I have defined some mock types for you: T0, T1, T2, …, T9. And additionally types like NvGuid
and NvWchar
(I have plans to add other types in the future). The source code is committed to the same repo as the lld
extension: https://github.com/lowleveldesign/lldext and binaries can be found on the release page. There is one problem though: we need to have the NatvisTypes.dll loaded into the process. Here comes with help the !injectdll
command, which I have released with the lld WinDbg extension. The visualizers for Nv*
types are defined in the project and automatically added to the NatvisTypes.pdb file. WinDbg is kind enough to load the visualizers with the .pdb file. Let’s have a look at how the debugger output looks like for our example TestClass
instance:
0:000> .load lld
0:000> !injectdll d:\dev\src\lldext\Win32\Debug\NatvisTypes.dll
0:000> .nvload c:\temp\TestClass.natvis
Successfully loaded visualizers in "c:\temp\TestClass.natvis"
0:000> ld NatvisTypes
*** WARNING: Unable to verify checksum for d:\dev\src\lldext\Win32\Debug\NatvisTypes.dll
Symbols loaded for NatvisTypes
0:000> dx *((T0 *)0x010ff1d0)
*((T0 *)0x010ff1d0) : [Type: T0]
[<Raw View>]
Id : 0xc41e14c4-0x95fc-0x402b-0x8e0x54-0x9f0x2e0xc10xf40x860x5e [Type: NvGuid]
Count : 2
Name : CLR string [Type: T1]
0:000> dx -r1 (*((NatvisTypes!T1 *)0x32f2754))
(*((NatvisTypes!T1 *)0x32f2754)) : CLR string [Type: T1]
[<Raw View>]
[0] : 116 't' [Type: NvWchar]
[1] : 101 'e' [Type: NvWchar]
[2] : 115 's' [Type: NvWchar]
[3] : 116 't' [Type: NvWchar]
[4] : 32 ' ' [Type: NvWchar]
[5] : 99 'c' [Type: NvWchar]
[6] : 108 'l' [Type: NvWchar]
[7] : 97 'a' [Type: NvWchar]
[8] : 115 's' [Type: NvWchar]
[9] : 115 's' [Type: NvWchar]
I know the example is not the best one, but notice that we have transformed raw binary data into something meaningful. I haven’t yet touched the subject of postmortem debugging. It’s not possible to inject a DLL into a dump. When you need to analyze a dump, you will have to use any types for which you have symbols, but would not use them normally, example:
0:000> dt ntdll!*
...
ntdll!_ALTERNATIVE_ARCHITECTURE_TYPE
ntdll!_KUSER_SHARED_DATA
ntdll!_TP_POOL
ntdll!_TP_CLEANUP_GROUP
ntdll!_ACTIVATION_CONTEXT
ntdll!_TP_CALLBACK_INSTANCE
...
You may overwrite them in the .natvis file and later cast the memory. It’s quite tedious, but I haven’t found a better way here. Finally, if you are not yet impressed by the dx
command, have a look at the output of the dx Debugger
call in your WinDbg session.
Filed under: CodeProject, windbg