Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / file

PE Format Illustrated – Part 2

5.00/5 (5 votes)
10 Mar 2023CPOL4 min read 6K  
Beginner’s tutorial on PE format applied to .NET assemblies
This is a beginner’s tutorial on PE format applied to .NET assemblies. We tried to give light illustrated text. We tried to focus on the big picture.

1. Continuation of Another Article

This article is a continuation of another article:

1.1. Tools Used – Hex Editor ImHex

I am using Hex Editor ImHex (freeware) which enables me to color sections of files ([1]).

1.2. Tools Used – PE Viewer “PE-bear”

PE-bear (freeware) is a very useful tool to analyze PE files visually. Based on the documentation, it does not cover all variants and flavors of PE files, but for simpler ones, it is a great viewer/parser/analyzer. I think it is always easier to study topics using visual tools. ([2])

1.3. Tools Used – Assembly Viewer

An excellent “A .NET assembly viewer” (freeware) is being provided in the project [3].

1.4. Tools Used – CFF Explorer

Another PE file explorer “CFF Explorer” freeware can be found at [4].

2. .NET Header (aka “COR20 Header”, “CLI Header”)

As we saw in the  previous article [11], “NT Headers - Optional Header – Data Dictionary” presented by the PE-bear tool looks like this:

Image 1

If you look at file offset 0x168, for .NET Header, it says at address 0x2008(RVA). Again, 0x2008-0x2000+0x200= 0x208(raw file). Size is 0x48.

Here is how it looks in Hex editor:

Image 2

And here is an interpretation by PE-bear:

Image 3

And here is an interpretation by Assembly Viewer:

Image 4

3. Resources (aka “Managed Resources”)

It says ResourceDir is 0x2AFC(RVA), with size 0xD8. Again, 0x2AFC-0x2000+0x200= 0xCFC(raw file). End is 0xCFC+0xD8= 0xDD4(raw file).

Here, it is Hex editor:

Image 5

As you can see, we mapped the stringHW” to the “Hello World!string in our C# code resources.

4. Metadata

4.1. Short Theoretical Background

Metadata inside the .NET assembly is organized in 5 streams. The name of each stream starts with #. The streams are:

  1. #~ stream – This is the “metadata stream” and contains info on the types, methods, fields, properties and events in the assembly.
  2. #Strings stream – Contains names of namespaces, types, and members
  3. #US stream – This is the “user string heap” and contains all the strings used in the code
  4. #GUID stream – Stores GUIDs used in the assembly
  5. #Blob stream – Contains pure binary data

4.2. File Content

As seen above, at file offset 0x210, for MetaData, it says the address is 0x211C(RVA). Again, 0x211C-0x2000+0x200= 0x31C(raw file). That is the location of the Metadata Header. It can be seen in the Hex editor.

Image 6

And here is an interpretation by Assembly Viewer:

Image 7

So, we see that we have metadata in 5 streams. Offsets here are given “relative to start of metadata header (0x31C)”. We need to do some calculations:

  • Stream #Strings starts at 0x31C+0x3D0= 0x6EC(raw file). It ends at 0x6EC+0x410= 0xAFC(raw file).
  • Stream #Blob starts at 0x31C+8280x= 0xB44(raw file). It ends at 0xB44+0x1B8= 0xCFC(raw file).
  • Stream #GUID starts at 0x31C+0x818= 0xB34(raw file). It ends at 0xB34+0x10= 0xB44(raw file).
  • Stream #US starts at 0x31C+0x7E0= 0xAFC(raw file). It ends at 0xAFC+0x38= 0xB34(raw file).
  • Stream #~ starts at 0x31C+0x6C= 0x388(raw file). It ends at 0x388+0x364= 0x6EC(raw file).

We will not be going into the interpretation of what each stream contains, just let us see the interpretation by Assembly Viewer.

Image 8

Stream #~ is special, so it is presented differently:

Image 9

5. Method Bodies

So, that was Metadata. Where are our method bodies, that is IL?

For that, we need to interpret Metadata. We will not be doing it manually, but instead using tools to see that interpretation.

Let us see the interpretation by CFF Explorer.

Image 10

So, we can see that Metadata stream #~, contains a folder with Methods and for method Main, we see offset 0x2069(RVA) which is a starting point of our method code. Again 0x2069-0x2000+0x200= 0x269(raw file) location in section .text.

So, Method bodies are stored between .NET Header and Metadata Headers. In our file, that is area 0x250(raw file)-0x31B(raw file).

6. Section .text Overview

So, here is the big picture, an overview of our .NET assembly packaged into the section .text of our file.

Image 11

Image 12

Image 13

7. Conclusion

This article is a continuation of the article [11]. We here outlined how .NET assembly is packaged into PE file format. We tried to give light illustrated text and didn’t go into all the fine details of how bytes are packaged but relied on tools to parse and interpret data.
An interested reader can find more details in articles [6]-[9].

8. References

History

  • 10th March, 2023: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)