Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C

Understanding LIST_ENTRY Lists and Its Importance in Operating Systems

4.86/5 (23 votes)
24 Jul 2014CPOL16 min read 65.5K   1.3K  
Understanding LIST_ENTRY lists and CONTAINING_RECORD macro which are used in Windows Kernel

Table of Contents

  • Introduction
  • Background
  • Required software tools
  • Download links for software tools
  • Quick look at the code
  • LIST_ENTRY memory layout and working
  • CONTAINING_RECORD working
  • Operations on LIST_ENTRY
  • Using WinDbg dt command to dump out MYSTRUCT structure using LIST_ENTRY
  • Using WinDbg dt command to dump out all process in a live kernel
  • Using WinDbg dt command to dump out all threads in a process in a live kernel

Introduction

When we write software at the lowest possible level of an operating system, more than often we do not have the extravagance of using fancy, well tested, world renowned libraries like C++ STL, C++ Boost or something similar to that. An outstanding example would be the Windows kernel, where the use of C++ itself is considered not the most optimal way of doing things, in favor of plain old C to make things purely "bare metal" and have complete control and track of what exactly we are doing, to the very last bit of instruction generated by the compiler. This brings additional challenges, such as difficulty to maintain group of items, where we could have used an STL vector, map or list in user mode.

To manage this, traditionally, OS developers use simple arrays, lists and trees in OS kernel and among those, I cannot overemphasize the importance of a typical doubly linked list which is ubiquitous in Windows kernel. In this article, I am going to discuss LIST_ENTRY data structure which is used in Windows kernel and other native parts of the Windows user mode, specifically ntdll.dll to maintain a doubly linked list. While this article shows its examples in the context of Windows OS, please be informed that the exact same techniques are used in Linux, BSD as well, wherever applicable with different variable and macro names. As a matter of fact, the concepts which are discussed here, are running in your system, at the very moment, even when you are reading this article, irrespective of which OS you are using (Mac, Linux, Windows) or the version of that OS.

The code, which is available for download, demonstrates the usage of LIST_ENTRY and CONTAINING_RECORD in a simple scenario. This code is used for debugging with WinDbg to demonstrate dt (dump type) command to dump out the structure. Latter part of the article describes how to use the techniques specified in the former part for live Windows kernel debugging to dump out the list of all processes and then all the threads inside a specific process.

I suggest you take a look at the code first and keep it open while reading the article because I have tried explaining the concepts with reference to the source code. Also, readers of this article are expected to know C programming language.

Background

To understand LIST_ENTRY structure and CONTAINING_RECORD macro and their usage, a thorough understanding of pointers and structures is essential. The documentation for LIST_ENTRY and CONTAINING_RECORD are available at ListEntry and ContainingRecordMacro respectively. A headstart on WinDbg could also help with the debugging.

This article is based on the following 4 presentations, so feel free to refer to them to brush up your prerequisites:

Required Software Tools

The tools used are:

  • WinDbg, preferably latest - for debugging
  • VmPlayer - For running the target Virtual Machine which is being debugged
  • An OS - to be installed in the VmPlayer
WinDbg and VmPlayer are available for free download. Now I have used Windows 7 for VmPlayer, however you could use whatever you have got provided it's something over Windows 2000. Also, you don't have to stick with vmplayer. You can go with any virtualization infrastructure which supports kernel debugging in Windows.

Download Links for Software Tools

Make sure you download the latest version of VMPlayer.

Let's Have a Quick Look at the Code

As mentioned before, please open the attached code in your favorite editor. The code contains a sample structure, MYSTRUCT that used LIST_ENTRY as a part of it. The definition is similar to that in the header file wdm.h.

The user defined structure in the code is the following:

C++
//
// MYSTRUCT - structure
//
typedef struct newstruct
{
    int num1;
    int num2;
    char alpha;
    LIST_ENTRY list_entry;
    float exp;

}MYSTRUCT,*PMYSTRUCT;
...

The structure of LIST_ENTRY is as follows (explained later):

C++
//
// LIST_ENTRY structure 
//
typedef struct _LIST_ENTRY
{
    struct _LIST_ENTRY *Flink;
    struct _LIST_ENTRY *Blink;

}LIST_ENTRY,*PLIST_ENTRY;
... 

The CONTAINING_RECORD macro is defined as follows in the beginning of the code (explained later):

C++
//
// CONTAINING_RECORD macro
//Gets the value of structure member (field),given the type(MYSTRUCT, in this code) and the List_Entry head(temp, in this code)
//
#define CONTAINING_RECORD(address, type, field) (\
    (type *)((char*)(address) -(unsigned long)(&((type *)0)->field)))
    
...

We have just seen all the important players in the game. Now, there are just two operations defined in the code, namely

<a href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff547799(v=vs.85).aspx">InsertHeadList</a>
and
<a href="http://msdn.microsoft.com/en-us/library/windows/hardware/ff547799(v=vs.85).aspx">InitializeListHead</a>
whose definition is similar to that in wdm.h. In the main function, I have just added two entries to the list. And later accessed a member(num1) of the structure MYSTRUCT by traversing LIST_ENTRY and calculating the starting address of MYSTRUCT using CONTAINING_RECORD. So the code is as simple as adding 2 nodes to a doubly linked list and reading them back.

LIST_ENTRY Memory Layout and Working (With Respect to the Source Code)

Now, let us try to understand the memory layout of the list and how it works. LIST_ENTRY normally appears in the middle of a structure. It contains a Flink and Blink pointing to forward and previous LIST_ENTRY structure which is a member of another structure (in the source code, it's MYSTRUCT).

LIST_ENTRY in MYSTRUCT

In the above figure, you can see list_entry (refer to MYSTRUCT declaration in the previous section) is of type "LIST_ENTRY" and it appears in the middle of the structure MYSTRUCT. Also, note how each object of the structure is linked. Not using a pointer to MYSTRUCT, but via a pointer to LIST_ENTRY. Now this also means, you cannot directly access the member of the structure because your object is pointing to the start of LIST_ENTRY structure and not MYSTRUCT. So you have to calculate the start address of MYSTRUCT from the address of list_entry (which is the one you have at hand). For this purpose, the macro CONTAINING_RECORD is used.

For example, if you wish to access num1 of MYSTRUCT, you cannot access it directly, but rather calculate the offset first, get the start address of MYSTRUCT and then access the element. And that is exactly what the source code does.

CONTAINING_RECORD Working (With Respect to the Source Code)

Now let's have a look at CONTAINING_RECORD. CONTAINING_RECORD declaration might seem complicated at first look. However the working is pretty simple. Let's take a look at the arguments it takes.

  1. address: This is nothing but a pointer to a field (here,list_entry) in an instance of a structure of type Type (here, MYSTRUCT).
  2. type: The name of the type of the structure (in the source code, it is MYSTRUCT whose base address is to be returned, from which direct access to structure members can be made.
  3. field: The name of the field (in this source code, it is list_entry) pointed to by address (first argument of the macro) and which is contained in a structure of type Type (MYSTRUCT)

Still complicated? Let's dig deeper. Let us take a look at the definition again.

C++
//
// CONTAINING_RECORD macro
//Gets the value of structure member (field - num1),given the type(MYSTRUCT, in this code) and the List_Entry head(temp, in this code)
//
#define CONTAINING_RECORD(address, type, field) (\
    (type *)((char*)(address) -(unsigned long)(&((type *)0)->field)))
    
...

Let's take a closer look at this part of the macro definition

(&((type *)0)->field)  

It seems, we are trying to access a field from null pointer. That is, the above statement itself will cause a problem. In fact, the whole expression is not completely inline with the C standard although it works in almost all known compliers. Let us try to have a logical understanding of what is going on. Now, as you can see, the null pointer is typecasted to a pointer to type (in this case, MYSTRUCT). Trying to refer to a field wouldn't generate any access violation error and in fact the '& 'in front of it is doing that trick, because we aren't exactly accessing that memory location, but just address of it. So it would return the offset of the field that we are referring to (in the source code, it is list_entry). So in another words, it gives address of field element in the type when the type's object address is null. Now that we know the offset of LIST_ENTRY structure inside MYSTRUCT, we just need to subtract this offset from the address (first argument of the CONTAINING_RECORD), to get the starting address of the containing structure (in the source code, containing structure is MYSTRUCT).

Thus, CONTAINING_RECORD(temp,MYSTRUCT,list_entry)->num1 in the source code would refer to num1 of structure MYSTRUCT.

Explanation of while loop in the source code that uses CONTAINING_RECORD

C++
//
//
PLIST_ENTRY temp=0;
temp=MyListHead;
while(MyListHead!=temp->Flink)
  {
    temp=temp->Flink;
    int num1=CONTAINING_RECORD(temp,MYSTRUCT,list_entry)->num1;
    printf("num1=%d\n",num1);
  }

The function of the loop above, as it appears in the source code, is to print each value of the member num1 in the structure MYSTRUCT. The traversal of the list is performed using list_entry of MYSTRUCT. temp is an element of type LIST_ENTRY which is initialized to MyListHead (which points to the first list_entry within the structure MYSTRUCT, in a list of elements of type MYSTRUCT). The condition in the loop checks whether the end of list has been reached. temp=temp->Flink is used to traverse to the next element in the list. However, note that Flink will be pointing to list_entry within the MYSTRUCT and not the start of the structure. But we have to display the value of num1. For the aforementioned purpose, the next line in the source code (int num1=CONTAINING_RECORD(temp,MYSTRUCT,list_entry)->num1; ) is used. The working of COTNAINING_RECORD was explained earlier. Please do refer to it to see how the macro works.

Operations on LIST_ENTRY

Though there are a number of operations on LIST_ENTRY, the source code covers just two functions, the rest of the functions are available in wdm.h for your reference. In this section, let us try to understand how LIST_ENTRY operations works and its polymorphic significance,

  • Initializing the List Head ( InitializeListHead(PLIST_ENTRY Listhead))

    The definition of InitializeListHead(PLIST_ENTRY Listhead) is as follows:

    C++
    //
    //
    void InitializeListHead( PLIST_ENTRY ListHead)
     {
       ListHead->Flink = ListHead->Blink = ListHead;
     }

    In the above function, since it's a doubly linked list, we're initializing the Flink and Blink. Also Both Flink and Blink points to ListHead. That is, this is a circular doubly linked list.

  • Inserting New Node ( InsertHeadList( PLIST_ENTRY ListHead, PLIST_ENTRY Entry))

    The definition of InsertHeadList( PLIST_ENTRY ListHead, PLIST_ENTRY Entry) is as follows:

    C++
    //
    //
    void InsertHeadList( PLIST_ENTRY ListHead, PLIST_ENTRY Entry)
    {
        PLIST_ENTRY Flink;
    
        Flink = ListHead->Flink;
        Entry->Flink = Flink;
        Entry->Blink = ListHead;
        Flink->Blink = Entry;
        ListHead->Flink = Entry;
    }

    The function takes two arguments, one of which is the listhead (MyListhead in the source code) and the other is the list_entry (which is a member of the structure MYSTRUCT). The function adds the new element to the front of the head list, pushing behind the existing element behind, otherwise the code is self explanatory.

The Polymorphic Nature of LIST_ENTRY and Related Functions

An interesting point to note is that the above mentioned functions take argument of type pointer to LIST_ENTRY and NOT a pointer to MYSTRUCT. This means that these functions are independent of the type of structure with which it is used, because the definition is independent of that structure type (In the source code, it is MYSTRUCT). This polymorphic nature of these constructs allows us to use this with any TYPE, and not worry about their layout at runtime. From a programming standpoint, this is as good as C++ templates if not better, without any code duplication like in the case of C++ templates.

Using WinDbg dt Command to Dump Out MYSTRUCT Structure using LIST_ENTRY

Now let us have a look into the debugger and do some practice with the code. As you can see below, Debugger commands are equipped to deal with LIST_ENTRY. Please follow the below instructions to do this exercise. This is needed for the coming section of this article.

Open the given source file (Convert.cpp) in WinDbg. Also open the executable (Convert.exe) from the project folder in WinDbg. Set a breakpoint in line 89, or where the while loop begins in the source code. Now you can dump out the structure with the following commands:

  1. Type the command x Convert!MyListHead in the debugger.
  2. The address of MyListHead will be dumped out on the left.
  3. Type the command dc address ( any d* of your choice should work) (The address from step 2) in the debugger. Do another dc address (The first address on the left of previous dc output).
  4. The first address is the address of the LIST_ENTRY structure in MYSTRUCT.
  5. The offset of LIST_ENTRY structure (list_entry field) in MYSTRUCT have to be subtracted from the address in the previous step to get the starting address of MYSTRUCT.
  6. The offset can be obtained by dumping out MYSTRUCT with the command dt Convert!MYSTRUCT
  7. Subtract the offset of list_entry from the address in step 4 with the command ?addr-offset
  8. Now use the obtained address with the following command to recursively list the entries of MYSTRUCT
    • Syntax: dt type -l field.Flink address
    • Command: dt Convert!MYSTRUCT -l list_entry.Flink addr from previous step

If you're finding difficulties in the steps, please refer to the images below. It is the sequence of commands and equivalent output from WinDbg for the attached demo project's executable.

LIST_ENTRY in MYSTRUCT

LIST_ENTRY in MYSTRUCT

Now we are ready to look at some real production OS scenarios, in which LIST_ENTRY is used.

Using WinDbg dt Command to Dump Out All Processes in a Live Kernel (like !process output)

Follow the steps to create a named pipe and use it to attach the virtual machine to debugger for live kernel debugging of the OS within the virtual machine:

  1. Download and install VmPlayer (See to "Download Links for Software Tools" for download link)
  2. Install a target OS in VMPlayer. I have installed Windows 7.
  3. Add a virtual serial port in VM player.
    • Open VmPlayer
    • Select the OS installed on the left pane. It's properties are displayed on the right pane.
    • Click Edit virtual machine settings
    • In the opened dialog box, under Hardwaretab, click "Add"
    • Select the hardware type as "Serial Port" and click Next
    • Select "Output to Named Pipe" in the next window and click Next
    • Enter a name for the pipe. (I have given "\\.\pipe\NamedPipe" as the name.) Make sure you have it copied somewhere. You need to remember that. Click Finish.
    • Select "Serial Port" under the "Hardware" pane. Its properties appear on the right pane.
    • Enable Use named pipe radio button. Enter the name as the one you have given earlier. Click Ok.
  4. Now start your virtual OS. (Click on "Play Virtual Machine".)
  5. Open the "Run" command prompt. Type msconfig
  6. "System Configuration" dialog box opens.
  7. Under the "Boot" tab, Click on "Advanced Options" and enable Debug checkbox. Click Ok.
  8. Open the WinDbg in your host machine.
  9. Select File->Kernel Debug
  10. In "Kernel Debugging" window that opens, under "COM" tab, enable Pipe and Reconnect checkbox. Enter the pipename you have created in the virtual machine (for me, it was \\.\pipe\NamedPipe ). Click ok.
  11. Now the Kernel Debugger waits for the pipe to reconnect. Restart your virtual machine. And the target OS (the one in the virtual machine, the debugee) will be connected to the debugger.
  12. Once connected, and when you see the message "KDTARGET : Refreshing KD connection " on command window, hit break after which commands can be executed.

Live debugging to list all processes

Now that the target OS is connected to the debugger, we can dump out all the current running processes in the target OS (inside the virtual machine) using dt command. The steps are similar to how we dumped out MYSTRUCT earlier. Instead of Convert!MyListHead, we're going to use nt!PsActiveProcessHead. PsActiveProcessHead is a pointer to the start of the kernel's list of _EPROCESS structures. _EPROCESS structure has a member named ActiveProcessLinks which is of type LIST_ENTRY. Using this member (ActiveProcessLinks), we can dump out a list of all active processes in the kernel using dt command.

  1. Type the command x nt!PsActiveProcessHead in the debugger.
  2. The address of ActiveProcessHead will be dumped out on the left.
  3. Type the command dc address (The address from step 2) in the debugger to see the contents of ActiveProcessHead. The first address is pointing into a structure which is of type _EPROCESS
  4. The offset of ActiveProcessLinks in _EPROCESS have to be subtracted from the address in the previous step to get the starting address of _EPROCESS.
  5. The offset can be obtained by dumping out _EPROCESS with the command dt nt!_EPROCESS
  6. Subtract the offset of ActiveProcessLinks from the address in step 3 with the command ?addr-offset
  7. Now use the obtained address with the following command to recursively list the entries of _EPROCESS
    • Syntax: dt type -l field.Flink address
    • Command:
      dt nt!_EPROCESS -l ActiveProcessLinks.Flink addr from previous step 
  8. To dump out specific fields inside the _EPROCESS structure, say for example the ones starting with "Thread", use this command..
    dt nt!_EPROCESS -l ActiveProcessLinks.Flink  -y Thread addr 
    .

The following snapshots would make it easier to follow.

PROCESS

Now that we got all the processes running in a Windows OS. Now let's try dump all the threads inside one of these processes.

Using WinDbg dt Command to Dump Out All Threads in a Process in a Live Kernel

To dump out threads from a particular process is pretty straight forward and similar to dumping out all the active processes. _EPROCESS has a member named ThreadListHead which will be pointing into a structure list called _ETHREAD. ThreadListHead is of type LIST_ENTRY whose Flink points to ThreadListEntry inside the _ETHREAD structure. Once the explained idea is clear, the process should be easy.

  1. For demo purposes, assume you have one of the addresses of active process (with handle count>0) copied.
  2. Dump out the _EPROCESS structure at that address using dt command to find the offset of ThreadListHead
  3. Copy the first address within the parenthesis of ThreadListHead entry in the dumped out structure. This will be pointing into _ETHREAD
  4. The offset of ThreadListEntry in _ETHREAD have to be subtracted from the address in the previous step to get the starting address of _ETHREAD
  5. The offset can be obtained by dumping out _ETHREAD with the command dt nt!_ETHREAD
  6. Subtract the offset of ThreadListEntry from the address in step 3 with the command ?addr-offset
  7. Now use the obtained address with the following command to recursively dump out all active threads
    • Syntax : dt type -l field.Flink address
    • Command : dt nt!_ETHREAD -l ThreadListEntry.Flink addr from previous step

The following snapshots would make it easier to follow.

THREAD

THREAD

Key Points of Interest

There are few interesting points that need to be noted as these would definitely make things go wrong if not understood properly.

  • CONTAINING_RECORD takes the address of a structure member and returns the starting address of the structure.
  • dt command takes the start address of the structure to dump out its contents not the address of a structure member as in the case of CONTAINING_RECORD.
  • The dump of process can be done with !process command and thread with
    
    !thread 
    command both commands internally use the same technique which we have just discussed. You can use those commands to verify your output.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)