Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Debug Tutorial Part 3: The Heap

0.00/5 (No votes)
20 Mar 2004 1  
Introduction to the heap.

Introduction

In the previous installment of this debug series, we learned about the stack. The stack is a temporary storage location for local variables, parameters, return addresses, and just about anything the compiler wants to use it for. In this installment of the debug series, we will learn about the heap in usermode.

What's the heap?

The heap is just a block of memory within the process space that can be used to provide memory to an application when requested. The system APIs generally allocate a block of memory and serve it to the application as requested. The APIs add a header to each requested memory location that helps the system mark it as in-use and its size. This helps the system when it frees the memory. (Global variables are also on the heap).

The methods I am talking about above are completely in usermode. This block of memory is already allocated into your process space for your application to use. This chunk of memory is allocated when the specific library starts, for example in the DllMain() of MSVCRT.DLL. The code that handles malloc() generally works of this already allocated chunk of memory.

Memory heaps can be allocated into your process using functions such as HeapCreate. This function allocates a heap segment and returns you the segment number. Your application can then pass this segment into HeapAlloc to allocate memory from this heap. This works the same way as I described malloc as working, the Heap* functions perform all management of memory on the segment you allocated. This is actually what malloc() uses internally. A segment is allocated specifically for malloc() function calls.

There is another function termed VirtualAlloc. This function allocates memory on a larger scale for application use and commits the pages to memory. This function does not use a pre-allocated heap such as HeapAlloc does (the one you supply from HeapCreate) and you can even specify the location of the memory, but it's not required. This is a more advanced method of allocation and generally is not used nor needed in most application programs.

This is one of the main reasons you cannot free memory with a different function pair than you allocated (i.e., allocate with LocalAlloc() and free with C's free()). This is because they could have been allocated using different methods or different underlying heaps. The underlying code is generally stupid and usually attempts to free it anyway causing corruption in memory. This is also why any module that allocates memory should always free it. What if you allocate memory in one DLL and free in another? It may work some of the time, but maybe, one day you throw a debug DLL to replace one of them. I guarantee you will trap, as the debug heap allocations will contain different information than the release allocations! That's a bad habit! Allocate and free in the same module using the matching function pair!

Allocated Memory is not zero?

I mentioned this in the last tutorial that you may see values on the stack that don't make sense such as return values to function calls that currently haven't happened and etc. The stack is generally initialized to zero, however, during its use it becomes dirty. You know that local variables aren't always initialized, so if you make a stack call, those values aren't reset to zero when the stack moves up. If you pop a value off the stack, the stack may decrement but the values stay unless they are physically cleaned up. Sometimes the stack optimizes things and doesn't clean up variables as well. So, seeing "ghost" values on the stack is very common. The same thing can occur with the heap.

Freeing a location on the heap does not clear its value to zero unless you have done so before you freed the memory. So, it's very common that after allocating memory, your pointer will point to "garbage". This is generally why you usually initialize memory to zero after you've allocated it.

It may seem funny to "zero memory" before freeing it, however, if your application contains sensitive information such as passwords, or the like, it's probably best to zero out even stack variables before returning. This helps erase the sensitive information after their use. You wouldn't want your program to trap, and right there on the stack or on the heap is the user's password!

Heap Problems

In this article, I will go over the two most common problems with the heap. These are "Memory Leaks" and "Heap Corruption".

Memory Leaks

If you've ever noticed that over time you eventually start to see your program's memory foot print grow, using Task Manager, you may realize you have a memory leak. A memory leak basically means that you are allocating memory and not freeing it. The "leak" really occurs when the program "forgets" or drops the allocated memory into space never to reference it or even remember it's there all together. This is when we start the actual "leak". The memory footprint of a process can grow without leaking, the program may just require a lot of memory. So, where do we begin?

The first thing I would do is check task manager. If it's a fast leak, the memory may go up very fast. If it's a slow leak, the memory may go up gradually over time. First, you need to know how to read task manager.

  1. Virtual Memory - This field is how much memory is unique to this process. If the process was to be paged out to disk, this is how much it would take up in the paging file.
  2. Memory Usage - This is the "working set" or how much memory is in physical memory. Some people get confused when they see this number bigger than the Virtual Memory size. This is because this number does not just include memory "unique" to this process but also includes memory that is shared by other processes. For example, a lot of processes use kernel32.dll and to duplicate the same code for every process would be wasteful.

So, once you have determined the leak, you need to find the problem. Heap problems are some of the hardest problems to track down. One thing that can help you is that, if it is a leak, most likely they came from the same source. Even if there are multiple sources, they all will leak a lot of data. This means that usually the case is that the stored data will all be similar.

If the stored data is similar, you just need to find any memory that has been allocated in large amounts and examine it. To find the data, here are some tips:

  1. Generally, memory allocated from the same place are "usually" the same size. This is true for fixed structures, so you need to find allocations that are many and of similar size. This does not always have to be the case, for example, dynamically allocated strings based on a varying size.
  2. The memory allocated from the same place generally will contain the same or similar information or type of information. This means once you find memory, examine it. Compare different memory locations and see if they are similar. If you know data structures, etc., you could verify the size and match the data in the memory to attempt to come up with an educated "guess" as to what the memory actually is. This can then help narrow down where to look for such memory allocations in your code.

I have written a small program that causes a memory leak. This program is very simple and we will go through a simple scenario. This will help to get you acquainted with the heap.

Now, when we see a memory leak in Windows in your program, it doesn't always mean it's from your application. If you use a third party library or DLL, perhaps they're leaking the memory. Perhaps if you see a memory leak, but you're not directly "allocating" memory, then something you are doing could be indirectly allocating memory. For example, say you want to open a registry key and you use RegOpenKey. Do you know how that's implemented? No. So, if you leaked the "handle", could there be some memory associated with it? There could be. "Handle" leaks are another problem I will deal with in a later tutorial.

Now, I'm not saying that leaking a registry key would cause memory to grow and I'm not saying that leaking memory in another module would be shown by a handle leak. I'm just stating that manipulating and interacting with other modules could indirectly cause leaks that you are still responsible for. If the library you are using specifies you must free or call some destroy function, you should follow the instructions!

So, we have ran the program and we noticed the memory has gone up. Let's see if we can find out where it is. First, we find the PID of the process and then we can use "CDB -P <PID>". You could also use WinDbg's GUI and select the process from a list. The first thing we do once we are broken into the program is "!heap" to display the heaps of the process.

0:000> !heap
NtGlobalFlag enables following debugging aids for new heaps:    tail checking
    disable coalescing of free blocks
Index   Address  Name      Debugging options enabled
  1:   00140000                 tail checking free checking validate parameters
  2:   00240000                 tail checking free checking validate parameters
  3:   00250000                 tail checking free checking validate parameters
  4:   00320000                 tail checking free checking validate parameters
0:000>

The next thing we will do is examine all the heaps and find out which ones are hording the most memory.

0:000> !heap 00140000
Index   Address  Name      Debugging options enabled
  1:   00140000
    Segment at 00140000 to 00240000 (00100000 bytes committed)
    Segment at 00510000 to 00610000 (00100000 bytes committed)
    Segment at 00610000 to 00810000 (00051000 bytes committed)
  2:   00240000
  3:   00250000
  4:   00320000
0:000> !heap 00240000
Index   Address  Name      Debugging options enabled
  1:   00140000
  2:   00240000
    Segment at 00240000 to 00250000 (00006000 bytes committed)
  3:   00250000
  4:   00320000
0:000> !heap 00250000
Index   Address  Name      Debugging options enabled
  1:   00140000
  2:   00240000
  3:   00250000
    Segment at 00250000 to 00260000 (00001000 bytes committed)
  4:   00320000
0:000> !heap 00320000
Index   Address  Name      Debugging options enabled
  1:   00140000
  2:   00240000
  3:   00250000
  4:   00320000
    Segment at 00320000 to 00330000 (00010000 bytes committed)
    Segment at 00410000 to 00510000 (000ee000 bytes committed)
0:000>

In bold, we have the segments that have the most memory committed. We will start with the first heap.

0:000> !heap 00140000 -a
Index   Address  Name      Debugging options enabled
  1:   00140000
    Segment at 00140000 to 00240000 (00100000 bytes committed)
    Segment at 00510000 to 00610000 (00100000 bytes committed)
    Segment at 00610000 to 00810000 (00051000 bytes committed)
    Flags:                50000062
    ForceFlags:           40000060
    Granularity:          8 bytes
    Segment Reserve:      00400000
    Segment Commit:       00002000
    DeCommit Block Thres: 00000200
    DeCommit Total Thres: 00002000
    Total Free Size:      00000226
    Max. Allocation Size: 7ffdefff
    Lock Variable at:     00140608
    Next TagIndex:        0000
    Maximum TagIndex:     0000
    Tag Entries:          00000000
    PsuedoTag Entries:    00000000
    Virtual Alloc List:   00140050
    UCR FreeList:        00140598
    FreeList Usage:      00040000 00400000 00000000 00000000
    FreeList[ 00 ] at 00140178: 00660118 . 00660118
    Unable to read nt!_HEAP_FREE_ENTRY structure at 00660118
    FreeList[ 12 ] at 00140208: 0023ff78 . 0023ff78
    Unable to read nt!_HEAP_FREE_ENTRY structure at 0023ff78
    FreeList[ 36 ] at 00140328: 0060fe58 . 0060fe58
    Unable to read nt!_HEAP_FREE_ENTRY structure at 0060fe58
    Segment00 at 00140640:
        Flags:           00000000
        Base:            00140000
        First Entry:     00140680
        Last Entry:      00240000
        Total Pages:     00000100
        Total UnCommit:  00000000
        Largest UnCommit:00000000
        UnCommitted Ranges: (0)

    Heap entries for Segment00 in Heap 00140000
        00140000: 00000 . 00640 [01] - busy (640)
        00140640: 00640 . 00040 [01] - busy (40)
        00140680: 00040 . 01818 [07] - busy (1800), 
            tail fill - unable to read heap entry extra at 00141e90
        00141e98: 01818 . 00040 [07] - busy (22), 
            tail fill - unable to read heap entry extra at 00141ed0
        00141ed8: 00040 . 00020 [07] - busy (5), 
            tail fill - unable to read heap entry extra at 00141ef0
        00141ef8: 00020 . 002f0 [07] - busy (2d8), 
            tail fill - unable to read heap entry extra at 001421e0
        001421e8: 002f0 . 00330 [07] - busy (314), 
            tail fill - unable to read heap entry extra at 00142510
        00142518: 00330 . 00330 [07] - busy (314), 
            tail fill - unable to read heap entry extra at 00142840
        00142848: 00330 . 00040 [07] - busy (24), 
            tail fill - unable to read heap entry extra at 00142880
        00142888: 00040 . 00040 [07] - busy (24), 
            tail fill - unable to read heap entry extra at 001428c0
        001428c8: 00040 . 00028 [07] - busy (10), 
            tail fill - unable to read heap entry extra at 001428e8
        001428f0: 00028 . 00058 [07] - busy (40), 
            tail fill - unable to read heap entry extra at 00142940
        00142948: 00058 . 00058 [07] - busy (40), 
            tail fill - unable to read heap entry extra at 00142998
        001429a0: 00058 . 00060 [07] - busy (44), 
            tail fill - unable to read heap entry extra at 001429f8
        00142a00: 00060 . 00020 [07] - busy (1), 
            tail fill - unable to read heap entry extra at 00142a18
        00142a20: 00020 . 00028 [07] - busy (10), 
            tail fill - unable to read heap entry extra at 00142a40
        00142a48: 00028 . 00050 [07] - busy (36), 
            tail fill - unable to read heap entry extra at 00142a90
        00142a98: 00050 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00142ca0
        00142ca8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00142eb0
        00142eb8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001430c0
        001430c8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001432d0
        001432d8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001434e0
        001434e8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001436f0
        001436f8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00143900
        00143908: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00143b10
        00143b18: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00143d20
        00143d28: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00143f30
        00143f38: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144140
        00144148: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144350
        00144358: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144560
        00144568: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144770
        00144778: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144980
        00144988: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144b90
        00144b98: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144da0
        00144da8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00144fb0
        00144fb8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001451c0
        001451c8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001453d0
        001453d8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001455e0
        001455e8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 001457f0
        001457f8: 00210 . 00210 [07] - busy (1f4), 
            tail fill - unable to read heap entry extra at 00145a00

In order to shorten the list, I hit "Control Break" a few times to get out of the listing. I notice that after a while, the memory blocks are the same size and they're larger sizes. The listing goes like this:

<ADDRESS>: <Current Size> . <PREVIOUS Size>

If I dump one of the addresses, this is what I see:

0:000> dd 001457f8
001457f8  00420042 001c0700 00006968 00000000
00145808  00000000 00000000 00000000 00000000
00145818  00000000 00000000 00000000 00000000
00145828  00000000 00000000 00000000 00000000
00145838  00000000 00000000 00000000 00000000
00145848  00000000 00000000 00000000 00000000
00145858  00000000 00000000 00000000 00000000
00145868  00000000 00000000 00000000 00000000

The first DWORD is the size, but it's split into two sections. The low word is the current size and the high word is the previous size. In order to get the true size, you must shift the value left by 3. All memory is being allocated in a granularity of 8. 42<<3 = 210 or 528. The second DWORD is the flags and the rest is the memory allocated to the program.

0:000> dc 001457f8
001457f8  00420042 001c0700 00006968 00000000  B.B.....hi......
00145808  00000000 00000000 00000000 00000000  ................
00145818  00000000 00000000 00000000 00000000  ................
00145828  00000000 00000000 00000000 00000000  ................
00145838  00000000 00000000 00000000 00000000  ................
00145848  00000000 00000000 00000000 00000000  ................
00145858  00000000 00000000 00000000 00000000  ................
00145868  00000000 00000000 00000000 00000000  ................
0:000>

If I do "DC" to dump the characters, I notice the allocated memory contains the word "hi". If I continue dumping the memory, I continue to see "hi" in the memory allocated at 528 bytes. This is now starting to become a pattern. Let's move onto the next heap.

0:000> !heap
NtGlobalFlag enables following debugging aids for new heaps:    tail checking
    disable coalescing of free blocks
Index   Address  Name      Debugging options enabled
  1:   00140000                 tail checking free checking validate parameters
  2:   00240000                 tail checking free checking validate parameters
  3:   00250000                 tail checking free checking validate parameters
  4:   00320000                 tail checking free checking validate parameters
0:000> !heap 00320000 -a
Index   Address  Name      Debugging options enabled
  1:   00140000
  2:   00240000
  3:   00250000
  4:   00320000
    Segment at 00320000 to 00330000 (00010000 bytes committed)
    Segment at 00410000 to 00510000 (000ee000 bytes committed)
    Flags:                50001062
    ForceFlags:           40000060
    Granularity:          8 bytes
    Segment Reserve:      00200000
    Segment Commit:       00002000
    DeCommit Block Thres: 00000200
    DeCommit Total Thres: 00002000
    Total Free Size:      000000b3
    Max. Allocation Size: 7ffdefff
    Lock Variable at:     00320608
    Next TagIndex:        0000
    Maximum TagIndex:     0000
    Tag Entries:          00000000
    PsuedoTag Entries:    00000000
    Virtual Alloc List:   00320050
    UCR FreeList:        00320598
    FreeList Usage:      00000800 00000000 00000000 00000000
    FreeList[ 00 ] at 00320178: 004fdac8 . 004fdac8
    Unable to read nt!_HEAP_FREE_ENTRY structure at 004fdac8
    FreeList[ 0b ] at 003201d0: 0032ffb0 . 0032ffb0
    Unable to read nt!_HEAP_FREE_ENTRY structure at 0032ffb0
    Segment00 at 00320640:
        Flags:           00000000
        Base:            00320000
        First Entry:     00320680
        Last Entry:      00330000
        Total Pages:     00000010
        Total UnCommit:  00000000
        Largest UnCommit:00000000
        UnCommitted Ranges: (0)

    Heap entries for Segment00 in Heap 00320000
        00320000: 00000 . 00640 [01] - busy (640)
        00320640: 00640 . 00040 [01] - busy (40)
        00320680: 00040 . 01818 [07] - busy (1800), 
            tail fill - unable to read heap entry extra at 00321e90
        00321e98: 01818 . 000a0 [07] - busy (88), 
            tail fill - unable to read heap entry extra at 00321f30
        00321f38: 000a0 . 00498 [07] - busy (480), 
            tail fill - unable to read heap entry extra at 003223c8
        003223d0: 00498 . 00098 [07] - busy (80), 
            tail fill - unable to read heap entry extra at 00322460
        00322468: 00098 . 00028 [07] - busy (d), 
            tail fill - unable to read heap entry extra at 00322488
        00322490: 00028 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00322568
        00322570: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00322648
        00322650: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00322728
        00322730: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00322808
        00322810: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003228e8
        003228f0: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003229c8
        003229d0: 000e0 . 000e8 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00322ab0
        00322ab8: 000e8 . 00238 [07] - busy (220), 
            tail fill - unable to read heap entry extra at 00322ce8
        00322cf0: 00238 . 000a8 [07] - busy (90), 
            tail fill - unable to read heap entry extra at 00322d90
        00322d98: 000a8 . 00058 [07] - busy (3e), 
            tail fill - unable to read heap entry extra at 00322de8
        00322df0: 00058 . 00060 [07] - busy (41), 
            tail fill - unable to read heap entry extra at 00322e48
        00322e50: 00060 . 00050 [07] - busy (31), 
            tail fill - unable to read heap entry extra at 00322e98
        00322ea0: 00050 . 00038 [07] - busy (1b), 
            tail fill - unable to read heap entry extra at 00322ed0
        00322ed8: 00038 . 00040 [07] - busy (26), 
            tail fill - unable to read heap entry extra at 00322f10
        00322f18: 00040 . 00030 [07] - busy (11), 
            tail fill - unable to read heap entry extra at 00322f40
        00322f48: 00030 . 00030 [07] - busy (17), 
            tail fill - unable to read heap entry extra at 00322f70
        00322f78: 00030 . 00028 [07] - busy (d), 
            tail fill - unable to read heap entry extra at 00322f98
        00322fa0: 00028 . 00048 [07] - busy (2f), 
            tail fill - unable to read heap entry extra at 00322fe0
        00322fe8: 00048 . 000d0 [07] - busy (b1), 
            tail fill - unable to read heap entry extra at 003230b0
        003230b8: 000d0 . 00080 [07] - busy (61), 
            tail fill - unable to read heap entry extra at 00323130
        00323138: 00080 . 00038 [07] - busy (1c), 
            tail fill - unable to read heap entry extra at 00323168
        00323170: 00038 . 00048 [07] - busy (2d), 
            tail fill - unable to read heap entry extra at 003231b0
        003231b8: 00048 . 00040 [07] - busy (22), 
            tail fill - unable to read heap entry extra at 003231f0
        003231f8: 00040 . 00030 [07] - busy (17), 
            tail fill - unable to read heap entry extra at 00323220
        00323228: 00030 . 00028 [07] - busy (e), 
            tail fill - unable to read heap entry extra at 00323248       
        00323250: 00028 . 00168 [07] - busy (149), 
            tail fill - unable to read heap entry extra at 003233b0
        003233b8: 00168 . 00058 [07] - busy (39), 
            tail fill - unable to read heap entry extra at 00323408
        00323410: 00058 . 00038 [07] - busy (1b), 
            tail fill - unable to read heap entry extra at 00323440
        00323448: 00038 . 00060 [07] - busy (43), 
            tail fill - unable to read heap entry extra at 003234a0
        003234a8: 00060 . 00030 [07] - busy (12), 
            tail fill - unable to read heap entry extra at 003234d0
        003234d8: 00030 . 00030 [07] - busy (18), 
            tail fill - unable to read heap entry extra at 00323500
        00323508: 00030 . 00038 [07] - busy (1e), 
            tail fill - unable to read heap entry extra at 00323538
        00323540: 00038 . 00028 [07] - busy (c), 
            tail fill - unable to read heap entry extra at 00323560
        00323568: 00028 . 00030 [07] - busy (14), 
            tail fill - unable to read heap entry extra at 00323590
        00323598: 00030 . 00028 [07] - busy (f), 
            tail fill - unable to read heap entry extra at 003235b8
        003235c0: 00028 . 00030 [07] - busy (18), 
            tail fill - unable to read heap entry extra at 003235e8
        003235f0: 00030 . 00040 [07] - busy (28), 
            tail fill - unable to read heap entry extra at 00323628
        00323630: 00040 . 00040 [07] - busy (27), 
            tail fill - unable to read heap entry extra at 00323668
        00323670: 00040 . 00038 [07] - busy (19), 
            tail fill - unable to read heap entry extra at 003236a0
        003236a8: 00038 . 00030 [07] - busy (17), 
            tail fill - unable to read heap entry extra at 003236d0
        003236d8: 00030 . 00050 [07] - busy (34), 
            tail fill - unable to read heap entry extra at 00323720
        00323728: 00050 . 00030 [07] - busy (11), 
            tail fill - unable to read heap entry extra at 00323750
        00323758: 00030 . 00030 [07] - busy (14), 
            tail fill - unable to read heap entry extra at 00323780
        00323788: 00030 . 00068 [07] - busy (4a), 
            tail fill - unable to read heap entry extra at 003237e8
        003237f0: 00068 . 00818 [07] - busy (800), 
            tail fill - unable to read heap entry extra at 00324000
        00324008: 00818 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003240e0
        003240e8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003241c0
        003241c8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003242a0
        003242a8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324380
        00324388: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324460
        00324468: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324540
        00324548: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324620
        00324628: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324700
        00324708: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003247e0
        003247e8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003248c0
        003248c8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003249a0
        003249a8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324a80
        00324a88: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324b60
        00324b68: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324c40
        00324c48: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324d20
        00324d28: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324e00
        00324e08: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324ee0
        00324ee8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00324fc0
        00324fc8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 003250a0
        003250a8: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00325180
        00325188: 000e0 . 000e0 [07] - busy (c8), 
            tail fill - unable to read heap entry extra at 00325260

In this heap, I see a lot of <address>: e0 . e0 allocations. This could be my leak. If I take a look at them, I notice they are the same:

0:000> dc 00325188
00325188  001c001c 00180700 66647361 61667361  ........asdfasfa
00325198  73666473 73666461 73666461 61666164  sdfsadfsadfsdafa
003251a8  61736673 73616664 61736664 73616664  sfsadfasdfsadfas
003251b8  66736166 73666473 66736661 66647361  fasfsdfsafsfasdf
003251c8  00617364 baadf00d baadf00d baadf00d  dsa.............
003251d8  baadf00d baadf00d baadf00d baadf00d  ................
003251e8  baadf00d baadf00d baadf00d baadf00d  ................
003251f8  baadf00d baadf00d baadf00d baadf00d  ................

Another thing you could do besides !heap is to simply take the starting address of the heap allocations and continuously "DC" on the heap looking for strings or other patterns.

01c << 3 = e0 or 224. So, let's look at the source code.

  char *p, *x;
  while(1)
  {
     p = malloc(200);
     strcpy(p, "asdfasfasdfsadfsadfsdafasfsadfasdfsadfasfasfsdfsafsfasdfdsa");

     x = LocalAlloc(LMEM_ZEROINIT, 500);
     strcpy(x, "hi");

     Sleep(1);
  }

Very simple, and obviously would cause a fast leak. Notice that we allocated "224" bytes instead of "200". This is because the allocation size includes the 2 DWORD headers as well as must be on an 8 byte boundary. If you take the memory that you were given in your program and subtract 8, you have the header information. If you know, take the allocated size, shift left by 3, then add this to that address, you will get the next allocated block of memory. Take our last allocation as an example:

0:000> dc 0325188 + e0
00325268  001c001c 00180700 66647361 61667361  ........asdfasfa
00325278  73666473 73666461 73666461 61666164  sdfsadfsadfsdafa
00325288  61736673 73616664 61736664 73616664  sfsadfasdfsadfas
00325298  66736166 73666473 66736661 66647361  fasfsdfsafsfasdf
003252a8  00617364 baadf00d baadf00d baadf00d  dsa.............
003252b8  baadf00d baadf00d baadf00d baadf00d  ................
003252c8  baadf00d baadf00d baadf00d baadf00d  ................
003252d8  baadf00d baadf00d baadf00d baadf00d  ................
0:000>

I will not go over the heap flags as I really never bother with them myself unless really necessary. The most important flag would basically be to tell you if the memory is currently allocated or not. This is told to you when you do !heap <heap> -a as "busy" generally means not free and "free" means the memory is free. The "00180700" when allocated generally turns into "00180400" when free. You can experiment by writing a program to allocate and free memory while watching the flags. Just be careful that there's no other thread or the memory was allocated in the time after you freed it but before you checked its flags!

Private Heaps Verse Global Heaps

I have introduced you to the !heap function, but there's something I should mention to you. It does not show the global heap. If you have created a global variable, it will not show up in any of the heaps listed in !heap and will not follow the same rules. The global heap cannot be destroyed while these private heaps can.

The global heap you will have problems with corruption though, not leaking; which will be the next topic for discussion. I will not go over the global heap in this tutorial though.

Keeping Track Of Allocations

There is another method that you could use in your program to track down memory leaks. This would be the creation of your own memory allocation routines. These routines would simply behave as the following:

PVOID MyAllocationRoutine(DWORD dwSize)
{
 PVOID pMem = malloc(dwSize + sizeof(_DEBUG_STRUCTURE));

 if(pMem)
 {
    _DEBUG_STRUCTURE *pDebugStruc = (_DEBUG_STRUCTURE *)pMem;

    /* Fill In Your Debug Information Here */

    /* Make Sure You Give the Application the memory AFTER your debug structure */
    pMem = pDebugStruc + 1; 
  }

 return pMem;
}

You simply append your own header to all allocations. This information could be as complex as returning the return address of who allocated the memory, thread identifier, etc. This is how programs like boundschecker work. They can simply replace your allocations and find out who's allocating the memory and keep track of it. Your program could do this itself. You could even create a global variable and add all the memory into a linked list and create a debug helper extension to walk through it dumping all the memory and information. Creating a debug helper extension would be in a later tutorial.

As you can see, you can simply create your own memory allocation thunks to add useful information to the allocated block to help you keep track of memory and find leaks or even memory corruption. You could even #define this function to only be your code when in a certain mode, such as debug. The function you used could be redefined to LocalAlloc or malloc on a normal build. You could even always use it and enable it with a special registry key or flag. There're a lot of possibilities.

Also, remember that you now need to create a "free" routine. This routine should simply subtract the size of your structure before calling the free API.

Heap Corruption

Also known as "Memory Corruption", is basically when variables start overwriting their boundaries. The most common cause of simple heap corruption is caused when you use the wrong method to free the memory. For example, if you allocate with malloc() but free with LocalFree(). They use different heaps and could even use different underlying methods to allocate or keep track of memory all together. The problem is that these functions may still try to free the memory using their algorithms without doing validation on the heap being freed. This is when the corruption occurs. Look at the above section on creating your own malloc() thunk function. Other APIs could be performing their own thunks so it's always best to use the matching APIs to free the memory!

The other common causes are simply over running your boundary, such as writing to more memory than you've allocated and even simply just randomly writing to memory that the program does not own. Heap corruption can be harder to track down than memory leaks. The strategy of tracking allocations does not help, especially if you want to compile that in after the fact. This is because now you change the allocation sizes, so it's possible you will not corrupt the heap the same or perhaps there's enough padding it won't become corrupted.

I have written a program with a few heap problems. Now, heap problems do not surface immediately, they usually surface over time. They can surface when another part of the program goes to use the corrupted memory and trap, or when you free or allocate another variable.

I ran this program and got a few dialog boxes about invalid memory referencing. So, I run it in the debugger to see what happens.

C:\programs\DirectX\Games\src\Games\temp\bin>cdb temp 

Microsoft (R) Windows Debugger  Version 6.3.0005.1
Copyright (c) Microsoft Corporation. All rights reserved. 

CommandLine: temp
Symbol search path is: SRV*c:\symbols*http://msdl.microsoft.com/download/symbols 

Executable search path is:
ModLoad: 00400000 00404000   temp.exe
ModLoad: 77f50000 77ff7000   ntdll.dll
ModLoad: 77e60000 77f46000   C:\WINDOWS.0\system32\kernel32.dll
ModLoad: 77c10000 77c63000   C:\WINDOWS.0\system32\MSVCRT.dll
(a20.710): Break instruction exception - code 80000003 (first chance)
eax=00241eb4 ebx=7ffdf000 ecx=00000004 edx=77f51310 esi=00241eb4 edi=00241f48
eip=77f75a58 esp=0012fb38 ebp=0012fc2c iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000202
ntdll!DbgBreakPoint:
77f75a58 cc               int     3
0:000> g
(a20.710): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=61736664 ebx=00000004 ecx=73616664 edx=00142ab8 esi=00142ab8 edi=00140000
eip=77f8452d esp=0012f7e4 ebp=0012f9fc iopl=0         nv up ei pl zr na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=0038  gs=0000             efl=00010246
ntdll!RtlAllocateHeapSlowly+0x6bd:
77f8452d 8901             mov     [ecx],eax         ds:0023:73616664=????????
0:000>

We receive a first chance exception. This generally means if I hit "g", the program will continue to run. If I get a "second chance" exception, the program is then trapped. However, this still does not look good. It's a good sign of memory corruption. Let's take a look. The first thing we want to do is remember what we were taught in the first tutorials. We need to find out why this memory is being referenced, who's referencing it and where it came from. The first way to do this? Our old friend, the stack trace!

0:000> kb
ChildEBP RetAddr  Args to Child
0012f9fc 77f9d959 00140000 50140169 00000006 ntdll!RtlAllocateHeapSlowly+0x6bd
0012fa80 77f83eb1 00140000 50140169 00000006 ntdll!RtlDebugAllocateHeap+0xaf
0012fcac 77f589f2 00140000 40140068 00000006 ntdll!RtlAllocateHeapSlowly+0x41
0012fee4 77e7a6d4 00140000 40140068 00000006 ntdll!RtlAllocateHeap+0xe44
0012ff30 00401024 00000040 00000006 00000000 kernel32!LocalAlloc+0x58
0012ff4c 0040113b 00000001 00322470 00322cf8 temp!main+0x24
0012ffc0 77e814c7 00000000 00000000 7ffdf000 temp!mainCRTStartup+0xe3
0012fff0 00000000 00401058 00000000 78746341 kernel32!BaseProcessStart+0x23
0:000>

So, we are allocating memory and basically NTDLL is trapping! This is a good sign of memory corruption. Let's look further, what is the memory attempted to be referenced.

0:000> u eip - 20
ntdll!RtlAllocateHeapSlowly+0x69d:
77f8450d 058845b356       add     eax,0x56b34588
77f84512 8b7de4           mov     edi,[ebp-0x1c]
77f84515 57               push    edi
77f84516 e85eeaffff       call    ntdll!RtlpUpdateIndexRemoveBlock (77f82f79)
77f8451b 8b4608           mov     eax,[esi+0x8]
77f8451e 89855cffffff     mov     [ebp-0xa4],eax
77f84524 8b4e0c           mov     ecx,[esi+0xc]
77f84527 898d58ffffff     mov     [ebp-0xa8],ecx
0:000> u
ntdll!RtlAllocateHeapSlowly+0x6bd:
77f8452d 8901             mov     [ecx],eax

From the looks of this, it appears to be some type of linked list or similar data structure. I have put in bold the assembly of interest. We first see that ECX is being de-referenced as a pointer. [ecx], as stated in another tutorial, is the same as DWORD *pECX; *pECX = EAX; in C. So, we need to find where the ECX value came from. We see "ECX, [ESI + 0Ch]", which would basically be:

DWORD *pECX, *pESI; pECX = pESI[12/4]; 

Remember, in assembly there's no types so the arrays are always indexed as byte sizes rather than data type sizes. (I assume if you're reading this you know about pointers!) So, let's dump [ESI + C] and see what's there.

0:000> dc esi + c
00142ac4  73616664 66736166 73666473 66736661  dfasfasfsdfsafsf
00142ad4  66647361 00617364 feeefeee feeefeee  asdfdsa.........
00142ae4  feeefeee feeefeee feeefeee feeefeee  ................
00142af4  feeefeee feeefeee feeefeee feeefeee  ................
00142b04  feeefeee feeefeee feeefeee feeefeee  ................
00142b14  feeefeee feeefeee feeefeee feeefeee  ................
00142b24  feeefeee feeefeee feeefeee feeefeee  ................
00142b34  feeefeee feeefeee feeefeee feeefeee  ................

The trap:

77f8452d 8901  mov  [ecx],eax  ds:0023:73616664=????????

This is very simple now! This is a string, so we just need to look through our program to find out where we allocated this string.

x = LocalAlloc(LMEM_ZEROINIT, 5);
strcpy(x, "asdfasfasdfsadfsadfsdafasfsadfasdfsadfasfasfsdfsafsfasdfdsa");
p = LocalAlloc(LMEM_ZEROINIT, 6);
strcpy(p, "hi");
LocalFree(x);
free(p);

As we can see, we have overwritten a lot of memory beyond our "5" bytes we wanted to allocate. This was a very simple example. Sometimes, it could just be one byte overwritten and you may need to search backwards through memory to find out if it belongs to a string or the "null" of a string. A very common mistake is to forget to leave 1 character for the NULL. A lot of times, this is OK as we see we allocate on granularities of 8, so it's unnoticed. Other times though, it's a nightmare to track down because once the memory has been overwritten, the culprit is long gone once its effect actually takes hold!

Another good thing to do is step through the program. If the place being overwritten does not change, it's simple. Step through and keep checking if it's been overwritten. You can walk functions to narrow down the location. This is also assuming a simple problem, not a complex scenario. You may also want to check other memory locations and etc. to track down.

Another good tool is the "breakpoint". "ba r1 xxxx" means "break if anyone attempts to read or write to this address". This is very useful as well if the address is constant. It will break as soon as someone attempts to write to it. There's other methods you could attempt as well, such as narrowing down the functionality that causes the problems, enabling "global flags", etc.

I know that I mentioned in the above first paragraph that the code used to check for memory leaks could not be used to check for heap corruption. This is true and false. I basically meant, in its current state, you could not expect to use it to help find memory corruption. With a little tweak, you can pad the end of the allocation and the beginning with values that you would not expect the program to use. Then, at the end, when you free the data, check the signature to validate it was unchanged. If it did change, it's a possible memory overwrite. This assumes your corruption is continuous or will hit the boundary and not skip it. This also assumes the problem lies in the allocated buffer and not somewhere else in your code!

Other Tools

There are other tools that can help you spot and track down memory corruption and leaks.

Performance Monitor

This tool comes with Windows, "perfmon". This allows you to monitor and record system performance using the options you choose. This can help you study a process' memory footprint over time. This is a better alternative than "Task Manger", for spotting slow leaks.

Bounds Checker

This is a tool which I described above. It will help you to track down memory that is not freed at the close of your application. It will also help to find other types of leaks and even help find memory corruption. This is a very easy tool to use and will display the locations in your code which caused the allocation that was never freed. It will even tell you how much memory was not freed.

Global Flags

There are options in the registry to enable heap validation and heap checking. There is a utility that comes with the debug tools called "gflags" that can help set these flags in the registry for you. I may go over these in a future tutorial, however the debug tools do come with documentation.

QuickView: System Explorer

This is a utility that I have written and it is a general Windows 2000 and higher debug helper. It does not do real time data at this time and is just for exploring different aspects of the system to help you track down problems. If you want to use it, there is an .RTF file installed with it that helps explain all the features.

Conclusion

This tutorial should hopefully have introduced you to memory leaks and heap corruption from usermode. This tutorial should also have helped educate you in how to approach such problems with your own applications.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here