Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

A Simple Profiler using the Visual Studio C/C++ Compiler and DIA SDK

4.97/5 (40 votes)
15 Dec 2009CPOL6 min read 143.5K   3.3K  
Easy to use profiler for time and impact analysis of C/C++ code which uses the Visual Studio C/C++ compiler (/Gh and /GH flags) and the DIA SDK to gather profiling data.

Image 1

Introduction

As a C++ programmer, I always end up writing code for products which are either add-on applications or third-party DLLs to the main application. So as a developer, it's important for me to ensure that my code does not degrade the performance of the main application or product. Generally, I use commercial products like IBM Rational Product Suite or Bounds Checker. But at times, these commercial tools are not useful as the code (DLL) which I write relies on the main application as the main application is not at all modifiable for profiling. The main motive behind this is I should get the time taken by each function in my code and the number of times that function is called. In short, I want to do an impact analysis of my code so that I can get input for performance improvements. Thus, I can ensure that the overall performance of the main application or product is maintained.

Background

I happen to discuss this problem with one of my friends who is also an experienced C/C++ programmer. He suggested me that the Visual Studio C++ compiler has some flags which can be used for writing function specific code. Those compiler flags are /Gh and /GH. The /Gh flag causes a call to the _penter function at the start of every method or function, and the /GH flag causes a call to the _pexit function at the end of every method or function. So, if I write some code in these methods to find out the caller function, then I would gather the stack trace information. Also, if I write the code to start the timer in the _penter function and stop the corresponding timer in the _pexit function, then I would roughly measure the time taken by the method or function to execute. But, it's very important to understand the _penter and _pexit functions before writing any code. MSDN states that the _penter and _pexit functions are not part of any library, and it is up to the developer to provide a definition for _penter and _pexit. So, I decided to write my own DLL in which I would provide the definition of _penter and _pexit, and at the same time, these two functions would be exported from the DLL. The prototype is as follows:

C++
void __declspec(naked) _cdecl _penter( void);
void __declspec(naked) _cdecl _pexit( void);

These methods are defined as __declspec(naked) and _cdecl, which means the implementation should push the content of all the registers on entry, and pop the unchanged content on exit. Also, objects can not be instantiated inside the function body, and only global or static variables can be used inside the function body. We can only call global or static methods from inside the function body. Keeping all this in mind, I created a singleton Profiler class which will have the necessary methods to collect the time profiling data. I used the C++ inline assembler feature to implement _penter and _pexit. The sample implementation is as follows:

C++
extern "C" void __declspec(naked) _cdecl _penter( void ) 
{
    _asm 
    {
        //Prolog instructions
        pushad
        //calculate the pointer to the return address by adding 4*8 bytes 
        //(8 register values are pushed onto stack which must be removed)
        mov  eax, esp
        add  eax, 32
        // retrieve return address from stack
        mov  eax, dword ptr[eax]
        // subtract 5 bytes as instruction for call _penter
        // is 5 bytes long on 32-bit machines, e.g. E8 <00 00 00 00>
        sub  eax, 5
        // provide return address to recordFunctionCall
        push eax
        call enterFunc
        pop eax

        //Epilog instructions
        popad
        ret
    }
}

The interesting part in the implementation is how a non naked global function, i.e., enterFunc is called with the caller function virtual address as an argument. After writing the Prolog instruction, travel the stack by adding 32 bytes from the current stack pointer. Then, get the return address from that virtual address by a pointer operation. Now, subtract 5 bytes from that address, which will get the virtual address from the caller function body. This virtual address is passed as an argument to the global function which does further processing. The time keeping part is solved with this arrangement. But, what about the function name? How should I get the function name from a virtual address from any function body?

I solved this name problem by using the DIA (Debug Interface Access) SDK. The DIA SDK has a unified model to access or query any symbol and its properties from PDB files. So, to use this profiler, it's important, rather mandatory, that the DLL or EXE should have Debugging Information (the PDB file).

I implemented getFunc in the Profiler class which finds out the function name from the given virtual address. The process is as follows:

  • Get the current process handle using the GetCurrentProcess function
  • Get all the loaded modules of the current process using the EnumProcessModules function
  • Check if the given virtual address belongs to any modules using its address space size and load address
  • Get the module file path from the module handle if the given virtual address belongs to that module, using GetModuleFileNameEx
  • Load the PDB file from the module file path using the loadDataForExe method of IDiaDataSource
  • Use the openSession method of IDiaDataSource to get IDiaSession and use the put_loadAddress method of IDiaSession to setup the symbol database for the query
  • Now, query the IDiaSession object using the findSymbolByVA method which would return IDiaSymbol, i.e., the function having the given virtual address in its body
  • Get the function name from the IDiaSymbol object using the get_name method

Using the Code

The Profiler described in this article uses the DIA SDK, so if you want to use this profiler, then it's mandatory that the project (LIB/DLL/EXE) to be profiled should generate debugging information.

To generate debugging information for a project, go to the respective Project's General property page which is under the C/C++ tab and set the Debug Information Format to Program Database(/Zi)

Image 2

Also, for the Debugging property page which is under the Linker tab, set the Generate Debug Info to Yes(/Debug). These two settings will ensure that the PDB file is created for the project to be profiled.

Image 3

After setting the project to generate a PDB file, set the /Gh and /GH flags in Additional options of the Command Line property page under the C/C++ tab, as shown below.

Image 4

Add profiler.lib (export library from the profiler project) in Additional Dependencies of the Input property page under the Linker tab. This is an important setting for the profiler to work as profiler.lib/dll has the _penter and _pexit implementations.

Image 5

Provide the path to profiler.lib in Additional Library Directories of the Linker property page. This setting would be developer dependent.

Image 6

If the user wants to view the profiling result in the form of a CSV file, then set the PROFILER_LOG environment variable with the CSV file path. After the complete run of the main application, the profiling data is saved to the specified CSV file.

Image 7

As depicted earlier, the profiler CSV file contains the executed function/method name, how many times it is called, the total time taken by the function and its child functions in milliseconds, the time taken by the function, i.e., self time, and the time taken by child functions.

Limitation

The /Gh and /GH compiler flags are supported only on the Win32 platform, so the current profiler will not be useful for native 64 bit applications.

Scope for Improvement

The Simple Profiler discussed in this article is complete in itself, but it can be further improved. I can think of the following ways to improve it or make it more developer friendly:

  • Can use a multimedia timer instead of the default clock to get precise time
  • A memory profiler can be added
  • A Visual Studio add-in or macro can be created for quick and all time profiling for a complete solution

References

Here is the list of help which I took while coding this article:

History

  • 15 Dec. 2009 - Article first posted to The Code Project.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)