Introduction
As a C++ programmer, I always end up writing code for products which are either add-on applications or third-party DLLs to the main application. So as a developer, it's important for me to ensure that my code does not degrade the performance of the main application or product. Generally, I use commercial products like IBM Rational Product Suite or Bounds Checker. But at times, these commercial tools are not useful as the code (DLL) which I write relies on the main application as the main application is not at all modifiable for profiling. The main motive behind this is I should get the time taken by each function in my code and the number of times that function is called. In short, I want to do an impact analysis of my code so that I can get input for performance improvements. Thus, I can ensure that the overall performance of the main application or product is maintained.
Background
I happen to discuss this problem with one of my friends who is also an experienced C/C++ programmer. He suggested me that the Visual Studio C++ compiler has some flags which can be used for writing function specific code. Those compiler flags are /Gh and /GH. The /Gh flag causes a call to the _penter
function at the start of every method or function, and the /GH flag causes a call to the _pexit
function at the end of every method or function. So, if I write some code in these methods to find out the caller function, then I would gather the stack trace information. Also, if I write the code to start the timer in the _penter
function and stop the corresponding timer in the _pexit
function, then I would roughly measure the time taken by the method or function to execute. But, it's very important to understand the _penter
and _pexit
functions before writing any code. MSDN states that the _penter
and _pexit
functions are not part of any library, and it is up to the developer to provide a definition for _penter
and _pexit
. So, I decided to write my own DLL in which I would provide the definition of _penter
and _pexit
, and at the same time, these two functions would be exported from the DLL. The prototype is as follows:
void __declspec(naked) _cdecl _penter( void);
void __declspec(naked) _cdecl _pexit( void);
These methods are defined as __declspec(naked)
and _cdecl
, which means the implementation should push the content of all the registers on entry, and pop the unchanged content on exit. Also, objects can not be instantiated inside the function body, and only global or static variables can be used inside the function body. We can only call global or static methods from inside the function body. Keeping all this in mind, I created a singleton Profiler
class which will have the necessary methods to collect the time profiling data. I used the C++ inline assembler feature to implement _penter
and _pexit
. The sample implementation is as follows:
extern "C" void __declspec(naked) _cdecl _penter( void )
{
_asm
{
pushad
mov eax, esp
add eax, 32
mov eax, dword ptr[eax]
sub eax, 5
push eax
call enterFunc
pop eax
popad
ret
}
}
The interesting part in the implementation is how a non naked global function, i.e., enterFunc
is called with the caller function virtual address as an argument. After writing the Prolog instruction, travel the stack by adding 32 bytes from the current stack pointer. Then, get the return address from that virtual address by a pointer operation. Now, subtract 5 bytes from that address, which will get the virtual address from the caller function body. This virtual address is passed as an argument to the global function which does further processing. The time keeping part is solved with this arrangement. But, what about the function name? How should I get the function name from a virtual address from any function body?
I solved this name problem by using the DIA (Debug Interface Access) SDK. The DIA SDK has a unified model to access or query any symbol and its properties from PDB files. So, to use this profiler, it's important, rather mandatory, that the DLL or EXE should have Debugging Information (the PDB file).
I implemented getFunc
in the Profiler
class which finds out the function name from the given virtual address. The process is as follows:
- Get the current process handle using the
GetCurrentProcess
function - Get all the loaded modules of the current process using the
EnumProcessModules
function - Check if the given virtual address belongs to any modules using its address space size and load address
- Get the module file path from the module handle if the given virtual address belongs to that module, using
GetModuleFileNameEx
- Load the PDB file from the module file path using the
loadDataForExe
method of IDiaDataSource
- Use the
openSession
method of IDiaDataSource
to get IDiaSession
and use the put_loadAddress
method of IDiaSession
to setup the symbol database for the query - Now, query the
IDiaSession
object using the findSymbolByVA
method which would return IDiaSymbol
, i.e., the function having the given virtual address in its body - Get the function name from the
IDiaSymbol
object using the get_name
method
Using the Code
The Profiler described in this article uses the DIA SDK, so if you want to use this profiler, then it's mandatory that the project (LIB/DLL/EXE) to be profiled should generate debugging information.
To generate debugging information for a project, go to the respective Project's General property page which is under the C/C++ tab and set the Debug Information Format to Program Database(/Zi)
Also, for the Debugging property page which is under the Linker tab, set the Generate Debug Info to Yes(/Debug). These two settings will ensure that the PDB file is created for the project to be profiled.
After setting the project to generate a PDB file, set the /Gh and /GH flags in Additional options of the Command Line property page under the C/C++ tab, as shown below.
Add profiler.lib (export library from the profiler project) in Additional Dependencies of the Input property page under the Linker tab. This is an important setting for the profiler to work as profiler.lib/dll has the _penter
and _pexit
implementations.
Provide the path to profiler.lib in Additional Library Directories of the Linker property page. This setting would be developer dependent.
If the user wants to view the profiling result in the form of a CSV file, then set the PROFILER_LOG environment variable with the CSV file path. After the complete run of the main application, the profiling data is saved to the specified CSV file.
As depicted earlier, the profiler CSV file contains the executed function/method name, how many times it is called, the total time taken by the function and its child functions in milliseconds, the time taken by the function, i.e., self time, and the time taken by child functions.
Limitation
The /Gh and /GH compiler flags are supported only on the Win32 platform, so the current profiler will not be useful for native 64 bit applications.
Scope for Improvement
The Simple Profiler discussed in this article is complete in itself, but it can be further improved. I can think of the following ways to improve it or make it more developer friendly:
- Can use a multimedia timer instead of the default clock to get precise time
- A memory profiler can be added
- A Visual Studio add-in or macro can be created for quick and all time profiling for a complete solution
References
Here is the list of help which I took while coding this article:
History
- 15 Dec. 2009 - Article first posted to The Code Project.