Contents
Have you ever wondered how profiling tools hook into .NET applications? This article shows how you can create your own custom .NET profiler for any managed application. This profiler is about as basic as it can get, but it shows how to create one and provides a skeleton that can be used to develop something further.
The profiler will perform the following functions:
- Maintain an internal map of called functions
- Maintain the call count for each function
- Maintain the depth of the call stack
The result of using this profiler is an output file that reports what occurred during the course of an application's run. After seeing the depth of the contents of this file, I guarantee that you will never look at your .NET code the same way again.
NOTE: I want to mention up front that this profiler implementation is for .NET 2.0 applications and the sample projects are VS 2005 projects. I'd love to post a .NET 1.1 / VS 2003 solution, but would have to revert my machine to do so. There are references at the end of this article which point to sample SDK projects for .NET 1.1.
Profiling is the ability to time areas of an application, typically the entire application, so that bottlenecks can be discovered. There are several profilers available that provide full profiling of an application. In .NET, profilers are written as COM DLLs. The minimum requirement for a .NET profiler is that it implement an interface called ICorProfilerCallback
. An extended interface called ICorProfilerCallback2
was introduced in .NET 2.0, and is used in the included sample. These interfaces provide callbacks for almost everything you'd want to know about your program's execution: call enter/exits, thread switches, assembly loads/unloads, class loads/unloads, JIT compilation, managed/unmanaged code transitions, garbage collection, exception handling... everything!
The CLR is the engine that drives .NET applications. It also provides a logical place to insert hooks to see exactly what it's up to. Whereas the Win32® debug API provided only eight notifications, the ICorProfilerCallback2
profiling interface provides almost 80.
When the CLR begins a process, it looks for two environment variables:
COR_ENABLE_PROFILING
: This environment variable is set to either 1 or 0. 1 indicates that the CLR should use a profiler. 0 (or the non-existence of this environment variable) indicates that it should not use a profiler. COR_PROFILER
: Now that we've told the CLR that we want to profile, we have to tell it which profiler to use. Because profilers are implemented as COM objects, this environment variable will be set to the GUID of the coclass that implements the ICorProfilerCallback2
interface. You should be able to find this in the IDL of the profiler. For this project, the coclass guid is {9E2B38F2-7355-4C61-A54F-434B7AC266C0}
.
There are two ways to set up these environment variables. You can set them globally, in which case all .NET applications you run will be profiled. Obviously, this is a bad idea. The other way is to manually create a process using the Process
object and set those environment variables only for that process. If you've ever added a Profiling Session to your solution in Visual Studio 2005 and run it with the toolbar button, this is likely what's happening behind the scenes.
Once the process has started and the CLR has been instructed to profile and which profiler to use, it will create the profiler object and query for it's ICorProfilerCallback2
interface. Once it has that, it begins calling it to notify the profiler about everything it's doing. I've attempted to diagram this in the following picture:
The sample includes two solutions:
- DotNetProfiler.sln: This solution contains a single C++ project called DotNetProfiler which implements the actual profiler using ATL.
- ProfilerTest.sln: This solution contains two projects. ProfilerLauncher is a C# application with a single button that creates the process, sets the environment variables, and runs another application in that process. HelloWorld is our C# test application that will be profiled. It also contains a single button that, when clicked, shows a single "Hello world" message box.
As stated previously, the profiler in this article produces an output file. This file may be specified using the LOG_FILENAME
environment variable. The contents of the log file consists of two parts: a call stack and a function summary.
- The Call Stack: It is simply a list of functions in the order in which they were called, using padding to indicate the depth of the call stack. Here's a sample of the call stack section:
System.IO.StringWriter.Write, id=70671928, call count = 622
System.Text.StringBuilder.Append, id=21629200, call count = 635
System.IntPtr.op_Inequality, id=21665400, call count = 717
System.String.AppendInPlace, id=9670768, call count = 635
System.Configuration.XmlUtilWriter.AppendAttributeValue, id=73427152, call count = 14
System.Xml.XmlTextReader.get_QuoteChar, id=70046432, call count = 14
System.Xml.XmlTextReaderImpl.get_QuoteChar, id=70049120, call count = 14
This sample shows two main calls, StringWriter.Write
and XmlUtilWriter.AppendAttributeValue
. If a function is indented from the previous one, that means that it was called by that function.
- The Function Summary: It is a flat list of all of the functions that were called during the course of the run. When I say all, I mean ALL... every function in every assembly. Each function in this list also shows the total number of calls made to that function during the run. If some function names show up multiple times, it's because they are overloaded. Here's a sample of the function list section:
System.String.Join : call count = 2
System.String.SmallCharToUpper : call count = 1
System.String.EqualsHelper : call count = 1106
By default, the output file will be named "ICorProfilerCallback Log.log" and will be written to the executing assembly's directory (which in this case is the \bin\Debug directory of HelloWorld).
This project builds a COM DLL that implements the ICorProfilerCallback2
interface. This section goes over some of the classes used.
CCorProfilerCallbackImpl
: At first glance, this class appears to do nothing, and you are right. Because all profilers MUST implement the ICorProfilerCallback
interface, we must have an implementation for every function that interface defines. There are so many functions on this interface that it would bloat our main implementation to create stubs for the ones we had no use for. CCorProfilerCallbackImpl
implements the ICorProfilerCallback2
interface with nothing but stubs that may be overridden. Therefore, our main implementation will derive from this class and we only need to override the functions which we're interested in. CFunctionInfo
: This class is designed to represent a single function prototype. The CLR gives unique IDs to each function it calls. Even overloaded versions of the same function get their own ID. Our profiler implementation maintains an STL <code><code>map
(hashtable) of these objects, keyed by ID. When we encounter a function ID that we don't have in the map, we create a CFunctionInfo
object, populate it and add it to the map using the function ID assigned by the CLR. Objects of this class also maintain a call count, so when we are notified that a function has been called, we can increment its call count. CProfiler
: This class is the main implementation of the profiler. It performs the initialization of the profiler and maintains the map of CFunctionInfo
objects. It also maintains the output file, which it writes to continuously during its operation.
- The first thing the CLR does when it obtains a reference to our profiler is to call the
Initialize
function, passing a pointer to an object that implements the ICorProfilerInfo
interface. If .NET 2.0 is being used, the this object will also implement the ICorProfilerInfo2
interface. Our CProfiler
class maintains a CComQIPtr
smart pointer to each. However, in this implementation, we only use the pointer to the ICorProfilerInfo
interface.
HRESULT hr = pICorProfilerInfoUnk->QueryInterface(IID_ICorProfilerInfo,
(LPVOID*)&m_pICorProfilerInfo);
if (FAILED(hr))
return E_FAIL;
hr = pICorProfilerInfoUnk->QueryInterface(IID_ICorProfilerInfo2,
(LPVOID*)&m_pICorProfilerInfo2);
if (FAILED(hr))
{
m_pICorProfilerInfo2.p = NULL;
}
This object is used for getting contextual information from the CLR, such as retrieving the metadata for a function given a CLR-assigned ID.
- Once we have a reference to the
ICorProfilerInfo
interface, we need to tell the CLR which types of notifications we're interested in. We do this by calling the SetEventMask
function with a bit-masked value. The SetEventMask
function in CProfiler
only registers for enter/leave notifications, but the rest have been enumerated and commented out so that you can see what's available.
DWORD eventMask = (DWORD)(COR_PRF_MONITOR_ENTERLEAVE);
m_pICorProfilerInfo->SetEventMask(eventMask);
- Our next step is to register for "function enter and leave" hooks. Doing this registers three callback functions on our
CProfiler
object.
hr = m_pICorProfilerInfo->SetEnterLeaveFunctionHooks
((FunctionEnter*)&FunctionEnterNaked,
(FunctionLeave*)&FunctionLeaveNaked,
(FunctionTailcall*)&FunctionTailcallNaked);
FunctionEnterNaked
: Called when a function enters. FunctionLeaveNaked
: Called when a function exits. FunctionTailcallNaked
: Called when the last action of a function is a call to another method.
In C++ the callback functions must be declared as __declspec(naked)
. Additionally the routines must preserve any CPU registers they use and restore them before returning. Each callback has three parts (illustrated with the "Enter
" callback):
FunctionEnterNaked
: The assembly implementation which preserves the registers while calling the FunctionEnterGlobal
function. FunctionEnterGlobal
: The global function which forwards the call to the profiler object. Enter
: The profiler object's implementation of the FunctionEnter
notification.
NOTE: I have to admit that this had me scratching my head as well, however, it's the way that Microsoft implements these functions. This was probably the most difficult part of the profiler to get working properly. I'm not sure why they didn't just add Enter
, Leave
and Tailcall
as functions to be implemented on the ICorProfilerCallback
interface.
- Our final step is to register for function mapping callbacks. I've seen multiple definitions of what this function is designed to do, but one thing that it does well is to callback whenever it reaches a function it hasn't executed before, passing back the ID that it has mapped to that function. This is the ideal place to create our
CFunctionInfo
objects to go into the STL map
. This way, we're guaranteed to have an object for that function in the map whenever we get an "enter
" or "leave
" callback about it.
hr = m_pICorProfilerInfo->SetFunctionIDMapper((FunctionIDMapper*)&FunctionMapper);
Once initialization has occurred, our CProfiler
object is ready to profile. Because we registered our enter
/leave
hooks and specified COR_PRF_MONITOR_ENTERLEAVE
in our notification flags, we will be called whenever a function is entered or returns. This interaction looks like this:
- A new function is encountered for execution by the CLR.
- The CLR maps it to a numeric ID and calls
CProfiler
's FunctionMapper
function. CProfiler
uses the ICorProfilerInfo
reference to get the function name, given the ID. The profiler creates and populate a CFunctionInfo
object with this information and adds it to the STL map
, using the function ID as the key. - The CLR then calls
CProfiler
's FunctionEnterNaked
callback function, passing the function ID. This is forwarded to CProfiler
's Enter
function. - Inside the
Enter
function, CProfiler
looks up the function ID in the map. If we found the CFunctionInfo
object in the map, then we log it in the output file's call stack and increment it's call count. The profiler then increments the call stack depth. - The CLR executes the function.
- The CLR then calls
CProfiler
's FunctionLeaveNaked
callback function, passing the function ID. This is forwarded to CProfiler
's Leave
function. - Inside the
Leave
and Tailcall
functions, CProfiler
decrements the call stack depth.
Once the program finishes, the CLR calls CProfiler
's Shutdown
function. Inside this function, the profiler traverses the CFunctionInfo
map and outputs each function name and it's call count to the log file. We then free the CFunctionInfo
objects in the map.
This project is simply an application that sets up the environment for profiling a managed application. All of the work is done inside the button click handler, where it creates a ProcessStartInfo
object and sets up the environment variables that were discussed in the previous section. A process is then started using this information, which should invoke the sample profiler.
One of the most interesting things about this application, as simple as it is, is that it can be used to profile any application with any profiler. Two constants are declared at the top of Form1.cs:
private const string PROFILER_GUID = "{9E2B38F2-7355-4C61-A54F-434B7AC266C0}";
private const string EXECUTABLE_TO_RUN = "HelloWorld.exe";
To use another profiler, simply change PROFILER_GUID
to the profiler you want to use. To profile another application, change EXECUTABLE_TO_RUN
to the application you want to profile. It would probably take just a few minutes to add a file open dialog and a text box to make these changeable at runtime. (I wanted to keep the sample as simple as possible).
- Unzip the dotnetprofiler_demo.zip file into a directory.
- Register the DLL with
RegSvr32
from the command line:
regsvr32.exe DotNetProfiler.dll
You should get a confirmation that the DLL registered successfully.
- Run ProfilerLauncher.exe. Click the "Launch and Profile!" button to launch HelloWorld.exe.
- HelloWorld will run (very slowly). Click it's "Say Hello World" button and it will say "Hello world".
- Close HelloWorld, then close ProfilerLauncher.
- A file called "ICorProfilerCallback Log.log" should have been written in that directory. This file can be opened in Notepad.
- Load the DotNetProfiler.sln solution in Visual Studio 2005. The project links with the corguids.lib library, which should be in your C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Lib folder.
- Build the solution.
- Load the ProfilerTest.sln solution in Visual Studio 2005.
- Build the solution. Building it isn't dependent on DotNetProfiler.dll.
- Copy DotNetProfiler.dll, ProfilerLauncher.exe and HelloWorld.exe into their own directory.
- Follow the steps from Running the Pre-built Sample (including registering the DLL).
- First and foremost... you will be astounded at the amount of calls being made just to execute HelloWorld.exe. If you don't want to run the sample, you can download the sample output of profiling HelloWorld.exe. Now, consider the complexity of your own applications versus the complexity of HelloWorld.exe. Amazing...
- This profiler implementation is purely for show. Because it writes continuously to a file, it's horribly slow. It's not meant to be used in production... it's simply a model for building your own profiler. There are many modifications that could be made to make it better.
- I have still not found any way to debug a custom profiler. All debugging I've done has been through output file messages. If you can find a way to debug one of these, I'd love to know how.
- Because the profiler is attached to the executing process, it is extremely difficult to profile ASP.NET applications and services using this technique.
There are some good references out there about using ICorProfilerCallback
and ICorProfilerCallback2
. Here are a few:
- Profiling.doc: This document is the most comprehensive collection of information about
ICorProfilerCallback
available. It comes as part of the .NET 1.1 SDK, but is sadly missing from the .NET 2.0 SDK, so I've posted a link to it. - MSDN ICorProfilerCallback Reference and MSDN ICorProfilerCallback2 Reference
- There is an excellent example of implementing
ICorProfilerCallback
in the C:\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Tool Developers Guide\Samples\profiler\hst_profiler folder. HST stands for Hot Spot Tracker and this is a much more full featured profiler than my sample. My sample is designed to demonstrate the basics, without going into the depth that hst_profiler does. This article does a very good job of covering some of the details of that profiler. - Another excellent example of implementing
ICorProfilerCallback
is in the C:\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Tool Developers Guide\Samples\profiler\gcp_profiler folder. GCP stands for General Code Profiler and this is a much more full featured profiler than my sample. This article does a very good job of covering some of the details of that profiler. - This article does a very good job describing all of the different notifications you can subscribe to using
ICorProfilerCallback
. It also comes with it's own implementation of a profiler.
- Aug 30, 2006: Initial release
- Sep 4, 2006: Added a link to Profiling.doc under the Other References section.