MethodLogger - Hook into method calls in .NET binaries

S. Senthil Kumar

4.78/5 (24 votes)

20 Nov 20066 min read

2.2K

Modifies .NET MSIL binaries to call configured methods at the beginning and end of methods in the binaries.

Introduction

I'm sure all of us here would agree that there's no such thing as too much logging - the more the better. However, peppering source code with log statements clutters code, and the repetitive nature of the work results in a lot of copy/paste errors. This article presents a simple tool that works on .NET assemblies to execute arbitrary code at the start and end of each method in the assembly. The most obvious use of such a tool is to modify the binaries to log the starting and ending of execution of the methods, but any code, including control flow altering code, could be run. Essentially, it allows you to hook into the execution of methods in an assembly.

Background

Having dealt with a series of deadlock problems recently, I wished the log files had information about the start and end of execution of each method. However, it was not practical to modify the source code to add such logs, the sheer size of the codebase was deterrent enough. That's when I thought it would be nice if we had some utility that does that after code compilation. I did not find any, so I decided to write one.

I looked around for libraries to read IL code. AbsIL was the first toolkit I came across. Unfortunately, it's written in F#, a research language based on OCaml, and while it's still .NET code and therefore usable from any .NET language, the API felt very awkward when used from C#. I then found PERWAPI, which I found to be pretty good.

Using the tool

MethodLogger is a commandline app. Before delving into the full syntax, let's see an example of how it could be used. Let's assume you have an assembly, TestApp, whose methods need to be modified. To be used with MethodLogger, you need to have a class that exposes two public static methods, which contain the code to be executed at the start and end of the modified methods.

public class MethodExecutionLogger
{
    public static void MethodStarted(string typeName, 
                       string methodName, string args)
    {
        Console.WriteLine("Start of " + typeName + "." + 
                          methodName + "(" + args + ")");
    }

    public static void MethodCompleted(string typeName, 
                       string methodName, string args)
    {
        Console.WriteLine("End of " + typeName + "." + 
                          methodName + "(" + args + ")");
    }
}

MethodLogger can then be used as follows:

MethodLogger -input TestApp TestApp MethodExecutionLogger

The initial TestApp after -input is the assembly that is to be modified. The following TestApp and MethodExecutionLogger are the assembly and class containing the two methods(MethodStarted and MethodCompleted) shown above. After running the tool, all methods in the TestApp assembly would be modified to call MethodExecutionLogger.MethodStarted and MethodexecutionLogger.MethodCompleted, passing the name of the type they belong to, their name, and arguments as parameters. It's also possible to restrict the list of classes that need to be modified. The -include switch is used to indicate that the specified classes are to be included, the -exclude switch is used to exclude classes. Both the switches take a list of classes or namespaces (indicated by a trailing .*). For example:

MethodLogger -include TestNameSpace.TestClass exclude 
             TestNameSpace.* -input TestApp 
             TestApp MethodExecutionLogger

This excludes all classes in TestNameSpace, except for TestClass. The full syntax specification is as follows:

MethodLogger -include <namespace/list of classes> 
             -exclude <namespace/list of classes>l 
             -input <dir/list of files> <assemblyname> 

<classname>

-include: a list of classes/namespaces that should be included.
-exclude: a list of classes/namespaces that should be excluded.
-input: the directory name/file name to be modified, followed by the assembly name and class name containing the methods to be called.

How it works

MethodLogger uses the PERWAPI internally to read IL, metadata, and modify code in methods. At the C# level, the transformation looks like this:-

// Before transformation
private static void Main(string[] args)
{
      Application.EnableVisualStyles();
      Application.SetCompatibleTextRenderingDefault(false);
      Application.Run(new Form1());
}

// After transformation
private static void Main(string[] args)
{
      MethodExecutionLogger.MethodStarted("TestApp.Program", 
                           "Main", "System.String[] args");
      try
      {
            Application.EnableVisualStyles();
            Application.SetCompatibleTextRenderingDefault(false);
            Application.Run(new Form1());
      }
      finally
      {
            MethodExecutionLogger.MethodCompleted("TestApp.Program", 
                                 "Main", "System.String[] args");
      }
}

With the above implementation of MethodExecutionLogger, this would result in "Start of TestApp.Program.Main(System.String[] args)" being written to the console when Main starts executing and "End of TestApp.Program.Main(System.String[] args)" when it ends.

The code in itself is pretty easy to understand. The PERWAPI provides a ReadFile method to load assemblies. Calling GetClasses on the resulting PEFile instance will return the list of classes (ClassDefs) defined in the assembly. GetMethods on each ClassDef returns the list of methods in that class. The most interesting part is the actual patching of the IL code to emit the method calls.

static void ModifyCode(ClassDef classDef, MethodDef methodDef, 
            Method startLogMethod, Method endLogMethod)
{
    string classNameString = MethodLoggerUtil.GetQualifiedClassName(classDef);
    string methodNameString = methodDef.Name();
    string paramsString = 
           MethodLoggerUtil.GetParamsAsString(methodDef.GetParams());

    Param[] parms = methodDef.GetParams();

     // We'll be pushing typeName, methodName and parameters
     // as string parameters, so set max stack size to 3.
     if (methodDef.GetMaxStack() < 3)
     {
          methodDef.SetMaxStack(3);
     }

     CILInstructions instructions = methodDef.GetCodeBuffer();
     instructions.StartInsert();
     instructions.ldstr(classNameString);
     instructions.ldstr(methodNameString);
     instructions.ldstr(paramsString);
     instructions.MethInst(MethodOp.call, startLogMethod);
     // Emit call to MethodStarted

     instructions.StartBlock();
            
     instructions.EndInsert();


     while (instructions.GetNextInstruction().GetPos() < 
            instructions.NumInstructions() - 2) ;
     // Go to last but one instruction

     instructions.StartInsert();

     TryBlock tryBlock = instructions.EndTryBlock();
     instructions.StartBlock();
     instructions.ldstr(classNameString);
     instructions.ldstr(methodNameString);
     instructions.ldstr(paramsString);
     instructions.MethInst(MethodOp.call, endLogMethod);
     // Emit call to MethodCompleted

     instructions.EndFinallyBlock(tryBlock);

     instructions.EndInsert();
}

As you can see, the PERWAPI makes it incredibly simple and intuitive to do that. We first set the max stack to 3 (if it's less than 3), as we'll be pushing three arguments to the stack, corresponding to the type name, method name, and argument list. We then get the code buffer associated with the method, which is just a list of IL instructions. We move to the beginning of the list and emit the necessary IL code to push the three string variables onto the stack using the ldstr method, followed by the MethInst method to emit the call instructions to our designated methods.

We create a try block by using the StartBlock method call to indicate the beginning of the block, followed by a EndTryBlock method call to indicate the "closing brace". Again, for the finally block, we use the StartBlock, EndFinallyBlock methods.

Getting a reference to the methods that we need to call at the start and end is a bit tricky. MethodLogger allows those methods to be in:

the assembly to be modified
one of the assemblies referenced by the assembly to be modified
an assembly not referenced by the assembly to be modified

If the class is not in the assembly to be modified, then that assembly's manifest and metadata need to be modified to add the AssemblyRef, TypeRef, and MemberRef for the class and its methods. For example, if MethodExecutionLogger is in a different assembly, say SomeAssembly, then

SomeAssembly needs to be added to the AssemblyRef table
MethodExecutionLogger needs to be added to the TypeRef table
MethodStarted and MethodExecuted need to be added to the MemberRef tables

If SomeAssembly is already referenced by TestApp, then the AssemblyRef table would already contain an entry for it, so we need to update only the TypeRef and MethodRef tables.

The LocateLoggerMethods method in the MethodLoggerUtil class executes the above logic. It first checks if that class is available in the current assembly itself.

// Check if it is in this assembly itself
if (file.GetThisAssembly().Name() == assemblyName)
{
    ClassDef methodLogger = file.GetClass(className);
    if (methodLogger != null)
    {
        return GetLoggerMethodsFromClass(methodLogger, 
               out startLogMethod, out endLogMethod);
    }
}

If not, it loops through all referenced assemblies, looking for the specified class in the specified assembly.

// Check referenced assemblies
foreach (AssemblyRef assemblyRef in file.GetImportedAssemblies())
{
    if (assemblyRef.Name() == assemblyName)
    {
        ClassRef methodLoggerRef = 
          TryGetMethodLoggerFromAssembly(assemblyRef, className);
        if (methodLoggerRef != null)
        {
             if (GetLoggerMethodsFromClass(methodLoggerRef, 
                      out startLogMethod, out endLogMethod))
                 return true;
        }
    }
}

The TryGetMethodLoggerFromAssembly method checks if the methods in that class are already in the MethodRef table for this assembly, and if not, adds it using the AddMethod method on the AssemblyRef class. In order to do this, we have to actually load the referenced assembly using the same ReadFile method, get the ClassDef for the class, and then use the MakeRefOf method to convert it to a ClassRef (methodLoggerRef).

PEFile refFile = PEFile.ReadPEFile(fileName);
ClassDef methodLogger = refFile.GetClass(className);
// convert def to ref
ClassRef methodLoggerRef = methodLogger.MakeRefOf();
            
System.Array.ForEach(methodLogger.GetMethods(), 
                     delegate(MethodDef methodDef)
{
    if (methodLoggerRef.GetMethod(methodDef.Name()) == null)
    {
        methodLoggerRef.AddMethod(methodDef.Name(), 
                                  methodDef.GetRetType(), 
                                  methodDef.GetParTypes());
    }
});

If not available in one of the referenced assemblies, MethodLogger then adds the specified assembly in the referenced assemblies list using the MakeExternAssembly method. It then adds the necessary class and method references using the same TryGetMethodLoggerFromAssembly method.

Caveats

PERWAPI doesn't retain the .rsrc section when it writes out the PE file, so any data in that section would be missing from the output of MethodLogger. Resources added via VS.NET 2005 in a WinForms app were preserved though. Also, it does not retain debug information, which I must admit, reduces the scope of MethodLogger quite a bit. It performs some minor optimization on the IL code also, which means the IL won't be identical to the original version. PERWAPI also doesn't seem to respect the subsystem type (console/windows etc..) Apart from the above, any strong naming on the assembly should be done on MethodLogger's output, as it obviously changes the assembly's contents (and therefore its hashcode).

Also, there would obviously be some performance impact because of the additional code running at the beginning and end of all methods.

History

Initial version - 12:21 AM 11/13/2006.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here