Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

A High Performance Binary Serializer using Microsoft Common Intermediate Language

0.00/5 (No votes)
1 Feb 2011 1  
A high performance binary serializer using Microsoft Common Intermediate Language

Basic Concepts

  • Serialization: A mechanism to transform the state of an object into a persistable format.
  • Deserialization: Restore the state of an object from a persistable format.
  • Binary serialization: Serialization technique to transform the state of an object into a binary stream.

Problem

Microsoft .NET provides a binary serializer in the System.Runtime.Serialization.Formatters.Binary namespace. Here is a simple example of how that works:

using System;
using System.Text;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;
namespace ConsoleApplication1
{
    [Serializable]
    class TestClass
    {
        public String Name;
    }
    class Program
    {
        static void Main(string[] args)
        {
            TestClass oT = new TestClass();
            oT.Name = "Hello Bin Serializer";
            MemoryStream ms = new MemoryStream();
            BinaryFormatter bf = new BinaryFormatter();
            bf.Serialize(ms, oT);
        }
    }
}

The binary stream generated by this formatter is 173 bytes long, and when converted to characters looks like:

"\0/\0\0\0????/\0\0\0\0\0\0\0\f<\0\0\0JConsoleApplication1, 
Version=1.0.0.0, Culture=neutral, 
PublicKeyToken=null/\0\0\0ConsoleApplication1.TestClass/\0\0\
0Name/<\0\0\0>\0\0\0Hello Bin Serializer\v"

173 bytes just to serialize a simple class !!

That may not be suitable for high performance applications with a low memory budget.

Solution

The good news is that you can write a custom surrogate class that can serialize and deserialize your class into/from a binary stream.

The bad news is that you need to write this surrogate for every class you need to serialize.

And this is where .NET IL and the System.Reflection.Emit.ILGenerator come to the rescue. Using this class and a good working knowledge of MSIL, you can auto generate serialization surrogates on the fly. Here are the basic steps:

  1. Define a custom attribute that you can use to tag the class you want to generate serialization surrogates for.
  2. During the startup of your assembly, walk through all the types that have this custom attribute and generate a serialization surrogate for them.
  3. Create a class with an interface similar to the binary formatter that internally delegates the call to the IL serialization surrogate.

Sounds pretty easy, but unfortunately, the hardest part is to code the serialization surrogate using IL. Well, don’t lose heart yet, for I will show you how to write one and also give a reference implementation free. What do you say? It is a good deal, right?

OK then, let's get started. First, let me give a quick tutorial on IL.

Quick IL Tutorial

Write a simple hello world app in C#:

class HelloIL
{
 public static void Main()
 {
  System.Console.Writeline("Hello IL");
 }
}

Compile it and then open the application in IL DASM (in Visual Studio, go to Tools/ILDasm). Double click the Main node to see the IL:

ILSerializer/image.jpg

This is what the IL looks like:

.method public hidebysig static vod Main() cil managed
{
.entrypoint
// Code size 13 (0xd)
.maxstack 8
IL_0000: nop
IL_0001: ldstr "Hello IL"
IL_0006: call void [mscorlib]System.Console::WriteLine(string)
IL_000b: nop
IL_000c: ret
} // end of method HelloIL::Main

Here is where the fun begins. Using the following classes in the System.Reflection.Emit namespace, you can generate IL at runtime in any .NET app.

Now coming back to creating our serialization surrogates using this namespace. Here are the steps.

Steps to Write an IL Binary Serializer

  1. Define an interface that our dynamic serialization surrogate will implement:
    public interface IHiPerfSerializationSurrogate
    {
     void Serialize(BinaryWriter writer, object graph);
     object DeSerialize(BinaryReader reader);
    }
  2. Using the AssemblyBuilder class, create a dynamic assembly within the current app domain:
    AssemblyBuilder myAsmBuilder = Thread.GetDomain().DefineDynamicAssembly(
             new AssemblyName("SomeName"),
             AssemblyBuilderAccess.Run);
  3. Within this assembly, now define a module:
    ModuleBuilder surrogateModule = 
    	myAsmBuilder.DefineDynamicModule("SurrogateModule");
  4. Within the module, now define your custom serialization surrogate:
    TypeBuilder surrogateTypeBuilder = surrogateModule.DefineType( 
                                "MyClass_EventSurrogate", TypeAttributes.Public);
  5. Make this type an implementation of IHiPerfSerializationSurrogate:
    surrogateTypeBuilder.AddInterfaceImplementation
    		(typeof(IHiPerfSerializationSurrogate));
  6. Now define the Serialize method within the surrogate:
    Type[] dpParams = new Type[] { typeof(BinaryWriter), typeof(object) };
    MethodBuilder serializeMethod = surrogateTypeBuilder.DefineMethod(
               "Serialize",
               MethodAttributes.Public | MethodAttributes.Virtual,
               typeof(void),dpParams);
  7. And then emit a getter method for each public property:
    ILGenerator serializeIL = serializeMethod.GetILGenerator();
    MethodInfo mi = EventType.GetMethod("get_" + pi.Name);
    MethodInfo brWrite = GetBinaryWriterMethod(pi.PropertyType);
    serializeIL.Emit(OpCodes.Ldarg_1);//PU binary writer
    serializeIL.Emit(OpCodes.Ldloc, tpmEvent);//PU load the event object
    serializeIL.EmitCall(OpCodes.Callvirt, mi, null);//PU get val of property
    serializeIL.EmitCall(OpCodes.Callvirt, brWrite, null);//PU
  8. Define the DeSerialize method within the surrogate:
    MethodBuilder deserializeMthd = surrogateTypeBuilder.DefineMethod(
                           "DeSerialize",
                        MethodAttributes.Public | MethodAttributes.Virtual | 
                        MethodAttributes.HideBySig | MethodAttributes.Final | 
                        MethodAttributes.NewSlot,
                        typeof(object),
                        dpParams);
  9. And now emit a setter method for each property:
    ILGenerator deserializeIL = deserializeMthd.GetILGenerator();
    MethodInfo setProp = EventType.GetMethod("set_" + pi.Name);
    deserializeIL.Emit(OpCodes.Ldloc, tpmRetEvent);//load new obj on ES
    deserializeIL.Emit(OpCodes.Ldarg_1);//PU binary reader ,load BR on ES
    deserializeIL.EmitCall(OpCodes.Callvirt, brRead, null);//PU
    deserializeIL.EmitCall(OpCodes.Callvirt, setProp, null);//PU
  10. Emit the serializing surrogate:
    Type HiPerfSurrogate = surrogateTypeBuilder.CreateType();
  11. Now that we have a high performance serialization surrogate, it is time to use it. Here is how:
    IHiPerfSerializationSurrogate surrogate =Activator.CreateInstance(HiPerfSurrogate);
    BinaryWriter binaryWriter = new BinaryWriter(serializationStream);
    binaryWriter.Write(eventType.FullName);
    surrogate.Serialize(_binaryWriter, obj);

Results

using System;
using System.Text;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;
namespace ConsoleApplication1
{
    [Serializable]
    [ILSerialization.HiPerfSerializable]
    public class TestClass
    {
        public String Name;
    }
    class Program
    {
        static void Main(string[] args)
        {
            int len = int.Parse(args[0]);
            TestClass oT = new TestClass();
            oT.Name = "Hello Bin Serializer";
            System.Diagnostics.Stopwatch w = System.Diagnostics.Stopwatch.StartNew();
            w.Start();
            for (int i = 0; i < len; i++)
            {
                MemoryStream ms = new MemoryStream();
                BinaryFormatter bf = new BinaryFormatter();
                bf.Serialize(ms, oT);
                ms.Close();
            }
            w.Stop();
            Console.WriteLine("Time elapsed .net binary serializer= " 
				+ w.ElapsedMilliseconds);

            //now let us see how our high performance serializer performs
            w = System.Diagnostics.Stopwatch.StartNew();
            w.Start();
            for (int i = 0; i < len; i++)
            {
                MemoryStream ms = new MemoryStream();
                ILSerialization.Formatters.HiPerfBinaryFormatter hpSer = 
                    new ILSerialization.Formatters.HiPerfBinaryFormatter();
                hpSer.Serialize(ms, oT);
                ms.Close();
            }
            w.Stop();
            Console.WriteLine("Time elapsed IL hi perf serializer= " 
				+ w.ElapsedMilliseconds);
        }
    }
}

Serializing the TestClass defined in the problem section gives the following results:

  • Byte stream size: 1/3rd the size of the .NET binary serializer (51 bytes)
  • Performance: 5 times faster (for 1000000 runs, the .NET serializer took 6602 ms, our high performance serializer took 1261 ms)

Reference Implementation

"HiPerf_IL_CustomSerializer" is a reference implementation of the high performance binary serializer that is 5 times faster than the .NET binary serializer with 1/3rd the size of the serialized stream.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here