Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Compressed DataTable serialization using GZip

0.00/5 (No votes)
17 Sep 2008 2  
Using this FastDataTable you can serialize your data with fast and good compressing.

Introduction

During work on our project, which is three-tier business solution with lots of data, flowing between servers and clients, we (I mean our developers division) faced the problem of compressing data, that transmitted from server to client application. The main problem is a process of serialization and deserialization, because it's hidden deeply in the core of ADO.NET and it's very difficult to change it from your code. For some time we used FastDataSet realization, where process of serialization was totally rewritten. For the long time it was a best solution of uor problem, but suddenly we noticed, that in some scenarios, when passing data through local net to report generator, the data becomes broken. The only way for us was to understan, how to make this serialization simple, fast and harmless for our data. It seems, that we made it, so I just want to share this piece of knowledge with you.

Background

The main point of this code is using of internal DataTable methods, such as DeserializeTableSchema, DeserializeTableData and ResetIndexes. This three functions used by the DataTable class in deserialization. So do we, but just before using this functions, we need to decompress data held in SerializationInfo class. When serializing data, first we using default mechanizm, and just after compress it - that's all. Due some limits

Using the Code

So, the main parts of code is: Method GetObjectData, that is method used by formatter to serialize object data into SerializationInfo object. THis object simply contains all fields of original, types of that fields and their names. We did there some tric - get real data into temporary info, then haked into its private fields using reflection, compress them, and then place this data into real serialization info.

public override void GetObjectData(SerializationInfo info, StreamingContext context) {
    SerializationInfo zipInfo = new SerializationInfo(typeof(FastDataTable),
        new FormatterConverter());
    base.GetObjectData(zipInfo, context);

    FieldInfo fiData = typeof(SerializationInfo).GetField("m_data",
        BindingFlags.NonPublic | BindingFlags.Instance);
    FieldInfo fiMembers = typeof(SerializationInfo).GetField("m_members",
        BindingFlags.NonPublic | BindingFlags.Instance);
    FieldInfo fiTypes = typeof(SerializationInfo).GetField("m_types",
        BindingFlags.NonPublic | BindingFlags.Instance);
    object[] data = (object[])fiData.GetValue(zipInfo);
    string[] members = (string[])fiMembers.GetValue(zipInfo);
    Type[] types = (Type[])fiTypes.GetValue(zipInfo);

    IFormatter formatter = new BinaryFormatter();
    using(MemoryStream stream = new MemoryStream()) {
        formatter.Serialize(stream, data);
        formatter.Serialize(stream, members);
        formatter.Serialize(stream, types);
        formatter.Serialize(stream, zipInfo.MemberCount);

        using(MemoryStream streamZip = new MemoryStream()) {
            stream.Position = 0;
            byte[] arr = null;
            if(useCompression && stream.Length > compressThreshold) {
                Compress(stream, streamZip);
                arr = streamZip.ToArray();
            } else {
                arr = stream.ToArray();
            }

            info.AddValue("chunk", arr);
            info.AddValue("compressed",
                useCompression && stream.Length > compressThreshold);
        }
    }
}

Second important method is overriden constructor for deserialization. There we decompressing our original serialization info, put real data into info object, and then, again using reflection, call methods, which DataTable class use to deserialize tables.

protected FastDataTable(SerializationInfo info, StreamingContext context) {
    MethodInfo miDeTS = typeof(FastDataTable).GetMethod(
        "DeserializeTableSchema", 
        BindingFlags.NonPublic | BindingFlags.Instance);
    MethodInfo miDeTD = typeof(FastDataTable).GetMethod("DeserializeTableData",
        BindingFlags.NonPublic | BindingFlags.Instance);
    MethodInfo miResIn = typeof(FastDataTable).GetMethod("ResetIndexes",
        BindingFlags.NonPublic | BindingFlags.Instance);

    using(MemoryStream stream = new MemoryStream()) {
        byte[] bytes = (byte[])info.GetValue("chunk", typeof(byte[]));
        useCompression = (bool)info.GetValue("compressed", typeof(bool));

        compressedSize = bytes.Length;

        if(useCompression) {
            using(MemoryStream streamUnzip = new MemoryStream(bytes)) {
                Decompress(streamUnzip, stream);
            }
        } else {
            stream.Write(bytes, 0, bytes.Length);
        }

        stream.Position = 0;
        originalSize = (int)stream.Length;
        IFormatter formatter = new BinaryFormatter();

        FieldInfo fiData = typeof(SerializationInfo).GetField(
            "m_data", BindingFlags.NonPublic | BindingFlags.Instance);
        FieldInfo fiMembers = typeof(SerializationInfo).GetField(
            "m_members", BindingFlags.NonPublic | BindingFlags.Instance);
        FieldInfo fiTypes = typeof(SerializationInfo).GetField("m_types",
            BindingFlags.NonPublic | BindingFlags.Instance);
        FieldInfo fiCurrMember = typeof(SerializationInfo).GetField(
            "m_currMember", BindingFlags.NonPublic | BindingFlags.Instance);

        object[] data = (object[])formatter.Deserialize(stream);
        string[] members = (string[])formatter.Deserialize(stream);
        Type[] types = (Type[])formatter.Deserialize(stream);
        int curMember = (int)formatter.Deserialize(stream);

        fiData.SetValue(info, data);
        fiMembers.SetValue(info, members);
        fiTypes.SetValue(info, types);
        fiCurrMember.SetValue(info, curMember);
    }


    miDeTS.Invoke(this, new object[] { info, context, true });
    miDeTD.Invoke(this, new object[] { info, context, 0 });
    miResIn.Invoke(this, new object[] { });
}

So, that's main features, we used. Also don't forget to set RemotingFormat to SerializationFormat.Binary.

Points of Interest

The most problem, that we faced while writing this code - is the fact, that in C# we can't call base constructor in any place of your constructor - only in the beginning. Because of this we forced to use all this reflection stuff. Great help in understanding structure of framework code was provided by .NET Reflector - very cool tool!

History

Nothing yet :)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here