Abstract
Serialization in C# .NET plays a key role in various functions, such as remoting. Developers may often need to perform custom serialization in order to have complete control over the serialization and deserialization processes. Binary .NET serialization processes may not be enough to deserialize an input in one platform with serialized output from another. In the second article of this series, Anupam Banerji explains XML serialization, schemas, and provides an example of XML serialization.
Introduction
XML serialization is a method of storing object states in a portable, human-readable format. An XML serialized object can be deserialized by any platform, and not just the .NET platform as required by an object serialized in the .NET binary format. XML serialization can also be standardized using an XML schema. This allows serialization for business processes that operate across multiple operating systems, platforms, or have multiple development parties.
XML serialization has a few disadvantages when compared to binary serialization. The object itself is serialized, not the entire object graph. Another disadvantage is that only public fields may be serialized. Therefore, object design requires a careful consideration of XML serialization. The advantages of XML serialization should be a decisive factor in the integration of XML serialization in object design and implementation.
(De)Serializing an Object
Serialization is performed by instancing an XmlSerializer object. The serialized output is written to an underlying stream object. This is identical to the process used in binary serialization. The deserialized object should be cast into the original data type (a generic object type is returned upon deserialization).
The code below serializes a double:
using System.Xml.Serialization;
using System.IO;
XmlSerializer xs = new XmlSerializer(typeof(double));
FileStream fs = new FileStream(, FileMode.Create);
xs.Serialize(fs, (double)(10));
fs.Close();
The file contains XML output with the object state in a node:
<?xml version="1.0"?>
<double>10</double>
To deserialize an object back into the original data type (notice the cast):
double d = (double)xs.Deserialize(fs);
One of the drawbacks of using the XML serializer is that it writes only one object state to the output rather than multiple object states. This is similar to the binary serializer. I don’t necessarily agree with Microsoft® on the implementation of the Serialize() method; a large number of files are needed to serialize a large number of objects. The other issue is that deserialization of the object cannot be implemented without reading the contents of the correct object into the underlying stream, even if multiple object states were written to a single file.
A class may also be serialized. The class and serialized members must be marked as public. The class must also have a parameterless constructor; this may be overloaded. The [Serializable] attribute does not have to be applied to the class, unlike binary serialization. Private and protected members are ignored, and any inherited fields are not serialized.
(De)Serializing a DataSet
An DataSet object may also be (de)serialized using the XmlSerializer object. There are two methods to accomplish this.
The Developer can choose to implement the method to (de)serialize an object after instancing a DataSet object and populating it. The second method is to use DataSet methods.
The code below shows how to serialize a DataSet instance using DataSet methods:
using System.Data;
FileStream fs = new FileStream(<file name>, FileMode.Create);
DataSet ds = new DataSet();
ds.WriteXml(fs);
If I implement the XmlSerializer method, the serialized file looks like this:
="1.0" ="utf-8"
<DataSet>
<xs:schema id="NewDataSet"
xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema">http://www.w3.org/2001/XMLSchema"
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
...
<diffgr:diffgram
xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" />
</DataSet>
If I implement the instanced DataSet methods, I get this output:
<NewDataSet />
The second output file has no data in it. I would therefore recommend using the XmlSerializer object until the deficiencies of the DataSet instanced methods are addressed.
Controlling the Serialization Process
Common standards may be required for processes implemented between different platforms in a large organization. An example is a *NIX executable that uses serialized output from a .NET executable to perform a task. The *NIX executable may require the XML output to conform to a specific format, known as a schema. We’ll look at XML schemas a little later.
If no custom control is implemented, the XML serializer outputs all members of the serialized object as XML elements. Developers can control the XML output by marking fields with serialization attributes. For example, marking a field with the [XmlAttribute] tag results in the field being stored as an attribute, rather than an element. There are more important attributes; [XmlIgnore] ignores the attributed public field, [XmlElement] stores the field as an element (this is the default process), and [XmlArray] that stores a field as an array of complex objects, e.g. a field that is of a user-defined data type.
The Developer may also take control over the XML (de)serialization process by implementing the IXmlSerializable interface. The interface contains the GetSchema(), ReadXml() and WriteXml() methods. The Developer then reads and writes the XML output from scratch, and may implement an approved schema to ensure that serialization is performed to a standard specification.
Using XML Schemas
XML schemas are essentially template files containing information on a particular storage format. XML schemas allow a .NET application to (de)serialize standard output to an XML file. XML schemas may contain all format information, or may include a reference to a Document Type Definition (DTD) file. Creating XML Schema Definition (XSD) files are beyond the scope of this article.
Schema files are integrated into a Visual Studio® project using the XML Schema Definition Tool (Xsd.exe). Open a Visual Studio Command Prompt and type in:
xsd <file name> /classes /language:CS
XSD creates a C# class that conforms to the schema when (de)serialized. The class fields and properties should be designed in the XML schema prior to writing any methods. The diagram below shows the development sequence of a class that conforms to a XSD.
Figure 1: Class development flow diagram with XSD specification.
The Developer may use open source tools to create and modify XSDs. The GetSchema() method in the IXmlSerializable interface also allows the developer to dynamically create a schema. Inheriting a base class with this schema implementation is best practice when multiple classes require the same schema implementation.
A Quick Example: XML Serializable Class
We write a non-conforming XML class that is (de)serialized by the XmlSerializer object:
public class XmlClass
{
public double field
{
get;
set;
}
private XmlClass()
{
}
public XmlClass(double initialValue)
{
field = initialValue;
}
}
XML serialization does not require attributes for serialization. All we need to implement is a parameterless constructor. Private members are not serialized, and would need to be separately stored through the implementation of the IXmlSerializable interface.
To serialize an instance of the class, we use a TextWriter object:
XmlSerializer xs = new XmlSerializer(typeof(XmlClass));
TextWriter tw = File.CreateText(<file name>);
XmlClass xc = new XmlClass(10);
xs.Serialize(tw, xc);
tw.Close();
We open the text file containing the serialized output, and see that the serialized public field has a stored value:
="1.0" ="utf-8"
<XmlClass xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">http://www.w3.org/2001/XMLSchema">
<field>10</field>
</XmlClass>
To deserialize, create an instance of TextReader object, and cast the deserialized output as an XmlClass object:
TextReader tr = File.OpenText();
XmlClass xcd = (XmlClass)xs.Deserialize(tr);
tw.Close();
Console.WriteLine(xcd.field.ToString());
Conslusion
XML serialization provides a simple and efficient set of techniques to transfer object states between multiple software platforms. The Developer can gain complete control over the (de)serialization processes by implementing the IXmlSerializable interface. Serialization can also be customized through attributes or by conforming to an XSD format. Therefore, the benefits of implementing an XML serialization framework should be considered as an alternative to binary serialization.
This is the second article of the series.
To download this technical article in PDF format, visit Coactum Solutions at http://www.coactumsolutions.com/Articles.aspx.