Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / productivity / Office

Content Controls and Open XML 2.0 SDK

5.00/5 (2 votes)
23 Jul 2010CPOL4 min read 49.7K  
In this blog, I’ll focus on Content Controls and Open XML 2.0 SDK from the experience I gained in the last 2 months.

I've been working on Microsoft Word automation, Open XML, Microsoft.Office.Interop.Word and Open XML 2.0 SDK. In this blog, I'll focus on Content Controls and Open XML 2.0 SDK from the experience I gained in the last 2 months.

In this blog, I'll discuss the points mentioned below: 

  • Add Custom XML part to WordprocessingDocument
  • Get Custom XML part from WordprocessingDocument
  • Each content control contains a unique ID that is assigned by Word upon creation of the content control (Issues this may cause and how it can be handled)
  • Convert in-memory Document to Bytes without saving to a File

Add Custom XML Part to WordprocessingDocument

  1. Get the MainDocumentPart:
    C#
    MainDocumentPart mainPart = doc.MainDocumentPart;
  2. Define a root element for the Custom XML part:
    C#
    string customXmlPartNamespace = "http://schemas.microsoft.com/Test.Sample";
    string rootNodeName = "TestCoverageRoot";
    XName rootName = XName.Get(rootNodeName, customXmlPartNamespace);
    XElement rootElement = new XElement(rootName);
  3. The method displayed in the code snippet below does the rest:
    C#
    public static CustomXmlPart AddCustomXmlPart
    	(MainDocumentPart mainPart, XElement rootElement)
    {
    CustomXmlPart customXmlPart = 
    	mainPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
    
    using (StreamWriter sw = new StreamWriter(customXmlPart.GetStream()))
    {
    sw.Write(rootElement.ToString());
    sw.Close();
    }
    return customXmlPart;
    }

Get Custom XML Part from a WordprocessingDocument

The code snippet displayed below assumes that namespace is unique for each CustomXml part. If this is true, I just check for the root node namespace only as displayed below:

C#
string namespaceUri= "http://schemas.microsoft.com/Test.Sample";

public static CustomXmlPart GetCustomXmlPart
	(MainDocumentPart mainPart, string namespaceUri)
{
CustomXmlPart result = null;

foreach (CustomXmlPart part in mainPart.CustomXmlParts)
{
using (XmlTextReader reader = 
	new XmlTextReader(part.GetStream(FileMode.Open, FileAccess.Read)))
{
XmlNodeType nodeType = reader.MoveToContent();
bool exists = reader.NamespaceURI.Equals(namespaceUri);                    
reader.Close();

if (exists)
{
result = part;
break;
}
}
}

return result;
}

Each Content Control Contains a Unique ID that is Assigned by Word upon Creation of the Content Control

Every content control will have a unique ID so you can associate that Content Control with a Custom XML part and achieve cool functionalities, otherwise it is impossible through Custom XML. But then everything has a negative side which may not affect in 95% of the cases, but in 5% it may cause some issues. I'll discuss about one of the issues I faced and then an approach that worked.

I was implementing a lot of Word automation related tasks, e.g. copy/pasting content controls, merging documents having content controls and suddenly one of the test cases while doing merge operation failed. When I drilled further, I found that both the documents were having different Content Controls with the same IDs. So during merge (Library was using Microsoft.Office.Interop.Word 12.0), we are doing as displayed in the code snippet below:

C#
//Using Microsoft.Office.Interop.Word 12.0, 
//where Range can be Selection.Range, Document.Range etc.
string fileName = "testFileToInsert.docx";
range.InsertFile(fileName, ref m_Missing, ref m_Missing, ref m_Missing, ref m_Missing);

In this scenario for any Controls having the same ID in file, we are inserting Word automatically assigns them a new ID to make the Control ID unique across the document. As we had Custom XML parts associated to Content Controls in both the documents, I was not able to map the data to the Custom XML part now, i.e. if duplicate Control IDs are 10, 20 and Word now assigns 23356 and 45556, I was not able to figure out if 10 corresponds to 23356 or 45556. As I was not able to map a previous Id to a new Id, I was not able to extract the information I had in Custom XML part.

As I could not find any solution, what I decided was to use the Tag property of Content Control. So instead of relying on Control ID, I decided to assign a unique GUID for every content control and save that in the Tag property. The only drawback in this case is that if you set “ActiveDocument.ToggleFormsDesign = True” or “Design Mode” in Developer tab in Microsoft Word is activated, you will see those Tags now.

As I didn't have any functional limitation (Developer mode was disabled), in this case I proceed with this solution.

In brief, the solution was:

  1. Get the Range from the Document where you want to insert the .docx file
  2. Read the Custom XML part associated with the file to be inserted
  3. Call Range.InsertFile method
  4. From the Custom XML part that you read in step 2 as per your business logic, add data in the Custom XML part associated with the Document (Range.Document) into which we inserted
  5. As Tags were unique (GUIDs), for any automatic rename that would had happened for duplicated, it will not affect our functionality.

This issue may appear while doing Copy/Paste operations and the approach listed above may work.

Convert in-memory Document to Bytes without Saving to a File

Here I'll list down one approach that worked in my case where I had to convert in-memory document to Bytes without saving to a File. This particular Document was loaded in some other module(process) using Microsoft.Office.Interop.Word 12.0 and from there, we had to pass a byte stream without saving document to file.

The code snippet below is implemented in Open XML 2.0, so for that I passed the Outer XML of MainDocumentPart as string and this method returns me the byte array.

C#
public static byte[] GetDocumentStream(string mainDocumentPartOuterXml)
{
byte[] output = null;

if (string.IsNullOrEmpty(mainDocumentPartOuterXml))
{
return output;
}

string packageNodeName = "pkg";
string packageUri = "http://schemas.microsoft.com/office/2006/xmlPackage";
string partNameSpaceUri = "http://schemas.microsoft.com/office/2006/xmlPackage";

XmlNamespaceManager namespaceManager = new XmlNamespaceManager(new NameTable());
namespaceManager.AddNamespace(packageNodeName, packageUri);
XPathDocument xpathDocument = 
	new XPathDocument(new StringReader(mainDocumentPartOuterXml));
XPathNavigator navigator = xpathDocument.CreateNavigator();           
XPathNodeIterator iterator = navigator.Select("//pkg:part", namespaceManager);

using (MemoryStream ms = new MemoryStream())
{
using (Package pkg = Package.Open(ms, FileMode.Create))
{
while (iterator.MoveNext())
{
Uri partUri = new Uri(iterator.Current.GetAttribute
		("name", partNameSpaceUri), UriKind.Relative);

if (pkg.PartExists(partUri))
pkg.DeletePart(partUri);

PackagePart part = pkg.CreatePart(
partUri
, iterator.Current.GetAttribute("contentType", partNameSpaceUri));

XElement elem = XElement.Parse(iterator.Current.InnerXml);

byte[] buffer = null;
string elementToWrite = elem.FirstNode.ToString();

//Handled for Content Type = binaryData e.g. images
//May need to handle for other content types
if (elem.Name.LocalName.Equals("binaryData", StringComparison.OrdinalIgnoreCase))
{
buffer = Convert.FromBase64String(elementToWrite); 
}
else
{                           
buffer = Encoding.UTF8.GetBytes(elementToWrite);
}

part.GetStream().Write(buffer, 0, buffer.Length);
}
pkg.Flush();
pkg.Close();
}
ms.Position = 0;
output = new byte[(int)ms.Length];
ms.Read(output, 0, (int)ms.Length);
ms.Flush();
ms.Close();
}
return output;
}

Summary

Whatever solutions I have listed worked in my case, it may or may not work for some functional requirements. Also there may be better ways to implement the same which I did not find due to lack of time, lack of experience in Microsoft Word automation, etc. as I only worked for 2 months in OpenXml 2.0, Microsoft.Office.Interop.Word while migrating an application from Custom XML to Content controls. I'm providing the references that helped me a lot.

References

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)