Introduction
I recently had to write a module to create an Invoice in Word (.docx) format. Though I found a number of solutions which do this, none were as simple as I wanted them to be. All I wanted was kind of search-and-replace that I could do from C# code. So this library effectively does a simple text based search-and-replace on a docx file without any external library (OpenXML or Office Interop). It doesn't require Office to be installed on the target machine as well.
Background
The .docx format is basically an archive containing several other files. The main content residing in the XML file word/document.xml. Details on the format can be found here.
So, what this library effectively does is:
- Extract the docX to a temporary location.
- Do a search-and-replace on the XML file containing main content (word/document.xml)
- Repackage the files to re-create docX.
- Copy/Rename the new file to the desired location
To archive/unarchive the docx file without any third party library, I've used the code from my earlier article.
Using the Code
Using the DocX Editor is extremely simple. The main class that exposes all the functionality is the DocXEditor
class.
Before the DocXEditor
class can edit the file, we need to provide 3-pieces of information to the class:
- The source file name
- A list of text to replace in the source file
- The destination file name
Here's how we specify each of the three:
1. Specify the source file
This is as simple as passing the file name to the constructor.
DocXEditor dox = new DocXEditor("Template.docx");
This will set the source (or template) file of our choosing.
2. List of text to replace in the template file
This is where we specify the search-replace stuff. The class will finally go through this list and replace the text one-by-one as specified. Here's how it's done.
ReplacementList replaceList = new ReplacementList();
replaceList.Add(new ReplaceItem("[PAT_NAME]", "John Doe"));
replaceList.Add(new ReplaceItem("[PAT_AGE]", "24"));
dox.ReplaceList = replaceList;
Here, we are creating a new ReplacementList
and then adding find-replace pair of string
s. So, effectively, we are asking the DocXEditor
to find the string
"[PAT_NAME]
" in the docx file and replace it with the string "John Doe
". Similarly, in the third line, we want the string "[PAT_AGE]
" to be replaced with the string "24
".
In the last line above, we assign the newly created replacement list to the DocXEditor
object created in step 1.
Note: The ReplacementList
and ReplaceItem
classes are just miniature helper classes to make our task of specifying search-replace list easier.
3. Specify the destination file
This is the final step where we specify the destination file and ask DocXEditor
to actually do the work.
dox.ReplaceContent("Test_Destination.docx", true);
Here, we set the destination file name. The second argument is optional and specifies if the destination file should be overwritten if it already exists.
That's it!
Points of Interest
- The code snippets shown above have been slightly modified from the sample application attached. This has been done for the sake of making the snippet simpler and easier to understand.
- Though this is not the most flexible and/or effective way to modify a docx file, in my opinion this is an extremely simple approach and might be useful in many scenarios. Additionally, this also has an advantage of not needing any external libraries.
- I've written this code in VS 2012 and the target framework is 3.5. However, since this code doesn't use anything fancy, it should work with older versions of the .NET Framework just fine.
- This library might not work in cases where the
string
to be searched is similar to the XML tags in the XML document.
History
- Original post: 28 September, 2013