Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Merging Word Documents with C#

0.00/5 (No votes)
30 Jun 2004 1  
This article describes how to, given an initial file and a set of modified versions of that file, generate a summarized document with all the changes.

Introduction

In my current project, the customer wants a set of features that will allow the administrator to publish a document in a website, let users work in that document (either changing it or adding comments) and submit the modified version back to the site. Then, the administrator would be able to see a summarized document with the changes introduced by all the users, allowing him to approve or reject them in a change-by-change basis.

Although I saw it quite difficult at first glance, digging deeper into the documentation, I found that besides the "track changes" feature in Word is also possible to merge several documents into a final one, so that all changes will be available in a single document. Of course, such feature could be used by means of VBA automation; hence, it's also available in .NET framework via COM interop.

The solution

In order to implement a solution, I built a C# ASP.NET project in Visual Studio .NET 2003 with the following structure:

In a nutshell, the Default.apsx page is the front end, the DocMerger performs the actual merging and the documents are located in the "files" folder ("OriginalDoc" contains the original version of the document, "Copies" is where all the uploaded files from the users reside, and "Output" is the folder in which the summarized document is generated).

The web form is fairly simple and allows downloading the original document, uploading a modified version and creating the output document.

The initial version is published in the server with the "Track Changes" option turned on, so every change the user does will be easily recognized.

Let's suppose this is the document:

The upload logic in the web page is quite straightforward: It creates a unique name for the changed document (I did this trick with Guid.NewGuid()), and then it stores the file in the "Copies" folder.

private void btnGo_Click(object sender, System.EventArgs e)
{
 string strBaseDir = Server.MapPath("files/copies");
 string strFileName = Guid.NewGuid().ToString().Replace("{","").Replace("}","");
 upload.PostedFile.SaveAs(Path.Combine(strBaseDir, strFileName + ".doc"));
}

Following our previous example, let's say three users changed the document separately.

User 1

User 2

User 3

In the "copies" folder are located all the copies that different users submitted:

The DocMerger class has a Merge method that performs the document combination. I wrote an overload with a folder name instead of a file list.

    
        
    /// <summary>

    /// Merge a document with a set of copies

    /// </summary>

    /// <param name="strOrgDoc">

    /// Original file name

    /// </param>

    /// <param name="arrCopies">

    /// File names of the modified files

    /// </param>

    /// <param name="strOutDoc">

    /// The result filename

    /// </param>

    public void Merge(string strOrgDoc, string[] arrCopies, string strOutDoc)
    {
      ApplicationClass objApp = null;

      //boxing of default values for COM interop purposes

      object objMissing = Missing.Value;
      object objFalse = false;
      object objTarget = WdMergeTarget.wdMergeTargetSelected;
      object objUseFormatFrom = WdUseFormattingFrom.wdFormattingFromSelected;

      try
      {
        objApp = new ApplicationClass();
        object objOrgDoc = strOrgDoc;
        
        Document objDocLast = null;
        Document objDocBeforeLast = null;

        objDocLast = objApp.Documents.Open(
          ref objOrgDoc,    //FileName

          ref objMissing,   //ConfirmVersions

          ref objMissing,   //ReadOnly

          ref objMissing,   //AddToRecentFiles

          ref objMissing,   //PasswordDocument

          ref objMissing,   //PasswordTemplate

          ref objMissing,   //Revert

          ref objMissing,   //WritePasswordDocument

          ref objMissing,   //WritePasswordTemplate

          ref objMissing,   //Format

          ref objMissing,   //Enconding

          ref objMissing,   //Visible

          ref objMissing,   //OpenAndRepair

          ref objMissing,   //DocumentDirection

          ref objMissing,   //NoEncodingDialog

          ref objMissing    //XMLTransform

          );

        foreach(string strCopy in arrCopies)
        {
          Debug.WriteLine("Merging file " + strCopy);
          objDocLast.Merge(
            strCopy,                //FileName    

            ref objTarget,          //MergeTarget

            ref objMissing,         //DetectFormatChanges

            ref objUseFormatFrom,   //UseFormattingFrom

            ref objMissing          //AddToRecentFiles

            ); 
          objDocBeforeLast = objDocLast;
          objDocLast = objApp.ActiveDocument;
          Debug.WriteLine("The active document is " + objDocLast.Name);

          if (objDocBeforeLast != null)
          {
            Debug.WriteLine("Closing " + objDocBeforeLast.Name);
            objDocBeforeLast.Close(
              ref objFalse,     //SaveChanges

              ref objMissing,   //OriginalFormat

              ref objMissing    //RouteDocument

              );
          }
            
          
        }

        object objOutDoc = strOutDoc;
      
        objDocLast.SaveAs(    
          ref objOutDoc,      //FileName

          ref objMissing,     //FileFormat

          ref objMissing,     //LockComments

          ref objMissing,     //PassWord     

          ref objMissing,     //AddToRecentFiles

          ref objMissing,     //WritePassword

          ref objMissing,     //ReadOnlyRecommended

          ref objMissing,     //EmbedTrueTypeFonts

          ref objMissing,     //SaveNativePictureFormat

          ref objMissing,     //SaveFormsData

          ref objMissing,     //SaveAsAOCELetter,

          ref objMissing,     //Encoding

          ref objMissing,     //InsertLineBreaks

          ref objMissing,     //AllowSubstitutions

          ref objMissing,     //LineEnding

          ref objMissing      //AddBiDiMarks

          );

        foreach(Document objDocument in objApp.Documents)
        {
          objDocument.Close(
            ref objFalse,     //SaveChanges

            ref objMissing,   //OriginalFormat

            ref objMissing    //RouteDocument

            );
        }
        
      }
      finally
      {
        objApp.Quit(          
          ref objMissing,     //SaveChanges

          ref objMissing,     //OriginalFormat

          ref objMissing      //RoutDocument

          );
        objApp = null;
      }
    }

    /// <summary>

    /// Merge a document with a set of copies

    /// </summary>

    /// <param name="strOrgDoc">

    /// Original file name

    /// </param>

    /// <param name="strCopyFolder">

    /// Folder in which the copies are located

    /// </param>

    /// <param name="strOutDoc">

    /// The result filename

    /// </param>

    public void Merge(string strOrgDoc, string strCopyFolder, string strOutDoc)
    {
      string[] arrFiles = Directory.GetFiles(strCopyFolder);
      Merge(strOrgDoc, arrFiles, strOutDoc);
    }
    

The WebForm invokes this method:

    private void btnMerge_Click(object sender, System.EventArgs e)
    {
      string strOrigFile = Server.MapPath("files/originaldoc/thedocument.doc");
      string strCopiesDir = Server.MapPath("files/copies");
      string strOutputDir = Server.MapPath("files/output/output.doc");
      DocMerger objMerger = new DocMerger();
      objMerger.Merge(strOrigFile, strCopiesDir, strOutputDir);
      lnkResult.NavigateUrl = "files/output/output.doc";
      lnkResult.Visible = true;
    }

    

The final outcome is a document with the changes proposed by all the users:

This way, the administrator will be able to see just in one place all the changes and either approve or reject each one of them.

The code of this simple application could be found with this article, but there's a couple of thinks you must keep in mind:

  • I generated the interop assemblies for Office 2003. You have to re-import the Office's type libraries (I think it will work in Office XP, 2000 and even in 97) and recompile the project
  • Microsoft discourages the use of Office Automation in a web server. Nonetheless, all the tests that I did were fine. However, if needed, the DocMerger class could be moved to another kind of project and replace the front end with a Console or a WinForm Application.

Conclusion

By using automation in the .NET framework, taking advantage of Office's features is simple, and it allows the delivery of quick-to-implement and cool-featured solutions.

History

  • 2004/06/29 - Initial upload

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here