Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Getting MS Word Document Properties Using Visual Studio .NET

0.00/5 (No votes)
25 Jan 2006 2  
This code will demonstrate how to automate and get the Document properties from a MS Word document.

Introduction

This code demonstrates how to extract/retrieve the Document properties from a MS Word document file. Refer to my other articles for a proper background on MS Office Automation and how to setup your development environment if needed. I have created this as a utility which is being used for a much larger project in our company for document management. The night I was asked to create this utility, I also received a lot of request during that same week from readers on CodeProject to show them this exact feature! So I hope this helps demonstrate yet another way to extract/automate Word for your needs.

Background

No special background is necessary. Just have some hands on experience with C#.

Using the code

The following is a listing of the code to retrieve the document properties. Refer to Automating MS Word Using Visual Studio .NET to get started with a new project. I have provided source code that will demonstrate in a very simple fashion how to achieve this task. The following listing is the main section of the program which actually does exactly what we are looking for.

Note: The following code was written for MS Word 2003.

...

/// <summary>

/// Get source document. Open a FileDialog window

/// for user to select single/multiple files for

/// parsing.

/// </summary>

private void butSourceDocument_Click(object sender, System.EventArgs e)
{
    openFileDialog.Multiselect = true;

    if( openFileDialog.ShowDialog() == DialogResult.OK )
    {
        
        object    vk_read_only    = false;
        object    vk_visible        = true;
        object    vk_false            = false;
        object    vk_true            = true;
        object    vk_dynamic        = 2;

        object    vk_missing        = System.Reflection.Missing.Value;
        
        string [] properties = { "Title", "Subject", "Author", 
                  "Keywords", "Revision Number", 
                  "Creation Date", "Last Save Time" };

        using (StreamWriter sw = new StreamWriter("FileProperties.txt")) 
        {
            string strHeader = null;
            foreach( string header in properties )
            {
                strHeader = strHeader + header + ", ";
            }
            
            sw.WriteLine(strHeader.Substring( 0, strHeader.Length-2 ));

            foreach( string file in openFileDialog.FileNames )
            {
                object fileName = @file;

                // Let make the word application visible

                vk_word_app.Visible = false;
                
                // Let's open the document

                Word.Document vk_my_doc = 
                    vk_word_app.Documents.Open( ref fileName,
                    ref vk_missing, ref vk_read_only, 
                    ref vk_missing, ref vk_missing,
                    ref vk_missing, ref vk_missing, 
                    ref vk_missing, ref vk_missing,
                    ref vk_missing, ref vk_missing, 
                    ref vk_visible );

                object vk_document_prop = vk_my_doc.BuiltInDocumentProperties;

                Type propertyType = vk_document_prop.GetType(  );
        
                string strProValues = null;
                foreach( string prop in properties )
                {
                    object property = propertyType.InvokeMember( "Item", 
                        System.Reflection.BindingFlags.Default | 
                        System.Reflection.BindingFlags.GetProperty, 
                        null, 
                        vk_document_prop, 
                        new object[  ] { prop } );

                    Type validatedType = property.GetType(  );

                    string propValue = validatedType.InvokeMember( "Value", 
                        System.Reflection.BindingFlags.Default | 
                        System.Reflection.BindingFlags.GetProperty, 
                        null, 
                        property, 
                        new object[] {} ).ToString(  );

                    strProValues = strProValues + propValue + ", ";
                }
                sw.WriteLine(strProValues.Substring(0,strProValues.Length-2));
        
                // close the original document

                vk_my_doc.Close( ref vk_false, ref vk_missing, ref vk_missing );
            }
        }

        // close word application

        vk_word_app.Quit( ref vk_false, ref vk_missing, ref vk_missing );

        MessageBox.Show( "Done!" );
    }
}

...

A quick summary, the program has a set of properties that it is looking for and wants to extract. In this case, they are defined as string [] properties = { "Title", "Subject", "Author", "Keywords", "Revision Number", "Creation Date", "Last Save Time" };. The program then loops through the selected file(s) and extracts the information and stores it in a text file for further processing. If you notice, there are two for loops in the code shown above, the first one is for the list of files to process, and the second one is for the list of the properties to extract on each file.

Points of Interest

The new version of Office, Office 2003, is going to make things a little easier for Office developers. So if you are an Office developer, you should start looking into the features that Office 2003 has to offer. One of the nice features that I like is the capability of exporting documents into XML format.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here