Introduction
Often there is a need to convert a file from one format to another. Most of the tools out there seem to cost some money. Not a lot of money, but they aren’t free. Anyway, several file types that you may want to convert to can be done through Microsoft Word. Here is an example of using C# to access Word Automation object to read in a file type and then write out a different file type.
Background
I had a need to convert an RFT format to TXT. As I was looking for solutions, I had a hard time finding one that was free. So I decided to use Word Automation to accomplish my goals. Note that in this example I am using Office 2003. If you do not have Office 2003 this will still probably work except for the XML output since that is new to Word 2003. The sample program I have included in the download with the source code allows you to set an input file. This is loaded into the Word object. Then there is a ComboBox that contains a list of formats that you can convert to.
The code
To do Word Automation you need to add a reference to your project to the Word DLL.
Add a reference to your project. Click on the COM tab. Down towards the bottom you will find a Microsoft Word 11.0 Object Library. Select and add to references. Now you will be able to access the Word functionality in code.
private void ConvertFile()
{
String inFileName = txtBoxInputFile.Text;
if (!File.Exists(inFileName))
{
MessageBox.Show(inFileName + "does not exist." +
"Please select an existing file to convert.");
return;
}
myItem tmpItem =
cmbBoxOutput.SelectedItem as myItem;
object fileName = inFileName;
object fileSaveName = inFileName.Substring(0,
inFileName.LastIndexOf("."))
+ tmpItem.ItemExtension;
object vk_read_only = false;
object vk_visible = true;
object vk_true = true;
object vk_false = false;
object vk_dynamic = 2;
object missing = System.Reflection.Missing.Value;
object vk_range = missing;
object vk_to = missing;
object vk_from = missing;
Microsoft.Office.Interop.Word.ApplicationClass vk_word_app =
new Microsoft.Office.Interop.Word.ApplicationClass();
Microsoft.Office.Interop.Word.Document aDoc = null;
try
{
aDoc = vk_word_app.Documents.Open(
ref fileName, ref missing,
ref vk_read_only, ref missing,
ref missing, ref missing,
ref missing, ref missing,
ref missing, ref missing,
ref missing, ref vk_visible,
ref missing, ref missing,
ref missing, ref missing );
}
catch (System.Exception ex)
{
MessageBox.Show("There was a problem opening "+
fileName +" error:"+ex.ToString());
}
try
{
object vk_saveformat = tmpItem.ItemWord;
aDoc.SaveAs(ref fileSaveName, ref vk_saveformat,
ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing);
}
catch (System.Exception ex)
{ MessageBox.Show("Error : "+ex.ToString());}
finally
{
if (aDoc != null)
{
aDoc.Close(ref vk_false, ref missing, ref missing);
}
vk_word_app.Quit(ref vk_false,ref missing,ref missing);
}
}
The main thing that does the work is a Word SaveFormat.
Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatRTF
Under WdSaveFormat
you have all the available formats that you would see in Word when you do a SaveAs. I am storing this in the ComboBox in the myItem
class.
public class myItem
{
private String _itemName;
private String _itemExtension;
private Object _itemWord;
...
public Object ItemWord
{
get{ return _itemWord; }
set{ _itemWord = value; }
}
public override String ToString()
{
return ItemName.ToString ();
}
public myItem(){}
public myItem(String inName,
String inExtension, Object inWordType)
{
ItemName = inName;
ItemExtension = inExtension;
ItemWord = inWordType;
}
}
Loading the ComboBox looks like this:
cmbBoxOutput.Items.Add(new myItem("XML Doc *.xml)", ".xml",
Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatXML));
Conclusion
So this is a pretty simple solution. It is important to note that when you save to XML you will get the word formatting stuff included in the XML. Some of that may not be what you want in your XML doc. Still the solution works well for converting RTF format to TXT.
My thanks to all the other CodeProject articles on Word automation that helped me put this solution together.