Introduction
In our company we have an automatic product scanner that reads the devices serial number (XXXX; XXXXS; XXXX where X is an integer) and saves them in a tabular text file.
After one hour maybe, there are more than 1000 SN saved in the text file in a tabular format. At the end of the day, the production employer must insert parts of the scanned SN’s into Microsoft Word template and he shall send a PDF copy to the customer.
This tool helps the production employer to inserts tabular data into Microsoft Word document and converts the merged document to PDF.
This project presents a tabular text format merger which copies data from a given tabular text files and inserts/merges the copied text on the fly into a given Microsoft Document table, finally exports the merged Microsoft Word document to PDF file format.
The Main Form
The purpose of the main form is control the feedback of the application and allows the user to run it anyway that they choose. When the program is run as shown in Fig 3, you will see the browse buttons and textboxes.
The first one is for the given tabular text file.
The second one is for the given Microsoft Word document.
The third one is optional just an output path.
Remarks
- The origin Microsoft Word document used as template. The tool will not change the origin doc file, if you like to do that, then set:
- In the Config file, you can define the begin index of the given Microsoft Word document table. For example:
- The tool will open the second table in the given Microsoft Word origin document and then jumps to the second row and begins to write the copied tabular text.
- If the table size is less than the copied tabular text size, then the user will receive a
TextInjector
format error.
- If the copied tabular text is not compatible with the given regular expression pattern, then the user will receive a format error.
- You can define how many columns need to be copied from the given text file.
- The copy process is cell by cell respectively, the start cell is defined in the Config file.
- Microsoft Word 2003 can't save PDF format, I have solved this problem by using an external tool.
How to use TextInjector?
Run the TextInjector
.
- Navigate to the tabular text file, as shown in Fig 3.
Figure 1 Example for a tabular text file
- Navigate to Microsoft Word document that contains tables, as shown in Fig 3.
Figure 2 Example for Microsoft-Word Document
- You have to browse to the desired output path, as shown below in Fig 3.
Figure 3 TextInjector
- Finally, click on the Create File button. The PDF file shall be created, see Fig 4, 5.
Figure 4 The created PDF file
Figure 5 The test directory
How Does the Code Work?
The first method called in the constructor is RetrieveAppSettings()
.
private void RetrieveAppSettings()
{
var configurationAppSettings = new AppSettingsReader();
_inputTextFileTextBox.Text = (string)(configurationAppSettings.GetValue
("TextFilePath", typeof(string)));
_inputWordFileTextBox.Text = (string)(configurationAppSettings.GetValue
("WordFilePath", typeof(string)));
…
App.config File
<configuration>
<appSettings>
<add key="TextFilePath" value="C:\" />
<add key="WordFilePath" value="C:\" />
<add key="WordOutputPath" value="C:\" />
<add key="TextFilesFilter" value="txt files (*.txt)|*.txt" />
<add key="WordFilesFilter" value="txt files (*.doc)|*.doc" />
<add key="RegularExpressionFilter" value="^[0-9]{5}[- ;.]
[0-9]{5}S[- ;.][0-9]{4}$" />
<add key="NumberOfCloumnsInTextFile" value="3" />
<add key="RowSplitter" value=";" />
<add key="TableIndex" value="1" />
<add key="RowIndex" value="1" />
</appSettings>
</configuration>
This method reads the application App.config which contains the configuration data. Each time you navigate to a new path, the new path will be written in the App.config file as follows:
private static void SaveConfig(string propertyName, string filePath)
{
var config = ConfigurationManager.OpenExeConfiguration(ConfigurationUserLevel.None);
config.AppSettings.Settings[propertyName].Value = filePath;
Causes only modified properties to be written to the configuration file,
even when the value is the same as the inherited value.
config.Save(ConfigurationSaveMode.Modified);
Path.DirectorySeparatorChar + ConfigFileName;
ConfigurationManager.RefreshSection("appSettings");
}
The application is ready to work.
The CreateDocumentsButtonClick
is the heart method of the application, it’s called when the user clicks on the Convert Button:
Check the exiting of the given paths, if any error found, then return:
if (!File.Exists(_inputTextFileTextBox.Text))
{
MessageBox.Show("Input Text file doesn't exist!!");
return;
}
if (!File.Exists(_inputWordFileTextBox.Text))
{
MessageBox.Show("Input Word file doesn't exist!!");
return;
}
if (!Directory.Exists((_outputDocsPathTextBox.Text)))
{
MessageBox.Show("Output Word directory doesn't exist!!");
return;
}
Retrieves the data from the given tabular text file, if a format error is found and the user has cancelled the process, then return:
if (!RetrieveRows())
return;
Below, I will explain the most important code section:
Open the Microsoft Word document and insert the copied table.
Represent the Microsoft Office Word application.
_Application wordApp = new Word.Application();
The Microsoft Word document.
object wordFile = @_inputWordFileTextBox.Text;
Dummy object.
object missing = Missing.Value;
The document shall opened in read-only mode.
object readOnly = true;
PDF format.
object fileFormat = WdSaveFormat.wdFormatPDF;
Here I get the Microsoft Word document file name without the extension.
var fileNameWithoutExt = Path.GetFileNameWithoutExtension(_inputWordFileTextBox.Text);
The PDF file has the same name of the Microsoft Word document.
object wordPdfFile = _outputDocsPathTextBox.Text + Path.DirectorySeparatorChar +
fileNameWithoutExt + pdfExt;
wordApp.Documents
is a collection of all the Microsoft.Office.Interop.Word.Document
objects that are currently open in Word.
wordApp.Documents.Open
opens the given Microsoft Word document in the Microsoft Word application.
_Document wordDoc = wordApp.Documents.Open(ref wordFile,
ref missing, ref readOnly, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing, ref missing, ref missing);
Checks the given Microsoft Word document whether it contains tables.
if (wordDoc.Tables.Count >= _tableIndex &&
wordDoc.Tables[_tableIndex].Rows.Count >= _rowIndex)
{
var destinationTable = wordDoc.Tables[_tableIndex];
object readOnlyRecommended = false;
for (var rowNrInList = 0; rowNrInList < _extractedRow.Count; rowNrInList++)
{
var row = _extractedRow[rowNrInList];
Inserts the tabular text on the fly into the destination table.
for (var columnNrInList = 0; columnNrInList < row.Count; columnNrInList++)
{
destinationTable.Cell(rowNrInList +1, columnNrInList+1).Range.Text =
row[columnNrInList];
}
Exports the Modified document to PDF.
wordDoc.SaveAs(ref wordPdfFile, ref fileFormat,
ref missing, ref missing, ref missing,
ref missing, ref readOnlyRecommended,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing,
ref missing, ref missing, ref missing);
RetrieveRows()
is called from the CreateDocumentsButtonClick()
to read the text data.
- Extracts the data from tabular text file.
- Checks the extract row to see whether it matches the given regular expression pattern.
- Adds the result to _
rowTextLine
.
- Returns
true
if the process is completed without error.
private bool RetrieveRows()
{
_extractedRow.Clear();
var readSuccessfullyCompleted = true;
var formatRegex = new Regex(_regularExpressionText);
var fileText = File.ReadAllText(_inputTextFileTextBox.Text);
_rowTextLine = new List<string>(fileText.Split(new[] { Environment.NewLine },
StringSplitOptions.RemoveEmptyEntries));
foreach (var row in _rowTextLine)
{
if (formatRegex.IsMatch(row))
{
var rowSpiltter = row.Split(_rowSplitter);
_extractedRow.Add(new List<string>(rowSpiltter));
}
else
{
var dlgResult = MessageBox.Show("Format error founded in the
tabular text file!",
"Continue?", MessageBoxButtons.YesNo,
MessageBoxIcon.Question);
if (dlgResult == DialogResult.No)
{
readSuccessfullyCompleted = false;
break;
}
}
return readSuccessfullyCompleted;
}
History
- 26th August, 2010: Initial post