Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Insert Spaces Between CrammedTogetherWords in .docx files using C# and the DocX Library

5.00/5 (2 votes)
7 Jan 2014CPOL1 min read 9.2K  
Yet another docx tip - insert spaces between lowercase and uppercase chars

Theory

Sometimes you may end up with a .docx file that has words jammed and crammed together like this:

CuriousGeorgeWashingtonIrvingStone theCrowsNest ofRobberBaron vonMunchabaloneyB. WhiterShade ofPaleRiders

You want it, of course, to be like this instead:

Curious George Washington Irving Stone the Crows Nest of Robber Baron von Munchabaloney B. Whiter Shade of Pale Riders

Automatically replacing all the lowercase-followed-by-uppercase-with-no-space-between-them instances is easy; I'll show you how. First, though:

Regression

Follow these steps to prepare for the code to follow:

  1. Download the DocX DLL library from here
  2. In your Visual Studio project, right-click References, select "Add Reference..." and add docx.dll to the project from wherever you saved it.
  3. Add this to your using section:
    C#
    using Novacode;

Practice

Add these consts to the top of your class:

const int firstCapPos = 65;
const int lastCapPos = 90;
const int firstLowerPos = 97;
const int lastLowerPos = 122;

Add code like this to call the helper function that will insert spaces between lowercase chars and uppercase chars:

C#
string filename = string.Empty;
DialogResult result = openFileDialog1.ShowDialog();
if (result == DialogResult.OK)
{
    filename = openFileDialog1.FileName;
}
else
{
    MessageBox.Show("No file selected - exiting");
    return;
}
InsertSpaceBetweenLowercaseAndUppercaseChars(filename);

Now add the helper method:

C#
private void InsertSpaceBetweenLowercaseAndUppercaseChars(string filename)
{
    using (DocX document = DocX.Load(filename))
    {                
        for (int i = firstLowerPos; i <= lastLowerPos; i++)
        {
            char lowerChar = (char)i;
            for (int j = firstCapPos; j <= lastCapPos; j++)
            {
                char upperChar = (char)j;
                string originalStr = string.Format("{0}{1}", lowerChar, upperChar);
                string newStr = string.Format("{0} {1}", lowerChar, upperChar);
                document.ReplaceText(originalStr, newStr);
            }
        }
        document.Save();
    }
}

Of course, you would only want to use this on "normal" English (prose, etc.) If you used it on code, you would end up changing "InsertSpaceBetweenLowercaseAndUppercaseChars" to "Insert Space Between Lowercase And Uppercase Chars", etc. - probably not what you want.

Everything Works in Theory

If this tip benefits you, be sure to try this one the next time you have the hiccups (or, even worse, the hiccoughs), as it is a surefire* cure for that malady: Run around the house three times without thinking of the word tiger.

* It has never been proven to be ineffective!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)