(untagged)

Implementing Word Wrap in C#

Jonathan Wood

0.00/5 (No votes)

4 Jul 2012

The .NET platform makes it easy to send emails from your code. However, it was bothering me the other day that my emails had no word wrap...

The .NET platform makes it easy to send emails from your code. However, it was bothering me the other day that my emails had no word wrap.

In most cases, modern email readers will word wrap when email lines are too long. But there are still some email readers around that won't. The industry standard is to wrap email lines, limiting their length to about 65-75 characters. So I decided it was worth implementing word wrap in my code.

As rich as it is, the .NET platform does not appear to have any routines for implementing word wrap. I found some sample code online but, while the code was fairly simple (which is good), I didn't think it was very efficient.

The .NET platform provides many routines for parsing text and extracting substrings, etc. but these generally involve allocating and moving lots of memory. So my approach was to write simple C# code that would word wrap the code without unnecessarily allocating additional objects.

Of course, I will need a new string in order to save my results. And since I'll be building that string line-by-line, I used the StringBuilder class for this. The StringBuilder class allows you to more efficiently build a string without allocating new strings each time you make a change. Listing 1 is the code I came up with.

/// <summary>
/// Word wraps the given text to fit within the specified width.
/// </summary>
/// <param name="text">Text to be word wrapped</param>
/// <param name="width">Width, in characters, to which the text
/// should be word wrapped</param>
/// <returns>The modified text</returns>
public static string WordWrap(string text, int width)
{
    int pos, next;
    StringBuilder sb = new StringBuilder();

    // Lucidity check
    if (width < 1)
        return text;

    // Parse each line of text
    for (pos = 0; pos < text.Length; pos = next)
    {
        // Find end of line
        int eol = text.IndexOf(Environment.NewLine, pos);
        if (eol == -1)
            next = eol = text.Length;
        else
            next = eol + Environment.NewLine.Length;

        // Copy this line of text, breaking into smaller lines as needed
        if (eol > pos)
        {
            do
            {
                int len = eol - pos;
                if (len > width)
                    len = BreakLine(text, pos, width);
                sb.Append(text, pos, len);
                sb.Append(Environment.NewLine);

                // Trim whitespace following break
                pos += len;
                while (pos < eol && Char.IsWhiteSpace(text[pos]))
                    pos++;
            } while (eol > pos);
        }
        else sb.Append(Environment.NewLine); // Empty line
    }
    return sb.ToString();
}

/// <summary>
/// Locates position to break the given line so as to avoid
/// breaking words.
/// </summary>
/// <param name="text">String that contains line of text</param>
/// <param name="pos">Index where line of text starts</param>
/// <param name="max">Maximum line length</param>
/// <returns>The modified line length</returns>
private static int BreakLine(string text, int pos, int max)
{
    // Find last whitespace in line
    int i = max;
    while (i >= 0 && !Char.IsWhiteSpace(text[pos + i]))
        i--;

    // If no whitespace found, break at maximum length
    if (i < 0)
        return max;

    // Find start of whitespace
    while (i >= 0 && Char.IsWhiteSpace(text[pos + i]))
        i--;

    // Return length of text before whitespace
    return i + 1;
}

Listing 1: Word Wrap Code

The code starts by extracting each line from the original text. It does this by locating the hard-coded line breaks. Note that my code searches for carriage return, line feed pairs (“\r\n”). Some platforms may only use “\n” or other variations for new lines, but the carriage return, line feed pair works in most cases on Windows systems. You can change the _newline constant if you want the code to look for something else.

The code then copies each line to the result string. If a line is too long to fit within the specified width, then it is further broken into smaller lines. Each time through the loop, if the line needs to be broken, the BreakLine method is called to locate the last white space that fits within the maximum line length. This is done to try and break the line between words instead of in the middle of them.

While the string object provides the LastIndexOf() method, which could be used to locate the last space character, I manually coded the loop myself so that I could use Char.IsWhiteSpace() to support all whitespace characters defined on the current system. If no whitespace is found, the line is simply broken at the maximum line length.

As each line is broken, that the code removes any spaces at the break. This avoids trailing spaces on the current line or leading spaces on the next line. Although there is normally only one space between each word, the code tries to correctly handle cases where there might be more.

As each new line is created, a carriage return, line feed pair is also added to separate each line. Note the special case for handling when the line is empty, in which case we just write the carriage return, line feed pair.

There’s nothing complex about this code, but I took a little extra time to make it efficient. Note that the word wrap is based on the number of characters and not the display width. If you were, for example, word wrapping text output to the screen or printer, the code should probably test different line lengths measured on a device context in order to determine the display length.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here