Background
For my GenOmatic, I wanted the ability to allow the user to provide a format to use when outputting data. For this, I want to use value.ToString(format)
, and thereby allow the user to provide any format that's suitable for the datatype without requiring the utility to know anything about what that datatype is. This works fine for the simple formats that are supported by ToString
, but I wanted to support more complex formats.
Example: Print an integer value in hexadecimal with the leading "0x". Ideally I'd use a format of "'0x'Xn" (where n represents a width). That can't be done with the provided int.ToString(format)
method; you can use either a "Standard Numeric Format String" which includes "Xn" but not literal text or a "Custom Numeric Format String" which includes literal text but not hex-digits.
One way to support more complex formats is to use string.Format
instead; string.Format(format, value)
, in which case, the format can be "0x{0:Xn}". This solution works, but it requires the user to provide the braces and such; things the user shouldn't need to know about.
What I wanted was a way to allow the user to specify the "'0x'Xn" format and split the literal text apart from the format text and output them separately. Splitting the provided format string also allows the value to be formatted into the output more than once, which ToString
doesn't support either. What follows is an Extension Method that provides this functionality.
After I published the first version of this article, a question came up in the C# forum asking about left-padding a value, but having the sign (+ or -) precede the padding. This can't be done with a single format with string.Format
or WriteLine
, so I set about to add this functionality to this method.
LibExt.ApplyFormat.cs
The Regular Expression
The method uses the following Regular Expression to split the provided format string into sub-strings of literal text and non-literal text which is assumed to be padding/justification information and valid formatting strings.
private static readonly System.Text.RegularExpressions.Regex regex =
new System.Text.RegularExpressions.Regex
(
"'(?'Text'[^']*)'|" +
"\"(?'Text'[^\"]*)\"|" +
"(?'IsFormat'((?'Justify'[+\\-/]?)(?'MinWidth'\\d{0,4})" +
"(,(?'MaxWidth'\\d{0,4})(,(?'PadChar'.?))?)?:)?(?'Format'[^'\"]*))"
,
System.Text.RegularExpressions.RegexOptions.CultureInvariant
) ;
(If anyone has comments or suggestions for improvements to this regex, please let me know.)
A Note on Padding/Justification
A format may now be prefixed with padding information in the following format:
[ Justify ] [ MinWidth ] [ , [ MaxWidth] [ , [ PadChar ] ] :
Justify
A character to specify where to add padding characters
- A plus-sign (+) to right-justify (left-pad); this is the default
- A minus-sign (-) to left-justify (right-pad)
- A slash (/) to center; if an odd number of padding characters is required, the extra one will be appended to the end.
MinWidth
Zero to four decimal digits to specify the minimum width for the formatted value, the formatted value will be padded to this length as necessary; the default is 0.
MaxWidth
Zero to four decimal digits to specify the maximum width for the formatted value, must be preceded by a comma (,), the formatted value will be truncated to this length as necessary; the default is int.MaxValue
.
PadChar
The character to use for padding, must be preceded by a comma (,); the default is a SPACE.
:
The padding information is terminated with a colon (:)
For Example:
"/10,,=:0" -- Center the value within a string
of at least ten characters, padded with equal-signs (=); applying this to 123 yields "===123===="
Using XML
Once I had all that working, I decided to add support for using XML to specify the format and padding/justification information... just because.
The element's name must be Formatter, the Format
goes in the element's value, and the following Attributes
are supported:
Justify
Where to add padding characters
- The word "
Right
" or a plus-sign (+) to right-justify (left-pad); this is the default - The word "
Left
" or a minus-sign (-) to left-justify (right-pad) - The word "
Center
" or a slash (/) to center; if an odd number of padding characters is required, the extra one will be appended to the end.
MinWidth
Decimal digits to specify the minimum width for the formatted value, the formatted value will be padded to this length as necessary; the default is 0.
MaxWidth
Decimal digits to specify the maximum width for the formatted value, the formatted value will be truncated to this length as necessary; the default is int.MaxValue
.
PadChar
The character to use for padding; the default is a SPACE.
The above example could be written in XML as:
<Formatter Justify="Center" MinWidth="10" PadChar="=">0</Formatter>
Interpretation Helpers
The following definitions help with the caching of the interpretations:
private interface IFormat
{
string Format ( object Value , System.IFormatProvider FormatProvider ) ;
}
private sealed class FormatList : System.Collections.Generic.List<IFormat> {}
private sealed class FormatDictionary : System.Collections.Generic.Dictionary<string,formatlist /> {}
private static readonly FormatDictionary knownformats = new FormatDictionary() ;
The Extension Method
The first version of this method was very simple, but now it has to interpret the padding/justification information. I chose to perform that interpretation once and cache the results. It also needs to detect XML. I'll get to the details of all that in a bit.
The string
version of this method does the following:
- If the
Value
is null
, throw an ArgumentNullException
- If no
Format
is provided, return a simple ToString
of the Value
- If we don't have the
Format
in the cache, interpret it and add it to the cache - For each piece of the interpreted
format
, append the result of Format
to the result
public static string
ApplyFormat
(
this object Value
,
string Format
,
System.IFormatProvider FormatProvider
)
{
if ( Value == null )
{
throw ( new System.ArgumentNullException
(
"Value"
,
"Value must not be null"
) ) ;
}
if ( string.IsNullOrEmpty ( Format ) )
{
return ( Value.ToString() ) ;
}
if ( !knownformats.ContainsKey ( Format ) )
{
knownformats [ Format ] = InterpretFormat ( Format ) ;
}
System.Text.StringBuilder result = new System.Text.StringBuilder() ;
foreach ( IFormat format in knownformats [ Format ] )
{
result.Append ( format.Format
(
Value
,
FormatProvider
) ) ;
}
return ( result.ToString() ) ;
}
The XML-handling overload of this method is very similar, it does:
- If the
Value
is null
, throw an ArgumentNullException
- If no
Format
is provided, return a simple ToString
of the Value
- If we don't have the
Format
in the cache, interpret it and add it to the cache - For each piece of the interpreted
format
, append the result of Format
to the result
public static string
ApplyFormat
(
this object Value
,
System.Xml.XmlElement Format
,
System.IFormatProvider FormatProvider
)
{
if ( Value == null )
{
throw ( new System.ArgumentNullException
(
"Value"
,
"Value must not be null"
) ) ;
}
if ( ( Format == null ) || ( Format.ChildNodes.Count == 0 ) )
{
return ( Value.ToString() ) ;
}
string temp = Format.OuterXml ;
if ( !knownformats.ContainsKey ( temp ) )
{
knownformats [ temp ] = InterpretFormat ( Format ) ;
}
System.Text.StringBuilder result = new System.Text.StringBuilder() ;
foreach ( IFormat format in knownformats [ temp ] )
{
result.Append ( format.Format
(
Value
,
FormatProvider
) ) ;
}
return ( result.ToString() ) ;
}
Yes, you can believe your eyes; I have two return
statements in there, I felt this was a reasonable situation for it.
There are also overloads of these methods that don't require an IFormatProvider
.
InterpretFormat
There are two overloads of InterpretFormat
; one for string
, and one for XML.
The string
version has to check for XML; if the string
contains well-formed XML the resultant XmlElement
is passed along to the XML version. Otherwise, the usual matching to the Regular Expression is performed and appropriate instances of IFormat
added to the FormatList
:
private static FormatList
InterpretFormat
(
string Format
)
{
FormatList result = null ;
try
{
System.Xml.XmlDocument doc = new System.Xml.XmlDocument() ;
doc.LoadXml ( Format ) ;
result = InterpretFormat ( doc.DocumentElement ) ;
}
catch ( System.Xml.XmlException )
{
result = new FormatList() ;
foreach
(
System.Text.RegularExpressions.Match mat
in
regex.Matches ( Format )
)
{
if ( mat.Groups [ "IsFormat" ].Value.Length != 0 )
{
result.Add ( new Formatter ( mat ) ) ;
}
else
{
if ( mat.Groups [ "Text" ].Value.Length != 0 )
{
result.Add ( new Text ( mat ) ) ;
}
}
}
}
return ( result ) ;
}
The XML version interprets the children of the provided Format
element:
private static FormatList
InterpretFormat
(
System.Xml.XmlElement Format
)
{
FormatList result = new FormatList() ;
foreach
(
System.Xml.XmlElement ele
in
Format.ChildNodes
)
{
switch ( ele.Name )
{
case "Formatter" :
{
result.Add ( new Formatter ( ele ) ) ;
break ;
}
case "Text" :
{
if ( ele.InnerText.Length != 0 )
{
result.Add ( new Text ( ele ) ) ;
}
break ;
}
default :
{
throw ( new System.InvalidOperationException
( "Unrecognized format type: " + ele.OuterXml ) ) ;
}
}
}
return ( result ) ;
}
Formatter
The Formatter
class stores an interpreted format and padding/justification and applies those settings to a Value
. It's a little long, so I'll present it in pieces.
The constructor that takes a RegularExpressions.Match
can assume that all the Groups exist, but may be empty:
private sealed class Formatter : IFormat
{
private enum Justify
{
Right
,
Left
,
Center
}
private readonly Justify justify ;
private readonly int min ;
private readonly int max ;
private readonly char padchar ;
private readonly string format ;
public Formatter
(
System.Text.RegularExpressions.Match Match
)
{
switch ( Match.Groups [ "Justify" ].Value )
{
case "-" :
{
this.justify = Justify.Left ;
break ;
}
case "/" :
{
this.justify = Justify.Center ;
break ;
}
default :
{
this.justify = Justify.Right ;
break ;
}
}
int.TryParse ( Match.Groups [ "MinWidth" ].Value , out this.min ) ;
if ( !int.TryParse ( Match.Groups [ "MaxWidth" ].Value , out this.max ) )
{
this.max = int.MaxValue ;
}
this.padchar = Match.Groups [ "PadChar" ].Value.Length == 0
? ' '
: Match.Groups [ "PadChar" ].Value [ 0 ] ;
this.format = Match.Groups [ "Format" ].Value ;
return ;
}
The constructor that takes an XmlElement
needs to check for the Attributes
before interpreting the Value
s:
public Formatter
(
System.Xml.XmlElement Element
)
{
if ( Element.Attributes [ "Justify" ] != null )
{
switch ( Element.Attributes [ "Justify" ].Value )
{
case "-" :
case "Left" :
{
this.justify = Justify.Left ;
break ;
}
case "/" :
case "Center" :
{
this.justify = Justify.Center ;
break ;
}
default :
{
this.justify = Justify.Right ;
break ;
}
}
}
else
{
this.justify = Justify.Right ;
}
if ( Element.Attributes [ "MinWidth" ] != null )
{
int.TryParse ( Element.Attributes [ "MinWidth" ].Value , out this.min ) ;
}
else
{
this.min = 0 ;
}
if ( Element.Attributes [ "MaxWidth" ] != null )
{
if ( !int.TryParse
( Element.Attributes [ "MaxWidth" ].Value , out this.max ) )
{
this.max = int.MaxValue ;
}
}
else
{
this.max = int.MaxValue ;
}
if ( Element.Attributes [ "PadChar" ] != null )
{
this.padchar = Element.Attributes [ "PadChar" ].Value.Length == 0
? ' '
: Element.Attributes [ "PadChar" ].Value [ 0 ] ;
}
else
{
this.padchar = ' ' ;
}
this.format = Element.InnerText ;
return ;
}
The Format
method uses the stored values to apply the requested formatting:
- Determine whether or not the
Value
is IFormattable
and perform the appropriate overload of ToString()
on the Value. - If the resultant
string
is shorter than the specified MinWidth
, then perform the requested padding/justification. - If the resultant
string
is longer than the specified MaxWidth
, then truncate it.
public string
Format
(
object Value
,
System.IFormatProvider FormatProvider
)
{
string result ;
if
(
( Value is System.IFormattable )
&&
( this.format.Length != 0 )
)
{
result = ( (System.IFormattable) Value).ToString
(
this.format
,
FormatProvider
) ;
}
else
{
result = Value.ToString() ;
}
if ( result.Length < this.min )
{
switch ( this.justify )
{
case Justify.Left :
{
result = result.PadRight
(
this.min
,
this.padchar
) ;
break ;
}
case Justify.Center :
{
result = result.PadRight
(
this.min - ( this.min - result.Length ) / 2
,
this.padchar
).PadLeft
(
this.min
,
this.padchar
) ;
break ;
}
default :
{
result = result.PadLeft
(
this.min
,
this.padchar
) ;
break ;
}
}
}
if ( result.Length > this.max )
{
result = result.Substring
(
0
,
this.max
) ;
}
return ( result ) ;
}
}
Text
Text
is a much simpler class, it only has to store the text from a Match
or an XmlElement
and echo it back:
private sealed class Text : IFormat
{
private readonly string text ;
public Text
(
System.Text.RegularExpressions.Match Match
)
{
this.text = Match.Groups [ "Text" ].Value ;
return ;
}
public Text
(
System.Xml.XmlElement Element
)
{
this.text = Element.InnerText ;
return ;
}
public string
Format
(
object Value
,
System.IFormatProvider FormatProvider
)
{
return ( this.text ) ;
}
}
Something to bear in mind about XML, is that if the element's value contains only whitespace, you'll need to give it the xml:space="preserve"
attribute.
For example:
<Text xml:space="preserve"> </Text>
FormatProvider.cs
Another way to apply custom formats to values is to define a class that implements ICustomFormatter
and IFormatProvider
. See this link for more information.
The following is such a class, it simply calls the above method:
public class
FormatProvider : System.IFormatProvider , System.ICustomFormatter
{
public object
GetFormat
(
System.Type Service
)
{
return ( ( Service == typeof(System.ICustomFormatter) ) ? this : null ) ;
}
public string
Format
(
string Format
,
object Value
,
System.IFormatProvider Provider
)
{
return ( Value.ApplyFormat ( Format , Provider ) ) ;
}
}
However, I don't expect to use this class very much; it's really only useful from string.Format
and if you're using that, you can use other formats.
Using the Code
The zip file contains the two files described above and ApplyFormatTest.cs, which takes an integer value and a format string
from the command-line and attempts to perform a format with the above classes:
int val ;
if ( int.TryParse ( args [ 0 ] , out val ) )
{
System.Console.WriteLine ( val.ApplyFormat ( args [ 1 ] ) ) ;
System.Console.WriteLine ( string.Format
( new PIEBALD.Types.FormatProvider() , "{0:" + args [ 1 ] + "}" , val ) ) ;
}
I've been building this with:
csc ApplyFormatTest.cs LibExt.ApplyFormat.cs FormatProvider.cs
Here are two examples of using ApplyFormatTest
that result in the same output:
C:\Projects\CodeProject\Temp>ApplyFormatTest 123 "'<'/10,,=:0'>'"
<===123====>
<===123====>
C:\Projects\CodeProject\Temp>ApplyFormatTest 123 "<Format><Text>
<</Text><Formatter Justify='Center' MinWidth='10' PadChar='='>0</Formatter>
<Text>></Text></Format>"
<===123====>
<===123====>
Obviously, the XML version is more verbose, but if you happen to be including the format in an XML file anyway, it may be a reasonable way to go about it.
Conclusion
The methods presented achieve the desired results along with providing more flexibility than required. ApplyFormat
may be used in many situations where a format string
is not hard-coded into the application. A format string
may be stored in a configuration file, in a database, the user may select one from a DropDown, etc. It can be more complex than ToString
allows, yet simpler than its string.Format
equivalent.
History
- 2009-02-01 First submitted
- 2009-03-06 Added support for padding/justification and XML