Background
In the never-ending debate over whether to use SPACEs or TABs for indenting code, I prefer SPACEs. Having mentioned that I had written an Untabify utility, I was asked for an article. I have now rewritten the code in C#, and added a companion Tabify utility for readers who are in that camp. I implemented the methods as Extension Methods. I'm not a fan of Extension Methods, but this seemed like a reasonable use for them.
Another implementation choice was to keep the general layout of the two methods similar. At first, I wrote Untabify with a switch
, but the switch
was problematic in Tabify, so I changed both to use nested if
s.
Both methods use a StringBuilder
, and stop processing the string as soon as a non-whitespace character or the end of the line is reached.
Untabify method
The Untabify
method iterates the string, appending any SPACEs it encounters, and replacing any TAB characters with an appropriate number of SPACEs to reach the next tab stop. If the caller specifies four SPACEs per indent and a string begins with [ SPACE SPACE TAB ], then the TAB will be replaced by two SPACEs.
The caller may also specify zero SPACEs per indent to simply remove TABs from the leading whitespace.
public static string
Untabify
(
this string String
,
byte SpacesPerIndent
)
{
System.Text.StringBuilder result =
new System.Text.StringBuilder ( String.Length ) ;
int offset = 0 ;
while ( offset < String.Length )
{
if ( String [ offset ] == ' ' )
{
result.Append ( ' ' ) ;
offset++ ;
}
else
{
if ( String [ offset ] == '\t' )
{
if ( SpacesPerIndent > 0 )
{
for ( int i = result.Length % SpacesPerIndent ;
i < SpacesPerIndent ; i++ )
{
result.Append ( ' ' ) ;
}
}
offset++ ;
}
else
{
break ;
}
}
}
result.Append ( String.Substring ( offset ) ) ;
return ( result.ToString() ) ;
}
Tabify method
The Tabify
method iterates the string, appending any TABs it encounters and removing any SPACE characters. If the caller specifies four SPACEs per indent and a string begins with [ SPACE SPACE TAB ], then the two SPACEs will be removed.
A contiguous sequence of SPACEs of length equal to SpacesPerIndent
will cause a TAB to be appended. If the caller specifies four SPACEs per indent and a string begins with [ SPACE SPACE SPACE SPACE ], then the four SPACEs will be replaced by a TAB.
The caller may also specify zero SPACEs per indent to simply remove SPACEs from the leading whitespace.
(See below for information on TabifyMode
.)
public static string
Tabify
(
this string String
,
byte SpacesPerIndent
,
TabifyMode Mode
)
{
System.Text.StringBuilder result =
new System.Text.StringBuilder ( String.Length ) ;
int offset = 0 ;
int spaces = 0 ;
while ( offset < String.Length )
{
if ( String [ offset ] == ' ' )
{
if ( ++spaces == SpacesPerIndent )
{
result.Append ( '\t' ) ;
spaces = 0 ;
}
offset++ ;
}
else
{
if ( String [ offset ] == '\t' )
{
result.Append ( '\t' ) ;
spaces = 0 ;
offset++ ;
}
else
{
break ;
}
}
}
switch ( Mode )
{
case TabifyMode.Retain :
{
while ( spaces-- > 0 )
{
result.Append ( ' ' ) ;
}
break ;
}
case TabifyMode.Extend :
{
result.Append ( '\t' ) ;
break ;
}
}
result.Append ( String.Substring ( offset ) ) ;
return ( result.ToString() ) ;
}
TabifyMode enumeration
The TabifyMode
enumeration controls what Untabify
does with "extra" SPACEs.
If the caller specifies four SPACEs per indent and a string begins with [ SPACE SPACE SPACE SPACE SPACE SPACE NON-WHITESPACE ], the first four SPACEs cause one TAB to be appended, but what does the caller want to do with the other two?
Visual Studio's tabify feature will leave the two SPACEs in place, so that's the default behavior (Retain
). The caller may specify Truncate
to remove them, or Extend
to append a TAB in their place.
public enum TabifyMode
{
Retain
,
Truncate
,
Extend
}
Using the code
Using these methods is quite simple; they're Extension Methods, so add an appropriate using
directive and use them as if they belong to the string
class. I prefer to put each Extension Method (or group of overloaded ones) I write in its own namespace, so my using
directives specify exactly which methods I'm importing.
using PIEBALD.Lib.LibExt.Untabify ;
using PIEBALD.Lib.LibExt.Tabify ;
string s = " \t" ;
s.Untabify ( 4 ) ;
s.Tabify ( 4 ) ;
Untabify utility
This is a very simple console application that reads the lines in a file, right-trims and untabifies each, then writes each out to another file.
It can be built with csc Untabify.cs LibExt.Untabify.cs.
namespace Untabify
{
using PIEBALD.Lib.LibExt.Untabify ;
public static class Untabify
{
[System.STAThreadAttribute()]
public static int
Main
(
string[] args
)
{
int result = 0 ;
try
{
if ( args.Length == 3 )
{
using ( System.IO.TextReader tr =
new System.IO.StreamReader ( args [ 0 ] ) )
{
using ( System.IO.TextWriter tw =
new System.IO.StreamWriter ( args [ 1 ] ) )
{
byte spt = byte.Parse ( args [ 2 ] ) ;
string line ;
while ( ( line = tr.ReadLine() ) != null )
{
tw.WriteLine ( line.TrimEnd().Untabify ( spt ) ) ;
}
}
}
}
else
{
System.Console.WriteLine ( "Syntax: Untabify" +
" infile outfile spacesperindent" ) ;
}
}
catch ( System.Exception err )
{
while ( err != null )
{
System.Console.WriteLine ( err ) ;
err = err.InnerException ;
}
}
return ( result ) ;
}
}
}
Tabify utility
This is a very simple console application that reads the lines in a file, right-trims and tabifies each, then writes each out to another file.
It can be built with csc Tabify.cs LibExt.Tabify.cs.
namespace Tabify
{
using PIEBALD.Lib.LibExt.Tabify ;
public static class Tabify
{
[System.STAThreadAttribute()]
public static int
Main
(
string[] args
)
{
int result = 0 ;
try
{
if ( args.Length == 3 )
{
using ( System.IO.TextReader tr =
new System.IO.StreamReader ( args [ 0 ] ) )
{
using ( System.IO.TextWriter tw =
new System.IO.StreamWriter ( args [ 1 ] ) )
{
byte spt = byte.Parse ( args [ 2 ] ) ;
string line ;
while ( ( line = tr.ReadLine() ) != null )
{
tw.WriteLine ( line.TrimEnd().Tabify ( spt ) ) ;
}
}
}
}
else
{
System.Console.WriteLine ( "Syntax: Tabify" +
" infile outfile spacesperindent" ) ;
}
}
catch ( System.Exception err )
{
while ( err != null )
{
System.Console.WriteLine ( err ) ;
err = err.InnerException ;
}
}
return ( result ) ;
}
}
}
History
- 2009-01-14 - First submitted.