Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

A C#/.NET Attributes Based Command Line Argument Parser

0.00/5 (No votes)
10 Jan 2012 1  
This article introduces an easy to use attribute/reflection based library for seamlessly parsing command line arguments for applications.

Introduction

Any non-trivial program will usually offer the user some options. For a Console application, these are conventionally specified by command line arguments which are passed into the Main method of a C# application as an array of string, e.g.:

static void Main(string[] args) { }

These arguments need parsing. This can be done by hand coding a custom implementation but it is error prone and becomes tedious after implementing it for a few applications. This issue has been faced by lots of programmers and has led to various libraries, e.g. getopt for C and UNIX with similar ports to Java and C#. The problem with the getopt and other approaches is that it only automates part of the process. They still require the developer to sometimes implement a processing loop usually containing a large switch statement and also to generate the usage string in the event of an error and various other setup procedures.

This article shows a different approach for C# using the NAttrArgs library. This removes the need for the developer to implement any parsing code. It also changes the programming model for handling command line arguments from an imperative to a declarative one. This means that all the developer needs to do is to specify which member variables, member properties and/or member functions should be set in response to a command line argument being present.

The change from imperative to declarative style is similar to that introduced by LINQ where queries moved from being about the implementation (imperative) to just declaring what the query should do (declarative). Whereas LINQ uses additional C# compiler syntactical sugar to make the process very easy for the developer, the NAttrArgs library as its name suggests relies upon .NET Attributes.

All the source code for the NAttrArgs library, the examples plus binaries for the library are provided as accompanying zips. The source code is Open Source and is available on GitHub as https://github.com/petebarber/NAttrArgs.

Using the Code

The easiest way to see how NAttrArgs works is start with a simple example.

class Program
{
    [NArg]
    private string _testArg;

    [NArg(IsOptional = true, AltName = "foo")]
    private bool _optionalFlag;

    static void Main(string[] args)
    {
        try
        {
            new Program().Run(args);
        }
        catch (NArgException e)
        {
            Console.Error.WriteLine(e.Message);
        }
    }

    void Run(string[] args)
    {
        new ArgParser<Program>("Example2").Parse(this, args);

        Console.WriteLine("_TestArg:{0}", _testArg);
        Console.WriteLine("_optionalFlag:{0}", _optionalFlag);
    }
}

This shows a simple program that has one required argument and an optional flag argument. Rather than having to add any special parsing code, the end result should be that if these command line arguments are present, then the variables decorated with the NArg attribute should be set appropriately.

Running this without any arguments generates the following output:

Usage: Example2 [-foo] <_testArg>

This is because the program expected a value to be supplied for _testArg. This was declared using an empty NArgAttribute. By default, an argument is considered mandatory unless the attribute parameter IsOptional is set to true. This is seen in the declaration for the second variable _optionalFlag.

There are two types of optional parameters, firstly a Boolean flag argument which represents a setting that can be on or off, e.g. 'somecmd -recurse'. Secondly there is an optional argument that takes a parameter, e.g. 'somecmd -bgcolour red'. In these cases, if the option isn't specified then the variable's value will remain unchanged. In the example above, if '-foo' is specified, then _optionalFlag would be set to true.

As no command line arguments were specified, the call to Parse() threw an exception as it expected to at least find one argument that could be used to set the value of _testArg. The usage message is constructed by NAttrArgs and is stored in the Message property of the thrown NArgException.

The format of the usage message is the name of the program as specified to the constructor of ArgParser followed by the optional parameters; if any, then the mandatory parameters; if any and finally an ellipse indicating that one variable (of type string[]) has been specified to receive any unconsumed arguments; one hasn't in this example.

The name for each argument in the usage message is by default the name of the associated variable. As can be seen, the name _testArg is not very friendly. A different name can be associated with the argument through the use of the AltName option to the NArg attribute. This is used to display "foo" in the case of _optionalFlag.

If instead, the example program is run with the command line arguments '-foo' and 'bar', it displays:

_TestArg:bar
_optionalFlag:True

What's happened is that the _testArg variable has been set with the value 'bar'. As the variable is of type string, then the mapping from a command line argument string to a string variable is simple. In the case of _optionalFlag, there is no actual command line argument value to set. Instead the presence of the option causes the NAttrArg parsing code to set the value of this variable to true.

The types of the receiving variables aren't just limited to the obvious case of string and bool. They can be set to anything that the .NET Convert.ChangeType method can handle. In addition, conversions from char to bool are permitted; but that's only used internally when the variable type of an optional flag is char and the internal code attempts to set it to true. Also, as well as converting types if the target type (of the variable) is assignable from the source type via Type.IsAssignable then this is used. This too was primarily intended for internal use when handling the special case where the remaining arguments are stored as a string[] and rather than being assigned to a variable of this type, they are instead assigned to property of type IEnumerable<string>.

If a conversion is not possible, e.g. the type of the variable is double and command line argument (which will always be a string) is the value 'bar' rather than say '2.37' then this will cause Convert.ChangeType to raise an exception. This will be caught within the ArgParse.Parse method which will then throw an NArgAttrException. As before, the usage message will be contained in the Message property but in addition the original exception will be available from the InnerException property.

This may seem like an inelegant and incomplete way to handle this situation. Instead, it removes the burden of handling this from the developer. By specifying the type of the variable to receive the value, there is an implicit contract that the command line argument string value must be convertible. If it's not, then that's a usage error. The usage message does specify what the acceptable values are. For any number, this would be hard given the infinite range. It may be possible to specify an acceptable format. However, by design and convention usage messages do not usually convey this information. Instead it's left to the application documentation.

NAttrArgs does however have an option to capture incorrect values prior to conversion and display the acceptable values in the usage messages. This is achieved by using the AllowedValues argument to the NArg attribute. This takes a string array of the acceptable values. These values are checked prior to any type conversion. The example below is a modified version of the previous one.

class Program
{
    [NArg(AltName = "Test", AllowedValues = new string[] { "one", "two", "three" })]
    public string _testArg;

    [NArg(IsOptional = true, AltName = "foo", 
    AllowedValues = new string[] { "true", "false" })]
    public bool _optionalFlag;

    static void Main(string[] args)
    {
        try
        {
            new Program().Run(args);
        }
        catch (NArgException e)
        {
            Console.Error.WriteLine(e.Message);
        }
    }

    void Run(string[] args)
    {
        new ArgParser<Program>("Example3").Parse(this, args);

        Console.WriteLine("_TestArg:{0}", _testArg);
        Console.WriteLine("_optionalFlag:{0}", _optionalFlag);
    }
}

The mandatory argument is now restricted to accepting either 'one', 'two' or 'three'. The slightly interesting case is that the Boolean variable _optionalFlag is now restricted to 'true' or 'false'; which makes sense as it's Boolean. However, specifying the AllowedValues argument changes this from an optional flag to an option that takes an argument. The usage message shows this:

Usage: Example3 [-foo <true|false>] <one|two|three>

The latter modification is pointless as specifying '-foo true' is the same as previously just using '-foo'. Therefore the benefit of constraining the values depends upon the type of the target variable. The usage message is also changed when the values are constrained. Rather than using the name of the variable or the alternate name, the list of permissible values is displayed, each separated by the pipe symbol ('|') meaning 'or' as originating from UNIX regular expressions.

The examples so far have just involved a single optional and mandatory argument. There is an implicit ordering of optional arguments followed by mandatory arguments. If there is more than one type of each argument and ordering is important, then the Rank argument to the NArg attribute should be used. This is shown below where there are three mandatory and three optional arguments.

class Program
{
    [NArg(Rank = 3)]private string _mandatoryArg;
    [NArg(Rank = 2)]private string _mandatoryArg1;
    [NArg(Rank = 1)]private string _mandatoryArg2;

    [NArg(IsOptional = true, AltName = "alt1", Rank = 2)] private int _optionalArg;
    [NArg(IsOptional = true, AltName = "alt2")] private int _optionalArg1;
    [NArg(IsOptional = true, AltName = "alt3", Rank = 1)] private int _optionalArg2;

    static void Main(string[] args)
    {
        try
        {
            new Program().Run(args);
        }
        catch (NArgException e)
        {
            Console.Error.WriteLine(e.Message);
        }
    }

    void Run(string[] args)
    {
        new ArgParser<Program>("Example4").Parse(this, args);

        Console.WriteLine("Arg0:{0}", _mandatoryArg);
        Console.WriteLine("Arg1:{0}", _mandatoryArg1);
        Console.WriteLine("Arg2:{0}", _mandatoryArg2);
        Console.WriteLine("Opt0:{0}", _optionalArg);
        Console.WriteLine("Opt1:{0}", _optionalArg1);
        Console.WriteLine("opt2:{0}", _optionalArg2);
    }
}

If no rank is specified but multiple mandatory or multiple optional arguments are present, then the parsing order will correspond to the order in which the decorated (with the NArg attribute) members are defined. By default, the value of Rank is 0 so all the arguments are equal so definition order prevails.

The corresponding usage message is:

Usage: Example4 [-alt2] [-alt3] [-alt1] <_mandatoryArg2> <_mandatoryArg1> <_mandatoryArg>

The definition order has been superseded by the Rank parameter. Notice how no rank was given to _optionalArg1 which meant it retained a rank of 0 which is the highest. To change this variable's position, either a lower rank needs specifying or another decorated variable needs defining before it.

It's Not Just Variables That Can Be Decorated

So far, only member variables have been decorated with an attribute. It's not only explicit member variables that can have their value set. A property can be specified as long as it has a setter, as can a method. For this to work, it either must have no arguments or one argument which is of a type that the command line argument can be converted too. Having no arguments may seem a little odd but for an optional flag parameter, it's a perfect mapping as invocation implies the argument was present. Even though there's not much point to it, there is nothing that prevents the use of a void method with either an optional or a mandatory argument.

The ability to invoke setters and in particular methods enables more complex mappings between command line arguments and the eventual setting. The example below demonstrates setting a variable to a value obtained from the command line and multiplying by an optional command line value.

class Program
{
    private uint _multiple;
    private double _multipleOfArgument;

    [NArg]
    public void Multiply(double value)
    {
        _multipleOfArgument = value * Multiple;
    }

    [NArg(IsOptional = true, OptionalArgName = "Multiple")]
    public uint Multiple
    {
        get { return Math.Max(1, _multiple); }
        private set { _multiple = value; }
    }

    static void Main(string[] args)
    {
        try
        {
            new Program().Run(args);
        }
        catch (NArgException e)
        {
            Console.Error.WriteLine(e.Message);
        }
    }

    void Run(string[] args)
    {
        new ArgParser<Program>("Example5").Parse(this, args);

        Console.WriteLine("_multipleOfArgument:{0}", _multipleOfArgument);
    }
}

This example also shows the use of the NArg attribute OptionalArgName parameter. The Multiple property is not a Boolean and needs to be set with a uint so the optional argument requires a parameter. The name of this argument (which will be used when constructing a usage message) is specified by the OptionalArgName parameter. The optional in the name means it's the argument to an optional parameter not that it is optional itself.

The same method cannot be used for different command line arguments. However, if used for an optional argument parameter, there is nothing preventing repeated invocations. If used with different (command line) values, an accumulated value could be created, e.g. 'somecmd -add 1 -add 2 -add 3' which could accumulate each value resulting in a variable which holds '6'.

Being able to specify a method via the attribute avoids having to create a temporary member variable to store the acquired value prior to doing something with it; in a method. Conceivably it would be possible to place the entire program logic in one of these methods or initiate operations from there. If the method is a function (has a return value rather than void), this is ignored.

What Happens to Any Remaining Command Line Arguments?

By default, any unrecognized optional parameters will generate an error and any left-over non-optional arguments will do the same though the InnerException will be different. This state of affairs is fine unless the application requires an unbounded list of final arguments to process, e.g. a set of files to list information about or dump their contents. This is where the final NArg attribute argument of IsConsumeRemaining is used.

class Program
{
    [NArg(IsOptional = true, AltName = "Count")] 
    public bool IsShowCount { get; private set; }
    [NArg(IsConsumeRemaining = true)] private string[] _theRest;

    private static void Main(string[] args)
    {
        try
        {
            new Program().Run(args);
        }
        catch (NArgException e)
        {
            Console.Error.WriteLine(e.Message);
        }
    }

    private void Run(string[] args)
    {
        new ArgParser<Program>("Example6").Parse(this, args);

        if (IsShowCount == true)
            Console.WriteLine("There are {0} remaining arguments", _theRest.Length);

        foreach (string s in _theRest)
            Console.WriteLine(s);
    }
}

Any remaining arguments are now turned into a string array and as per any other attributed member can be passed to a variable, property or method. The target type must be convertible from a string[]. In particular if the strings are all integers and target type is int[] then even though the conversion from string to int is legal in this case, the conversion between the array types is not, i.e. the conversion is not performed on a per-element basis. Explicit support is provided for assigning the remaining string values to IEnumerable<string>. An alternative implementation of Example 6 is provided that demonstrates this.

If multiple members are decorated with NArg(IsConsumeRemaining = true), then the first is used and the remaining will be untouched. A member decorated with this argument is essentially an optional argument as if there are no remaining arguments, then it will cause no error. Rank is irrelevant as by definition only command line values after all others are consumed. If any of these arguments are set, then they are ignored. As too is AllowedValues. The remaining values are passed verbatim to the member decorated to receive them. The name of the member is also irrelevant as in the usage message, the remaining arguments are indicated as '[...]'.

NArgAttribute Usage

Below is a summary of the different properties to the NArg attribute. Not all can be used together though if this occurs there will be no error. Instead the parsing mechanism will deduce the most suitable use. If properties are repeated within the attribute usage, then the last use takes precedence.

IsOptional

Syntax

NArg(IsOptional = true | false) 

Remarks

The default value is false meaning that any member decorated with the NArg attribute is considered mandatory.

Rank

Syntax

NArg(Rank = <uint>) 

Remarks

The default value is 0. When being parsed, optional arguments are considered before mandatory arguments. This means that the optional and mandatory arguments form two sets of arguments each with their own ordering so the same ordinal can be used for both a mandatory and optional argument without clashing. Optional arguments are considered first during parsing.

Optional argument parsing ends when the first command line argument not to prefixed with a '-' is encountered. If additional command line parameters do start with a '-' then they are not treated as such.

AltName

Syntax

NArg(AltName = <string>) 

Remarks

By default, the name used for generating the usage message is the name of the decorated member. This property allows the display name to be changed to the specified string.

OptionalArgName

Syntax

NArg(OptionalArgName = <string>) 

Remarks

This property only applies to optional parameters, i.e., where IsOptional = true. Setting this value changes the optional type from a optional flag to an optional argument. This now requires that the command line specify the option and an additional argument, e.g. foo -count 8. The value of the argument will be used to set the decorated member rather than a true or false.

AllowedValues

Syntax

NArg(AllowedValues = <string[]>) 

Remarks

This property allows the set of values to the decorated member to be limited. These are specified as strings rather than the target type of the member; if different.

The usage message is also affected: Rather than displaying a mandatory argument's name or the name of an optional argument, the list of allowed values separated by a '|' symbol is displayed.

If this property is set in addition to IsOptional = true but no OptionalArgName is specified, then as the allowed values are no longer Boolean then the optional flag argument is implicitly promoted to become an optional parameter with an argument taking the specified values. This is true even if the allowed values are just 'True' and 'False'.

IsConsumeRemaining

Syntax

NArg(IsConsumeRemaining = true | false) 

Remarks

By default, if any remaining command line arguments are outstanding once all the optional and mandatory arguments have been matched, then this is considered an error.

Setting this property on a member decorated with the NArg attribute will no longer mean that outstanding command line arguments constitute an error. Instead, they will be assigned to the decorated member. This must be of type string[] or IEnumerable<string>. If there are no outstanding arguments, then a zero length string array will be assigned, i.e. the target member will not be null. No other type conversion takes place.

It makes sense that only a single member be decorated with the NArg attribute that has this property set. If there is more than one occurrence, then the first is used. No other NArg attribute property is applicable in conjunction with this property. If specified, they will be ignored. This includes AllowedValues.

You Aren't Limited to Console Applications

Whilst the primary application of the NAttrArgs library is the processing of command line arguments for Console applications, this is not the only use. It is quite normal for GUI applications to take command line arguments though this functionality is infrequently used. The following code in an excerpt from Example 7 which shows NAttrArgs being used to process command line arguments passed to a WPF application and the decorated member variable being used within the XAML.

NOTE: That when obtaining the command line arguments via Environment.GetCommandLineArguments(), the first argument contains the path to the executable whereas the arguments passed into a C# Console application do not.

App.xaml.cs

public partial class App : Application
{
    [NArg(IsOptional = true, AltName = "Show", OptionalArgName = "ShowMe")]
    public string ShowMe { get; private set; }

    App()
    {
        ShowMe = "Default boring message";
    }

    protected override void OnStartup(StartupEventArgs e)
    {
        base.OnStartup(e);

        try
        {
            string[] cmdLineArgsIncAppName = Environment.GetCommandLineArgs();
            string[] cmdLineArgsOnly = cmdLineArgsIncAppName.Skip(1).ToArray();

            new ArgParser<App>("App").Parse(this, cmdLineArgsOnly);
        }
        catch (Exception)
        {
            Shutdown();
        }
    }
}

MainWindow.xaml

<Window x:Class="Example7_WPF.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"

        Title="MainWindow" Height="350" Width="525">
    <Grid>
        <TextBox Text="{Binding Path=ShowMe, 
        Source={x:Static Application.Current}, Mode=OneWay}" />
    </Grid>
</Window>

Test Driven Development (TDD)

This library was my first attempt at using Test Driven Development. All the tests are contained in the project NAttrArgs.Test project. The tests are implemented on a per-class basis with a *Test.cs file corresponding to each of the classes within the actual NAttrArgs project, kind of! These are implemented using NUnit.

The code has evolved to be very granular with lots of small classes that are easily testable in isolation. The code did not start out this way. Originally it was a single class, then lots of smaller class and then back to one or two large classes to finally end up in its current state. As such, quite a few of the tests, in particular those testing the argument parsing and usage messages are at a far too higher level and should be considered either Integration or Acceptance tests. The current reasonably fine grained implementation of the parsing code would now allow these to be implemented as a proper set of unit tests on a per-class basis.

Performing TDD has been an interesting experience. At first it was like programming inside-out but the more accustomed to it I became, the easier it was. Having a set of unit tests (and acceptance tests) allowed the code to be continuously refactored; as previously described. The refactorings that occurred were far reaching in terms of extracting out classes and the methods within classes. Just being able to run the tests frequently; there are 152 and they run in a second or two; meant that checking that a refactoring had worked was trivial.

In addition to using TDD, a test program was written to use NAttrArgs. This is a simplified version of the UNIX ls command. This was also written using TDD which turned out to be a bit of mistake for a second TDD project. The reason being that it required the Windows File System to be mocked; though I prefer the generic term Test Double. In fact fakes as per the previous link were used. To do this, Microsoft Moles was used. This is an isolation framework which has the useful properties that test-doubles can be created for sealed classes, static classes and classes with static and non-virtual methods. In the case of the .NET file-system classes, i.e. File, FileSystemInfo, FileInfo and DirectoryInfo this is certainly the case.

To enable this, Moles uses detours which appears be the interception of method calls and the rewriting of those methods on the fly using the .NET profiling API. The downside of this extra functionality is that the unit tests take many orders of magnitude longer to run than the NUnit based tests. In order to have integration with Visual Studio, the tests for the ls sample were written using MSTest rather than NUnit. This was needed as when using Moles, the Pex test runner which is associated with Moles needs to be used (to set up the profiling API). MS Test is able to launch this within Visual Studio as long as each test is decorated with the Mole's custom attribute: [HostType("Moles")].

One part of the TDD red-green-refactor process that's been taken further than usual is the refactor stage. Having recently watched Uncle Bob's Clean Coder Video Series the 'extract till to you drop' idiom has been used. This leads to methods of about four lines long and also a lot more methods and classes. Having evolved classes via extraction to the point that they now contain lots of methods, the amount of cohesion often decreases. However, it is often the case that subsets of methods are very cohesive which means a separate class can be extracted. This is desirable and leads to highly-cohesive but loosely coupled code.

What's in the Package?

There are three main areas to the NAttrArgs solution. Firstly, there is the NAttrArgs project itself. This is the source for the NAttrArgs library which when built results in the NAttrArgs.dll assembly. Secondly, there is the NAttrArgs.Test project which is the NUnit based tests for NAttrArgs. Finally, there is the Samples solution directory which contains the example ls utility in the ls project and accompany MS Test/Moles unit tests in the ls.TestMS project. As this doesn't demonstrate that many of the NAttrArgs features, there's also the final project which is called Test. This project is a simple program that demonstrates each of NAttrArgs' capabilities.

NOTE: Unlike the NUnit binaries, the Moles binaries are not provided. This is due to the Microsoft licence which only allows free use for non-commercial and academic users. If the tests from ls.TestMS project need to be run, then please obtain a copy appropriately from Microsoft. This solution is not a key part of the package and is only left in as others may find it interesting to see how test-doubles can be created for the .NET file-system classes.

The Code

Given the use of TDD (including the presence of the tests) plus the use of small but hopefully well named methods, then the code should be self-describing. That said, the two main places to look are ArgParser.cs and NArgAttribute.cs. The latter is the definition of the custom attribute that is used to decorate class members. The constructor for ArgParser initiates the use of these by using reflection to obtain all the class members so adorned. Three lists, well LINQ queries are created representing the required, optional and remaining decorated members if there are any. These queries don't return instances of NArgAttribute but rather the derived class MemberAttribute. The main thing this adds is a reference to the class member that was decorated with the NArgAttribute.

The work of parsing the command line arguments is handed out in sequence to the optional, required and mandatory parsers contained in the similarly named files. These make use of an instance of the ArgIterator class. This implements IEnumerator<string> and provides enumerated access to the command line arguments. The reason for the custom implementation rather than just obtaining an instance from string[] is the additional behaviour that allows the current item to be pushed back. This is used by the optional argument parser when it encounters the first required option so needs to push back that argument for consumption by the RequiredArgumentsParser.

The other main classes of interest are MemberSetter and CustomConvert. When the parsers have found a matching member for their current argument MemberSetter is used to set the value. Depending on the type of the member, this will either be setting a value on a variable or a property or invoking a method. By combining the NArgAttribute and the MemberInfo information together in the MemberAttribute this can be passed to the MemberSetter which handles the member generically regardless of whether it's an optional or required argument. This class has entry point for optional, required and remaining arguments but these are handled by a common implementation that uses the MemberInfo from MemberAttribute to figure out how to access the member variable, property or method.

When setting a value, the CustomConvert class handles any necessary conversion between the string type of the command line argument and the target type. Mainly this is performed using the .NET method Convert.ChangeType but this class handles the special cases of converting from a bool to a char which is only needed when the target type of an optional flag is a char. The other, not so special case is when the target type is assignable from the source type. This was added so that the string[] created for handling the remaining arguments could be applied to a target type of IEnumerable<string> as this is not convertible. However for other types which are directly assignable, a conversion may be avoided.

What Next?

This is V1 of NAttrArgs. Some additional features that immediately spring to mind are:

  • Extended usage message

    In addition to displaying the generated usage message, this would allow an optional longer description of an argument to be supplied which would be displayed under the usage message.

  • Allow single letter optional flags to be combined into a single string

    E.g. -abc rather than -a -b -c

  • Allow values to optional arguments to be specified immediately without a space following the option or using the = sign

    E.g. -foobar rather than -foo bar or -foo=bar

  • Re-factor the unit tests to be fully per-class
  • Create a nuget package for the binaries

If anything appeals to you or if there's something else that would be of use, please let me know. Alternatively, fork the source from GitHub and contribute any changes back if you like.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here