Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / string

MatchKit Library

5.00/5 (16 votes)
21 Apr 2014Ms-PL2 min read 27K  
MatchKit is a .NET Library that provides a set of classes to build patterns to match simple and complex strings

Introduction

MatchKit is a .NET Library that helps to match simple and complex strings. It provides:

  • a flexible and extensible infrastructure to customize the matching process and create custom patterns
  • a set of built-in patterns to match common set of characters
  • a way to match strings with buil-in and custom patterns through a deserialization-like mechanism

The .NET Framework provides a regular expressions framework, but often the construction of the pattern is so complex that in some cases is better to write a custom code to parse the string.

Downloads

Learn by Examples

The best way to learn use MatchKit is to explore the samples library in the documentation, as the following examples selection demonstrates.

Find digits

This example use the built-in DigitPattern and the Find extension method to search single digits inside a string.

C#
string content = "0, 1, 2, 99";
IPattern pattern = new DigitPattern();
 
IEnumerable<Match> matches = content.Find(pattern);
 
// matches results: 
//    matches[0].MatchedString = "0" 
//    matches[1].MatchedString = "1" 
//    matches[2].MatchedString = "2" 
//    matches[3].MatchedString = "9" 
//    matches[4].MatchedString = "9"

Match a number

This example use the built-in NumberPattern to exactly match a number.

C#
string content = "+101.56";
IPattern pattern = new NumberPattern();
 
Match match = content.Match(pattern);
 
// match results: 
//    match.MatchedString = "+101.56" 
//    match.Value = 101.56 (decimal)

Extract protocol and port

This example builds a complex pattern, as a composition of simplest patterns, to extract protocol and port number from an url.

string url = "http://www.domain.com:8080/readme.html";

IPattern pattern = new SequencePattern(
    new WordPattern().CaptureAs("proto"),
    new StringPattern("://", false, false),
    new TillPattern(':', '/'),
    new SequencePattern(
        Symbols.Colon,
        new NumberPattern().CaptureAs("port")
    ).AsOptional(),
    Symbols.Slash
    );  

Match m = url.Match(pattern);

// m["proto"].Value = "http" 
// m["port"].Value = 8080

Scan for HTML HRefs

This example builds a complex pattern, as a composition of simplest patterns, to extract HREFs from an HTML string.

C#
public void RunSample()
{
    string inputString = "My favorite web sites include:</P>" +
                "<A HREF=\"http://msdn2.microsoft.com\">" +
                "MSDN Home Page</A></P>" +
                "<A HREF=\"http://www.microsoft.com\">" +
                "Microsoft Corporation Home Page</A></P>" +
                "<A HREF=\"http://blogs.msdn.com/bclteam\">" +
                ".NET Base Class Library blog</A></P>";
 
     var matches = DumpHRefs(inputString);
 
    // var capture0 = matches[0].GetCapture("url"); 
    // capture0.Value = "http://msdn2.microsoft.com" 
    // capture0.Location.Index = 42 
    // var capture1 = matches[1].GetCapture("url"); 
    // capture1.Value = "http://www.microsoft.com" 
    // capture1.Location.Index = 101 
    // var capture2 = matches[2].GetCapture("url"); 
    // capture2.Value = "http://blogs.msdn.com/bclteam" 
    // capture2.Location.Index = 175
}
 
private Match[] DumpHRefs(string inputString)
{
    var pattern = new SequencePattern(
        new StringPattern("href", true, false),
        Spaces.ZeroOrMore(),
        Symbols.Equal,
        Spaces.ZeroOrMore(),
        new ExclusivePattern(
            new LiteralPattern('"'),
            new LiteralPattern('\'')
            ).CaptureAs("url")
        );
    return inputString.Find(pattern).ToArray();
}

Match an hexadecimal string with a custom pattern

This example creates a custom pattern to match an hexadecimal string.

C#
public void RunSample()
{
    string content = "FFAA";
 
    MatchContext context = new MatchContext(content);
 
    IPattern pattern = new RepeaterPattern(new HexPattern());
 
    Match match = pattern.Match(context);
 
    // match.MatchedString = "FFAA" 
    // match.Value = new byte[] { 0xFF, 0xAA };
}
 
// custom pattern to match a single hex byte 00 > FF 
class HexPattern : BasePattern
{
    protected override Match OnMatch(MatchContext context)
    {
        var str = context.Current.ToString();
        str += context.NextCharacter();
 
        if (IsHexChar(str[0]) && IsHexChar(str[1]))
        {
            context.NextCharacter();
            return Success(context, Convert.ToByte(str, 16));
        }
        else
        {
            return Error(context);
        }
    }
 
    private bool IsHexChar(char ch)
    {
        return (ch >= '0' && ch <= '9')
            || (ch >= 'A' && ch <= 'F')
            || (ch >= 'a' && ch <= 'f');
    }
 
    public override string HelpString
    {
        get { return "Hex byte"; }
    }
}

Cancelling an async matching process

This example shows how to cancel an async long matching process.

C#
string content = "very long string";
IPattern pattern = new StringPattern("string");
CancellationFlag cancellation = new CancellationFlag();
 
IAsyncResult async = content.BeginFind(pattern
    , new MatchOptions { CancellationFlag = cancellation });
 
cancellation.Cancel();

try
{
    var matches = content.EndFind(async);
}
catch (MatchCanceledException)
{
    // some code
}

Clean an input string

This example shows how to replace an input string from dummy characters.

string content = "a simple text. a bit -dirty";

IPattern pattern = new ExclusivePattern(new WhiteSpacePattern(), Symbols.Dot, Symbols.Minus);
 
string replaced = content.Replace(pattern, "");
 
// replaced = "asimpletextabitdirty"

Map a fixed serie of numbers

This example show how to map a serie of numbers on a matchable custom class.

C#
public void NumberPatternPattern()
{
    string content = "101 -35.95 15";
 
    NumberPatternClass mapped = ObjectMapper.Map<NumberPatternClass>(content);
 
    // mapped results: 
    //    mapped.LongValue = 101 
    //    mapped.DecimalValue = -35.95 
    //    mapped.ByteValue = 15
}
 
[MatchableClass]
class NumberPatternClass
{
    [MatchableMember(0)]
    public long LongValue;
 
    [MatchableMember(1)]
    private const char _space = ' ';
 
    [MatchableMember(2)]
    public decimal DecimalValue;
 
    [MatchableMember(3)]
    private const char _space2 = ' ';
 
    [MatchableMember(4)]
    public byte ByteValue;
}

Map a variable serie of numbers

This example show how to map a variable serie of numbers on a matchable custom class.

C#
public void RepeaterPattern()
{
    string content = "10,20,45,102";
 
    RepeaterPatternClass mapped = ObjectMapper.Map<RepeaterPatternClass>(content);
 
    // mapped results: 
    //    mapped.Numbers[0].Number = 10 
    //    mapped.Numbers[1].Number = 20 
    //    mapped.Numbers[2].Number = 45 
    //    mapped.Numbers[3].Number = 102
}
 
[MatchableClass]
class RepeaterPatternClass
{
    [MatchableMember(0)]
    public RepeatableClass[] Numbers;
}
 
[MatchableClass]
class RepeatableClass
{
    [MatchableMember(0)]
    public int Number;
 
    [MatchableMember(1, IsSeparatorElement = true)]
    private const char _comma = ',';
}

Replace a matchable class

This example shows how to replace text from a matched custom class.

C#
public void RunSample()
{
    string code = "public class Sample { }";
    IMatchBag bag = new MatchBag();
 
    ObjectMapperOptions options = new ObjectMapperOptions
    {
        EndsWithEos = true,
        IgnoreBlanks = IgnoreBlank.All,
        MatchBag = bag
    };
 
    CSharpClass mapped = ObjectMapper.Map<CSharpClass>(code, options);
 
    var replacer = new ObjectReplacer(code, bag);
    replacer.Replace(mapped, o => o.Name, "RenamedClass");
 
    var replaced = replacer.Apply();

    // replaced = "public class RenamedClass { }"
}
 
[MatchableClass]
class CSharpClass
{
    [MatchableMember(0)]
    private const string @public = "public";
 
    [MatchableMember(1)]
    private const string @class = "class";
 
    [MatchableMember(2)]
    public string Name;
 
    [MatchableMember(3)]
    private const char open = '{';
 
    [MatchableMember(4)]
    private const char close = '}';
}

A complex example - Map a math expression

This example creates a set of custom matchable classes to map a math expression.

The test code

C#
ObjectMapperOptions settings = new ObjectMapperOptions
{
    IgnoreBlanks = IgnoreBlank.All,
    EndsWithEos = true
};
 
string expr = "(150 != (var / 32)) - _k2 + 98 * (90 / 123 + (_j5 * (15.5 - 0.32)))";
 
Expr mapped = ObjectMapper.Map<Expr>(expr, settings);

The matchable classes

[MatchableClass]
class Expr
{
    // an expression is an enclosed expression or a single value

    [MatchableMember(0)]
    [ObjectPattern(0, typeof(EnclosedExpr))]
    [ObjectPattern(1, typeof(Value))]
    public object Value;
 
    // followed by a list of operator/expression

    [MatchableMember(1)]
    public Item[] Values;
}
 
[MatchableClass]
class Item
{
    [MatchableMember(0)]
    [StringPattern(0, "+")]
    [StringPattern(1, "-")]
    [StringPattern(2, "*")]
    [StringPattern(3, "/")]
    [StringPattern(4, "==")]
    [StringPattern(5, "!=")]
    [StringPattern(6, "<=")]
    [StringPattern(7, "<")]
    [StringPattern(8, ">=")]
    [StringPattern(9, ">")]
    [StringPattern(10, "&&")]
    [StringPattern(11, "||")]
    public string Operator;
 
    [MatchableMember(1)]
    [ObjectPattern(0, typeof(EnclosedExpr))]
    [ObjectPattern(1, typeof(Value))]
    public object Value;
}
 

[MatchableClass]
class Value
{
    // a value is a number or an identifier

    [MatchableMember(0)]
    [NumberPattern(0)]
    [Pattern(1, typeof(IdentifierPattern))]
    public object Val;
}
 
[MatchableClass]
class EnclosedExpr
{
    [MatchableMember(0, IsPointOfNoReturn = true)]
    private const char _open = '(';
 
    [MatchableMember(1)]
    public Expr Value;
 
    [MatchableMember(2)]
    private const char _close = ')';
}
 
// custom pattern to match an identifier 
sealed class IdentifierPattern : BasePattern
{
    protected override Match OnMatch(MatchContext context)
    {
        if (!context.IsLetter && !context.IsUnderscore)
            return Error(context);
        context.NextTo(ch => char.IsLetter(ch) || char.IsDigit(ch) || ch == '_');
        return Success(context, MatchedString.Value);
    }
 
    public override string HelpString
    {
        get { return "Identifier"; }
    }
}

Other features not showed in this selection

MatchKit exposes others features such as:

  • Use of a tracer to dump the matching process
  • Handle member assigning in matchable classes to change the matched value or cancel the entire process
  • Use of a match bag, only with matchable custom classes, to retrieve the Match information (line, column, index, ...) of a mapped member of a specific instance
  • Customize the default pattern for a .NET data type

Other complex examples, available in the documentation samples library, shows how to match a command line, an INI file, an XML, a JSON string, a SQL query, a C#-style source code.

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)