Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

A C# LIKE implementation that mimics SQL LIKE

4.67/5 (4 votes)
18 Jun 2013CPOL2 min read 18.9K  
Code for a C# string.Like extension method.

Introduction

I've always found the lack of a SQL LIKE operator in C# lacking. Sure - regular expressions can do the job - but the syntax is not nearly as nice as that of LIKE. Not being able to find an actual LIKE implementation by searching I've taken the time to write one. Hopefully this can make life easier for a few other people as well.

Using the code

The code provided is an extension method for the string class. It works in the same way that Contains, StartsWith, and EndsWith works.

Here is a very basic example of its usage:

C#
if ("abc".Like("a%"))//Will return true 
//Do something

The following notes on its parameter's syntax:

  • All searches are case insensitive.
  • % or * act as multi-character wildcards and allow one or more characters.
  • _ (underscore) acts as a single-character wildcard
  • \ (backslash) acts as an escape character and can be used to escape %, * or _. If you want to use a backslash in your search string you can use two (\\).
  • [] allows multiple characters in a search. [abc] will check for any of a, b or c.
  • [a-c] will check for a, b or c.
  • [^a-c] will check for anything that is not a, b or c.
  • [\^] will check for ^. [^\^] will check for anything that is not ^. [a\-b] will check for a, - or b.
  • \[ or [[] will check for [.

The code

A few things to note before copying the code:

  • Paste this in any static class and it'll work out of the box.
  • I've commented everything to make it clear what's happening with every step. If something is unclear please post and I'll improve on the comments. The remarks XML contains usage instructions.
  • This isn't really all that performance friendly and does some things in clumsy ways. Feel free to improve! If you send me the new & improved code I'll update this article ;)
  • I have not tested it for every use case imaginable - it works fine in every case I could think of and write a test for. A few tests are included at the end. If you do happen to find any cases where it fails or produces unexpected results please let me know so I can fix it up.
C#
/// <summary>
/// Performs equality checking using behaviour similar to that of SQL's LIKE.
/// </summary>
/// <param name="s">The string to check for equality.</param>
/// <param name="match">The mask to check the string against.</param>
/// <param name="CaseInsensitive">True if the check should be case insensitive.</param>
/// <returns>Returns true if the string matches the mask.</returns>
/// <remarks>
/// All matches are case-insensitive in the invariant culture.
/// % acts as a multi-character wildcard.
/// * acts as a multi-character wildcard.
/// _ acts as a single-character wildcard.
/// Backslash acts as an escape character.  It needs to be doubled if you wish to
/// check for an actual backslash.
/// [abc] searches for multiple characters.
/// [^abc] matches any character that is not a,b or c
/// [a-c] matches a, b or c
/// Published on CodeProject: http://www.codeproject.com/Articles/
///         608266/A-Csharp-LIKE-implementation-that-mimics-SQL-LIKE
/// </remarks>
public static bool Like(this string s, string match, bool CaseInsensitive = true)
{
    //Nothing matches a null mask or null input string
    if (match == null || s == null)
        return false;
    //Null strings are treated as empty and get checked against the mask.
    //If checking is case-insensitive we convert to uppercase to facilitate this.
    if (CaseInsensitive)
    {
        s = s.ToUpperInvariant();
        match = match.ToUpperInvariant();
    }
    //Keeps track of our position in the primary string - s.
    int j = 0;
    //Used to keep track of multi-character wildcards.
    bool matchanymulti = false;
    //Used to keep track of multiple possibility character masks.
    string multicharmask = null;
    bool inversemulticharmask = false;
    for (int i = 0; i < match.Length; i++)
    {
        //If this is the last character of the mask and its a % or * we are done
        if (i == match.Length - 1 && (match[i] == '%' || match[i] == '*'))
            return true;
        //A direct character match allows us to proceed.
        var charcheck = true;
        //Backslash acts as an escape character.  If we encounter it, proceed
        //to the next character.
        if (match[i] == '\\')
        {
            i++;
            if (i == match.Length)
                i--;
        }
        else
        {
            //If this is a wildcard mask we flag it and proceed with the next character
            //in the mask.
            if (match[i] == '%' || match[i] == '*')
            {
                matchanymulti = true;
                continue;
            }
            //If this is a single character wildcard advance one character.
            if (match[i] == '_')
            {
                //If there is no character to advance we did not find a match.
                if (j == s.Length)
                    return false;
                j++;
                continue;
            }
            if (match[i] == '[')
            {
                var endbracketidx = match.IndexOf(']', i);
                //Get the characters to check for.
                multicharmask = match.Substring(i + 1, endbracketidx - i - 1);
                //Check for inversed masks
                inversemulticharmask = multicharmask.StartsWith("^");
                //Remove the inversed mask character
                if (inversemulticharmask)
                    multicharmask = multicharmask.Remove(0, 1);
                //Unescape \^ to ^
                multicharmask = multicharmask.Replace("\\^", "^");
                
                //Prevent direct character checking of the next mask character
                //and advance to the next mask character.
                charcheck = false;
                i = endbracketidx;
                //Detect and expand character ranges
                if (multicharmask.Length == 3 && multicharmask[1] == '-')
                {
                    var newmask = "";
                    var first = multicharmask[0];
                    var last = multicharmask[2];
                    if (last < first)
                    {
                        first = last;
                        last = multicharmask[0];
                    }
                    var c = first;
                    while (c <= last)
                    {
                        newmask += c;
                        c++;
                    }
                    multicharmask = newmask;
                }
                //If the mask is invalid we cannot find a mask for it.
                if (endbracketidx == -1)
                    return false;
            }
        }
        //Keep track of match finding for this character of the mask.
        var matched = false;
        while (j < s.Length)
        {
            //This character matches, move on.
            if (charcheck && s[j] == match[i])
            {
                j++;
                matched = true;
                break;
            }
            //If we need to check for multiple charaters to do.
            if (multicharmask != null)
            {
                var ismatch = multicharmask.Contains(s[j]);
                //If this was an inverted mask and we match fail the check for this string.
                //If this was not an inverted mask check and we did not match fail for this string.
                if (inversemulticharmask && ismatch ||
                    !inversemulticharmask && !ismatch)
                {
                    //If we have a wildcard preceding us we ignore this failure
                    //and continue checking.
                    if (matchanymulti)
                    {
                        j++;
                        continue;
                    }
                    return false;
                }
                j++;
                matched = true;
                //Consumse our mask.
                multicharmask = null;
                break;
            }
            //We are in an multiple any-character mask, proceed to the next character.
            if (matchanymulti)
            {
                j++;
                continue;
            }
            break;
        }
        //We've found a match - proceed.
        if (matched)
        {
            matchanymulti = false;
            continue;
        }

        //If no match our mask fails
        return false;
    }
    //Some characters are left - our mask check fails.
    if (j < s.Length)
        return false;
    //We've processed everything - this is a match.
    return true;
} 

Here are some test cases:

C#
Action<string, bool> check = (s, b) => { if (!b)
    throw new ArgumentException("Like failed with the string " + s); };
check("a%", "abc".Like("a%"));
check("%c", "abc".Like("%c"));
check("%d%", !"abc".Like("%d%"));
check("%john%", "john".Like("%john%"));
check("%john%", !"johb".Like("%john%"));
check("%joh[nb]", "johb".Like("%joh[nb]"));
check("[^z]ack", "Sack".Like("[^z]ack"));
check("[\\-b^]", "^".Like("[\\^-b]"));
check("[\\^]", !"^".Like("[^\\-abc]"));
check("d_n", "dan".Like("d_n"));   

Conclusion

The code provided is intended to work similar to a SQL LIKE expression for C# strings. It also adds a few additional capabilities like using * instead of % and escaping with \. It is intended to provide nicer syntax than regular expressions (currently at the cost of some performance).

History

  • 2013/06/18 - Initial publication
  • 2013/06/18 #2 - Now supports case-sensitive comparison via an optional parameter and always returns false for null input strings.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)