Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

A Beginner's Tutorial on String Comparison in C#

4.68/5 (23 votes)
24 Aug 2012CPOL5 min read 118.1K   498  
This small article shows you the right way of comparing strings in a C# application.

Introduction

This short article talks about the right way of comparing strings in a C# application. We will try to see what are the various ways in which we can compare the strings and which one should be or should not be used.

Background

Usually in our applications, when we want to compare two strings, we use the equality operator. Under most scenarios, this will work properly but still we should know what are the other ways we can do string comparisons and perhaps achieve better performance and results. So let's say I have a variable str and I want to check whether its value is equal to "Yes" or not.

C#
if(str == "Yes")
{
	// equal
}
else
{
	// not equal
}

The above mentioned operator will do the comparison in a case sensitive manner and it will not consider the current culture. Now if a non case sensitive comparison is required, I have seen most of the developers taking either of the below mentioned approaches.

Either we do this:

C#
if (str.ToLower() == "yes")
{
	//equal
}
else
{
	//not equal
}

or we do something like:

C#
if (str.ToUpper() == "YES")
{
	//equal
}
else
{
	//not equal
}

Now this will work fine in most cases and since the immutable nature of the string will not even modify my original string, it does involve an extra function call and creation of an extra temporary string variable (call to ToLower() or ToUpper()). And it will not work in case we have this code running in a culture sensitive application and the str variable might contain some characters that are non-English characters.

So how do we do string comparison in a way that circumvents all these problems. .NET Framework string class already takes care of all these scenarios and provides us some functions that will enable us to perform correct and optimal string comparison in all such scenarios. We will now look into these functions.

Note: We will talk about equality comparison, but all these points will be valid for other comparisons too, i.e., finding the order of strings, etc.

Using the Code

The very first thing to understand before jumping on the functions is the type of comparisons I might need. I might need a culture sensitive comparison or a non culture sensitive comparison (ordinal comparison). Secondly, I might want a case sensitive information or case insensitive comparison.

Now let us look at what .NET provides us. .NET provides us 3 modes:

  1. CultureInvariant
  2. CurrentCulture
  3. Ordinal

CultureInvariant

The CultureInvariant mode assumes that all the comparisons will be done in English language and en-US as the culture. This mode interprets characters with reference to a particular alphabet. The alphabets are ordered assuming the en-US as the culture. This mode ultimately can be visualized as using this sort of string to find the order of string: "AaBbCc...". So in this mode, the sting "CAT" and "bat" will be ordered as: "bat", "CAT".

CurrentCulture

The second mode CurrentCulture will arrange the alphabets as arranged in case of Invariant culture to find the order of strings, only this order will be culture specific.

Also in this mode, the characters are compared using their corresponding counterpart in the other culture, i.e., the German Ä will be treated as A of en-US.

Ordinal

The third mode Ordinal simply compares the strings based on the order of characters. In other words, it simply uses the Unicode value of the characters to find the order. It uses the following reference string for ordering strings. Which is nothing but all alphabets ordered as per their Unicode/ASCII values: "ABC...abc...". So in this mode, the sting "CAT" and "bat" will be ordered as: "CAT", "bat".

Now with this information at hand, let us see what .NET provides us. The String.Equals and compare functions have an overloaded version which takes StringComparison enum type as the argument. This argument will specify the mode which we want to use for this comparison.

C#
public static bool Equals (string a, string b, StringComparison comparisonType);

This enum could have these possible values:

  • CurrentCulture
  • CurrentCultureIgnoreCase
  • InvariantCulture
  • InvariantCultureIgnoreCase
  • Ordinal
  • OrdinalIgnoreCase

Looking at each enum value, it is self explanatory which mode is for which scenario. Still, let us draw a small matrix for the same.

  CaseSensitive Non Casesensitive
Culture Sensitive CurrentCulture CurrentCultureIgnoreCase
Non culture sentitive(English en-US) InvariantCulture InvariantCultureIgnoreCase
Order Ordinal OrdinalIgnoreCase

And now, I do the same comparison which we saw above using these modes.

Comparing the string character to character in a case sensitive manner:

C#
if (String.Equals(str, "Yes", StringComparison.Ordinal) == true)
{
	//equal
}
else
{
	//not equal
}

Comparing the string in a non case sensitive manner:

C#
if (String.Equals(str, "Yes", StringComparison.OrdinalIgnoreCase) == true)
{
	//equal
}
else
{
	//not equal
}

These code snippets will also give us the desired results and perhaps in a little efficient way than the earlier.

Note: The == operator is equals to <code>StringComparison.Ordinal. So in case we need to use this mode, we can simply do away with the == operator.

Now let us summarize and see which one should be used when:

  • CurrentCulture - Culture specific case sensitive comparison
  • CurrentCultureIgnoreCase - Culture specific case non-sensitive comparison
  • InvariantCulture - English only case sensitive comparison
  • InvariantCultureIgnoreCase - English only non-case sensitive comparison
  • Ordinal - ASCII/UNICODE value based case sensitive comparison
  • OrdinalIgnoreCase - ASCII/UNICODE value based non-case sensitive comparison

A Note on StringComparer and StringComparison

A very interesting point of confusion is the possibility of being able to user StringComparer class for all the similar string comparisons. This class also has all these six ways of doing the string comparisons. Important thing to note here is that this Class also implements comparison interfaces, i.e., IComparer, IEqualityComparer, IComparer<String>.

The StringComparison that we have discussed so far in this article is an enum that you we should use while comparing two strings. So when should we not use this above mentioned approach and go for the StringComparer class.

The thumb rule is that if only string comparison is needed, then we should use String class's methods like String.Equals which will use the StringComparison enum to determine which mode should be used for actual comparison. You should use StringComparer class only when we have some methods which take any one of IComparer, IEqualityComparer, IComparer<String> type as parameters and we need to pass our strings.

Perhaps, internally the String class's methods are still using StringComparer class for actual comparison but from a developer's perspective, following the above guideline should suffice.

Point of Interest

This small article is written for those developers who are still at the start of their career and they are manipulating string in various forms just to achieve the desired comparison results. We have discussed only the equality operation but comparison operator will also follow the same rules.

History

  • 24th August, 2012: First version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)