Introduction
We will be looking at String
type in this post how Equality works for it. You might be aware that for string
s, the equality operator compares values not references which we had seen in the first post of this series. It is because String
has overloaded the ==
operator and also provided an overloaded implementation for Equals
method to behave in this manner.
We will investigate how ==
operator and Object.Equals
method behave for equality checking.
Background
This article is in the continuation of series of articles regarding how Equality works in .NET, the purpose is to give the developers a clearer understanding on how .NET handles equality for different types.
What We Learned So Far
Following are the key points that we learned from the previous parts until now:
- C# does not syntactically distinguish between value and reference equality which means it can sometimes be difficult to predict what the equality operator will do in particular situations.
- There are often multiple different ways of legitimately comparing values. .NET addresses this by allowing types to specify their preferred natural way to compare for equality, also providing a mechanism to write equality comparers that allow you to place a default equality for each type.
- It is not recommended to test floating point values for equality because rounding errors can make this unreliable.
- There is an inherent conflict between implementing equality, type-safety and good Object Oriented practices.
- Net provides the types equality implementation out of the box, few methods are defined by the .NET framework on the
Object
class which are available for all types. - By default, the
virtual
Object.Equals
method does reference equality for reference types and value equality for value types, but for value types it uses reflection which is a performance overhead for value types and any type can override Object.Equals
method to change the logic of how it checks for equality, e.g., String, Delegate
and Tuple
do this for providing value equality, even though these are reference types. Object
class also provides a static Equals
method which can be used when there is chance that one or both of the parameters can be null
, other than that, it behaves identical to the virtual Object.Equals
method. - There is also a
static ReferenceEquals
method which provides a guaranteed way to check for reference equality. IEquatable<T>
interface
can be implemented on a type to provide a strongly typed Equals
method which also avoids boxing for value types. It is implemented for primitive numeric types but unfortunately Microsoft has not been very proactive implementing for other value types in the FCL(Framework Class Library). - For Value Types using
==
operator gives us the same result as calling Object.Equals
but underlying mechanism of ==
operator is different in IL(Intermediate Language) as compared to Object.Equals
, so the Object.Equals
implementation provided for that primitive type is not called, instead an IL instruction <a href="https://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.ceq(v=vs.110).aspx">ceq </a>
gets called which says that compare the two values that are being loaded on the stack right now and perform equality comparison using CPU registers. - For Reference Types,
==
operator and Object.Equals
method call both work differently behind the scenes which can be verified by inspecting the IL
code generated. It also uses ceq
instruction which does the comparison of memory addresses.
If you want to read the other parts published until now, you can read them here:
Equality Operator and String
Consider the following piece of code:
class Program
{
static void Main(String[] args)
{
string s1 = "Ehsan Sajjad";
string s2 = String.Copy(s1);
Console.WriteLine(ReferenceEquals(s1, s2));
Console.WriteLine(s1 == s2);
Console.WriteLine(s1.Equals(s2));
Console.ReadKey();
}
}
The above code is very similar to what we have looked at before as well, but this time, we have String
type variables in place. We are creating a string
and holding its reference in s1
variable and on the next line, we are creating a copy of the string
and holding its reference in another variable names s2
.
Then we are checking for reference equality for both the variables that are both pointing to the same memory location or not, then in next two lines, we are checking the output of equality operator and Equals
overloaded method.
Now we will build the project and run it to see what it outputs on the console. The following is the output printed on console:
You can see that ReferenceEquals
has returned false
which means that both the string
s are different instances, but the ==
operator and Equals
method have returned true
, so it is clear that for String
s, the equality operator does indeed test the value for equality, not the reference exactly as Object.Equals
does.
Behind the Scenes of Equality Operator for String
Let’s see how the equality operator is doing that. Now let’s examine the IL code generated for this example. For doing that, Open the Visual Studio Developer Command Prompt, for opening it, go to Start Menu >> All Programs >> Microsoft Visual Studio >> Visual Studio Tools >> Developer Command Prompt.
Type ildasm
on the command prompt, this will launch the IL disassembler which is used to look at the IL code contained in an assembly, it is installed automatically when you install Visual Studio, so you don’t need to do anything for installing it.
Click the File
Menu to open the menu, and click the Open
Menu Item which will bring up the window to browse the executable that we want to disassemble.
Now navigate to the location where your application executable file is located and open it.
This will bring up the code of the assembly in hierarchical form, as we have multiple classes written in the assembly, so it has listed down all the classes.
Now the code which we want to explore is in the Main
method of the Program
class, so navigate to the Main
method and double click it to bring the IL code for it.
The IL code for main looks like this:
From the IL, we can see that it is calling the Equals
method implementation which takes parameter of type String
as input and if we dig in to the source code of Equals overload of String.cs file, we can see this implementation for it:
public bool Equals(String value)
{
if (this == null)
throw new NullReferenceException();
if (value == null)
return false;
if (Object.ReferenceEquals(this, value))
return true;
if (this.Length != value.Length)
return false;
return EqualsHelper(this, value);
}
There is also an override implementation of Equals
method available in String
class and the implementation of it looks like the following:
public override bool Equals(Object obj)
{
if (this == null)
throw new NullReferenceException();
String str = obj as String;
if (str == null)
return false;
if (Object.ReferenceEquals(this, obj))
return true;
if (this.Length != str.Length)
return false;
return EqualsHelper(this, str);
}
So it is pretty clear from the above that both the methods contain the same implementation and are doing the same logic for determining if two objects of String
type are equal or not.
IL for String Override of Equals Method
First, let's look at the IL generated for s1.Equals(s2)
, and there are no surprises as it is calling Equals
method, but this time it is calling the method implementation of IEquatable<string>
which takes a string
as argument, not the Object.Equals
override is being called, because the compiler found a better match for the string
argument that was supplied. See the picture below:
IL for == Operator for String
Now let’s examine what is the IL generated for the string
equality checking done using equality operator, so we can see that now there is no ceq
instruction being called which we saw in previous posts that for value types and reference types that instruction is executed when we check for equality using == operator
, but for String
we have call to a new method named op_equality(string, string)
which takes two string
arguments, we have not seen this kind of method before, so what is it actually?
The answer is it is the overload of C# equality operator (==
) which is provided by String
class. In C#, when we define a type, we have the option to overload the equality operator for that type, for example, for Person
class which we have been seeing in previous examples will look like following, if we overload the == operator
for it:
public class Person
{
public int Id { get; set; }
public string Name { get; set; }
public static bool operator == (Person p1, Person p2)
{
bool areEqual = false;
if (Object.Equals(p1, null) && Object.Equals(p2, null))
areEqual = true;
else if (Object.Equals(p1,null) || Object.Equals(p2,null))
areEqual = false;
else if (p1.Id == p2.Id)
areEqual = true;
else
areEqual = false;
return areEqual;
}
}
So the above code is pretty simple, We have declared an operator overload which would be a static
method, but the thing to notice here is that the name of the method is operator ==
The similarity of declaring operator overload with static
method is not a co-incidence, actually it is compiled as a static
method by the compiler, because we know and had been discussed before that IL (Intermediate Language)
has no concept of operators, events, etc., it only understands fields and methods, so operator overload can only exist as a method which we observed in IL code above, the overload operator code is turned by compiler in to a special static
method called op_Equality()
.
First, it is checking if any of the passed instances is null
then they are not equal, then we see if both are null
then obviously both references are equal so it will return true
, and next it checks if Id
property of both references is equal then they are equal, otherwise they are not equal.
This way, we can define our own implementation for our custom types according to the business requirements. As we discussed earlier, equality of two objects is totally dependent on the business flow of the application, so two objects might look equal to someone but not equal to some other according to their business logic.
This makes one thing clear that Microsoft has provided ==
operator overload for String
class, and we can even see that if we peek into the source code of String
class in Visual Studio by using Go to Definition, which would be like:
In the above snapshot, we can see that there are two operator overloads, one for equality and the other is inequality operator which works exactly the same way but the negation of equality operator output. One thing to note is that if you are overloading == operator implementation for a type, you will need to overload the != operator implementation as well to get your code compiled.
Summary
- We have now enough understanding of what C# Equality operator does in the case of Reference Types. Following the thing that we need to keep in mind:
- If there is an overload for the equality operator for the type being compared, it uses that operator as a
static
method. - If there is no overload of operator for reference type, the equality operator compares the memory addresses using
ceq
instruction.
- One thing to note is that Microsoft made sure that
==
operator overload and Object.Equals
override always give the same result even though they are in fact different methods. So that is an important thing we need to keep in mind when we start implementing our own Equals
override, we should also take care of the equality operator as well, otherwise our type will end up giving different result using Equals
override and equality operator which would be problematic for consumers of the type. We will be seeing in another post how we can override Equals
method in a proper way. - If we are changing how Equality works for a type, we need to make sure we provide implementation for both
Equals
override and ==
operator overload so that they both give the same result, and that’s obvious otherwise it would be confusing for other developers who will be using our implemented type .
You Might Also Like to Read