Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Story of Equality in .NET - Part 6

4.80/5 (17 votes)
25 Mar 2018CPOL11 min read 37.6K  
This article explains how Equals method and == operator behave differently for String class

Introduction

We will be looking at String type in this post how Equality works for it. You might be aware that for strings, the equality operator compares values not references which we had seen in the first post of this series. It is because String has overloaded the == operator and also provided an overloaded implementation for Equals method to behave in this manner.

We will investigate how == operator and Object.Equals method behave for equality checking.

Background

This article is in the continuation of series of articles regarding how Equality works in .NET, the purpose is to give the developers a clearer understanding on how .NET handles equality for different types.

What We Learned So Far

Following are the key points that we learned from the previous parts until now:

  • C# does not syntactically distinguish between value and reference equality which means it can sometimes be difficult to predict what the equality operator will do in particular situations.
  • There are often multiple different ways of legitimately comparing values. .NET addresses this by allowing types to specify their preferred natural way to compare for equality, also providing a mechanism to write equality comparers that allow you to place a default equality for each type.
  • It is not recommended to test floating point values for equality because rounding errors can make this unreliable.
  • There is an inherent conflict between implementing equality, type-safety and good Object Oriented practices.
  • Net provides the types equality implementation out of the box, few methods are defined by the .NET framework on the Object class which are available for all types.
  • By default, the virtual Object.Equals method does reference equality for reference types and value equality for value types, but for value types it uses reflection which is a performance overhead for value types and any type can override Object.Equals method to change the logic of how it checks for equality, e.g., String, Delegate and Tuple do this for providing value equality, even though these are reference types.
  • Object class also provides a static Equals method which can be used when there is chance that one or both of the parameters can be null, other than that, it behaves identical to the virtual Object.Equals method.
  • There is also a static ReferenceEquals method which provides a guaranteed way to check for reference equality.
  • IEquatable<T> interface can be implemented on a type to provide a strongly typed Equals method which also avoids boxing for value types. It is implemented for primitive numeric types but unfortunately Microsoft has not been very proactive implementing for other value types in the FCL(Framework Class Library).
  • For Value Types using == operator gives us the same result as calling Object.Equals but underlying mechanism of == operator is different in IL(Intermediate Language) as compared to Object.Equals, so the Object.Equals implementation provided for that primitive type is not called, instead an IL instruction <a href="https://msdn.microsoft.com/en-us/library/system.reflection.emit.opcodes.ceq(v=vs.110).aspx">ceq </a>gets called which says that compare the two values that are being loaded on the stack right now and perform equality comparison using CPU registers.
  • For Reference Types, == operator and Object.Equals method call both work differently behind the scenes which can be verified by inspecting the IL code generated. It also uses ceq instruction which does the comparison of memory addresses.

If you want to read the other parts published until now, you can read them here:

Equality Operator and String

Consider the following piece of code:

C#
class Program
{ 
    static void Main(String[] args)
    { 
        string s1 =  "Ehsan Sajjad";
        string s2 = String.Copy(s1);
 
        Console.WriteLine(ReferenceEquals(s1, s2));
        Console.WriteLine(s1 == s2);
        Console.WriteLine(s1.Equals(s2));
            
        Console.ReadKey(); 
    } 
}

The above code is very similar to what we have looked at before as well, but this time, we have String type variables in place. We are creating a string and holding its reference in s1 variable and on the next line, we are creating a copy of the string and holding its reference in another variable names s2.

Then we are checking for reference equality for both the variables that are both pointing to the same memory location or not, then in next two lines, we are checking the output of equality operator and Equals overloaded method.

Now we will build the project and run it to see what it outputs on the console. The following is the output printed on console:

Image 1

You can see that ReferenceEquals has returned false which means that both the strings are different instances, but the == operator and Equals method have returned true, so it is clear that for Strings, the equality operator does indeed test the value for equality, not the reference exactly as Object.Equals does.

Behind the Scenes of Equality Operator for String

Let’s see how the equality operator is doing that. Now let’s examine the IL code generated for this example. For doing that, Open the Visual Studio Developer Command Prompt, for opening it, go to Start Menu >> All Programs >> Microsoft Visual Studio >> Visual Studio Tools >> Developer Command Prompt.

Image 2

Type ildasm on the command prompt, this will launch the IL disassembler which is used to look at the IL code contained in an assembly, it is installed automatically when you install Visual Studio, so you don’t need to do anything for installing it.

Image 3

Click the File Menu to open the menu, and click the Open Menu Item which will bring up the window to browse the executable that we want to disassemble.

Image 4

Now navigate to the location where your application executable file is located and open it.

Image 5

This will bring up the code of the assembly in hierarchical form, as we have multiple classes written in the assembly, so it has listed down all the classes.

Now the code which we want to explore is in the Main method of the Program class, so navigate to the Main method and double click it to bring the IL code for it.

Image 6

The IL code for main looks like this:

Image 7

From the IL, we can see that it is calling the Equals method implementation which takes parameter of type String as input and if we dig in to the source code of Equals overload of String.cs file, we can see this implementation for it:

C#
public bool Equals(String value)
{
      if (this == null)                        
          throw new NullReferenceException();  

      if (value == null)
          return false;

      if (Object.ReferenceEquals(this, value))
          return true;

      if (this.Length != value.Length)
          return false;

      return EqualsHelper(this, value);
}

There is also an override implementation of Equals method available in String class and the implementation of it looks like the following:

C#
public override bool Equals(Object obj) 
{
      if (this == null)                        
          throw new NullReferenceException();  
 
      String str = obj as String;
      if (str == null)
          return false;
 
      if (Object.ReferenceEquals(this, obj))
          return true;
 
      if (this.Length != str.Length)
          return false;
 
      return EqualsHelper(this, str);
}

So it is pretty clear from the above that both the methods contain the same implementation and are doing the same logic for determining if two objects of String type are equal or not.

IL for String Override of Equals Method

First, let's look at the IL generated for s1.Equals(s2), and there are no surprises as it is calling Equals method, but this time it is calling the method implementation of IEquatable<string> which takes a string as argument, not the Object.Equals override is being called, because the compiler found a better match for the string argument that was supplied. See the picture below:

Image 8

IL for == Operator for String

Now let’s examine what is the IL generated for the string equality checking done using equality operator, so we can see that now there is no ceq instruction being called which we saw in previous posts that for value types and reference types that instruction is executed when we check for equality using == operator, but for String we have call to a new method named op_equality(string, string) which takes two string arguments, we have not seen this kind of method before, so what is it actually?

The answer is it is the overload of C# equality operator (==) which is provided by String class. In C#, when we define a type, we have the option to overload the equality operator for that type, for example, for Person class which we have been seeing in previous examples will look like following, if we overload the == operator for it:

C#
public class Person
{
    public int Id { get; set; }
    public string Name { get; set; }
 
    public static bool operator == (Person p1, Person p2)
    {
        bool areEqual = false;
        if (Object.Equals(p1, null) && Object.Equals(p2, null))   // note, if use == here it will cause 
            areEqual = true;                                      // stackoverflowexception due to
        else if (Object.Equals(p1,null) || Object.Equals(p2,null))  // infinite recursion
            areEqual = false;
        else if (p1.Id == p2.Id)
            areEqual = true;
        else
            areEqual = false;
 
        return areEqual;
    }
}

So the above code is pretty simple, We have declared an operator overload which would be a static method, but the thing to notice here is that the name of the method is operator == The similarity of declaring operator overload with static method is not a co-incidence, actually it is compiled as a static method by the compiler, because we know and had been discussed before that IL (Intermediate Language) has no concept of operators, events, etc., it only understands fields and methods, so operator overload can only exist as a method which we observed in IL code above, the overload operator code is turned by compiler in to a special static method called op_Equality().

First, it is checking if any of the passed instances is null then they are not equal, then we see if both are null then obviously both references are equal so it will return true, and next it checks if Id property of both references is equal then they are equal, otherwise they are not equal.

This way, we can define our own implementation for our custom types according to the business requirements. As we discussed earlier, equality of two objects is totally dependent on the business flow of the application, so two objects might look equal to someone but not equal to some other according to their business logic.

This makes one thing clear that Microsoft has provided == operator overload for String class, and we can even see that if we peek into the source code of String class in Visual Studio by using Go to Definition, which would be like:

Image 9

In the above snapshot, we can see that there are two operator overloads, one for equality and the other is inequality operator which works exactly the same way but the negation of equality operator output. One thing to note is that if you are overloading == operator implementation for a type, you will need to overload the != operator implementation as well to get your code compiled.

Summary

  • We have now enough understanding of what C# Equality operator does in the case of Reference Types. Following the thing that we need to keep in mind:
    • If there is an overload for the equality operator for the type being compared, it uses that operator as a static method.
    • If there is no overload of operator for reference type, the equality operator compares the memory addresses using ceq instruction.
  • One thing to note is that Microsoft made sure that == operator overload and Object.Equals override always give the same result even though they are in fact different methods. So that is an important thing we need to keep in mind when we start implementing our own Equals override, we should also take care of the equality operator as well, otherwise our type will end up giving different result using Equals override and equality operator which would be problematic for consumers of the type. We will be seeing in another post how we can override Equals method in a proper way.
  • If we are changing how Equality works for a type, we need to make sure we provide implementation for both Equals override and == operator overload so that they both give the same result, and that’s obvious otherwise it would be confusing for other developers who will be using our implemented type .

You Might Also Like to Read

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)