When a record declaration has a property referencing itself, a reference cycle is generated which causes the compiler to write a ToString() method which will cause a stack overflow. The C# language designers say this is by design and refuse to issue a compiler warning, while most merely mortals would consider this a bug. Even worse, the debugging session might stop abruptly and the developer has no way to figure out the problem, because the stack trace is no longer available. Be warned when you design your own records.
Introduction
The new record feature of C# is supposed to make the life of developers easy. Just define the properties of a record and the compiler will autogenerate a class with some useful methods like ToString()
. However, if the developer is not careful, his design might cause the compiler to write code that will fail. This article gives a quick introduction into C# 9 records, explains the problem, what you have to do to rectify it and makes some suggestions how the C# compiler should be improved.
C# 9 Record
The new record
keyword in C# 9 allows to define a class
type like this:
record Person(string Name);
This is a kind of a class
declaration with the name Person
and one string
property: Name
. Based on this simple line, the compiler creates IL code which could look in C# like this:
class Person: IEquatable<Person> {
public string Name { get; init; }
protected virtual Type EqualityContract => typeof(Person);
public Person(string name) {Name = name;}
public override bool Equals(object? obj) => Equals(obj as Person);
public virtual bool Equals(Person? other) {
return !(other is null) &&
EqualityContract == other.EqualityContract &&
EqualityComparer<string>.Default.Equals(Name, other.Name);
}
public static bool operator ==(Person? left, Person? right)
=> (object)left! == right || (left?.Equals(right) ?? false);
public static bool operator !=(Person? left, Person? right)
=> !(left == right);
public override int GetHashCode() {
return HashCode.Combine(EqualityComparer<Type>.Default.GetHashCode(EqualityContract),
EqualityComparer<string>.Default.GetHashCode(Name));
}
protected virtual bool PrintMembers(StringBuilder builder) {
builder.Append(nameof(Name));
builder.Append(" = ");
builder.Append(this.Name);
return true;
}
public override string ToString() {
var builder = new StringBuilder();
builder.Append(nameof(Person));
builder.Append(" { ");
if (PrintMembers(builder)) builder.Append(' ');
builder.Append('}');
return builder.ToString();
}
}
This is a lot of boilerplate code a developer no longer needs to write.
A record
is a class
, which uses the values of the properties to decide if two records are equal:
var smith1 = new Person("Smith");
var smith2 = new Person("Smith");
Console.WriteLine($"smith1==smith2: {smith1==smith2}");
Output:
smith1==smith2: True
If person
would be a class without overriden Equals()
, then the comparison would return false
.
Cyclical Reference
When a type (class
, record
, struct
) references itself, a cyclical reference gets created:
class Twin {
string Name;
Twin? OtherTwin;
}
This Twin
class is nothing special, it just references itself.
The same can be defined using record
instead of class
:
record Twin(string Name, Twin OtherTwin);
Looks innocent enough and compiles easily.
I then wrote a unit test:
#nullable enable
using Microsoft.VisualStudio.TestTools.UnitTesting;
namespace CSharpRecordTest {
record Twin(string Name) {
public Twin? OtherTwin { get; set; }
}
[TestClass]
public class UnitTest1 {
[TestMethod]
public void TestMethod1() {
var twinA = new Twin("A");
var twinB = new Twin("B");
twinA.OtherTwin = twinB;
twinB.OtherTwin = twinA;
Assert.AreEqual(twinA, twinA.OtherTwin.OtherTwin);
}
}
}
Note that I had to write OtherTwin
as normal property
, which record
allows to do and treats the same way as if it was defined in record Twin(...)
. Properties written within the parenthesis are readonly
. If OtherTwin
would be readonly
, it would not be possible to create first TwinA
, because the constructor would require TwinB
to exist already. Therefore, twinA
needs to get created first and twinB
assigned as OtherTwin
only later on.
The test runs just fine. But if you try to debug it, set a breakpoint on Assert.AreEqual()
and move with the mouse over twinA
, you get a strange error message and the debugging session stops abruptly:
This was my real life experience when using C# records for the first time. Then I spent over a day trying to figure out what happened. The error message does not give the slightest clue and nothing can be examined, since the debugging session was forced to end. So I decided to report this to Microsoft. To make it easy reproducible, I wrote a console application using ToString()
, which I figured caused the problem. There I got the clue what was wrong:
static void Main(string[] args) {
var twinA = new Twin("A");
var twinB = new Twin("B");
twinA.OtherTwin = twinB;
twinB.OtherTwin = twinA;
Console.WriteLine(twinA);
}
Output on Screen
Stack overflow.
Repeat 4585 times:
--------------------------------
at RecordToStringProblem.Program+Twin.PrintMembers(System.Text.StringBuilder)
at RecordToStringProblem.Program+Twin.ToString()
at System.Text.StringBuilder.Append(System.Object)
--------------------------------
at RecordToStringProblem.Program+Twin.PrintMembers(System.Text.StringBuilder)
at RecordToStringProblem.Program+Twin.ToString()
at System.IO.TextWriter.WriteLine(System.Object)
at System.IO.TextWriter+SyncTextWriter.WriteLine(System.Object)
at System.Console.WriteLine(System.Object)
at RecordToStringProblem.Program.Main(System.String[])
What is happening?
TwinA.ToString()
calls TwinA.OtherTwin.ToString()
which calls TwinA.OtherTwin.OtherTwin.ToString()
and so on until the stack overflows.
Note: A cyclical reference also gets created when a record references another record and the second record references the first record. A cyclical reference might actually involve many records and might be difficult for the developer to notice.
Response from the C# Language Designers
I thought I found a major bug and reported it on Github:
To my surprise, the response was:
Thanks for reporting @PeterHuberSg. Yes this is the correct repository, however, this behavior is "By Design". See #49396.
Unbelievable, but true. For them, this is not a bug, but a feature. I always thought calling a bug a feature is a joke, but here we are.
Even worse, there is no hint at all in the C# record specification that it is the developers' job to recognise that there is a cyclical reference and that in this case, the developer has to override the Equals()
method in his record statement.
Of course, this design means the least work for the compiler, but I feel this situation needs to be improved. I see three possibilities:
- The compiler detects that a property causes cyclical reference and does not call
ToString()
on that property. - The compiler detects that a property causes cyclical reference and creates a warning in the source code and provides a hint to the developer that he needs to overwrite
Equals()
. - The compiler detects that a property causes cyclical reference and creates a warning in the source code and provides a hint to the developer that he needs to overwrite
Equals()
. The developer can mark the property not to be included in ToSring()
.
They really should have done 1) right from the beginning. 2) is the bare minimum. However, the idea in 3) is that properties can be marked not to be included in ToString()
(and Equals()
and GetHashCode()
). This could come in handy, because equality should depend only on readonly
properties, otherwise classes like Dictionary
will not work.
If we (=all developers) learned one lesson over the years, then it is this: The compiler should catch as many errors as possible. However, I gave up hope that I can convince the C# language designers of anything.
If you have read until here, you seem genuinely interested in coding. In this case, I would recommend you my article Debugging Multithreaded Code in Real Time. Some of the hardest bugs to find are those created by race conditions in multithreading. I managed to write some code which achieves the impossible: Providing debugging information without slowing down noticeably the running code. The article got 4.96 stars out of 5.
Update
From the discussion (see wkempf and others) below, I understand now a bit better, why it might be difficult for the compiler to detect that a cyclical reference will truly occur at run time. If you look at my code above, creating TwinA
and TwinB
and then linking TwinA
to TwinB
does not create a cyclical reference yet. If my program would stop there, ToString()
would work just fine.
In a single linked list, exactly that happens, one member of the list linking to the next member. Of course, this will still lead to a stack overflow when the linked list is thousands of links long. And even without a stack overflow, the ToString()
will look bad even with just 10+ members, because each member will be shown, which might take a lot of space.
In reality, many linked lists are double linked lists, meaning each member links to the previous and the next member. This allows to remove any member quickly from the list. In a long single linked list, this is extremely slow. Of course, the double linked list will always create a cyclical reference, only this is difficult for the compiler to detect without complicated analysis of the code.
So the point being raised in the discussion is that the compiler cannot easily detect at compile time if a cyclical reference will be truly generated at run time and therefore the compiler should not raise a warning at all.
I see that differently. Like with the nullable feature, a warning is just a warning, even if it might not be correct, which happens with the nullable warnings all the time. But the nullable feature still raises the warning and the developer decides if it is indeed a real problem or not. The same approach should be used with records and possible cyclical references. Even if there is never a cyclical reference in the data during runtime, the ToString()
might still not be usable.
History
- 26th April, 2021: Initial version
- 28th April, 2021: Update