Introduction
Many C++ programmers were rather unhappy with the non-deterministic
finalization feature they were provided with by the .NET Garbage Collection
algorithm. C++ programmers were so much used to the RAII (Resource Acquisition
Is Initialization) idiom where they expected a destructor to get called when an
object went out of scope or when delete
was explicitly called on
it, that a non-deterministic destructor simply didn't fit their expectations or
requirements. Microsoft alternatively offered the Dispose pattern where classes
had to implement IDisposable
and then call Dispose
on
their objects when they went out of scope. The basic issue here was that this
required the programmer to manually and consistently call Dispose
whenever the object needed to be finalized and it became worse when the object
had managed member objects that themselves would need to have Dispose
called on them, which then meant they too needed to implement IDisposable
.
Tiresome sounding, isn't it?
Guess what? In C++/CLI, the Microsoft VC++ team is giving us a destructor
that internally gets compiled to the Dispose
method and the old
finalizer gets an alternate syntax, so we basically have finalizers and
destructors as two separate entities that behave differently as they should have
in the previous version. The designers of C# made the unfortunate initial
mistake of calling their finalizer a destructor and I presume there must be tens
of thousands of C# coders out there who have no inkling of the fact that they
have got a basic concept in object life-time maintenance absolutely confused
with the wrong thing.
Note
It's easy to wrongly call automatic objects as stack objects in C++/CLI, but
it should be remembered that the seemingly stack based objects actually reside
on the CLR heap, as they are still normal garbage collected ref objects. It's a
C++ compiler trick that allow us to treat these variables just as we used to
treat stack based objects in unmanaged C++ during the good old days.
The new syntax
In C++/CLI, destructors follow the same syntax used in the pre-managed times,
where ~classname would be the method name for the destructor. It also brings out
a new naming syntax, !classname which is the method name for the finalizer. Here
is what a typical class would look like :-
ref class R1
{
public:
R1()
{
Show("R1::ctor");
}
~R1()
{
Show("R1::dtor");
}
protected:
!R1()
{
Show("R1::fnzr");
}
};
The destructor (~R1
) gets compiled into a Dispose
method in the generated IL.
.method public newslot virtual
final instance void
Dispose() cil managed
{
.override [mscorlib]System.IDisposable::Dispose
.maxstack 1
IL_0000: ldstr "R1::dtor"
IL_0005: call void [mscorlib]
System.Console::WriteLine(string)
IL_000a: ldarg.0
IL_000b: call void [mscorlib]
System.GC::SuppressFinalize(object)
IL_0010: ret
}
The C# equivalent of the above would be :-
public void Dispose()
{
Console.WriteLine("R1::dtor");
GC.SuppressFinalize(this);
}
There is a call made to GC::SuppressFinalize
in the generated
Dispose
method. This is done to ensure that the finalizer does not
get called during the garbage collection cycle that claims this object's memory.
If that sounds confusing, remember that we are still restricted by the
environment which we are targeting, which happens to be the CLR. In the CLR,
reference objects are allocated on the CLR heap and their memory is reclaimed
when they are out of use by the Garbage Collector, there is no way the
programmer can free up the memory on his/her own. So, even if our destructor gets
called, the memory will be released only during the next GC cycle and at that
point we don't want the GC trying to call Finalize
on our object.
GC::SuppressFinalize
basically removes the object from the
finalization queue.
How it's implemented
void _tmain()
{
R1 r;
}
I've declared r
as an automatic variable. Now let's see the IL
that gets generated for this :-
.method public static int32
main() cil managed
{
.vtentry 1 : 1
.maxstack 1
.locals (class R1 V_0)
IL_0000: ldnull
IL_0001: stloc.0
IL_0002: newobj instance void R1::.ctor()
IL_0007: stloc.0
IL_0008: ldloc.0
IL_0009: call instance void R1::Dispose()
IL_000e: ldc.i4.0
IL_000f: ret
}
The C# equivalent for that would be :-
public static int main()
{
R1 r = null;
r = new R1();
r.Dispose();
return 0;
}
Pretty straightforward stuff as you can see with Dispose
being
called when the object goes out of scope. You might be a little surprised that
there is no try
-catch
block in there, but that's
because our code fragment was too simple. try
-catch
blocks are used only if they are required, in the above case, it's not so. Let's
see the following code snippet :-
void _tmain()
{
R1 r;
int y=100;
}
The IL generated :-
.method public static int32
main() cil managed
{
.vtentry 1 : 1
.maxstack 1
.locals (class R1 V_0,
int32 V_1)
IL_0000: ldnull
IL_0001: stloc.0
IL_0002: newobj instance void R1::.ctor()
IL_0007: stloc.0
.try
{
IL_0008: ldc.i4.s 100
IL_000a: stloc.1
IL_000b: leave.s IL_0014
}
fault
{
IL_000d: ldloc.0
IL_000e: call instance void R1::Dispose()
IL_0013: endfinally
}
IL_0014: ldloc.0
IL_0015: call instance void R1::Dispose()
IL_001a: ldc.i4.0
IL_001b: ret
}
The moment the compiler realizes that there is a probable contingency where
control might not reach the line that calls Dispose
, it implements
a try
block and in case of any exception, calls Dispose
within the fault handler. The C# equivalent would be :-
public static int main()
{
R1 r = null;
int y;
r = new R1();
try
{
y = 100;
}
catch
{
r.Dispose();
}
r.Dispose();
return 0;
}
You could also declare the object as a handle object and then
manually call delete
on it which equates to calling Dispose
on your object.
void _tmain()
{
R1^ r = gcnew R1();
delete r;
}
The generated IL is a little more complex for this case (I am not fully sure why an
unnecessary int
variable is introduced for instance.)
.method public static int32
main() cil managed
{
.vtentry 1 : 1
.maxstack 1
.locals (class [mscorlib]System.IDisposable V_0,
class R1 V_1,
int32 V_2)
IL_0000: ldnull
IL_0001: stloc.1
IL_0002: newobj instance void R1::.ctor()
IL_0007: stloc.1
IL_0008: ldloc.1
IL_0009: stloc.0
IL_000a: ldloc.0
IL_000b: brfalse.s IL_0017
IL_000d: ldloc.0
IL_000e: callvirt
instance void [mscorlib]System.IDisposable::Dispose()
IL_0013: ldnull
IL_0014: stloc.2
IL_0015: br.s IL_0019
IL_0017: ldnull
IL_0018: stloc.2
IL_0019: ldc.i4.0
IL_001a: ret
}
As I mentioned, I am truly puzzled by the V_2 int32
variable. Here is the C# equivalent for those of you who don't like looking at
IL.
public static int main()
{
int v2;
R1 r = null;
r = new R1();
IDisposable d = r;
if (disposable1 != null)
{
d.Dispose();
v2 = 0;
}
else
{
v2 = 0;
}
return 0;
}
My best guess is that this is to help the CLR Execution Engine do run-time
optimizations; in the above case, the entire if
loop might possibly
be skipped if r
is not null
.
How member objects are handled
See the following code snippet :-
#define Show(x) Console::WriteLine(x)
ref class R1
{
public:
R1()
{
Show("R1::ctor");
}
~R1()
{
Show("R1::dtor");
}
protected:
!R1()
{
Show("R1::fnzr");
}
};
ref class R
{
public:
R()
{
Show("R::ctor");
}
~R()
{
Show("R::dtor");
}
R1 r;
protected:
!R()
{
Show("R::fnzr");
}
};
Let's take a look at R
's constructor in the generated IL :-
.method public specialname rtspecialname
instance void .ctor() cil managed
{
.maxstack 2
IL_0000: ldarg.0
IL_0001: call instance void [mscorlib]System.Object::.ctor()
IL_0006: ldarg.0
IL_0007: newobj instance void R1::.ctor()
IL_000c: stfld class R1 modopt(
[Microsoft.VisualC]Microsoft.VisualC.IsByValueModifier) R::r
IL_0011: ldstr "R::ctor"
IL_0016: call void [mscorlib]System.Console::WriteLine(string)
IL_001b: ret
}
Equivalent C# code would be :-
public R()
{
this.r = ((R1 modopt(Microsoft.VisualC.IsByValueModifier)) new R1());
Console.WriteLine("R::ctor");
}
The compiler inserts a custom modopt
modifier into
the instantiation of the R1 object which would give the JIT compiler some idea
of how to treat it. In this case, it has marked it with
Microsoft.VisualC.IsByValueModifier
which presumably means that this
object is to be treated as a pass-by-value object. Anyway, that's beyond the
scope of this article and what I wanted to put forth here is that the R
object's constructor also instantiates and constructs the R1
member
object.
Now let's see the R
class destructor :-
.method public newslot virtual final instance void
Dispose() cil managed
{
.override [mscorlib]System.IDisposable::Dispose
.maxstack 1
.try
{
IL_0000: ldstr "R::dtor"
IL_0005: call void [mscorlib]System.Console::WriteLine(string)
IL_000a: leave.s IL_0018
}
fault
{
IL_000c: ldarg.0
IL_000d: ldfld class R1 modopt(
[Microsoft.VisualC]Microsoft.VisualC.IsByValueModifier) R::r
IL_0012: call instance void R1::Dispose()
IL_0017: endfinally
}
IL_0018: ldarg.0
IL_0019: ldfld class R1 modopt(
[Microsoft.VisualC]Microsoft.VisualC.IsByValueModifier) R::r
IL_001e: call instance void R1::Dispose()
IL_0023: ldarg.0
IL_0024: call void [mscorlib]System.GC::SuppressFinalize(object)
IL_0029: ret
}
Equivalent C# code is :-
public void Dispose()
{
try
{
Console.WriteLine("R::dtor");
}
catch
{
this.r.Dispose();
}
this.r.Dispose();
GC.SuppressFinalize(this);
}
As you can see, Dispose
is called on the member object as well.
The compiler sure does generate a lot of code for us, eh?
In the above discussed case, the member object was also an automatic
variable. But what if we had a handle variable as a member? In that case, we
should manually delete
the member variable in our destructor, otherwise there
won't be so much benefit out of the deterministic destruction if the member
objects will then have to wait for an unpredictable GC cycle before they get
disposed. So, this is what we need to do for such cases :-
ref class R
{
public:
R()
{
r = gcnew R1();
Show("R::ctor");
}
~R()
{
delete r;
Show("R::dtor");
}
R1^ r;
protected:
!R()
{
Show("R::fnzr");
}
};
Warning
Do not delete
member objects manually from your finalizer, because there is
every chance that by the time the finalizer is called on your object, its member
objects might already have been finalized.
Performance boost
By using destructors whenever possible instead of finalizers, you would see a
small-to-medium performance boost in your code. Problem with finalizers it that,
the GC promotes objects that need to be finalized to at least Generation 2, and
then the finalizer thread will have to run the Finalize
method on
objects that need finalizatioon, and then the GC has to reclaim the memory in a
future cycle.
Points to remember when using destructors
- You cannot have a method named
Dispose
in your class, for
obvious reasons
- ~classname is the destructor and !classname is the finalizer
- Destructors get called when the object goes out of scope, but the memory
won't be freed up until the next GC cycle
- The destructor and finalizer won't get called for the same object
- For automatic member variables you don't need to do anything special
- For handle member variables, make sure to
delete
them
manually in the destructor
Conclusion
Essentially the C++/CLI deterministic destructor implementation is internally
a syntactically pleasant form of the Dispose-Pattern and the compiler generates
just about all the code that we require. I believe C# 2.0 has a slightly
inferior form where they use the using
-keyword. The big plus about the C++/CLI
destructor syntax is that it fits in naturally to what a native C++ programmer
expects his/her destructor to do, and he/she needn't even be aware of the Dispose
pattern that's being used internally. Thanks to Herb Sutter and his team :-)