I usually write C# articles but lately I am dealing with C++ more often, so this post is about C++ (yet I keep some C# comparisons). It is all about a common but hard-to-find problem in C++: Smart-pointers, methods that keep a reference to the "this" instance and calling anything that can clear the smart-pointer and kill the object (usually callbacks) before we finish using any information of that this
variable.
Simply put, the this
variable is not a smart pointer. If an instance method (that is, one that has a this
) nullifies/clears the smart pointer that holds the actual instance, then the current object will be freed but the this
pointer will still be available, pointing to garbage. So, any access to other members of the "this" instance will deal with garbage, and many bad things can happen.
Worse than that, in Garbage Collected languages this would never happen (at least not if there's no interaction with unmanaged objects or bugs in the language itself) as even the implicit this
variable is a garbage collected reference (or if you prefer, a kind of smart pointer). This makes porting code from Garbage Collected languages to C++ much harder than it might look at a first glance.
So, look at the most basic example:
-
In C#
public class MyClass
{
private readonly string Name;
public static MyClass GlobalInstance;
public MyClass()
{
Name = "I only put a name here so we have a field to access";
}
public void MakeUnavailable()
{
if (GlobalInstance == this)
{
GlobalInstance = null;
Console.WriteLine("The instance \"" + this.Name + "\" is no more the GlobalInstance.");
}
}
}
At some moment, we can do this:
MyClass.GlobalInstance = new MyClass();
MyClass.GlobalInstance.MakeUnavailable();
-
In C++
class MyClass
{
private:
string Name;
public:
static shared_ptr<MyClass> GlobalInstance;
MyClass()
{
Name = "I only put a name here so we have a field to access";
}
void MakeUnavailable()
{
if (GlobalInstance.get() == this)
{
GlobalInstance.reset();
printf("The instance %s is no more the GlobalInstance.", this->Name);
}
}
};
Again, at some other place we can do this:
MyClass::GlobalInstance = make_shared<MyClass>();
MyClass::GlobalInstance->MakeUnavailable();
Note: I know that static variables and public fields aren't a good design, but I only want to show the most basic example of what can go wrong. This code is also not intended to be thread-safe.
You can see that the C++ and C# code are very similar. Yet, the C# code will work fine. After setting the GlobalInstance
to null, the Name
field can be read without problems because the this
variable guarantees that the current object is kept alive. The C++ version is not safe. As soon as the GlobalInstance
is reset, the instance dies, then the this->Name
on the printf()
will read garbage. Interestingly, though, many calls like this will work because the reclaimed memory isn't clear or used immediately to anything else (in particular if you use an int
variable instead of a string
), but then it becomes harder to tell what is wrong when it actually fails.
More realistic situations
I know that the basic example is naive. Nobody writing "good" code put static variables to keep an instance alive. Yet, that example was only to show what can go wrong using a very small amount of code.
In most real-world cases, the error happens during callbacks and when using ref-counted objects like COM ones. The code can be as simple as this:
void InvokeCallback()
{
if (m_InvokingCallback)
return;
m_InvokingCallback = true;
InvokeCallback(this);
m_InvokingCallback = false;
}
Code similar to this is used in many notifications to protect the object from generating the notification recursively. The problem is that the object may die during a call to InvokeCallback()
, and the bold line that does m_InvokingCallback = false
will acces garbage.
Imagine that InvokeCallback()
is actually a MouseEntered
event. The Button
is ref-counted and will have a count of at least one when on the screen. Yet, the callback may be removing the button from the screen, causing its count to by reduced to zero if no other smart pointer or AddRef()
was done, deleting it before the callback returns. Again, remember that the this
variable is a basic pointer, not a smart pointer. It will not do an AddRef()/Release()
pair on that button instance, so we may have a crash here.
Solving the Problem
Knowing that the problem exists is great. Better yet is to know how to solve it.
But that may not be that easy. Considering that we don't have the source-code of a control/component that has the issue, we may need to look for work-arounds.
- If facing a situation of a button that can't be removed from the screen during a
MoveMoved
, Click
or similar event, the work-around may be to post a message to the parent control, windows or through a Dispatcher->BeginInvoke()
so we execute that code later, when the control is not executing a callback anymore;
-
If our code is invoking a method of an object that has the issue, which is actually crashing in the middle of that call, we may try to protect that call. That is, instead of writing something like:
button->Click();
We could do this:
ComPtr<Button> keepAlive(button);
button->Click();
Or, if we don't care about exceptions, we can do direct calls to AddRef/Release:
button->AddRef();
button->Click();
button->Release();
In those two alternatives, we will keep the button alive in memory even if it is removed from the screen during its callback, avoiding it from dying during the callback;
-
And, of course, if we are the owners of the component we can try to protect our component from dying at the wrong time by doing an AddRef/Release
(with or without a smart pointer) before invoking any callback:
void InvokeCallback()
{
if (m_InvokingCallback)
return;
m_InvokingCallback = true;
ComPtr<OurComponentType> keepAlive(this);
InvokeCallback(this);
m_InvokingCallback = false;
}
It's important to note that the object doesn't need to be a COM object to have the AddRef
and Release
methods. Also, it is possible to create many variants of it, even using booleans to mark an object as being in a problematic situation and either forbidding deletes from happening in the middle of a callback (probably throwing an exception if the user tries to do it) or by registering a delete requested flag and then deleting the object when the problematic situation is finished.
Why are those errors happening?
To me it is even funny to see that .NET, Java and many languages/frameworks solved the issue many time ago and now, it is appearing again. Maybe because of performance concerns, reference counted objects/variables are being used again. Smart pointers give the false impression that they are better than garbage collection. They are more predictable. They free an object as soon as it is not used anymore. Yet ref-counted variables have problems with circular references (cases when A holds a reference to B and B has a reference to A... but nobody else ever uses them, so they get lost in memory) and, in particular with C++, invoking any method by using a ->
on a smart pointer means that such method received a non-smart pointer to that object, the this
, and clearing the smart pointer in the middle of that action, directly or indirectly, is to ask for trouble.