This article presents an implementation of an efficient reference counting system which supports user-provided allocators, weak references, and is thread-safe.
Disclaimer: This article uses material published on
this page.
Background
This article uses some advanced C++ concepts such as variadic templates, perfect forwarding, inplace new.
Introduction
Reference counting is a well-known method that automatically handles lifetimes of objects with complex dependencies. C++ standard library provides std::shared_ptr class that is a smart pointer retaining shared ownership of a dynamically allocated object. The class maintains internal reference counters, but does not allow accessing them directly. The counters can only be changed by creating new smart pointers, which limits their application to C++ language. There are, however, certain situations when custom reference counting system is needed. Diligent Engine is a cross-platform rendering engine. It is packed into dynamic libraries, which can be used in any language or system, and thus needs to provide simple and efficient method for managing object lifetimes. That system could be implemented with C++ smart pointers, however that would require introducing additional level of abstraction: instead of exposing native interfaces, the engine would have to expose smart pointers, which is undesirable. Besides that, implementing own resource counting system gives more flexibility and provides better possibilities for optimizations. Diligent Engine for performance reasons implements its own atomic operations. On Win32 platform, both Diligent Engine and standard C++ atomics use InterlockedIncrement() Windows function, but because DE introduces less overhead, its implementation is more than two times faster than one provided by C++ standard library.
Basic Reference Counting System
A very basic resource counting system can be implemented using the base class that may look like this:
template<typename Base>
class RefCountedObject : public Base
{
public:
void AddRef()
{
++m_NumReferences;
}
void Release()
{
if(--m_NumReferences == 0)
{
delete this;
}
}
private:
virtual ~RefCountedObject() = 0;
int m_NumReferences = 0;
};
The class defines virtual destructor that allows deleting any instance of derived class through the base class pointer. The class counts the number of outstanding references and destroys itself when the last reference is released. That kind of implementation provides very limited functionality and only works in a single-threaded scenario. Real resource counting system should support many more features:
- Allow different types of memory allocation and deallocation methods
- Provide weak pointers that implement non-owning references
- Object ownership flexibility (an object may be part of another reference counted object)
- Provide exception and multi-threaded safety
- Efficiency
Designing a system that meets all the requirements above is an interesting and challenging task. This article presents a solution implemented in Diligent Engine.
Overview
Diligent Engine implements thread-safe reference counting system that supports weak pointers. It includes the following classes and interfaces:
IObject
is the base interface to the reference counted object IReferenceCounters
is the interface to the helper object that implements reference counting and controls the lifetime of an object
RefCountedObject
is the template base class for all reference counted objects RefCountersImpl
is the template class that implements actual reference counting functionality defined by the IReferenceCounters
interface MakeNewRCObj
is the template class that is responsible for creating object + reference counters pair RefCntAutoPtr
is the template class that implements strong pointer RefCntWeakPtr
is the template class that implements weak pointer
The following diagram illustrates different components of the system:
IReferenceCounters Interface
The declaration of IReferenceCounters
interface is given in the listing below:
class IReferenceCounters
{
public:
virtual Atomics::Long AddStrongRef() = 0;
virtual Atomics::Long ReleaseStrongRef() = 0;
virtual Atomics::Long AddWeakRef() = 0;
virtual Atomics::Long ReleaseWeakRef() = 0;
virtual void GetObject(class IObject **ppObject) = 0;
virtual Atomics::Long GetNumStrongRefs()const = 0;
virtual Atomics::Long GetNumWeakRefs()const = 0;
};
The interface defines the following functions:
AddStrongRef()
- Increments the number of strong references by 1
. The method returns the number of strong references after incrementing the counter. It is thread-safe and does not require explicit synchronization. ReleaseStrongRef()
- Decrements the number of strong references by 1
and destroys the referenced object when the counter reaches zero. If there are no more weak references, destroys the reference counters object itself. The method is thread-safe and does not require explicit synchronization. AddWeakRef()
- Increments the number of weak references by 1
and returns the number of weak references after incrementing the counter. The method is thread-safe and does not require explicit synchronization. ReleaseWeakRef()
- Decrements the number of weak references by 1
. If there are no more strong and weak references, destroys the reference counters object itself. GetObject()
- Gets the pointer to the IUnknown
interface of the referenced object. If the object was destroyed, nullptr
will be returned. If the object is alive, the pointer to the object's IUnknown
interface will be returned. In this case, the number of strong references to the object will be incremented by 1
. GetNumStrongRefs()
- Returns the number of outstanding strong references. This method is intended for debug purposes only. In a multithreaded environment, the returned number may not be reliable as other threads may simultaneously change the actual value of the counter. The only reliable value is 0
as the object is destroyed when the last strong reference is released. GetNumWeakRefs()
- Returns the number of outstanding weak references. This method is intended for debug purposes only. In a multithreaded environment, the returned number may not be reliable as other threads may simultaneously change the actual value of the counter.
IObject Interface
The declaration of IObject
interface is given in the listing below:
class IObject
{
public:
virtual void QueryInterface
( const Diligent::INTERFACE_ID &IID, IObject **ppInterface ) = 0;
virtual Atomics::Long AddRef() = 0;
virtual Atomics::Long Release() = 0;
virtual IReferenceCounters* GetReferenceCounters()const = 0;
};
The interface defines the following functions:
QueryInterface()
- Queries the specific interface. The method increments the number of strong references by 1
. The interface must be released by a call to Release()
method when it is no longer needed. AddRef()
- Increments the number of strong references by 1
. This method is equivalent to GetReferenceCounters()->AddStrongRef()
. It is thread-safe and does not require explicit synchronization. Release()
- Decrements the number of strong references by 1
and destroys the object when the counter reaches zero. This method is equivalent to GetReferenceCounters()->ReleaseStrongRef()
. It is thread-safe and does not require explicit synchronization. GetReferenceCounters()
- Returns the pointer to IReferenceCounters
interface of the associated reference counters object. The method does not increment the number of strong references to the returned object.
RefCountedObject Class
RefCountedObject
is the base template class that implements common functionality of all reference counted objects. Its source code is given in the listing below:
template<typename Base>
class RefCountedObject : public Base
{
public:
RefCountedObject(IReferenceCounters *pRefCounters)noexcept :
m_pRefCounters( ValidatedCast<RefCountersImpl>(pRefCounters) )
{
}
virtual ~RefCountedObject()
{
}
inline virtual IReferenceCounters* GetReferenceCounters()const override final
{
return m_pRefCounters;
}
inline virtual Atomics::Long AddRef()override final
{
return m_pRefCounters->AddStrongRef();
}
inline virtual Atomics::Long Release()override final
{
return m_pRefCounters->ReleaseStrongRef();
}
private:
template<typename AllocatorType, typename ObjectType>
friend class MakeNewRCObj;
friend class RefCountersImpl;
void* operator new(size_t Size)
{
return new Uint8[Size];
}
void operator delete(void *ptr)
{
delete[] reinterpret_cast<Uint8*>(ptr);
}
template<typename ObjectAllocatorType>
void* operator new(size_t Size, ObjectAllocatorType &Allocator,
const Char* dbgDescription,
const char* dbgFileName, const Int32 dbgLineNumber)
{
return Allocator.Allocate(Size, dbgDescription, dbgFileName, dbgLineNumber);
}
template<typename ObjectAllocatorType>
void operator delete(void *ptr, ObjectAllocatorType &Allocator,
const Char* dbgDescription,
const char* dbgFileName, const Int32 dbgLineNumber)
{
return Allocator.Free(ptr);
}
RefCountersImpl *const m_pRefCounters;
};
Few comments regarding the implementation:
AddRef()
, Release()
, and GetReferenceCounters()
methods are declared as final
and inline
. The first keyword tells the compiler that the methods will never be overridden in derived classes, so that the compiler can avoid virtual calls. inline
is another hint for the compiler to inline the methods. - Calls to
AddRef()
and Release()
are delegated to the reference counters object, pointer to which is kept in m_pRefCounters
member. Since the type of this pointer is RefCountersImpl*
rather than IReferenceCounters*
, a good compiler will be able to avoid virtual calls and potentially inline them. - Valid pointer to the reference counters object must be provided to the constructor. Note that
m_pRefCounters
does not necessarily have to track the object itself. It may be another object that owns this one, directly or indirectly. - The class defines two versions of operator
new
and two operators delete
. These operators are private and are only accessible by MakeNewRCObj
class that is responsible for constructing new objects. The first pair uses standard C++ operator new to allocate required amount of memory. The second pair uses custom allocator. These versions take reference to the allocator as an argument. They also use debug information such as allocation description, file name and line number where allocation was performed. Also note that either operator delete
is only called if an exception is thrown during the object construction.
RefCountersImpl Class
RefCountersImpl
class is responsible for the bulk of the actual reference counting functionality and implements IReferenceCounters
interface.
Destroying Controlled Object
The most important operation that RefCountersImpl
class must support is destroying the object it controls. This operation needs to be flexible enough to support different types of objects as well as different allocators. One way to achieve this is to make the class a template class parameterized by the type of the controlled object as well as by the allocator type:
template<typename ControlledObjectType, typename ObjectAllocatorType>
class RefCountersImpl : public IReferenceCounters
Such template class can hold pointers to the object and to the allocator:
ControlledObjectType* m_pObject;
ObjectAllocatorType * const m_pObjectAllocator;
so that the object can be destroyed as follows:
auto *pObj = m_pObject;
auto *pAllocator = m_pObjectAllocator;
if (pAllocator)
{
pObj->~ControlledObjectType();
pAllocator->Free(pObj);
}
else
{
delete pObj;
}
As long as ControlledObjectType
is a type derived from RefCountedObject
, it will have a virtual destructor and will be successfully destroyed by pObj->~ControlledObjectType()
. This approach also handles any allocator and is thus a legitimate method to handle different types of objects and different types of allocators. In fact, this is how it was originally implemented in Diligent Engine. It however has one drawback: the type of RefCountedObject::m_pRefCounters
member needs to be IReferenceCounter
. The reason is that when an object is part of another reference counting object (for example, default view of a texture is part of the texture object), it keeps pointer to the reference counters of the parent object. Since the types of the two objects are unrelated, the only alternative left is using IReferenceCounter
class which is common ancestor for all reference counters. The biggest disadvantage of this method is that AddRef()
and Release()
methods of RefCountedObject
class perform virtual calls to AddStrongRef()
and ReleaseStrongRef()
of a specific RefCountersImpl
instantiation, which is suboptimal. To the contrary, in the implementation of RefCountedObject
class shown above, the type of RefCountedObject::m_pRefCounters
member is RefCountersImpl*
(non template). All methods of IReferenceCounter
interface implemented by RefCountersImpl
class are labeled as final
, and inline
, so a good compiler will not only be able to eliminate virtual call, but to eliminate the function call altogether by inlining the methods.
The question now is: if RefCountersImpl
class does not depend on the type of the object and allocator, how can we provide the required flexibility? The answer is using object wrapper template class that knows how to destroy the object. First of all, we will define non-template abstract base class as shown in the listing below:
class ObjectWrapperBase
{
public:
virtual void DestroyObject() = 0;
virtual void QueryInterface
( const Diligent::INTERFACE_ID &iid, IObject **ppInterface )=0;
};
This class only provides two pure virtual
methods: one to destroy the object, and the second to query the specified interface (which we will use to implement GetObject()
method later). So as long as RefCountersImpl
class keeps pointer to the object wrapper of type ObjectWrapperBase
, it can destroy controlled object as easy as pWrapper->DestroyObject()
. Specific details about destroying particular object with particular allocator are defined by the inherited template class given in the listing below:
template<typename ObjectType, typename AllocatorType>
class ObjectWrapper : public ObjectWrapperBase
{
public:
ObjectWrapper(ObjectType *pObject, AllocatorType *pAllocator) :
m_pObject(pObject),
m_pAllocator(pAllocator)
{}
virtual void DestroyObject()
{
if (m_pAllocator)
{
m_pObject->~ObjectType();
m_pAllocator->Free(m_pObject);
}
else
{
delete m_pObject;
}
}
virtual void QueryInterface
( const Diligent::INTERFACE_ID &iid, IObject **ppInterface )override final
{
return m_pObject->QueryInterface(iid, ppInterface);
}
private:
ObjectType* const m_pObject;
AllocatorType * const m_pAllocator;
};
You can see that all the object destruction logic moved to DestroyObject()
virtual
method. Since the class is parameterized by the type of object and allocator (the same parameters that RefCountersImpl
class had originally), it can handle all object types and all allocator types. The only question we need to answer is how to initialize specific instance of ObjectWrapper
class. For that, RefCountersImpl
class provides template method Attach()
that has the same template parameters and initializes specific instance of ObjectWrapper
class. However, instead of creating the class on the heap, the method initializes it in place in the raw memory buffer provided by RefCountersImpl
class:
static const size_t ObjectWrapperBufferSize =
sizeof(ObjectWrapper<IObject, IMemoryAllocator>) / sizeof(size_t);
size_t m_ObjectWrapperBuffer[ObjectWrapperBufferSize];
template<typename ObjectType, typename AllocatorType>
void Attach(ObjectType *pObject, AllocatorType *pAllocator)
{
new(m_ObjectWrapperBuffer) ObjectWrapper<ObjectType, AllocatorType>
(pObject, pAllocator);
m_ObjectState = ObjectState::Alive;
}
Typical size of m_ObjectWrapperBuffer
buffer will be the size of three pointers (vtbl
pointer plus m_pObject
and m_pAlloctor
members). In-place new
operation
new(m_ObjectWrapperBuffer) ObjectWrapper<ObjectType, AllocatorType>(pObject, pAllocator);
initializes raw memory with an instance of ObjectWrapper<ObjectType, AllocatorType>
class, whose m_pObject
and m_pAlloctor
members will store pointers provided to the constructor from the calling function. Also, the compiler will generate specific DestroyObject()
method instantiation for the ObjectType
and AllocatorType
types, which will be accessible though the virtual table of the ObjectWrapper<ObjectType, AllocatorType>
class, pointer to which will in turn be stored in the first element of m_ObjectWrapperBuffer
buffer. So putting everything together, RefCountersImpl::ReleaseStrongRef()
will execute the following code to destroy the controlled object:
size_t ObjectWrapperBufferCopy[ObjectWrapperBufferSize];
for(size_t i=0; i < ObjectWrapperBufferSize; ++i)
ObjectWrapperBufferCopy[i] = m_ObjectWrapperBuffer[i];
auto *pWrapper = reinterpret_cast<ObjectWrapperBase*>(ObjectWrapperBufferCopy);
pWrapper->DestroyObject();
Class Members and Methods
Now when we described how RefCountersImpl
class manages various types of objects allocated with different allocators, we can describe the implementation of class methods. But first, let's take a look at other members that the class defines:
Atomics::AtomicLong m_lNumStrongReferences;
Atomics::AtomicLong m_lNumWeakReferences;
ThreadingTools::LockFlag m_LockFlag;
enum class ObjectState : Int32
{
NotInitialized,
Alive,
Destroyed
}m_ObjectState = ObjectState::NotInitialized;
m_lNumStrongReferences
is the strong reference counter m_lNumWeakReferences
is the weak reference counter m_LockFlag
is the lock flag used to gain exclusive access to the object members m_ObjectState
is the state of the object (not initialized, alive or destroyed)
Let's now look at the implementation of the class methods. AddStrongRef()
, AddWeakRef()
, GetNumStrongRefs()
, and GetNumWeakRefs()
are straightforward:
inline virtual Atomics::Long AddStrongRef()override final
{
return Atomics::AtomicIncrement(m_lNumStrongReferences);
}
inline virtual Atomics::Long AddWeakRef()override final
{
return Atomics::AtomicIncrement(m_lNumWeakReferences);
}
inline virtual Atomics::Long GetNumStrongRefs()const override final
{
return m_lNumStrongReferences;
}
inline virtual Atomics::Long GetNumWeakRefs()const override final
{
return m_lNumWeakReferences;
}
Note that inline
and final
keywords are very strong hints to a good compiler that these methods need to be inlined when they are called using the pointer of RefCountersImpl*
type.
The real work is done by ReleaseStrongRef()
, ReleaseWeakRef()
and GetObject()
methods. At first glance, the implementation may seem to be straightforward, but in a multithreaded environment, there may be a lot of tricky situations. Consider few examples:
- There is one strong reference and one weak reference remaining. One thread releases the last strong reference, while the second one releases weak reference. The first thread destroys controlled object, but which one should destroy the reference counter itself? If the threads simply check that both counters have reached
0
, then the object will be released twice, which will cause undefined behavior. - There is one strong reference and one weak reference remaining. One thread releases the last strong reference, while the second one attempts to get the object by calling
GetObject()
. Who wins in this situation? How to avoid returning reference to the destroyed object?
These were just few examples, but there exists many more complex cases. We will now give implementations of all three functions for reference, and then give detailed explanation for every function:
inline virtual Atomics::Long ReleaseStrongRef()override final
{
auto RefCount = Atomics::AtomicDecrement(m_lNumStrongReferences);
if( RefCount == 0 )
{
ThreadingTools::LockHelper Lock(m_LockFlag);
size_t ObjectWrapperBufferCopy[ObjectWrapperBufferSize];
for(size_t i=0; i < ObjectWrapperBufferSize; ++i)
ObjectWrapperBufferCopy[i] = m_ObjectWrapperBuffer[i];
auto *pWrapper = reinterpret_cast<ObjectWrapperBase*>(ObjectWrapperBufferCopy);
m_ObjectState = ObjectState::Destroyed;
bool bDestroyThis = m_lNumWeakReferences == 0;
Lock.Unlock();
pWrapper->DestroyObject();
if( bDestroyThis )
SelfDestroy();
}
return RefCount;
}
inline virtual Atomics::Long ReleaseWeakRef()override final
{
ThreadingTools::LockHelper Lock(m_LockFlag);
auto NumWeakReferences = Atomics::AtomicDecrement(m_lNumWeakReferences);
if( NumWeakReferences == 0 && m_ObjectState == ObjectState::Destroyed )
{
Lock.Unlock();
SelfDestroy();
}
return NumWeakReferences;
}
inline virtual void GetObject( class IObject **ppObject )override final
{
if( m_ObjectState != ObjectState::Alive)
return;
ThreadingTools::LockHelper Lock(m_LockFlag);
auto StrongRefCnt = Atomics::AtomicIncrement(m_lNumStrongReferences);
if( m_ObjectState == ObjectState::Alive && StrongRefCnt > 1 )
{
auto *pWrapper = reinterpret_cast<ObjectWrapperBase*>(m_ObjectWrapperBuffer);
pWrapper->QueryInterface(Diligent::IID_Unknown, ppObject);
}
Atomics::AtomicDecrement(m_lNumStrongReferences);
}
Let's now take a look at how these functions work. Let's start with ReleaseStrongRef()
. The function first atomically decrements the strong reference counter without acquiring the lock. This is important as locking is an expensive operation and you want to pay this cost only when it is absolutely necessary. If the decremented value is 0
, which means that the function released the last reference, object should be destroyed. Note that Atomics::AtomicDecrement()
decrements the counter atomically. This means that if several threads reach the instruction at the same time, access to the counter will be serialized. The function returns decremented value, and only one thread will read 0
. Note, that it is crucially important to use the value returned by the function, because if we used m_lNumStrongReferences
in comparison, an object could be destroyed multiple times as more than one thread could read 0
. The thread that reads 0
, starts destroying the object:
auto RefCount = Atomics::AtomicDecrement(m_lNumStrongReferences);
if( RefCount == 0 )
{
Interference between ReleaseStrongRef() and GetObject() methods
This is the first situation where we need to carefully design interference with other functions. What happens if another thread has a weak reference and starts getting strong reference to an object through GetObject()
? Without special care, the method can potentially return reference to an object that will very soon be destroyed by ReleaseStrongRef()
from another thread. After ReleaseStrongRef()
starts destroying the object, it acquires exclusive access to the class using the lock. GetObject()
also acquires the lock, so only one method in one of the threads can now be running. Since ReleaseStrongRef()
cannot be stopped and the object will be destroyed anyway, it is the responsibility of GetObject()
to detect this situation and avoid returning reference to the soon-to-be destroyed object. After obtaining the lock, GetObject()
first increments the number of strong references (we will discuss later why it is important to increment the counter while having exclusive access to the counters) and examines the returned value. There are two cases now:
- Returned value is
1
. This means that there are no more alive strong reference to the object, and the object is either already destroyed or will be destroyed very soon. In this case, we should not return reference to the object. - Returned value is greater than
1
. This means that there exists at least one another strong reference to the object. Since we already incremented the counter, no ReleaseStrongRef()
in other threads will be able to decrement it to 0
(assuming the calls to AddRef()
and Release()
are correctly balanced), so it is safe to return the reference to the object.
Note that QueryInterface()
also increments the strong reference counter, but before calling the method we need to be sure that the object is alive.
The following two figures illustrates two possible scenarios that show what happens if one thread is releasing the last strong reference to the object, while the second one is attempting to get strong reference to the object through a weak reference using GetObject()
. In the first scenario, ReleaseStrongRef()
decrements strong reference counter first, and the object will be released:
In the second scenario, GetObject()
increments strong reference counter first, and the object is kept alive:
Note that it is crucial that GetObject()
increments the strong reference counter only after acquiring the lock. Consider what may have happened if GetObject()
did not acquire the lock and there were several threads running GetObject()
while another thread was releasing the last reference to the object:
As you can see, what happens in the scenario above is that two threads running GetObject()
increment strong reference counter. As a result, the thread that increments the counter second sees two strong reference and starts returning the object. The object, however will have been destroyed by that time or will be destroyed soon after by the thread running ReleaseStrongRef()
. This scenario is not possible if GetObject()
acquires the lock:
The crucial difference is that if the lock is acquired, the second thread (thread 3) running GetObject()
will only be able to increment reference counter after the first thread running GetObject()
(thread 2) has decremented it. As a result, thread 3 will also see only one strong reference and will not return the object. In the alternative scenario where one of the threads running GetObject()
increments the counter before ReleaseStrongRef()
decrements it, the object will be kept alive and both threads 2 and 3 will obtain valid object reference.
There may be two questions that a careful reader may ask at this point. First, if GetObject()
acquires the lock, why do we use atomic operations to increment and decrement the counter? This is because other methods (AddStrongRef()
and ReleaseStrongRef()
) access the counter without acquiring the lock. The second question is: if AddStrongRef()
increments reference counter without acquiring the lock, is it possible that the same faulty scenario happens when one thread runs ReleaseStrongRef()
, second runs GetObject()
while the third one runs AddStrongRef()
? The answer is no because since the third thread runs AddStrongRef()
, there exist at least one another strong reference to the object. So the first thread running ReleaseStrongRef()
is not releasing the last reference.
Interference between ReleaseStrongRef() and ReleaseWeakRef() Methods
Without special care, there may be a problem if two threads are simultaneously releasing the last strong and weak reference through ReleaseStrongRef()
and ReleaseWeakRef()
methods. If both methods see there are no more strong and weak references, two steps are taken to assure that only one method destroys the reference counting object itself. First, ReleaseStrongRef()
checks the number of strong references while still holding the lock and sets the flag indicating if reference counting object needs to be released:
bool bDestroyThis = m_lNumWeakReferences == 0;
Second, ReleaseWeakRef()
decrements weak reference counter only after acquiring the lock. If two threads are running ReleaseStrongRef()
and ReleaseWeakRef()
, there are two scenarios possible depending on which method first acquires the lock. In the first scenario, ReleaseStrongRef()
acquires the lock first. Since number of weak references it will see will not be equal zero, the method will not destroy the reference counting object:
In the second scenario, ReleaseWeakRef()
acquires the lock first. However, ReleaseStrongRef()
must first destroy the controlled object, so ReleaseWeakRef()
must not self destroy reference counting object. To detect this situation, the methods use m_ObjectState
flag. The state is atomically set to ObjectState::Destroyed
by ReleaseStrongRef()
method. If ReleaseWeakRef()
sees that the state is not ObjectState::Destroyed, it means there are alive strong references or ReleaseStrongRef()
method is not completed as in the scenario below:
Note that it is crucial that m_ObjectState
is accessed when lock is acquired and that weak reference counter is only decremented while lock is acquired as well. For example, consider the following scenario where weak reference counter is decremented without acquiring the lock, which leads to self-destroying reference counters object twice:
Acquiring the lock has two safety effects:
- If
ReleaseStrongRef()
sets bDestroyThis
flag to true
, this means there are no other threads that may run code related to weak references, because ReleaseWeakRef()
decrements weak reference counter after acquiring the lock. So reference counters object can be safely destroyed - If
ReleaseWeakRef()
sees that m_ObjectState
is set to ObjectState::Destroyed
, then all strong reference-related code must be completed by this time because the object state is updated while keeping the lock. In this case, the reference counters can be safely destroyed as well
Interference between ReleaseWeakRef() and GetObject() methods
There is in fact no interference possible between these two methods if they are used correctly. If GetObject()
is called in one thread, while another thread is running ReleaseWeakRef()
, this means there are at least two weak references. Therefore, ReleaseWeakRef()
will only decrement the counter, but will not destroy the reference counters object as there will be at least one another outstanding weak reference.
MakeNewRCObj Class
MakeNewRCObj
is a class that is responsible for creating object + reference counters pair. The class is a template class parameterized by the type of the object and type of the memory allocator:
template<typename ObjectType, typename AllocatorType = IMemoryAllocator>
class MakeNewRCObj
The class defines the following private
members:
AllocatorType* const m_pAllocator;
IObject* const m_pOwner;
const Char* const m_dbgDescription;
const char* const m_dbgFileName;
const Int32 m_dbgLineNumber;
where:
m_pAllocator
is the pointer to the allocator that will be used to allocate memory for the object. If nullptr
, default system allocator is used. m_pOwner
is the pointer to the object that owns the one to be created. If nullptr
, the object has no owner. m_dbgDescription
, m_dbgFileName
, and m_dbgLineNumber
are debug members provided to the allocator to describe the allocation for debug purposes.
The class provides two constructors to initialize its members:
MakeNewRCObj(AllocatorType &Allocator,
const Char* dbgDescription,
const char* dbgFileName,
const Int32 dbgLineNumber,
IObject* pOwner = nullptr)noexcept :
m_pAllocator(&Allocator),
m_pOwner(pOwner),
m_dbgDescription(dbgDescription),
m_dbgFileName(dbgFileName),
m_dbgLineNumber(dbgLineNumber)
{
}
MakeNewRCObj(IObject* pOwner = nullptr)noexcept :
m_pAllocator(nullptr),
m_pOwner(pOwner),
m_dbgDescription(nullptr),
m_dbgFileName(nullptr),
m_dbgLineNumber(0)
{}
The class defines template operator ()
that performs allocation of the object:
template<typename ... CtorArgTypes>
ObjectType* operator() (CtorArgTypes&& ... CtorArgs)
{
RefCountersImpl *pNewRefCounters = nullptr;
IReferenceCounters *pRefCounters = nullptr;
if(m_pOwner != nullptr)
pRefCounters = m_pOwner->GetReferenceCounters();
else
{
pNewRefCounters = new RefCountersImpl();
pRefCounters = pNewRefCounters;
}
ObjectType *pObj = nullptr;
try
{
if(m_pAllocator)
pObj = new(*m_pAllocator, m_dbgDescription, m_dbgFileName, m_dbgLineNumber)
ObjectType(pRefCounters, std::forward<CtorArgTypes>(CtorArgs)... );
else
pObj = new ObjectType
( pRefCounters, std::forward<CtorArgTypes>(CtorArgs)... );
if(pNewRefCounters != nullptr)
pNewRefCounters->Attach<ObjectType, AllocatorType>(pObj, m_pAllocator);
}
catch (...)
{
if(pNewRefCounters != nullptr)
pNewRefCounters->SelfDestroy();
throw;
}
return pObj;
}
There are few interesting things about the operator. First, it is a variadic template function. It can take any number of arguments of any types. The function passes all its arguments to the object constructor using perfect forwarding mechanism:
ObjectType(pRefCounters, std::forward<CtorArgTypes>(CtorArgs)... )
Perfect forwarding assures that every parameter is passed as either lvalue
, or rvalue
, depending on its original type. Second, the method uses in-place new
to create object using custom allocator in case it is provided:
pObj = new(*m_pAllocator, m_dbgDescription, m_dbgFileName, m_dbgLineNumber)
ObjectType(pRefCounters, std::forward<CtorArgTypes>(CtorArgs)... );
If allocator is not provided, default allocator is used:
pObj = new ObjectType( pRefCounters, std::forward<CtorArgTypes>(CtorArgs)... );
Recall that RefCountedObject
defines two private
versions of operator new
, and that MakeNewRCObj
is a friend class of RefCountersImpl
. MakeNewRCObj
is thus the only class that has access to these operators, and is the only place where instances of classes derived from RefCountersImpl
can be created on the heap.
After the object is created, the method attaches this object to the reference counters using template Attach()
method that we discussed earlier.
If an exception is thrown by the object constructor, the method catches it, destroys reference counters object, and re-throws the exception. Note that in our original implementation, reference counters object was created by the object constructor. This resulted in a memory leak if constructor threw an exception, because destructor would never be called and memory never freed.
There is one tricky situation related to correct exception handling. Consider a scenario, where object A
owns object B
that keeps weak reference WP to object A
. If constructor of A
throws an exception, then destructor of weak reference may try to destroy the reference counting object. As we know, this operation is performed by the MakeNewRCObj
class, and we must avoid destroying the same object twice. What helps here is that ReleaseWeakRef()
checks if the state of the object is ObjectState::Destroyed
. Since the object has never been constructed, its state will be ObjectState::NotInitialized
, and ReleaseWeakRef()
will not destroy the reference counting object.
The following macros is defined to facilitate usage of MakeNewRCObj
class:
#define NEW_RC_OBJ(Allocator, Desc, Type, ...)\
MakeNewRCObj<Type, typename std::remove_reference<decltype(Allocator)>::type>\
(Allocator, Desc, __FILE__, __LINE__, ##__VA_ARGS__)
With the help of this macros, typical allocation looks like this:
BufferD3D11Impl *pBufferD3D11 =
NEW_RC_OBJ(m_BufObjAllocator, "BufferD3D11Impl instance", BufferD3D11Impl)
(m_BuffViewObjAllocator, this, BuffDesc, BuffData );
Smart Pointer Classes
The reference counting system provides two smart pointer classes, RefCntAutoPtr
and RefCntWeakPtr
that implement strong reference and weak reference functionality. The class implementation is relatively straightforward as they are thin wrappers over RefCountersImpl
and RefCountedObject
classes.
Conclusion
This article presented an implementation of an efficient reference counting system. The system supports user-provided allocators, weak references, and is thread-safe. The source code for the classes described in the article can be found in the attached archive. It is also available on GitHub.
History
- 17th June, 2017: Initial version