Introduction
For prototype with no inheritance such as char
, short
, int
, float
..., we can further optimize to make sizeof (smart pointer) == sizeof (raw pointer)
.
Examples in Win32:
sizeof(int*) == 4
sizeof(std::shared_ptr<int>) == 8
sizeof(boost::shared_array<int>) == 8
sizeof(smart pointer in this section) == 4
It is more economical and quicker than std::shared_ptr
or boost::shared_array
.
Background
In C++, the inheritance relationship will change the pointer value if we do the pointer conversion. Try the following simple code:
struct base
{
};
struct der: base
{
virtual void fff()
{
}
};
int main()
{
der obj;
base* p = &obj;
if(p == &obj)
{
printf("pointer is equal\n");
}
if(unsigned(p) != (unsigned)&obj)
{
printf("value is not equal\n");
}
}
The program output is "pointer is equal" and "value is not equal".
That's why the sizeof(std::shared_ptr<>) == 8
in the Win32 system. It can't retrieve the reference counting position from the object pointer value caused by inheritance. So it needs to store an additional pointer inside.
The Solution
There is one exception scenario. For C++ built-in data types, it is impossible to have inheritance. This limitation offers us an opportunity to minimized the smart pointer.
Memory distribution for built-in object allocation:
ref-counter | blank area to ensure build-in object's alignment | built-in object |
Memory distribution for build-in array objects allocation:
ref-counter | blank area to ensure build-in object's alignment | build-in object1 | build-in object2 | ... |
We directly use the overloaded new operator to embed the reference counter in the object heap during allocation. The reference counter and the object(s) share the same heap, which will greatly enhance efficiency and reduce resource consumption. Think about boost::make_shared
.
Some key code:
template<typename T, size_t align>
struct NativeCore
{
NativeCore(T* p, size_t align): m_ptr(p), m_align(align) { }
T* m_ptr; size_t m_align;
};
template<typename T, void*(*allocator)(size_t), void(*cleaner)(void*), size_t align>
inline NativeCore<T, align> NativeAllocMemOnly(size_t total)
{
size_t remain = sizeof(long)%align;
if(0 == remain)
{
long* p = reinterpret_cast<long*>(allocator(sizeof(T)*total + sizeof(long)));
*p = 1;
::new(cleaner, p+1) T[total]; return NativeCore<T, align>(reinterpret_cast<T*>(p+1), align);
}
else
{
char* p = reinterpret_cast<char*>(allocator(sizeof(T)*total +
sizeof(long) + align - remain));
*reinterpret_cast<long*>(p) = 1; ::new(cleaner, p+sizeof(long) +
align - remain) T[total]; return NativeCore<T, align>(reinterpret_cast<T*>(p+sizeof(long) +
align - remain), align);
}
}
template<typename T, size_t alignment>
inline NativeCore<T, alignment> NativeAlloc(size_t total)
{
StaticAssert<!IsClassType<T>::m_value>(); return NativeAllocMemOnly<T, operator new, operator delete, alignment>(total);
}
Test code
Ref-counted management for built-in types. No longer need explicit free or delete.
#include "stdafx.h"
int main()
{
using namespace std;
_CrtSetDbgFlag(_CRTDBG_ALLOC_MEM_DF|_CRTDBG_LEAK_CHECK_DF);
RefNative<int>::Normal n = NativeAlloc<int>(2); n[0] = 1;
n[1] = 2;
int* pn = n;
printf("Current ref-count before push_back = %d\n", n.GetRefCount());
std::vector<RefNative<int>::Normal> vec;
vec.push_back(n); printf("Current ref-count after push_back = %d\n", n.GetRefCount());
printf("sizeof(RefNative<int>::Normal) = %d\n", sizeof(RefNative<int>::Normal));
RefNative<double, 64>::
Aligned na = NativeAllocAligned<double, 64>(1); double* pna = na; pna[0] = 3.5;
printf("sizeof(RefNative<int>::Aligned) = %d\n", sizeof(RefNative<int>::Aligned));
if(unsigned(pna)%64 == 0)
{
puts("it is properly 64 bytes aligned\n");
}
return 0;
}
Points of Interest
If the array objects (not just built-in types) and the ref-counter can share the same one heap, we will produce a more economical smart pointer than the shared_ptr<vector>
or boost::shared_array
.
We will do the corresponding test in Part 3.
History
- 2006: Created
- 3rd June, 2012: Article posted