Introduction
Not a long time ago, I've posted an article that describes a technique to optimize frequent dynamic allocations of a constant size.
A few days ago, I talked to my friend and occasionally this subject came up. He told me: "In such cases, instead of deleting the unneeded blocks, I just store them in a linked list, and then reuse them again".
Impressing, isn't it? This method is even simpler than what I've invented, and its advantage is that the caching pool can grow dynamically without special efforts or performance penalties.
Overhead: Theoretically, we need no overhead at all: when the allocated block is in use, we need no references to it, and once it is unneeded - we just use its body to implement the linked list node (assume the minimal allocation size is at least the size of a pointer, 4 bytes on Win32). But on the other hand - all blocks are allocated via a heap, and it probably needs some header. This can be significant when allocation size is small. In such a case, it would be a good idea to use my allocator (that I described earlier) instead of a heap.
Implementation
Now, let's discuss it a bit closer. We need a one-directional linked list (stack or queue, doesn't matter) which is initially empty. During the runtime, we allocate blocks in a regular way. Then, when they are no longer needed, we push them into our linked list (instead of deleting). Upon next allocation request, we just return a block from our linked list.
OK, let's focus on some pitfalls:
- Suppose there's a spike at an early running stage. In other words, you need suddenly very many allocation blocks whereas the caching list hasn't collected too many blocks yet. Hence, you'll have to allocate them rapidly, and that is when it is especially critical for your application to respond quickly upon such a payload! So, our first conclusion would be that some amount of blocks should be PRE-cached upon initialization.
- After a huge spike goes off, our list will probably contain a lot of blocks, whereas most of the time we'll need only a small part of them. So, our second conclusion would be that when our caching list becomes large enough - we delete allocation blocks in a regular way instead of collecting them.
Well, not too complex at all. Here, I'm giving one possible implementation of this idea. It caches allocations up to some maximum count, which doesn't change. However, a clever application can somehow collect statistics and adjust dynamically this limit. Well, that's up to you. Good luck!
P.S.: Sorry if there are bugs, I haven't tested it.
class CAllocaterEx {
ULONG m_nAllocationSize;
ULONG m_nPoolMax;
ULONG m_nCount;
struct CNode {
CNode* m_pNext;
};
CNode* m_pHead;
CNode* Pop()
{
ASSERT(m_pHead && m_nCount);
CNode* pHead = m_pHead;
m_pHead = pHead->m_pNext;
m_nCount--;
return pHead;
}
void Push(CNode* pNode)
{
ASSERT(pNode);
pNode->m_pNext = m_pHead;
m_pHead = pNode;
m_nCount++;
}
public:
CAllocaterEx(ULONG nAllocationSize = 0) :
m_nAllocationSize(nAllocationSize),
m_nPoolMax(0),
m_nCount(0),
m_pHead(NULL)
{
}
~CAllocaterEx()
{
Clear();
}
bool Init(ULONG nPoolSize)
{
if (m_nAllocationSize < sizeof(ULONG_PTR))
m_nAllocationSize = sizeof(ULONG_PTR);
Clear();
return EnsurePoolMin(m_nPoolMax = nPoolSize);
}
bool Init(ULONG nPoolSize, ULONG nAllocationSize)
{
m_nAllocationSize = nAllocationSize;
return Init(nPoolSize);
}
bool EnsurePoolMin(ULONG nPoolSize)
{
while (m_nCount < nPoolSize)
{
CNode* pNode = (CNode*) new BYTE[m_nAllocationSize];
if (!pNode)
return false;
Push(pNode);
}
return true;
}
void EnsurePoolMax(ULONG nPoolSize)
{
while (m_nCount > nPoolSize)
delete[] (PBYTE) Pop();
}
void Clear()
{
EnsurePoolMax(0);
}
PVOID Alloc()
{
return EnsurePoolMin(1) ? (PVOID) Pop() : NULL;
}
void Free(PVOID pPtr)
{
Push((CNode*) pPtr);
EnsurePoolMax(m_nPoolMax);
}
};