Introduction
Anyone who works with applications that use vast allocation and freeing of dynamic memory during their run knows that this can cause decrease in performance, because this operations needs to be guarded with some kind of a global lock, so different modules can affect one another's performance. Further more, over time the memory is fragmented from all sizes of allocations, so if an application is kind of a resident process on a machine, its performance deteriorates over time.
In this article I will try to give a partial solution for this problem. Why partial? 'Cause I'm not trying to give a better implementation of the Memory Manager above the OS level. What I can optimize is modules that use vast allocation and freeing of buffers of the same size, which from my experience are the most common.
The solution I'm suggesting is a buffer pool that will allocate and free buffers of a predefined size, the complexity of the allocation and free operations is O(1), and each buffer pool object is self synchronized, so it wont affect other modules using another buffer pool. Furthermore the dynamic allocations of the buffer pool are done in segments of memory the size of N buffers, so the fragmentation of the memory over time is reduced significantly. More then that, because each segment is allocated as one block of memory, the paging of memory will be reduced also.
Using the code
First declare a CBufferPool
object
CBufferPool BuffPoolObj;
Next create the Buffer pool by calling the Create
method.
BuffPoolObj.Create(nBufferSize,
nBuffersPerSegment,
nInitialNumberOfSegments,
nMinNumberOfSegments,
nMaxNumberOfSegments,
nFreeBuffersPercentForSegmentDeletion);
The last parameter nFreeBuffersPercentForSegmentDeletion
is not so intuitively understood, so I'll try to explain its purpose. In the implementation of the buffer pool, the buffers are given from a segment that is pre-allocated. There is that 'gray zone' where if a buffer is the first allocated buffer in a segment, we allocate a new segment for it, and if in the next operation we free this same buffer, we should free the whole segment and so on. This will cause a decrease in the performance that is worse than if we'll allocate the buffers in the traditional way. To prevent this from happening, I'll free a segment if I'll have a safe number of free buffers from whole segments. This safe distance is determined by this parameter.
Then you can call Allocate
method to get the required buffer, and the Free
method to return it to the pool.
void* pBuffer = BuffPoolObj.Allocate()
BuffPoolObj.Free(pBuffer);
The CBufferPool
has a synchronization scheme, so this methods can be called asynchronously. To destroy the buffer pool object call the Destroy
method. This method will free all segments, in debug it will check that all buffers where freed.
Working the demo
The demo project demonstrates how the buffer pool can improve performance of a program compared to simple new
and delete
operators. Furthermore it can be used to check if your system can be improved, and the approximate amount of improvement that can be achieved with a certain configuration.
There are 3 tests that you can perform with the demo project, test the buffer pool and then the global allocation (new
and delete
) to see the improvement the buffer pool achieves over global allocation, and the interaction test that can illustrate the severe decrease in performance effect that the global allocation suffers from, in comparison to the buffer pool.
The parameters in the demo are divided into three, session parameters that are used to run both the buffer pool and the global allocation, and the two other sections for each of the two. For the parameters that are not intuitively understood, you can understand using the code. The buffer pool code is very well documented, and I tried to keep the demo application code simple.
For those who compile the demo in debug mode, let me just say that the buffer pool is added in debug mode tests to ensure the buffers or segments that are freed will not be used and buffers that are being allocated are not already in use. And these tests (commonly memsets to trash the memory out for it to be unusable) will decrease the performance beyond measurement, so do the tests in release.
Implementation
I will just try to explain the implementation in general, the buffer pool is constructed of a doubly linked list of segments, each segment has a single linked list of free buffers. The link between the buffers uses the buffer itself. Each buffer has apart from its required memory, a header that keeps its index in the segment, so the free operation will be direct with no searching and its complexity O(1). All the memory for the segment and buffers is allocated as one block. The segments list is ordered during the run so if the segment is fully used, it will be inserted to the end of the list, and if it has one or more of its buffers freed, it will move to the head of the list. The allocation operation just pops one free buffer out of the list. When a buffer is free a calculation is made to extract it's segment, a segment is allocated each time there is no free buffer, and freed if we have more then nFreeBuffersPercentForSegmentDeletion
multiplied by nBuffersPerSegment
amount of free buffers all together in the buffer pool. This is the basic idea for specifics. See the code, it is really well documented.
Object pool
The object pool is an implementation of a pool for object that is a direct descendent of the buffer pool. What is needed in order to use it is to declare an Object pool with a declared type.
CObjectPool<SomeType> SomeTypeObjectPool;
Next create The object pool by calling the Create
method.
SomeTypeObjectPool.Create
( nObjectsPerSegment,
nInitialNumberOfSegments,
nMinNumberOfSegments,
nMaxNumberOfSegments,
nFreeObjectsPercentForSegmentDeletion);
And the use of the pool is as follows:
SomeType* pObject = SomeTypeObjectPool.Allocate()
SomeTypeObjectPool.Free(pObject);
The object pool implementation has very little code, it uses a new
and delete
"placement" (as the documentations calls this method) that receives the pointer to a buffer pool which is initialized with the object size as the buffer size at the Create
method. I think the implementation is very exciting because of this new
\delete
"placement" method that is rarely used.
A light example of the object pool is shown in the OnInitDialog
function in the CDemoDlg
.
You can use the CBufferPool
\ CObjectPool
on any platform. All you need to do, is replace the critical section object with the other platform dependent mutual exclusion lock.
Thanks to George Anescu for the CPreciseTimer
class used in the demo project.