When calling into the win32 API, there are many times when you must deal with structures that have a variable size. This can require a lot of manual programming and memory management. Using C++ and template programming, it is possible to make this process much less painless and safer.
Introduction
When dealing with the win32 API, there are many cases where the interface uses a structure of variable size. In general, there are two reasons for this.
- There is an array in the
struct
, and its length is variable. - There are other pointer members in the
struct
, and because of API restrictions, whatever they are pointing to needs to be contained in a contiguous block of memory together with the struct
itself. There are two reasons for that
- If all the data is managed in one single block, whoever is managing the memory only has to free one single block, instead of having to keep track of multiple buffers. This is especially significant in cases of output
struct
s where the API allocates all the memory and the caller is responsible for freeing it, or input struct
s where the API tells the caller in advance how much memory they need in total for returning everything. - If the callee needs to pass that data to another actor, it can
memcpy
the entire block, and at the same time, do a full sanity check on the data to verify that all the pointers point to locations inside the block, and all internal buffer sizes are guaranteed to stay within the block boundaries.
In itself, this is a sensible approach. The problem comes from the fact that various steps along the way can return errors. Combined with the manual memory aspect, it can be problematic and very tedious to make sure that there are no memroy leaks.
Additionally, in most of my code, I use std::string
or std::wstring
for my string handling. This means that all my code needs to be exception safe, which precludes manually allocating memory or any handles to resources. Exceptions are like glitter (parents of little kids will understand). If even the smallest part of your code uses exceptions, every bit of code in your program needs to deal with exceptions.
Based on this information, I came up with the following requirements:
- A convenient way to manage the memory for variable sized structure
- A convenient way to access the
struct
members - Exception safety
Background
The reason for putting in the time to design something solid and convenient for dealing with variable size structures is that a lot of the API interface for dealing with the security subsystem uses them. After the 3D time of manually programming the code to deal with specific structures, I decided that there has to be a better way.
In this case, I was using the GetTokenInformation
API which can use a great number of different types by passing them as a void*
.
Designing a Buffer
The very base of my approach is the CBuffer
class. I looked at the standard containers as base classes, but none of them fit. Only array
and vector
resemble anything close, but array
is fixed size, and vector
allows things like removing elements. And while the standard containers are awesome for what they do, they also do things that aren't appropriate here so I am designing a lightweight buffer from scratch.
The main requirements we have are:
- dynamically resize upon request
- be exception safe
- be able to act both as a C style array, and a general purpose memory buffer
- support both '
length
' as number of items, and 'size
' as number of bytes
The declaration is simple:
template<typename T>
class CBuffer
{
protected:
T* m_buffer = NULL; size_t m_size = 0;
virtual void Allocate(size_t size) { virtual void DeAllocate() { /... }
public:
CBuffer() { CBuffer(size_t size) { CBuffer(CBuffer&& rval) { CBuffer(CBuffer& other) { virtual ~CBuffer() {
virtual void Resize(size_t newSize) {
size_t Size() { size_t Length() {
T& operator [](size_t index) { operator T* () { operator void* () { };
The internal variables are protected
in order to give derived classes full access to implement additional memory management methods. The memory management methods themselves are virtual
and protected
for the same reason. They're sensible defaults for most common scenarios.
There are the usual constructors and a virtual destructor. The Resize
method makes it possible to resize the buffer as needed. We can return the buffer size as Length
(treating the buffer as an array of T
) or Size
(treating the buffer as a block of x bytes).
And finally, we can index into the buffer like we would with a normal array, as well as devolve it into a T*
or a void*
.
Memory Management
virtual void Allocate(size_t size) {
DeAllocate();
if (size == 0)
return;
m_buffer = static_cast<T*>
(HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size));
if (!m_buffer) {
throw bad_alloc();
}
m_size = size;
}
virtual void DeAllocate() {
if (m_buffer)
HeapFree(GetProcessHeap(), 0, m_buffer);
m_buffer = NULL;
m_size = 0;
}
void Resize(size_t newSize) {
if (newSize == 0) {
DeAllocate();
return;
}
if (! m_buffer) {
Allocate(newSize);
}
else {
T* newBuffer = static_cast<T*> (
HeapReAlloc(GetProcessHeap(),
HEAP_ZERO_MEMORY, m_buffer, newSize));
if (!newBuffer) {
throw bad_alloc();
}
m_buffer = newBuffer;
m_size = newSize;
}
}
Allocate
creates a new block of size
bytes on the process heap and zero-es that block. DeAllocate
frees that memory again. The behaviour of the Resize
method depends on the state of the buffer. If no buffer currently exists, one is allocated using the Allocate
function. If one exists already, it is reallocated on the heap using HeapReAlloc
because this is more efficient. Thanks @Randor for suggesting this.
The other members are pretty self-explanatory:
size_t Size() {
return m_size;
}
size_t Length() {
return m_size / sizeof(T);
}
T& operator [](size_t index) {
if (index < m_size / sizeof(T))
return m_buffer[index];
else {
throw range_error("index out of bounds");
}
}
operator T* () {
return static_cast<T*>(m_buffer);
}
operator void* () {
return static_cast<void*>(m_buffer);
}
The final thing to note is the destructor:
virtual ~CBuffer() {
DeAllocate();
}
The DeAllocate
call always happens so our buffer class is exception safe. When the CBuffer
object is cleaned up, any allocated memory will be cleaned up too.
Making a Variable Sized Structure
Now that we have a buffer to work with, we can use it as a base for our variable size structure. The structure itself is very simple because it doesn't need to do a whole lot.
template<typename T>
class CVarSizeStruct : public CBuffer<BYTE>
{
public:
CVarSizeStruct() : CBuffer(sizeof(T)) {
}
void* GetPtrAtOffset(size_t offset) {
void* maxptr = m_buffer + m_size;
void* ptrAtOffset = m_buffer + sizeof(T) + offset
if(ptrAtOffset >= maxptr)
throw range_error("GetPtrAtOffset")
return ptrAtOffset;
}
void Oversize(size_t oversize) {
Resize(sizeof(T) + oversize);
}
T* operator -> () {
return reinterpret_cast<T*>(m_buffer);
}
};
The constructor initializes a buffer that's big enough for the base definition of the structure without its variable part. The reason is that sometimes you don't need the variable part, in which case this default saves you one or two lines of code.
There are two ways of sizing it. There is the Resize
method in the base class, which is used primarily when an API tells us 'when you call me next time, I need a structure with a total size of X
'. And then there is the Oversize
method, which is convenient if you are creating a structure to pass into an API call and you have a dynamic means of calculating how big the additional part needs to be. You could use the Resize
method for that, but this way is a little bit cleaner.
The GetPtrAtOffset
method performs pointer arithmetic to return a pointer to a memory location inside the variable block of the struct
. If someone fills in the data during an API call, we don't need it. But if we have to prepare that data for passing into an API call, something like this is needed to manually put data in specific locations.
And finally, the ->
operator allows you to access the members of the T
struct
as if CVarSizeStruct<T>
as a T*
, which are precisely what we set out to do. And because we use the CBuffer<BYTE>
as a base, any allocated memory is guaranteed to be freed when it goes out of scope.
Using the Variable Size Struct
We can easily show the benefits of this approach by refactoring the code in a previous article. Let's look at the original code:
DWORD w32_GetTokenInformation(
HANDLE hToken,
TOKEN_INFORMATION_CLASS InfoType,
PVOID &info,
DWORD &bufSize)
{
DWORD retVal = NO_ERROR;
if (!GetTokenInformation(
hToken, InfoType, NULL, 0, &bufSize)) {
retVal = GetLastError();
}
if (retVal == ERROR_INSUFFICIENT_BUFFER) {
retVal = NO_ERROR;
info = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, bufSize);
if (!info)
retVal = GetLastError();
}
if (retVal == NO_ERROR && !GetTokenInformation(
hToken, InfoType, info, bufSize, &bufSize)) {
retVal = GetLastError();
}
if (retVal != NO_ERROR) {
if (info)
HeapFree(GetProcessHeap(), 0, info);
info = NULL;
}
return retVal;
}
ULONG w32_GetTokenPrivilege(HANDLE hToken, vector<w32_CPrivilege>& privilegeList) {
DWORD bufLength = 0;
DWORD retVal = NO_ERROR;
PTOKEN_PRIVILEGES pPrivileges = NULL;
retVal = w32_GetTokenInformation(hToken, TokenPrivileges,
(PVOID&)pPrivileges, bufLength);
if (retVal == NO_ERROR) {
TCHAR privilegeName[w32_MAX_PRIVILEGENAME_LENGTH];
DWORD bufSize = sizeof(privilegeName);
DWORD receivedSize = 0;
privilegeList.resize(pPrivileges->PrivilegeCount);
for (DWORD i = 0; i < pPrivileges->PrivilegeCount; i++) {
LUID luid = pPrivileges->Privileges[i].Luid;
bufSize = sizeof(privilegeName);
if (!LookupPrivilegeName(NULL, &luid, privilegeName, &bufSize)) {
retVal = GetLastError();
break;
}
privilegeList[i].Luid = luid;
privilegeList[i].Flags = pPrivileges->Privileges[i].Attributes;
privilegeList[i].Name = privilegeName;
}
}
if(pPrivileges)
HeapFree(GetProcessHeap(), 0, pPrivileges);
if (retVal != NO_ERROR)
privilegeList.clear();
return retVal;
}
There are several issues here. The first thing that stands out is that I perform wstring
operations for string
manipulation. Granted, the chances of getting out-of-memory exceptions is vanishingly small but it's still not proper to do this.
More annoying is that I perform memory allocation in the w32_GetTokenInformation
function. Deallaction is done in the w32_GetTokenPrivileges
function when things go well, but also in w32_GetTokenInformation
if things go wrong. All of this is contingent to manually checking error variables and making sure there's not a return without cleanup.
void w32_GetTokenInformation(
HANDLE hToken,
TOKEN_INFORMATION_CLASS InfoType,
CBuffer<BYTE>& buffer)
{
DWORD sizeNeeded;
DWORD retVal = NO_ERROR;
if (!GetTokenInformation(hToken, InfoType, NULL, 0, &sizeNeeded)) {
retVal = GetLastError();
}
if (retVal != ERROR_INSUFFICIENT_BUFFER) {
throw ExWin32Error(retVal);
}
buffer.Resize(sizeNeeded);
if (!GetTokenInformation(hToken, InfoType, buffer,
(DWORD)buffer.Size(), &sizeNeeded)) {
throw ExWin32Error();
}
}
Right off the bat, you can see that the w32_GetTokenInformation
function has gotten a whole lot simpler, and is guaranteed to not leak memory.
void w32_GetTokenPrivilege(HANDLE hToken, vector<CPrivilege>& privilegeList) {
CVarSizeStruct<TOKEN_PRIVILEGES> privileges;
w32_GetTokenInformation(hToken, TokenPrivileges, privileges);
TCHAR privilegeName[MAX_PRIVILEGENAME_LENGTH];
TCHAR privilegeDispName[MAX_PRIVILEGEDISPNAME_LENGTH];
DWORD receivedSize = 0;
privilegeList.resize(privileges->PrivilegeCount);
for (DWORD i = 0; i < privileges->PrivilegeCount; i++) {
LUID luid = privileges->Privileges[i].Luid;
DWORD privNameBufSize = sizeof(privilegeName) / sizeof(TCHAR);
if (!LookupPrivilegeName(NULL, &luid, privilegeName, &privNameBufSize)) {
throw ExWin32Error();
}
privilegeList[i].Luid = luid;
privilegeList[i].Flags = privileges->Privileges[i].Attributes;
privilegeList[i].Name = privilegeName;
}
}
The w32_GetTokenPrivilege
function has gotten simpler as well. And it, too, has become safe against memory leaks.
Points of Interest
The decrease in total number of lines of code isn't dramatic, but the main improvement is developer convenience.
Given the vast number of API calls which work with variable size structures, it's very convenient that you don't have to worry about memory management or error handling.
History
- 25th November, 2022: Initial version