Introduction
When an application encounters an error, the last thing you want to happen is that the reporting routine fails or renders an incomplete report because it ran out of string space, or couldn't get more from the heap because the system is starving for memory. My solution to this problem has quietly and privately evolved over the last ten years or so, and I finally am sufficiently confident of their value, and have a litle time to move them into a DLL and write an article about it.
Background
The design of the library takes into account a number of factors.
- According to "STRINGTABLE Resource," in the Microsoft Windows Platform SDK, a string resource "must be no longer than 4097 characters." Accordingly, any valid string will fit into a buffer of that many
TCHAR
s, plus one for the terminal null character. Accordingly, buffers designated for use as destinations for LoadString
are allocated as arrays of 4098 unsigned short
elements. - Since error messages usually involve limited amounts of text being substituted into a string, I still use the old
sprintf
and swprintf
functions, which aren't supposed to use more than 1024 bytes of output buffer. (See wsprintf function in the MSDN library.) Hence, the sprtintf
buffers are allocated as arrays of 1024 char
elements. - Over the past two years of usage, I have found that few programs need more than 3 buffers each for use by
LoadString
and sprintf
. Being a conservative sourt of person, though, I built my DLL with 5 buffers of each kind. - There are numerous applications, including those related to error reporting, in which substitution tokens are more useful than the cryptic
printf
tokens, not to mention that a replace function can operate on strings of arbitrary length, and the replacement string is specified once only, rather than once for each replacement, as you must if you use printf or sprintf
to perform the replacment. - For many other reasons, which could easily fill another article, I studiously avoided using templates, frameworks, and external libraries, other than my own, to eliminate, as much as possible, the risk of a heap allocation sneaking in by the back door, and to otherwise keep the code "lean and mean," in the interest of low overhead, robust error reporting.
- The public functions use the
__declspec(dllimport)
calling convention, but there is no .DEF
file. Though I have many libraries that use them, I have discovered an annoying side effect, which is that functions imported from a library that has one display only their ordinals in dumpbin
import reports. While ordinals are efficient for the loader, they are inefficient for us carbon units, for the same reason that most people don't use IP addresses directly when they have a choice. - Functions that return strings have ANSI (narrow character) and Unicode (wide character) versions, for which I used the generic text mappings defined in
TCHAR.H
, so that the source of both is virtually identical. Rather than maintain two identical copies of the function bodies, I put them into .INL
files, which are #included
into source files that supply only the appropriate character encoding directive, header inclusions, function prototype, and closing brace. - This project dispenses with precompiled headers, which cause more trouble than they are worth when some, but not all, modules define the
UNICODE
and _UNICODE
preprocessor symbols. - Along the same lines, there is one test program, for which two configurations are defined, one with
UNICODE
defined, and the other without.
Package Inventory
This package contains a good bit of material. This section offers guidance, in the form of inventories of the directories that comprise the package and the DLLs and link libraries includeded in it.
The following table lists and describes the directories in the package.
Directory Name | Abstract |
FixedStringBuffersTestStand
| Library Exerciser and Demonstration program
|
INCLUDE
| Headers for libraries, including the article subject library
|
LIB
| Link libraries required to build the project
|
NOTES
| Notes and reference documents
|
QueryRoutineNames
| Satellite DLL of string resources
|
Release
| DLL binaries and listings
|
scripts
| Scripts used in the post-build step, and the
|
FixedStringBuffersTestStand\FixedStringBuffersTestStand___Win32_Unicode_Release
| Release build of test stand program configured to use all Unicode strings, and, therefore, to test the Unicode routines
|
FixedStringBuffersTestStand\Release
| Release build of test stand program configured to use all ANSI strings, and, therefore, to test the Unicode routines
|
QueryRoutineNames\Release
| Release build of satellite DLL of string resources
|
QueryRoutineNames\scripts
| Scripts called in the post-build step to update the test stand program directories
|
The following table describes the dynamic link libraries, some of which are required to use the library, and all of which are used by the demonstration program.
Library Name | Abstract |
P6CStringLib1.dll
| The string manipulation routines in this library do things that I wish the frameworks did well or at all. Although I eventually found equivalents for some in MFC, that puts them out of reach for programs that support only __stdcall, at best, which includes VBA and even robust scripting languages, such as WinBatch.
|
P6VersionInfo.dll
| This library exports a small set of convenience routines for constructing logo banners from version resources. This is one of my oldest libraries of Windows API wrapper routines, all of which are implemented in straight C, and export via __stdcall.
|
ProcessInfo.dll
| This fairly new library provides ready access to information about the current process and modules that are loaded into it, such as, for example, the fully qualified name of the file from which they were loaded, the directory name from which a specified module loaded, and the fully qualified NetBIOS name of the user that owns the current process.
|
SafeMemCpy.dll
| This library exports a single function (well, two, because it has ANSI and Unicode implementations) that uses memcpy to provide efficient, safe appending and copying of strings. The safety refers to the fact that the routines use HeapReAlloc when necessary to expand the destination buffer to ensure that it can accommodate the requested copy or append operation.
|
WWDateLib.dll
| This library, which is about the same age as P6VersionInfo.dll, exports routines that employ substitution tokens to format the members of a SYSTEMTIME structure into a human readable date. The banner routines defined in P6VersionInfo.dll use them to display the current date.
|
WWKernelLibWrapper.dll
| The most important routines in this library are the three that wrap HeapAlloc, HeapReAlloc, and HeapFree in Structured Exception Handling blocks, so that they can return system status codes if they encounter a problem. HeapFree is also protected by a preceding call to HeapSize, whose return value is used to detect that the specified pointer didn’t come from the specified heap.
|
In addition to a handful of Windows NT command scripts and the executable programs upon which they depend, the scripts directory contains a couple of Microsoft .NET assemblies upon which Date2FN.exe
depends. Due to the way that .NET assemblies load, they must stay with Date2FN.exe
.
Using the code
Subdirectory INCLUDE
of both packages contain a standard C/C++ header file, FixedStringBuffers.H
, which declares the required constants and exported routines. The library include file pulls two additional headers, both found in the same directory, into the compilation. After carefully weighing the risks, I decided that the safest way to deliver the packag was to identify the header file dependencies, leave them all in place, and recommend that you install everything in the INCLUDE
directory into a directory of your own choosing, so long as it meets a single requirement: it must belong to the list of directories named in your INCLUDE
environment variable. This is how they are installed on my development machines, which enables the preprocessor to find them, since those are the directories that are searched for include files whose names appear in angle brackets. The other headers are required to build the library and the demonstration program.
Name | Abstract |
FixedStringBuffers.H Dependencies Const_Typedefs_WW.H | Define const typedefs that I haven't found anywhere in the Platform SDK headers, but use to secure arguments against accidental changes beiing made that might adversely affect the calling appliction if they were to be reflected back into it. |
WWStandardErrorMessages.H | Define string reosurce IDs and associated application status codes for conditions that occur frequently in most applications. The resource strings live in WWStandardErrorMessages.dll , which the main DLL expects to find in the directory from which it loads. Hence, I deposited a copy in the Debug and Release directories of the main DLL,. |
FixedStringBuffersTestStand
of both packages contains a like named program, whose main routine is defined in FixedStringBuffersTestStand.C
., with additional routines defined in FB_LoadStringFromNamedDLLA.CPP
. Between them, these two source files demonstrate the full capabilities of the library.
Navigation Aids
To help you navigate the library, the following table summarizes the main worker routines, all of which have ANSI and Unicode (wide character) implementations.
Name | Returm | Abstract | Buffer |
FB_ReportErrorViaStaticBuffer
| int
| Use this routine to report errors via message box, (for any program) or console (for a character mode program), returning the specified status code, unless a further error, such as a missing resource string, prevents the original error being reported.
| FB_ERROR_MESSAGE_BUFFER
|
FB_FormatMessage
| LPTSTR
| Use this routine to directly format the message for a system status code. The return value is a pointer to the string, ready to use as you see fit.
| FB_ERROR_MESSAGE_BUFFER
|
FB_LoadString
| LPTSTR
| Use this routine to load a string from a module for which you have a valid HMODULE , or from the first module that was loaded into the process address space. A NULL module handle signifies the process module. Use this routine to load strings from modules that are already mapped into the process, either as executable or data-only DLLs.
| 0-4
|
FB_LoadStringFromDLL
| LPTSTR
| Use this routine to load a string from a module for which you have a file name. The specified module is mapped into the address space of the calling process, the requested string is read into the buffer, and the module is unloaded.
| 0-4
|
FB_Replace
| LPTSTR
| Use this routine to format a string of up to 4097 characters (the maximum supported length of a resource string). The input string, text to find, and replacement text may come from anywhere, but the new string always comes from a single dedicated buffer that belongs to the DLL.
| Not applicable
|
In addition to the main worker routines, a number of service routines return useful information from the DLL, including the number of each type of buffer that it supports, the sizes of the various types of buffers, and their machine addresses. The following table summarizes these routines.
Name | Return | Abstract | Buffer |
FB_GetlpErrMsgResStr
| LPSFSBUF
| Get the address of the buffer into which the resource string specified as input to FB_ReportErrorViaStaticBuffer was loaded.
| Emergency Message Resource String buffer
|
FB_GetlpErrMsgSprintf
| LPSFSBUF
| Get the address of the buffer used by FB_ReportErrorViaStaticBuffer when it must use sprintf to construct the finished message.
| Emergency Message sprint output buffer
|
FB_GetSprintFBuffer
| LPSFSBUF
| Get the address of one of the output buffers designated for use as sprintf output buffers.
| 0-4
|
FB_XlateFBStatusCode
| LPSFSBUF
| In the unlikely event that one of the worker routines returns NULL to indicate that an error occurred, pass the status code returned by GetLastError into this routine. The returned string translates the status code into English, and provides as much information as it can about why the error happened.
| Emergency Message sprint output buffer
|
FB_GetSprintFBufferCount
| int
| Get the number of sprintf buffers. The index that you pass into FB_GetSprintFBuffer must be less than the returned value.
| Not applicable
|
FB_GetResourceBufferCount
| int
| Get the number of resource string buffers. Your index (puintBufferID ) in any call to FB_LoadStringFromDLL or FB_LoadString must be less than the returned value.
| Not applicable
|
FB_GetSprintFBufferBytes
| Int
| Get the size, in bytes, of each sprintf buffer. This is mostly FYI, since the sprintf family of routines don't ask how much room they have, and won't use more than 1024 bytes, which happens to be how big these buffers are.
| Not applicable
|
FB_GetResourceBufferTChars
| Int
| Get the size, in TCHAR s, of each resource string buffer. This is mostly FYI, since these routines supply the information to LoadString , and the buffers accommodate the maximum supported length of a string resource, 4097 characters.
| Not applicable
|
Copying Strings from the Buffers
The fourth argument to FB_LoadString
and FB_LoadStringFromDLL
is plpuintLength
, a pointer to the location of an unsigned integer which, unless NULL
, receives the character count returned by the underlying LoadString
system routine.
- If you intend to use the strings in situ, you can save 4 bytes of storage in your program and a few machine cycles in the DLL by passing
NULL
. However, the argument must always be tested for null, and it takes only two machine instructions to return the value through the supplied pointer. - For the same reason,
FB_Replace
has a fourth argument, named puintNewLength
, to emphasize that it reports the length of the new string.
The fastest way to copy a string from a fixed buffer into one of your own is to call memcpy
or CopyMemory
(which calls memcpy
under the hood), passing the address of your own buffer as the first argument, the address returned by FB_LoadString
, FB_LoadStringFromDLL
, or FB_Replace
as the second argument, and the character count times sizeof ( TCHAR )
as the third argument. Failure to multiply the character count as I just described will get only half of your buffer copied out if it is composed of Unicode characters.
The following snippet, taken from FB_ReportErrorViaStaticBuffer
(the routine that motivated me to gather these routines into a library) illustrates use of memcpy
to copy out the new string generated by FB_Replace
, which is called several times in a loop, to replace the tokens embedded in the error message template.
for ( intSrchIndx = ARRAY_FIRST_ELEMENT_P6C ; intSrchIndx < sizeof ( m_aszTokens ) / sizeof ( m_aszTokens [ ARRAY_FIRST_ELEMENT_P6C ] ) ; intSrchIndx++ )
{
lpChanged = FB_Replace ( lpErrMsgResStr ,
m_aszTokens [ intSrchIndx ] ,
alpReplacements [ intSrchIndx ] ,
&uintNewLen ) ;
if ( StringIsNullOrEmptyWW ( lpChanged ) )
{
_stprintf ( m_lpFBReplaceBuff ,
FB_XlateFBStatusCode ( GetLastError ( ) ) ) ;
lpFinalMessage = m_lpFBReplaceBuff ;
} else
{
if ( IsLastLoopLT_WW ( intSrchIndx , ( sizeof ( m_aszTokens ) / sizeof ( m_aszTokens [ ARRAY_FIRST_ELEMENT_P6C ] ) ) ) )
{
lpFinalMessage = lpChanged ;
} else
{
memcpy ( lpErrMsgResStr ,
lpChanged ,
TcharsMinBufSizeP6C ( uintNewLen ) ) ;
} } }
Since I have "broken the ice" by displaying a code snippet, I shall shift gears, and call attention to a few aspects of the above example, and the ones to follow, that are significant, but not immiediately obvious.
Points of Interest
The loop shown above illustrates quite a few things that you will see throughout my code.
- The initialization clause of the for statement uses
ARRAY_FIRST_ELEMENT_P6C
, a macro that expands to a numeric value of zero. I use such macros to document magic numbers, which is what I perceive the lower bound of an array to be. - Although the limit clause is an expression in the source code, even a debug build of the code converts the expression to an immediate (hard coded) constant, which can be seen in the following snippet from the disassembly of the
for
statment shown above. This feat is possible because the required values are all known at compile time, so the compiler computes the value, and bakes it into the code.
81: for ( intSrchIndx = ARRAY_FIRST_ELEMENT_P6C
10002DBF mov dword ptr [ebp-24h],0
10002DC6 jmp FB_ReportErrorViaStaticBufferW+1D1h (10002dd1)
10002DC8 mov ecx,dword ptr [ebp-24h]
10002DCB add ecx,1
10002DCE mov dword ptr [ebp-24h],ecx
10002DD1 cmp dword ptr [ebp-24h],3
10002DD5 jae FB_ReportErrorViaStaticBufferW+286h (10002e86)
The limit test evaluation is the cmp
instruction at machine address 10002DD1; the MASM style comment is lifted vebatim from my work notes, from which I lifted the above snippet.
- For the same reason, I didn't waste space in the executable file to evaluate and store the expression for use in the last iteration test that begins "
if ( IsLastLoopLT_WW
" that suppresses the memory copy on the last iteration, since the final string can be used where it sits. - The new length is captured on each iteration into
uintNewLen
, which is allocated at the top of the routine, and fed to memcpy
to copy out the string between iterations, so that the output buffer can be reused. (As I write this, I realize that the copying could be eliminated by allocating a second buffer, and alternating between them on each iteration. I leave that as an exercise for ambitions readers, or for the next version of the library.) - Though it looks like a function call,
TcharsMinBufSizeP6C
is a parameterized macro that hides the multiplication by sizeof ( TCHAR )
that I described above, and accounts for the trailing null,.
- Copying the trailing null every time makes it safe to reuse buffers without initializing them.
- The wide character calculation requires just one machine instruction, since
mov
and push
don't count, because both are required to get the number into the argument list.
10002E66 mov edx,dword ptr [ebp-20h]
10002E69 lea eax,[edx+edx+2]
10002E6D push eax
The middle instruction at machine address 10002E69
accounts for both sizeof ( TCHAR )
for a wide character and the trailing null (+2
). When UNICODE
is undefined, that instruction becomes add edx, 1
, and edx
goes onto the stack.
- The last iteration test is another parameterized macro; this macro generates an expression that evaluates to true only on the last iteration of the loop.
#define IsLastLoopLT_WW(pintLoopIndex,pintLoopLimit) ( ( pintLoopIndex + 1 ) == pintLoopLimit )
Since the limit test of this loop is that the index is less than the upper limit, the loop stops when the index is one short of the loop. Why this expresson works is left as an exercise for the interested reader, as is the disassembly.
- The final function style macro in this block,
StringIsNullOrEmptyWW
, is inspired by the static string.IsNullOrEmpty
method in the Microsoft .NET Framework, and it behaves in exactly the same way. The macro is straightforward, as is the generated machine code.
#define StringIsNullOrEmptyWW(plpString) ( ( BOOL ) ( plpString == NULL || StringIsEmptyWW ( plpString ) ) )
The machine code generated to impment the macro in the snippet shown above is as follows.
85: if ( StringIsNullOrEmptyWW ( lpChanged ) )
003B37EE cmp dword ptr [ebp-8],0
003B37F2 je FB_ReportErrorViaStaticBufferA+20Eh (003b37fe)
003B37F4 mov edx,dword ptr [ebp-8]
003B37F7 movsx eax,byte ptr [edx]
003B37FA test eax,eax
003B37FC jne FB_ReportErrorViaStaticBufferA+24Bh (003b383b)
The example above, also taken from my working notes, shows the register values from a test for a string that is neither null, nor empty.
Using Two or More Buffers at Once
The last major point that I think deserves some attention is a demonstration of a case in which it helps to have access to more than one (three, to be exact) static buffers. The example is the StagingOrbits
routine, defined in FixedStringBuffersTestStand.C
, most of which is reproduced below.
#define FB_BUFFER_INDEX_TOSEARCH ( FB_GUARANTEED_BUFFER + ARRAY_NEXT_ELEMENT_P6C )
#define FB_BUFFER_INDEX_TOFIND ( FB_BUFFER_INDEX_TOSEARCH + ARRAY_NEXT_ELEMENT_P6C )
#define FB_BUFFER_INDEX_REPLACEMENT ( FB_BUFFER_INDEX_TOFIND + ARRAY_NEXT_ELEMENT_P6C )
for ( uintStrData = STRDATA_FIRST ;
uintStrData <= STRDATA_LAST ;
uintStrData++ )
{
lpStrData = FB_LoadTestString ( uintStrData ,
FB_BUFFER_INDEX_TOSEARCH ) ;
for ( uintStrFind = TOFIND_FIRST ;
uintStrFind <= TOFIND_LAST ;
uintStrFind ++ )
{
lpStrFind = FB_LoadTestString ( uintStrFind ,
FB_BUFFER_INDEX_TOFIND ) ;
for ( uintStrRepl = TOREPLACE_FIRST ;
uintStrRepl <= TOREPLACE_LAST ;
uintStrRepl++ )
{
* plpOrbit += 1 ;
lpStrRepl = FB_LoadTestString ( uintStrRepl ,
FB_BUFFER_INDEX_REPLACEMENT ) ;
lpReplaced = FB_Replace ( lpStrData ,
lpStrFind ,
lpStrRepl ,
&uintLength ) ,
lpReplaced4lOG = StrReplace_P6C ( ( lpReplaced
? lpReplaced
: FB_XlateFBStatusCode ( GetLastError ( ) ) ) ,
_T ( "\n" ) ,
_T ( "[NEWLINE]" ) ) ;
lpstrData4Log = StrReplace_P6C ( lpStrData , _T ( "\n" ) , _T ( "[NEWLINE]" ) ) ;
lplpStrFind4Log = StrReplace_P6C ( lpStrFind , _T ( "\n" ) , _T ( "[NEWLINE]" ) ) ;
lpStrRepl4Log = StrReplace_P6C ( lpStrRepl , _T ( "\n" ) , _T ( "[NEWLINE]" ) ) ;
_tprintf ( lpMsgTpl ,
* plpOrbit ,
uintStrData ,
lpstrData4Log ,
uintStrFind ,
lplpStrFind4Log ,
uintStrRepl ,
lpStrRepl4Log ,
lpReplaced4lOG ,
uintLength ) ;
FreeBuffer_WW ( lpstrData4Log ) ;
FreeBuffer_WW ( lplpStrFind4Log ) ;
FreeBuffer_WW ( lpStrRepl4Log ) ;
FreeBuffer_WW ( lpReplaced4lOG ) ;
} } }
The objective of this routine is to thorougly exercise the FB_Replace
library routine, which takes three string arguments, all of which are inputs, and a fourth argument, which is a pointer to a UINT
variable that receives the length of the new string.
Test routine <font face="Courier New">StagingOrbits</font>
is implemented as a nested for
loop, with a loop corresponding to each of the three inputs. Since all three inputs must be present when the innermost loop calls the <font face="Courier New">FB_Replace</font>
routine, it uses three of the five resource string buffers, designated by the three symbolic constants defined at the top of the listing.
Since they aren't really part of the test, but are used to format the output so that it can be read into Microsoft Excel for analysis, strings lpstrData4Log
, lplpStrFind4Log
, and lpStrRepl4Log
are constructed in dynamically allocated buffers, using StrReplace_P6C
, the predecessor of FB_Replace
that allocates memory as needed from the heap, and can, therefore, handle strings of arbitrary length. Unlike its successor, StrReplace_P6C
has no provision for returning the length of its finished string, although it can be derived by dividing the value returned from HeapSize
by sizeof ( TCHAR )
and subtracting one from the quotient. Why this is much faster than passing the string to _tcslen
is left as an exercise for you mental gymnasts. The only additional hint I shall offer is that this method works because the returned buffer is exactly big enough to hold the returned string.
There are a number of differences between the algorithms used by StrReplace_P6C
and FB_Replace
, I'll just say that I think the algorithm implemented in FB_Replace
is more robust in several respects, and there is a good chance of it being adapted to work with dynamic memory, to become an improved version of StrReplace_P6C
.
Lessons Learned or Reinforced
- Reinforced: Compute offsets into character strings in characters, and let the compiler convert them to bytes. You may as well, since you can't make it do otherwise without more work than it's worth.
- Reinforced: Test the Unicode version first, and the ANSI version will probably take care of itself.
- Learned: The CRT string routines fail badly when fed a null reference. My solution to this issue is
TcsLenEvenIfNull
, a function style macro that wraps my StringIsNullOrEmptyWW
macro, discussed above, in a ternary expression that calls _tcslen
only if the string pointer is not null and the string has a length greater than zero. This saves the function call for when you really need it, and avoids the badly handled null reference exception. TcsLenEvenIfNull
is defined in FixedStringBuffers_Pvt.H
, which is part of the main DLL project; its dissection is left as a lab exercise. - Learned: The easiest and best way to avoid string ID number collisions is to group strings into satellite DLLs. This lesson culminated in the creation of library function
FB_LoadStringFromDLL
and the VBA macro that makes FB_Replace_Test_Strings.XLSM
magic. Collision Proof Shared String Resources is all about the Excel workbook and its magic, and includes an improved version of the workbook, along with Visual Studio template projects from which to create your own string DLLs.. Meanwhile, I left a copy in the NOTES
directory of the project for you to explore. [New Version] As of 2 June 2015, the download package contains a copy of the improved workbook that I published with the article, and exports of the embedded VBA source code. Along with some bug fixes, the new version sports a hot key that starts the macro, Ctrl-Shift-G. The macro project is locked but unsigned. (To prevent accidental changes, I lock my VBA projects.), and the critcial formulas in the worksheets are protected against accidental changes, as are the lookup worksheets from which the resource script and its header are generated. If you downloaded the archive for this article last month, you may want to download it again to get the updated NOTES
directory. Better yet, use the hyperlink above to pop the other article open in a new browser window, read it, and get its demonstration package.
History
08 April 2015 - Article published.
02 June 2015 - Added new version of FB_Replace_Test_Strings.XLSM to both download packages, revise the article to cover the new package and include a link to the article about the workbook and the associated C/C++ code, reword a sentence here and there, and make a few cosnetic changes.