Introduction
Updated 2007-03-25: See end of article for details.
Ever designed a simple utility program, such as a hex-dump program, only to find your simple program is a full 64K, optimized for size, when all it does is read a file and print to stdout? Ever wonder what happened to those good ol' DOS days where programs had to be small? Where a COM file was limited to 64K? Or when you can write a bare-bones DOS-style protected mode operating system kernel in about 64K?
Well look no further. Here I will examine what causes this code bloat, and what can be done to fix it.
Background
Matt Pietrek wrote an excellent article in the January 2001 MSDN Magazine titled Under the Hood: Reduce EXE and DLL Size with LIBCTINY.LIB. While most of this information remains valid today, I have updated some of his code to work better with Visual Studio 2005. I have also added support for functions that were not included in his article.
Intended Audience
This article is aimed at programmers who like to have control over every little detail. It is also geared towards small portable utility-like programs, where a DLL CRT is undesirable because of the need for a second file and installation program, and where the overhead of a statically linked CRT is much greater than the core program code.
Of course, by replacing the CRT, programs that rely on specifics of the Microsoft CRT will fail. For instance, if you go digging into the FILE
structure, or expect a certain header on your memory allocations, or rely on the buffering features of stdio, or use locales, runtime checks, or C++ exception handling, you can't use this library. This library is aimed for use by small, simple programs, such as a hex-dump command line program or the many UNIX-style tools like cat
or grep
.
Many C/C++ purists will take offence at my suggestions, because the C runtime is, to them, something that shouldn't be tampered with. But bear with me, because although you might never use any of this article's information, it should at least give you an insight into how your program works.
Where's Bloat-o?
(really bad pun, I know...)
The source of this 'code bloat' is very easy to find by looking at a linker-generated map file. Here is a snippet from the demo programs' map file:
0001:00000000 ?DumpFile@@YAXPAD@Z 00401000 f hd.obj
0001:00000152 _main 00401152 f hd.obj
0001:0000021b _feof 0040121b f LIBCMT:feoferr.obj
0001:0000024a _fgetc 0040124a f LIBCMT:fgetc.obj
0001:00000381 _printf 00401381 f LIBCMT:printf.obj
0001:00000430 __get_printf_count_output 00401430 f LIBCMT:printf.obj
0001:00000446 __fsopen 00401446 f LIBCMT:fopen.obj
0001:0000050a _fopen 0040150a f LIBCMT:fopen.obj
0001:00000520 _memset 00401520 f LIBCMT:memset.obj
0001:0000059a __fclose_nolock 0040159a f LIBCMT:fclose.obj
0001:0000060d _fclose 0040160d f LIBCMT:fclose.obj
0001:00000689 __amsg_exit 00401689 f LIBCMT:crt0dat.obj
0001:000006ad ___crtCorExitProcess 004016ad f LIBCMT:crt0dat.obj
0001:000006d3 ___crtExitProcess 004016d3 f LIBCMT:crt0dat.obj
...
0001:0000a590 __allmul 0040b590 f LIBCMT:llmul.obj
0001:0000a5e0 _strchr 0040b5e0 f LIBCMT:strchr.obj
0001:0000a5e6 ___from_strstr_to_strchr 0040b5e6 LIBCMT:strchr.obj
As you can see, it includes "two" functions from my program, and over "two hundred" functions in the C Runtime (CRT).
Notice that one of the functions is even ___crtCorExitProcess
, a function that is used by a C++/CLI program! Other gems include multithreading support, locales, and exception handling - none of which are used by my program!
And this is with Eliminate Unreferenced Data and COMDAT Folding on!
Where do I begin?
I will first highlight the various tasks performed by the C Runtime to give the reader a better understanding of the 'magic' that happens in C and C++.
Let's start by configuring the linker to Ignore Default Libraries
. Compile. I was greeted with this:
hd.obj : error LNK2001: unresolved external symbol _feof
hd.obj : error LNK2001: unresolved external symbol _fgetc
hd.obj : error LNK2001: unresolved external symbol _printf
hd.obj : error LNK2001: unresolved external symbol _fopen
hd.obj : error LNK2001: unresolved external symbol _memset
hd.obj : error LNK2001: unresolved external symbol _stricmp
hd.obj : error LNK2001: unresolved external symbol _fclose
hd.obj : error LNK2001: unresolved external symbol _exit
LINK : error LNK2001: unresolved external symbol _mainCRTStartup
Not good. Not good at all.
mainCRTStartup
Where does your console program start? Did I hear you say main
? If you did, you said what I would have said before journeying into the inner Stationworkings of the C Runtime.
Windows isn't nice enough to provide your app with a ready-made argc
and argv
. All it does is call a void function()
specified in the EXE header. And by default, that function is called mainCRTStartup
. Here is a simple example:
extern "C" void __cdecl mainCRTStartup()
{
int argc = _init_args();
_init_atexit();
_initterm(__xc_a, __xc_z);
int ret = main(argc, _argv, 0);
_doexit();
ExitProcess(ret);
}
We start by creating argc
and argv
, which we later pass to main
. But before we do that we have to take care of some things, like calling the constructors for static C++ objects.
The same thing happens in GUI programs, except the function is called WinMainCRTStartup
. And for DLLs, the true entry point is _DllMainCRTStartup
. Unicode programs look for wmainCRTStartup
and wWinMainCRTStartup
respectively. DllMain
appears to stay the same.
C++ Magic
The constructors of static objects don't just call themselves. And Windows is certainly not going to call them for us. So we have to do it ourselves. What do I mean?
class StaticClass
{
public:
StaticClass() {printf("StaticClass constructor\n");};
~StaticClass() {printf("StaticClass destructor\n");};
};
StaticClass staticClass;
void main()
{
printf("main\n");
}
C++ programmers should automatically expect the output of this program to be:
StaticClass constructor
main
StaticClass destructor
Matt Pietrek has a great explanation in his article, mentioned earlier, under the heading "The Dark Underbelly of Constructors", so I will not bother going into that level of detail here. Suffice it to say that the compiler emits pointers to the constructor functions (actually thunks to constructor functions) in a special ".CRT" section in the object file, which is later merged with the ".data" section. By declaring a pointer to the start and the end of this section, the _initterm
function is able to iterate over these pointers, calling each constructor in turn.
The constructor thunk
function also registers an atexit
callback to call the destructor of the object. Thus the mainCRTStartup
function above goes to the trouble of creating an atexit
table. The _doexit
function is responsible for calling these functions.
Standard Functions
So now we have taken care of the program's entry point. What about the other functions?
printf and Family
One of the more complex tasks performed by the C Runtime is parsing the printf
format string. (I'll admit it's not terribly complex; it's just non-trivial compared to strcmp
) To save space, we can offload this processing to the Windows function wvsprintf
. No, that's not a wide-character version. The w
probably stands for Windows.
extern "C" int __cdecl printf(const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
int ret = vprintf(fmt, args);
va_end(args);
return ret;
}
extern "C" int __cdecl vprintf(const char *fmt, va_list args)
{
char bfr[2048]; int ret = wvsprintf(bfr, fmt, args);
fwrite(bfr, ret, 1, stdout); return ret;
}
File I/O
Originally I had planned to eschew the FILE structure altogether - and instead just use a HANDLE cast to a FILE*. But this would have only given me two bits of information. As I added functionality to the library this ideal solution became less ideal when I needed to store an end-of-file flag, text-mode flag, and possibly other data. And besides, not using the FILE structure means that the stdin
, stdout
, and stderr
identifiers don't work! So now I (ab)use the FILE structure.
Because I cannot change the FILE structure itself (it is defined in stdio.h) I have to use its fields to work with my data. A very ugly solution. But this library isn't intended to be pretty. NOTE however, that this means code that relies on internal fields in the FILE structure will crash. But then again, you shouldn't be messing with internal data structures anyways, right?
Thus, for illustration, here is fopen
:
extern "C" FILE *fopen(const char *path, const char *attrs)
{
DWORD access, disp;
if (strchr(attrs, 'w'))
{
access = GENERIC_WRITE;
disp = CREATE_ALWAYS;
}
else
{
access = GENERIC_READ;
disp = OPEN_EXISTING;
}
HANDLE hFile = CreateFileA(path, access, 0, 0, disp, 0, 0);
if (hFile == INVALID_HANDLE_VALUE)
return 0;
_FILE *file = new _FILE;
memset(file, 0, sizeof(_FILE));
file->set_handle(hFile);
if (strchr(attrs, 't'))
file->_flag |= _FILE_TEXT;
return file;
}
fread
and fwrite
are substantially more complicated than this, because they must translate '\r\n
' combinations to '\n
' only. For brevity, I will not discuss the algorithm - see the source code if you are interested.
String functions
Replacing the CRT means no more strlen
, strcmp
, or even memset
. These must be implemented from scratch. Thankfully, they are not difficult to implement - just tedious. Care should be taken to handle NULL
pointers and other special cases described in the MSDN documentation.
Wide Character (Unicode) Support
This is the major new feature in this library. It is still under development and hasn't undergone extensive testing yet.
As suggested by Hans Dietrich I have started to add wide-character support to the library. Basically that means implementing wide-character versions of various functions.
Uppercase and lowercase
When dealing with ASCII, functions like isalpha
, toupper
, and strlwr
are trivial to implement. But as soon as Unicode enters the picture, they become much more complicated. There are different rules for uppercase versus lowercase and alphabetic versus numeric, so some operating system help is in order. To fix this problem, the function GetStringTypeW
is used to implement the isXYZ
family of functions, and the functions CharUpper
and CharLower
are used to implement toupper
and tolower
, respectively.
File encoding
Up until VS2005 even the Unicode file library functions could only write ASCII characters. Output to wprintf
, fwprintf
, and fwputs
in text mode are all translated from Unicode before it is written to the file.
Because adding support for UTF-8, UTF-16, and other forms of file encoding would just add bloat to this library, I have made the decision to not include it. The behavior will remain compatible with the pre-VS2005 CRT. If you need to deal with file encodings, you probably need the full CRT anyway.
Why are you adding all this stuff? Why not keep it simple!
Simple: Only the stuff that you call is included in a release build!
But then why is Microsoft's CRT so bloated if you don't call much stuff? Again - because you do, but don't know it. The CRT startup code itself calls lots of functions that in turn call other functions - and a lot of it is garbage that isn't needed 90% of the time. Locales, exception handling, etc. have their place, but not in all programs. If your program doesn't use it, why should it have to pay the price of Microsoft's startup code using it?
The startup code and various functions in this CRT library are designed to rely on as little functionality as possible. Thus only the essentials are included.
Using the code
Add the tlibc
(Tiny Libc) project to your project's solution, and add it as a referenced project. Alternatively, compile the library and add it to your project's linker options.
Because we are replacing the default CRT, C++ exception handling and SEH will not be handled properly. So don't use it! You will also need to turn off Buffer Security Check, set Runtime Checks to default, and disable Runtime Type Information.
Make sure to link with Ignore Default Libraries turned on! And to generate the smallest code, compile with link-time code generation on, optimize for size, turn string pooling on, and enable COMDAT folding and eliminate unreferenced data.
Results
After recompiling the program with libctiny
and the method above, the EXE jumped from a giant 64K to a much more reasonable 4K! (4096 bytes to be exact). For comparison, the entire code section of the linker map file is reproduced below:
0001:00000000 ?DumpFile@@YAXPAD@Z 00401000 f hd.obj
0001:0000013d _main 0040113d f hd.obj
0001:0000021a _fopen 0040121a f libct:file.obj
0001:000002a7 _fread 004012a7 f libct:file.obj
0001:000003c2 _fwrite 004013c2 f libct:file.obj
0001:0000048b _fgetc 0040148b f libct:file.obj
0001:000004b6 _printf 004014b6 f libct:printf.obj
0001:000004ef _memset 004014ef f libct:memory.obj
0001:0000050e __doexit 0040150e f libct:initterm.obj
0001:0000053a _mainCRTStartup 0040153a f libct:crt0tcon.obj
0001:000005f5 _malloc 004015f5 f libct:alloc.obj
0001:00000607 __init_args 00401607 f libct:argcargv.obj
0001:00000705 __ismbcspace 00401705 f libct:isctype.obj
History
2007-03-25
- Fixed
strnicmp
, pointed out by mpj
2006-08-19
- Wide-character bugfixes (
_fgetws
) - Non-Unicode builds now set to SBCS rather than MBCS
- Fixed typo bug in
stderr
, pointed out by Hans
2006-08-13
- Preliminary wide-character support
- Fixed memory leak in command-line parsing (existed in original)
- Fixed memory leak in fread
- Fixed behavior of
feof
- Fixed a rather embarrassing problem with
strncpy
- Added
_DllMainCRTStartup
that I accidentally omitted
2006-08-12
- Submission to CodeProject
Comments, complaints, questions, etc. are welcome. Please let me know if you actually use this for something. If you need a function that is not included in this library, let me know and I will update the code. Comments on my 'comments' are also welcome.