The code shown in this tip makes it easy to keep the application settings in an INI file using UTF-8 encoding.
Introduction
In my previous article, "Doing UTF-8 in Windows", I showed how you can work with UTF-8 using basically only two functions, utf8::narrow
and utf8::widen
. For general file I/O, you just have to convert the file name from UTF-8 to UTF-16 and all the reading and writing functions remain unchanged:
FILE *f = utf8::fopen (u8"ܐܪܡܝܐ.txt", "w");
fputs (u8"This text is in Aramaic ܐܪܡܝܐ", f);
fclose (f);
There is one case that is not covered by these rules: the INI files, also called "profile files" in Microsoft parlance. Although there are many other ways of storing application settings, INI files are still widely used either for compatibility reasons or because they are simple to work with.
The problem is that the basic Windows API calls for reading and writing INI files, GetPrivateProfileString
and PutPrivateProfileString
, combine both the file name and the information to be read or written in one API call. As an example, here is the signature of the GetPrivateProfileStringW
function:
DWORD GetPrivateProfileStringW(
LPCWSTR lpAppName,
LPCWSTR lpKeyName,
LPCWSTR lpDefault,
LPWSTR lpReturnedString,
DWORD nSize,
LPCWSTR lpFileName
);
If we would use the utf8::widen
function to convert all our UTF-8 strings, we would end up with an INI file that contains UTF-16 characters.
The solution is to completely forget about the Windows API functions and roll our own implementation for accessing INI files. This is by far not the only implementation of INI files that you can find out there. For a list of implementations, you can check the Wikipedia page. Some of them might be a bit over-hyped; one such project claims to be "the ultimate and most consistent INI file parser library written in C". The only claim I make is that my implementation struggles to be as compatible as possible with the original Windows API.
As such, you will find no arbitrary extensions to the file format and I've done a lot of testing to identify different corner cases. Here are the rules I discovered by trying different combination of calls to the original Windows API:
- The only comments lines are the ones starting with a semi-colon (hashes are not considered comments by Windows API).
- There are no trailing comments; anything after the '=' sign is part of the key value.
- Leading and trailing spaces are removed both from returned strings and from parameters.
The only changes compared to the Windows API are:
- Line length defaults to 1024 (the
INI_BUFFER_SIZE
value) while Windows limits it to 256 characters. - Files without a path are in current directory while Windows places them in Windows folder.
Implementation
An INI file is implemented as a IniFile
object. The basic member functions IniFile::GetString
and IniFile::PutString
allow you to read or write settings in the INI file like in the code below:
utf8::IniFile test ("test.ini");
test.PutString ("key1", "value11", "section1");
string val = test.GetString ("key1", "section1");
The original Windows API handles only two data types for INI files: strings and integer numbers.
(GetPrivateProfileString
and GetPrivateProfileInt
functions). I thought it was useful to extend these functions to additional data types and also add some utility functions. This is not an extension of the file format; it is just an extension of the API for accessing these files. Here are some of these functions:
PutInt
and GetInt
for integer values PutDouble
and GetDouble
for floating point values PutBool
and GetBool
for boolean variables (when reading, the code, understands things like "on" or "0" or "OFF") PutColor
and GetColor
for RGB color representations PutFont
and GetFont
to save and retrieve font settings HasKey
and HasSection
to check if a key or a section exists in the INI file
Looking at the code, there are a few points of interest.
There is no in-memory buffering for the INI files. Everything is written out to disk as quickly as possible. This was a design decision because:
- that's what Windows does and I wanted to be as compatible as possible, and
- it is quite annoying when parameters don't get saved if the application has crashed or otherwise unexpectedly ended. This drawback is that INI files become less efficient but are not meant to be general data files.
Moreover, every time a key is written in an INI file, the whole file gets re-written; as I was saying, efficiency was not a design goal.
Conclusion
The code shown in this article makes it easy to keep the application settings in an INI file using UTF-8 encoding.
This concluded the series about UTF-8 in Windows. The previous two articles in this series are:
For reference, the code included with this article also contains the code from the previous ones.
History
- 2nd April, 2020: Initial version