This article presents the structure used in the Kigs framework to efficiently convert strings (std::string, const char *) to unsigned int IDs with the intention of using them as keys, generally for unordered_map structures.
Table of Contents
The base class of the kigs framework (CoreModifiable
) provides access to some member variables, called attributes, using a string name:
instance->setValue("IntValue", 16);
int v = instance->getValue<int>("IntValue");
int v1;
if(!instance->getValue("IntValue",v1))
{
}
The instance factory also uses string names for classes:
myApplicationTimer = KigsCore::GetInstanceOf("ApplicationTimer", "Timer");
And we have a lot of other cases where strings are used to access values stored in maps (ordered or unordered).
But for performance reasons, it's not a good choice to use a std::string as a map key.
We looked for a way to convert string
(const char*
or std::string
) to unsigned int ID
in an efficient way. The string
s we use are generally one or two alphanumeric words, and the map contains at most a few dozen values. We wanted to avoid collisions (having the same ID for two different string
s) for this kind of usage.
Here is the KigsID struct
code:
struct KigsID
{
KigsID() = default;
KigsID(const KigsID& other) = default;
KigsID(KigsID&& other) = default;
KigsID& operator=(const KigsID& other) = default;
KigsID& operator=(KigsID&& other) = default;
bool operator==(const KigsID& other) const
{
return (_id == other._id);
}
bool operator<(const KigsID& other) const
{
return _id < other._id;
}
unsigned int toUInt() const { return _id; }
#ifdef KEEP_NAME_AS_STRING
template<size_t _Size> KigsID(const char(&aid)[_Size]) : _id_name(aid),
_id(CharToID::GetID(aid)) {};
KigsID(const kstl::string& aid) : _id_name(aid), _id(CharToID::GetID(aid)) {};
KigsID(const kstl::string_view& aid) : _id_name(aid), _id(CharToID::GetID(aid)) {};
KigsID(unsigned int aid) : _id_name("*unknown*"), _id(aid) {};
KigsID& operator=(const kstl::string& aid)
{ _id_name = aid; _id = CharToID::GetID(aid); return *this; };
template<size_t _Size> KigsID& operator=(const char(&aid)[_Size])
{ _id_name = aid; _id = CharToID::GetID(aid); return *this; };
KigsID& operator=(unsigned int aid)
{ _id_name = "*unknown*"; _id = aid; return *this; };
kstl::string _id_name;
#else
template<size_t _Size>
KigsID(const char(&aid)[_Size]) : _id(CharToID::GetID<_Size>(aid)) {};
KigsID(const kstl::string& aid) : _id(CharToID::GetID(aid)) {};
KigsID(const kstl::string_view& aid) : _id(CharToID::GetID(aid)) {};
KigsID(unsigned int aid) : _id(aid) {};
template<size_t _Size>
KigsID& operator=(const char(&aid)[_Size])
{ _id = CharToID::GetID<_Size>(aid); return *this; };
KigsID& operator=(const kstl::string& aid)
{ _id = CharToID::GetID(aid); return *this; };
KigsID& operator=(const kstl::string_view& aid)
{ _id = CharToID::GetID(aid); return *this; };
KigsID& operator=(unsigned int aid) { _id = aid; return *this; };
#endif
unsigned int _id;
};
The define KEEP_NAME_AS_STRING
is set when building in debug mode, so that it's easy when debugging to see the original string
used to initialize the ID.
Here is a sample code to illustrate this:
void Sample::KigsIDTest(const KigsID& id)
{
printf("KigsID unsigned int value = %08x\n", id._id);
printf("sizeof(KigsID) = %d\n", sizeof(KigsID));
}
The KigsIDTest
method is called like this:
KigsIDTest("HelloWorld");
And here is the id
variable inspection in debug mode in Visual Studio C++ debugger:
If KEEP_NAME_AS_STRING
is not defined, only unsigned int _id
member is available.
So the interesting part now is that in release mode, with optimizations activated, a call to KigsIDTest("HelloWorld");
outputs this:
So sizeof(KigsID)
is 4 bytes (32 bits unsigned int
).
Another interesting thing is that here, the "HelloWorld
" string
conversion to int is evaluated at compile time as we can see if we look at the disassembly code:
It is the case for const char*
strings up to 19 characters with Visual C++ compiler (x86 build).
For longer string
s or std::string
, the ID is evaluated at run time using a few logical operations (XOR
, ROL
) on 4 bytes packets.
To test the performances of the structure, we have set up a little benchmark code. For the use we make of it, the critical part is the access to the map values, so the benchmark focuses on that.
template<typename maptype, typename vectorkeys>
double TestKigsIDPerfs::testUMap()
{
double startTime = mApplicationTimer->GetTime();
maptype testMap;
std::vector<vectorkeys> testVector;
testVector.push_back("testString1");
testVector.push_back("testString2");
testVector.push_back("testString3");
testVector.push_back("testString4");
testVector.push_back("testString5");
testVector.push_back("ajzhgjhsqfd");
testVector.push_back("xcwvksqjdfh");
testVector.push_back("nzsbdocdsdf");
testVector.push_back("zieulkjvsqd");
testVector.push_back("poizerbhncs");
testVector.push_back("az");
testVector.push_back("aze");
testVector.push_back("azer");
testVector.push_back("azert");
testVector.push_back("azerty");
testVector.push_back("azertyu");
testVector.push_back("azertyui");
testVector.push_back("azertyuio");
testVector.push_back("azertyuiop");
testVector.push_back("azertyuiopq");
for (unsigned loop = 0; loop < testVector.size(); loop++)
{
testMap[testVector[loop]] = (loop + 1) * (loop + 1);
}
int vectorSize = testVector.size();
unsigned int voidComputation = 0;
for (unsigned int accessloop = 0; accessloop < (1000 * 1000 * 100); accessloop++)
{
voidComputation = voidComputation << 2;
voidComputation ^= testMap[testVector[accessloop % vectorSize]];
}
double endTime = mApplicationTimer->GetTime();
printf("voidComputation = %d\n", voidComputation);
return endTime - startTime;
}
We tested two kinds of unordered maps: std::unordered_map
and robin_hood::unordered_map
(https://github.com/martinus/robin-hood-hashing)
We also tested with the default hash method, and with the KigsIDHash
method, just returning the ID itself:
struct KigsIDHash
{
std::size_t operator()(const KigsID& k) const
{
return k._id;
}
};
Here are all the different benchmark calls:
double ti=testUMap<std::unordered_map<std::string,unsigned int>, std::string >();
printf("Unordered map with string map keys and string access: %f s\n", ti);
ti = testUMap<std::unordered_map<KigsID, unsigned int>, std::string >();
printf("Unordered map with KigsID map keys and string access: %f s\n", ti);
ti = testUMap<std::unordered_map<KigsID, unsigned int>, KigsID>();
printf("Unordered map with KigsID map keys and KigsID access: %f s\n", ti);
ti = testUMap<std::unordered_map<KigsID, unsigned int, KigsIDHash>, KigsID>();
printf("Unordered map with KigsID map keys, KigsID access and KigsIDHash: %f s\n", ti);
ti = testUMap<robin_hood::unordered_map<std::string,unsigned int>, std::string>();
printf("Robin hood unordered map with string map keys and string access: %f s\n", ti);
ti = testUMap<robin_hood::unordered_map<KigsID, unsigned int>, std::string>();
printf("Robin hood unordered map with KigsID map keys and string access: %f s\n", ti);
ti = testUMap<robin_hood::unordered_map<KigsID, unsigned int>, KigsID>();
printf("Robin hood unordered map with KigsID map keys and KigsID access: %f s\n", ti);
ti = testUMap<robin_hood::unordered_map<KigsID, unsigned int, KigsIDHash>, KigsID>();
printf("Robin hood unordered map with KigsID map keys,
KigsID access and KigsIDHash: %f s\n", ti);
To easily build and test the code (Git, Microsoft Visual Studio and CMake needed, see Kigs framework Getting Started or watch this video tutorial), clone https://github.com/Kigs-framework/kigs. Then download and extract TestKigsIDPerfs.zip in the kigs/projects directory.
Open kigs/projects/CMakeLists.txt with your favorite text editor, and add the following line:
add_subdirectory(TestKigsIDPerfs)
Save and close CMakeLists.txt.
The solution generation script : kigs/scripts/generateWinCMake.bat works with Visual Studio 2022 now. If you work with Visual Studio 2019, please edit generateWinCMake.bat and change "Visual Studio 17 2022" by "Visual Studio 16 2019" before launching the script.
Go to kigs/scripts directory and launch generateWinCMake.bat script (double click on it).
Then go to Build/solutionWinCMake/kigs/projects/TestKigsIDPerfs directory and double click TestKigsIDPerfs.sln to open the solution in Visual Studio.
In Visual Studio, select TestKigsIDPerfs
as your "startup project" and StaticRelease
as your build configuration. Generate and launch.
The benchmark was built with full optimizations and launched on:
- a Windows 10 PC (Core i7-4790K 4.00 Ghz), build with Visual Studio 2019 compiler.
- Firefox (HTML5) on the same Windows 10 PC, build with Emscripten under WSL.
- an Android Samsung Galaxy Tab S2 (armebiv7a), build with CLANG 5.0 compiler under Visual Studio 2019.
- Hololens 2 (Arm64), build under Visual Studio 2019.
All times are in seconds.
| Win10 | HTML5 | Android | Hololens 2 |
std::unordered_map, string key, string access, std::hash | 2.765395 | 100% | 5.633000 | 100% | 25.914861 | 100% | 8.705835 | 100% |
std::unordered_map, KigsID key, string access, std::hash | 1.785608 | 155% | 2.444000 | 230% | 19.587981 | 132% | 5.171553 | 168% |
std::unordered_map, KigsID key, KigsID access, std::hash | 0.722537 | 383% | 1.947000 | 289% | 17.237732 | 150% | 2.843732 | 306% |
std::unordered_map, KigsID key, KigsID access, KigsIDHash | 1.002372 | 276% | 1.916000 | 294% | 17.229091 | 150% | 2.866783 | 304% |
robin_hood::unordered_map, string key, string access, robin_hood::hash | 3.930436 | 70% | 4.103000 | 137% | 15.034525 | 172% | 7.228063 | 120% |
robin_hood::unordered_map, KigsID key, string access, robin_hood::hash | 2.069901 | 134% | 2.766000 | 204% | 10.752449 | 241% | 4.625862 | 188% |
robin_hood::unordered_map, KigsID key, KigsID access, robin_hood::hash | 1.138290 | 243% | 1.638000 | 344% | 8.765440 | 296% | 2.956348 | 294% |
robin_hood::unordered_map, KigsID key, KigsID access, KigsIDHash | 1.142882 | 242% | 1.647000 | 342% | 8.845147 | 293% | 2.946729 | 295% |
Using the KigsID
structure as the access key for unordered maps is always a good choice for our usage in the Kigs framework. On the contrary, using KigsIDHash
seems to be a bad idea.
Choosing between std::unordered_map and robin_hood::unordered_map depends a lot on the compiler and targeted platform.
If you want to have a closer look at the KigsID
code, you can find all the Kigs framework code here:
The KigsID
structure is defined here:
You can also start reading the introduction to the Kigs framework article series on CodeProject:
And follow us on Twitter or YouTube !
- 27th January, 2020: Initial version
- 19th February, 2020: Added Table of Contents, Benchmark and source code download
- 21th June, 2022: change links to the Kigs-framework GitHub repo, fixed the code and added link to video tutorial