Introduction
I think C++ is the most beautiful programming language but some functionality of other languages is useful. The list' extension of the DotNet GroupBy is one of them. This is my C++ simple implementation.
Requirements
Function
The idea is to write a function that executes an input function on every item of a generic input container range and extracts a key that will be used to split the input container in a key-sublist values.
With a little bit of metaprogramming:
template<class KeyFunc, typename It >
struct ___st_GroupBy___
{
typedef typename std::invoke_result<KeyFunc,It>::type Key_Type;
};
template<class KeyFunc, typename It >
std::unordered_map<typename ___st_GroupBy___<KeyFunc, It>::Key_Type, std::vector<It>> GroupBy(KeyFunc&& Func, It _begin, It _end )
{
std::unordered_map<typename ___st_GroupBy___<KeyFunc, It>::Key_Type, std::vector<It>> map;
for(It Curr = _begin; Curr != _end; Curr++ )
map[Func(Curr)].push_back(Curr);
return map;
}
The function uses an auxiliary internal struct to determine the Key type returned by the input function.
The returned type is a map has as key, the returned key value from input function call, and as value, a vector of iterator of the original input container. This solution is a little bit more complex as syntax to use but avoid copy of item inside sublist output. The disadvantage is that if I modify the original input container, the iterators inside the returned map can be invalid and you can fall in access violation exception.
Another version that is less optimized but safer can be the following:
template<class KeyFunc, typename It >
std::unordered_map<typename ___st_GroupBy___<KeyFunc, It>::Key_Type, std::vector<typename It::value_type>> GroupByCpy(KeyFunc&& Func, It _begin, It _end )
{
std::unordered_map<typename ___st_GroupBy___<KeyFunc, It>::Key_Type, std::vector<typename It::value_type>> map;
for(It Curr = _begin; Curr != _end; Curr++ )
map[Func(Curr)].push_back(*Curr);
return map;
}
This function creates a copy of the elements of the input container and it requires that the copy operator is been defined in the type of the items of the input container.
Usage
#include <string>
#include <vector>
using std::string;
struct TestData
{
string Key;
int Data1;
int Data2;
};
vector<TestData> InputData{{"pippo",1,2},{"pippo",2,3},{"pippo",3,4},{"pluto",1,0},{"paperino",2,3},{"paperino",3,4}};
auto result = GroupBy( [&](decltype(InputData.begin()) Item)->string
{ return Item->Key; }, InputData.begin(), InputData.end());
for (auto KeyValuesPair : result)
{
string str_Key = KeyValuesPair.first;
vector<vector<TestData>::iterator> &vec_Vector = KeyValuesPair.second;
std::cout << "Key : " << str_Key << std::endl;
for (const auto &it_Value : vec_Vector)
std::cout << "\tData1 : " << it_Value->Data1 << " - " << "\tData2 : " << it_Value->Data2 << std::endl;
}
std::unordered_map<string, vector<TestData>> result2 = GroupByCpy( [&](decltype(InputData.begin()) Item)->string{ return Item->Key; }, InputData.begin(), InputData.end());
for (auto KeyValuesPair : result2)
{
string str_Key = KeyValuesPair.first;
vector<TestData> &vec_Vector = KeyValuesPair.second;
std::cout << "Key : " << str_Key << std::endl;
for (const auto &obj_Value : vec_Vector)
std::cout << "\tData1 : " << obj_Value.Data1 << " - " << "\tData2 : " << obj_Value.Data2 << std::endl;
}
Another variation can be using std::map
instead of unordered_map
in order to have sorting by key.
History
- 17th March, 2023: Initial version