C++: Simple Permutation and Combination Parallelism

Shao Voon Wong

5.00/5 (6 votes)

6 Jun 2018CPOL4 min read

13.5K

104

Simple Permutation and Combination Parallelism Examples

Download concurrent_permcomb-master.zip - 28.2 KB

Primary Motivation
Single Threaded Permutation
Multi-Threaded Permutation
Single Threaded Combination
Multi-Threaded Combination

Primary Motivation

My Github repository, Boost Concurrent Permutations and Combinations on CPU, has been up since January 2017. However, download is close to none while my older single threaded next_combination has plenty of downloads. So I figured it must be the code usage is deemed too complex. Aim of this article is to show a very simple, albeit inefficient, multithreaded example to give users a guide. It should be relatively easy for user to come up with an efficient multithreaded code, once they understand the examples in this article. For each permutation and combination section, I am going to start off with a single threaded example then followed by a multi-threaded example.

Single Threaded Permutation

First, I'll show you the example on using next_permutation in single threaded scenario. The while loop displays std_permuted until next_permutation returned false when std_permuted is detected to be in descending order. That is "54321".

C++

void usage_of_next_perm()
{
    std::string std_permuted = "12345";
    do
    {
        std::cout << std_permuted << std::endl;
    } while (std::next_permutation(std_permuted.begin(), std_permuted.end()));
}

Output is as follows:

Multi-Threaded Permutation

Next, I'll show you how to find permutation in multithreaded scenario. There are a total of 120 different permutations in collection of 5 elements. We'll use 4 threads in my example, meaning each thread will find 120/4=30 permutations. To tell you the truth, each thread will find 30-1=29 permutations, because the 1^st one is already computed by find_perm_by_idx in the main thread and passed to the thread. To keep the discussion simple, we pretend each thread computes 30 permutations. 1^st thread finds 0^th to 29^th permutations, while 2^nd thread starts from 30^th to 59^th permutations, and 3^rd thread begins from 60^th to 89 and so on. From 2^nd thread and 3^rd thread to start from 30^th and 60^th permutation respectively, we need to make use of find_perm_by_idx to find 30^th and 60^th permutation given their index, we'll find the rest with next_permutation. Technically, it is possible to use find_perm_by_idx to find all permutations by giving indices from 0 to 119 but it is simply not fleasible to do so, because find_perm_by_idx is much much slower than next_permutation. As a word of caution, index_to_find must be of type that is large enough to store the factorial(n). For instance, you are just finding the 1st 10 permutations, index_to_find type can store 0 to 10 but if it cannot store the factorial(n), find_perm_by_idx will not generate the permutation correctly. This applies to find_comb_by_idx as well. If your largest integer type is not large enough, you can consider using Boost Multiprecision Integer. And vector_type can be any collection type that supports push_back() and size() methods, like std::string. You can see I use std::string in my example.

I join my theads before starting the next thread because all_results is shared with all threads and it is not guarded with a lock, therefore not thread-safe. It is only meant to be a simple example on how to find_perm_by_idx. You can see I call the slow find_perm_by_idx in the main thread. It is your job to try to figure out how to call find_perm_by_idx in the worker threads.

C++

template<typename int_type, typename vector_type>
vector_type find_perm_by_idx(int_type index_to_find,
	vector_type& original_vector);

void usage_of_perm_by_idx()
{
    uint64_t index_to_find = 0;
    std::string original_text = "12345";
    std::string std_permuted = "12345";
    std::vector<std::string> all_results;
    size_t repeat_times = 30;
    for (size_t i=0; i<4; ++i)
    {
        std::string permuted = concurrent_perm::find_perm_by_idx(index_to_find, original_text);

        std::thread th([permuted, &all_results, repeat_times] () mutable {

            // do your work on the permuted result, instead of pushing to vector
            all_results.push_back(permuted); 
            for (size_t j=1; j<repeat_times; ++j)
            {
                std::next_permutation(permuted.begin(), permuted.end());
                // do your work on the permuted result, instead of pushing to vector
                all_results.push_back(permuted); 
            }
        });
        th.join();

        index_to_find += repeat_times;
    }
}

I skip the output here, I have verified they are the same as generated by single threaded next_permutation.

Single Threaded Combination

Next, we have come to next_combination section. To compute the total combination count, we use compute_total_comb. total must be of type that is large enough to store the factorial(n).

C++

void usage_of_next_comb()
{
    std::string original_text = "123456";
    std::string std_combined = "123";
    uint64_t total = 0;
    if(concurrent_comb::compute_total_comb(original_text.size(), std_combined.size(), total))
    {
        std::cout << std_combined << std::endl;
        for (uint64_t i = 1; i < total; ++i)
        {
            stdcomb::next_combination(original_text.begin(), original_text.end(), 
                                      std_combined.begin(), std_combined.end());
            std::cout << std_combined << std::endl;
        }
    }
}

Output is as follows. If your collection is sorted like original_text, all generated combination will be sorted as well, as you can see below:

Multi-Threaded Combination

index_to_find must be of type that is large enough to store the factorial(n) and vector_type can be any collection type that supports push_back() and size() methods, like std::string. Similar to find_perm_by_idx mentioned above, it is possible to find all combinations with find_comb_by_idx but it is much slower compared to next_combination, so we resort to next_combination to find the rest of combinations. There are total of 20 combinations, so each of the 4 thread will find 20/4=5 combinations. Actually, each thread will find 5-1=4 combination, because the 1^st one is already computed by find_comb_by_idx in the main thread and passed to the thread.

C++

template<typename int_type, typename vector_type>
vector_type find_comb_by_idx(const uint32_t subset,
	int_type index_to_find,
	vector_type& original_vector);

void usage_of_comb_by_idx()
{
    uint64_t index_to_find = 0;
    std::string original_text = "123456";
    std::string std_combined = "123";
    std::vector<std::string> all_results;
    size_t repeat_times = 5;
    for (size_t i = 0; i < 4; ++i)
    {
        std::string combined = concurrent_comb::find_comb_by_idx(std_combined.size(), 
                                           index_to_find, original_text);

        std::thread th([original_text, combined, &all_results, repeat_times] () mutable {

            // do your work on the combined result, instead of pushing to vector
            all_results.push_back(combined);
            for (size_t j = 1; j < repeat_times; ++j)
            {
                stdcomb::next_combination(original_text.begin(), original_text.end(), 
                                             combined.begin(), combined.end());
                // do your work on the combined result, instead of pushing to vector
                all_results.push_back(combined);
            }
        });
        th.join();

        index_to_find += repeat_times;
    }
}

Output is skipped here, afterall I have verified they are the same as generated by single threaded next_combination.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)