Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Java

Generating Subsets

2.00/5 (1 vote)
23 Sep 2018CPOL2 min read 6K  
How to generate subsets

Every once in a while, I am being challenged with problems involving optimal subsets from a given set of objects. It happened again a few weeks ago. The problem was of a knapsack style (finding the most profitable subset of items of bounded total size) and ... of course, I forgot how to tackle it using dynamic programming. But, many years ago, I learned a very simple (to memorise) algorithm which saved the day and I will describe it in this article.

The Algorithm Design Manual, page 453, defines it as binary counting. The idea behind it is the one to one mapping between the numbers 0 and 2n-1 (total number of numbers is 2n) and binary strings of length n (total number of strings is 2n) where bit 1 at the index i means "consider the object at the index i in the original set for the given subset". We also know that the total number of subsets of a set of size n is 2n. We have 2n everywhere, awesome!

Let's see it in action by looking at a simple example, a small set {A, B, C} with n=3. We are not interested in the empty set, thus we will ignore the 0-th iteration, leaving us with a total of 23-1=7 subsets:

Iteration Binary string Indexes of the objects to consider Subset
1 001 1 {A}
2 010 2 {B}
3 011 1 and 2 {A, B}
4 100 3 {C}
5 101 1 and 3 {A, C}
6 110 2 and 3 {B, C}
7 111 1, 2 and 3 {A, B, C}

I need to stress the fact that it's an exponential (or O(2n)) complexity algorithm. It is very inefficient for big sets. Here is a code example:

Java
import java.util.LinkedHashSet;
import java.util.Set;

public class AllSubSets {
 public static void main(String args[]) {
  String[] setValues = new String[] { "A", "B", "C", "D", "E",
    "F", "G", "H", "I", "J" };

  long limit = 1 << setValues.length; // this is 2^length

  for (long l = 1; l < limit; l++) {
   Set<string> subSet = new LinkedHashSet<>();
   for (int i = 0; i < setValues.length; i++) {
    if ((l & (1 << i)) > 0) {
     subSet.add(setValues[i]);
    }
   }
   subSet.forEach(System.out::print);
   System.out.println();
  }
 }
}

and the result:

A
B
AB
C
AC
BC
ABC
D
... (trimmed due to the size) ...
CDEFGHIJ
ACDEFGHIJ
BCDEFGHIJ
ABCDEFGHIJ

A careful reader will almost immediately reply that this version is limited to sets with maximum 64 objects (with Java 8, long is limited to 64 bits and can be treated as unsigned). BigInteger, which supports bitwise operations, addresses this limitation. And yes, the extra subset construction loop makes it an O(n2n) algorithm, it's still exponential though.

However, as it was mentioned earlier, this is not an efficient algorithm. Anything bigger than 64 objects will produce more than 264 subsets and this is a big number. Even 232 is pretty decent in size, so don't use this algorithm for big sets.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)