Question

Suppose I am given a sorted list of elements and I want to generate all subsets satisfying some condition, so that if a given set does not satisfy the condition, then a larger subset will also not satisfy it, and all sets of one element do satisfy it.

For example, given a list of all positive integers smaller than 100, determine subsets whose sum is smaller than 130: (100,29) (0,1,40), (0), etc...

How can I do that (preferably in Python)?

Thanks! :)

Was it helpful?

Solution

You can generate all the subsets using a Branch-and-bound technique: you can generate all the subsets in an incremental fashion (generating superset of subsets already determined), using as a prune condition "does not explore this branch of the tree if the root does not satify the constraint".

If you want to be generic regarding the constraint, I think this is the best strategy.

Be sure to write in a correct manner the code that generates the subsets, otherwise you generate many time the same subsets: in order to avoid memoization, which can be time-consuming due to map lookups and introduce memory overhead, you can generate the subsets in that manner:

GetAllSubsets(List objects) {
    List generated = {};
    GetAllSubsets(generated, [], objects);
    return generated;
}

GetAllSubsets(List subsetGenerated, List objectFixed, List objectsToFix) {
    GetAllSubsets(subsetGenerated, objectFixed, objectsToFix.sublist(1, objectsToFix.length());
    if (satisfy(toCheck = objectsFixed.add(objectsToFix.get(0)))) {
        subsetGenerated.add(toCheck);
        GetAllSubsets(subsetGenerated, toCheck, objectsToFix.sublist(1, objectsToFix.length());
    }
}

In fact, the subsets added by the first invocation of GetAllSubsets doesn't have the first element of objectsToFix, where the subsets added by the second call (if the pruning condition isn't violated) have that element, so the intersection of the two sets of subsets generated is empty.

OTHER TIPS

You could construct your sets recursively, starting with the empty set and attempting to add more elements, giving up on a recursive line of execution if one of the subsets (and thus all of its supersets) fails to meet the condition. Here's some pseudocode, assuming a set S whose condition-satisfying subsets you would like to list. For convenience, assume that the elements of S can be indexed as x(0), x(1), x(2), ...

EnumerateQualifyingSets(Set T)
{
    foreach (x in S with an index larger than the index of any element in T)
    {
            U = T union {x}

            if (U satisfies condition)
            {
                print U
                EnumerateQualifyingSets(U)
            }
    }
}

The first call would be with T as the empty set. Then, all the subsets of S matching the condition would be printed. This strategy relies crucially on the fact that a subset of S which does not meet the condition cannot be contained in one that does.

There are certainly ways to do it, but unless you can constrain the condition somehow, it's going to take O(2^n) steps. If you consider, for example a condition on 1–100 where all the subsets would be selected (eg, < Σ i for i in 1-n), then you would end up enumerating all the subsets.

You'd be looking at

for i in the powerset of {1-n}
    if cond(i)
       note that set

You can get the powerset of the set by simply generating all binary numbers from 0 to sn-1, and choosing element i for the subset when bit i is 1.

I've done something similar for a class schedule generating algorithm. Our schedule class had 2 elements - a list of courses added to the schedule, and a list of courses available to add.

Pseudocode:

queue.add(new schedule(null, available_courses))
while( queue is not empty )
    sched = queue.next()
    foreach class in sched.available_courses
        temp_sched = sched.copy()
        temp_sched.add(class)
        if(temp_sched.is_valid())
            results.add(temp_sched)
            queue.add(temp_sched)

The idea is to start with an empty schedule and the list of available classes, and to search down the tree for valid schedules (valid meaning fits the requirements given by the user, doesn't have time conflicts, etc). If a schedule is invalid, it is thrown away - we can't make an invalid schedule valid by adding classes (ie, pruning the tree).

It should be pretty easy to modify this to work with your problem.

Here's a concrete example of akappa's answer, using a recursive function to generate the subsets:

def restofsubsets(goodsubset, remainingels, condition):
    answers = []
    for j in range(len(remainingels)):
        nextsubset = goodsubset + remainingels[j:j+1]
        if condition(nextsubset):
            answers.append(nextsubset)
            answers += restofsubsets(nextsubset, remainingels[j+1:], condition)
    return answers

 #runs slowly
 easieranswer = restofsubsets([], range(101), lambda l:sum(l)<40)

 #runs much faster due to eliminating big numbers first
 fasteranswer = restofsubsets([], range(100,-1,-1), lambda l:sum(l)<40)

 #runs extremely slow even with big-numbers-first strategy
 finalanswer = restofsubsets([], range(100,-1,-1), lambda l:sum(l)<130)

I think in the worst case, you still have to generate all subsets and calculate the sum of each set to determine if it is qualify or not. Asymptotically, it is the cost of subsets generating procedure.

The following is the method I implemented in javascript for the same idea.

//this is to generate an array to test
var numbers = (function(start, end){
    var result = [],
        i =  start; 
    for(; i <= end; i++){
        result.push(i);
    }
    return result; 
})(1, 12);

//this is the qualifying function to determine if the generated array is qualified
var fn = (function(maxSum){
    return function(set){
        var sum = 0;
        for(var i = 0 ; i< set.length; i++){
            sum += set[i];
            if( sum > maxSum ){
                return false;
            }
        }
        return true;
    }
})(30);

//main function
(function(input, qualifyingFn){
    var result, mask, total = Math.pow(2, input.length);
    for(mask = 0; mask < total; mask++){

        result = [];
        sum = 0;

        i = input.length - 1; 
        do{
            if( (mask & (1 << i)) !== 0){
                result.push(input[i]);
                sum += input[i];
                if( sum > 30 ){
                    break;
                }
            }
        }while(i--);
        if( qualifyingFn(result) ){
            console.log(JSON.stringify(result));
        }
    }

})(numbers, fn);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top