How do I solve a set of constraints in Perl?

https://stackoverflow.com/questions/575515

05-09-2019
|

Question

I have the following set of constraints in Perl (just a sample set of constraints, not the ones I really need):

$a < $b
$b > $c
$a is odd => $a in [10..18]
$a > 0
$c < 30

And I need to find a list ($a, $b, $c) that meet the constraints. My naive solution is

sub check_constraint {
    my ($a, $b, $c) = @_;
    if !($a < $b) {return 0;}
    if !($b > $c) {return 0;}
    if (($a % 2) && !(10 <= $a && $a <= 18)) {return 0;}
    if !($a > 0) {return 0;}
    if !($c < 30) {return 0;}
    return 1;
}

sub gen_abc {
    my $c = int rand 30;
    my $b = int rand $c;
    my $a = int rand $b;
    return ($a, $b, $c);
}

($a, $b, $c) = &gen_abc();
while (!&check_constraint($a, $b, $c)) {
    ($a, $b, $c) = &gen_abc();
}

Now, this solution isn't guaranteed to end, and is pretty inefficient in general. Is there a better way to do this in Perl?

Edit: I need this for a random test generator, so the solution needs to use random functions such as rand(). A solution that's completely deterministic isn't enough, although if that solution can give me a list of possible combinations I can select an index at random:

@solutions = &find_allowed_combinations(); # solutions is an array of array references
$index = int rand($#solutions);
($a, $b, $c) = @$solution[$index];

Edit 2: The constraints here are simple to solve with brute force. However, if there are many variables with a large range of possible values, brute force isn't an option.

Solution

The main challenge in this optimization problem is mathematical in nature.

Your goal, as I can infer from your definition of the gen_abc method, is to prune your search space by finding bounding intervals for your various variables ($a, $b etc.)

The best strategy is to extract as many linear constraints from your full set of constraints, attempt to infer the bounds (using linear programming techniques, see below), then proceed with exhaustive (or non-deterministic) trial-and-error tests against a pruned variable space.

A typical linear programming problem is of the form:

minimize (maximize) <something>
subject to <constraints>

For example, given three variables, a, b and c, and the following linear constraints:

<<linear_constraints>>::
  $a < $b
  $b > $c
  $a > 0
  $c < 30

You can find upper and lower bounds for $a, $b and $c as follows:

lower_bound_$a = minimize $a subject to <<linear_constraints>>
upper_bound_$a = maximize $a subject to <<linear_constraints>>
lower_bound_$b = minimize $b subject to <<linear_constraints>>
upper_bound_$b = maximize $b subject to <<linear_constraints>>
lower_bound_$c = minimize $c subject to <<linear_constraints>>
upper_bound_$c = maximize $c subject to <<linear_constraints>>

In Perl you may employ Math::LP to this purpose.

EXAMPLE

A linear constraint is of the form "C eqop C1×$V1 ± C2×$V2 ± C3×$V3 ...", where

eqop is one of <, >, ==, >=, <=
$V1, $V2 etc. are variables, and
C, C1, C2 etc. are constants, possibly equal to 0.

For example, given...

$a < $b
$b > $c
$a > 0
$c < 30

...move all variables (with their coefficients) to the left of the inequality, and the lone constants to the right of the inequality:

$a - $b       <  0
     $b - $c  >  0
$a            >  0
          $c  < 30

...and adjust the constraints so that only =, <= and >= (in)equalities are used (assuming discrete i.e. integer values for our variables):

'... < C' becomes '... <= C-1'
'... > C' becomes '... >= C+1'

...that is,

$a - $b       <= -1
     $b - $c  >=  1
$a            >=  1
          $c  <= 29

...then write something like this:

use Math::LP qw(:types);             # imports optimization types
use Math::LP::Constraint qw(:types); # imports constraint types

my $lp = new Math::LP;

my $a  = new Math::LP::Variable(name => 'a');
my $b  = new Math::LP::Variable(name => 'b');
my $c  = new Math::LP::Variable(name => 'c');

my $constr1 = new Math::LP::Constraint(
    lhs  => make Math::LP::LinearCombination($a, 1, $b, -1), # 1*$a -1*$b
    rhs  => -1,
    type => $LE,
);
$lp->add_constraint($constr1);
my $constr2 = new Math::LP::Constraint(
    lhs  => make Math::LP::LinearCombination($b, 1, $c, -1), # 1*$b -1*$c
    rhs  => 1,
    type => $GE,
);
$lp->add_constraint($constr2);

...

my $obj_fn_a = make Math::LP::LinearCombination($a,1);
my $min_a = $lp->minimize_for($obj_fn_a);
my $max_a = $lp->maximize_for($obj_fn_a);

my $obj_fn_b = make Math::LP::LinearCombination($b,1);
my $min_b = $lp->minimize_for($obj_fn_b);
my $max_b = $lp->maximize_for($obj_fn_b);

...

# do exhaustive search over ranges for $a, $b, $c

Of course, the above can be generalized to any number of variables V1, V2, ... (e.g. $a, $b, $c, $d, ...), with any coefficients C1, C2, ... (e.g. -1, 1, 0, 123, etc.) and any constant values C (e.g. -1, 1, 30, 29, etc.) provided you can parse the constraint expressions into a corresponding matrix representation such as:

   V1  V2  V3     C
[ C11 C12 C13 <=> C1 ]
[ C21 C22 C23 <=> C2 ]
[ C31 C32 C33 <=> C3 ]
... ... ... ... ... ...

Applying to the example you have provided,

  $a  $b  $c     C
[  1  -1   0 <= -1 ]   <= plug this into a Constraint + LinearCombination
[  0   1  -1 >=  1 ]   <= plug this into a Constraint + LinearCombination
[  1   0   0 >=  1 ]   <= plug this into a Constraint + LinearCombination
[  0   0   1 <= 29 ]   <= plug this into a Constraint + LinearCombination

NOTE

As a side note, if performing non-deterministic (rand-based) tests, it may or may not be a good idea to keep track (e.g. in a hash) of which ($a,$b,$c) tuples have already been tested, as to avoid testing them again, if and only if:

the method being tested is more expensive than a hash lookup (this is not the case with the sample code you provided above, but may or may not be an issue with your real code)
the hash will not grow to enormous proportions (either all variables are bound by finite intervals, whose product is a reasonable number - in which case checking the hash size can indicate whether you have completely explored the entire space or not -, or you can clear the hash periodically so at least for one time interval at a time you do have some collision detection.)
- ultimately, if you think that the above could apply to you, you can time various implementation options (with and without hash) and see whether it is worth implementing or not.

OTHER TIPS

I use Data::Constraint. You write little subroutines that implement the individual constraints then serially apply all of the constraints that you want. I talk about this a little in Mastering Perl in the "Dynamic Subroutines" chapter.

use Data::Constraint;

Data::Constraint->add_constraint(
    'a_less_than_b',
    run         => sub { $_[1] <  $_[2] },
    description => "a < b",
    );

Data::Constraint->add_constraint(
    'b_greater_than_c',
    run         => sub { $_[2] >  $_[3] },
    description => "b > c",
    );

Data::Constraint->add_constraint(
    'a_greater_than_0',
    run         => sub { $_[1] > 0 },
    description => "a > 0",
    );

Data::Constraint->add_constraint(
    'c_less_than_30',
    run         => sub { $_[3] < 30 },
    description => "c < 30",
    );

Data::Constraint->add_constraint(
    'a_is_odd_between_10_18',
    run         => sub { 
        return 1 if( $_[1] < 10 or $_[1] > 18);
        return 0 unless $_[1] % 2,
        },
    description => "a is odd between 10 and 18",
    );

for ( 1 .. 10 )
    {
    my( $a, $b, $c ) = gen_abc(); 
    print "a = $a | b = $b | c = $c\n";

    foreach my $name ( Data::Constraint->get_all_names )
        {
        print "\tFailed $name\n"
            unless Data::Constraint->get_by_name( $name )->check( $a, $b, $c ),
        }
    }

sub gen_abc {
    my $c = int rand 30;
    my $b = int rand $c;
    my $a = int rand $b;
    return ($a, $b, $c);
}

Doing it this way means it's easy to inspect the result to see what failed instead of an overall failure:

a = 2 | b = 4 | c = 5
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 0 | c = 2
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 0 | c = 2
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c
a = 7 | b = 14 | c = 25
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 0 | c = 29
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 0 | c = 20
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 4 | c = 22
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c
a = 4 | b = 16 | c = 28
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 22 | c = 26
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c
a = 0 | b = 3 | c = 6
    Failed a_greater_than_0
    Failed a_less_than_b
    Failed b_greater_than_c

If you want something more hardcore, my Brick module handles trees of constraints, including pruning and branching. These things make sense for bigger systems where you will mix and match the various constraints for different situations since most of the code is setting up the constraint objects. If you only have your one situation, you probably just want to stick with what you have.

Good luck, :)

A "real" answer would require parsing the expressions and reasoning about relationships. Short of that, I'd suggest using a systematic traversal of the value space, rather than just trying values at random. For instance,

my $count = 0;
for (my $c = 0; $c < 30 && $count < $SOMELIMIT; ++$c) {
    # check all other constraints on only $c here
    # next if any fail
    for (my $b = $c + 1; $b < $UPPERLIMIT && $count < $SOMELIMIT; ++$b) {
        # check all other constraints on only $b and $c here
        # next if any fail
        for (my $a = 1; $a < $b && $count < $SOMELIMIT; ++$a) {
            #check all remaining constraints on $a, $b, and $c here
            # next if any fail
            # now use surviving combinations
            ++$count;
        }
    }
}

I'd put the variable with the most individual constraints at the outermost level, next most constrained next, etc.

At least with this structure you won't test the same combination multiple times (as the random version is very likely to do) and if you watch it run, you may see patterns emerge that lets you cut the execution short.

I am not sure you're going to find a simple answer to this (although I'd like to be proven wrong!).

It seems that your problem would be well suited for a genetic algorithm. The fitness function should be easy to write,just score 1 for each satisfied constraint, 0 otherwise. AI::Genetic seem to be a module that could help you, both to write the code and to understand what you need to write.

This should be faster than a brute force method.

it seems that Simo::Constrain is what you want

I would instead create an algorithm that generates a bunch of valid lists, randomly generated or not (it should be trivial), write them to a file, and then use that file to feed the test program, so he can randomly pick any list it wants.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow