Question

I found a solution, see answer below and/or Github Gist, which has newer optimizations.

I've got an array of credit card charges and a batch total... sometimes the SUM() of the amounts = the batch amount, so that's easy to group to the batch and we can check off transactions contributing to that day's deposit to the checking account.

Sometimes, the proper sum is for a subset of these charges, where 1 or more are batched the next day. Would you help me programmatically solve this? It's for my authorize.net batch accounting, which sucks, so I'm making a tool for my bookkeeper.

+------------+--------+
| transId    | amount |
+------------+--------+
| 2863996511 | 154.00 |
| 2863992361 | 762.20 |
| 2863898975 |  49.00 |
| 2863894564 |   5.44 |
| 2863879671 |  10.88 |
| 2863858891 | 209.00 |
| 2863856334 | 148.00 |
| 2863367273 | -25.01 |
+------------+--------+

And the batch total for the day is $1302.63. As often happens, a charge didn't end up in that batch, so the batch is some subset of the sum of charges in the array. In this case, the $10.88 charge was in the next day's batch. This little bit of pseudocode can catch that via two nested for loops:

        for( $skipdx=0; $skipdx<$size; $skipdx++){
            $total=0;
            for( $idx=0; $idx<$size; $idx++){
                if($idx!=$skipdx){
                    $total+=$charges[$idx]['amount'];
                    $thisBatch[]=$charges[$idx]['transId'];
                }
                if( abs($total-$batch['total']) < .04 ) {
                    echo "sum of charges matches $date batch total: $$total  (line: ". __LINE__ .")\n";
                    $foundIt=TRUE;
                    break;
                }
            }
            if($foundIt==TRUE)
                break;
        }

How can I dynamically choose to search the charges where TWO are not added in? Then THREE? I can see that if $skipdx is one charge omitted, then two charges skipped would add a skip2dx nested loop. And if still not found, skip3dx would be 3rd level of nesting.

I'm usually really good at algorithms until recursion and then I go stupid.

Was it helpful?

Solution

It came to me! The "present layer" of the recursion does this:

  1. Sum the charge amounts.
  2. Iterate through the array of charges, subtracting out 1 charge at a time and test against the target batch amount.
  3. If the amount matches in this iteration, unset the id of this charge and return the array of results.
  4. If not found, iterate again through the array of charges this time removing the current charge. call the function again with the same target amount and the new shortened array to the same function again.

Working code (Github has newer optimizations, too):

$charges=Array(
     '2863996511' => 154.00 ,'2863879671' => 10.88
    ,'2863992361' => 762.20 ,'2863858891' => 209.00
    ,'2863898975' => 49.00  ,'2863856334' => 148.00
    ,'2863894564' => 5.44   ,'2863367273' => -25.01
); print_r($charges);
$targets=Array(1302.63, 1327.64, 1322.20 );
foreach( $targets as $batch ) {
    printf( "We seek combination of transIds to = $%-7.2f\n", $batch);
    $answer=reco( $batch, $charges );
    if( is_array($answer) )
        echo eval( "return ".implode("+",$answer).";" )."\nWIN!\n";
    else
        echo "Bust!\n";
}   

function reco( $target, $arr ){
    if( count($arr) < 2 )
        return FALSE;
    $killid='';
    $sum = eval( "return ".implode("+",$arr).";" );
    foreach( $arr as $id=>$amt ){
        if( abs($sum-$target-$amt) < .01 ) {
            $killid=$id;
            break;
        }
    }
    if( strlen($killid) > 1 ) {
        echo ".  omit $killid : ".$arr[$killid]."\n";
        unset($arr[$killid]); 
        return $arr;
    }
    foreach( $arr as $id=>$amt ){
        $tmp=$arr;
        unset($tmp[$id]);
        $rtn = reco( $target, $tmp );
        if( is_array($rtn) ) {
            echo ".. omit $id : ".$arr[$id]."\n";
            return $rtn;
        }
    }
    return FALSE;
}

And the output:

We seek combination of transIds to = $1302.63
. omit 2863879671 : 10.88
1302.63
WIN!
We seek combination of transIds to = $1327.64
. omit 2863367273 : -25.01
.. omit 2863879671 : 10.88
1327.64
WIN!
We seek combination of transIds to = $1322.20
. omit 2863367273 : -25.01
.. omit 2863879671 : 10.88
.. omit 2863894564 : 5.44
1322.2
WIN!

Performance isn't a great concern; to the third level of recursion processed in about a second.

OTHER TIPS

When you start looking at ALL possible subsets of your set of charges you may notice that sometimes going for the first subset that matches your batch sum is not a good accounting solution. There may be several possible subsets of charges that add up to your batch total.

Which one would you choose?

Actually this leads to a well known problem called Subset Sum Problem - see wiki here.

There is also a cheap Excel add-in called SumMatch that will find all combinations of numbers matching a given sum. Consider if it is worth of you programming time to work on a correct algorithm.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top