문제

I have a heuristic of the following format:

if a == 1 and b == 1 and c == 1:
    do something
if a == 1 and b == 1 and c == 2:
    do something
if a == 3 and b == 2 and c == 1:
    do something
if a == 2 and b == 2 and c == 1:
    do something
if a == 3 and b == 1 and c == 3:
    do something
etc.

Of course, this makes many unnecessary comparisons which could be avoided if the statements were nested like this:

if a == 1:
    if b == 1:
        if c == 1:
            do something
        if c == 2:
            do something
etc.

Assuming the set of tuples (a, b, c) for which I have a case is finite, and each one has the same likelihood of being received by the algorithm, how can I generate a decision tree which is optimal, i.e., that it makes the least number of comparisons for the general case, or that minimizes the total number of comparisons when all the inputs are run through it?

I imagine something like this:

In: optimal_tree([(1, 1, 1), (1, 1, 2)])
Out: "if a == 1:
          if b == 1:
              if c == 1:
                  do something
              if c == 2:
                  do something"

In: optimal_tree([(1, 1), (2, 1), (2, 2)])
Out: "if a == 2:
          if b == 1:
              do something
          if b == 2:
              do something
      if a == 1:
          do something"

     OR

     "if b == 1:
          if a == 1:
              do something
          if a == 2:
              do something
      if b == 2:
          do something"
도움이 되었습니까?

해결책

Rule engines and database queries regularly deal with this problem too. You might want to look into the implementation behind those.

They have several ways to deal with this (although none is perfect):

  • Do the comparisons in the order defined in the Rule/Query.
  • Reorder the comparison based on statistical information. Databases will for example to first handle the indexed columns with the most value variance.

If you want to make your algorithm faster, you might want to look into hashing & indexing if you're not doing that already. That brings in a much bigger scale advantage.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top