Why is IDA$^$ faster than A$^$? Why does IDA$^$ visit more nodes than A$^$?

https://cs.stackexchange.com/questions/118121

28-09-2020
|

Question

I used IDA$^*$ for optimally solving the 8-puzzle and my friends used A$^*$ for it too (with same manhattan distance heuristic).

I calculated the average running time and number of nodes of my algorithm for 20 examples and my friend's algorithm. The time average for my algorithm was much faster than my friend's algorithm, but my average number of nodes visited is a lot more than my friend's.

I know IDA$^*$ visits each node more than once but why is it faster than A$^*$?

Solution

Since you already implemented IDA$^*$ you certainly understand why it expands more nodes than A$^*$, i.e., it starts from the start state with a new depth-first traversal in each iteration. Note first that, the overall number of nodes visited by IDA$^*$, while necessarily larger than A$^*$ is not that much larger. The reason is that the number of nodes at each depth progresses according to a geometric series with factor $b$, the branching factor. As a result, the number of nodes at depth $d$, $b^d$, is much larger than the sum of all nodes expanded at the previous iterations, i.e., $b^d>\sum_{i=0}^{d-1}b^i$. From this difference, it turns out that IDA$^*$ requires $\frac{b}{b-1}$ additional expansions which is asymptotically optimal as the limit for large $b$ is 1. Concluding, for any algorithm visiting all nodes at depth $d$ is much more difficult than visiting all nodes at the preceding depths.

In case that you want to delve more into this issue, I strongly recommend you reading the original paper: Korf, Richard E. Depth-First iterative-deepening: An optimal admissible tree search. Artificial Intelligence (27), 97-109, 1985. See in particular Theorem 4.2:

Depth-first iterative-deepening is asymptotically optimal among brute-force tree searches in terms of time, space, and length of solution.

Of course, it delivers optimal solutions so that it is admissible and hence, asymptotically optimal. It only practices depth-first traversals and therefore, it is asymptotically optimal in terms of space (requiring with a good implementation only $O(d)$).

As for the time, I already outlined the main theoretical reason but let me know highlight three main reasons why it is so fast in practice:

First of all, for almost all purposes, the overall running time of any algorithm is dominated by the time it takes to expand nodes (other operations are rather simple and atomic). It is indeed when expanding nodes that IDA$^*$ can be extraordinarily fast because:

1.1. IDA$^*$ is implemented recursively so that all you need is to take the state given as an argument and to generate a child at a time (for your specific case this simply means swapping the blank with an adjacent tile, that's only one statement!). However, for A$^*$ the expansion operation requires: first, popping a state from the queue, then generating all its children (i.e., moving the blank tile in all possible directions).

1.2. While the preceding operation makes some difference, the really important one is that A$^*$ requires sorting nodes in OPEN. Even if you do this with a 1-bucket (which would take $O(1)$), note that IDA$^*$ does not require to sort nodes at all, so that while A$^*$ takes time which is linear in the number of nodes expanded, IDA$^*$ takes none at all.
Thirdly, one of the contributions of best-first search algorithms (such as A$^*$) is that they avoid re-expanding nodes by using a CLOSED list (which in spite of its name is usually implemented as a set!!). IDA$^*$ does not have duplicate-detection mechanisms and hence, again, A$^*$ performs an extra operation which is not performed at all by IDA$^*$. If you want to know more about duplicate-detection in IDA$^*$ see: Reinefeld, A.; Marsland, T. Enhanced iterative-deepening search. IEEE Transactions on Pattern Analysis and Machine Intelligence (16) 7, 701-710, 1994.

The last point is truly relevant. In the case of the sliding-tile puzzle there are not many transpositions indeed and the shortest cycle consists of 12 moves, so that the number of re-expansions is not that large. Putting all these together, IDA$^*$ is a killer machine for all sizes of the sliding-tile puzzle.

However,, in spite of all its advantages, it might not be of practical interest in those applications that have a large number of transpositions. There have been various attempts to overcome this difficulty with limited success, see for example:

Dow, P. Alex; Korf, Richard E. Duplicate avoidance in depth-first search with applications to treewidth. International Joint Conference on Artificial Intelligence, 480-485, 2009.

Hope this helps,

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange

Why is IDA$^*$ faster than A$^*$? Why does IDA$^*$ visit more nodes than A$^*$?

Why is IDA$^$ faster than A$^$? Why does IDA$^$ visit more nodes than A$^$?