A* Algorithm with Manhattan Distance Heuristic

https://stackoverflow.com/questions/23189211

06-07-2023
|

Question

I've been working on a 15-puzzle solver in C. And I had some issues with the huge amounts of memory that my code uses.

I won't be posting my code because it's too long... I've implemented most of the libraries I'm using and it will probably just bring confusion to you. Let's start on the basics.

The things that I'm using right now are: (All of them implemented in C)

- Fibonacci Heap:

/* Struct for the Fibonacci Heap */
typedef struct _fiboheap {
    int  size;           // Number of nodes in the heap
    node min;            // Pointer to the minimun element in the Heap
    funCompare compare;  // Function to compare within nodes in the heap
    freeFunction freeFun;// Function for freeing nodes in the heap
} *fiboheap;


/* Struct of a Fibonacci Heap Node */
typedef struct _node {
    node parent;    // Pointer to the node parent
    node child;     // Pointer to the child node
    node left;      // Pointer to the right sibling
    node right;     // Pointer to the left sibling
    int  degree;    // Degree of the node
    int  mark;      // Mark
    void *key;      // Value of the node (Here is the element)
} *node;

- Hash Table

The only thing I haven't implemented myself, I'm using uthash.

- 15-Puzzle States representation

This is an interesting topic. Before explaining the situation here, let's just think a little bit about the 15-puzzle problem... As we know, 15-Puzzle has 16 tiles (We're counting the blank tile as the tile with the number 0). So, how many possible states has the 15-puzzle problem? Well, it's factorial(16) states. So, that's a lot of states. That means that we want our states to be as small as possible... If our initial state is too far away from the solution, we may explore so much states that our program memory will just explode.

So... my 15-puzzle representation consist of using bits and logical operators:

/* Struct for the 15-puzzle States Representation */
typedef struct _state {
    unsigned int quad_1; /* This int represent the first 8 numbers */
    unsigned int quad_2; /* This int represent the last 8 numbers */
    unsigned short zero; /* This is the position of the zero */
} *state;

So what I'm doing is using logical operators to move and change the numbers, using the minimum space.

Note that the size of this struct is 12 bytes (Because it has three integers)

- Manhattan Distance

This is just the well known Manhattan Distance Heuristic. Where you basically calculate the sum of the distances of each number current position to the number position in the goal state.

- A* Implementation and Node structure to be used with A*

Let's start with the Nodes.

typedef struct nodo_struct {
    nodo parent;        // Pointer to the parent of the node
    state estado;       // State associated with the node
    unsigned int g : 8; // Cost of the node
    action a;           // Action by which we arrived to this node
                        // If null, it means that this is the initial node
} *nodo;

Note that these nodes had nothing to do with the nodes in fibonacci heap.

And now the main reason of this topic... The pseudocode of A* I'm currently using.

a_star(state initial_state) {
    q = new_fibo_heap; // Sorted by (Cost g) + (Manhattan Distance)
                       // It will have nodes which contain a pointer to the state
    q.push(make_root_node(initial_state));
    closed = new_hash_table();
    while (!q.empty()) {
        n = q.pop();
        if ((n->state ∉ closed) || (n->g < dist(n->state))) { 
        /* The dist used above is stored in the hash table as well */
            closed.insert(n->state);
            dist(n->state) = n->g;   // Update the hash table
            if (is_goal(n->state)) {
                return extract_solution(n); // Go through parents, return the path
            }
            for <a,s> ∈ successors(n->state) {
            // 'a' is the action, It can be 'l', 'r', 'u', 'd'
            // 's' is the state that is a successor of n
                if (manhattan(s) < Infinity) {
                    q.push(make_node(n,a,s));
                    // Assuming that manhattan is safe
                }
            }
        }
    }
    return NULL;
}

So the questions I haven't been able to answer yet are...

How do you efficiently manage the memory? Can you re-use states or nodes? What consequences does that bring?

I mean, If you take a look at the pseudocode. It's not considering re-using states or nodes. It just keeps allocating over and over nodes and states, even though they've been calculated before.

I've been thinking a lot about this... Each time you run the solver, it can expand millions of nodes real quick. And as we know, A* may re-explore nodes when you find another path with a cost less than the previous one. That means... If we explore 1 million nodes, it will be 24 millions of bytes (Wow)... And considering that each node has a state, that would be 14 bytes per node, those are 14 millions of bytes...

In the end, What I need is a way of reusing/freeing space so my computer won't crash when I execute this solver.

(PD: Sorry for the long post)

Solution

Are you doing this for an assignment or for fun?

If it's for fun, then don't use A*, use IDA*. IDA* will be faster and it uses almost no memory. Furthermore, you can use the extra memory to build better heuristics, and get even better performance. (You should find sufficient information if you google "pattern database". These were invented by Jonathan Schaeffer and Joe Culberson, but studied in significant detail by Rich Korf and Ariel Felner.)

IDA* has some drawbacks and doesn't work in every domain, but it is just about perfect for the sliding-tile puzzle.

Another algorithm which could help is Breadth-First Heuristic Search. This and other papers discuss how you can avoid storing the closed list entirely.

Basically, a lot of smart people have tackled this problem before and they've published their methods/results so you can learn from them.

Here are some tips to improve your A*:

I have found that there isn't much of a speed gain from Fibonacci heaps, so you might want to use a simpler data structure. (Although available implementations might have improved since I tried this last.)
The f-cost of a node will jump in increments of 2. Thus, you can bucket your f-costs and only worry about sorting items in the same f-cost layer. A FIFO queue actually works pretty well.
You can use the ideas in this paper to convert the 15-puzzle representation into a permutation, which will take about 43 bits for the full representation. But, it becomes more expensive to expand states because you have to convert into a different representation to generate moves.
Avoid storing the closed list entirely using forbidden operators. (See the previous Breadth-First Heuristic Search paper or this paper on Best-First Frontier search for more details.)

Hopefully these points will address your issues. I'm happy to provide more links or clarification if you need them.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow