Detecting infinite loop in brainfuck program

https://stackoverflow.com/questions/367571

21-08-2019
|

Question

I have written a simple brainfuck interpreter in MATLAB script language. It is fed random bf programs to execute (as part of a genetic algorithm project). The problem I face is, the program turns out to have an infinite loop in a sizeable number of cases, and hence the GA gets stuck at the point.
So, I need a mechanism to detect infinite loops and avoid executing that code in bf.
One obvious (trivial) case is when I have

[]

I can detect this and refuse to run that program.
For the non-trivial cases, I figured out that the basic idea is: to determine how one iteration of the loop changes the current cell. If the change is negative, we're eventually going to reach 0, so it's a finite loop. Otherwise, if the change is non-negative, it's an infinite loop.
Implementing this is easy for the case of a single loop, but with nested loops it becomes very complicated. For example, (in what follows (1) refers to contents of cell 1, etc. )

++++ Put 4 in 1st cell (1)
>+++ Put 3 in (2)
<[   While( (1) is non zero)
    --   Decrease (1) by 2
    >[   While( (2) is non zero)
        -    Decrement (2)
        <+   Increment (1) 
    >]   
    (2) would be 0 at this point
    +++  Increase (2) by 3 making (2) = 3
<]   (1) was decreased by 2 and then increased by 3, so net effect is increment

and hence the code runs on and on. A naive check of the number of +'s and -'s done on cell 1, however, would say the number of -'s is more, so would not detect the infinite loop.
Can anyone think of a good algorithm to detect infinite loops, given arbitrary nesting of arbitrary number of loops in bf?

EDIT: I do know that the halting problem is unsolvable in general, but I was not sure whether there did not exist special case exceptions. Like, maybe Matlab might function as a Super Turing machine able to determine the halting of the bf program. I might be horribly wrong, but if so, I would like to know exactly how and why.

SECOND EDIT: I have written what I purport to be infinite loop detector. It probably misses some edge cases (or less probably, somehow escapes Mr. Turing's clutches), but seems to work for me as of now. In pseudocode form, here it goes:

subroutine bfexec(bfprogram)
begin
    Looping through the bfprogram,
        If(current character is '[')
            Find the corresponding ']'
            Store the code between the two brackets in, say, 'subprog'
            Save the value of the current cell in oldval
            Call bfexec recursively with subprog
            Save the value of the current cell in newval
            If(newval >= oldval)
                Raise an 'infinite loop' error and exit
            EndIf
        /* Do other character's processings */
        EndIf
    EndLoop
end

Solution

When I used linear genetic programming, I just used an upper bound for the number of instructions a single program was allowed to do in its lifetime. I think that this is sensible in two ways: I cannot really solve the halting problem anyway, and programs that take too long to compute are not worthy of getting more time anyway.

OTHER TIPS

Alan Turing would like to have a word with you.

http://en.wikipedia.org/wiki/Halting_problem

Let's say you did write a program that could detect whether this program would run in an infinite loop. Let's say for the sake of simplicity that this program was written in brainfuck to analyze brainfuck programs (though this is not a precondition of the following proof, because any language can emulate brainfuck and brainfuck can emulate any language).

Now let's say you extend the checker program to make a new program. This new program exits immediately when its input loops indefinitely, and loops forever when its input exits at some point.

If you input this new program into itself, what will the results be?

If this program loops forever when run, then by its own definition it should exit immediately when run with itself as input. And vice versa. The checker program cannot possibly exist, because its very existence implies a contradiction.

As has been mentioned before, you are essentially restating the famous halting problem: http://en.wikipedia.org/wiki/Halting_problem

Ed. I want to make clear that the above disproof is not my own, but is essentially the famous disproof Alan Turing gave back in 1936.

State in bf is a single array of chars.

If I were you, I'd take a hash of the bf interpreter state on every "]" (or once in rand(1, 100) "]"s*) and assert that the set of hashes is unique.

The second (or more) time I see a certain hash, I save the whole state aside.

The third (or more) time I see a certain hash, I compare the whole state to the saved one(s) and if there's a match, I quit.

On every input command ('.', IIRC) I reset my saved states and list of hashes.

An optimization is to only hash the part of state that was touched.

I haven't solved the halting problem - I'm detecting infinite loops while running the program.

*The rand is to make the check independent of loop period

Infinite loop cannot be detected, but you can detect if the program is taking too much time.

Implement a timeout by incrementing a counter every time you run a command (e.g. <, >, +, -). When the counter reaches some large number, which you set by observation, you can say that it takes very long time to execute your program. For your purpose, "very long" and infinite is a good-enough approximation.

As already mentioned this is the Halting Problem. But in your case there might be a solution: The Halting Problem is considering is about the Turing machine, which has unlimited memory.

In case you know that you have a upper limit of memory (e.g. you know you dont use more than 10 memory cells), you can execute your programm and stop it. The idea is that the computation space bounds computation time (as you cant write more than one cell at one step). After you executed as much steps as you can have different memory configurations, you can break. E.g. if you have 3 cells, with 256 conditions, you can have at most 3^256 different states, and so you can stop after executing that many steps. But be careful, there are implicit cells, like the instruction pointer and the registers. You do it even shorter, if you save every state configuration and as soon as you detect one, which you already had, you have an infite loop. This approach is definitly much better in the run time, but therefor needs much more space (here it might be suitable to hash the configurations).

This is not the halting problem, however, it is still not reasonable to try to detect halting even in such a limited machine as a 1000 cell BF machine.

Consider this program:

+[->[>]+<[-<]+]

This program will not repeat until it has filled up the entire of memory which for just 1000 cells will take about 10^300 years.

Off the top of my head (and I could be wrong), I would think it would be a little bit difficult to detect whether or not a program has an infinite loop without actually executing the program itself.

As the conditional execution of portions of the program depends on the execution state of the program, it will be difficult to know the particular state of the program without actually executing the program.

If you don't require that a program with an infinite loop be executed, you could try having an "instructions executed" counter, and only execute a finite number of instructions. This way, if a program does have an infinite loop, the interpreter can terminate the program which is stuck in an infinite loop.

If I remember correctly, the halting problem proof was only true for some extreme case that involved self reference. However it's still trivial to show a practical example of why you can't make an infinite loop detector.

Consider Fermat's Last Theorem. It's easy to create a program that iterates through every number (or in this case 3 numbers), and detects if it's a counterexample to the theorem. If so it halts, otherwise it continues.

So if you have an infinite loop detector, it should be able to prove this theorem, and many many others (perhaps all others, if they can be reduced to searching for counterexamples.)

In general, any program that involves iterating through numbers and only stopping under some condition, would require a general theorem prover to prove if that condition can ever be met. And that's the simplest case of looping there is.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow