Should I avoid tail recursion in Prolog and in general?

Question 1

Short answer: Tail recursion is desirable, but don't over-emphasize it.

Your original program is as tail recursive as you can get in Prolog. But there are more important issues: Correctness and termination.

In fact, many implementations are more than willing to sacrifice tail-recursiveness for other properties they consider more important. For example steadfastness.

But your attempted optimization has some point. At least from a historical perspective.

Back in the 1970s, the major AI language was LISP. And the corresponding definition would have been

(defun addone (xs)
  (cond ((null xs) nil)
    (t (cons (+ 1 (car xs))
         (addone (cdr xs))))))

which is not directly tail-recursive: The reason is the cons: In implementations of that time, its arguments were evaluated first, only then, the cons could be executed. So rewriting this as you have indicated (and reversing the resulting list) was a possible optimization technique.

In Prolog, however, you can create the cons prior to knowing the actual values, thanks to logic variables. So many programs that were not tail-recursive in LISP, translated to tail-recursive programs in Prolog.

The repercussions of this can still be found in many Prolog textbooks.

Question 2

Your addOne procedure already is tail recursive.

There are no choice points between the head and the last recursive call, because is/2 is deterministic.

Accumulators are sometime added to allow tail recursion, the simpler example I can think of is reverse/2. Here is a naive reverse (nreverse/2), non tail recursive

nreverse([], []).
nreverse([X|Xs], R) :- nreverse(Xs, Rs), append(Rs, [X], R).

if we add an accumulator

reverse(L, R) :- reverse(L, [], R).
reverse([], R, R).
reverse([X|Xs], A, R) :- reverse(Xs, [X|A], R).

now reverse/3 is tail recursive: the recursive call is the last one, and no choice point is left.

Question 3

O.P. said:

But I have read that it is better to avoid [tail] recursion for performance reasons. Is this true? Is it considered 'good practice' to use tail recursion always? Will it be worth the effort to use accumulators to get into a good habit?

It is a fairly straightforward optimization to convert a tail-recursive construct into iteration (a loop). Since the tail (recursive) call is the last thing done, the stack frame can be reused in the recursive call, making the recursion, for all intents and purposes, a loop, by simply jumping to the beginning of the predicate/function/method/subroutine. Thus, a tail recursive predicate will not overflow the stack. Tail-recursive construct, with the optimization applied have the following benefits:

Slightly faster execution as new stack frames don't need to be allocated/freed; further, you get better locality of reference, so arguably less paging.
No upper bound on the depth of recursion.
No stack overflows.

The possible downsides?

loss of useful stack trace. Not an issue if TRO is only applied in a release/optimized build and not in a debug build, but...
developers will write code that depends on TRO, which means that code will run fine with TRO applied will fail without TRO being applied. Which means that in the above case (TRO only in release/optimized builds), a functional change exists between release and debug builds, essentially meaning one's choice of compiler options generates two different programs from identical source code.

This is not, of course, an issue, when the language standard demands tail recursion optimization.

To quote Wikipedia:

Tail calls are significant because they can be implemented without adding a new stack frame to the call stack. Most of the frame of the current procedure is not needed any more, and it can be replaced by the frame of the tail call, modified as appropriate (similar to overlay for processes, but for function calls). The program can then jump to the called subroutine. Producing such code instead of a standard call sequence is called tail call elimination, or tail call optimization.

See also:

I've never understood why more languages don't implement tail recursion optimization

Question 4

I don't think that the first version of addone should lead to less efficient code. It is also a lot more readable, so I see no reason why it should be good practice to avoid it.

In more complex examples, the compiler might not be able to transfer the code automatically to tail recursion. Then it may be reasonable to rewrite it as an optimization, but only if it is really necessary.

So, how can you implement a working tail recursive version of addone? It may be cheating but assuming that reverse is implemented with tail-recursion (e.g., see here), then it can be used to fix your problem:

accAddOne([X|Xs],Acc,Result) :- Xnew is X+1, accAddOne(Xs,[Xnew|Acc],Result).
accAddOne([],Acc,Result) :- reverse(Acc, Result).
addone(List,Result) :- accAddOne(List,[],Result).

It is extremly clumsy, though. :-)

By the way, I cannot find a simpler solution. It may because of the same reason as foldr in Haskell is normally not defined with tail recursion.

Question 5

In contrast to so some other programming languages, certain Prolog implementations are well suited for tail recursive programs. Tail recursion can be handled as a special case of last call optimization (LCO). For example this here in Java doesn't work:

public static boolean count(int n) {
    if (n == 0) {
        return true;
    } else {
        return count(n-1);
    }
}

public static void main(String[] args) {
    System.out.println("count(1000)="+count(1000));
    System.out.println("count(1000000)="+count(1000000));
}

The result will be:

count(1000)=true
Exception in thread "main" java.lang.StackOverflowError
    at protect.Count.count(Count.java:9)
    at protect.Count.count(Count.java:9)

On the other hand major Prolog implementations don't have any problem with it:

 ?- [user].
 count(0) :- !.
 count(N) :- M is N-1, count(M).
 ^D

The result will be:

?- count(1000).
true.
?- count(1000000).
true.

The reason Prolog systems can do that, is that their execution is most often anyway trampolin style, and last call optimization is then a matter of choice point elimination and environment trimming. Environment trimming was already documented in early WAM.

But yes, debugging might be a problem.