What are the pros and cons of using manual list iteration vs recursion through fail

https://stackoverflow.com/questions/9744641

24-05-2021
|

Question

I come up against this all the time, and I'm never sure which way to attack it. Below are two methods for processing some season facts.

What I'm trying to work out is whether to use method 1 or 2, and what are the pros and cons of each, especially large amounts of facts.

methodone seems wasteful since the facts are available, why bother building a list of them (especially a large list). This must have memory implications too if the list is large enough ? And it doesn't take advantage of Prolog's natural backtracking feature.

methodtwo takes advantage of backtracking to do the recursion for me, and I would guess would be much more memory efficient, but is it good programming practice generally to do this? It's arguably uglier to follow, and might there be any other side effects?

One problem I can see is that each time fail is called, we lose the ability to pass anything back to the calling predicate, eg. if it was methodtwo(SeasonResults), since we continually fail the predicate on purpose. So methodtwo would need to assert facts to store state.

Presumably(?) method 2 would be faster as it has no (large) list processing to do?

I could imagine that if I had a list, then methodone would be the way to go.. or is that always true? Might it make sense in any conditions to assert the list to facts using methodone then process them using method two? Complete madness?

But then again, I read that asserting facts is a very 'expensive' business, so list handling might be the way to go, even for large lists?

Any thoughts? Or is it sometimes better to use one and not the other, depending on (what) situation? eg. for memory optimisation, use method 2, including asserting facts and, for speed use method 1?

season(spring).
season(summer).
season(autumn).
season(winter).

 % Season handling
showseason(Season) :-
    atom_length(Season, LenSeason),
    write('Season Length is '), write(LenSeason), nl.

% -------------------------------------------------------------
% Method 1 - Findall facts/iterate through the list and process each
%--------------------------------------------------------------
% Iterate manually through a season list
lenseason([]).
lenseason([Season|MoreSeasons]) :-
    showseason(Season),
    lenseason(MoreSeasons).


% Findall to build a list then iterate until all done
methodone :-
    findall(Season, season(Season), AllSeasons),
    lenseason(AllSeasons),
    write('Done').

% -------------------------------------------------------------
% Method 2 - Use fail to force recursion
%--------------------------------------------------------------
methodtwo :-
    % Get one season and show it
    season(Season),
    showseason(Season),

    % Force prolog to backtrack to find another season
    fail.

% No more seasons, we have finished
methodtwo :-
    write('Done').

Solution

method one seems wasteful since the facts are available, why bother building a list of them (especially a large list). This must have memory implications too if the list is large enough ?

Yes, method 1 takes Θ(n) memory. It's primary benefit is that it is declarative, i.e. it has a straightforward logical meaning.

Method 2, the "failure-driven loop" as Prolog programmers call it, takes constant memory, is procedural and may be preferred when you're doing procedural (extra-logical) things anyway; i.e., in I/O code it's ok to use it.

Note that SWI-Prolog has a third way of writing this loop:

forall(season(S), showseason(S)).

This only works if showseason succeeds for each binding of season(S).

OTHER TIPS

Lets look at your example. It is very simple, so we will imagine it being more complex. However, it seems you take for granted that side effects are essential. Let me question that a bit:

In your example you have made a very interesting discovery: The names of all seasons are of same length. What an earth-shattering insight! But wait, is it really true? The most straight-forward way to verify this, is:

?- season(S), atom_length(S,L).
S = spring,
L = 6 ;
S = summer,
L = 6 ;
S = autumn,
L = 6 ;
S = winter,
L = 6.

No need for findall/3, no need for write/1.

For a larger number of answers, visual inspection is not practical. Imagine 400 seasons. But we can verify this with:

?- season(S), atom_length(S,L), dif(L,6).
false.

So we now know for sure that there is no season of a different length.

That is my very first answer to your question:

As long as you can, use the toplevel shell and not your own side effecting procedures! Stretch things a little bit further to avoid side-effects altogether. This is the best way to avoid failure driven loops right from the beginning.

There are more reasons why sticking to the toplevel shell is a good idea:

If your programs can be easily queried on the toplevel, it will be trivial to add test cases for them.
The toplevel shell is used by many other users and therefore is very well tested. Your own writing is often flawed and untested. Think of constraints. Think of writing floats. Will you use write/1 for floats too? What is the right way to write floats such that they can be read back accurately? There is a way to do this in iso-prolog. Here is the answer:

In ISO, writeq/1,2, write_canonical/1,2, write_term/2,3 with option quoted(true) guarantee that floats can be read back accurately. That is, they are the same w.r.t. (==)/2

The toplevel shell shows you valid Prolog text. In fact, the answer itself is a query! It can be pasted back into the toplevel - only to get back the very same answer. In this manner you will learn the more exotic but unavoidable details of Prolog, like quoting, escaping and bracketing. It is practically impossible to learn the syntax otherwise, since Prolog parsers are often extremely permissive.
Your program will be most probably more accessible to declarative reasoning.

Very likely, your two procedures methodone and methodtwo are incorrect: You forgot a newline after writing Done. So methodone, methodone contains a garbled line. How to test that easily?

But lets look a little bit further into your program. What is so typical for failure driven loops is that they start innocently as something doing "only" side effects but sooner or later they tend to attract more semantic parts as well. In your case, atom_length/2 is hidden down in the failure driven loop completely inaccessible to testing or reasoning.

Efficiency considerations

Prolog systems often implement failure by deallocating a stack. Therefore, failure driven loops will not require a garbage collector. That's why people believe that failure driven loops are efficient. However, this is not necessarily the case. For a goal like findall(A, season(A), As) every answer for A is copied into some space. This is a trivial operation for something like atoms but imagine a bigger term. Say:

blam([]).
blam([L|L]) :- blam(L).

bigterm(L) :- length(L,64), blam(L).

In many systems, findall/3 or assertz/1 for this big term will freeze the system.

Also, systems like SWI, YAP, SICStus do have quite sophisticated garbage collectors. And using fewer failure driven loops will help to improve those systems even further, since this creates a demand for more sophisticated techniques.

If using findall already, why not maplist as well:

findall(S, season(S), L), maplist( showseason, L).

~~Both are not in pure logical Prolog core.~~ And yes, you allocate a whole list for all the solutions.

Your second method is called "failure-driven loop" and there isn't anything wrong with it, except there's no way to get at the previous solutions after backtracking through failure. That's why findall is extra-logical. Internally, it could be impl'd as failure-driven loop which stores its interim results via asserting. So the second is conceptually cleaner as well, in addition to not allocating any extra memory. It is usually employed in top-level "driver" (i.e., UI) predicates.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow