Question

I am trying to get my head around as to why having a local variable or a for loop inside a function is not considered to be pure functional programming.

Given this function:

int as_int(char *str)
{
    int acc; /* accumulate the partial result */

    for (acc = 0; isdigit(*str); str++) {
        acc = acc * 10 + (*str - '0');
    }

    return acc;
}

Under what circumstances would the variable acc be a side-effect ? Even in a concurrent environment each invocation of the function would have its own copy of acc. So I don't quite get why it isn't allowed in functional programming.

Was it helpful?

Solution

Looping in functional programming isn't done with control statements like for and while, it's done with explicit calls to functions like map, fold, or recursion - all of which involve placing the inner loop call inside another function. If the loop code mutates variables outside the loop, this inner loop function would be manipulating variables outside its scope and would thus be impure. So the whole outer function is pure, but the looping is not. Looping constructs in functional programming require you to make the state explicit. Translating your code to something using functional programming looping tools reveals the impurity:

int as_int(char *str)
{
    int acc = 0; /* accumulate the partial result */

    map(takeWhile(isdigit, str), void function(char *chr) {
      acc = acc * 10 + (chr - '0');
    });

    return acc;
}

(Note - this syntax is approximate to get the general idea across)

This code uses an inner function for the loop body which must mutate the variable acc, which is outside it's scope. This is impure - the inner loop function depends on the outer loop context, calling it multiple times with the same character will have side-effects, and the order you call it on the sequence of characters matters. In functional programming, in order to make this a pure function you would have to make this dependency on state passed between loop iterations explicit with fold:

int as_int(char *str)
{
    return fold(takeWhile(isdigit, str), 0, int function(char *chr, int acc) {
      return acc * 10 + (chr - '0');
    });
}

fold uses a function of two arguments for the inner loop body: the first argument is an item in the sequence that fold is looping over, while the second is some value that the inner loop body uses to build up partial results. For the first loop iteration, acc is 0, for the second, acc is whatever the first inner loop function call returned, for the third, it's whatever the second inner loop returned, and the final loop returns the result of the whole fold expression.

Note that this isn't really a problem with your code from the perspective of the rest of your program - both definitions of as_int are pure. The difference is by making the inner loop code a pure function, you can take advantage of the massive array of tools functional programming offers to decompose the loop into something more declarative (e.g. using takeWhile, fold, filter, map, etc. etc.)

Licensed under: CC-BY-SA with attribution
scroll top