Question

A pure function in functional programming is the one which do not have side effects. One of the meanings of this is it can not change the values of input parameters. Can this be seen as disadvantage for the memory utilization?
e.g. Lets say I have a function which simply takes a list and add another element in it. In C++ it could be as simple as


void addElem(std::vector& vec, int a)
{
    vec.insert(a);
}

This function clearly doesn't use much memory than already taken by passed object.
But same in Haskell will come to something like this.

addElem :: [Int] -> Int -> [Int] 
addElem xs a = xs ++ [a]

Now in this computation as xs is not changing it's value. So am I correct in saying that at some time the function will be consuming the memory double size of xs, one for xs and other for the return value. Or somehow lazy invocation makes sure that actually xs is only getting returned by adding only one element at the end?
I know that Monad is the way to have something which can have side effect. But can Monad be used to simply modify the input and return it's value?
Also can changing xs ++ [a] to a:xs consumes less memory?

Was it helpful?

Solution

Purity implies that the function cannot make destructive updates, so if

xs ++ [a]

is fully evaluated, a copy of xs must be made. That can happen in constant space if xs is lazily generated and not referenced anywhere else, so that it can be garbage collected as it is created. Or it can need a copy besides xs - but purity allows the cells of both lists to point to the same data, so the data need not be copied, only the spine (amended by the final cons). But if a copy is made, the old value xs is still available.

Also can changing xs ++ [a] to a:xs consumes less memory?

Yes, that one can simple reuse xs, all that needs is to create a new list cell and let its tail pointer point to xs.

From a comment:

I don't want C++ function to be pure. I want it to change the state of input. And that is what I wanted to be the whole point of the question.

In that case, you need an impure function/action. That is possible for some data structures to implement, but for lists, the cells have to be copied/constructed anew. But adding an element to a std::vector<T> may also need reallocation.

OTHER TIPS

Yes, pure FP may be more memory-intensive than imperative programming, though it depends on how you write your programs and how smart your compiler is. Haskell compilers in particular have very strong optimizers and might be able to transform pure FP code into quite efficient machine code that reuses allocated memory. That requires writing good FP code though, as even the smartest compiler will not include optimizations to handle programs that just simulate imperative ones with superficially similar FP constructs.

Mind you, your C++ example is invalid. If you meant

v[0] = a;  // assuming v.size() > 0

then that doesn't do any allocation. If you meant

v.append(a);

then that may or may not allocate, depending on the capacity of v.

Or somehow lazy invocation makes sure that actually xs is only getting returned by adding only one element at the end?

Depends on the implementation and the use of the expression in its context. When xs ++ [a] is fully evaluated, it copies the entire list xs. If it is partially evaluated, it might do any number of allocations between none and len(xs).

Also can changing xs ++ [a] to a:xs consumes less memory?

Yes, that changes it from O(n) worst-case to O(1) worst case allocations/extra memory use. Same goes for time complexity. When handling lists in Haskell, never append to the end. If that's what you need, use a difference list.

I think the relevant difference between your two examples is the choice of data structure rather than the question of pure vs impure in itself.

You can have singly linked lists in both approaches, and you can have arrays with constant time update in both. In the pure case, an update to a data structure may semantically make a copy, but that does not necessarily mean that it does so physically.

Ropes, for example, are a variant of immutable arrays that allows constant time concatenation. You can also have an immutable array abstraction with constant time functional update (where a.set(i, x) semantically returns a copy) by using a copy-on-write scheme internally that only performs a physical copy if you are actually accessing the old version after creating the new (i.e. if you make use of the persistence capability of the pure data structure which is not available in the impure case).

ab·strac·tion [ab-strak-shuh n]

the act of considering something as a general quality or characteristic, apart from concrete realities, specific objects, or actual instances.

tradeoff

a balancing of factors all of which are not attainable at the same time

leaky abstractions

All non-trivial abstractions, to some degree, are leaky.

Functional programming can be seen as an abstraction, and since all abstractions leak, there are some tradeoffs to be made.

Now in this computation as xs is not changing it's value. So am I correct in saying that at some time the function will be consuming the memory double size of xs, one for xs and other for the return value.

Not necessarily. The advantage of having immutable data is it can be shared. So compilers can optimize by sharing xs in such case. So the size will remain same as in case of c++.

Or somehow lazy invocation makes sure that actually xs is only getting returned by adding only one element at the end?

Laziness just means evaluate when the value is actually required, it does not guarantee less memory usage. On the otherhand thunks created due to laziness might use more memory.

I know that Monad is the way to have something which can have side effect. But can Monad be used to simply modify the input and return it's value?

You are partially right. Monad are used to compose code having side effects. You can very well use mutable vector and write code very similar to your c++ example in the IO monad.

Also can changing xs ++ [a] to a:xs consumes less memory?

This is compiler dependent, but I think it would copy the whole list and add element at the end. (I am no expert in this).

You can always profile your code and check memory consumption, which is the best way to learn this.

Different languages have different common idioms, and the implementers work to make these idioms as effective as possible. I would not assume_ that because something entails extra overhead in C++, the same would be true in another language, just as I would not assume that the most effective idioms in another language would be the most effective in C++.

Having said that: I'm currently working on a large, high performance application where we often return std::vector, and we've not found this to be a problem. Things like NRVO and move semantics end up meaning that this can be very efficient in C++ as well.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top