Do functional programming languages disallow side effects?

https://softwareengineering.stackexchange.com/questions/369770

05-02-2021
|

Question

According to Wikipedia, Functional programming languages, that are Declarative, they disallow side effects. Declarative programming in general, attempts to minimize or eliminate side effects.

Also, according to Wikipedia, a side effect is related to state changes. So, Functional programming languages, in that sense, they actually eliminate side effects, since they save no state.

But, in addition, a side effect has another definition. Side effect

has an observable interaction with its calling functions or the outside world besides returning a value. For example, a particular function might modify a global variable or static variable, modify one of its arguments, raise an exception, write data to a display or file, read data, or call other side-effecting functions.

In that sense, Functional programming languages actually allow side effects, since there are countless examples of functions affecting their outside world, calling other functions, raise exceptions, writing in files etc.

So, finally, do Functional programing languages allow side effects or not?

Or, I don't understand what qualifies as a "side effect", so Imperative languages allow them and Declarative don't. According to the above and what I get, no language eliminates side effects, so either I am missing something about side effects, or the Wikipedia definition is incorrectly broad.

Solution

Functional programming includes many different techniques. Some techniques are fine with side effects. But one important aspect is equational reasoning: If I call a function on the same value, I always get the same result. So I can substitute a function call with the return value, and get equivalent behaviour. This makes it easier to reason about the program, especially when debugging.

Should the function have side effects, this doesn't quite hold. The return value is not equivalent to the function call, because the return value doesn't contain the side effects.

The solution is to stop using side effects and encoding these effects in the return value. Different languages have different effect systems. E.g. Haskell uses monads to encode certain effects such as IO or State mutation. The C/C++/Rust languages have a type system that can disallow mutation of some values.

In an imperative language, a print("foo") function will print something and return nothing. In a pure functional language like Haskell, a print function also takes an object representing the state of the outside world, and returns a new object representing the state after having performed this output. Something similar to newState = print "foo" oldState. I can create as many new states from the old state as I like. However, only one will ever be used by the main function. So I need to sequence the states from multiple actions by chaining the functions. To print foo bar, I might say something like print "bar" (print "foo" originalState).

If an output state is not used, Haskell doesn't perform the actions leading up to that state, because it is a lazy language. Conversely, this laziness is only possible because all effects are explicitly encoded as return values.

Note that Haskell is the only commonly used functional language that uses this route. Other functional languages incl. the Lisp family, ML family, and newer functional languages like Scala discourage but allow still side effects – they could be called imperative–functional languages.

Using side effects for I/O is probably fine. Often, I/O (other than logging) is only done at the outer boundary of your system. No external communication happens within your business logic. It is then possible to write the core of your software in a pure style, while still performing impure I/O in an outer shell. This also means that the core can be stateless.

Statelessness has a number of practical advantages, such as increased reasonability and scalability. This is very popular for web application backends. Any state is kept outside, in a shared database. This makes load balancing easy: I don't have to stick sessions to a specific server. What if I need more servers? Just add another, because it's using the same database. What if one server crashes? I can redo any pending requests on another server. Of course, there still is state – in the database. But I've made it explicit and extracted it, and could use a pure functional approach internally if I want to.

OTHER TIPS

No programming language eliminates side effects. I think it's better to say that declarative languages contain side effects while imperative languages do not. However, I'm not so sure that any of this talk about side effects gets at the fundamental difference between the two types of languages and that really seems like what you are seeking.

I think it helps to illustrate the difference with an example.

a = b + c

The above line of code could be written in virtually any language so how can we determine whether we are using an imperative or declarative language? How are the properties of that line of code different in the two classes of language?

In an imperative language (C, Java, Javascript, &c.) that line of code merely represents a step in a process. It doesn't tell us anything about the fundamental nature of any of the values. It tells us that at the moment after this line of code (but before the next line,) a will equal b plus c but it doesn't tell us anything about a in the larger sense.

In a declarative language (Haskell, Scheme, Excel, &c.) that line of code says a whole lot more. It establishes an invariant relationship between a and the other two objects such that it will always be the case that a is equal to b plus c. Note, that I included Excel in the list of declarative languages because even if b or c changes value, the fact will still remain that a will be equal to their sum.

To my mind this, not side effects or state, is what makes the two types of languages different. In an imperative language, any particular line of code tells you nothing about the overall meaning of the variables in question. In other words, a = b + c only means that for a very brief moment in time, a happened to equal the sum of b and c.

Meanwhile, in declarative languages every line of code establishes a fundamental truth that will exist throughout the entire lifetime of the program. In these languages, a = b + c tells you that no matter what happens in any other line of code a will always be equal to the sum of b and c.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange