Purity vs Referential transparency

https://stackoverflow.com/questions/11919992

25-06-2021
|

Question

The terms do appear to be defined differently, but I've always thought of one implying the other; I can't think of any case when an expression is referentially transparent but not pure, or vice-versa.

Wikipedia maintains separate articles for these concepts and says:

From Referential transparency:

If all functions involved in the expression are pure functions, then the expression is referentially transparent. Also, some impure functions can be included in the expression if their values are discarded and their side effects are insignificant.

From Pure expressions:

Pure functions are required to construct pure expressions. [...] Pure expressions are often referred to as being referentially transparent.

I find these statements confusing. If the side effects from a so-called "impure function" are insignificant enough to allow not performing them (i.e. replace a call to such a function with its value) without materially changing the program, it's the same as if it were pure in the first place, isn't it?

Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any? If there is a difference, an example expression that clearly demonstrates it would be appreciated.

Solution

If I gather in one place any three theorists of my acquaintance, at least two of them disagree on the meaning of the term "referential transparency." And when I was a young student, a mentor of mine gave me a paper explaining that even if you consider only the professional literature, the phrase "referentially transparent" is used to mean at least three different things. (Unfortunately that paper is somewhere in a box of reprints that have yet to be scanned. I searched Google Scholar for it but I had no success.)

I cannot inform you, but I can advise you to give up: Because even the tiny cadre of pointy-headed language theorists can't agree on what it means, the term "referentially transparent" is not useful. So don't use it.

P.S. On any topic to do with the semantics of programming languages, Wikipedia is unreliable. I have given up trying to fix it; the Wikipedian process seems to regard change and popular voting over stability and accuracy.

OTHER TIPS

All pure functions are necessarily referentially transparent. Since, by definition, they cannot access anything other than what they are passed, their result must be fully determined by their arguments.

However, it is possible to have referentially transparent functions which are not pure. I can write a function which is given an int i, then generates a random number r, subtracts r from itself and places it in s, then returns i - s. Clearly this function is impure, because it is generating random numbers. However, it is referentially transparent. In this case, the example is silly and contrived. However, in, e.g., Haskell, the id function is of type a - > a whereas my stupidId function would be of type a -> IO a indicating that it makes use of side effects. When a programmer can guarantee through means of an external proof that their function is actually referentially transparent, then they can use unsafePerformIO to strip the IO back away from the type.

I'm somewhat unsure of the answer I give here, but surely somebody will point us in some direction. :-)

"Purity" is generally considered to mean "lack of side-effects". An expression is said to be pure if its evaluation lacks side-effects. What's a side-effect then? In a purely functional language, side-effect is anything that doesn't go by the simple beta-rule (the rule that to evaluate function application is the same as to substitute actual parameter for all free occurrences of the formal parameter).

For example, in a functional language with linear (or uniqueness, this distinction shouldn't bother at this moment) types some (controlled) mutation is allowed.

So I guess we have sorted out what "purity" and "side-effects" might be.

Referential transparency (according to the Wikipedia article you cited) means that variable can be replaced by the expression it denotes (abbreviates, stands for) without changing the meaning of the program at hand (btw, this is also a hard question to tackle, and I won't attempt to do so here). So, "purity" and "referential transparency" are indeed different things: "purity" is a property of some expression roughly means "doesn't produce side-effects when executed" whereas "referential transparency" is a property relating variable and expression that it stands for and means "variable can be replaced with what it denotes".

Hopefully this helps.

These slides from one ACCU2015 talk have a great summary on the topic of referential transparency.

From one of the slides:

A language is referentially transparent if (a) every subexpression can be replaced by any other that’s equal to it in value and (b) all occurrences of an expression within a given context yield the same value.

You can have, for instance, a function that logs its computation to the program standard output (so, it won't be a pure function), but you can replace calls for this function by a similar function that doesn't log its computation. Therefore, this function have the referential transparency property. But... the above definition is about languages, not expressions, as the slides emphasize.

[...] it's the same as if it were pure in the first place, isn't it?

From the definitions we have, no, it is not.

Is there a simpler way to understand the differences between a pure expression and a referentially transparent one, if any?

Try the slides I mentioned above.

I'll quote John Mitchell Concept in programming language. He defines pure functional language has to pass declarative language test which is free from side-effects or lack of side effects.

"Within the scope of specific deceleration of x1,...,xn , all occurrence of an expression e containing only variables x1,...,xn have the same value."

In linguistics a name or noun phrase is considered referentially transparent if it may be replaced with the another noun phrase with same referent without changing the meaning of the sentence it contains.

Which in 1st case holds but in 2nd case it gets too weird.

Case 1: "I saw Walter get into his new car."

And if Walter own a Centro then we could replace that in the given sentence as:

"I saw Walter get into his Centro"

Contrary to first :

Case #2 : He was called William Rufus because of his read beard.

Rufus means somewhat red and reference was to William IV of England.

"He was called William IV because of his read beard." looks too awkward.

Traditional way to say is, a language is referentially transparent if we may replace one expression with another of equal value anywhere in the program without changing the meaning of the program.

So, referential transparency is a property of pure functional language. And if your program is free from side effects then this property will hold.

So give it up is awesome advice but get it on might also look good in this context.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow