Does the existence rseq/seq break referential transparency? Are there some alternative approaches that don't?

StackOverflow https://stackoverflow.com/questions/13673095

Question

I always thought that replacing an expression x :: () with () :: () would be one of the most basic optimizations during compiling Haskell programs. Since () has a single inhabitant, no matter what x is, it's result is (). This optimization seemed to me as a significant consequence of referential transparency. And we could do such an optimization for any type with just a single inhabitant.

(Update: My reasoning in this matter comes from natural deduction rules. There the unit type corresponds to truth (⊤) and we have an expansion rule "if x : ⊤ then () : ⊤". For example see this text p. 20. I assumed that it's safe to replace an expression with its expansion or contractum.)

One consequence of this optimization would be that undefined :: () would be replaced by () :: (), but I don't see this as a problem - it would simply make programs a bit lazier (and relying on undefined :: () is certainly a bad programming practice).

However today I realized that such optimization would break Control.Seq completely. Strategy is defined as

type Strategy a = a -> ()

and we have

-- | 'rseq' evaluates its argument to weak head normal form.
rseq :: Strategy a
rseq x = x `seq` ()

But rseq x :: () so the optimization would simply discard the required evaluation of x to WHNF.

So where is the problem?

  • Does the existence of rseq and seq break referential transparency (even if we consider only terminating expressions)?
  • Or is this a flaw of the design of Strategy and we could devise a better way how to force expressions to WHNF compatible with such optimizations?
Was it helpful?

Solution

Referential transparency is about equality statements and variable references. When you say x = y and your language is referentially transparent, then you can replace every occurence of x by y (modulo scope).

If you haven't specified x = (), then you can't replace x by () safely, such as in your case. This is because you are wrong about the inhabitants of (), because in Haskell there are two of them: One is the only constructor of (), namely (). The other is the value that is never calculated. You may call it bottom or undefined:

x :: ()
x = x

You certainly can't replace any occurrence of x by () here, because that would chance semantics. The existence of bottom in the language semantics allows some awkward edge cases, especially when you have a seq combinator, where you can even prove every monad wrong. That's why for many formal discussions we disregard the existence of bottom.

However, referential transparency does not suffer from that. Haskell is still referentially transparent and as such a purely functional language.

OTHER TIPS

Your definition of referential transparency is incorrect. Referential transparency does not mean you can replace x :: () with () :: () and everything stays the same; it means you can replace all occurrences of a variable with its definition and everything stays the same. seq and rseq are not in conflict with referential transparency if you use this definition.

seq is a red herring here.

unitseq :: () -> a -> a
unitseq x y = case x of () -> y

This has the same semantics as seq on unit, and requires no magic. In fact, seq only has something "magic" when working on things you can't pattern match on -- i.e., functions.

Replacing undefined :: () with () has the same ill effects with unitseq as with the more magical seq.

In general, we think of values not only as "what" they are, but how defined they are. From this perspective, it should be clear why we can't go implementing definedness-changing transformations willy-nilly.

Thanks all for the inspiring answers. After thinking about it more, I come to the following views:

View 1: Considering terminating expressions only

If we restrict ourselves to a normalizing system, then seq (or rseq etc.) has no influence on the result of a program. It can change the amount of memory the program uses and increase the CPU time (by evaluating expressions we won't actually need), but the result is the same. So in this case, the rule x :: () --> () :: () is admissible - nothing is lost, CPU time and memory can be saved.

View 2: Curry-Howard isomorphism

Since Haskell is Turing-complete and so any type is inhabited by an infinite loop, the C-H corresponding logic system is inconsistent (as pointed out by Vitus) - anything can be proved. So actually any rule is admissible in such a system. This is probably the biggest problem in my original thinking.

View 3: Expansion rules

Natural deduction rules give us reductions and expansions for different type operators. For example for -> it's β-reduction and η-expansion, for (,) it is

fst (x, y) --reduce--> x (and similarly for snd)

x : (a,b) --expand--> (fst x, snd x)

etc. With the presence of bottom, the reducing rules are still admissible (I believe), but the expansion rules not! In particular, for ->:

rseq (undefined :: Int -> Int) = undefined

but it's η-expansion

rseq (\x -> (undefined :: Int -> Int) x) = ()

or for (,):

case undefined of (a,b) -> ()

but

case (fst undefined, snd undefined) of (a,b) -> ()

etc. So in the same way, the expansion rule for () is not admissible.

View 4: Lazy pattern matching

Allowing x :: () to expand to () :: () could be viewed as a special case of forcing all pattern matching on single-constructor data types to be lazy. I've never seen lazy pattern matching to be used on a multi-constructor data type, so I believe this change would allow us to get rid of the ~ lazy pattern matching symbol completely. However, it would change the semantic of the language to be more lazy (see also Daniel Fischer's comment to the question). And moreover (as described in the question) break Strategy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top