Does the existence rseq/seq break referential transparency? Are there some alternative approaches that don't?

Question 1

Referential transparency is about equality statements and variable references. When you say x = y and your language is referentially transparent, then you can replace every occurence of x by y (modulo scope).

If you haven't specified x = (), then you can't replace x by () safely, such as in your case. This is because you are wrong about the inhabitants of (), because in Haskell there are two of them: One is the only constructor of (), namely (). The other is the value that is never calculated. You may call it bottom or undefined:

x :: ()
x = x

You certainly can't replace any occurrence of x by () here, because that would chance semantics. The existence of bottom in the language semantics allows some awkward edge cases, especially when you have a seq combinator, where you can even prove every monad wrong. That's why for many formal discussions we disregard the existence of bottom.

However, referential transparency does not suffer from that. Haskell is still referentially transparent and as such a purely functional language.

Question 2

Your definition of referential transparency is incorrect. Referential transparency does not mean you can replace x :: () with () :: () and everything stays the same; it means you can replace all occurrences of a variable with its definition and everything stays the same. seq and rseq are not in conflict with referential transparency if you use this definition.

Question 3

seq is a red herring here.

unitseq :: () -> a -> a
unitseq x y = case x of () -> y

This has the same semantics as seq on unit, and requires no magic. In fact, seq only has something "magic" when working on things you can't pattern match on -- i.e., functions.

Replacing undefined :: () with () has the same ill effects with unitseq as with the more magical seq.

In general, we think of values not only as "what" they are, but how defined they are. From this perspective, it should be clear why we can't go implementing definedness-changing transformations willy-nilly.

Question 4

Thanks all for the inspiring answers. After thinking about it more, I come to the following views:

View 1: Considering terminating expressions only

If we restrict ourselves to a normalizing system, then seq (or rseq etc.) has no influence on the result of a program. It can change the amount of memory the program uses and increase the CPU time (by evaluating expressions we won't actually need), but the result is the same. So in this case, the rule x :: () --> () :: () is admissible - nothing is lost, CPU time and memory can be saved.

View 2: Curry-Howard isomorphism

Since Haskell is Turing-complete and so any type is inhabited by an infinite loop, the C-H corresponding logic system is inconsistent (as pointed out by Vitus) - anything can be proved. So actually any rule is admissible in such a system. This is probably the biggest problem in my original thinking.

View 3: Expansion rules

Natural deduction rules give us reductions and expansions for different type operators. For example for -> it's β-reduction and η-expansion, for (,) it is

fst (x, y) --reduce--> x (and similarly for snd)

x : (a,b) --expand--> (fst x, snd x)

etc. With the presence of bottom, the reducing rules are still admissible (I believe), but the expansion rules not! In particular, for ->:

rseq (undefined :: Int -> Int) = undefined

but it's η-expansion

rseq (\x -> (undefined :: Int -> Int) x) = ()

or for (,):

case undefined of (a,b) -> ()

but

case (fst undefined, snd undefined) of (a,b) -> ()

etc. So in the same way, the expansion rule for () is not admissible.

View 4: Lazy pattern matching

Allowing x :: () to expand to () :: () could be viewed as a special case of forcing all pattern matching on single-constructor data types to be lazy. I've never seen lazy pattern matching to be used on a multi-constructor data type, so I believe this change would allow us to get rid of the ~ lazy pattern matching symbol completely. However, it would change the semantic of the language to be more lazy (see also Daniel Fischer's comment to the question). And moreover (as described in the question) break Strategy.