Explanation of untyped terms

https://stackoverflow.com/questions/10360159

haskell
gadt

04-06-2021
|

Question

I have a continuation exercise from uni for my Haskell subject where I've been given the following:

data Expr = Con Value
          | And Expr Expr

data Value = IntValue  Int
           | BoolValue Bool

est :: Expr -> Val
est (Con v) = v
est (And x y) = 
  case (est x, est y) of
    (BoolValue bool1, BoolValue bool2) -> BoolValue $ bool1 && bool2
    _                    -> error "And: expected Boolean arguments"

And I'm not sure what it truly does. It seems to be an evaluator of the terms defined in Expr. Could someone please explain it to me? My exercise involves me converting that into a GADTs which I've done as:

data Expr e where
  Con :: Val -> Expr Val
  And :: Expr e -> Expr e -> Expr e

Now they're asking me to implement the following statically and make it type safe:

est :: Expr e -> e
est _ = -- implement this

Solution

I think you're on the wrong track with your GADT. To see why, we'll first look at the untyped version of Expr and its evaluator (the first part of your question).

Here are a couple of values of type Expr which you can construct:

expr1 = Con (IntValue 42)
expr2 = Con (BoolValue True)
expr3 = Con (BoolValue False)

So far, so good: expr1 represents the integer constant 42, expr2 and expr3 the boolean constants True and False. All values of type Expr with Con as their outermost constructor essentially look like this.

Things start getting interesting when we add the And constructor to the mix:

expr4 = And (Con (BoolValue True)) (Con (BoolValue True))
expr5 = And (Con (BoolValue True)) (And (Con (BoolValue False)) (Con (BoolValue False)))
expr6 = And (Con (BoolValue True)) (Con (IntValue 42))

expr4 and expr5 are fine; they represent the expressions True && True and True && (False && False), respectively. We would expect them to evaluate to True and False, but more on that soon. However, expr6 looks fishy: it represents the expression True && 42 which doesn't make sense (in Haskell, at least!).

The expressions we have seen so far, except number 6, all have a value: expr1 has the integer value 42, the rest are booleans (True, False, True, False for exprs 2 through 5). As you can see, the values are either integers or booleans and so can be represented as values of type Value.

We can write an evaluator which takes an Expr and returns its Value. In words, the evaluator should return the value of a constant expression and if the expression is a logical 'and', it should return the logical 'and' of the values of constituent expressions, which need to be boolean values - you can't take the logical and of an integer and a boolean, or two integers. In code, this translates to

est :: Expr -> Value -- takes an Expr and returns its Value
est (Con value) = value -- the value of a constant expression is the wrapped value
est (And e1 e2) = let value1 = est e1 -- the value of the first operand
                      value2 = est e2 -- the value of the second operand
                  in logicalAndValue value1 value2

logicalAndValue :: Value -> Value -> Value
logicalAndValue (BoolValue b1) (BoolValue b2) = BoolValue (b1 && b2)
logicalAndValue (BoolValue b1) (IntValue i2)  = error "Can't take the logical 'and' of a boolean and an integer" 
logicalAndValue (IntValue i1)  (BoolValue b2) = error "Can't take the logical 'and' of an integer and a boolean"
logicalAndValue (IntValue i1)  (IntValue i2)  = error "Can't take the logical 'and' of two integers"

This is just a more verbose version of the first est, with the operation of taking the logical 'and' of two evaluated expressions spun off into a separate function and slightly more informative error messages.

Okay, hopefully this answers your first question! The problem boils down to the fact that Expr values can either have a boolean or an integer value, and you can't "see" that type anymore, so it's possible to And two expressions together for which this doesn't make sense.

One way to solve this would be to split Expr into two new expression types, one having integer values and the other with boolean values. That would look something like (I'll give the equivalents of the exprs used above as well):

data IntExpr = ConInt Int
expr1 :: IntExpr
expr1 = ConInt 42

data BoolExpr = ConBool Bool
              | AndBool BoolExpr BoolExpr
expr2 :: BoolExpr
expr2 = ConBool True
expr3 = ConBool False
expr4 = AndBool (ConBool True) (ConBool True)
expr5 = AndBool (ConBool True) (AndBool (ConBool False) (ConBool False))

Two things are interesting to note: we've gotten rid of the Value type, and now it's become impossible to construct the equivalent of expr6 - this is because its obvious translation AndBool (ConBool True) (ConInt 42) will be rejected by the compiler (it might be worth trying this out), because of a type error: ConInt 42 is of type IntExpr and you can't use that as the second argument to AndBool.

The evaluator would also need two versions, one for integer expressions and one for boolean expressions; let's write them! IntExpr should have Int values, and BoolExpr should evaluate to Bools:

 evalInt :: IntExpr -> Int
 evalInt (ConInt n) = n

 evalBool :: BoolExpr -> Bool
 evalBool (ConBool b) = b
 evalBool (AndBool e1 e2) = let v1 = evalBool e1 -- v1 is a Bool
                                v2 = evalBool e2 -- v2 as well
                            in v1 && v2

As you can imagine, this is going to get tiresome fast as you add more types of expressions (like Char, lists, Double) or ways to combine expressions such as adding two numbers, building 'if' expressions or even variables where the type is not given in advance...

This is where GADTs come in! Instead of making a separate type of expression for each possible resulttype of the evaluation (IntExpr and BoolExpr above), we are going to "tag" the expression type itself with the type it will evaluate to. So we will be sure that the result of evaluating a value of type Expr Int will be an Int and that a Expr Bool will give us a Bool. In effect, we'll let the compiler do the typechecking for us, instead of doing it at runtime (as in the function logicalAndValue above).

For the moment, we have just two ways of constructing expressions: making a constant expression, and 'and'-ing two boolean values together. The first way works like this: if we have an Int, we turn it into a Expr Int, a Bool gets turned into a Expr Bool and so on. So, the type signature for the "make constant expression" constructor will be:

Con :: v -> Expr v

The second constructor takes two expressions which represent boolean values (so those two expressions are of type Expr Bool) and returns another expression with a boolean value, i.e. the type of this constructor is

And :: Expr Bool -> Expr Bool -> Expr Bool

Putting the pieces together, we get the following Expr type:

data Expr e where
    Con :: v -> Expr v
    And :: Expr Bool -> Expr Bool -> Expr Bool

Some example values:

expr1 :: Expr Int
expr1 = Con 42
expr2 :: Expr Bool
expr2 = Con True
expr3 :: Expr Bool
expr3 = Con False
expr4 :: Expr Bool
expr4 = And (Con True) (Con True)
expr5 :: Expr
expr5 = And (Con True) (And (Con False) (Con False))

Once again, the equivalent of expr6 doesn't pass the typechecker: it would be And (Con True) (Con 42), but Con 42 is a Expr Int and therefore can't be used as an argument to And - it needs to be an Expr Bool.

The evaluator becomes really easy now. Given an Expr e (remember, this means that it's an expression which has a value of type e) it returns an e. The value of a constant expression is the constant itself, and the value of the logical 'and' expression is the 'and' of the values of the operands, and we're sure that those values are Bools. This gives:

est :: Expr e -> e
est (Con v) = v
est (And e1 e2) = let b1 = est e1 -- this will be a Bool, since e1 is an Expr Bool
                      b2 = est e2 -- likewise
                  in b1 && b2

The GADT you've given is the direct equivalent of the untyped Expr, and it will still allow you to construct 'bad' values such as And (Con (BoolValue True)) (Con (IntValue 42)).

By getting rid of the 'Value' type, we can be much more precise in stating what the type of an expression is; instead of saying "the type of an expression is an integer or a boolean, but I don't know just yet" and running into problems when evaluating expressions, we make sure from the beginning that we know the type of the value of an expression and that we don't combine them in ways that don't make sense.

I hope you've made it so far - all those types, values, and expressions on various levels can get confusing! - and that you'll be able to experiment a bit with extending the Expr type and the evaluator on your own.

Simple things to try are making an expression that adds two integer values, using string or char constants, or making an 'if-then-else' expression which takes three arguments: the first of boolean type, and the second and third of the same type (but that type could be Int, Bool, Char or whatever).

OTHER TIPS

The point of using GADTs is to ensure that all expressions are well-typed by construction. This means that if you have an Expr Int, you know it's well-typed and that it evaluates to an Int.

To enforce this, you need to ensure that constant expressions are tagged with the type they contain, so that Con 0 has type Expr Int while Con True is Expr Bool.

Con :: v -> Expr v

Similarly, you need to ensure that And can only be used on expressions that evaluate to Bool.

And :: Expr Bool -> Expr Bool -> Expr Bool

This way, something like Con 0 `And` Con 1 won't even compile, since the constants have type Expr Int while And requires Expr Bool.

Once you've set this up correctly, implementing est :: Expr e -> e should be a trivial exercise.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow