Strange type inference behaviour with GADT type (for fixed length vectors)
Frage
I have the following GADT
data Vec n a where
T :: Vec VZero a
(:.) :: (VNat n) => a -> (Vec n a) -> (Vec (VSucc n) a)
to model fixed length vectors, using the class
class VNat n
data VZero
instance VNat VZero
data VSucc n
instance (VNat n) => VNat (VSucc n)
I tried to programm an append-function on vectors:
vAppend :: Vec n b -> Vec m b -> Vec nm b
vAppend T T = T -- nonsense, -- its just a minimal def for testing purposes
The type checker doesnt like it:
Could not deduce (nm ~ VZero)
from the context (n ~ VZero)
bound by a pattern with constructor
T :: forall a. Vec VZero a,
in an equation for `vAppend'
at VArrow.hs:20:9
or from (m ~ VZero)
bound by a pattern with constructor
T :: forall a. Vec VZero a,
in an equation for `vAppend'
at VArrow.hs:20:11
`nm' is a rigid type variable bound by
the type signature for vAppend :: Vec n b -> Vec m b -> Vec nm b
at VArrow.hs:20:1
Expected type: Vec nm b
Actual type: Vec VZero b
In the expression: T
In an equation for `vAppend': vAppend T T = T
Failed, modules loaded: Vectors.
Can anyone explain GHCs problems with the type variable nm
? And what exactly means the ~
in the error message?
Lösung
As it stands you are saying you can get a vector of any length by appending any two vectors. If you scrap the signature ghc
infers that vAppend
is supposed to yield vector of length VZero given any two vectors -- that makes sense but isn't what you want. You need a Plus
type associated with your VNat
s to constrain the result type of vAppend
on vectors. The natural way would be a type family of some sort, but I couldn't get it under the VNat
class. In any case, this whole sort of idea is much more clearly realized with the promoted types of the DataKinds
extension (in ghc-7.4 and later) -- though maybe you were deliberately trying to avoid that extension? This gets rid of the obnoxious unclosed character of VSucc
, which admits VSucc Char
etc. If you weren't trying to avoid DataKinds
, then your module might look something like this:
{-#LANGUAGE GADTs, TypeFamilies, DataKinds, TypeOperators#-}
data Vec n a where -- or: data Vec :: VNat -> * -> * where
T :: Vec VZero a
(:.) :: a -> Vec n a -> Vec (VSucc n) a
-- no class constraint
data VNat = VZero | VSucc VNat -- standard naturals
type family n :+ m :: VNat -- note the kind of a ":+" is a promoted VNat
type instance VZero :+ n = n
type instance VSucc n :+ m = VSucc (n :+ m)
vAppend :: Vec n b -> Vec m b -> Vec (n :+ m) b
vAppend T v = v
vAppend (a :. u) v = a :. (vAppend u v)
Edit: It occurs to me, looking at this, that the line for the Plus
- or :+
- type family should have explicitly constrained the kinds of the arguments
type family (n::VNat) :+ (m::VNat) :: VNat
to keep out garbage types like Char :+ VZero
and so on -- that is, on the same principle used to prefer DataKinds
to types like data VSucc n
. Also, then we can see that the two instances specify it completely, though I don't know how much use the compiler can make of this.
Andere Tipps
All type variables in a type signature are universally quantified. This means that if you say the function has type
Vec n b -> Vec m b -> Vec nm b
then for any choice of n
, m
, nm
and b
, this type must be valid. In particular, for example,
Vec VZero Int -> Vec VZero Int -> Vec (VSucc VZero) Int
must also be a valid type of your function. However, appending two vectors does not have such a general type. There are constraints on nm
, namely that nm
is the sum of the type-level numbers n
and m
. You have to express these constraints in the type of the function, otherwise you will get type errors.
In your case, GHC complains that in your definition, nm
is actually VZero
, so you are making assumptions about nm
that your type indicates you are not allowed to make. The ~
is just GHC's symbol for type equality.
When writing a function by pattern matching on values of a GADT, GHC uses information about the expected behaviour at runtime of your function when type checking each clause. Your vAppend
function has only one clause, that pattern matches a value of type Vec n b
and a another value of type Vec m b
. GHC reasons that if at runtime vAppend
is applied to actual arguments that both match against the pattern T
, then the actual type of the actual arguments must be of the form Vec VZero b
, which is a more informative type than just Vec n b
or Vec m b
. The way this reasoning is implemented in GHC is that when type checking the RHS of the unique clause of vAppend
, it records an assumption that n
must surely actually be VZero
, written n ~ VZero
, and likewise m ~ VZero
.
The type you write for a function sets out the contract that it must fulfill. The error message that you are getting is because the contract that must be fulfilled by the implementation of vAppend
is much too general. You are saying that given two vectors of some lengths n
and m
, vAppend
must produce a vector that can be considered to be of any size. Indeed, GHC notices that your implementation does not fulfill this contract, because the type of your RHS, Vec VZero b
, does not match the expected type of the RHS, Vec nm b
, and there is no assumption saying the nm ~ VZero
. Indeed the only assumptions available, GHC tells us, are the ones from the previous paragraph.
The only possible way you could fulfill this contract is by writing undefined
as the RHS of your clause. That is obviously not what you want. The trick to getting the type right for vAppend
is to somehow relate the size of the output vector to the respective size of the two input vectors. That might go like this:
type family Plus n m
type instance Plus VZero m = m
type instance Plus (VSucc n) m = VSucc (Plus n m)
vAppend :: Vec n b -> Vec m b -> Vec (Plus n m) b
vAppend T T = T
What we have done here is said that that the length is determined by the lengths of the inputs to vAppend
, through some function on types called Plus
. In the case where both input lengths are VZero
, then we know that Plus n m
is the same as Plus VZero VZero
since n ~ VZero
and m ~ VZero
. Since Plus VZero VZero
is of the shape of the first type family instance, GHC knows that it is the same as VZero
. Therefore in this branch GHC expects a RHS of type Vec VZero b
, which we can readily construct!