Hindley-Milner algorithm: using types to ensure bindings are applied

https://stackoverflow.com//questions/9716301

16-12-2019
|

Question

I am implementing the Hindley-Milner type inference algorithm, following the tutorials of Mark Jones and Oleg Kiselyov. Both of these have an "apply bindings" operation with a type roughly of the form

applyBindings :: TyEnv -> Type -> Type

which applies the tyvar -> ty bindings in TyEnv to the given Type. I have found it a common mistake in my code to forget to call applyBindings, and I get no help from Haskell's type system, since ty has the same type as applyBindings tyenv ty. I am looking for a way to enforce the following invariant in the type system:

when doing type inference, bindings must be applied before returning a 'final' result

When doing type inference for a monomorphic object language, there is a natural way to enforce this, as implemented in wren ng thornton's unification-fd package: we define two datatypes for Types:

-- | Types not containing unification variables
type Type = ...          -- (Fix TypeF) in wren's package

-- | Types possibly containing unification variables
type MutType = ...       -- (MutTerm IntVar TypeF) in wren's package

and give applyBindings the type

-- | Apply all bindings, returning Nothing if there are still free variables
-- otherwise just
applyBindings :: TyEnv -> MutType -> Maybe Type

(this function is actually freeze . applyBindings in unification-fd). This enforces our invariant - if we forget to applyBindings, then we will get a type error.

This is the kind of solution I am looking for, but for object languages with polymorphism. The above approach, as it stands, doesn't apply, since our object-language types may have type variables -- indeed, if there variables free after applying bindings, we don't want to return Nothing, but we we want to generalise over these variables.

Is there a solution along the lines I describe, i.e. one which gives applyBindings a different type from const id? Do real compilers use the same punning (between unification variables and object-language type variables) that Mark's and Oleg's tutorials do?

Solution

I'm taking a stab in the dark here, because I think there may be other problems with the solution you propose, but I can address at least one difficulty:

Your type checker should have different representations for unification type variables and object-language type variables.

This variation is not difficult to implement, and in fact I think the GHC type checker worked this way, at least at one time. You might want to check the paper Practical Type Inference for Arbitrary-Rank Types; the appendix contains a lot of very helpful code.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow