Type inference with overloading

https://cs.stackexchange.com/questions/123894

29-09-2020
|

Question

I am working on a type system supporting overloading. I have a rough idea of how type inference is usually implemented in such a scenario, but I am wondering how - after type inference is completed - the correct implementation of an overloaded operator can be chosen. Or, in other words, how the inferred type can be passed back down the syntax tree to the operator.

For a small example, consider the expression (x + y) + 1 where x :: N | S, y :: a, + :: (N -> N -> N) | (S -> S -> S), 1 :: N. :: stands for type of, and a | b stands for type a or type b.

The way, I assume, type inference would now work, is to traverse the syntax tree, and for each node return a type constraint:

(x + y) + 1 => ((N & (N[a=N] | S[a=S])), (N & N) -> N) | ((S & (N[a=N] | S[a=S])), (S & N) -> S) => N[a=N]
  1         => N
  +         => (N -> N -> N) | (S -> S -> S)
  x + y     => ((N & (N | S)), (N & a) -> N) | ((S & (N | S)), (S & a) -> S) => N[a=N] | S[a=S]
    x       => N | S
    y       => a
    +       => (N -> N -> N) | (S -> S -> S)

a & b in this example stands for unifying the types a and b, [a=T, b=U] is a set of equality constraints for type variables.

As expected the return type of the given expression is inferred as N[a=N], that is N where the type variable a is expected to be N.

Therefore, of the two provided implementations for the + operator (N -> N -> N, S -> S -> S), N -> N -> N should be used. In the given example, the resulting type is inferred, but not the type of the overloaded operator. My question is if there is a common pattern that is used to inform the + node in the syntax tree of the used implementation.

Solution

You could organize type inference as follows.

Suppose your input syntax has type Input. Define output syntax Output to be like Input but with explicit type annotations on all variables. The type inference would have type

infer : Input -> List (Output * Type)

That is, given some input e, it returns a list of possible answers. Each answer is a pair (e', t) where e' is e with variables annotated by types, and t is the inferred type of e.

You can view this as all happenning in the nondeterminism monad. Whenever you need to infer the type of a variable x, you look up its possible types S | T | ... and branch on each one of them. This way you do not have to "pass back" any information to sub-expressions. Instead, each sub-expressions already comes annotated, in all possible ways.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange