Question

Wikipedia says:

fun factorial n = 
    if n = 0 then 1 else n * factorial (n-1) 

A Standard ML compiler is required to infer the static type int -> int of this function without user-supplied type annotations. I.e., it has to deduce that n is only used with integer expressions, and must therefore itself be an integer, and that all value-producing expressions within the function return integers.

I don't understand how a compiler could infer this. It sounds like SML is essentially solving the halting problem for the factorial function, and showing that it only halts on positive integer inputs.

Am I missing something?

Was it helpful?

Solution

The classic Hindley-Milner type inference algorithm requires each literal value to be unambiguously inferable to a type: 0 is an int, while 0.0 is a real. Haskell's type system has been enhanced to include a certain kind of bounded polymorphism, so that 0 can have any type that is a number (more precisely, of any type that is an instance of the Num type class: 0 is of type Num a => a).

See the wikipedia article on Hindley-Milner for further details on the algorithm (the article explanations are much better than anything I could write on that topic).

OTHER TIPS

I think it is important to realise what the type $\mathtt{int \rightarrow int}$ means and does not mean.

It does not mean that a program P having that type must return an integer, contrary to what the original poster said. Instead it means if P terminates on a given input (of type int), P terminates by returning an integer. This way the halting problem is not an issue.

In addition, well-typing guarantees that programs don't suffer run-time type errors even when not terminating.

The original Milner, Damas paper Principal type-schemes for functional programs is very readable. With 6 pages it's also short.

Finally, it is highly instructive, and not much effort, to implement a type-inferencer for a toy language. I recommend doing this to really understand how types work in programming languages.

Working with types is not as "hard" as working with values with respect to computability. You can know statically that a function will return a bool without knowing statically that it will return true, for example. Moreover, it's entirely possible and sometimes trivially easy to prove that a specific given function halts with a certain input, but this is a different problem than writing a general algorithm that determines whether or not any other given function will halt on an input.

With respect to how type inference works, let me recommend some sources which may help:

  1. Unification on Wikipedia
  2. Type inference on Wikipedia
  3. Types and Programming Languages by Benjamin C. Pierce (chapter 22, especially)
  4. Hindley-Milner method on Wikipedia.

The type inference algorithm happens essentially in two phases: constraint generation, and unification. During constraint generation, the AST of the program is traversed and a set of type constraints is generated.

For example, if somewhere in the program there is the application expression myfun (3), the inference algorithm could infer that the type of myfun is $\tt{int} \rightarrow \alpha$. If in another part of the program the expression 4 + myfun(y) is encountered, the constraint that myfun has type $\beta \rightarrow \tt{int}$ is generated. During the unification step, such constraints will be (successfully or unsuccessfully) unified. During this phase the set of constraints might include

  • $ \tt{myfun} : \tt{int} \rightarrow \alpha$
  • $ \tt{myfun} : \beta \rightarrow \tt{int}$
  • $ \tt{y} : \tt{int}$
  • $ \tt{y} : \beta$

From this set (really system) of constraints, we could infer that $\alpha = \tt{int}$ and $\beta = \tt{int}$, and everything would be consistent and the program would type check. It's possible for unification to fail, though, depending on what else is in the set of constraints.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top