Question

I am looking into the Scheme compiler Stalin. It is big and complex. Also, if I understood correctly, the author was planning to write a series of papers detailing aspects of the implementation, but never got around to doing that.

The aspect of Stalin I am interested in is global type inference: deducing the types of things based on their usage in other places in the program. Does Stalin indeed do that? If yes, how, and where in its code base? Does it use a variant / extension of a Hindley-Milner algorithm?

Was it helpful?

Solution

From the README:

Stalin does global static type analysis using a soft type system that supports recursive union types. Stalin can determine a narrow or even monomorphic type for each source code expression in arbitrary Scheme programs with no type declarations. This allows Stalin to reduce, or often eliminate, run-time type checking and dispatching. Stalin also does low-level representation selection on a per-expression basis. This allows the use of unboxed base machine data representations for all monomorphic types resulting in extremely high-performance numeric code.

From a 1997 talk by Siskind:

Stalin performs type inference using set-based analysis (SBA aka 0CFA). This analysis is augmented to support polyvariant procedure splitting. The results of SBA are used to reduce run-time type checking and dispatching. The results of SBA are also used to do low-level representation selection on a per-expression basis. This has two benefits. First, type tags can be eliminated for monotypes, allowing the use of base machine representations for primitive data. Second, boxing can be eliminated, alleviating the costs of indirection, allocation, and reclamation associated with boxing. Eliminating boxing requires that the run-time organization allow variables, parameters, structure slots, and vector slots to have different widths depending on the type of data that they hold. Furthermore, user-defined structures can be unboxed only if those structures are immutable. SBA is extended to determine data widths and mutability at compile time.

The actual type inference algorithm seems to be mostly implemented in the source file source/stalin3b.sc.

It seems that SBA/0CFA is an entirely separate algorithm to Hindley-Milner. However, Hindley-Milner can also be used to implement soft typing.

Here's a nicer description of the 0CFA algorithm.

Relevant papers are Olin Shivers' 1991 PhD thesis Control-flow Analysis of Higher-Order Languages or Taming Lambda and Flanagan & Felleisen's 1995 paper Set-Based Analysis for Full Scheme and Its Use in Soft-Typing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top