According to the MLton documentation:
Standard ML requires types to be defined before they are used. [link]
Not all implementations enforce this requirement (for example, SML/NJ doesn't), but the above-linked page makes a good case for why it might be needed for soundness (depending on how the implementation handles the value restriction), and it accords with some of the commentary in the Definition:
Although not assumed in our definitions, it is intended that every context C = T, U, E has the property that tynames E ⊆ T. Thus T may be thought of, loosely, as containing all type names which "have been generated". […] Of course, remarks about what "has been generated" are not precise in terms of the semantic rules. But the following precise result may easily be demonstrated:
Let S be a sentence T, U, E ⊢ phrase ⇒ A such that tynames E ⊆ T, and let S′ be a sentence T′, U′, E′ ⊢ phrase′ ⇒ A′ occurring in a proof of S; then also tynames E′ ⊆ T′.
[page 21]
But I'm doubly confused by this.
Firstly — the above theorem seems backward. If I correctly understand the phrase "occurring in a proof of S", then this seems to mean (by contrapositive) "once you have a context that violates the intention that tynames E ⊆ T, all subsequent contexts will also violate that intention". Even if that's true, it seems that it would be much more useful and meaningful to assert the converse, namely, "if all contexts so far conform to the intention that tynames E ⊆ T, then any subsequently inferable context will also conform to that intention". No?
Secondly — neither MLton's statement nor the Definition's statement actually seems to be supported by the inference rules (or the "Further Restrictions" that follow them). A few inference rules have "tynames τ ⊆ T of C" or "tynames VE ⊆ T of C" as a side-condition, but none of those rules is needed for this program (given in the above-linked documentation):
val r = ref NONE
datatype t = A | B
val () = r := SOME A
(Specifically: rule (4) has to do with let
, rule (14) with =>
, and rule (26) with rec
. None of those is used in this program.)
And coming at it from the other direction, rule (17), which covers datatype
declarations, requires only that the generated type names not be in T of C; so it doesn't prevent the generation of a type name used in the existing value environment (except insofar as it's already true that tynames VE ⊆ T of C).
I feel like I'm probably missing something pretty basic here, but I have no idea what it could be!