Excessive explanation, part twelve
Last time we saw how we can make a substitution for a free variable to produce an instance of a type scheme. This time we'll look at a much more powerful form of substitution.
By contrast a type-scheme If = aI1...Im I has a "generic instance"If' = aI^21...I^2n I' if I' = [Ii/Ii]I for some types I1,...,Im and the I^2j are not free in If.
This definitely needs an example.
Suppose we have the identity function Ix.x. Its type scheme is aI I a' I. Now we might want to perform a substitution here of int for I. But we've only defined substitution on free variables, and we need to make a substitution on bound variables, to eliminate the bound variable entirely.
This mechanism of "generic instantiation" says: a type scheme consists of zero or more quantifiers and a type. Take just the type and perform a substitution on it, and then put as many quantifiers as you like before the substituted type, provided that the new quantifiers are not turning free variables into bound variables.
Let's try it. Extract the type from the type scheme: I a' I. We perform our substitution [int/I] on that type to get int a' int. And then we put zero or more quantifiers on top of that. Let's put zero quantifiers on top of it. So we have the original type scheme aI I a' I - that is, for all I, there's a function from I to I. Now we want a specific generic instance of that, and it is int a' int.
This is in a sense the sort of generic instantiation that you're used to from C#. Notice that we've eliminated not just the type variable, but the quantifier on top of the type scheme.
Now let's look at a more complicated example.
Suppose we have some type scheme for some function, let's say aI^2aI^3 I^2 a' I^3. And let's suppose we pass that thing to the identity function. The type scheme of the identity function is aI I a' I: the quantifier says that this works for any I we care to name, as long as the type provided for I is a type, not a type scheme; remember, so far we've only described type substitution by saying that we're going to replace a free variable with a type, but what we have is a type scheme with two bound type variables: aI^2aI^3 I^2 a' I^3.
How do we deal with this? We'll do our generic instantiation process again. We have type schemes aI I a' I and aI^2aI^3 I^2 a' I^3. Take just the type of the first without the quantifiers: I a' I. Now make the substitution with the type of the second without the quantifiers: [I^2 a' I^3 / I] to get the type (I^2 a' I^3) a' (I^2 a' I^3). Then slap a couple of quantifiers on top of that: aI^2aI^3 (I^2 a' I^3) a' (I^2 a' I^3).
We've produced an entirely new type scheme, but clearly this type scheme has the same "fundamental structure" as the type scheme for the identity function, even though this type scheme does not have the type variable of the original scheme, or even the same number of type variables; it's got one more.
Why did we have to say that all of the introduced type variables must be "not free"? Well, let's see what happens if we violate that constraint. Suppose we have type scheme aI I^2 a' I. Clearly I^2 is free.
We extract the type: I^2 a' I. We perform a substitution [int/I] to get I^2 a' int. And then we put a quantifier on top: aI^2 I^2 a' int. But that turns I^2 from a free variable into a bound variable, so this is not a generic instance of a type scheme with a free variable I^2. Thus when creating generic instances we are restricted to introduce quantifiers only on non-free variables.
In this case we shall write If > If'.
Conceptually this is pretty straightforward. Given two type schemes, possibly one is a generic instance of the other. If a schema If' is a generic instance of a scheme If then we say that If' is smaller than If. This should make some sense given our examples so far. If you think of schemes as patterns that can be matched, the pattern aI I a' I is way more general than the pattern int a' int, so we say that the latter is "smaller" - less general - than the former.
It is slightly problematic that the authors chose the "greater than" symbol instead of, say, the "greater than or equal" symbol. Why? Well consider aI I a' I and aI^2 I^2 a' I^2. It should not be hard to see that both of these are generic instances of each other, and therefore both are "smaller" than the other, which is silly. Plainly they are equally general!
Note that instantiation acts on free variables, while genericinstantiation acts on bound variables.
This line should now be clear.
It follows that If > If' implies S If > S If'.
Let's look at a quick example just to make sure that's clear. Suppose we have If = aI I a' I^2 and If' = int a' I^2. Clearly If > If'. If we now make the substitution of string for I^2 in both, then we have aI I a' string which is still more general than int a' string; clearly the greater-than relationship is preserved by substitution of free variables.
And that's it for section 3! Next time, we'll look at the formal semantics of the Exp language.