Excessive explanation, part ten

ericlippert

from Fabulous adventures in coding on 2017-01-12 14:40 (#28MCJ)

In this short section the paper gives a somewhat formal definition of what we mean by generic type substitution. You are of course familiar with the basic idea in C#: if you have

delegate T MyDelegate<T, U>(T t, List<U> us);

then we can make a substitution of int for T and double for U to get the type MyDelegate<int, double> which represents a method with signature

int MyDelegate(int t, List<double> us);

Let's see how the paper defines this idea. There are some subtleties that are not readily apparent in C#!

I'm just going to quote the entirety of section 3; it is only two paragraphs. But we'll need to take this thing apart word by word to understand it.

3 Type instantiationIf S is a substitution of types for type variables, often written[I₁/I₁, ... ,I_n/I_n] or [I_i/I_i] and If is a type-scheme, then S If is the type-scheme obtained by replacing each free occurrence of I_i in If by I_i, renaming the generic variables of If if necessary. Then S If is called an "instance" of If; the notions of substitution and instance extend naturally to larger syntactic constructs containing type-schemes.By contrast a type-scheme If = aI₁...I_m I has a "generic instance"If' = aI^2₁...I^2_n I' if I' = [I_i/I_i]I for some types I₁,...,I_m and the I^2_j are not free in If. In this case we shall write If > If'. Note that instantiation acts on free variables, while genericinstantiation acts on bound variables. It follows that If > If'implies S If > S If'.

Just" wow. We already know what type variables and type schemes are, but now we have type instances somehow distinct from generic type instances, and free variables somehow distinct from bound variables. And an ordering relation on type schemes has just been introduced.

Let's start by saying what we mean by "free" and "bound" variables in the Exp language, and then say what we mean by free and bound variables in the type language.

Basically, by "bound" we mean that a variable is declared somewhere in the expression, and by "free" we mean it is used but not declared. So for example, in the expression x y, both x and y are free variables. The variable q which does not appear, is not considered free in x y, but nor is it bound.

In the expression let x = q in x y, x is no longer a free variable; the name "x" is associated with a particular syntactic location: the first operand of the "let". In this larger expression, q and y are free, as both are used but not defined.

Let's now connect this with the type grammar. Suppose we have expression Ix.Iy.x. First off, let's make sure that we understand what this is. It's a function; all functions take a value and return a value. The outer function takes a value x and returns a function that takes a value y, ignores it, and returns x.

What is the type scheme of this expression? x and y can be any type, and they need not be the same. So let's make two type variables and quantify both of them. The type scheme of this expression is aI aI^2 I a' (I^2 a' I). That is, it's a function that takes any type alpha, and returns a function that takes any type beta and returns an alpha.

So now maybe it is clear what "bound" means in type schemes. In the type I a' (I^2 a' I) found inside of that type scheme, both I and I^2 are bound; they have quantifiers that introduce them. But what is "free"? Well, what's the type scheme of the sub-expression Iy.x found inside Ix.Iy.x? It's not aI aI^2 I^2 a' I. Why not? Because x cannot take on any type whatsoever; it has to take on the type of the expression that was passed to the outer lambda! The type scheme of the inner lambda is
aI^2 I^2 a' I, and I is a free type variable from the perspective of the inner lambda. A type variable I^3 which does not appear anywhere would be considered to be neither free nor bound in these expressions.

Maybe looking at this example more explicitly will help. An equivalent but more verbose expression is

let outer = Ix.let inner = Iy.x in inner in outer

If you're finding this terse syntax hard to read, in C# this would be (ignoring that you can't use var for lambdas!)

var outer = x => { var inner = y => x; return inner;};

So what is the type scheme of outer? aI aI^2 I a' (I^2 a' I). Both type variables are bound. What is the type scheme of inner? aI^2 I^2 a' I. I is free, I^2 is bound.

In short, "free" really means "the meaning of this name will be provided by some outer context, not here", and "bound" means "this name is given its meaning somewhere in here", regardless of whether we're talking about identifiers in the Exp language or type variables in the type language.

Apropos of which: if you're familiar with the C# specification for lambda expressions, in there we refer to the "free" variables of a lambda as the "outer" variables, which I think is a little more clear.

Next time, we'll actually look at substitution, I promise.

Source	RSS or Atom Feed
Feed Location	http://ericlippert.com/feed
Feed Title	Fabulous adventures in coding
Feed Link	https://ericlippert.com/