Excessive explanation, part thirteen
So far we've gotten through the introduction, the grammar of the Exp language, the grammar of the type description language, and what it means to make substitutions in type schemes. For the next few episodes we'll consider the semantics of the Exp language.
Having the grammar of the language is enough to be able to parse it, but ultimately computer programs are intended to be executed. Programs manipulate values, and so we should be able to say formally what the rules are for values and how different program elements compute them.
4 SemanticsThe semantic domain V for Exp is a complete partial order satisfying the following equations up to isomorphism, where Bi is a cpo corresponding to primitive type I^1i :V = B0 + B1 + ... + F + W (disjoint sum)F = V a' V (function space)W = {} (error element)
HOLY GOODNESS that is a lot of jargon with no explanation.
I'm not going to go into too much detail here as formal semantic analysis using domain theory is not my strong suit. But we can unpack some of this jargon.
First off, by "semantic domain V" we pretty much mean "what are all the possible values that a variable can have?" This might not be as straightforward as you think! Consider C#:
class C<T> { int x; T y;}
It's pretty clear that the values of variable x can be any of the four billion possible ints. But what are the possible values of y? T is, from the perspective of the compiler, a perfectly legal type for y. We can now come to two incompatible conclusions. First: T can be any type, so y can have any value, including, say, 123. Second: there are no values of type T - please, try to find a running program that has a T stored in y! - so since y is of type T, there can be no values for y.
Plainly at most one of those statements can be true. The second conclusion is unsound but it is tricky to try to figure out where exactly the reasoning goes wrong.
Anyways, my point is that in a language with generics we need to be careful to describe exactly what are the possible values are that can be manipulated by the program. The "semantic domain" that we're calling "V" here is the set of all values that can be manipulated by the Exp language.
But we say more than just that it is a set; we say that it is a "complete partial order". What on earth is a cpo, and why do we care?
Long-time readers of this blog already know what a "total ordering" is. It's a bunch of things, plus a relation on them that takes any two things A and B, and decides consistently whether A is larger than, smaller than, or equal to B. Total orders are necessary for sorting algorithms to work properly. A total order has all kinds of restrictions - things equal to the same are equal to each other, if A is smaller than B and B is smaller than C then A is smaller than C, A and B are never both smaller than each other, and so on. But the easiest way to think about a total order is that given any two elements, they are either equal, or one is bigger than the other.
A partial order relaxes this requirement. In a partial order, given any two elements they are equal, or one is bigger than the other, or they are unordered with respect to each other. "Unordered" and "equal" are very different!
Consider assignment compatibility in C# for example. We can make a partial order on types based on assignment compatibility.
An expression of type int is assignable to a variable of type long, so int is "less than" long. Similarly int is "less than" object. A string is assignable to object, so string is "less than" object. But ints and strings are not assignable to each other, and they certainly are not equal! They are neither bigger, nor smaller, nor equal to each other as far as the "assignment compatibility" relationship is concerned. Assignment compatibility is a partial order.
A set that comes with a partial order is called a "poset".
A "complete" partial order is a partial order with an additional property which would be somewhat tedious for me to define, so I'm going to skip it. In fact, all of this is a bit of a digression. There are technical reasons why the study of programming language value domains needs complete partial orders. Recall that we mentioned fixed-point analysis as it applied to recursive functions a while back. Suffice to say, the mathematics of analyzing recursive functions, function application, and so on, requires that there be a partial order.
So what are we really saying here? We're saying that the values that can be manipulated by an Exp program are:
- Any value of one of the "primitive" types
- A function that takes a value and returns a value
- A value representing "this program is an error".
Frankly it is not 100% clear to me why we need the error element. It is never mentioned again in the paper! Anyone who has more background in domain theory who wants to clue me in here, please leave a comment.
All right, we have come to the not entirely surprising conclusion that the values manipulated in a functional language are primitive values and functions of one parameter.
Next time: typing judgments!