Fixing Random, part 20
Without further ado, here's my proposed stripped-down C# that could be a DSL for probabilistic workflows; as we'll see, it is quite similar to both enumerator blocks from C# 2 and async/await from C# 5. (Code for this episode can be found here.)
- The workflow must be a function with return type of IDiscreteDistribution<T> for some T.
- Just as methods that use await must be marked async, our DSL methods must be marked probabilistic.
- The function has an ordinary function body: a block of statements.
Obviously I am using C# 5 asynchronous workflows as an example of how to add a new mechanism to C#. The statements and the expressions in a probabilistic method are restricted as follows:
- Declaration statements must have an initializer.
- All the locals must have unique names throughout the method.
- Assignments can only be in declarations or as statements; no use of assignments in the middle of an expression is allowed.
- No compound assignments, increments or decrements.
- No await, yield, try, throw, switch, while, do, break, continue, goto, fixed, lock, using or labeled statements allowed. What's left? if and return statements are fine, as are blocks { }. I told you I was stripping things down!
- No lambdas, anonymous functions or query comprehensions.
- No use of in, out or ref.
- All the other perfectly normal operations are allowed - function calls, object initializers, member access, indexers, arithmetic, that's all just fine.
- However, all function calls and whatnot must be pure; there can be no side effects, and nothing should throw.
Basically what I'm doing here is making a little super-simple subset of C# that doesn't have any of the weird, hard stuff that keep compiler writers busy. In this pleasant world locals are always assigned, there are no closures, there are no worries about what happens when an exception is thrown, and all that sort of stuff.
This is pretty draconian, I know. We'll sketch out how to soften some of those restrictions in later episodes. I want to show that we can describe how to implement a simple core of a language; harder features can come later.
To this little language I'm going to add a new unary operator and a new statement. The new operator is sample, and it may appear only as the right operand of an assignment, or the initializer of a declaration:
int x = sample WeightedInteger.Distribution(8, 2, 7, 3);
The operand must be of type IDiscreteDistribution<T>, and the type of the expression is T. Again, this should feel familiar if you're used to await.
The new statement is condition, and it has the form
condition expression;
The expression must be convertible to bool. The meaning of this thing is much the same as Where: if the Boolean condition is not met, then a value is filtered out from the distribution by setting its weight to zero. But I'm getting ahead of myself.
Aside: Last time I talked a bit about how language designers must choose how general or specific a language element is, and that this must be reflected in syntax; this is a great example of such a choice.
I've chosen condition because I think of this operation as creating a conditional distribution; I could have said where instead of condition and had it mean basically the same thing, but that would be moving towards the established C# jargon for sequences, which I'm explicitly trying to avoid.
However, as we've seen, a primary usage case for a conditional distribution is computing the posterior distribution when given a prior and a likelihood. But why do we wish to compute the posterior at all? Typically because we have observed something. That is, the development cycle of the probabilistic program is:
- Before the program is written, data scientists and developers compute priors, like "What percentage of emails are spam?"
- They also compute likelihood functions: "what word combinations are likely, given that an email is spam? What word combinations are likely given that an email is not spam?"
- A spam detection program must now answer the question: given that we have observed an email with certain word combinations, what is the posterior probability that it is spam? This is where the probabilistic workflow comes in.
For that reason, we could reasonably choose observation or observe as our keyword here, instead of condition, emphasizing to the developer "the reason you use this feature is to express the computation of a posterior distribution for a given observation". That would make the feature feel a little bit less general, but might help more clearly express the desired semantics in the mind of the typical programmer designing a workflow that computes a posterior.
I'm going to stick with condition, but I just wanted to point out that this is a choice that has user experience consequences, were we actually to implement this feature in a language like C#.
Let's look at some examples:
probabilistic IDiscreteDistribution<bool> Flip()
{
int x = sample Bernoulli.Distribution(1, 1);
return x == 0;
}
What I would like is for this method to have the exact same semantics as if I had written:
IDiscreteDistribution<bool> Flip() =>
Bernoulli.Distribution(1, 1).Select(x => x == 0);
What do you think about these two implementations? I think the former looks a lot more like the straightforward, imperative code that we're used to.
Let's look at another:
probabilistic IDiscreteDistribution<int> TwoDSix()
{
var d = SDU.Distribution(1, 6);
int x = sample d;
int y = sample d;
return x + y;
}
Again, it seems to me that this is more clear than
IDiscreteDistribution<int> TwoDSix()
{
var d = SDU.Distribution(1, 6);
return from x in d from y in d select x + y;
}
And it certainly seems more clear, particularly to the novice, than
IDiscreteDistribution<int> TwoDSix()
{
var d = SDU.Distribution(1, 6);
return d.SelectMany(x => d, (x, y) => x + y);
}
LINQ is awesome for sequences, but I think the statement-based workflow is much easier to understand for distributions.
Now let's look at a really complicated workflow, this one with conditioning:
probabilistic IDiscreteDistribution<string> Workflow(int z)
{
int ii = sample TwoDSix();
if (ii == 2)
return "two";
condition ii != z;
bool b = sample Flip();
return b ? "heads" : ii.ToString();
}
The first two were easy to see how they corresponded to query syntax, but what even is the distribution represented by this workflow? It depends on the value of the parameter z, for one thing.
What I want here is: when you call this method, you get a distribution back. When you Sample() that distribution, it should logically have the same effect as: all the sample operations Sample() their operand, and the returned value is, well, the returned value. However, if any condition is false then we abandon the current attempt and start over.
The trick is implementing those semantics without actually running the body in a loop!
Exercise: Pick a value for z; let's say 3. See if you can work out what the distribution of strings is that should come out the other end of this thing. I'll give the answer in the next episode.
Exercise: If you had to represent this workflow as a query, how would you do it?
Next time on FAIC: Suppose we were writing a compiler for this feature. We know that LINQ works by turning query comprehensions into function calls; we know that the compiler turns async workflows and iterator blocks build state-machine coroutines. How might we lower this new kind of code into ordinary C# 7 code? Probably somehow using our existing query operators" but what exactly would work?