Fixing Random, part 18

ericlippert

from Fabulous adventures in coding on 2019-04-02 15:45 (#4CAV1)

Before that silly diversion I mentioned that we will be needing the empty distribution; today, we'll implement it. It's quite straightforward, as you'd expect. [Code for this episode is here.]

public sealed class Empty<T> : IDiscreteDistribution<T>
{
public static readonly Empty<T> Distribution = new Empty<T>();
private Empty() { }
public T Sample() =>
throw new Exception("Cannot sample from empty distribution");
public IEnumerable<T> Support() =>
Enumerable.Empty<T>();
public int Weight(T t) => 0;
}

Easy peasy. Now that we have this, we can fix up our other distributions to use it. The WeightedInteger factory becomes:

public static IDiscreteDistribution<int> Distribution(
IEnumerable<int> weights)
{
List<int> w = weights.ToList();
if (w.Any(x => x < 0))
throw new ArgumentException();
if (!w.Any(x => x > 0))
return Empty<int>.Distribution;
["]

And the Bernoulli factory becomes:

public static IDiscreteDistribution<int> Distribution(
int zero, int one)
{
if (zero < 0 || one < 0)
throw new ArgumentException();
if (zero == 0 && one == 0)
return Empty<int>.Distribution;
["]

And the StandardDiscreteUniform factory becomes:

public static IDiscreteDistribution<int> Distribution(
int min, int max)
{
if (min > max)
return Empty<int>.Distribution;
["]

And the Projected factory becomes:

public static IDiscreteDistribution<R> Distribution(
IDiscreteDistribution<A> underlying, Func<A, R> projection)
{
var result = new Projected<A, R>(underlying, projection);
if (result.weights.Count == 0)
return Empty<R>.Distribution;
["]

And one more thing needs to change. Our computation in SelectMany assumed that none of the total weights are zero. Easily fixed:

int lcm = prior.Support()
.Select(a => likelihood(a).TotalWeight())
.Where(x => x != 0)
.LCM();

We also have a division by total weight; don't we have to worry about dividing by zero? Nope. Remember, the empty distribution's support is the empty sequence, so when we then say:

var w = from a in prior.Support()
let pb = likelihood(a)
from b in pb.Support()
group prior.Weight(a) * pb.Weight(b) *
lcm / pb.TotalWeight()
by projection(a, b);

If prior.Support() is empty then the whole query is empty and so the division is never executed. If prior.Support()is not empty but one of the pb.Support() is empty then there is nob from which to compute a group key. We never actually divide by total weight, and so there is no division by zero error to avoid.

That was relatively painless, but it is probably still very unclear why we'd ever need an empty distribution. It seems to be like a little bomb hiding in the program, waiting for someone to sample it. Have we just committed another "null reference" design fault? In a few episodes we'll see what benefits justify the costs.

Next time on FAIC: We've been having a lot of fun treating distributions as monads that we can use query comprehensions on, but is that really the best possible syntax?

Source	RSS or Atom Feed
Feed Location	http://ericlippert.com/feed
Feed Title	Fabulous adventures in coding
Feed Link	https://ericlippert.com/