Article 54R3K Life, part 16

Life, part 16

by
ericlippert
from Fabulous adventures in coding on (#54R3K)

Source code for this episode is here.

We are continuing with our project to gradually morph Abrash's remember the living neighbour counts" algorithm into Stafford's algorithm. I'm going to start today by adding two very small bit-twiddling optimizations to the Triplet wrapper struct I introduced in the previous episode.

The first optimization is: I'm going to need to update neighbour counts, but remember, we're going to be updating a short with three neighbour counts in it. Let's look at an example of a 3*3 grid represented by three shorts, and assume that every other cell in the larger grid is dead.

I'll give the current state as A" for alive, D" for dead, and the counts are raw" counts; that is, the left-hand living neighbour counts are not including the state of the middle cell, and the middle neighbour counts are not including the left or right cells, and so on.

The scenario is that an L-shaped triomino is about to become a block:

 Before After A-1 A-1 D-0 A-2 A-2 D-1 A-2 D-2 D-1 ==> A-2 A-2 D-1 D-1 D-1 D-0 D-2 D-2 D-1

What has to happen? The middle triplet's living neighbour counts do not need to change. But when the middle-of-the-middle triplet cell becomes alive, we need to increase all three living neighbour counts by one in both the upper and lower triplet.

Suppose the middle triplet is in a rectangular array of triplets at x, y. We do not want to have to write for this scenario:

n = triplet[x, y - 1];triplets[x, y - 1] = n.SetLeftCountRaw(n.LeftCountRaw+1) .SetMiddleCountRaw(n.MiddleCountRaw+1) .SetRightCountRaw(n.RightCountRaw+1);s = triplet[x, y + 1];... oh the boredom ...

That's both tedious and unnecessarily slow because of all the bit twiddling. Instead I'm going to write 15 helper functions as follows:

private const int lcountone = 1 << lcount;private const int mcountone = 1 << mcount;private const int rcountone = 1 << rcount;public Triplet UUP() => new Triplet(rcountone + triplet);public Triplet UUM() => new Triplet(-rcountone + triplet);public Triplet UPU() => new Triplet(mcountone + triplet);public Triplet UPP() => new Triplet(mcountone + rcountone + triplet);public Triplet UMU() => new Triplet(-mcountone + triplet);public Triplet UMM() => new Triplet(-mcountone - rcountone + triplet);public Triplet PUU() => new Triplet(lcountone + triplet);public Triplet PUM() => new Triplet(lcountone - rcountone + triplet);public Triplet PPU() => new Triplet(lcountone + mcountone + triplet);public Triplet PPP() => new Triplet(lcountone + mcountone + rcountone + triplet);public Triplet MUU() => new Triplet(-lcountone + triplet);public Triplet MUP() => new Triplet(-lcountone + rcountone + triplet);public Triplet MMU() => new Triplet(-lcountone - mcountone + triplet);public Triplet MMM() => new Triplet(-lcountone - mcountone - rcountone + triplet);

To add one to a three-bit integer embedded inside a short, all we need to do is add one shifted to the right position in the short! We do not need to extract the bits, put them in an integer, do the arithmetic, and then put them back.

(Also, fun bonus nano-optimization: since I put the constants on the left, the ones with multiple operations will get folded at compile time.)

The naming is of course P for plus, M for minus, U for unchanged; the operation I described above would be PPP".

The second optimization is: I want to be able to quickly answer the question is anything going to change in this triplet on this tick?" What we're going to do, recall, is set three next state" bits, so the answer to my question is: are the next-state bits as an integer equal to the current-state bits as an integer? If the answer is yes, then there was no change.

private const int currentm = lcm | mcm | rcm;private const int nextm = lnm | mnm | rnm;public int CurrentState => (currentm & triplet) >> rcur;public int NextState => (nextm & triplet) >> rnext;public bool Changed => CurrentState != NextState;

All right, let's get into it. The basic layout of the class is by design almost exactly the same as our previous algorithm because that's the whole idea of this pedagogy:

class StaffordOne : ILife{ private int height = 258; private int width = 88; private Triplet[,] triplets; private List<(int, int)> changes;

We have a rectangular array of 258*88 triplets. The top and bottom row, and the left and right triplet will be a rectangle of death, so that gives us 256*258 cells to play with.

The change list coordinates are triplet array coordinates, not Life grid coordinates.

I've still got my BecomeAlive" and BecomeDead" helper methods. They take Life grid coordinates. They are still idempotent but they no longer update the change list because I want to do that on a per-triplet granularity, not a per-cell-changed granularity, and I don't want to use a hash table to deduplicate the change list.

Rather, they return true or false, changed or unchanged, and let the caller decide how to update the change list.

private bool BecomeAlive(int x, int y){ int tx = x / 3; Triplet t = triplets[tx, y]; switch (x % 3) { case 0: if (t.LeftCurrent) return false; triplets[tx - 1, y - 1] = triplets[tx - 1, y - 1].UUP(); triplets[tx, y - 1] = triplets[tx, y - 1].PPU(); triplets[tx - 1, y] = triplets[tx - 1, y].UUP(); triplets[tx, y] = t.SetLeftCurrent(true); triplets[tx - 1, y + 1] = triplets[tx - 1, y + 1].UUP(); triplets[tx, y + 1] = triplets[tx, y + 1].PPU(); break;

As in Abrash's algorithm, we do the expensive work of updating the living neighbour counts only when something changed. If the left cell in a triplet changed then we need to update the counts of the triplets to the north, south, west, northwest and southwest.

We don't need to update the count of the middle cell in the current triplet because changing the state of the left state bit is what is going to update that count.

And we do not need to update the counts of the triplets to the east, northeast or southeast because those triplets have no neighbours in common with the left cell of this triplet.

I'm going to omit the middle and right cells; you see how this goes I'm sure. And the become dead" logic is all the same with the signs inverted so let's skip that too.

The real business logic is in the step function. As before, we do it in two passes.

In the first pass we look at every triplet on the previously-changed list and the neighbour triplets of all those triplets, which might include duplicates. We make a current changes" list of every triplet that is going to change, and stash that change in the next" bits of the triplet.

In the second pass we update the neighbour counts and current state bits of every changed cell. As we do so, we create a deduplicated changes list for the next tick.

public void Step(){ var currentChanges = new List<(int, int)>(); foreach ((int cx, int cy) in changes) { int minx = Max(cx - 1, 1); int maxx = Min(cx + 2, width - 1); int miny = Max(cy - 1, 1); int maxy = Min(cy + 2, height - 1); for (int y = miny; y < maxy; y += 1) { for (int x = minx; x < maxx; x += 1) {

This is basically the same code as before, but something to note here. Suppose we have a triplet that is on the changed" list because its left cell, and only its left cell, changed on the previous tick. We are now examining the neighbouring triplets, which includes the northeast, east and southeast neighbouring triplets, but those are not affected by a change in the left cell. We are therefore frequently doing unnecessary work here. Isn't this bad?

There are ways to avoid that, but as we keep adding optimizations to our implementation of Stafford's algorithm, we'll see that it becomes very cheap to process a neighbour that is not going to change. In cases where it is more expensive to run an avoidable work detector" than it is to simply do the work, it's better to do the work. I'm therefore going to ignore this small problem for the remainder of the exploration of the algorithm.

Now we get into the meat of the algorithm, as usual:

 Triplet c = triplets[x, y]; Triplet t = c; int lc = t.LeftCount; int mc = t.MiddleCount; int rc = t.RightCount; t = t.SetLeftNext(lc == 3 | t.LeftCurrent & lc == 2); t = t.SetMiddleNext(mc == 3 | t.MiddleCurrent & mc == 2); t = t.SetRightNext(rc == 3 | t.RightCurrent & rc == 2); if (t.Changed) { currentChanges.Add((x, y)); triplets[x, y] = t; } } } } changes.Clear();

That gets us through the first pass. The second pass is rather more verbose than our previous version because it is doing three cells per loop iteration, but is logically the same.

 foreach ((int x, int y) in currentChanges) { Triplet t = triplets[x, y]; if (!t.Changed) continue; bool changed = false; if (t.LeftNext) changed |= BecomeAlive(x * 3, y); else changed |= BecomeDead(x * 3, y); if (t.MiddleNext) changed |= BecomeAlive(x * 3 + 1, y); else changed |= BecomeDead(x * 3 + 1, y); if (t.RightNext) changed |= BecomeAlive(x * 3 + 2, y); else changed |= BecomeDead(x * 3 + 2, y); Debug.Assert(changed); changes.Add((x, y)); }}

The changed flag I'm tracking there is just a cheap sanity check to make sure we're not doing unnecessary work, and that BecomeAlive/BecomeDead are doing the right thing.

That's it; rather more tedious code but the same algorithm as previous. We should expect that there is a performance penalty as we have added a bunch of bit twiddling that we did not have before but have only gained a very small amount of parallelism, like setting three neighbour counts with a single addition. If we run the performance test on the release build we get:

Algorithm time(ms) ticks size(quad) megacells/sNaive (Optimized): 4000 5K 8 82Abrash 550 5K 8 596Abrash w/ changes 190 5K 8 1725 Abrash, O(c) 240 5K 8 1365Stafford v1 340 5K 8 963

We've lost almost half the speed of our previous best version of this algorithm. I sure hope there was a point to all this.

Next time on FAIC: We'll take another brief foray into some interesting Life forms, and improve the speed controls in the client.

Coming up after that: How can we optimize the heck out of the first pass inner loop given this data structure?

For today's extra content: the creepiest spaceship, Spider:spider.gif?w=584

Image from the wiki.

External Content
Source RSS or Atom Feed
Feed Location http://ericlippert.com/feed
Feed Title Fabulous adventures in coding
Feed Link https://ericlippert.com/
Reply 0 comments