Detecting a short period in an RNG
The last couple posts have been looking at the Cliff random number generator. I introduce the generator here and look at its fixed points. These turn out to be less of a problem in practice than in theory.
Yesterday I posted about testing the generator with the DIEHARDER test suite, the successor to George Marsaglia's DIEHARD battery of RNG tests.
This morning I discovered something about the Cliff RNG which led to discovering something about DIEHARDER.The latter is more important: I don't think anyone is using the Cliff RNG for serious work, but people are definitely using DIEHARDER.
The Cliff RNG has a short period, and yet many of the DIEHARDER tests passed. However, several of the tests failed as well, and perhaps these tests failed due to the short period, in particular rgb_lagged_sum. But at least some tests can pass despite a short period.
Finding the periodSince the Cliff RNG maintains internal state as a floating point number and outputs integers, the period is a bit subtle.
The state of the Cliff RNG is a floating point number between 0 and 1, and so it has 253 possible values. (See Anatomy of a floating point number.) That means the maximum possible period is 253. We use the internal state x to create 32-bit integers n by multiplying x by 232 and truncating to an integer value. The maximum period could conceivably be larger than 232 because an output value n could repeat but correspond to two different x's. The actual period, at least in my experiment, was between 222 and 223.
I seeded the Cliff RNG with I - 3 (why?) and found that starting with index i = 3,075,302, the output values repeat with period p = 6,705,743. So there was a burn-in period before the state entered a cycle, but it would repeat that cycle forever. Not only are the output values repeating, the state values x repeat. (Otherwise it would be possible for the integer outputs to cycle for a while then break out.)
It's possible that starting with other seeds, the generator has other cycle lengths, longer or shorter. But I stumbled across one cycle of period 6,705,743.
Testing RNGsI wrote a chapter for O'Reilly's book Beautiful Testing in which I discuss How to test a random number generator. More specifically, now to test a non-uniform random number generator. That is, starting with a trusted uniform random number generator, transform the values to have some other probability distribution. This is the most common scenario. Few developers write their own core RNG, but it's fairly common to have to use a core RNG to create a custom distribution.
If you do want to test a uniform random number generator, as I do in this post, there are test suites like DIEHARDER. This is one of the oldest and best known test suites. There are newer and more rigorous suites, like TestU01 that I blog about here. There's also the NIST statistical test suite that I write about in the same post.
These tests are a little challenging to build and use. I've had clients ask me to run these tests for them and help them interpret the results. If you'd like for me to do that for you, let's talk.