Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-11-23 16:01
Bayesian methods at Bletchley Park
From Nick Patterson’s interview on Talking Machines: GCHQ in the ’70s, we thought of ourselves as completely Bayesian statisticians. All our data analysis was completely Bayesian, and that was a direct inheritance from Alan Turing. I’m not sure this has ever really been published, but Turing, almost as a sideline during his cryptoanalytic work, reinvented […]
Sphere packing
The previous couple blog posts touched on a special case of sphere packing. We looked at the proportion of volume contained near the corners of a hypercube. If you take the set of points within a distance 1/2 of a corner of a hypercube, you could rearrange these points to form a full ball centered […]
Is most volume in the corners or not?
I’ve written a couple blog posts that may seem to contradict each other. Given a high-dimensional cube, is most of the volume in the corners or not? I recently wrote that the corners of a cube stick out more in high dimensions. You can quantify this by centering a ball at a corner and looking […]
Corners stick out more in high dimensions
High-dimensional geometry is full of surprises. For example, nearly all the area of a high-dimensional sphere is near the equator, and by symmetry it doesn’t matter which equator you take. Here’s another surprise: corners stick out more in high dimensions. Hypercubes, for example, become pointier as dimension increases. How might we quantify this? Think of […]
Math diagrams updated
I updated several of the math diagrams on this site today. They’re SVG now, so they resize nicely if you want to zoom in our out. Special functions Topological vector spaces Category theory concepts General topology Gamma function identities
Discrete example of concentration of measure
The previous post looked at a continuous example of concentration of measure. As you move away from a thin band around the equator, the remaining area in the rest of the sphere decreases as an exponential function of the dimension and the distance from the equator. This post will show a very similar result for […]
Nearly all the area in a high-dimensional sphere is near the equator
Nearly all the area of a high-dimensional sphere is near the equator. And by symmetry, it doesn’t matter which equator you take. Draw any great circle and nearly all of the area will be near that circle. This is the canonical example of “concentration of measure.” What exactly do we mean by “nearly all the […]
DIEHARDER random number generator test results for PCG and MWC
A few days ago I wrote about testing the PCG random number generator using the DIEHARDER test suite. In this post I’ll go into a little more background on this random number generator test suite. I’ll also show that like M. E. O’Neill’s PCG (“permuted congruential generator”), George Marsaglia’s MWC (“multiply with carry”) generator does quite […]
The chaos game and the Sierpinski triangle
The chaos game is played as follows. Pick a starting point at random. Then at each subsequent step, pick a triangle vertex at random and move half way from the current position to that vertex. The result looks like a fractal called the Sierpinski triangle or Sierpinski gasket. Here’s an example: If the random number […]
Testing the PCG random number generator
M. E. O’Neill’s PCG family of random number generators looks very promising. It appears to have excellent statistical and cryptographic properties. And it takes remarkably little code to implement. (PCG stands for Permuted Congruential Generator.) The journal article announcing PCG gives the results of testing it with the TestU01 test suite. I wanted to try it out […]
Simple random number generator does surprisingly well
I was running the NIST statistical test suite recently. I wanted an example of a random number generator where the tests failed, and so I used a simple generator, a linear congruence generator. But to my surprise, the generator passed nearly all the tests, even though some more sophisticated generators failed some of the same […]
Least common multiple of the first n positive integers
Here’s a surprising result: The least common multiple of the first n positive integers is approximately exp(n). More precisely, let φ(n) equal the log of the least common multiple of the numbers 1, 2, …, n. There are theorems that give upper and lower bounds on how far φ(n) can be from n. We won’t prove or […]
Subscribing by email
You can subscribe to my blog by email or RSS. I also have a brief newsletter you could sign up for. There are links to these in the sidebar of the blog: If you subscribe by email, you’ll get an email each morning containing the post(s) from the previous day. I just noticed a problem […]
Effective sample size for MCMC
In applications we’d like to draw independent random samples from complicated probability distributions, often the posterior distribution on parameters in a Bayesian analysis. Most of the time this is impractical. MCMC (Markov Chain Monte Carlo) gives us a way around this impasse. It lets us draw samples from practically any probability distribution. But there’s a […]
Quicksort and prime numbers
The average number of operations needed for quicksort to sort a list of n items is approximately 10 times the nth prime number. Here’s some data to illustrate this. |------+-----------------+---------| | n | avg. operations | 10*p(n) | |------+-----------------+---------| | 100 | 5200.2 | 5410 | | 200 | 12018.3 | 12230 | | 300 | 19446.9 […]
Why do linear prediction confidence regions flare out?
Suppose you’re tracking some object based on its initial position x0 and initial velocity v0. The initial position and initial velocity are estimated from normal distributions with standard deviations σx and σv. (To keep things simple, let’s assume our object is moving in only one dimension and that the distributions around initial position and velocity […]
Polynomials evaluated at integers
Let p(x) = a0 + a1x + a2x2 + … + anxn and suppose at least one of the coefficients ai is irrational for some i ≥ 1. Then a theorem by Weyl says that the fractional parts of p(n) are equidistributed as n varies over the integers. That is, the proportion of values that land in some interval […]
Leading digits of powers of 2
The first digit of a power of 2 is a 1 more often than any other digit. Powers of 2 begin with 1 about 30% of the time. This is because powers of 2 follow Benford’s law. We’ll prove this below. When is the first digit of 2n equal to k? When 2n is between […]
Extreme beta distributions
A beta probability distribution has two parameters, a and b. You can think of these as the number of successes and failures out of a+b trials. The PDF of a beta distribution is approximately normal if a and b are approximately equal and a + b is large. If a and b are close, they don’t have to be very large for the beta […]
Fractional parts, invariant measures, and simulation
A function f: X → X is measure-preserving if for each iteration of f sends the same amount of stuff into a given set. To be more precise, given a measure μ and any μ-measurable set E with μ(E) > 0, we have μ( E ) = μ( f –1(E) ). You can read the right side of […]
Most popular posts this year so far
These have been the most popular posts for the first half of 2017. Golden powers are nearly integers How efficient is Morse code? Putting SHA1 in perspective Improving on the Unix shell Three proofs that 2017 is prime How areas of math are connected
One practical application of functional programming
Arguments in favor of functional programming are often unconvincing. For example, the most common argument is that functional programming makes it easier to “reason about your code.” That’s true to some extent. All other things being equal, it’s easier to understand a function if all its inputs and outputs are explicit. But all other things […]
Dividing projects into math, statistics, and computing
If you’ve read this blog for long, you know that my work is a combination of math, statistics, and computing. I was looking over my records and tried to see how my work divides into these three areas. In short, it doesn’t. The boundaries between these areas are fuzzy or arbitrary to begin with, but […]
Gaussian correlation inequality
The Gaussian correlation inequality was proven in 2014, but the proof only became widely known this year. You can find Thomas Royan’s remarkably short proof here. Let X be a multivariate Gaussian random variable with mean zero and let E and F be two symmetric convex sets, both centered at the origin. The Gaussian correlation inequality says that Prob(X in E […]
Irrational rotations are ergodic
In a blog post yesterday, I mentioned that the golden angle is an irrational portion of a circle, and so a sequence of rotations by the golden angle will not repeat itself. We can say more: rotations by an irrational portion of a circle are ergodic. Roughly speaking, this means that not only does the […]
Saxophone with two octave keys
Last year I wrote a post about saxophone octave keys. I was surprised to discover, after playing saxophone for most of my life, that a saxophone has not one but two octave holes. Modern saxophones have one octave key, but two octave holes. Originally saxophones had a separate octave key for each octave hole; you had to […]
Listening to golden angles
The other day I wrote about the golden angle, a variation on the golden ratio. If φ is the golden ratio, then a golden angle is 1/φ2 of a circle, approximately 137.5°, a little over a third of a circle. Musical notes go around in a circle. After 12 half steps we’re back where we […]
Color theory questions
Here’s a script I wanted to write: given a color c specified in RGB and an angle θ, rotate c on the color wheel by θ and return the RGB value of the result. You can’t rotate RGB values per se, but you can rotate hues. So my initial idea was to convert RGB to […]
A sixth sense for category theory
From Paul Phillips: I see adjoint functors. How often do you see them? All the time. They’re everywhere. pic.twitter.com/6PkGJ9wP4A — Paul Phillips (@contrarivariant) May 27, 2017 Mashup of Saunders Mac Lane’s quip “Adjoint functors arise everywhere” and Haley Joel Osment’s famous line from Sixth Sense. Related: Applied category theory
Student’s future, teacher’s past
“Teachers should prepare the student for the student’s future, not for the teacher’s past.” — Richard Hamming I ran across the above quote from Hamming this morning. It made me wonder whether I tried to prepare students for my past when I used to teach college students. How do you prepare a student for the […]
Ideal background for algebraic geometry
From Foundations of Algebraic Geometry: … in an ideal world, people would learn this material over many years, after having background courses in commutative algebra, algebraic topology, differential geometry, complex analysis, homological algebra, number theory, and French literature.
Changing your mind
From Dorothy Sayers’ essay Why Work? It is always strange and painful to have to change a habit of mind; though, when we have made the effort, we may find a great relief, even a sense of adventure and delight, in getting rid of the false and returning to the true.
Volume of a rose-shaped torus
Start with a rose, as described in the previous post: Now spin that rose around a vertical line a distance R from the center of the rose. This makes a torus (doughnut) shape whose cross sections look like the rose above. You could think of having a cutout shaped like the rose above and extruding Play-Doh […]
Length of a rose
The polar graph of r = cos(kθ) is called a rose. If k is even, the curve will trace out 2k petals as θ runs between 0 and 2π. If k is odd, it will trace out k petals, tracing each one twice. For example, here’s a rose with k = 5. (I rotated the […]
When length equals area
The graph of hyperbolic cosine is called a catenary. A catenary has the following curious property: the length of a catenary between two points equals the area under the catenary between those two points. The proof is surprisingly simple. Start with the following: Now integrate the first and last expressions between two points a and […]
Solving systems of polynomial equations
In a high school algebra class, you learn how to solve polynomial equations in one variable, and systems of linear equations. You might reasonably ask “So when do we combine these and learn to solve systems of polynomial equations?” The answer would be “Maybe years from now, but most likely never.” There are systematic ways to […]
Generating pink noise
Different colors of noise are named by analogy with colors of light. Pink noise is between white noise and red noise. White noise has equal power at all frequencies, just as white light is a combination of all the frequencies of the visible spectrum. The components of red noise are weighted toward low frequencies, just […]
Building software the right way
Yesterday a friend told me about a software project whose owners said “We’re going to do this the right way.” I told him I have two opposite reactions when I hear that: Ooh, that sounds like fun! Run away! I’ve been on several projects where the sponsors have identified some aspect of the status quo […]
The 3n+1 problem and Benford’s law
This is the third, and last, of a series of posts on Benford’s law, this time looking at a famous open problem in computer science, the 3n + 1 problem, also known as the Collatz conjecture. Start with a positive integer n. Compute 3n + 1 and divide by 2 repeatedly until you get an odd […]
Cauchy, Benford, and a problem with NHST
Introduction Samples from a Cauchy distribution nearly follow Benford’s law. I’ll demonstrate this below. The more data you see, the more confident you should be of this. But with a typical statistical approach, crudely applied NHST (null hypothesis significance testing), the more data you see, the less convinced you are. This post assumes you’ve read the […]
Weibull distribution and Benford’s law
Introduction to Benford’s law In 1881, Simon Newcomb noticed that the edges of the first pages in a book of logarithms were dirty while the edges of the later pages were clean. From this he concluded that people were far more likely to look up the logarithms of numbers with leading digit 1 than of […]
Golden angle
The golden angle is related to the golden ratio, but it is not as well known. And the relationship is not quite what you might think at first. The golden ratio φ is (1 + √5)/2. A golden rectangle is one in which the ratio of the longer side to the shorter side is φ. […]
What personality classifications have in common
There are many ways to divide people into four personality types, from the classical—sanguine, choleric, melancholic, and phlegmatic—to contemporary systems such as the DISC profile. The Myers-Briggs system divides people into sixteen personality types. I just recently ran across the “enneagram,” an ancient system for dividing people into nine categories. There’s one thing advocates of […]
Denver airport, Weierstrass, and A&S
Last night I was driving toward the Denver airport and the airport reminded me of the cover of Abramowitz and Stegun’s Handbook of Mathematical Functions. Here’s the airport: And here’s the book cover: I’ve written about the image on book cover before. Someone asked me what function it graphed and I decided it was probably […]
Resisting simplicity
As much as we admire simplicity and strive for simplicity, something in us isn’t happy when we achieve it. Sometimes we’re disappointed with a simple solution because, although we don’t realize it yet, we didn’t properly frame the problem it solves. I’ve been in numerous conversations where someone says effectively, “I understand that 2+3 = […]
Flying through a 3D fractal
A Menger sponge is created by starting with a cube a recursively removing chunks of it. Draw a 3×3 grid on one face of the cube, then remove the middle square, all the way through the cube. Then do the same for each of the eight remaining squares. Repeat this process over and over, and do it […]
Computing harmonic numbers
The harmonic numbers are defined by Harmonic numbers are sort of a discrete analog of logarithms since As n goes to infinity, the difference between Hn and log n is Euler’s constant γ = 0.57721… [1] How would you compute Hn? For small n, simply use the definition. But if n is very large, there’s a way […]
Technical notes and other relatively hidden content
I’ve written quite a few pages that are separate from the timeline of the blog. These are a little hidden, not because I want to hide them, but because you can’t make everything equally easy to find. These notes cover a variety of topics: Math diagrams Numerical computing Probability and approximations Differential equations Python Regular expressions […]
New Twitter icons
I’ve updated the icons for my Twitter accounts.
Mercury and the bandwagon effect
The study of the planet Mercury provides two examples of the bandwagon effect. In her new book Worlds Fantastic, Worlds Familiar, planetary astronomer Bonnie Buratti writes The study of Mercury … illustrates one of the most confounding bugaboos of the scientific method: the bandwagon effect. Scientists are only human, and they impose their own prejudices […]
...49505152535455565758...