Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-09-11 22:17
Fixed points of the Cliff random number generator
I ran across the Cliff random number generator yesterday. Given a starting value x0 in the open interval (0, 1), the generator proceeds by xn+1 = | 100 log(xn) mod 1 | for n > 0. The article linked to above says that this generator passes a test of randomness based on generating points on […]
Ease of learning vs relearning
Much more is written about how easy or hard some technology is to learn than about how hard it is to relearn. Maybe this is because people are more eager to write about something while the excitement or frustration of their first encounter is fresh. Advocates of difficult-to-learn technologies say that tools should be optimized […]
Uniform approximation paradox
What I’m going to present here is not exactly a paradox, but I couldn’t think of a better way to describe it in the space of a title. I’ll discuss two theorems about uniform convergence that seem to contradict each other, then show by an example why there’s no contradiction. Weierstrass approximation theorem One of […]
Nearly parallel is nearly transitive
We begin with a bit of geometry, then show its relevance to statistics. Geometry Let X, Y, and Z be three unit vectors. If X is nearly parallel to Y, and Y is nearly parallel to Z, then X is nearly parallel to Z. Here’s a proof. Think of X, Y, and Z as points […]
Angles in the spiral of Theodorus
The previous post looked at how to plot the spiral of Theodorus shown below. We stopped the construction where we did because the next triangle to be added would overlap the first triangle, which would clutter the image. But we could certainly have kept going. If we do keep going, then the set of hypotenuse […]
How to plot the spiral of Theodorus
You may have seen the spiral of Theodorus. It sticks a sequence of right triangles together to make a sort of spiral. Each triangle has a short side of length 1, and the hypotenuse of each triangle becomes the long leg of the next triangle as shown below. How would you plot this spiral? At […]
Encryption as secure as factoring
RSA encryption is based on the assumption that factoring large integers is hard. However, it’s possible that breaking RSA is easier than factoring. That is, the ability to factor large integers is sufficient for breaking RSA, but it might not be necessary. Two years after the publication of RSA, Michael Rabin created an alternative that […]
Accelerating convergence with Aitken’s method
The previous post looked at Euler’s method for accelerating the convergence of a slowly converging alternating series. Both hypotheses are necessary. The signs must alternate between terms, and applying the method to a series that is already converging quickly can slow down convergence. Aitken’s method This post looks at Aitken’s method for speeding up the […]
Accelerating an alternating series
The most direct way of computing the sum of an alternating series, simply computing the partial sums in the terms get small enough, may not be the most efficient. Euler figured this out in the 18th century. For our demo we’ll evaluate the Struve function defined by the series Note that the the terms in […]
Data breach trends
Are data breaches becoming more or less common? This post gives a crude, back-of-the-envelope calculation to address the question. We won’t look at number of breaches per se but number of records breached. There’s a terrific visualization of data breach statistics at Information is Beautiful, and they share their data here. Note that the data […]
Beating the odds on the Diffie-Hellman decision problem
There are a couple variations on the Diffie-Hellman problem in cryptography: the computation problem (CDH) and the decision problem (DDH). This post will explain both and give an example of where the former is hard and the latter easy. The Diffie-Hellman problems The Diffie-Hellman problems are formulated for an Abelian group. The main group we […]
Magic square links and errata
Someone pointed out that what I called a knight’s tour magic square is technically a semi-magic square: the rows and columns add up to the same constant, but the diagonals do not. It turns out there are no strict magic squares formed by knight’s tours. This was proved in 2003. See a news article here. […]
Quaternion reference in the Vulgate
To contemporary ears “quaternion” refers to a number system discovered in the 19h century, but there were a couple precedents. Both refer to something related to a group of four things, but there is no relation to mathematical quaternions other than that they have four dimensions. I’ve written before about Milton’s use of the word […]
Fame, difficulty, and usefulness
Pierre Fermat is best known for two theorems, dubbed his “last” theorem and his “little” theorem. His last theorem is famous, difficult to prove, and useless. His little theorem is relatively arcane, easy to prove, and extremely useful. There’s little relation between technical difficulty and usefulness. Fermat’s last theorem Fermat’s last theorem says there are […]
Twisted elliptic curves
This morning I was sitting at a little bakery thinking about what to do before I check out of my hotel. I saw that the name of the bakery was Twist Bakery & Cafe, and that made me think of writing about twisted elliptic curves when I got back to my room. Twist of an […]
Hashing names does not protect privacy
Secure hash functions are practically impossible to reverse, but only if the input is unrestricted. If you generate 256 random bits and apply a secure 256-bit hash algorithm, an attacker wanting to recover your input can’t do much better than brute force, hashing 256-bit strings hoping to find one that matches your hash value. Even […]
May the best technology win
I’ve become skeptical of arguments of the form “X is a better technology, but people won’t quit using Y.” Comparisons of technologies are multi-faceted. When someone says “X is better than Y” I want to ask “By all criteria? There’s nothing better about Y?” When people say X is better but Y won, it’s often […]
Integral approximation trick
Here’s a simple integration approximation that works remarkably well in some contexts. Suppose you have an integrand that looks roughly like a normal density, something with a single peak that drops off fairly quickly on either side of the peak. The majority of integrals that arise in basic applications of probability and statistics fit this […]
Number of feet in a mile
Here are a couple amusing things I’ve run across recently regarding the number of feet in a mile. Both are frivolous, but also have a more serious side. Mnemonic First, you can use “five tomatoes” as a mnemonic for remembering that there are 5280 feet in a mile. “Five tomatoes” is a mnemonic for the […]
Discriminant of a cubic
The discriminant of a quadratic equation ax² + bx + c = 0 is Δ = b² – 4ac. If the discriminant Δ is zero, the equation has a double root, i.e. there is a unique x that makes the equation zero, and it counts twice as a root. If the discriminant is not zero, […]
Distribution of quadratic residues
Let p be an odd prime number. If the equation x² = n mod p has a solution then n is a square mod p, or in classical terminology, n is a quadratic residue mod p. Half of the numbers between 0 and p are quadratic residues and half are not. The residues are distributed […]
What does CCPA say about de-identified data?
The California Consumer Privacy Act, or CCPA, takes effect January 1, 2020, less than six months from now. What does the act say about using deidentified data? First of all, I am not a lawyer; I work for lawyers, advising them on matters where law touches statistics. This post is not legal advice, but my […]
Serious applications of a party trick
In a group of 30 people, it’s likely that two people have the same birthday. For a group of 23 the probability is about 1/2, and it goes up as the group gets larger. In a group of a few dozen people, it’s unlikely that anyone will have a particular birthday, but it’s likely that […]
Channel quantity and quality
Years ago, when there were a couple dozen television stations, someone [1] speculated that when we got more channels we’d also get better content. The argument was that people are more similar in their base interests than in their more refined interests. Therefore if there are only a few channels, they will all appeal to […]
Symmetry in exponential sums
Today’s exponential sum is highly symmetric: These sums are often symmetric, but not always. For example, here’s the sum from a couple days ago: It’s not obvious from looking at the parameters whether a sum will be symmetric or not. Maybe someone could find a prove criteria for a sum to have certain symmetries. For […]
Proto-calculus
David Bressoud has written a new book entitled Calculus Reordered: A History of the Big Ideas. He presents the major themes of calculus in historical order, which is very different from the order in which it is now taught. We now begin with limits, then differentiation, integration, and infinite series. Historically, integration came first and […]
Homomorphic encryption
A function that satisfies f(x*y) = f(x)*f(y) is called a homomorphism. The symbol “*” can stand for any operation, and it need not stand for the same thing on both sides of the equation. Technically * is the group operation, and if the function f maps elements of one group to another, the group operation […]
Journalistic stunt with Emacs
Emacs has been called a text editor with ambitions of being an operating system, and some people semi-seriously refer to it as their operating system. Emacs does not want to be an operating system per se, but it is certainly ambitious. It can be a shell, a web browser, an email client, a calculator, a […]
Notes on computing hash functions
A secure hash function maps a file to a string of bits in a way that is hard to reverse. Ideally such a function has three properties: pre-image resistance collision resistance second pre-image resistance Pre-image resistance means that starting from the hash value, it is very difficult to infer what led to that output; it […]
Translating poetry
You can’t preserve every aspect of a text when translating. A strict word-for-word translation attempts to be faithful to the words but may be ungrammatical in the target language. An idea-for-idea translation is more readable, but still may not convey the style of the original. Translation reminds me of making maps. There have been countless […]
Category theory for data science: cautious optimism
I’m cautiously optimistic about applications of category theory. I believe category theory has the potential to be useful, but I’m skeptical of most claims to have found useful applications. Category theory has found useful application, especially behind the scenes, but a lot of supposed applications remind me of a line from Colin McLarty: [Jean-Pierre] Serre […]
Software to factor integers
In my previous post, I showed how changing one bit of a semiprime (i.e. the product of two primes) creates an integer that can be factored much faster. I started writing that post using Python with SymPy, but moved to Mathematica because factoring took too long. SymPy vs Mathematica When I’m working in Python, SymPy […]
Making public keys factorable with Rowhammer
The security of RSA encryption depends on the fact that the product of two large primes is difficult to factor. So if p and q are large primes, say 2048 bits each, then you can publish n = pq with little fear that someone can factor n to recover p and q. But if you […]
Bounds on the nth prime
The nth prime is approximately n log n. For more precise estimates, there are numerous upper and lower bounds for the nth prime, each tighter over some intervals than others. Here I want to point out upper and lower bounds from a dissertation by Christian Axler on page viii. First, define Then for sufficiently large […]
Converting between nines and sigmas
Nines and sigmas are two ways to measure quality. You’ll hear something has four or five nines of reliability or that some failure is a five sigma event. What do these mean, and how do you convert between them? Definitions If a system has fives nines of availability, that means the probability of the system […]
Maybe it’s just hard
If someone tells you repeatedly that something isn’t hard, maybe it’s just hard. Monads A post by Gilad Bracha got me thinking about this. He says Last time I looked, the Haskell wiki listed 29 tutorials on [monads]. … Could it just be that people just have a hard time understanding monads? If so, what […]
Why are regular expressions difficult?
Regular expressions are challenging, but not for the reasons commonly given. Non-reasons Here are some reasons given for the difficulty of regular expressions that I don’t agree with. Cryptic syntax I think complaints about cryptic syntax miss the mark. Some people say that Greek is hard to learn because it uses a different alphabet. If […]
Feller-Tornier constant
Here’s kind of an unusual question: What is the density of integers that have an even number of prime factors with an exponent greater than 1? To define the density, you take the proportion up to an integer N then take the limit as N goes to infinity. It’s not obvious that the limit should […]
Translating Robert Burns
Last year Adam Roberts had some fun with Finnegans Wake [1], seeing how little he could edit it and turn it into something that sounded like Return of the Jedi. I wrote a blog post where I quantified the difference between the original and the parody using Levenshtein distance, basically how many edits it takes […]
Protecting privacy while keeping detailed date information
A common attempt to protect privacy is to truncate dates to just the year. For example, the Safe Harbor provision of the HIPAA Privacy Rule says to remove “all elements of dates (except year) for dates that are directly related to an individual …” This restriction exists because dates of service can be used to […]
Per stirpes and random walks
If an inheritance is to be divided per stirpes, each descendant gets an equal share. If a descendant has died but has living descendants, his or her share is distributed by applying the rule recursively. Example For example, suppose a man had two children, Alice and Bob, and stipulates in his will that his estate […]
SQRL: Secure Quick Reliable Login
Steve Gibson’s Security Now is one of the podcasts I regularly listen to, and so I’ve been hearing him talk about his SQRL for a while. This week he finally released SQRL: Secure Quick Reliable Login. You can read more about SQRL in the white paper posted on the GRC web site. Here’s a tease […]
The cost of no costs
The reason businesses have employees rather than contracting out everything is to reduce transaction costs. If a company needs enough graphics work, they hire a graphic artist rather than outsourcing every little project, eliminating the need to evaluate bids, write contracts, etc. Some things are easier when no money has to change hands. But some […]
Trott’s constant
Trott’s constant is the unique number whose digits equal its continued fraction coefficients. Uniqueness assumes the number is expanded into a simple continued fraction, i.e. one with all numerators equal to 1. See OEIS sequence A039662. More continued fraction posts Best rational approximations to π Continued fraction cryptography Normal hazard continued fraction
Using one RNG to sample another
Suppose you have two pseudorandom bit generators. They’re both fast, but not suitable for cryptographic use. How might you combine them into one generator that is suitable for cryptography? Coppersmith et al [1] had a simple but effective approach which they call the shrinking generator, also called irregular decimation. The idea is to use one […]
Rock, paper, scissors, algebra
Aatish Bhatia posted something interesting on Twitter: if you define multiplication on Rock, Paper, Scissors to be the winner of a match, the result is commutative but not associative. Here's a neat thing about the algebra of Rock, Paper, Scissors. If you define 'multiplication' as the game's winner, then it's commutative, i.e. P x R […]
Liminal and subliminal
It occurred to me for the first time this morning that the words liminal and subliminal must be related, just after reading an article by Vicki Boykis that discusses liminal spaces. I hear the two words in such in different contexts—architecture versus psychology—and hadn’t thought about the connection until now. If I were playing a […]
Cop with a mop
Yesterday I was at a wedding, and a vase broke in the aisle shortly before the bridal party was to enter. Guests quickly picked up the pieces, but the vase left a pool of water on the hard floor. A security guard ran (literally) for a mop and cheerfully picked up the water. He could […]
R with Conda
I’ve been unable to get some R libraries to install on my Linux laptop. Two libraries in particular were tseries and tidyverse. The same libraries installed just fine on Windows. (Maybe you need to install Rtools first before installing these on Windows; I don’t remember.) I use conda all the time with Python, but I […]
On this day
This morning as a sort of experiment I decided to look back at all my blog posts written on May 30 each year. There’s nothing special about this date, so I thought it might give an eclectic cross section of things I’ve written about. *** Last year on this day I wrote about Calendars and […]
...37383940414243444546...