Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-03-07 01:01
Fame, difficulty, and usefulness
Pierre Fermat is best known for two theorems, dubbed his “last” theorem and his “little” theorem. His last theorem is famous, difficult to prove, and useless. His little theorem is relatively arcane, easy to prove, and extremely useful. There’s little relation between technical difficulty and usefulness. Fermat’s last theorem Fermat’s last theorem says there are […]
Twisted elliptic curves
This morning I was sitting at a little bakery thinking about what to do before I check out of my hotel. I saw that the name of the bakery was Twist Bakery & Cafe, and that made me think of writing about twisted elliptic curves when I got back to my room. Twist of an […]
Hashing names does not protect privacy
Secure hash functions are practically impossible to reverse, but only if the input is unrestricted. If you generate 256 random bits and apply a secure 256-bit hash algorithm, an attacker wanting to recover your input can’t do much better than brute force, hashing 256-bit strings hoping to find one that matches your hash value. Even […]
May the best technology win
I’ve become skeptical of arguments of the form “X is a better technology, but people won’t quit using Y.” Comparisons of technologies are multi-faceted. When someone says “X is better than Y” I want to ask “By all criteria? There’s nothing better about Y?” When people say X is better but Y won, it’s often […]
Integral approximation trick
Here’s a simple integration approximation that works remarkably well in some contexts. Suppose you have an integrand that looks roughly like a normal density, something with a single peak that drops off fairly quickly on either side of the peak. The majority of integrals that arise in basic applications of probability and statistics fit this […]
Number of feet in a mile
Here are a couple amusing things I’ve run across recently regarding the number of feet in a mile. Both are frivolous, but also have a more serious side. Mnemonic First, you can use “five tomatoes” as a mnemonic for remembering that there are 5280 feet in a mile. “Five tomatoes” is a mnemonic for the […]
Discriminant of a cubic
The discriminant of a quadratic equation ax² + bx + c = 0 is Δ = b² – 4ac. If the discriminant Δ is zero, the equation has a double root, i.e. there is a unique x that makes the equation zero, and it counts twice as a root. If the discriminant is not zero, […]
Distribution of quadratic residues
Let p be an odd prime number. If the equation x² = n mod p has a solution then n is a square mod p, or in classical terminology, n is a quadratic residue mod p. Half of the numbers between 0 and p are quadratic residues and half are not. The residues are distributed […]
What does CCPA say about de-identified data?
The California Consumer Privacy Act, or CCPA, takes effect January 1, 2020, less than six months from now. What does the act say about using deidentified data? First of all, I am not a lawyer; I work for lawyers, advising them on matters where law touches statistics. This post is not legal advice, but my […]
Serious applications of a party trick
In a group of 30 people, it’s likely that two people have the same birthday. For a group of 23 the probability is about 1/2, and it goes up as the group gets larger. In a group of a few dozen people, it’s unlikely that anyone will have a particular birthday, but it’s likely that […]
Channel quantity and quality
Years ago, when there were a couple dozen television stations, someone [1] speculated that when we got more channels we’d also get better content. The argument was that people are more similar in their base interests than in their more refined interests. Therefore if there are only a few channels, they will all appeal to […]
Symmetry in exponential sums
Today’s exponential sum is highly symmetric: These sums are often symmetric, but not always. For example, here’s the sum from a couple days ago: It’s not obvious from looking at the parameters whether a sum will be symmetric or not. Maybe someone could find a prove criteria for a sum to have certain symmetries. For […]
Proto-calculus
David Bressoud has written a new book entitled Calculus Reordered: A History of the Big Ideas. He presents the major themes of calculus in historical order, which is very different from the order in which it is now taught. We now begin with limits, then differentiation, integration, and infinite series. Historically, integration came first and […]
Homomorphic encryption
A function that satisfies f(x*y) = f(x)*f(y) is called a homomorphism. The symbol “*” can stand for any operation, and it need not stand for the same thing on both sides of the equation. Technically * is the group operation, and if the function f maps elements of one group to another, the group operation […]
Journalistic stunt with Emacs
Emacs has been called a text editor with ambitions of being an operating system, and some people semi-seriously refer to it as their operating system. Emacs does not want to be an operating system per se, but it is certainly ambitious. It can be a shell, a web browser, an email client, a calculator, a […]
Notes on computing hash functions
A secure hash function maps a file to a string of bits in a way that is hard to reverse. Ideally such a function has three properties: pre-image resistance collision resistance second pre-image resistance Pre-image resistance means that starting from the hash value, it is very difficult to infer what led to that output; it […]
Translating poetry
You can’t preserve every aspect of a text when translating. A strict word-for-word translation attempts to be faithful to the words but may be ungrammatical in the target language. An idea-for-idea translation is more readable, but still may not convey the style of the original. Translation reminds me of making maps. There have been countless […]
Category theory for data science: cautious optimism
I’m cautiously optimistic about applications of category theory. I believe category theory has the potential to be useful, but I’m skeptical of most claims to have found useful applications. Category theory has found useful application, especially behind the scenes, but a lot of supposed applications remind me of a line from Colin McLarty: [Jean-Pierre] Serre […]
Software to factor integers
In my previous post, I showed how changing one bit of a semiprime (i.e. the product of two primes) creates an integer that can be factored much faster. I started writing that post using Python with SymPy, but moved to Mathematica because factoring took too long. SymPy vs Mathematica When I’m working in Python, SymPy […]
Making public keys factorable with Rowhammer
The security of RSA encryption depends on the fact that the product of two large primes is difficult to factor. So if p and q are large primes, say 2048 bits each, then you can publish n = pq with little fear that someone can factor n to recover p and q. But if you […]
Bounds on the nth prime
The nth prime is approximately n log n. For more precise estimates, there are numerous upper and lower bounds for the nth prime, each tighter over some intervals than others. Here I want to point out upper and lower bounds from a dissertation by Christian Axler on page viii. First, define Then for sufficiently large […]
Converting between nines and sigmas
Nines and sigmas are two ways to measure quality. You’ll hear something has four or five nines of reliability or that some failure is a five sigma event. What do these mean, and how do you convert between them? Definitions If a system has fives nines of availability, that means the probability of the system […]
Maybe it’s just hard
If someone tells you repeatedly that something isn’t hard, maybe it’s just hard. Monads A post by Gilad Bracha got me thinking about this. He says Last time I looked, the Haskell wiki listed 29 tutorials on [monads]. … Could it just be that people just have a hard time understanding monads? If so, what […]
Why are regular expressions difficult?
Regular expressions are challenging, but not for the reasons commonly given. Non-reasons Here are some reasons given for the difficulty of regular expressions that I don’t agree with. Cryptic syntax I think complaints about cryptic syntax miss the mark. Some people say that Greek is hard to learn because it uses a different alphabet. If […]
Feller-Tornier constant
Here’s kind of an unusual question: What is the density of integers that have an even number of prime factors with an exponent greater than 1? To define the density, you take the proportion up to an integer N then take the limit as N goes to infinity. It’s not obvious that the limit should […]
Translating Robert Burns
Last year Adam Roberts had some fun with Finnegans Wake [1], seeing how little he could edit it and turn it into something that sounded like Return of the Jedi. I wrote a blog post where I quantified the difference between the original and the parody using Levenshtein distance, basically how many edits it takes […]
Protecting privacy while keeping detailed date information
A common attempt to protect privacy is to truncate dates to just the year. For example, the Safe Harbor provision of the HIPAA Privacy Rule says to remove “all elements of dates (except year) for dates that are directly related to an individual …” This restriction exists because dates of service can be used to […]
Per stirpes and random walks
If an inheritance is to be divided per stirpes, each descendant gets an equal share. If a descendant has died but has living descendants, his or her share is distributed by applying the rule recursively. Example For example, suppose a man had two children, Alice and Bob, and stipulates in his will that his estate […]
SQRL: Secure Quick Reliable Login
Steve Gibson’s Security Now is one of the podcasts I regularly listen to, and so I’ve been hearing him talk about his SQRL for a while. This week he finally released SQRL: Secure Quick Reliable Login. You can read more about SQRL in the white paper posted on the GRC web site. Here’s a tease […]
The cost of no costs
The reason businesses have employees rather than contracting out everything is to reduce transaction costs. If a company needs enough graphics work, they hire a graphic artist rather than outsourcing every little project, eliminating the need to evaluate bids, write contracts, etc. Some things are easier when no money has to change hands. But some […]
Trott’s constant
Trott’s constant is the unique number whose digits equal its continued fraction coefficients. Uniqueness assumes the number is expanded into a simple continued fraction, i.e. one with all numerators equal to 1. See OEIS sequence A039662. More continued fraction posts Best rational approximations to π Continued fraction cryptography Normal hazard continued fraction
Using one RNG to sample another
Suppose you have two pseudorandom bit generators. They’re both fast, but not suitable for cryptographic use. How might you combine them into one generator that is suitable for cryptography? Coppersmith et al [1] had a simple but effective approach which they call the shrinking generator, also called irregular decimation. The idea is to use one […]
Rock, paper, scissors, algebra
Aatish Bhatia posted something interesting on Twitter: if you define multiplication on Rock, Paper, Scissors to be the winner of a match, the result is commutative but not associative. Here's a neat thing about the algebra of Rock, Paper, Scissors. If you define 'multiplication' as the game's winner, then it's commutative, i.e. P x R […]
Liminal and subliminal
It occurred to me for the first time this morning that the words liminal and subliminal must be related, just after reading an article by Vicki Boykis that discusses liminal spaces. I hear the two words in such in different contexts—architecture versus psychology—and hadn’t thought about the connection until now. If I were playing a […]
Cop with a mop
Yesterday I was at a wedding, and a vase broke in the aisle shortly before the bridal party was to enter. Guests quickly picked up the pieces, but the vase left a pool of water on the hard floor. A security guard ran (literally) for a mop and cheerfully picked up the water. He could […]
R with Conda
I’ve been unable to get some R libraries to install on my Linux laptop. Two libraries in particular were tseries and tidyverse. The same libraries installed just fine on Windows. (Maybe you need to install Rtools first before installing these on Windows; I don’t remember.) I use conda all the time with Python, but I […]
On this day
This morning as a sort of experiment I decided to look back at all my blog posts written on May 30 each year. There’s nothing special about this date, so I thought it might give an eclectic cross section of things I’ve written about. *** Last year on this day I wrote about Calendars and […]
Sum of all Spheres
I ran across a video this afternoon that explains that the sum of volumes of all even-dimensional unit spheres equals eπ. Why is that? Define vol(n) to be the volume of the unit sphere in dimension n. Then and so the sum of the volumes of all even dimensional spheres is But what if you […]
Inside the AES S-box
The AES (Advanced Encryption Standard) algorithm takes in blocks of 128 or more bits [1] and applies a sequence of substitutions and permutations. The substitutions employ an “S-box”, named the Rijndael S-box after its designers [2], an invertible nonlinear transformation that works on 8 bits at a time. There are 256 = 16 × 16 […]
Random sampling from a file
I recently learned about the Linux command line utility shuf from browsing The Art of Command Line. This could be useful for random sampling. Given just a file name, shuf randomly permutes the lines of the file. With the option -n you can specify how many lines to return. So it’s doing sampling without replacement. […]
Between now and quantum
The National Security Agency has stated clearly that they believe this is the time to start moving to quantum-resistant encryption. Even the most optimistic enthusiasts for quantum computing believe that practical quantum computers are years away, but so is the standardization of post-quantum encryption methods. The NSA has also made some suggestions for what to […]
Cosmic rays flipping bits
A cosmic ray striking computer memory at just the right time can flip a bit, turning a 0 into a 1 or vice versa. While I knew that cosmic ray bit flips were a theoretical possibility, I didn’t know until recently that there had been documented instances on the ground [1]. Radiolab did an episode […]
Strong primes
There are a couple different definitions of a strong prime. In number theory, a strong prime is one that is closer to the next prime than to the previous prime. For example, 11 is a strong prime because it is closer to 13 than to 7. In cryptography, a strong primes are roughly speaking primes […]
Unifiers and Diversifiers
I saw a couple tweets this morning quoting Freeman Dyson’s book Infinite in All Directions. Unifiers are people whose driving passion is to find general principles which will explain everything. They are happy if they can leave the universe looking a little simpler than they found it. Diversifiers are people whose passion is to explore […]
Internet privacy as seen from 1975
Science fiction authors set stories in the future, but they don’t necessarily try to predict the future, and so it’s a little odd to talk about what they “got right.” Getting something right implies they were making a prediction rather than imagining a setting of a story. However, sometimes SF authors do indeed try to […]
Impossible to misunderstand
“The goal is not to be possible to understand, but impossible to misunderstand.” I saw this quote at the beginning of a math book when I was a student and it stuck with me. I would think of it when grading exams. Students often assume it is enough to be possible to understand, possible for […]
Comparing Truncation to Differential Privacy
Traditional methods of data de-identification obscure data values. For example, you might truncate a date to just the year. Differential privacy obscures query values by injecting enough noise to keep from revealing information on an individual. Let’s compare two approaches for de-identifying a person’s age: truncation and differential privacy. Truncation First consider truncating birth date […]
Golden ratio primes
The golden ratio is the larger root of the equation φ² – φ – 1 = 0. By analogy, golden ratio primes are prime numbers of the form p = φ² – φ – 1 where φ is an integer. To put it another way, instead of solving the equation φ² – φ – 1 […]
Goldilocks and the three multiplications
Mike Hamburg designed an elliptic curve for use in cryptography he calls Ed448-Goldilocks. The prefix Ed refers to the fact that it’s an Edwards curve. The number 448 refers to the fact that the curve is over a prime field where the prime p has size 448 bits. But why Goldilocks? Golden primes and Goldilocks […]
Tricks for arithmetic modulo NIST primes
The US National Institute of Standards and Technology (NIST) originally recommended 15 elliptic curves for use in elliptic curve cryptography [1]. Ten of these are over a field of size 2n. The other five are over prime fields. The sizes of these fields are known as the NIST primes. The NIST curves over prime fields […]
...33343536373839404142...