Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-04-20 06:31
Building high frequencies out of low frequencies
If you have sines and cosines of some fundamental frequency, and you’re able to take products and sums, then you can construct sines and cosines at any multiple of the fundamental frequency. Here’s a proof. Taking real parts gives us cos nθ in the first equation and the even terms of the sum in the […]
Prime plus power of 2
A new article [1] looks at the problem of determining the proportion of odd numbers that can be written as a power of 2 and a prime. A. de Polignac conjectured in 1849 that all odd numbers have this form. A counterexample is 127, and so apparently the conjecture was that every odd number is […]
Analogy between Fibonacci and Chebyshev
Quick observation: I recently noticed that Chebyshev polynomials and Fibonacci numbers have analogous formulas. The nth Chebyshev polynomial satisfies for |x| ≥ 1, and the nth Fibonacci number is given by There’s probably a way to explain the similarity in terms of the recurrence relations that both sequences satisfy. More on Chebyshev polynomials Yogi Berra […]
On using your computer as more than a terminal
Here are five things I appreciate about using Emacs. They also apply to any software that runs entirely on your computer. It doesn’t track me. It doesn’t show me ads. It doesn’t require two-factor authentication. It doesn’t change unless I change it. It doesn’t stop working if my internet connection stops working. Related post: Negative […]
A spring, a rubber band, and chaos
Suppose you have a mass suspended by the combination of a spring and a rubber band. A spring resists being compressed but a rubber band does not. So the rubber band resists motion as the mass moves down but not as it moves up. In [1] the authors use this situation to motivate the following […]
Hypocycloids
In the previous post, I showed how to plot “envelopes of epicycloids.” This post will consider a variation on the same theme, hypocycloids. For the epicycloid post, we imagined two ants crawling around a circle at different speeds, and drawing lines between their positions at various times. Although the ants were traveling at different speeds, […]
Envelopes of epicycloids (pretty pictures!)
Imagine two ants crawling around a circle at different speeds and draw a line between the two ants at regular time intervals. The position of the two ants at time t are (cos pt, sin pt) and (cos qt, sin qt) where p and q are integers, p > q > 0, and t comes from dividing the […]
Short essays on programming languages
I saw a link to So You Think You Know C? by Oleksandr Kaleniuk on Hacker News and was pleasantly surprised. I expected a few comments about tricky parts of C, and found them, but there’s much more. The subtitle of the free book is And Ten More Short Essays on Programming Languages. Good reads. […]
Probability that a cubic has two turning points
Most cubic polynomials with real coefficients have two turning points, a local maximum and a local minimum. But how do you quantify “most”? Here’s how one author did it [1]. Start with the cubic polynomial x³ + ax² + bx + c Since multiplying a polynomial by a nonzero constant doesn’t change how many turning […]
Truncated distributions vs clipped distributions
In the previous post, I looked at truncated probability distributions. A truncated normal distribution, for example, lives on some interval and has a density proportional to a normal distribution; the proportionality constant is whatever it has to be to make the density integrate to 1. Truncated distributions Suppose you wanted to generate random samples from […]
The truncated Cauchy paradox
The Cauchy distribution is the probability distribution on the real line with density proportional to 1/(1 + x²). It comes up often as an example of a fat-tailed distribution, one that can wreak havoc on intuition and on applications. It has no mean and no variance. The truncated Cauchy distribution has the same density, with […]
LaTeX command frequencies
In the previous post I present a bash one-liner to search directories for LaTeX files and count the commands used. College files I first tried this out on a directory that included some old files from grad school. I chose this directory because I knew it had a lot of LaTeX files, but I was […]
A shell one-liner to search directories
I started this post by wanting to look at the frequency of LaTeX commands, but then thought that some people mind find the code to find the frequencies more interesting than the frequencies themselves. So I’m splitting this into two posts. This post will look at the shell one-liner to find command frequencies, and the […]
Expected length of longest common DNA substrings
If we have two unrelated sequences of DNA, how long would we expect the longest common substring of the the two sequences to be? Since DNA sequences come from a very small alphabet, just four letters, any two sequences of moderate length will have some common substring, but how long should we expect it to […]
Powers that don’t change the last digit
If you raise any number to the fifth power, the last digit doesn’t change. Here’s a little Python code to verify this claim. >>> [n**5 for n in range(10)] [0, 1, 32, 243, 1024, 3125, 7776, 16807, 32768, 59049] In case you’re not familiar with Python, or familiar with Python but not familiar with list […]
An application of Kronecker products
A while back I wrote about Kronecker products in the context of higher order Taylor series. Here’s how I described the Kronecker product in that post. The Kronecker product of an m × n matrix A and a p × q matrix B is a mp × nq matrix K = A ⊗ B. You can think of K as a block partitioned matrix. The ij block […]
A wrinkle in Clojure
Bob Martin recently posted a nice pair of articles, A Little Clojure and A Little More Clojure. In the first article he talks about how spare and elegant Clojure is. In the second article he shows how to write a program to list primes using map and filter rather than if and while. He approaches […]
Exponential growth vs logistic growth
This seems like a good time to discuss the difference between exponential growth and logistic growth as the covid19 pandemic is starting to look more like a logistic model and less like an exponential model, at least in many parts of the world [1]. This post is an expansion of a Twitter thread I wrote […]
Sine series for a sine
The Fourier series of an odd function only has sine terms—all the cosine coefficients are zero—and so the Fourier series is a sine series. What is the sine series for a sine function? If the frequency is an integer, then the sine series is just the function itself. For example, the sine series for sin(5x) […]
Two meanings of QR code
“QR code” can mean a couple different things. There is a connection between these two, though that’s not at all obvious. What almost everyone thinks of as a QR code is a quick response code, a grid of black and white squares that encode some data. For example, the QR code below contains my contact […]
Center of mass and vectorization
Para Parasolian left a comment on my post about computing the area of a polygon, suggesting that I “say something similar about computing the centroid of a polygon using a similar formula.” This post will do that, and at the same time discuss vectorization. Notation We start by listing the vertices starting anywhere and moving […]
Making an invertible function out of non-invertible parts
How can you make an invertible function out of non-invertable parts? Why would you want to? Encryption functions must be invertible. If the intended recipient can’t decrypt the message then the encryption method is useless. Of course you want an encryption function to be really hard to invert without the key. It’s hard to think […]
Underestimating risk
When I hear that a system has a one in a trillion (1,000,000,000,000) chance of failure, I immediately translate that in my mind to “So, optimistically the system has a one in a million (1,000,000) chance of failure.” Extremely small probabilities are suspicious because they often come from one of two errors: Wrongful assumption of […]
Reasoning under uncertainty
Reasoning under uncertainty sounds intriguing. Brings up images of logic, philosophy, and artificial intelligence. Statistics sounds boring. Brings up images of tedious, opaque calculations followed by looking some number in a table. But statistics is all about reasoning under uncertainty. Many people get through required courses in statistics without ever hearing that, or at least […]
Lee distance: codes and music
The Hamming distance between two sequences of symbols is the number of places in which they differ. For example, the Hamming distance between the words “hamming” and “farming” is 2, because the two worlds differ in their first and third letters. Hamming distance is natural when comparing sequences of bits because bits are either the […]
Conditional independence notation
Ten years ago I wrote a blog post that concludes with this observation: The ideas of being relatively prime, independent, and perpendicular are all related, and so it makes sense to use a common symbol to denote each. This post returns to that theme, particularly looking at independence of random variables. History Graham, Knuth, and […]
Three composition theorems for differential privacy
This is a brief post, bringing together three composition theorems for differential privacy. The composition of an ε1-differentially private algorithm and an ε2-differentially private algorithm is an (ε1+ε2)-differentially private algorithm. The composition of an (ε1, δ1)-differentially private algorithm and an (ε2, δ2)-differentially private algorithm is an (ε1+ε2, δ1+δ2)-differentially private algorithm. The composition of an (α, […]
Minimizing worst case error
It’s very satisfying to know that you have a good solution even under the worst circumstances. Worst-case thinking doesn’t have to be concerned with probabilities, with what is likely to happen, only with what could happen. But whenever you speak of what could happen, you have to limit your universe of possibilities. Suppose you ask […]
Pecunia non olet
I’ve been rereading That Hideous Strength. I’m going through it slowly this time, paying attention to details I glossed over before. For example, early in the book we’re told that the head of a college has the nickname N.O. N.O., which stood for Non-Olet, was the nickname of Charles Place, the warden of Bracton. The […]
Simple clinical trial of four COVID-19 treatments
A story came out in Science yesterday saying the World Health Organization is launching a trial of what it believes are the the four most promising treatments for COVID-19 (a.k.a. SARS-CoV-2, novel coronavirus, etc.) The four treatment arms will be Remdesivir Chloroquine and hydroxychloroquine Ritonavir + lopinavir Ritonavir + lopinavir + interferon beta plus standard […]
Product of copulas
A few days ago I wrote a post about copulas and operations on them that have a group structure. Here’s another example of group structure for copulas. As in the previous post I’m just looking at two-dimensional copulas to keep things simple. Given two copulas C1 and C2, you can define a sort of product […]
How to Set Num Lock on permanently
When I use my Windows laptop, I’m always accidentally brushing against the Num Lock key. I suppose it’s because the keys are so flat; I never have this problem on a desktop. I thought there must be some way to set it so that it’s always on, so I searched for it. First I found […]
New Asymptotic function in Mathematica 12.1
One of the new features in Mathematica 12.1 is the function Asymptotic. Here’s a quick example of using it. Here’s an asymptotic series for the log of the gamma function I wrote about here. If we ask Mathematica Asymptotic[LogGamma[z], z -> Infinity] we get simply the first term: But we can set the argument SeriesTermGoal […]
Extended floating point precision in R and C
The GNU MPFR library is a C library for extended precision floating point calculations. The name stands for Multiple Precision Floating-point Reliable. The library has an R wrapper Rmpfr that is more convenient for interactive use. There are also wrappers for other languages. It takes a long time to install MPFR and its prerequisite GMP, […]
When is round-trip floating point radix conversion exact?
Suppose you store a floating point number in memory, print it out in human-readable base 10, and read it back in. When can the original number be recovered exactly? D. W. Matula answered this question more generally in 1968 [1]. Suppose we start with base β with p places of precision and convert to base […]
Group symmetry of copula operations
You don’t often see references to group theory in a statistics book. Not that there aren’t symmetries in statistics that could be described in terms of groups, but this isn’t often pointed out. Here’s an example from An Introduction to Copulas by Roger Nelsen. Show that under composition the set of operations of forming the […]
Product of Chebyshev polynomials
Chebyshev polynomials satisfy a lot of identities, much like trig functions do. This point will look briefly at just one such identity. Chebyshev polynomials Tn are defined for n = 0 and 1 by T0(x) = 1 T1(x) = x and for larger n using the recurrence relation Tn+1(x) = 2xTn(x) – Tn-1(x) This implies […]
The Brothers Markov
The Markov brother you’re more likely to have heard of was Andrey Markov. He was the Markov of Markov chains, the Gauss-Markov theorem, and Markov’s inequality. Andrey had a lesser known younger brother Vladimir who was also a mathematician. Together the two of them proved what is known as the Markov Brothers’ inequality to distinguish […]
Finding coffee in Pi
It is widely believed that π is a “normal number,” which would mean that you can find any integer sequence you want inside the digits of π, in any base, if you look long enough. So for Pi Day, I wanted to find c0ffee inside the hexadecimal representation of π. First I used TinyPI, a […]
Chebyshev approximation
In the previous post I mentioned that Remez algorithm computes the best polynomial approximation to a given function f as measured by the maximum norm. That is, for a given n, it finds the polynomial p of order n that minimizes the absolute error || f – p ||∞. The Mathematica function MiniMaxApproximation minimizes the relative […]
Remez algorithm and best polynomial approximation
The best polynomial approximation, in the sense of minimizing the maximum error, can be found by the Remez algorithm. I expected Mathematica to have a function implementing this algorithm, but apparently it does not have one. (But see update below.) It has a function named MiniMaxApproximation which sounds like Remez algorithm, and it’s close, but […]
MDS codes
A maximum distance separable code, or MDS code, is a way of encoding data so that the distance between code words is as large as possible for a given data capacity. This post will explain what that means and give examples of MDS codes. Notation A linear block code takes a sequence of k symbols […]
Automatic data reweighting
Suppose you are designing an autonomous system that will gather data and adapt its behavior to that data. At first you face the so-called cold-start problem. You don’t have any data when you first turn the system on, and yet the system needs to do something before it has accumulated data. So you prime the […]
Maximum gap between binomial coefficients
I recently stumbled on a formula for the largest gap between consecutive items in a row of Pascal’s triangle. For n ≥ 2, where For example, consider the 6th row of Pascal’s triangle, the coefficients of (x + y)6. 1, 6, 15, 20, 15, 6, 1 The largest gap is 9, the gap between 6 […]
Formatting in comments
The comments to the posts here are generally very helpful. I appreciate your contributions to the site. I wanted to offer a tip for those who leave comments and are frustrated by the way the comments appear, especially those who write nicely formatted snippet of code only to see the formatting lost. There is a […]
Sum of squared digits
Take a positive integer x, square each of its digits, and sum. Now do the same to the result, over and over. What happens? To find out, let’s write a little Python code that sums the squares of the digits. def G(x): return sum(int(d)**2 for d in str(x)) This function turns a number into a […]
Computing the area of a thin triangle
Heron’s formula computes the area of a triangle given the length of each side. where If you have a very thin triangle, one where two of the sides approximately equal s and the third side is much shorter, a direct implementation Heron’s formula may not be accurate. The cardinal rule of numerical programming is to […]
A tale of two iterations
I recently stumbled on a paper [1] that looks at a cubic equation that comes out of a problem in orbital mechanics: σx³ = (1 + x)² Much of the paper is about the derivation of the equation, but here I’d like to focus on a small part of the paper where the author looks […]
Perfect codes
A couple days ago I wrote about Hamming codes and said that they are so-called perfect codes, i.e. codes for which Hamming’s upper bound on the number of code words with given separation is exact. Not only are Hamming codes perfect codes, they’re practically the only non-trivial perfect codes. Specifically, Tietavainen and van Lint proved […]
Computing parity of a binary word
The previous post mentioned adding a parity bit to a string of bits as a way of detecting errors. The parity of a binary word is 1 if the word contains an odd number of 1s and 0 if it contains an even number of ones. Codes like the Hamming codes in the previous post […]
...29303132333435363738...