Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-09-11 03:16
A strange take on the harmonic series
It is well known that the harmonic series 1 + + + 1/4 + ... diverges. But if you take the denominators as numbers in base 11 or higher, the series converges [1]. I wonder what inspired this observation. Maybe Brewster was bored, teaching yet another cohort of students that the harmonic series [...]The post A strange take on the harmonic series first appeared on John D. Cook.
Variance matters more than mean in the extremes
Suppose you have two normal random variables, X and Y, and that the variance of X is less than the variance of Y. Let M be an equal mixture of X and Y. That is, to sample from M, you first chose X or Y with equal probability, then you choose a sample from the [...]The post Variance matters more than mean in the extremes first appeared on John D. Cook.
Increasing speed due to friction
Orbital mechanics is fascinating. I've learned a bit about it for fun, not for profit. I seriously doubt Elon Musk will ever call asking me to design an orbit for him. [1] One of the things that makes orbital mechanics interesting is that it can be counter-intuitive. For example, atmospheric friction can make a satellite [...]The post Increasing speed due to friction first appeared on John D. Cook.
Ptolemy’s theorem
Draw a quadrilateral by pick four arbitrary points on a circle and connecting them cyclically. Now multiply the lengths of the pairs of opposite sides. In the diagram below this means multiplying the lengths of the two horizontal-ish blue sides and the two vertical-ish orange sides. Ptolemy's theorem says that the sum of the two [...]The post Ptolemy's theorem first appeared on John D. Cook.
Rule for converting trig identities into hyperbolic identities
There is a simple rule of thumb for converting between (circular) trig identities and hyperbolic trig identities known as Osborn's rule: stick an h on the end of trig functions and flip signs wherever two sinh functions are multiplied together. Examples For example, the circular identity sin( + ) = sin() cos() + cos() sin() [...]The post Rule for converting trig identities into hyperbolic identities first appeared on John D. Cook.
Interpolation and the cotanc function
This weekend I wrote three posts related to interpolation: Compression and interpolation Bessel, Everett, and Lagrange interpolation Binomial coefficients with non-integer arguments The first post looks at reducing the size of mathematical tables by switching for linear to quadratic interpolation. The immediate application is obsolete, but the principles apply to contemporary problems. The second post [...]The post Interpolation and the cotanc function first appeared on John D. Cook.
Binomial coefficients with non-integer arguments
When n and r are positive integers integers, with n >= r, there is an intuitive interpretation of the binomial coefficient C(n, r), namely the number of ways to select r things from a set of n things. For this reason C(n, r) is usually pronounced n choose r." But what might something like C(4.3, [...]The post Binomial coefficients with non-integer arguments first appeared on John D. Cook.
Bessel, Everett, and Lagrange interpolation
I never heard of Bessel or Everett interpolation until long after college. I saw Lagrange interpolation several times. Why Lagrange and not Bessel or Everett? First of all, Bessel interpolation and Everett interpolation are not different kinds of interpolation; they are different algorithms for carrying out the same interpolation as Lagrange. There is a unique [...]The post Bessel, Everett, and Lagrange interpolation first appeared on John D. Cook.
Compression and interpolation
Data compression is everywhere. We're unaware of it when it is done well. We only become aware of it when it is pushed too far, such as when a photo looks grainy or fuzzy because it was compressed too much. The basic idea of data compression is to not transmit the raw data but to [...]The post Compression and interpolation first appeared on John D. Cook.
Chebyshev polynomials as distorted cosines
Forman Acton's book Numerical Methods that Work describes Chebyschev polynomials as cosine curves with a somewhat disturbed horizontal scale, but the vertical scale has not been touched. The relation between Chebyshev polynomials and cosines is Tn(cos ) = cos(n). Some sources take this as the definition of Chebyshev polynomials. Other sources define the polynomials differently [...]The post Chebyshev polynomials as distorted cosines first appeared on John D. Cook.
Math’s base 32 versus Linux’s base 32
The convention in math for writing numbers in bases larger than 10 is to insert capital letters after 9, starting with A. So, for example, the digits in base 12 are 0, 1, 2, ..., 9, A, and B. So if you're familiar with math but not Linux, and you run across the base32 utility, [...]The post Math's base 32 versus Linux's base 32 first appeared on John D. Cook.
Editing a file without an editor
I don't use sed very often, but it's very handy when I do use it, particularly when needing to make a small change to a large file. Fixing a JSON file Lately I've been trying to fix a 30 MB JSON file that has been corrupted somehow. The file is one very long line. Emacs [...]The post Editing a file without an editor first appeared on John D. Cook.
Interpolating the gamma function
Suppose you wanted to approximate (10.3). You know it's somewhere between (10) = 9! and (11) = 10!, and linear interpolation would give you (10.3) 0.7 * 9! + 0.3 * 10! = 1342656. But the exact value is closer to 716430.69, and so our estimate is 53% too high. Not a very good [...]The post Interpolating the gamma function first appeared on John D. Cook.
Too clever Monte Carlo
One way to find the volume of a sphere would be to imagine the sphere in a box, randomly select points in the box, and count how many of these points fall inside the sphere. In principle this would work in any dimension. The problem with naive Monte Carlo We could write a program to [...]The post Too clever Monte Carlo first appeared on John D. Cook.
Evaluating a class of infinite sums in closed form
The other day I ran across the surprising identity and wondered how many sums of this form can be evaluated in closed form like this. Quite a few it turns out. Sums of the form evaluate to a rational number when k is a non-negative integer and c is a rational number with |c| > [...]The post Evaluating a class of infinite sums in closed form first appeared on John D. Cook.
Sphere spilling out
Center a small blue sphere on every corner of ann-dimensional unit hypercube. These are the points in n for which every coordinate is either a 0 or a 1. Now inflate each of these small spheres at the same time until they touch. Each sphere will have radius 1/2. For example, the spheres centered at [...]The post Sphere spilling out first appeared on John D. Cook.
A variation on Rock, Paper, Scissors
Imagine in a game of Rock, Paper, Scissors one player is free to play as usual but the other is required to choose each option the same number of times. That is, in 3n rounds of the game, the disadvantaged player much choose Rock n times, Paper n times, and Scissors n times. Obviously the [...]The post A variation on Rock, Paper, Scissors first appeared on John D. Cook.
q-analog of rising powers
The previous post looked at the probability that a random n by n matrix over a finite field of order q is invertible. This works out to be This function of q and n comes up in other contexts as well and has a name that we will get to shortly. Pochhammer symbols Leo August [...]The post q-analog of rising powers first appeared on John D. Cook.
Solvability of linear systems over finite fields
If you haven equations in n unknowns over a finite field with q elements, how likely is it that the system of equations has a solution? The number of possible n * n matrices with entries from a field of size q is qn^2. The set of invertible n * n matrices over a field [...]The post Solvability of linear systems over finite fields first appeared on John D. Cook.
Why do medical tests always have error rates?
Most people implicitly assume medical tests are infallible. If they test positive for X, they assume they have X. Or if they test negative for X, they're confident they don't have X. Neither is necessarily true. Someone recently asked me why medical tests always have an error rate. It's a good question. A test is [...]The post Why do medical tests always have error rates? first appeared on John D. Cook.
Rényi’s parking constant
Imagine parallel parking is available along the shoulder of a road, but no parking spaces are marked. The first person to park picks a spot on the shoulder at random. Then another car also chooses a spot along the shoulder at random, with the constraint that the second car can't overlap the first. This process [...]The post Renyi's parking constant first appeared on John D. Cook.
Calculating when a planet will appear to move backwards
When we say that the planets in our solar system orbit the sun, not the earth, we mean that the motions of the planets is much simpler to describe from the vantage point of the sun. The sun is no more the center of the universe than the earth is. Describing the motion of the [...]The post Calculating when a planet will appear to move backwards first appeared on John D. Cook.
Do incremental improvements add, multiply, or something else?
Suppose you make an x% improvement followed by a y% improvement. Together do they make an (x + y)% improvement? Maybe. The business principle of kaizen, based on the Japanese for improvement, is based on the assumption that incremental improvements accumulate. But quantifying how improvements accumulate takes some care. Add or multiply? Two successive [...]The post Do incremental improvements add, multiply, or something else? first appeared on John D. Cook.
The Clausen function
I ran across the Clausen function the other day, and when I saw a plot of the function my first thought was that it looks sorta like a sawtooth wave. I wondered whether it also sounds like a sawtooth wave, and indeed it does. More on that shortly. The Clausen function can be defined in [...]The post The Clausen function first appeared on John D. Cook.
Limit of a doodle
Suppose you're in a boring meeting and you start doodling. You draw a circle, and then you draw a triangle on the outside of that circle. Next you draw a circle through the vertices of the triangle, and draw a square outside that. Then you draw a circle through the vertices of the square, and [...]The post Limit of a doodle first appeared on John D. Cook.
National Provider Identifier (NPI) and its checksum
Healthcare providers in the United States are required to have an ID number known as the NPI (National Provider Identifier). This is a 10-digit unique identifier which serves as the primary key in a publicly available database. You can use the NPI number to look up a provider's name, credentials, their practice location, etc. The [...]The post National Provider Identifier (NPI) and its checksum first appeared on John D. Cook.
Getting some (algorithmic) SAT-isfaction
How can you possibly solve a mission-critical problem with millions of variables-when the worst-case computational complexity of every known algorithm for that problem is exponential in the number of variables? SAT (Satisfiability) solvers have seen dramatic orders-of-magnitude performance gains for many problems through algorithmic improvements over the last couple of decades or so. The SAT [...]The post Getting some (algorithmic) SAT-isfaction first appeared on John D. Cook.
Computing Γ(z) for complex z with tables
In the previous post I mentioned that the general strategy for computing a mathematical function using tables is to first reduce the function argument to be within the range of the tabulated values, and then to use interpolation to compute the function at values that are not directly tabulated. The second step is always the [...]The post Computing (z) for complex z with tables first appeared on John D. Cook.
Calculating trig functions from tables
It takes some skill to use tables of mathematical functions; it's not quite as simple as it may seem. Although it's no longer necessary to use tables, it's interesting to look into the details of how it is done. For example, the Handbook of Mathematical Functions edited by Abramowitz and Stegun tabulates sines and cosines [...]The post Calculating trig functions from tables first appeared on John D. Cook.
Rapidly convergent series for ellipse perimeter
The previous post looked at two simple approximations for the perimeter of an ellipse. Approximations are necessary since the perimeter of an ellipse cannot be expressed as an elementary function of the axes. Kepler noted in 1609 that you could approximate the perimeter of an ellipse as the perimeter of a circle whose radius is [...]The post Rapidly convergent series for ellipse perimeter first appeared on John D. Cook.
Kepler’s ellipse perimeter approximations
In 1609, Kepler remarked that the perimeter of an ellipse with semiaxes a and b could be approximated either as P 2(ab) or P (a + b). In other words, you can approximate the perimeter of an ellipse by the circumference of a circle of radius r where r is either the geometric mean [...]The post Kepler's ellipse perimeter approximations first appeared on John D. Cook.
Power method and centrality
A few days ago I wrote about eigenvector centrality, a way of computing which nodes in a network are most important. Rather than simply looking for the most highly connected nodes, it looks for nodes that are highly connected to nodes that are highly connected. It's the idea behind Google's PageRank algorithm. Adjacency matrices One [...]The post Power method and centrality first appeared on John D. Cook.
The search for the perfect prompt
Anyone with more than casual experience with ChatGPT knows that prompt engineering is a thing. Minor or even trivial changes in a chatbot prompt can have significant effects, sometimes even dramatic ones, on the output [1]. For simple requests it may not make much difference, but for detailed requests it could matter a lot. Industry [...]The post The search for the perfect prompt first appeared on John D. Cook.
Eigenvector centrality
A basic question to ask about a network is which nodes are most important. A simple way of answering this question would be to say the most well-connected nodes, i.e. the nodes with the most edges. This approach is known as degree centrality. It's not a bad place to start. It's easy to understand and [...]The post Eigenvector centrality first appeared on John D. Cook.
Fitting a line without an intercept term
The other day I was looking at how many lumens LED lights put out per watt. I found some data on Wikipedia, and as you might expect the relation between watts and lumens is basically linear, though not quite. If you were to do a linear regression on the data you'd get a relation lumens [...]The post Fitting a line without an intercept term first appeared on John D. Cook.
Solutions to tan(x) = x
I read something recently that said in passing that the solutions to the equation tan x = x are the zeros of the Bessel function J3/2. That brought two questions to mind. First, where have I seen the equation tan x = x before? And second, why should its solutions be the roots of a [...]The post Solutions to tan(x) = x first appeared on John D. Cook.
Computing logarithms of complex numbers
The previous post showed how to compute logarithms using tables. It gives an example of calculating a logarithm to 15 figures precision using tables that only allow 4 figures of precision for inputs. Not only can you bootstrap tables to calculate logarithms of real numbers not given in the tables, you can also bootstrap a [...]The post Computing logarithms of complex numbers first appeared on John D. Cook.
Using a table of logarithms
My favorite quote from Richard Feynman is his remark that nearly everything is really interesting if you go into it deeply enough." This post will look at something that seems utterly trivial-looking up numbers in a table-and show that there's much more to it when you dig a little deeper. More than just looking up [...]The post Using a table of logarithms first appeared on John D. Cook.
Duct tape value creation
Excerpt from from John Carmack's review of the book Bullshit Jobs. He talks about how software developers bemoan duct taping systems together, and would rather work on core technologies. He thinks it is some tragic failure, that if only wise system design was employed, you wouldn't be doing all the duct taping. Wrong. Every expansion [...]The post Duct tape value creation first appeared on John D. Cook.
Condition on your data
Suppose you design an experiment, an A/B test of two page designs, randomizing visitors to Design A or Design B. You planned to run the test for 800 visitors and you calculated some confidence level for your experiment. You decide to take a peek at the data after only 300 randomizations, even though your [...]The post Condition on your data first appeared on John D. Cook.
Can you look at experimental results along the way or not?
Suppose you're running an A/B test to determine whether a web page produces more sales with one graphic versus another. You plan to randomly assign image A or B to 1,000 visitors to the page, but after only randomizing 500 visitors you want to look at the data. Is this OK or not? Of course [...]The post Can you look at experimental results along the way or not? first appeared on John D. Cook.
One-liner to troubleshoot LaTeX references
In LaTeX, sections are labeled with commands like \label{foo} and referenced like \ref{foo}. Referring to sections by labels rather than hard-coded numbers allows references to automatically update when sections are inserted, deleted, or rearranged. For every reference there ought to be a label. A label without a corresponding reference is fine, though it might be [...]The post One-liner to troubleshoot LaTeX references first appeared on John D. Cook.
A “well-known” series
I was reading an article [1] that refers to a well-known trigonometric series" that I'd never seen before. This paper cites [2] which gives the series as Note that the right hand side is not a series in but rather in sin . Motivation Why might you know sin and want to calculate [...]The post A well-known" series first appeared on John D. Cook.
Probability, cryptography, and naïveté
Probability and cryptography have this in common: really smart people can be confidently wrong about both. I wrote years ago about how striking it was to see two senior professors arguing over an undergraduate probability exercise. As I commented in that post, Professors might forget how to do a calculus problem, or make a mistake [...]The post Probability, cryptography, and naivete first appeared on John D. Cook.
Thinking by playing around
Richard Feynman's Nobel Prize winning discoveries in quantum electrodynamics were partly inspired by his randomly observing a spinning dinner plate one day in the cafeteria. Paul Feyerabend said regarding science discovery, The only principle that does not inhibit progress is: anything goes" (within relevant ethical constraints, of course). Ideas can come from anywhere, including physical [...]The post Thinking by playing around first appeared on John D. Cook.
Approximation by prime powers
The well-known Weierstrass approximation theorem says that polynomials are dense in C[0, 1]. That is, given any continuous function f on the unit interval, and any > 0, you can find a polynomial P such that f and P are never more than apart. This means that linear combinations of the polynomials 1, [...]The post Approximation by prime powers first appeared on John D. Cook.
Logarithm approximation curiosity
I've written before about three simple approximations for logarithms, for base 10 log10(x) (x - 1)/(x + 1) base e, loge(x) 2(x - 1)/(x + 1) and base 2 log2(x) 3(x - 1)/(x + 1). These can be used to mentally approximate logarithms to moderate accuracy, accurate enough for quick estimates. Here's [...]The post Logarithm approximation curiosity first appeared on John D. Cook.
Iterated Mersenne primes
A Mersenne number is a number of the form 2k - 1. A Mersenne prime is a Mersenne number which is also a prime. It turns out that if 2k - 1 is prime then k must be prime, so Mersenne numbers have the form 2p - 1 is prime. What about the converse? If [...]The post Iterated Mersenne primes first appeared on John D. Cook.
Small probabilities add, big ones don’t
A video has been making the rounds in which a well-known professor [1] says that if something has a 20% probability of happening in one attempt, then it has a 40% chance of happening in two attempts, a 60% chance in happening in three attempts, etc. This is wrong, but it's a common mistake. And [...]The post Small probabilities add, big ones don't first appeared on John D. Cook.
Logistic regression quick takes
This post is a series of quick thoughts related to logistic regression. It starts with this article on moving between logit and probability scales. *** Logistic regression models the probability of a yes/no event occurring. It gives you more information than a model that simply tries to classify yeses and nos. I advised a client [...]The post Logistic regression quick takes first appeared on John D. Cook.
...45678910111213...