Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-09-12 00:01
Fixed points of logistic function
Here’s an interesting problem that came out of a logistic regression application. The input variable was between 0 and 1, and someone asked when and where the logistic transformation f(x) = 1/(1 + exp(a + bx)) has a fixed point, i.e. f(x) = x. So given logistic regression parameters a and b, when does the logistic curve given by y […]
Line art
A new video from 3Blue1Brown is about visualizing derivatives as stretching and shrinking factors. Along the way they consider the function f(x) = 1 + 1/x. Iterations of f converge on the golden ratio, no matter where you start (with one exception). The video creates a graph where they connect values of x on one […]
Causal inference and cryptic syntax
I just made one of those O’Reilly parody book covers. It’s a joke on Judea Pearl, expert in causal inference, and the Perl programming language, known for its unusual, terse syntax. Related:
Making a career out of the chain rule
When I was a teenager, my uncle gave me a calculus book and told me that mastering calculus was the most important thing I could do for starting out in math. So I learned the basics of calculus from that book. Later I read Michael Spivak’s two calculus books. I took courses that built on […]
Robustness and tests for equal variance
The two-sample t-test is a way to test whether two data sets come from distributions with the same mean. I wrote a few days ago about how the test performs under ideal circumstances, as well as less than ideal circumstances. This is an analogous post for testing whether two data sets come from distributions with the same […]
Ellipsoid geometry and Haumea
To first approximation, Earth is a sphere. A more accurate description is that the earth is an oblate spheroid, the polar axis being a little shorter than the equatorial diameter. See details here. Other planets are also oblate spheroids as well. Jupiter is further from spherical than the earth is more oblate. The general equation […]
Two-sample t-test and robustness
A two-sample t-test is intended to determine whether there’s evidence that two samples have come from distributions with different means. The test assumes that both samples come from normal distributions. Robust to non-normality, not to asymmetry It is fairly well known that the t-test is robust to departures from a normal distribution, as long as the actual […]
Spectral sparsification
The latest episode of My Favorite theorem features John Urschel, former offensive lineman for the Baltimore Ravens and current math graduate student. His favorite theorem is a result on graph approximation: for every weighted graph, no matter how densely connected, it is possible to find a sparse graph whose Laplacian approximates that of the original […]
Reciprocals of primes
Here’s an interesting little tidbit: For any prime p except 2 and 5, the decimal expansion of 1/p repeats with a period that divides p-1. The period could be as large as p-1, but no larger. If it’s less than p-1, then it’s a divisor of p-1. Here are a few examples. 1/3 = 0.33… […]
Rise and fall of the Windows Empire
This morning I ran across the following graph via Horace Dediu. I developed Windows software during the fattest part of the Windows curve. That was a great time to be in the Windows ecosystem. Before that I was in an academic bubble. My world consisted primarily of Macs and various flavors of Unix. I had […]
Robust statistics
P. J. Huber gives three desiderata for a statistical method in his book Robust Statistics: It should have a reasonably good (optimal or nearly optimal) efficiency at the assumed model. It should be robust in the sense that small deviations from the model assumptions should impair the performance only slightly. Somewhat larger deviations from the […]
Optimal low-rank matrix approximation
Matrix compression Suppose you have an m by n matrix A, where m and n are very large, that you’d like to compress. That is, you’d like to come up with an approximation of A that takes less data to describe. For example, consider a high resolution photo that as a matrix of gray scale values. An approximation to the matrix […]
Least squares solutions to over- or underdetermined systems
If often happens in applications that a linear system of equations Ax = b either does not have a solution or has infinitely many solutions. Applications often use least squares to create a problem that has a unique solution. Overdetermined systems Suppose the matrix A has dimensions m by n and the right hand side vector b has dimension m. Then the […]
Computing SVD and pseudoinverse
In a nutshell, given the singular decomposition of a matrix A, the Moore-Penrose pseudoinverse is given by This post will explain what the terms above mean, and how to compute them in Python and in Matheamtica. Singular Value Decomposition (SVD) The singular value decomposition of a matrix is a sort of change of coordinates that makes […]
Probit regression
The previous post looked at how probability predictions from a logistic regression model vary as a function of the fitted parameters. This post goes through the same exercise for probit regression and compares the two kinds of nonlinear regression. Generalized linear models and link functions Logistic and probit regression are minor variations on a theme. […]
Sensitivity of logistic regression prediction on coefficients
The output of a logistic regression model is a function that predicts the probability of an event as a function of the input parameter. This post will only look at a simple logistic regression model with one predictor, but similar analysis applies to multiple regression with several predictors. Here’s a plot of such a curve […]
Tridiagonal systems, determinants, and natural cubic splines
Tridiagonal matrices A tridiagonal matrix is a matrix that has nonzero entries only on the main diagonal and on the adjacent off-diagonals. This special structure comes up frequently in applications. For example, the finite difference numerical solution to the heat equation leads to a tridiagonal system. Another application, the one we’ll look at in detail […]
Probability of coprime sets
The latest blog post from Gödel’s Lost Letter and P=NP looks at the problem of finding relatively prime pairs of large numbers. In particular, they want a deterministic algorithm. They mention in passing that the probability of a set of k large integers being relatively prime (coprime) is 1/ζ(k) where ζ is the Riemann zeta function. This […]
The quadratic formula and low-precision arithmetic
What could be interesting about the lowly quadratic formula? It’s a formula after all. You just stick numbers into it. Well, there’s an interesting wrinkle. When the linear coefficient b is large relative to the other coefficients, the quadratic formula can give wrong results when implemented in floating point arithmetic. Quadratic formula and loss of precision The […]
Off by one character
There was a discussion on Twitter today about a mistake calculus students make: I pointed out that it’s only off by one character: The first equation is simply wrong. The second is correct, but a gross violation of convention, using x as a constant and e as a variable.
New Twitter account: BasicStatistics
I’ve started a new Twitter account: @BasicStatistics. The new account is for people who are curious about statistics. It’s meant to be accessible to a wider audience than @DataSciFact. More Twitter accounts here.
Review of Matrix Mathematics
Bernstein’s Matrix Mathematics is impressive. It’s over 1500 pages and weighs 5.3 pounds (2.4 kg). It’s a reference book, not the kind of book you just sit down to read. (Actually, I have sat down to read parts of it.) I’d used a library copy of the first edition, and so when Princeton University Press […]
Moore-Penrose pseudoinverse is not an adjoint
The Moore-Penrose pseudoinverse of a matrix is a way of coming up with something like an inverse for a matrix that doesn’t have an inverse. If a matrix does have an inverse, then the pseudoinverse is in fact the inverse. The Moore-Penrose pseudoinverse is also called a generalized inverse for this reason: it’s not just […]
It’s like this other thing except …
One of my complaints about math writing is that definitions are hardly ever subtractive, even if that’s how people think of them. For example, a monoid is a group except without inverses. But that’s not how you’ll see it defined. Instead you’ll read that it’s a set with an associative binary operation and an identity […]
Obesity index: Measuring the fatness of probability distribution tails
A probability distribution is called “fat tailed” if its probability density goes to zero slowly. Slowly relative to what? That is often implicit and left up to context, but generally speaking the exponential distribution is the dividing line. Probability densities that decay faster than the exponential distribution are called “thin” or “light,” and densities that […]
Duffing equation for nonlinear oscillator
The Duffing equation is an ordinary differential equation describing a nonlinear damped driven oscillator. If the parameter μ were zero, this would be a damped driven linear oscillator. It’s the nonlinear x³ term that makes things nonlinear and interesting. Using an analog computer in 1961, Youshisuke Ueda discovered that this system was chaotic. It was […]
Surface area of an egg
The first post in this series looked at a possible formula for the shape of an egg, how to fit the parameters of the formula, and the curvature of the shape at each end of the egg. The second post looked at the volume. This post looks at the surface area. If you rotate the […]
Volume of an egg
The previous post looked at an equation to fit the shape of an egg. In two dimensions we had In this post, we’ll rotate that curve around the x-axis to find the volume. Then we’ll see how it compares to that of an ellipsoid. If we rotate the graph of a function f(x) around the x-axis with x ranging […]
Equation to fit an egg
How would you fit an equation to the shape of an egg? This site suggests an equation of the form Note that if k = 0 we get an ellipse. The larger the parameter k is, the more asymmetric the shape is about the y-axis. Let’s try that out in Mathematica: ContourPlot[ x^2/16 + y^2 (1 + 0.1 […]
Viability of unpopular programming languages
I said something about Perl 6 the other day, and someone replied asking whether anyone actually uses Perl 6. My first thought was I bet more people use Perl 6 than Haskell, and it’s well known that people use Haskell. I looked at the TIOBE Index to see whether that’s true. I won’t argue how […]
Eight-bit floating point
Researchers have discovered that for some problems, deep neural networks (DNNs) can get by with low precision weights. Using fewer bits to represent weights means that more weights can fit in memory at once. This, as well as embedded systems, has renewed interest in low-precision floating point. Microsoft mentioned its proprietary floating point formats ms-fp8 and […]
Comparing range and precision of IEEE and posit
The IEEE standard 754-2008 defines several sizes of floating point numbers—half precision (binary16), single precision (binary32), double precision (binary64), quadruple precision (binary128), etc.—each with its own specification. Posit numbers, on the other hand, can be defined for any number of bits. However, the IEEE specifications share common patterns so that you could consistently define theoretical […]
Categorical Data Analysis
Categorical data analysis could mean a couple different things. One is analyzing data that falls into unordered categories (e.g. red, green, and blue) rather than numerical values (e..g. height in centimeters). Another is using category theory to assist with the analysis of data. Here “category” means something more sophisticated than a list of items you […]
Anatomy of a posit number
This post will introduce posit numbers, explain the interpretation of their bits, and discuss their dynamic range and precision. Posit numbers are a new way to represent real numbers for computers, an alternative to the standard IEEE floating point formats. The primary advantage of posits is the ability to get more precision or dynamic range out […]
Up arrow and down arrow notation
I recently ran into a tweet saying that if ** denotes exponentiation then // should denote logarithm. With this notation, for example, if we say 3**4 == 81 we would also say 81 // 3 == 4. This runs counter to convention since // has come to be a comment marker or a notation for integer […]
Asymmetric surprise
Motivating example: planet spacing My previous post showed that planets are roughly evenly distributed on a log scale, not just in our solar system but also in extrasolar planetary systems. I hadn’t seen this before I stumbled on it by making some plots. I didn’t think it was an original discovery—I assume someone did this […]
Planets evenly spaced on log scale
The previous post was about Kepler’s observation that the planets were spaced out around the sun the same way that nested regular solids would be. Kepler only knew of six planets, which was very convenient because there are only five regular solids. In fact, Kepler thought there could only be six planets because there are only […]
Planets and Platonic solids
Johann Kepler discovered in 1596 that the ratios of the orbits of the six planets known in his day were the same as the ratios between nested Platonic solids. Kepler was understandably quite impressed with this discovery and called it the Mysterium Cosmographicum. I heard of this in a course in the history of astronomy […]
Hypothesis testing vs estimation
I was looking at my daughter’s statistics homework recently, and there were a pair of questions about testing the level of lead in drinking water. One question concerned testing whether the water was safe, and the other concerned testing whether the water was unsafe. There’s something bizarre, even embarrassing, about this. You want to do […]
Curvature and automatic differentiation
Curvature is tedious to calculate by hand because it involves calculating first and second order derivatives. Of course other applications require derivatives too, but curvature is the example we’ll look at in this post. Computing derivatives It would be nice to write programs that only explicitly implement the original function and let software take care […]
Generalized normal distribution and kurtosis
The generalized normal distribution adds an extra parameter β to the normal (Gaussian) distribution. The probability density function for the generalized normal distribution is Here the location parameter μ is the mean, but the scaling factor σ is not the standard deviation unless β = 2. For small values of the shape parameter β, the […]
Gravitational attraction of stars and cows
One attempt at rationalizing astrology is to say that the gravitational effects of celestial bodies impact our bodies. To get an idea how hard the stars and planets pull on us, let’s compare their gravitational attraction to that of cows at various distances. Newton’s law of gravity says that the gravitational attraction between two objects […]
Sums of palindromes
Every positive integer can be written as the sum of three palindromes, numbers that remain the same when their digits are reverse. For example, 389 = 11 + 55 + 323. This holds not just for base 10 but for any base b ≥ 5. The result and algorithms for finding the palindromes was published […]
Squared digit sum
Take any positive integer n and sum the squares of its digits. If you repeat this operation, eventually you’ll either end at 1 or cycle between the eight values 4, 16, 37, 58, 89, 145, 42, and 20. For example, pick n = 389. Then 3² + 8² + 9² = 9 + 64 + […]
Asymptotic solution to ODE
Our adventure starts with the following ordinary differential equation: Analytic solution We can solve this equation in closed-form, depending on your definition of closed-form, by multiplying by an integrating factor. The left side factors to and so The indefinite integral above cannot be evaluated in elementary terms, though it can be evaluated in terms of […]
Approximating gamma ratios
Ratios of gamma functions come up often in applications. If the two gamma function arguments differ by an integer, then it’s easy to calculate their ratio exactly by using (repeatedly if necessary) the fact at Γ(x + 1) = x Γ(x). If the arguments differ by 1/2, there is no closed formula, but the there […]
Generating Laplace random variables
Differential privacy adds Laplace-distributed random noise to data to protect individual privacy. (More on that here.) Although it’s simple to generate Laplacian random values, the Laplace distribution is not always one of the built-in options for random number generation libraries. The Laplace distribution with scale β has density The Laplace distribution is also called the double […]
Could you read on Pluto?
I heard somewhere that Pluto receives more sunlight than you might think, enough to read by, and that sunlight on Pluto is much brighter than moonlight on Earth. I forget where I heard that, but I’ve done a back-of-the-envelope calculation to confirm that it’s true. Pluto is about 40 AU from the sun, i.e. forty times […]
Minimizing relative error
Suppose you know a number is between 30 and 42. You want to guess the number while minimizing how wrong you could be in the worst case. Then you’d guess the midpoint of the two ends, which gives you 36. But suppose you want to minimize the worst case relative error? If you chose 36, […]
Bits of information in age, birthday, and birthdate
The previous post looked at how much information is contained in zip codes. This post will look at how much information is contained in someone’s age, birthday, and birth date. Combining zip code with birthdate will demonstrate the plausibility of Latanya Sweeney’s famous result [1] that 87% of the US population can be identified based […]
...44454647484950515253...