Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-09-12 03:31
Interim analysis, futility monitoring, and predictive probability
An interim analysis of a clinical trial is an unusual analysis. At the end of the trial you want to estimate how well some treatment X works. For example, you want to how likely is it that treatment X works better than the control treatment Y. But in the middle of the trial you want to know something more subtle. It’s […]
Periods of fractions
Suppose you have a fraction a/b where 0 < a < b, and a and b are relatively prime integers. The decimal expansion of a/b either terminates or it has an initial non-repeating part followed by a repeating part. How long is the non-repeating part? How long is the period of the repeating part? The answer depends on the prime factorization […]
Speeding up R code
People often come to me with R code that’s running slower than they’d like. It’s not unusual to make the code 10 or even 100 times faster by rewriting it in C++. Not all that speed improvement comes from changing languages. Some of it comes from better algorithms, eliminating redundancy, etc. Why bother optimizing? If […]
The big deal about neural networks
In their book Computer Age Statistical Inference, Brad Efron and Trevor Hastie give a nice description of neutral networks and deep learning. The knee-jerk response [to neural networks] from statisticians was “What’s the big deal? A neural network is just a nonlinear model, not too different from many other generalizations of linear models.” While this […]
Gentle introduction to R
The R language is closely tied to statistics. It’s ancestor was named S, because it was a language for Statistics. The open source descendant could have been named ‘T’, but its creators chose to call it’R.’ Most people learn R as they learn statistics: Here’s a statistical concept, and here’s how you can compute it in R. […]
Turning math inside-out
Here’s one of the things about category theory that takes a while to get used to. Mathematical objects are usually defined internally. For example, the Cartesian product P of two sets A and B is defined to be the set of all ordered pairs (a, b) where a comes from A and b comes from B. The definition of P depends on the elements […]
Optimal team size
Kevlin Henney’s keynote at GOTO Copenhagen this year discussed how project time varies as a function of the number of people on the project. The most naive assumption is that the time is inversely proportional to the number of people. That is t = W/n where t is the calendar time to completion, W is a measure […]
Efficiency of C# on Linux
This week I attended Mads Torgersen’s talk Why you should take another look at C#. Afterward I asked him about the efficiency of C# on Linux. When I last looked into it, it wasn’t good. A few years ago I asked someone on my team to try running some C# software on Linux using Mono. The code worked […]
GOTO Copenhagen
I gave a talk this morning at GOTO Copenhagen 2016 on ways to mix R with other programming languages: Rcpp, HaskellR, R Markdown, etc. It’s been fun to see some people I haven’t seen since I spoke at the GOTO and YOW conferences four years ago. Photo above by conference photographer Fritz Schumann.
Mathematical modeling for medical devices
We’re about to see a lot of new, powerful, inexpensive medical devices come out. And to my surprise, I’ve contributed to a few of them. Growing compute power and shrinking sensors open up possibilities we’re only beginning to explore. Even when the things we want to observe elude direct measurement, we may be able to infer them from […]
Publishable
For an article to be published, it has to be published somewhere. Each journal has a responsibility to select articles relevant to its readership. Articles that make new connections might be unpublishable because they don’t fit into a category. For example, I’ve seen papers rejected by theoretical journals for being too applied, and the same papers […]
One of my favorite proofs: Lagrange multipliers
One of my lightbulb moments in college was when my professor, Jim Vick, explained the Lagrange multiplier theorem. The way I’d seen it stated in a calculus text gave me no feel for why it should be true, but his explanation made sense immediately. Suppose f(x) is a function of several variables, i.e. x is a vector, and g(x) = c […]
Uncertainty in a probability
Suppose you did a pilot study with 10 subjects and found a treatment was effective in 7 out of the 10 subjects. With no more information than this, what would you estimate the probability to be that the treatment is effective in the next subject? Easy: 0.7. Now what would you estimate the probability to be […]
New Twitter account: FormalFact
I’m starting a new Twitter account for logic and formal methods: @FormalFact. Expect to see tweets about constructive logic, type theory, formal proofs, proof assistants, etc. The image for the account is a bowtie, a pun on formality. It’s also the symbol for natural join in relational algebra.
Münchausen numbers
The number 3435 has the following curious property: 3435 = 33 + 44 + 33 + 55. It is called a Münchausen number, an allusion to fictional Baron Münchausen. When each digit is raised to its own power and summed, you get the original number back. The only other Münchausen number is 1. At least in […]
Beta reduction: The difference typing makes
Beta reduction is essentially function application. If you have a function described by what it does to x and apply it to an argument t, you rewrite the xs as ts. The formal definition of β-reduction is more complicated than this in order to account for free versus bound variables, but this informal description is sufficient […]
Less likely to get half, more likely to get near half
I was catching up on Engines of our Ingenuity episodes this evening when the following line jumped out at me: If I flip a coin a million times, I’m virtually certain to get 50 percent heads and 50 percent tails. Depending on how you understand that line, it’s either imprecise or false. The more times you […]
Insufficient statistics
Experience with the normal distribution makes people think all distributions have (useful) sufficient statistics [1]. If you have data from a normal distribution, then the sufficient statistics are the sample mean and sample variance. These statistics are “sufficient” in that the entire data set isn’t any more informative than those two statistics. They effectively condense […]
Reversing WYSIWYG
The other day I found myself saying that I preferred org-mode files to Jupyter notebooks because with org-mode, what you see is what you get. Then I realized I was using “what you see is what you get” (WYSISYG) in exactly the opposite of the usual sense. Jupyter notebooks are WYSIWYG in the same sense […]
Floating point: between blissful ignorance and unnecesssary fear
Most programmers are at one extreme or another when it comes to floating point arithmetic. Some are blissfully ignorant that anything can go wrong, while others believe that danger lurks around every corner when using floating point. The limitations of floating point arithmetic are something to be aware of, and ignoring these limitations can cause problems, like crashing […]
Proofs and programs
Here’s an interesting quote omparing writing proofs and writing programs: Building proofs and programs are very similar activities, but there is one important difference: when looking for a proof it is often enough to find one, however complex it is. On the other hand, not all programs satisfying a specification are alike: even if the […]
ETAOIN SHRDLU and all that
Statistics can be useful, even if it’s idealizations fall apart on close inspection. For example, take English letter frequencies. These frequencies are fairly well known. E is the most common letter, followed by T, then A, etc. The string of letters “ETAOIN SHRDLU” comes from the days of Linotype when letters were arranged in that order, […]
What is calculus?
When people ask me what calculus is, my usual answer is “the mathematics of change,” studying things that change continually. Algebra is essentially static, studying things frozen in time and space. Calculus studies things that move, shapes that bend, etc. Algebra deals with things that are exact and consequently can be fragile. Calculus deals with […]
Big Logic
As systems get larger and more complex, we need new tools to test whether these systems are correctly specified and implemented. These tools may not be new per se, but they may be applied with new urgency. Dimensional analysis is a well-established method of error detection. Simply checking that you’re not doing something like adding […]
Duality in spherical trigonometry
This evening I ran across an unexpected reference to spherical trigonometry: Thomas Hales’ lecture on lessons learned from the formal proof of the Kepler conjecture. He mentions at one point a lemma that was awkward to prove in its original form, but that became trivial when he looked at its spherical dual. The sides of […]
Primitive recursive functions and enumerable sets
The set of primitive recursive (PR) functions is the smallest set of functions of several integer arguments satisfying five axioms: Constant functions are PR. The function that picks the ith element of a list of n arguments is PR. The successor function S(n) = n+1 is PR. PR functions are closed under composition. PR functions are closed under primitive […]
Some ways linear algebra is different in infinite dimensions
There’s no notion of continuity in linear algebra per se. It’s not part of the definition of a vector space. But a finite dimensional vector space over the reals is isomorphic to a Euclidean space of the same dimension, and so we usually think of such spaces as Euclidean. (We’ll only going to consider real vector spaces […]
Solar power and applied math
The applied math featured here tends to be fairly sophisticated, but there’s a lot you can do with the basics as we’ll see in the following interview with Trevor Dawson of Borrego Solar, a company specializing in grid-connected solar PV systems. JC: Can you say a little about yourself? TD: I’m Trevor Dawson, I’m 25, born in the […]
Area of a triangle and its projections
Let S be the area of triangle T in three-dimensional space. Let A, B, and C be area of the projections of T to the xy, yz, and xz planes respectively. Then S2 = A2 + B2 + C2. There’s an elegant proof of this theorem here using differential forms. Below I’ll sketch a less elegant but more elementary proof. You could prove the identity above by using the fact that the […]
Amistics
Neal Stephenson coins a useful word Amistics in his novel Seveneves: … it was a question of Amistics, which was a term that had been coined ages ago by a Moiran anthropologist to talk about the choices that different cultures made as to which technologies they would, and would not, make part of their lives. […]
How many ways can you tile a chessboard with dominoes?
Suppose you have an n by m chessboard. How many ways can you cover the chessboard with dominoes? It turns out there’s a remarkable closed-form solution: Here are some questions you may have. But what if n and m are both odd? You can’t tile such a board with dominoes. Yes, in that case the formula evaluates to […]
Acoustic roughness examples
Amplitude modulated signals sound rough to the human ear. The perceived roughness increases with modulation frequency, then decreases, and eventually disappears. The point where roughness reaches is maximum depends on the the carrier signal, but for a 1 kHz tone roughness reaches a maximum for modulation at 70 Hz. Roughness also increases as a function […]
Tensors 5: Scalars
There are two uses of the word scalar, one from linear algebra and another from tensor calculus. In linear algebra, vector spaces have a field of scalars. This is where the coefficients in linear combinations come from. For real vector spaces, the scalars are real numbers. For complex vector spaces, the scalars are complex numbers. […]
Tensors 4: Behavior under change of coordinates
In the first post in this series I mentioned several apparently unrelated things that are all called tensors, one of these being objects that behave a certain way under changes of coordinates. That’s what we’ll look at this time. In the second post we said that a tensor is a multilinear functional. A k-tensor takes k vectors and […]
Tensors 3: Tensor products
In the previous post, we defined the tensor product of two tensors, but you’ll often see tensor products of spaces. How are these tensor products defined? Tensor product splines For example, you may have seen tensor product splines. Suppose you have a function over a rectangle that you’d like to approximate by patching together polynomials so that […]
Tensors 2: Multilinear operators
The simplest definition of a tensor is that it is a multilinear functional, i.e. a function that takes several vectors, returns a number, and is linear in each argument. Tensors over real vector spaces return real numbers, tensors over complex vector spaces return complex numbers, and you could work over other fields if you’d like. A dot product is […]
Tensors 1: What is a tensor?
The word “tensor” is shrouded in mystery. The same term is applied to many different things that don’t appear to have much in common with each other. You might have heared that a tensor is a box of numbers. Just as a matrix is a rectangular collection of numbers, a tensor could be a cube of […]
From triangles to the heat equation
“Mathematics compares the most diverse phenomena and discovers the secret analogies that unite them.” — Joseph Fourier The above quote makes me think of a connection Fourier made between triangles and thermodynamics. Trigonometric functions were first studied because they relate angles in a right triangle to ratios of the lengths of the triangle’s sides. For […]
Contradictory news regarding ABC conjecture
“Research is what I’m doing when I don’t know what I’m doing.” — Wernher von Braun I find Shinichi Mochizuki’s proof of the abc conjecture fascinating. Not the content of the proof—which I do not understand in the least—but the circumstances of the proof. Most mathematics, no matter how arcane it appears to outsiders, is […]
Practical continuity
I had a discussion recently about whether things are really continuous in the real world. Strictly speaking, maybe not, but practically yes. The same is true of all mathematical properties. There are no circles in the real world, not in the Platonic sense of a mathematical circle. But a circle is a very useful abstraction, […]
Yet another way to define fractional derivatives
Fractional integrals are easier to define than fractional derivatives. And for sufficiently smooth functions, you can use the former to define the latter. The Riemann-Liouville fractional integral starts from the observation that for positive integer n, This motivates a definition of fractional integrals which is valid for any complex α with positive real part. Derivatives and integrals are […]
Quantifying how annoying a sound is
Eberhard Zwicker proposed a model for combining several psychoacoustic metrics into one metric to quantify annoyance. It is a function of three things: N5, the 95th percentile of loudness, measured in sone (which is confusingly called the 5th percentile) ωS, a function of sharpness in asper and of loudness ωFR, fluctuation strength (in vacil), roughness (in […]
Loudness units
I’ve posted an online calculator to convert between two commonly used units of loudness, phon and sone. The page describes the purpose of both units and explains how to convert between them.
Physical models
The most recent episode of 99% Invisible tells the story of the Corp of Engineers’ enormous physical model of the Mississippi basin, nearly half of the area of the continental US. Spanning over 200 acres, the model was built during WWII and was shut down in 1993. Here are some of my favorite lines from the show: […]
Another way to define fractional derivatives
There are many ways to define fractional derivatives, and in general they coincide on nice classes of functions. A long time ago I wrote about one way to define fractional derivatives using Fourier transforms. From that post: Here’s one way fractional derivatives could be defined. Suppose the Fourier transform of f(x) is g(ξ). Then for […]
Humble Lisp programmers
Maybe from the headline you were expecting a blank post? No, that’s not where I’m going. Yesterday I was on Amazon.com and noticed that nearly all the books they recommended for me were either about Lisp or mountain climbing. I thought this was odd, and mentioned it on Twitter. Carl Vogel had a witty reply: […]
Integral equation types
There are four basic types of integral equations. There are many other integral equations, but if you are familiar with these four, you have a good overview of the classical theory. All four involve the unknown function φ(x) in an integral with a kernel K(x, y) and all have an input function f(x). In all […]
What is a vacil?
Fluctuation strength is similar to roughness, though at much lower modulation frequencies. Fluctuation strength is measured in vacils (from vacilare in Latin or vacillate in English). Police sirens are a good example of sounds with high fluctuation strength. Fluctuation strength reaches its maximum at a modulation frequency of around 4 Hz. For much higher modulation frequencies, one […]
What is an asper?
Acoustic roughness is measured in aspers (from the Latin word for rough). An asper is the roughness of a 1 kHz tone, at 60 dB, 100% modulated at 70 Hz. That is, the signal (1 + sin(140πt)) sin(2000πt) where t is time in seconds. Here’s what that sounds like (if you play this at 60 […]
Mittag-Leffler function and probability distribution
The Mittag-Leffler function is a generalization of the exponential function. Since k!= Γ(k + 1), we can write the exponential function’s power series as and we can generalize this to the Mittag=Leffler function which reduces to the exponential function when α = β = 1. There are a few other values of α and β for […]
...50515253545556575859...