Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-04-26 03:16
Visualizing search keyword overlap
The other day someone asked me to look at the search data for Debug Pest Control, a pest management company based in Rhode Island. One of the things I wanted to visualize was how the search terms overlapped with each other. To do this, I created a graph where the nodes are the keywords and edges join nodes that […]
Spectral coordinates in Python
A graph doesn’t have any geometric structure unless we add it. The vertices don’t come with any position in space. The same graph can look very different when arranged different ways. Spectral coordinates are a natural way to draw a graph because they are determined by the properties of the graph, not arbitrary aesthetic choices. Construct the Laplacian […]
Magic square made of dominoes
You can arrange a standard set of dominoes into a magic square of sorts. There are 28 dominoes, each with two ends, so the number of ends isn’t a perfect square. But if you ignore the row of blanks at the bottom, you have a 7 by 7 square where every row, column, and diagonal […]
Seven questions a statistician could answer for a lawyer
A statistician could help a lawyer answer the following questions. Was this data collected in a proper way? Does common sense apply here, or is there something subtle going on? What conclusions can we draw from the data? Is this analysis routine or is there something unusual about it? How much confidence can we place […]
Visualizing the DFT matrix
The discrete Fourier transform (DFT) of length N multiplies a vector by a matrix whose (j, k) entry is ωjk where ω = exp(-2πi/N), with j and k running from 0 to N – 1. Each element of the matrix is a rotation, so if N = 12, we can represent each element by an hour on a clock. The angle […]
Spectra of complete graphs, stars, and rings
A few examples help build intuition for what the eigenvalues of the graph Laplacian tell us about a graph. The smallest eigenvalue is always zero (see explanation in footnote here). For a complete graph on n vertices, all the eigenvalues except the first equal n. The eigenvalues of the Laplacian of a graph with n […]
Adding an edge increases eigenvalues
When you add an edge to a graph, each of the eigenvalues of the graph Laplacian either increases or stays the same. As I wrote in my previous post, the smallest eigenvalue is always 0. The second smallest eigenvalue is 0 if and only if the graph is disconnected. So no edge will increase the […]
Measuring connectivity with graph Laplacian eigenvalues
If a graph can be split into two components with no links between them, that’s probably easy to see. It’s also unlikely, unless there’s a good reason for it. The less obvious and more common case is a graph that can almost be split into two components. Spectral graph theory, looking at the eigenvalues of the graph Laplacian, […]
Big p, Little n
Statisticians use n to denote the number of subjects in a data set and p to denote nearly everything else. You’re supposed to know from context what each p means. In the phrase “big n, little p” the symbol p means the number of measurements per subject. Traditional data sets are “big n, little p” […]
Connecting on LinkedIn
I only connect to people on LinkedIn that I know. This almost always means people I have met face-to-face or at least talked to over the phone. If you’d like to connect on LinkedIn and we haven’t met, please contact me to set up a phone call. I look forward to talking to you.
Relating Fourier series and Fourier transforms
Fourier series and Fourier transforms may seem more different than they are because of the way they’re typically taught. Fourier series are presented more as a representation of a function, not a transformation. Here’s a function on an interval. We can write it as a sum of sines and cosines, just as we can write […]
An example of coming full circle
Here’s an interesting line from Brad Osgood: Isn’t it a little embarrassing that multibillion dollar industries seem to depend on integrals that don’t converge? In context, he’s not saying that huge companies are blithely using bad math. Some are, but that’s not what he’s getting at here. His discussion is an example of coming full […]
Finding 2016 in pi
2016 appears in π starting at the 7173rd decimal place: You can confirm this with Mathematica or Wolfram Alpha: Mod[ Floor[10^7177 pi] , 10000] I found it using the following Python code: >>> from sympy import pi >>> digits = str(pi.evalf(10000))[2:] >>> digits.find('2016') 7173 By the way, it’s also true that 2016 = 1 + 2 […]
Get rid of something every Thursday
I heard of someone who had a commitment to get rid of something every Thursday. I don’t know anything about how they carried that out. It could mean throwing out or donating to charity a physical object each Thursday. Or maybe it could be handing over a responsibility or letting go of an ambition. It could be a […]
Most popular posts of 2015
Here are this blog’s most popular technology posts of 2015: The most important skill in software development Automate to save mental energy, not time The success of object oriented programming Learning (needlessly) hard technology And here are the most popular math posts of 2015: Defining zero factorial Life lessons from differential equations Distance to Mars
Automate to save mental energy, not time
Automation doesn’t always save as much time or effort as we expect. The xkcd cartoon above is looking at automation as an investment. Does the work I put in now eventually save more work than I put into it? Automation may be well worth it even if the answer is “no.” Automation can be like […]
The Dirac comb or Sha function
The sha function, also known as the Dirac comb, is denoted with the Cyrillic letter sha (Ш, U+0428). This letter was chosen because it looks like how people visualize the function, a long series of vertical spikes. The function is called the Dirac comb for the same reason. This function is very important in Fourier […]
The longer it has taken, the longer it will take
Suppose project completion time follows a Pareto (power law) distribution with parameter α. That is, for t > 1, the probability that completion time is bigger than t is t-α. (We start out time at t = 1 because that makes the calculations a little simpler.) Now suppose we know that a project has lasted […]
Two meanings of distribution
There are a couple common uses of the term distribution in math. The most familiar is probability distribution, such as a beta distribution, a Poisson distribution, etc. Less familiar but still common is distributions in the sense of generalized functions, like the Dirac delta distribution. Anybody with much exposure to math will have heard of a […]
Restarting @DSP_fact, ending @PerlRegex
I’m making a couple changes to my Twitter accounts. First, I’m winding down @PerlRegex. I’ll stop tweeting there when my scheduled tweets run out. I suggest that everyone who has been following @PerlRegex start following @RegexTip instead. The latter is more general, but is mostly compatible with Perl. Second, I’m reviving my @DSP_Fact. I stopped […]
Retooling
I was listening to a classic music station yesterday, and I heard the story of a professional pianist whose hand was injured in an accident. He then started learning trumpet and two years later he was a professional trumpeter. I didn’t catch the musician’s name. I was not surprised that a professional in one instrument could become […]
Sinc and Jinc sums
In the previous post, we looked at an elegant equation involving integrals of the sinc function and computed the corresponding integrals for the jinc function. It turns out the analogous equation holds for sums as well: As before, we’d like to compute these two sums and see whether we can compute the corresponding sums for the […]
Sinc and Jinc integrals
The sinc function is defined by sinc(x) = sin(x)/x. Philip Woodward introduced the name of the function in 1952, saying it “occurs so often in Fourier analysis and its applications that it does seem to merit some notation of its own.” Here’s an elegant equation involving the integrals of the sinc function: When I ran […]
Unicode / LaTeX page updated
Almost three years ago I put up a web page to let you go back and forth between Unicode code points and LaTeX commands. Here’s the page and here’s a blog post explaining it. I’ve expanded the data the page uses by merging in data from the STIX Project. More queries should return successfully now. […]
Typesetting and computing continued fractions
Pi The other day I ran across the following continued fraction for π. Source: L. J. Lange, An Elegant Continued Fraction for π, The American Mathematical Monthly, Vol. 106, No. 5 (May, 1999), pp. 456-458. While the continued fraction itself is interesting, I thought I’d use this an example of how to typeset and compute […]
Fourier analysis notes
There are six or eight ways to define a Fourier transform. The differences in the various conventions are minor, but they lead to differences in the basic results. So whenever you look up a result, you have to make sure the reference’s definition matches the one you’re expecting. Or maybe you re-derive the result. This is good […]
Big data paradox
This is what the book Social Media Mining calls the Big Data Paradox: Social media data is undoubtedly big. However, when we zoom into individuals for whom, for example, we would like to make relevant recommendations, we often have little data for each specific individual. We have to exploit the characteristics of social media and […]
Alternating sums of factorials
Richard Guy’s Strong Law of Small Numbers says There aren’t enough small numbers to meet the many demands made of them. In his article by the same name [1] Guy illustrates his law with several examples of patterns that hold for small numbers but eventually fail. One of these examples is 3! – 2! + 1! = […]
Non-technical books I’ve written about this year
Here are some of the non-technical books I’ve mentioned in blog posts this year. I posted the technical list a couple days ago. Maybe I should say “less technical” rather than “non-technical.” For example, Surely You’re Joking, Mr. Feynman is a book about a physicist, but it’s at least as much a human interest book as […]
Timidity about approximating
“Nature does not consist entirely, or even largely, of problems designed by a Grand Examiner to come out neatly in finite terms, and whatever subject we tackle the first need is to overcome timidity about approximating.” H. and B. S. Jeffreys, Methods of Mathematical Physics, 2nd ed., Cambridge University Press, 1950, p. 8. Related post: […]
Four ways to find hidden RSS feeds
RSS feeds RSS lets you subscribe to blogs. It also lets you read posts in peace, free from distracting peripheral ads. This explains why Google would kill off the world’s most popular RSS reader. Blogs used to display an icon linking to the site’s RSS feed, and any still do. Blogging software still creates RSS feeds, […]
Technical books I’ve written about this year
Here are some of the books I’ve mentioned in blog posts this year. One of these books may be just the present for a geek in your life. This post looks at technical books: math, science, engineering, programming. I’ll have a follow-up post with non-technical books I’ve written about. (Update: here’s the non-technical list.) Programming Book […]
Algorithms vs Moore’s Law
I saw an impressive chart once of how numerical linear algebra algorithm efficiency have improved over time. I haven’t been able to find that chart, but here is something similar. Thanks to Naveen Palli for pointing this out. Even more remarkable [than Moore’s law] — and even less widely understood — is that in many areas, performance gains due […]
It all boils down to linear algebra
When I was in college, my view of applied math was something like the following. Applied math is mostly mathematical physics. Mathematical physics is mostly differential equations. Numerical solution of differential equations boils down to linear algebra. Therefore the heart of applied math is linear algebra. I still think there’s a lot of truth in […]
Colors of noise
The term white noise is fairly common. People unfamiliar with its technical meaning will describe some sort of background noise, like a fan, as white noise. Less common are terms like pink noise, red noise, etc. The colors of noise are defined various ways, but they’re all based on an analogy between the power spectrum […]
Toy problems
A toy problem is a simplified problem meant to be a warm-up to a more complicated problem. I worked on a project earlier this year that was so complex that the write-up of the toy version grew to over 100 pages. We had to make a toy version of the toy version in order to have […]
R lists and XML
Hadley Wickham posted a photo on Twitter back in September illustrating R list indices with pepper: Then a few days ago, Jenny Bryan posted on Twitter her follow up, an analogous photo for XML: Related post: R without Hadley Wickham
Graph Laplacian and other matrices associated with a graph
There are several ways to associate a matrix with a graph. The properties of these matrices, especially spectral properties (eigenvalues and eigenvectors) tell us things about the structure of the corresponding graph. This post is a brief reference for keeping these various matrices straight. The most basic matrix associated with a graph is its adjacency […]
Dimensional analysis and types
This weekend I mentioned on Twitter that it’s spooky how well dimensional analysis catches errors. If you’re trying to calculate a number of horses, does your final result have units of horses? If it has units of cats or kilograms, something has gone wrong. This is such a simple idea, it’s remarkable that it’s worth checking. […]
Overestimating the competition
Richard Feynman tells a story in Surely You’re Joking, Mr. Feynman that I’m reminded of periodically when I realize something is smaller and less sophisticated than I imagined. [Update: A couple people pointed out in the comments that I got the roles of the two characters in this story reversed, so I’ve corrected this.] Feynman tells the story of […]
Learning (needlessly) hard technology
A few years ago, a friend told me he was thinking about learning a certain technology because it was really hard to use. This was not something that had to be complex to solve a complex problem, but something that was unnecessarily complex. Why would anyone do that? His reasoning was that as a consultant, […]
Estimating the exponent of discrete power law data
Suppose you have data from a discrete power law with exponent α. That is, the probability of an outcome n is proportional to n-α. How can you recover α? A naive approach would be to gloss over the fact that you have discrete data and use the MLE (maximum likelihood estimator) for continuous data. That […]
Twitter account wordclouds
Here are wordclouds for some of my most popular Twitter accounts. Thanks to Mike Croucher for creating these images. He explains on his blog how to create your own Twitter wordclouds using R. My most popular account is CompSciFact, tweets about computer science and related topics. AlgebraFact is for algebra, number theory, and miscellaneous pure […]
Mathematical alchemy and wrestling
David Mumford wrote a blog post a few weeks ago in which he identified four tribes of mathematicians. Here’s a summary of his description of the four tribes. Explorers are people who ask — are there objects with such and such properties and if so, how many? … Alchemists … are those whose greatest excitement comes from […]
Numerical differentiation
Today I needed to the derivative of the zeta function. SciPy implements the zeta function, but not its derivative, so I needed to write my own version. The most obvious way to approximate a derivative would be to simply stick a small step size into the definition of derivative: f’(x) ≈ (f(x+h) – f(x)) / […]
Splitting proofs in two
“Ever since Euclid, mathematical proofs have served a dual purpose: certifying that a statement is true and explaining why it is true. In the future these two epistemological functions may be divorced. In the future, the computer assistant may take care of the certification and leave the mathematician to look for an explanation that humans […]
Anthony Scopatz on xonsh and shells in general
Anthony Scopatz did an interview for Podcast.__init__ recently talking about xonsh, a command shell that blends Python and some traditions from bash. One line from the interview jumped out at me: … thinking very critically about what shells get used for and what they’re actually good at and what they’re not good at. I’ve wondered about […]
You do not want to be an edge case
Hilary Mason made an important observation on Twitter a few days ago: You do not want to be an edge case in this future we are building. Systems run by algorithms can be more efficient on average, but make life harder on the edge cases, people who are exceptions to the system developers’ expectations. Algorithms, whether encoded in software or […]
Project lead time
Large companies take longer to start projects. How much longer? A plausible guess is that project lead time would be proportional to the logarithm of the company size. If a company with n employees has a hierarchy with every manager having m subordinates, the number of management layers would be around logm(n). If every project has […]
Bastrop State Park, four years later
Four years ago I wrote about the wildfires in Bastrop, Texas. Here’s a photo from the time by Kerri West, used by permission. Today I visited Bastrop State Park on the way home from Austin. Some trees, particularly oaks, survived the fires. Pines have come back on their own in parts of the park. A volunteer working […]
...5051525354555657