Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2025-04-26 08:31
The name we give to bright ideas
From The Book of Strange New Things: … I said that if science could come up with something like the Jump it could surely solve a problem like that. Severin seized hold of that word, “science.” Science, he said, is not some mysterious larger-than-life force, it’s just the name we give to bright ideas that […]
Algorithmic wizardry
Last week I wrote a short commentary on James Hague’s blog post Organization skills beat algorithmic wizardry. This week that post got more traffic than my server could handle. I believe it struck a chord with experienced software developers who know that the challenges they face now are not like the challenges they prepared for in school. Although […]
The Nickel Tour
If you’re new to this blog, welcome! Let me show you around. Here are some of the most popular posts on this site and some other things I’ve written. If you’d like to subscribe to this site you can do so by RSS or email. I also have a monthly newsletter. You can find out more about me and my background here. […]
The most important skill in software development
Here’s an insightful paragraph from James Hague’s blog post Organization skills beat algorithmic wizardry: When it comes to writing code, the number one most important skill is how to keep a tangle of features from collapsing under the weight of its own complexity. I’ve worked on large telecommunications systems, console games, blogging software, a bunch […]
AI Spring
Artificial intelligence, or at least the perception of artificial intelligence, has gone from disappointing to frightening in the blink of an eye. As Marc Andreessen said on Twitter this morning: AI: From “It’s so horrible how little progress has been made” to “It’s so horrible how much progress has been made” in one step. When […]
Ursula K. Le Guin has it backward
Ursula K. Le Guin is asking people to not buy books from Amazon because they market bestsellers, the literary equivalent of junk food. She said last week I believe that reading only packaged microwavable fiction ruins the taste, destabilizes the moral blood pressure, and makes the mind obese. I agree with that. That’s why I shop […]
Reading equations forward and backward
There is no logical difference between writing A = B and writing B = A, but there is a psychological difference. Equations are typically applied left to right. When you write A = B you imply that it may be useful to replace A with B. This is helpful to keep in mind when learning something […]
Launching missiles with Haskell
Haskell advocates are fond of saying that a Haskell function cannot launch missiles without you knowing it. Pure functions have no side effects, so they can only do what they purport to do. In a language that does not enforce functional purity, calling a function could have arbitrary side effects, including launching missiles. But this […]
Mystery curve
This afternoon I got a review copy of the book Creating Symmetry: The Artful Mathematics of Wallpaper Patterns. Here’s a striking curves from near the beginning of the book, one that the author calls the “mystery curve.” The curve is the plot of exp(it) – exp(6it)/2 + i exp(-14it)/3 with t running from 0 to 2π. Here’s Python […]
RSS feeds for categories
You can subscribe to this blog using this RSS feed. If you would like to only subscribe to posts in certain categories, you can do so using the category-specific feeds below. Business Clinical trials Computing Creativity Graphics Machine learning Math Music Python Science Software development Statistics Typography Misc You can also subscribe to my Twitter […]
Unix-like shells on Windows
This post gives some notes on ways to create a Unix-like command line experience on Windows, without using a virtual machine like VMWare or a quasi-virtual machine like Cygwin. Finding Windows ports of Unix utilities is easy. The harder part is finding a shell that behaves as expected. (Of course “as expected” depends on your expectations!) There […]
Data, code, and regulation
Data is code and code is data. The distinction between software (“code”) and input (“data”) is blurry at best, arbitrary at worst. And this distinction, or lack thereof, has interesting implications for regulation. In some contexts software is regulated but data is not, or at least software comes under different regulations than data. For example, […]
Subway map of the solar system
This is a thumbnail version of a large, high-resolution image by Ulysse Carion. Thanks to Aleksey Shipilëv (@shipilev) for pointing it out. It’s hard to see in the thumbnail, but the map gives the change in velocity needed at each branch point. You can find the full 2239 x 2725 pixel image here or click on the […]
Fibonacci number system
Every positive integer can be written as the sum of distinct Fibonacci numbers. For example, 10 = 8 + 2, the sum of the fifth Fibonacci number and the second. This decomposition is unique if you impose the extra requirement that consecutive Fibonacci numbers are not allowed. [1] It’s easy to see that the rule against consecutive […]
New monthly newsletter
Thank you for reading my blog. I’m starting a new email newsletter to address two things that readers have mentioned. Some say they enjoy the blog, but I post more often than they care to keep up with, particularly if they’re only interested in the non-technical posts. Others have said they’d like to know more about […]
Information hiding
One of the basic principles of software development is information hiding. People agree that it’s desirable, but may not realize they have different ideas of what it means. And when done poorly, well-meaning attempts to make software more maintainable backfire. Leo Brodie cautions … we should clarify. From what, or whom, are we hiding information? […]
Rotating PDF pages with Python
Yesterday I got a review copy of Automate the Boring Stuff with Python. It explains, among other things, how to manipulate PDFs from Python. This morning I needed to rotate some pages in a PDF, so I decided to try out the method in the book. The sample code uses PyPDF2. I’m using Conda for […]
RSS feeds for Twitter accounts
Twitter once provided RSS feeds for all Twitter accounts. They no longer provide this service. However, third parties can create RSS feeds from the content of Twitter accounts. BazQux has done this for my daily tip accounts, so you can subscribe to any of my accounts via RSS using the feeds linked to below. AlgebraFact AnalysisFact […]
Scientifically valid, practically invalid
In a recent episode of EconTalk, Phil Rosenzweig describes how the artificial conditions necessary to make experiments scientifically valid can also make the results practically invalid. Rosenzweig discusses experiments designed to study decision making. In order to make clean comparisons, subjects are presented with discrete choices over which they have no control. They cannot look for […]
The Mozart Myth
I don’t know how many times I’ve heard about how Mozart would compose entire musical scores in his head and only write them down once they were finished. Even authors who stress that creativity requires false starts and hard work have said that Mozart may have been an exception. But maybe he wasn’t. In his new book How to […]
Pedantic arithmetic rules
Generations of math teachers have drilled into their students that they must reduce fractions. That serves some purpose in the early years, but somewhere along the way students need to learn reducing fractions is not only unnecessary, but can be bad for communication. For example, if the fraction 45/365 comes up in the discussion of […]
QR Codes and Percolation
Percolation theory looks at problems such as the probability of being able to traverse some region with random obstacles. It is motivated by problems such as modeling the flow of a fluid in a porous medium. Here’s a percolation problem for QR codes: What is the probability that there is a path from one side […]
Why is an empty sum 0 and an empty product 1?
In response to my earlier post on why 0! should be 1, several people replied that 0! = 1 because an empty product is 1. You can define the factorial of an integer n as the product of all positive numbers less than or equal to n. There are no positive integers less than or equal […]
Quantifying uncertainty
The primary way to quantify uncertainty is to use probability. Subject to certain axioms that aim to capture common-sense rules for quantifying uncertainty, probability theory is essentially the only way. (This is Cox’s theorem.) Other methods, such as fuzzy logic, may be useful, though they must violate common sense (at least as defined by Cox’s theorem) […]
Defining zero factorial
Things are defined the way they are for good reasons. This seems blatantly obvious now, but it was eye-opening when I learned this my first year in college. Our professor, Mike Starbird, asked us to go home and think about how convergence of a series should be defined. Not how it is defined, but how […]
Why not statistics
Jordan Ellenberg’s parents were both statisticians. In his interview with Strongly Connected Components Jordan explains why he went into mathematics rather than statistics. I tried. I tried to learn some statistics actually when I was younger and it’s a beautiful subject. But at the time I think I found the shakiness of the philosophical underpinnings […]
Another reason we don’t apply the 80-20 rule
I’ve written about the 80-20 rule several times because it keeps coming up. I’d like to believe that each time I revisit it I understand it a little better. In its simplest form the 80-20 rule says 80% of your outputs come from 20% of your inputs. You might find that 80% of your revenue comes from 20% of […]
Endorsements
I’ve added a page for endorsements to my site. Thanks to everyone who let me use their photo and quote. If you’d like to contribute an endorsement, please contact me.
Magicians vs Repairmen
From The World Beyond Your Head: The appeal of magic is that it promises to render objects plastic to the will without one’s getting too entangled with them. Treated at arm’s length, the object can issue no challenge to the self. … The clearest contrast … that I can think of is the repairman, who […]
Looking ten years ahead
From Freeman Dyson: Economic forecasting is useful for predicting the future up to about ten years ahead. Beyond ten years the quantitative changes which the forecast accesses are usually sidetracked or made irrelevant by qualitative changes in the rules of the game. Qualitative changes are produced by human cleverness … or by human stupidity … Neither […]
Key fobs and interstellar space
From JPL scientist Rich Terrile: In everyone’s pocket right now is a computer far more powerful than the one we flew on Voyager, and I don’t mean your cell phone—I mean the key fob that unlocks your car. These days technology is equated with computer technology. For example, the other day I heard someone talk […]
Integration by Darts
Monte Carlo integration has been called “Integration by Darts,” a clever pun on “integration by parts.” I ran across the phrase looking at some slides by Brian Hayes, but apparently it’s been around a while. The explanation that Monte Carlo is “integration by darts” is fine as a 0th order explanation, but it can be […]
Bayes factors vs p-values
Bayesian analysis and Frequentist analysis often lead to the same conclusions by different routes. But sometimes the two forms of analysis lead to starkly different conclusions. The following illustration of this difference comes from a talk by Luis Pericci last week. He attributes the example to “Bernardo (2010)” though I have not been able to find the exact […]
Pros and cons of the term “data science”
I’ve resisted using the term “data science,” and enjoy poking fun at it now and then, but I’ve decided it’s not such a bad label after all. Here are some of the pros and cons of the term. (Listing “cons” first seems backward, but I’m currently leaning toward the pro side, so I thought I […]
Replace data with measurements
To tell whether a statement about data is over-hyped, see whether it retains its meaning if you replace data with measurements. So a request like “Please send me the data from your experiment” becomes “Please send me the measurements from your experiment.” Same thing. But rousing statements about the power of data become banal or even […]
Clinical trials and machine learning
Arguments over the difference between statistics and machine learning are often pointless. There is a huge overlap between the two approaches to analyzing data, sometimes obscured by differences in vocabulary. However, there is one distinction that is helpful. Statistics aims to build accurate models of phenomena, implicitly leaving the exploitation of these models to others. Machine learning aims to solve […]
Fitting a triangular distribution
Sometimes you only need a rough fit to some data and a triangular distribution will do. As the name implies, this is a distribution whose density function graph is a triangle. The triangle is determined by its base, running between points a and b, and a point c somewhere in between where the altitude intersects the base. […]
A subtle way to over-fit
If you train a model on a set of data, it should fit that data well. The hope, however, is that it will fit a new set of data well. So in machine learning and statistics, people split their data into two parts. They train the model on one half, and see how well it […]
Mathematical arbitrage
I suspect there’s a huge opportunity in moving mathematics from the pure column to the applied column. There may be a lot of useful math that never sees application because the experts are unconcerned with or unaware of applications. In particular I wonder what applications there may be of number theory, especially analytic number theory. […]
Mathematical modeling in Milton
In Book VIII of Paradise Lost, the angel Raphael tells Adam what difficulties men will have with astronomy: Hereafter, when they come to model heaven And calculate the stars: how they will wield the The mighty frame, how build, unbuild, contrive To save appearances, how gird the sphere With centric and eccentric scribbled o’er, Cycle […]
Partitioning natural numbers with pi
Every positive integer is either part of the sequence ⌊ nπ ⌋ or the sequence ⌊ nπ/(π – 1) ⌋ where n ranges over positive integers, and no positive integer is in both sequences. This is a special case of Beatty’s theorem.
Extremely small probabilities
One objection to modeling adult heights with a normal distribution is that the former is obviously positive but the latter can be negative. However, by this model negative heights are astronomically unlikely. I’ll explain below how one can take “astronomically” literally in this context. A common model says that men’s and women’s heights are normally […]
Atavachron
In the Star Trek episode “All Our Yesterdays” the people of the planet Sarpeidon have escaped into their past because their sun is about to become a supernova. They did this via a time machine called the Atavachron. One detail of the episode has stuck with me since I first saw it many years ago: although people can go back […]
Why isn’t everything normally distributed?
Adult heights follow a Gaussian, a.k.a. normal, distribution [1]. The usual explanation is that many factors go into determining one’s height, and the net effect of many separate causes is approximately normal because of the central limit theorem. If that’s the case, why aren’t more phenomena normally distributed? Someone asked me this morning specifically about […]
Machine learning and magic
When I first heard about a lie detector as a child, I was puzzled. How could a machine detect lies? If it could, why couldn’t you use it to predict the future? For example, you could say “IBM stock will go up tomorrow” and let the machine tell you whether you’re lying. Of course lie […]
Quaternions in Paradise Lost
Last night I checked a few books out from a library. One was Milton’s Paradise Lost and another was Kuipers’ Quaternions and Rotation Sequences. I didn’t expect any connection between these two books, but there is one. The following lines from Book V of Paradise Lost, starting at line 180, are quoted in Kuipers’ book: Air […]
Technical notes
For the last fifteen Wednesdays I’ve been posting links to technical notes. This is the end of the series. You can find most of the links from previous Wednesday posts on one page by going to technical notes from the navigation menu at the top of the site.
Oil on a parking lot
Oil on a wet parking lot
Graphemes
Here’s something amusing I ran across in the glossary of Programming Perl: grapheme A graphene is an allotrope of carbon arranged in a hexagonal crystal lattice one atom thick. Grapheme, or more fully, a grapheme cluster string is a single user-visible character, which in turn may be several characters (codepoints) long. For example … a “ȫ” […]
Too easy
When people sneer at a technology for being too easy to use, it’s worth trying out. If the only criticism is that something is too easy or “OK for beginners” then maybe it’s a threat to people who invested a lot of work learning to do things the old way. The problem with the “OK […]
...525354555657