Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2024-11-21 16:32
Duct tape value creation
Excerpt from from John Carmack's review of the book Bullshit Jobs. He talks about how software developers bemoan duct taping systems together, and would rather work on core technologies. He thinks it is some tragic failure, that if only wise system design was employed, you wouldn't be doing all the duct taping. Wrong. Every expansion [...]The post Duct tape value creation first appeared on John D. Cook.
Condition on your data
Suppose you design an experiment, an A/B test of two page designs, randomizing visitors to Design A or Design B. You planned to run the test for 800 visitors and you calculated some confidence level for your experiment. You decide to take a peek at the data after only 300 randomizations, even though your [...]The post Condition on your data first appeared on John D. Cook.
Can you look at experimental results along the way or not?
Suppose you're running an A/B test to determine whether a web page produces more sales with one graphic versus another. You plan to randomly assign image A or B to 1,000 visitors to the page, but after only randomizing 500 visitors you want to look at the data. Is this OK or not? Of course [...]The post Can you look at experimental results along the way or not? first appeared on John D. Cook.
One-liner to troubleshoot LaTeX references
In LaTeX, sections are labeled with commands like \label{foo} and referenced like \ref{foo}. Referring to sections by labels rather than hard-coded numbers allows references to automatically update when sections are inserted, deleted, or rearranged. For every reference there ought to be a label. A label without a corresponding reference is fine, though it might be [...]The post One-liner to troubleshoot LaTeX references first appeared on John D. Cook.
A “well-known” series
I was reading an article [1] that refers to a well-known trigonometric series" that I'd never seen before. This paper cites [2] which gives the series as Note that the right hand side is not a series in but rather in sin . Motivation Why might you know sin and want to calculate [...]The post A well-known" series first appeared on John D. Cook.
Probability, cryptography, and naïveté
Probability and cryptography have this in common: really smart people can be confidently wrong about both. I wrote years ago about how striking it was to see two senior professors arguing over an undergraduate probability exercise. As I commented in that post, Professors might forget how to do a calculus problem, or make a mistake [...]The post Probability, cryptography, and naivete first appeared on John D. Cook.
Thinking by playing around
Richard Feynman's Nobel Prize winning discoveries in quantum electrodynamics were partly inspired by his randomly observing a spinning dinner plate one day in the cafeteria. Paul Feyerabend said regarding science discovery, The only principle that does not inhibit progress is: anything goes" (within relevant ethical constraints, of course). Ideas can come from anywhere, including physical [...]The post Thinking by playing around first appeared on John D. Cook.
Approximation by prime powers
The well-known Weierstrass approximation theorem says that polynomials are dense in C[0, 1]. That is, given any continuous function f on the unit interval, and any > 0, you can find a polynomial P such that f and P are never more than apart. This means that linear combinations of the polynomials 1, [...]The post Approximation by prime powers first appeared on John D. Cook.
Logarithm approximation curiosity
I've written before about three simple approximations for logarithms, for base 10 log10(x) (x - 1)/(x + 1) base e, loge(x) 2(x - 1)/(x + 1) and base 2 log2(x) 3(x - 1)/(x + 1). These can be used to mentally approximate logarithms to moderate accuracy, accurate enough for quick estimates. Here's [...]The post Logarithm approximation curiosity first appeared on John D. Cook.
Iterated Mersenne primes
A Mersenne number is a number of the form 2k - 1. A Mersenne prime is a Mersenne number which is also a prime. It turns out that if 2k - 1 is prime then k must be prime, so Mersenne numbers have the form 2p - 1 is prime. What about the converse? If [...]The post Iterated Mersenne primes first appeared on John D. Cook.
Small probabilities add, big ones don’t
A video has been making the rounds in which a well-known professor [1] says that if something has a 20% probability of happening in one attempt, then it has a 40% chance of happening in two attempts, a 60% chance in happening in three attempts, etc. This is wrong, but it's a common mistake. And [...]The post Small probabilities add, big ones don't first appeared on John D. Cook.
Logistic regression quick takes
This post is a series of quick thoughts related to logistic regression. It starts with this article on moving between logit and probability scales. *** Logistic regression models the probability of a yes/no event occurring. It gives you more information than a model that simply tries to classify yeses and nos. I advised a client [...]The post Logistic regression quick takes first appeared on John D. Cook.
Numerical application of mean value theorem
Suppose you'd like to evaluate the function for small values of z, say z = 10-8. This example comes from [1]. The Python code from numpy import exp def f(z): return (exp(z) - 1 - z)/z**2 print(f(1e-8)) prints -0.607747099184471. Now suppose you suspect numerical difficulties and compute your result to 50 decimal places using bc [...]The post Numerical application of mean value theorem first appeared on John D. Cook.
Numerical differentiation with a complex step
The most obvious way to approximate the derivative of a function numerically is to use the definition of derivative and stick in a small value of the step size h. f'(x) ( f(x + h) - f(x) ) / h. How small should h be? Since the exact value of the derivative is the [...]The post Numerical differentiation with a complex step first appeared on John D. Cook.
MCMC and the coupon collector problem
Bob Carpenter wrote today about how Markov chains cannot thoroughly cover high-dimensional spaces, and that they do not need to. That's kinda the point of Monte Carlo integration. If you want systematic coverage, you can/must sample systematically, and that's impractical in high dimensions. Bob gives the example that if you want to get one integration [...]The post MCMC and the coupon collector problem first appeared on John D. Cook.
Up and down the abstraction ladder
It's easier to go up a ladder than to come down, literally and metaphorically. Gian-Carlo Rota made a profound observation on the application of theory. One frequently notices, however, a wide gap between the bare statement of a principle and the skill required in recognizing that it applies to a particular problem. This isn't quite [...]The post Up and down the abstraction ladder first appeared on John D. Cook.
Making documents with makefiles
I learned to use the command line utility make in the context of building C programs. The program make reads an input file to tell it how to make things. To make a C program, you compile the source files into object files, then link the object files together. You can tell make what depends [...]The post Making documents with makefiles first appeared on John D. Cook.
Applied abstraction
Good general theory does not search for the maximum generality, but for the right generality." - Saunders Mac Lane One of the benefits of a scripting language like Python is that it gives you generalizations for free. For example, take the function sorted. If you give it a list of integers, it will return [...]The post Applied abstraction first appeared on John D. Cook.
A deck of cards
One time when I was in grad school, I was a teaching assistant for a business math class that included calculus and a smattering of other topics, including a little bit of probability. I made up examples involving a deck of cards, but then learned to my surprise that not everyone was familiar with playing [...]The post A deck of cards first appeared on John D. Cook.
What can JWST see?
The other day I ran across this photo of Saturn's moon Titan taken by the James Webb Space Telescope (JWST). If JWST can see Titan with this kind of resolution, how well could it see Pluto or other planets? In this post I'll do some back-of-the-envelope calculations, only considering the apparent size of objects, ignoring [...]The post What can JWST see? first appeared on John D. Cook.
Fizz buzz walk
I ran across a graphic yesterday made by taking a sequence of steps of the same length, turning left on the nth step if n is prime, and otherwise continuing in the same direction. Here's my recreation of the first 1000 steps: You can see that in general it makes a lot of turns at [...]The post Fizz buzz walk first appeared on John D. Cook.
Closed-form solutions to nonlinear PDEs
The traditional approach to teaching differential equations is to present a collection of techniques for finding closed-form solutions to ordinary differential equations (ODEs). These techniques seem completely unrelated [1] and have arcane names such as integrating factors, exact equations, variation of parameters, etc. Students may reasonably come away from an introductory course with the false [...]The post Closed-form solutions to nonlinear PDEs first appeared on John D. Cook.
Choosing a Computer Language for a Project
Julia. Scala. Lua. TypeScript. Haskell. Go. Dart. Various computer languages new and old are sometimes proposed as better alternatives to mainstream languages. But compared to mainstream choices like Python, C, C++ and Java (cf. Tiobe Index)-are they worth using? Certainly it depends a lot on the planned use: is it a one-off small project, or [...]The post Choosing a Computer Language for a Project first appeared on John D. Cook.
On greedy algorithms and rodeo clowns
This weekend I ran across a blog post The Rodeo Clown Theory of Personal Development. The title comes from a hypothetical example of a goal you don't know how to achieve: becoming a rodeo clown. Let's say you decide you want to be a rodeo clown. And let's say you're me and you have no [...]The post On greedy algorithms and rodeo clowns first appeared on John D. Cook.
Finding strings in binary files
There's a little program called strings that searches for what appear to be strings inside binary file. I'll refer to it as strings(1) to distinguish the program name from the common English word strings. [1] What does strings(1) consider to be a string? By default it is a sequence of four or more bytes that [...]The post Finding strings in binary files first appeared on John D. Cook.
Extract text from a PDF
Arshad Khan left a comment on my post on the less and more utilities saying on ubuntu if I do less on a pdf file, it shows me the text contents of the pdf." Apparently this is an undocumented feature of GNU less. It works, but I don't see anything about it in the man [...]The post Extract text from a PDF first appeared on John D. Cook.
Length of a general Archimedean spiral
This post ties together the previous three posts. In this post, I said that an Archimedean spiral has the polar equation r = b 1/n and applied this here to rolls of carpet. When n = 1, the length of the spiral for running from 0 to T is approximately bT^2 with the [...]The post Length of a general Archimedean spiral first appeared on John D. Cook.
How big will a carpet be when you roll or unroll it?
If you know the dimensions of a carpet, what will the dimensions be when you roll it up into a cylinder? If you know the dimensions of a rolled-up carpet, what will the dimensions be when you unroll it? This post answers both questions. Flexible carpet: solid cylinder The edge of a rolled-up carpet can [...]The post How big will a carpet be when you roll or unroll it? first appeared on John D. Cook.
Approximating a spiral by rings
An Archimedean spiral has the polar equation r = b 1/n This post will look at the case n = 1. I may look at more general values of n in a future post. (Update: See here.) The case n = 1 is the simplest case, and it's the case I needed for the client [...]The post Approximating a spiral by rings first appeared on John D. Cook.
Hypergeometric function of a large negative argument
It's occasionally necessary to evaluate a hypergeometric function at a large negative argument. I was working on a project today that involved evaluating F(a, b; c; z) where z is a large negative number. The hypergeometric function F(a, b; c; z) is defined by a power series in z whose coefficients are functions of a, [...]The post Hypergeometric function of a large negative argument first appeared on John D. Cook.
More is less
When I first started using Unix, I used a program called more" to read files. The name makes sense because each time you press the space bar, more will show you more of your file, one screen at a time. Now everyone uses less, and more is all but forgotten. Daniel Halbert wrote more in [...]The post More is less first appeared on John D. Cook.
Precise answers to useless questions
I recently ran across a tweet from Allen Downey saying So much of 20th century statistics was just a waste of time, computing precise answers to useless questions. He's right. I taught mathematical statistics at GSBS [1, 2] several times, and each time I taught it I became more aware of how pointless some of [...]The post Precise answers to useless questions first appeared on John D. Cook.
Pairs in poker
An article by Y. L. Cheung [1] gives reasons why poker is usually played with five cards. The author gives several reasons, but here I'll just look at one reason: pairs don't act like you might expect if you have more than five cards. In five-card poker, the more pairs the better. Better here means [...]The post Pairs in poker first appeared on John D. Cook.
Solar system means
Yesterday I stumbled on the fact that the size of Jupiter is roughly the geometric mean between the sizes of Earth and the Sun. That's not too surprising: in some sense (i.e. on a logarithmic scale) Jupiter is the middle sized object in our solar system. What I find more surprising is that a systematic [...]The post Solar system means first appeared on John D. Cook.
Earth : Jupiter :: Jupiter : Sun
The size of Jupiter is approximately the geometric mean of the sizes of Sun and Earth. In terms of radii, The ratio on the left equals 9.95 and the ratio on the left equals 10.98. The subscripts are the astronomical symbols for the Sun (, U+2609), Jupiter (, U+2643), and Earth (, U+1F728). I produced [...]The post Earth : Jupiter :: Jupiter : Sun first appeared on John D. Cook.
Gravity on Jupiter
I was listening to the latest episode of the Space Rocket History podcast. The show includes some audio from a documentary on Pioneer 11 that mentioned that a man would weigh 500 pounds on Jupiter. My immediate thought was Is that all?! Is this man' a 100 pound boy?" The documentary was correct and my [...]The post Gravity on Jupiter first appeared on John D. Cook.
Are guidance documents laws?
Are guidance documents laws? No, but they can have legal significance. The people who generate regulatory guidance documents are not legislators. Legislators delegate to agencies to make rules, and agencies delegate to other organizations to make guidelines. For example [1], Even HHS, which has express cybersecurity rulemaking authority under the Health Insurance Portability and Accountability [...]The post Are guidance documents laws? first appeared on John D. Cook.
More Laguerre images
A week or two ago I wrote about Laguerre's root-finding method and made some associated images. This post gives a couple more examples. Laguerre's method is very robust in the sense that it is likely to converge to a root, regardless of the starting point. However, it may be difficult to predict which root the [...]The post More Laguerre images first appeared on John D. Cook.
A Bayesian approach to proving you’re human
I set up a GitHub account for a new employee this morning and spent a ridiculous amount of time proving that I'm human. The captcha was to listen to three audio clips at a time and say which one contains bird sounds. This is a really clever test, because humans can tell the difference between [...]The post A Bayesian approach to proving you're human first appeared on John D. Cook.
Antenna length: Another rule of 72
The famous Rule of 72 says that to find out how many years it takes an investment to double in value, divide 72 by the annual percentage rate. I'll come back to that in a little bit. This morning I read a really good article, Fifty Things you can do with a Software Defined Radio. [...]The post Antenna length: Another rule of 72 first appeared on John D. Cook.
Hallucinations of AI Science Models
AlphaFold 2, FourCastNet and CorrDiff are exciting. AI-driven autonomous labs are going to be a big deal [1]. Science codes now use AI and machine learning to make scientific discoveries on the world's most powerful computers [2]. It's common practice for scientists to ask questions about the validity, reliability and accuracy of the mathematical and [...]The post Hallucinations of AI Science Models first appeared on John D. Cook.
Double super factorial
I saw someone point out recently that 10! = 7! * 5! * 3! * 1! Are there more examples like this? What would you call the pattern on the right? I don't think there's a standard name, but here's why I think it should be called double super factorial or super double factorial. Super [...]The post Double super factorial first appeared on John D. Cook.
Laguerre’s root finding method
Edmond Laguerre (1834-1886) came up with a method for finding zeros of polynomials. Unlike Newton's method for finding zeros of general functions, Laguerre's method is specialized for polynomials. Laguerre's method converges an order of magnitude faster than Newton's method, i.e. the error is cubed on each step rather than squared. The most interesting thing about [...]The post Laguerre's root finding method first appeared on John D. Cook.
Breach Safe Harbor
In the context of medical data, Safe Harbor typically refers to the Safe Harbor provisions of the HIPAA Privacy Rule explained here. Breach Safe Harbor is a little different. It basically means you're off the hook if you breach encrypted health data. (But not necessarily. More on that below.) I'm not a lawyer, so this [...]The post Breach Safe Harbor first appeared on John D. Cook.
MD5 hash collision example
Marc Stevens gave an example of two alphanumeric strings that differ in only one byte that have the same MD5 hash value. It may seem like beating a dead horse to demonstrate weaknesses in MD5, but it's instructive to study the flaws of broken methods. And despite the fact that MD5 has been broken for [...]The post MD5 hash collision example first appeared on John D. Cook.
Distance from a point to a line
Eric Lengyel's new book Projective Geometric Algebra Illuminated arrived yesterday and I'm enjoying reading it. Imagine if someone started with ideas like dot products, cross products, and determinants that you might see in your first year of college, then thought deeply about those things for years. That's kinda what the book is. Early in the [...]The post Distance from a point to a line first appeared on John D. Cook.
Experiences with Thread Programming in Microsoft Windows
Lately I've been helping a colleague to add worker threads to his GUI-based Windows application. Thread programming can be tricky. Here are a few things I've learned along the way. Performance. This app does compute-intensive work. It is helpful to offload this very compute-heavy work to a worker thread. Doing this frees the main thread [...]The post Experiences with Thread Programming in Microsoft Windows first appeared on John D. Cook.
Accelerating Archimedes
One way to approximate is to find the areas of polygons inscribed inside a circle and polygons circumscribed outside the circle. The approximation improves as the number of sides in the polygons increases. This idea goes back at least as far as Archimedes (287-212 BC). Maybe you've tried this. It's a lot of work. [...]The post Accelerating Archimedes first appeared on John D. Cook.
How much will a cable sag? A simple approximation
Suppose you have a cable of length 2s suspended from two poles of equal height a distance 2x apart. Assuming the cable hangs in the shape of a catenary, how much does it sag in the middle? If the cable were pulled perfectly taut, we would have s = x and there would be no [...]The post How much will a cable sag? A simple approximation first appeared on John D. Cook.
Unique letter patterns in words
The word Mississippi has a unique pattern of letters. If you were solving a cryptogram puzzle and saw ZVFFVFFVCCV you might guess that the word is Mississippi. Is the pattern of letters in Mississippi literally unique or just uncommon? What is the shortest word with a unique letter pattern? The longest word? We can answer [...]The post Unique letter patterns in words first appeared on John D. Cook.
12345678910...