Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2024-11-23 14:01
Sinc and Jinc sums
In the previous post, we looked at an elegant equation involving integrals of the sinc function and computed the corresponding integrals for the jinc function. It turns out the analogous equation holds for sums as well: As before, we’d like to compute these two sums and see whether we can compute the corresponding sums for the […]
Sinc and Jinc integrals
The sinc function is defined by sinc(x) = sin(x)/x. Philip Woodward introduced the name of the function in 1952, saying it “occurs so often in Fourier analysis and its applications that it does seem to merit some notation of its own.” Here’s an elegant equation involving the integrals of the sinc function: When I ran […]
Unicode / LaTeX page updated
Almost three years ago I put up a web page to let you go back and forth between Unicode code points and LaTeX commands. Here’s the page and here’s a blog post explaining it. I’ve expanded the data the page uses by merging in data from the STIX Project. More queries should return successfully now. […]
Typesetting and computing continued fractions
Pi The other day I ran across the following continued fraction for π. Source: L. J. Lange, An Elegant Continued Fraction for π, The American Mathematical Monthly, Vol. 106, No. 5 (May, 1999), pp. 456-458. While the continued fraction itself is interesting, I thought I’d use this an example of how to typeset and compute […]
Fourier analysis notes
There are six or eight ways to define a Fourier transform. The differences in the various conventions are minor, but they lead to differences in the basic results. So whenever you look up a result, you have to make sure the reference’s definition matches the one you’re expecting. Or maybe you re-derive the result. This is good […]
Big data paradox
This is what the book Social Media Mining calls the Big Data Paradox: Social media data is undoubtedly big. However, when we zoom into individuals for whom, for example, we would like to make relevant recommendations, we often have little data for each specific individual. We have to exploit the characteristics of social media and […]
Alternating sums of factorials
Richard Guy’s Strong Law of Small Numbers says There aren’t enough small numbers to meet the many demands made of them. In his article by the same name [1] Guy illustrates his law with several examples of patterns that hold for small numbers but eventually fail. One of these examples is 3! – 2! + 1! = […]
Non-technical books I’ve written about this year
Here are some of the non-technical books I’ve mentioned in blog posts this year. I posted the technical list a couple days ago. Maybe I should say “less technical” rather than “non-technical.” For example, Surely You’re Joking, Mr. Feynman is a book about a physicist, but it’s at least as much a human interest book as […]
Timidity about approximating
“Nature does not consist entirely, or even largely, of problems designed by a Grand Examiner to come out neatly in finite terms, and whatever subject we tackle the first need is to overcome timidity about approximating.” H. and B. S. Jeffreys, Methods of Mathematical Physics, 2nd ed., Cambridge University Press, 1950, p. 8. Related post: […]
Four ways to find hidden RSS feeds
RSS feeds RSS lets you subscribe to blogs. It also lets you read posts in peace, free from distracting peripheral ads. This explains why Google would kill off the world’s most popular RSS reader. Blogs used to display an icon linking to the site’s RSS feed, and any still do. Blogging software still creates RSS feeds, […]
Technical books I’ve written about this year
Here are some of the books I’ve mentioned in blog posts this year. One of these books may be just the present for a geek in your life. This post looks at technical books: math, science, engineering, programming. I’ll have a follow-up post with non-technical books I’ve written about. (Update: here’s the non-technical list.) Programming Book […]
Algorithms vs Moore’s Law
I saw an impressive chart once of how numerical linear algebra algorithm efficiency have improved over time. I haven’t been able to find that chart, but here is something similar. Thanks to Naveen Palli for pointing this out. Even more remarkable [than Moore’s law] — and even less widely understood — is that in many areas, performance gains due […]
It all boils down to linear algebra
When I was in college, my view of applied math was something like the following. Applied math is mostly mathematical physics. Mathematical physics is mostly differential equations. Numerical solution of differential equations boils down to linear algebra. Therefore the heart of applied math is linear algebra. I still think there’s a lot of truth in […]
Colors of noise
The term white noise is fairly common. People unfamiliar with its technical meaning will describe some sort of background noise, like a fan, as white noise. Less common are terms like pink noise, red noise, etc. The colors of noise are defined various ways, but they’re all based on an analogy between the power spectrum […]
Toy problems
A toy problem is a simplified problem meant to be a warm-up to a more complicated problem. I worked on a project earlier this year that was so complex that the write-up of the toy version grew to over 100 pages. We had to make a toy version of the toy version in order to have […]
R lists and XML
Hadley Wickham posted a photo on Twitter back in September illustrating R list indices with pepper: Then a few days ago, Jenny Bryan posted on Twitter her follow up, an analogous photo for XML: Related post: R without Hadley Wickham
Graph Laplacian and other matrices associated with a graph
There are several ways to associate a matrix with a graph. The properties of these matrices, especially spectral properties (eigenvalues and eigenvectors) tell us things about the structure of the corresponding graph. This post is a brief reference for keeping these various matrices straight. The most basic matrix associated with a graph is its adjacency […]
Dimensional analysis and types
This weekend I mentioned on Twitter that it’s spooky how well dimensional analysis catches errors. If you’re trying to calculate a number of horses, does your final result have units of horses? If it has units of cats or kilograms, something has gone wrong. This is such a simple idea, it’s remarkable that it’s worth checking. […]
Overestimating the competition
Richard Feynman tells a story in Surely You’re Joking, Mr. Feynman that I’m reminded of periodically when I realize something is smaller and less sophisticated than I imagined. [Update: A couple people pointed out in the comments that I got the roles of the two characters in this story reversed, so I’ve corrected this.] Feynman tells the story of […]
Learning (needlessly) hard technology
A few years ago, a friend told me he was thinking about learning a certain technology because it was really hard to use. This was not something that had to be complex to solve a complex problem, but something that was unnecessarily complex. Why would anyone do that? His reasoning was that as a consultant, […]
Estimating the exponent of discrete power law data
Suppose you have data from a discrete power law with exponent α. That is, the probability of an outcome n is proportional to n-α. How can you recover α? A naive approach would be to gloss over the fact that you have discrete data and use the MLE (maximum likelihood estimator) for continuous data. That […]
Twitter account wordclouds
Here are wordclouds for some of my most popular Twitter accounts. Thanks to Mike Croucher for creating these images. He explains on his blog how to create your own Twitter wordclouds using R. My most popular account is CompSciFact, tweets about computer science and related topics. AlgebraFact is for algebra, number theory, and miscellaneous pure […]
Mathematical alchemy and wrestling
David Mumford wrote a blog post a few weeks ago in which he identified four tribes of mathematicians. Here’s a summary of his description of the four tribes. Explorers are people who ask — are there objects with such and such properties and if so, how many? … Alchemists … are those whose greatest excitement comes from […]
Numerical differentiation
Today I needed to the derivative of the zeta function. SciPy implements the zeta function, but not its derivative, so I needed to write my own version. The most obvious way to approximate a derivative would be to simply stick a small step size into the definition of derivative: f’(x) ≈ (f(x+h) – f(x)) / […]
Splitting proofs in two
“Ever since Euclid, mathematical proofs have served a dual purpose: certifying that a statement is true and explaining why it is true. In the future these two epistemological functions may be divorced. In the future, the computer assistant may take care of the certification and leave the mathematician to look for an explanation that humans […]
Anthony Scopatz on xonsh and shells in general
Anthony Scopatz did an interview for Podcast.__init__ recently talking about xonsh, a command shell that blends Python and some traditions from bash. One line from the interview jumped out at me: … thinking very critically about what shells get used for and what they’re actually good at and what they’re not good at. I’ve wondered about […]
You do not want to be an edge case
Hilary Mason made an important observation on Twitter a few days ago: You do not want to be an edge case in this future we are building. Systems run by algorithms can be more efficient on average, but make life harder on the edge cases, people who are exceptions to the system developers’ expectations. Algorithms, whether encoded in software or […]
Project lead time
Large companies take longer to start projects. How much longer? A plausible guess is that project lead time would be proportional to the logarithm of the company size. If a company with n employees has a hierarchy with every manager having m subordinates, the number of management layers would be around logm(n). If every project has […]
Bastrop State Park, four years later
Four years ago I wrote about the wildfires in Bastrop, Texas. Here’s a photo from the time by Kerri West, used by permission. Today I visited Bastrop State Park on the way home from Austin. Some trees, particularly oaks, survived the fires. Pines have come back on their own in parts of the park. A volunteer working […]
Interpreting scientific literature about your product
A medical device company approached me with the following problem. Scientists had written academic journal articles about their product, but the sales force couldn’t understand what they said. My task was to read the articles, then tell the people in sales what the articles were saying in laymen’s terms. One of the questions that came […]
Skin in the game for observational studies
The article Deming, data and observational studies by S. Stanley Young and Alan Karr opens with Any claim coming from an observational study is most likely to be wrong. They back up this assertion with data about observational studies later contradicted by prospective studies. Much has been said lately about the assertion that most published results are false, particularly […]
Intellectual property is hard to steal
It’s hard to transfer intellectual property. When I was managing software projects, it would take months to fully transfer a project from one person to another. This was with full access to and encouragement from the original developer. This was a transfer between peers, both part of the same environment with all its institutional memory. […]
New data, not just bigger data
The Insight 2015 conference highlighted some impressive applications of big data: predicting the path of hurricanes more accurately (as we saw with hurricane Patricia), improving the performance of athletes, making cars safer, etc. These applications involve large amounts of data. But more importantly they involve new data, not simply greater quantities of data we’ve had before. […]
Balancing profit and learning in A/B testing
A/B testing, or split testing, is commonly used in web marketing to decide which of two design options performs better. If you have so many visitors to a site that the number of visitors used in a test is negligible, conventional randomization schemes are the way to go. They’re simple and effective. But if you […]
Insight 2015
A few weeks ago I got a message on Twitter saying that IBM’s Watson had identified me as an “influencer” and invited me to the company’s Insight 2015 conference. So that’s where I am this week. I had a brief interview last night. Someone took this photo as we were setting up.
Distance to Mars
The distance between the Earth and Mars depends on their relative positions in their orbits and varies quite a bit over time. This post will show how to compute the approximate distance over time. We’re primarily interested in Earth and Mars, though this shows how to calculate the distance between any two planets. The planets […]
Impulse response
You may expect that a burst of input will cause a burst of output. Sometimes that’s the case, but often a burst of input results in a long, smoothly decreasing succession of output. You may not get immediate results, but long-term results. This is true of life in general, but it’s also true in a precise sense of differential equations. […]
Permutations and tests
Suppose a test asks you to place 10 events in chronological order. Label these events A through J so that chronological order is also alphabetical order. If a student answers BACDEFGHIJ, then did they make two mistakes or just one? Two events are in the wrong position, but they made one transposition error. The simplest way […]
How did our ancestors sleep?
Electric lighting has changed the way we sleep, encouraging us to lose sleep by staying awake much longer after dark than we otherwise would. Or maybe not. A new study of three contemporary hunter-gatherer tribes found that they stay awake long after dark and sleep an average of 6.5 hours a night. They also don’t nap […]
Fibonacci formula for pi
Here’s an unusual formula for pi based on the product and least common multiple of the first m Fibonacci numbers. Unlike the formula I wrote about a few days ago relating Fibonacci numbers and pi, this one is not as simple to prove. The numerator inside the root is easy enough to estimate asymptotically, […]
PACE: Property Assessed Clean Energy
Energy efficiency improvements can pay for themselves in the long run. Financing can make the improvements immediately cash-flow positive, but only if the loan tenor can match the useful life of the equipment. This enables the payments to be low enough that the projected energy savings exceeds the payments. PACE, which stands for Property Assessed Clean Energy, is […]
A rose by any other name: Data science etc.
I help people make decisions in the face of uncertainty. Sounds interesting. I’m a data scientist. Not sure what that means, but it sounds cool. I study machine learning. Hmm. Maybe interesting, maybe a little ominous. I’m into big data. Exciting or passé, depending on how many times you’ve heard the term. Even though each […]
Fibonacci numbers, arctangents, and pi
Here’s an unusual formula for π. Let Fn be the nth Fibonacci number. Then As mysterious as this equation may seem, it’s not hard to prove. The arctangent identity shows that the sum telescopes, leaving only the first term, arctan(1) = π/4. To prove the arctangent identity, take the tangent of both sides, use the addition law for tangents, and […]
Second languages and selection bias
When I was growing up, I was told that you could never become fluent in a second language, and I believed it. I had no reason not to. I didn’t know anybody who had become fluent at a second language, and I could think of plenty of people who had learned English as a second […]
Number of digits in n!
The other day I ran across the fact that 23! has 23 digits. That made me wonder how often n! has n digits. There can only be a finite number of cases, because n! grows faster than 10n for n > 10, and it’s reasonable to guess that 23 might be the largest case. Turns out it’s […]
Data analysis vs statistics
John Tukey preferred the term “data analysis” over “statistics.” In his paper Data Anaysis, Computation and Mathematics, he explains why. My title speaks of “data analysis” not “statistics”, and of “computation” not “computing science”; it does not speak of “mathematics”, but only last. Why? … My brother-in-squared-law, Francis J. Anscombe has commented on my use of […]
Technical arbitrage
There are huge opportunities to take technology that is well-known and undervalued in one context and apply it in another where it is unknown but valuable. You could call this technical arbitrage, analogous to financial arbitrage, taking advantage of the price difference of something in two markets. As with financial arbitrage, the hard part is […]
The academic cocoon
In the novel Enchantment, the main character, Ivan, gives a bitter assessment of his choice of an academic career, saying it was for “men who hadn’t yet grown up.” The life he had chosen was a cocoon. Surrounded by a web of old manuscripts and scholarly papers, he would achieve tenure, publish frequently, teach a group of […]
Doubly and triply periodic functions
A function f is periodic if there exists a constant period ω such that f(x) = f(x + ω) for all x. For example, sine and cosine are periodic with period 2π. There’s only one way a function on the real line can be periodic. But if you think of functions of a complex variable, […]
Taking away a damaging tool
This is a post about letting go of something you think you need. It starts with an illustration from programming, but it’s not about programming. Bob Martin published a dialog yesterday about the origin of structured programming, the idea that programs should not be written with goto statements but should use less powerful, more specialized […]
...48495051525354