Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2024-11-21 16:32
How to Organize Technical Research?
64 million scientific papers have been published since 1996 [1]. Assuming you can actually find the information you want in the first place-how can you organize your findings to be able to recall and use them later? It's not a trifling question. Discoveries often come from uniting different obscure pieces of information in a [...]The post How to Organize Technical Research? first appeared on John D. Cook.
A surprising result about surprise index
Surprise index Warren Weaver [1] introduced what he called the surprise index to quantify how surprising an event is. At first it might seem that the probability of an event is enough for this purpose: the lower the probability of an event, the more surprise when it occurs. But Weaver's notion is more subtle than [...]The post A surprising result about surprise index first appeared on John D. Cook.
Estimating an author’s vocabulary
How would you estimate the size of an author's vocabulary? Suppose you have a analyzed the author's available works and found n words, x of which are unique. Then you know the author's vocabulary was at least x, but it's reasonable to assume that the author may have know words he never used in writing, [...]The post Estimating an author's vocabulary first appeared on John D. Cook.
Detecting the language of encrypted text
Imagine you are a code breaker living a century ago. You've intercepted a message, and you go through your bag of tricks, starting with the simplest techniques first. Maybe the message has been encrypted using a simple substitution cipher, so you start with that. Simple substitution ciphers can be broken by frequency analysis: the most [...]The post Detecting the language of encrypted text first appeared on John D. Cook.
Blow up in finite time
A few years ago I wrote a post about approximating the solution to a differential equation even though the solution did not exist. You can ask a numerical method for a solution at a point past where the solution blows up to infinity, and it will dutifully give you a finite solution. The result is [...]The post Blow up in finite time first appeared on John D. Cook.
Normal subgroups are subtle
The definition of a subgroup is obvious, but the definition of a normal subgroup is subtle. Widgets and subwidgets The general pattern of widgets and subwidgets is that a widget is a set with some kind of structure, and a subwidget is a subset that has the same structure. This applies to vector spaces and [...]The post Normal subgroups are subtle first appeared on John D. Cook.
Finite differences and Pascal’s triangle
The key to solving a lot of elementary what-number-comes-next puzzles is to take first or second differences. For example, if asked what the next item in the series 14, 29, 50, 77, 110, ... the answer (or at lest the answer the person posing the question is most likely looking for) is 149. You might [...]The post Finite differences and Pascal's triangle first appeared on John D. Cook.
Archiving data on paper
This is a guest post by Ondej ertik. Ondej formerly worked at Los Alamos National Labs and now works for GSI Technologies. He is known in the Python community for starting the SymPy project and in the Fortran community for starting LFortran. - John I finally got to experiment a bit with archiving data [...]The post Archiving data on paper first appeared on John D. Cook.
Emails moved to Substack
Until recently I used two email services: one to send out daily blog post announcements and another for monthly blog highlights. I've combined these into one Substack account for weekly blog highlights. Apparently readers really like this move. Daily and monthly email subscriptions flatlined some time ago, but Substack subscriptions are going up steadily. Substack [...]The post Emails moved to Substack first appeared on John D. Cook.
What’s the Best Code Editor?
Emacs, vi, TextEdit, nano, Sublime, Notepad, Wordpad, Visual Studio, Eclipse, etc., etc.-everyone's got a favorite. I used Visual Studio previously and liked the integrated debugger. Recently I started using VS again and found the code editing windows rather cluttered. Thankfully you can tone this down, if you can locate the right options. Eclipse for Java [...]The post What's the Best Code Editor? first appeared on John D. Cook.
Bounding the perimeter of a triangle between circles
Suppose you have a triangle and you know the size of the largest circle that can fit inside (the incircle) and the size of the smallest circle that can fit outside (the circumcircle). How would you estimate the perimeter of the triangle? In terms of the figure below, if you know the circumference of the [...]The post Bounding the perimeter of a triangle between circles first appeared on John D. Cook.
Music of the spheres
The idea of music of the spheres" dates back to the Pythagoreans. They saw an analogy between orbital frequency ratios and musical frequency ratios. HD 110067 is a star 105 light years away that has six known planets in orbital resonance. The orbital frequencies of the planets are related to each other by small integer [...]The post Music of the spheres first appeared on John D. Cook.
The Real Book
I listened to the 99% Invisible podcast about The Real Book this morning and thought back to my first copy. My first year in college I had a jazz class, and I needed to get a copy of The Real Book, a book of sheet music for jazz standards. The book that was illegal at [...]The post The Real Book first appeared on John D. Cook.
Substack replacing email subscription
The service that sent out my email to blog subscribers stopped working a couple weeks ago, and I'm trying out Substack as a replacement. You can find my Substack account here. My plan for now is to use this account to make blog post announcements, maybe once a week, with a little introductory commentary for [...]The post Substack replacing email subscription first appeared on John D. Cook.
Determinant of an infinite matrix
What does the infinite determinant mean and when does it converge? The determinant D above is the limit of the determinants Dn defined by If all the as are 1 and all the bs are -1 then this post shows that Dn = Fn, the nth Fibonacci number. The Fibonacci numbers obviously don't converge, so [...]The post Determinant of an infinite matrix first appeared on John D. Cook.
Area of quadrilateral as a determinant
I've written several posts about how determinants come up in geometry. These determinants often look similar, having columns related to coordinates and a column of ones. You can find several examples here along with an explanation for this pattern. If you have three points z1, z2, and z3 in the complex plane, you can find [...]The post Area of quadrilateral as a determinant first appeared on John D. Cook.
A very accurate logarithm approximation
The previous post looked at an efficient way to approximate nth roots of fractions near 1 by hand. This post does the same for logarithms. As before, we assume x = p/q and define s = p + q d = p - q Because we're interested in values of x near 1, d is [...]The post A very accurate logarithm approximation first appeared on John D. Cook.
Handy approximation for roots of fractions
This post will discuss a curious approximation with a curious history. Approximation Let x be a number near 1, written as a fraction x = p / q. Then define s and d as the sum and difference of the numerator and denominator. s = p + q d = p - q Since we [...]The post Handy approximation for roots of fractions first appeared on John D. Cook.
Uncovering names masked with stars
Sometimes I'll see things like my name partially concealed as J*** C*** and think a lot of good that does." Masking letters reveals more than people realize. For example, when you see that someone's first name is four letters and begins with J, there's about a 70% chance they're male and there's a 44% chance [...]The post Uncovering names masked with stars first appeared on John D. Cook.
Almost ASCII
I was working recently with a gigabyte file that had a dozen non-ASCII characters. This is very common. The ASCII character set is not quite big enough for a lot of tasks. Of course it's completely inadequate if you're writing Japanese, but it's almost enough for documents written in English and a few other languages. [...]The post Almost ASCII first appeared on John D. Cook.
A knight’s tour of an infinite chessboard
Let ^2 be the lattice of points in the plane with integer coordinates. You could think of these points as being the centers of the squares in a chessboard extending to infinity in every direction. Cantor tells us that the points in ^2 are countable. What's more surprising is that you could count the points [...]The post A knight's tour of an infinite chessboard first appeared on John D. Cook.
Natural one-liners
I learned to use Unix in college-this was before Linux-but it felt a little mysterious. Clearly it was developed by really smart people, but what were the problems that motivated their design choices? Some of these are widely repeated. For example, commands have terse names because you may have to transmit commands over a glacial [...]The post Natural one-liners first appeared on John D. Cook.
When is less data less private?
If I give you a database, I give you every row in the database. So if you delete some rows from the database, you have less information, not more, right? This seems very simple, and it mostly is, but there are a couple subtleties. A common measure in data privacy is k-anonymity. The idea is [...]The post When is less data less private? first appeared on John D. Cook.
Additive functions
A functionf from positive integers to real numbers is defined to be additive if for relatively prime numbers m and n, f(mn) = f(m) + f(n). The function f is called completely addititive if the above holds for all positive integers m and n, i.e. we drop the requirement that m and n are relatively [...]The post Additive functions first appeared on John D. Cook.
Frequency analysis
Suppose you have a list of encrypted surnames names of US citizens. If the list is long enough, the encrypted name that occurs most often probably corresponds to Smith. The second most common encrypted name probably corresponds to Johnson, and so forth. This kind of inference is analogous to solving a cryptogram puzzle by counting [...]The post Frequency analysis first appeared on John D. Cook.
Security by obscurity
Security-by-obscurity is a bad idea in general. It's better, for example, to have a login page than to give your site an obscure URL. It's better to encrypt a file than to hide it in some odd directory. It's better to use a well-vetted encryption algorithm than to roll your own. There there are people [...]The post Security by obscurity first appeared on John D. Cook.
Advanced questions about a basic diagram
I saw a hand-drawn version of the diagram above yesterday and noticed that the points were too evenly distributed. That got me to thinking: is there any objective way to say that this famous diagram is in some sense complete? If you were to make a diagram with more points, what would they be? Simple [...]The post Advanced questions about a basic diagram first appeared on John D. Cook.
How much metadata is in a photo?
A few days ago I wrote about the privacy implications of metadata in a PDF. This post will do the same for photos. You can see the metadata in a photo using exiftool. By default cameras include time and location data. I ran this tool on a photo I took in Seattle a few years [...]The post How much metadata is in a photo? first appeared on John D. Cook.
The Borwein integrals
The Borwein integrals introduced in [1] are a famous example of how proof-by-example can go wrong. Define sinc(x) as sin(x)/x. Then the following equations hold. However where 2.3 * 10-11. This is where many presentations end, concluding with the moral that a pattern can hold for a while and then stop. But I'd [...]The post The Borwein integrals first appeared on John D. Cook.
Avoiding Multiprocessing Errors in Bash Shell
Suppose you have two Linux processes trying to modify a file at the same time and you don't want them stepping on each other's work and making a mess. A common solution is to use a lock" mechanism (a.k.a. mutex"). One process locks the lock" and by this action has sole ownership of a [...]The post Avoiding Multiprocessing Errors in Bash Shell first appeared on John D. Cook.
This-way-up and Knuth arrows
I was looking today at a cardboard box that had the this way up" symbol on it and wondered whether there is a Unicode value for it. Apparently not. But there is an ISO code for it: ISO 7000 symbol 0623. It's an international standard symbol for indicating how to orient a package. The name [...]The post This-way-up and Knuth arrows first appeared on John D. Cook.
Factoring pseudoprimes
Fermat's little theorem says that if p is a prime number, then for any positive integer b < p we hve bp-1 = 1 (mod p). This theorem gives a necessary but not sufficient condition for a number to be prime. Fermat's primality test The converse of Fermat's little theorem is not always true, but [...]The post Factoring pseudoprimes first appeared on John D. Cook.
Do comments in a LaTeX file change the output?
When you add a comment to a LaTeX file, it makes no visible change to the output. The comment is ignored as far as the appearance of the file. But is that comment somehow included in the file anyway? If you compile a LaTeX file to PDF, then edit it by throwing in a comment, [...]The post Do comments in a LaTeX file change the output? first appeared on John D. Cook.
Your PDF may reveal more than you intend
When you create a PDF file, what you see is not all you get. There is metadata embedded in the file that might be useful. It also might reveal information you'd rather not reveal. The previous post looked at just the time stamp on a file. This post will look at more metadata, focusing on [...]The post Your PDF may reveal more than you intend first appeared on John D. Cook.
If you save a file as PDF twice, you get two different files
If you save a file as a PDF twice, you won't get exactly the same file both times. To illustrate this, I created an LibreOffice document containing Hello world." and saved it twice, first as humpty.pdf then as dumpty.pdf. Then I compared the two files. % diff humpty.pdf dumpty.pdf Binary files humpty.pdf and dumpty.pdf differ [...]The post If you save a file as PDF twice, you get two different files first appeared on John D. Cook.
Is Low Precision Arithmetic Safe?
The popularity of low precision arithmetic for computing has exploded since the 2017 release of the Nvidia Volta GPU. The half precision tensor cores of Volta offered a massive 16X performance gain over double precision for key operations. The race to the bottom" for lower precision computations continues: some have even solved significant problems using [...]The post Is Low Precision Arithmetic Safe? first appeared on John D. Cook.
How likely is a random variable to be far from its center?
There are many answers to the question in the title: How likely is a random variable to be far from its center? The answers depend on how much you're willing to assume about your random variable. The more you can assume, the stronger your conclusion. The answers also depend on what you mean by center," [...]The post How likely is a random variable to be far from its center? first appeared on John D. Cook.
Connecting the FFT and quadratic reciprocity
Some readers will look at the title of this post and think Ah yes, the FFT. I use it all the time. But what is this quadratic reciprocity?" Others will look at the same title and think Gauss called the quadratic reciprocity theorem the jewel in the crown of mathematics. But what is this FFT [...]The post Connecting the FFT and quadratic reciprocity first appeared on John D. Cook.
Two-digit zip codes
It's common to truncate US zip codes to the first three digits for privacy reasons. Truncating to the first two digits is less common, but occurs in some data sets. HIPAA Safe Harbor requires sparse 3-digit zip codes to be suppressed; even when rolled up to three digits some regions are still sparsely populated. How [...]The post Two-digit zip codes first appeared on John D. Cook.
Bessel zero spacing
Bessel functions are to polar coordinates what sines and cosines are to rectangular coordinates. This is why Bessel function often arise in applications with radial symmetry. The locations of the zeros of Bessel functions are important in application, and so you can find software for computing these zeros in mathematical libraries. In days gone by [...]The post Bessel zero spacing first appeared on John D. Cook.
Coloring the queen’s graph
Suppose we have an n * n chessboard. The case n = 8 is of course most common, but we consider all positive integer values of n. The graph of a chess piece has an edge between two squares if and only if the piece can legally move between the two squares. Now suppose we [...]The post Coloring the queen's graph first appeared on John D. Cook.
Regex to match SWIFT-BIC codes
A SWIFT-BIC number identifies a bank, not a particular bank account. The BIC part stands for Bank Identifier Code. I had to look up the structure of SWIFT-BIC codes recently, and here it is: Four letters to identify the bank Two letters to identify the country Two letters or digits to identify the location Optionally, [...]The post Regex to match SWIFT-BIC codes first appeared on John D. Cook.
Bad takes on chaos theory
I just finished reading The Three Body Problem. At the end of the book is a preview of Cixin Liu's book Supernova Era. A bit of dialog in that preview stood out to me because it is touches on themes I've written about before. I've heard about that. When a butterfly flaps its wings, there's [...]The post Bad takes on chaos theory first appeared on John D. Cook.
New Ways To Make Code Run Faster
The news from Meta last week is a vivid reminder of the importance of making code run faster and more power-efficiently. Meta intends to purchase 350,000 Nvidia H100 GPUs this year [1]. Assuming 350W TDP [2] and $0.1621 per kW-h [3] average US energy cost, one expects a figure of $174 million per year in [...]The post New Ways To Make Code Run Faster first appeared on John D. Cook.
Brute force cryptanalysis
A naive view of simple substitution ciphers is that they are secure because there are 26! ways to permute the English alphabet, and so an attacker would have to try 26! 4 * 1026 permutations. However, such brute force is not required. In practice, simple substitution ciphers are breakable by hand in a few [...]The post Brute force cryptanalysis first appeared on John D. Cook.
Straddling checkerboard encryption
Introduction Computers fundamentally changed cryptography, opening up new possibilities for making and breaking codes. At first it may not have been clear which side benefited most, but now it's clear that computers gave more power to code makers than code breakers. We now have cryptographic primitives that cannot be attacked more efficiently than by brute [...]The post Straddling checkerboard encryption first appeared on John D. Cook.
Email subscription changes
I will soon be discontinuing the email subscription option for this blog. I recommend that email subscribers switch over to subscribing to the RSS feed for the blog. If you're unfamiliar with RSS, here is an article on how to get started. (I recommend RSS in general, and not just for subscribing to this blog. [...]The post Email subscription changes first appeared on John D. Cook.
Beta inequality symmetries
I was thinking about the work I did when I worked in biostatistics at MD Anderson. This work was practical rather than mathematically elegant, useful in its time but not of long-term interest. However, one result came out of this work that I would call elegant, and that was a symmetry I found. Let X [...]The post Beta inequality symmetries first appeared on John D. Cook.
When is a function of two variables separable?
Given a function f(x,y), how can you tell whetherf can be factored into the product of a function g(x) of x alone and a function h(y) of y alone? Depending on how an expression for f is written, it may or may not be obvious whether f(x, y) can be separated into g(x) h(y). There [...]The post When is a function of two variables separable? first appeared on John D. Cook.
Applications of Bernoulli differential equations
When a nonlinear first order ordinary differential equation has the form with n 1, the change of variables turns the equation into a linear equation in u. The equation is known as Bernoulli's equation, though Leibniz came up with the same technique. Apparently the history is complicated [1]. It's nice that Bernoulli's equation can [...]The post Applications of Bernoulli differential equations first appeared on John D. Cook.
12345678910...