Feed john-d-cook John D. Cook

Favorite IconJohn D. Cook

Link https://www.johndcook.com/blog
Feed http://feeds.feedburner.com/TheEndeavour?format=xml
Updated 2024-11-21 16:32
DFT conventions: NumPy vs Mathematica
Just as there are multiple conventions for defining the Fourier transform, there are multiple conventions for defining the discrete Fourier transform (DFT), better known as the fast Fourier transform (FFT). [1] This post will look at two DFT conventions, one used in Python's NumPy library, and one used in Mathematica. There are more conventions in [...]The post DFT conventions: NumPy vs Mathematica first appeared on John D. Cook.
DFT mandalas
Math books often use some illustration from the book contents as cover art. When they do, there's often some mystery to the cover art, and a sense of accomplishment when you get far enough into the book to understand the significance of the cover. (See examples here.) William L. Briggs and Van Emden Henson wrote [...]The post DFT mandalas first appeared on John D. Cook.
Sonnets are square
In his book How to Read Literature Like a Professor, Thomas Foster says that if a poem looks like a square on the printed page, it's likely a sonnet. The miracle of the sonnet, you see, is that it is fourteen lines long and written almost always in iambic pentameter. ... suffice it to say [...]The post Sonnets are square first appeared on John D. Cook.
First time seeing a rare event
Suppose you've been monitoring a rare event for a long time, then you see your first occurrence on the Nth observation. Now what would you say about the event's probability? For example, suppose you're wondering whether dogs ever have two tails. You observe thousands of dogs and never see two tails. But then you see [...]The post First time seeing a rare event first appeared on John D. Cook.
Stellar magnitude
Imagine the following dialog. Logarithms are usually taken to integer bases, like 2 or 10." What about e?" OK, that's an example of an irrational base, but it's the only one." Decibels are logarithms tobase 101/10." Really?!" Yeah, you can read about this here." That's weird. But logarithms are always take to bases bigger than [...]The post Stellar magnitude first appeared on John D. Cook.
Area codes
US telephone area codes are allocated somewhat randomly. There was a deliberate effort to keep geographical proximity from corresponding to numerical proximity, unlike zip codes. (More of zip code proximity here.) In particular, consecutive area codes should belong to different states. The thought was that this would reduce errors. It's still mostly the case that [...]The post Area codes first appeared on John D. Cook.
Curvature at Cairo
I was flipping through Gravitation [1] this weekend and was curious about an illustration on page 309. This post reproduces that graph. The graph is centered at Cairo, Egypt and includes triangles whose side lengths are the distances between cities. The triangles are calculated using only distances, not by measuring angles per se. The geometry [...]The post Curvature at Cairo first appeared on John D. Cook.
Calculating the intersection of two circles
Given the equations for two circles, how can you tell whether they intersect? And if they do intersect, how do you find the point(s) of intersection? MathWorld gives a derivation, but I'd like to add the derivation there in two ways. First, I'd like to be more explicit about the number of solutions. Second, I'd [...]The post Calculating the intersection of two circles first appeared on John D. Cook.
A small programming language
Paul Graham said Programming languages teach you not to want what they don't provide." He meant that as a negative: programmers using less expressive languages don't know what they're missing. But you could also take that as a positive: using a simple language can teach you that you don't need features you thought you needed. [...]The post A small programming language first appeared on John D. Cook.
Quadrature rules and an impossibility theorem
Many numerical integration formulas over a finite interval have the form That is, the integral on the left can be approximated by evaluating the integrand f at particular nodes and taking the weighted sum, and the error is some multiple of a derivative of f evaluated at a point in the interval [a, b]. This [...]The post Quadrature rules and an impossibility theorem first appeared on John D. Cook.
Jigs
In his book The World Beyond Your Head Matthew Crawford talks about jigs literally and metaphorically. A jig in carpentry is something to hold parts in place, such as aligning boards that need to be cut to the same length. Crawford uses the term more generally to describe labor-saving (or more importantly, thought-saving) techniques in [...]The post Jigs first appeared on John D. Cook.
Can the chi squared test detect fake primes?
This morning I wrote about Dan Piponi's fake prime function. This evening I thought about it again and wondered whether the chi-squared test could tell the difference between the distribution of digits in real primes and fake primes. If the distributions were obviously different, this would be apparent from looking at histograms. When distributions are [...]The post Can the chi squared test detect fake primes? first appeared on John D. Cook.
Mastodon account
I have an account on Mastodon: johndcook@mathstodon.xyz. Note that's @math... and not @mast... One advantage to Mastodon is that you can browse content there without logging, while Twitter is becoming more of a walled garden. You can browse my account, for example, by going to the URL https://mathstodon.xyz/@johndcook There's hardly any content there at this [...]The post Mastodon account first appeared on John D. Cook.
Fake primes
Someone asked on Math Overflow about the distribution of digits in primes. It seems 0 is the least common digit and 1 the most common digit. Dan Piponi replies this is probably just a combination of general properties of sets of numbers with a density similar to the primes and the fact that primes end [...]The post Fake primes first appeared on John D. Cook.
Do 5% less
I've been thinking about things that were ruined by doing about 5% more than was necessary, like an actor whose plastic surgery looks plastic. Sometimes excellence requires pushing ourselves to do more than we want to do or more than we think we can do. But sometimes excellence requires restraint. Context is everything. A few [...]The post Do 5% less first appeared on John D. Cook.
Tracking and the Euler rotation theorem
Suppose you are in an air traffic control tower observing a plane moving in a straight line and you want to rotate your frame of reference to align with the plane. In the new frame the plane is moving along a coordinate axis with no component of motion in the other directions. You could do [...]The post Tracking and the Euler rotation theorem first appeared on John D. Cook.
Using WordNet to create a PAO system
NLP software infers parts of speech by context. For example, the SpaCy NLP software can determine the parts of speech in the poem Jabberwocky even though the words are nonsense. More on this here. If you want to tell the parts of speech for isolated words, maybe software like SpaCy isn't the right tool. You [...]The post Using WordNet to create a PAO system first appeared on John D. Cook.
Memorizing four-digit numbers
The Major mnemonic system is a method of converting numbers to words that can be more easily memorized. The basics of the system can be written on an index card, but there are practical details that are seldom written down. Presentations of the Major system can be misleading, intentionally or unintentionally, by implying that it [...]The post Memorizing four-digit numbers first appeared on John D. Cook.
The numerical range ellipse
Let A be an n * n complex matrix. The numerical range of A is the image of x*Ax over the unit sphere. That is, the numerical range of A is the set W(A) in defined by W(A) = {x*Ax | x n and ||x|| = 1} where x* is the conjugate transpose of [...]The post The numerical range ellipse first appeared on John D. Cook.
Random slices of a sphube
Ben Grimmer posted something yesterday on Twitter: A nice mathematical puzzle If you take a 4-norm ball and cut it carefully, you will find a two-norm ball. 3D printed visual evidence below. The puzzle: Why does this happen and how much more generally does it happen? (This question was first posed to me by Pablo [...]The post Random slices of a sphube first appeared on John D. Cook.
Twin stars and twin primes
Are there more twin stars or twin primes? If the twin prime conjecture is true, there are an infinite number of twin primes, and that would settle the question. We don't know whether there are infinitely many twin primes, and it's a little challenging to find any results on how many twin primes we're sure [...]The post Twin stars and twin primes first appeared on John D. Cook.
Simple way to distribute points on a sphere
Evenly placing points on a sphere is a difficult problem. It's impossible in general, and so you distribute the points as evenly as you can. The results vary according to how you measure how evenly the points are spread. However, there is a fast and simple way to distribute points that may be good enough, [...]The post Simple way to distribute points on a sphere first appeared on John D. Cook.
Spherical coordinate Rosetta Stone
If you've only seen one definition of spherical coordinates, you may be shocked to discover that there are multiple conventions. In particular, mathematicians and geoscientists have different conventions. As Volker Michel put it in book on constructive approximation, Many mathematicians have faced weird jigsaw puzzles with misplaced continents after using a data set from a [...]The post Spherical coordinate Rosetta Stone first appeared on John D. Cook.
Creating a Traveling Salesman Tour of Texas with Mathematica
A Traveling Salesman tour visits a list of destinations using the shortest path. There's an obvious way to find the shortest path connecting N points: try all N! paths and see which one is shortest. Unfortunately, that might take a while. Texas has 254 counties, and so calculating a tour of Texas counties by brute [...]The post Creating a Traveling Salesman Tour of Texas with Mathematica first appeared on John D. Cook.
Area and volume of hypersphere cap
A spherical cap is the portion of a sphere above some horizontal plane. For example, the polar ice cap of the earth is the region above some latitude. I mentioned in this post that the area above a latitude is where R is the earth's radius. Latitude is the angle up from the equator. [...]The post Area and volume of hypersphere cap first appeared on John D. Cook.
Random points in a high-dimensional orthant
In high dimensions, randomly chosen vectors are very likely nearly orthogonal. I'll unpack this a little bit then demonstrate it by simulation. Then I'll look at what happens when we restrict our attention to points with positive coordinates. *** The lengths of vectors don't contribute to the angles between them, so we may as well [...]The post Random points in a high-dimensional orthant first appeared on John D. Cook.
Cosine similarity does not satisfy the triangle inequality
The previous post looked at cosine similarity for embeddings of words in vector spaces. Word embeddings like word2vec map words into high-dimensional vector spaces in such a way that related words correspond to vectors that are roughly parallel. Ideally the more similar the words, the smaller the angle between their corresponding vectors. The cosine similarity [...]The post Cosine similarity does not satisfy the triangle inequality first appeared on John D. Cook.
Angles between words
Natural language processing represents words as high-dimensional vectors, on the order of 100 dimensions. For example, the glove-wiki-gigaword-50 set of word vectors contains 50-dimensional vectors, and the the glove-wiki-gigaword-200 set of word vectors contains 200-dimensional vectors. The intent is to represent words in such a way that the angle between vectors is related to similarity [...]The post Angles between words first appeared on John D. Cook.
Productive constraints
This post will discuss two scripting languages, but that's not what the post is really about. It's really about expressiveness and (or versus) productivity. *** I was excited to discover the awk programming language sometime in college because I had not used a scripting language before. Compared to C, awk was high-level luxury. Then a [...]The post Productive constraints first appeared on John D. Cook.
Möbius transformations over a finite field
A Mobius transformation is a function of the form where ad - bc = 1. We usually think of z as a complex number, but it doesn't have to be. We could define Mobius transformations in any context where we can multiply, add, and divide, i.e. over any field. In particular, we could work over [...]The post Mobius transformations over a finite field first appeared on John D. Cook.
Sort and remove duplicates
A common idiom in command line processing of text files is ... | sort | uniq | ... Some process produces lines of text. You want to pipe that text through sort to sort the lines in alphabetical order, then pass it to uniq to filter out all but the unique lines. The uniq utility [...]The post Sort and remove duplicates first appeared on John D. Cook.
Swish function and a Swiss mathematician
The previous post looked at the swish function and related activation functions for deep neural networks designed to address the dying ReLU problem." Unlike many activation functions, the function f(x) is not monotone but has a minimum near x0 = -1.2784. The exact location of the minimum is where W is the Lambert W function, [...]The post Swish function and a Swiss mathematician first appeared on John D. Cook.
Swish, mish, and serf
Swish, mish, and serf are neural net activation functions. The names are fun to say, but more importantly the functions have been shown to improve neural network performance by solving the dying ReLU problem." This happens when a large number of node weights become zero during training and do not contribute further to the learning [...]The post Swish, mish, and serf first appeared on John D. Cook.
Generating and inspecting an RSA private key
In principle you generate an RSA key by finding two large prime numbers, p and q, and computing n = pq. You could, for example, generate random numbers by rolling dice, then type the numbers into Mathematica to test each for primaility until you find a couple prime numbers of the right size. In practice [...]The post Generating and inspecting an RSA private key first appeared on John D. Cook.
RSA encryption in practice
At its core, RSA encryption is modular exponentiation. That is, given a message m, the encrypted form of m is x = me mod n where e is a publicly known exponent and n is a product of two large primes. The number n is made public but only the holder of the private key [...]The post RSA encryption in practice first appeared on John D. Cook.
Code to convert words to Major system numbers
A few days ago I wrote about using the CMU Pronouncing Dictionary to search for words that decode to certain numbers in the Major mnemonic system. You can find a brief description of the Major system in that post. As large as the CMU dictionary is, it did not contain words mapping to some three-digit [...]The post Code to convert words to Major system numbers first appeared on John D. Cook.
Software and the Allee effect
The Allee effect is named after Warder Clyde Allee who added a term to the famous logistic equation. His added term is highlighted in blue. Here N is the population of a species over time, r is the intrinsic rate of increase, K is the carrying capacity, and A is the critical point. If you [...]The post Software and the Allee effect first appeared on John D. Cook.
Solved problems becoming unsolved
That's a solved problem. So nobody knows how to solve it anymore." Once a problem is deemed solved" interest in the problem plummets. Solved" problems may not be fully solved, but sufficiently solved that the problem is no longer fashionable. Practical issues remain, but interest moves elsewhere. The software written for the problem slowly decays. [...]The post Solved problems becoming unsolved first appeared on John D. Cook.
The cobbler’s son
There's an old saying The cobbler's son has no shoes." It's generally taken to mean that we can neglect to do for ourselves something we do for other people. I've been writing a few scripts for my personal use, things I've long intended to do but only recently got around to doing. I said something [...]The post The cobbler's son first appeared on John D. Cook.
Date sequence from the command line
I was looking back at Jeroen Janssen's book Data Science at the Command Line and his dseq utility caught my eye. This utility prints out a sequence of dates relative to the current date. I've needed this and didn't know it. Suppose you have a CSV file and you need to add a column of [...]The post Date sequence from the command line first appeared on John D. Cook.
Up-down permutations
An up-down permutation of an ordered set is a permutation such that as you move from left to right the permutation alternates up and down. For example 1, 5, 3, 4, 2 is an up-down permutation of 1, 2, 3, 4, 5 because 1 < 5 > 3 < 4 > 2. Up-down permutations are [...]The post Up-down permutations first appeared on John D. Cook.
Variance of binned data
Suppose you have data that for some reason has been summarized into bins of width h. You don't have the original data, only the number of counts in each bin. You can't exactly find the sample mean or sample variance of the data because you don't actually have the data. But what's the best you [...]The post Variance of binned data first appeared on John D. Cook.
Ancient estimate of π and modern numerical analysis
A very crude way to estimate would be to find the perimeter of squares inside and outside a unit circle. The outside square has sides of length 2, so 2 < 8. The inside square has sides of length 2/2, so 8/2 < 2. This tells us is between 2.82 and 4. Not [...]The post Ancient estimate of and modern numerical analysis first appeared on John D. Cook.
ARPAbet and the Major mnemonic system
ARPAbet is a phonetic spelling system developed by- you guessed it-ARPA, before it became DARPA. The ARPAbet system is less expressive than IPA, but much easier for English speakers to understand. Every sound is encoded as one or two English letters. So, for example, the sound denoted in IPA is ZH in ARPAbet. In [...]The post ARPAbet and the Major mnemonic system first appeared on John D. Cook.
Ruzsa distance
A few days ago I wrote about Jaccard distance, a way of defining a distance between sets. The Ruzsa distance is similar, except it defines the distance between two subsets of an Abelian group. Subset difference Let A and B be two subsets of an Abelian (commutative) group G. Then the difference A - B [...]The post Ruzsa distance first appeared on John D. Cook.
Finding the imaginary part of an analytic function from the real part
A function f of a complex variable z = x +iy can be factored into real and imaginary parts: where x and y are real numbers, and u and v are real-valued functions of two real values. Suppose you are given u(x, y) and you want to find v(x, y). The function v is called [...]The post Finding the imaginary part of an analytic function from the real part first appeared on John D. Cook.
Every Japanese prefecture shrinking
It's well known that the population of Japan has been decreasing for years, and so I was a little puzzled by a recent headline saying that Japan's population has dropped in every one of its 47 prefectures. Although the national population is in decline, until now not all of the nation's 47 prefectures dropped in [...]The post Every Japanese prefecture shrinking first appeared on John D. Cook.
Named entity recognition
Named entity recognition (NER) is a task of natural language processing: pull out named things text. It sounds like trivial at first. Just create a giant list of named things and compare against that. But suppose, for example, University of Texas is on your list. If Texas is also on your list, do you report [...]The post Named entity recognition first appeared on John D. Cook.
Jaccard index and jazz albums
Jaccard index is a way of measuring the similarity of sets. The Jaccard index, or Jaccard similarity coefficient, of two sets A and B is the number of elements in their intersection, A B, divided by the number of elements in their union, A B. Jaccard similarity is a robust way to compare [...]The post Jaccard index and jazz albums first appeared on John D. Cook.
Trying NLP on Middle English
It's not fair to evaluate NLP software on a language it wasn't designed to process, but I wanted to try it anyway. The models in the spaCy software library were trained on modern English text and not on Middle English. Nevertheless, spaCy does a pretty good job of parsing Chaucer's Canterbury Tales, written over 600 [...]The post Trying NLP on Middle English first appeared on John D. Cook.
...45678910111213...