Feed john-d-cook John D. Cook

John D. Cook

Link	https://www.johndcook.com/blog
Feed	http://feeds.feedburner.com/TheEndeavour?format=xml
Updated	2025-07-18 21:31

How to eliminate the first order term from a second order ODE

by

John

on 2017-10-20 21:28 (#35MQJ)

Authors will often say that â€œwithout loss of generalityâ€ they will assume that a differential equation has no first order derivative term. Theyâ€™ll explain that thereâ€™s no need to consider because a change of variables can turn the above equation into one of the form While this is true, the change of variables is seldom [â€¦]

Common words that have a technical meaning in math

by

John

on 2017-10-19 11:24 (#35FSJ)

Mathematical writing is the opposite of business writing in at least one respect. Math uses common words as technical terms, whereas business coins technical terms to refer to common ideas. There are a few math terms I use fairly often and implicitly assume readers understand. Perhaps the most surprising is almost as in â€œalmost everywhere.â€ [â€¦]

A uniformly distributed sequence

by

John

on 2017-10-19 09:34 (#35FJ6)

If you take the fractional parts of the set of numbers {n cos nx : integer n > 0} the result is uniformly distributed for almost all x. That is, in the limit, the number of times the sequence visits a subinterval of [0, 1] is proportional to the length of the interval. (Clearly itâ€™s not true [â€¦]

Applying probability to non-random things

by

John

on 2017-10-14 18:00 (#351V6)

Probability has surprising uses, including applications to things that absolutely are not random. Iâ€™ve touched on this a few times. For example, Iâ€™ve commented on how arguments about whether something is really random are often moot: Random is as random does. This post will take non-random uses for probability in a different direction. Weâ€™ll start [â€¦]

Misplacing a continent

by

John

on 2017-10-14 13:39 (#3518Z)

There are many conventions for describing points on a sphere. For example, does latitude zero refer to the North Pole or the equator? Mathematicians tend to prefer the former and geoscientists the latter. There are also varying conventions for longitude. Volker Michel describes this clash of conventions colorfully in his book on constructive approximation. Many [â€¦]

Quaint supercomputers

by

John

on 2017-10-10 14:06 (#34MRZ)

The latest episode of Star Trek Discovery (S1E4) uses the word â€œsupercomputerâ€ a few times. This sounds jarring. The word has become less common in contemporary usage, and seems even more out of place in a work of fiction set more than two centuries in the future. According to Googleâ€™s Ngram Viewer, the term â€œsupercomputerâ€ [â€¦]

Exponential sum of the day

by

John

on 2017-10-09 19:01 (#34JDR)

Iâ€™ve written a page that will show a different exponential sum each day, images along the line of the post Exponential sums make pretty pictures. Hereâ€™s page: https://www.johndcook.com/expsum/ Here are a few sample images. Small changes in the coefficients can make a big change in the appearance of the graphs.

Something that bothers me about deep neural nets

by

John

on 2017-10-09 18:07 (#34JB5)

Overfitting happens when a model does too good a job of matching a particular data set and so does a poor job on new data. The way traditional statistical models address the danger of overfitting is to limit the number of parameters. For example, you might fit a straight line (two parameters) to 100 data [â€¦]

Exponential sums make pretty pictures

by

John

on 2017-10-07 17:09 (#34D1Z)

Exponential sums are a specialized area of math that studies series with terms that are complex exponentials. Estimating such sums is delicate work. General estimation techniques are ham-fisted compared to what is possible with techniques specialized for these particular sums. Exponential sums are closely related to Fourier analysis and number theory. Exponential sums also make [â€¦]

No critical point between two peaks

by

John

on 2017-10-04 23:40 (#344V7)

If a function of one variable has two local maxima, it must have a local minimum in between. What about a function of two variables? If it has two local maxima, does it need to have a local minimum? No, it could have a saddle point in between, a point that is a local minimum [â€¦]

Clean obfuscated code

by

John

on 2017-10-03 11:59 (#33ZBS)

One way to obfuscate code is clever use of arcane programming language syntax. Hackers are able to write completely unrecognizable code by exploiting dark corners of programming techniques and languages. Some of these attempts are quite impressive. But itâ€™s also possible to write clean source code that is nevertheless obfuscated. For example, itâ€™s not at [â€¦]

How many musical scales are there?

by

John

on 2017-09-30 21:18 (#33QZ4)

How many musical scales are there? Thatâ€™s not a simple question. It depends on how you define â€œscale.â€ For this post, Iâ€™ll only consider scales starting on C. That is, Iâ€™ll only consider changing the intervals between notes, not changing the starting note. Also, Iâ€™ll only consider subsets of the common chromatic scale; this post [â€¦]

Toxic pairs, re-identification, and information theory

by

John

on 2017-09-30 18:53 (#33QPP)

Database fields can combine in subtle ways. For example, nationality is not usually enough to identify anyone. Neither is religion. But the combination of nationality and religion can be surprisingly informative. Information content of nationality How much information is contained in nationality? That depends on exactly how you define nations versus territories etc., but for [â€¦]

Chaos and the beta distribution

by

John

on 2017-09-27 11:00 (#33CD7)

Iteration of the quadratic function f(x) = 4x(1-x) is a famous example in chaos theory. Hereâ€™s what the first few iterations look like, starting with 1/âˆš3. (Thereâ€™s nothing special about that starting point; any point that doesnâ€™t iterate to exactly zero will do.) The values appear to bounce all over the place. Letâ€™s look at a [â€¦]

Cellular automata with random initial conditions

by

John

on 2017-09-25 14:30 (#3360B)

The previous post looked at a particular cellular automaton, the so-called Rule 90. When started with a single pixel turned on, it draws a Sierpinski triangle. With random starting pixels, it draws a semi-random pattern that retains features like the Sierpinski triangle. There are only 256 possible elementary cellular automata, so itâ€™s practical to plot [â€¦]

Sierpinski triangle strikes again

by

John

on 2017-09-23 17:31 (#3311X)

A couple months ago I wrote about how a simple random process gives rise to the Sierpinski triangle. Draw an equilateral triangle and pick a random point in the plane. Repeatedly pick a triangle vertex at random and move half way from the current position to that vertex. The result converges to a Sierpinksi triangle. [â€¦]

A cryptographically secure random number generator

by

John

on 2017-09-21 18:30 (#32TY8)

A random number generator can have excellent statistical properties and yet not be suited for use in cryptography. Iâ€™ve written a few posts to demonstrate this. For example, this post shows how to discover the seed of an LCG random number generator. This is not possible with a secure random number generator. Or more precisely, [â€¦]

Aerial video of Hurricane Harvey aftermath and cleanup

by

John

on 2017-09-21 13:29 (#32SWM)

Video by my friend Aaron Benzel showing the debris and cleanup typical of neighborhoods that flooded in Harvey.

Adding Laplace or Gaussian noise to database for privacy

by

John

on 2017-09-20 12:56 (#32PCX)

In the previous two posts we looked at a randomization scheme for protecting the privacy of a binary response. This post will look briefly at adding noise to continuous or unbounded data. I like to keep the posts here fairly short, but this topic is fairly technical. To keep it short Iâ€™ll omit some of [â€¦]

Quantifying privacy loss in a statistical database

by

John

on 2017-09-20 12:00 (#32P80)

In the previous post we looked at a simple randomization procedure to obscure individual responses to yes/no questions in a way that retains the statistical usefulness of the data. In this post weâ€™ll generalize that procedure, quantify the privacy loss, and discuss the utility/privacy trade-off. More general randomized response Suppose we have a binary response [â€¦]

Randomized response, privacy, and Bayes theorem

by

John

on 2017-09-19 12:00 (#32JSB)

Suppose you want to gather data on an incriminating question. For example, maybe a statistics professor would like to know how many students cheated on a test. Being a statistician, the professor has a clever way to find out what he wants to know while giving each student deniability. Randomized response Each student is asked [â€¦]

Why don’t you simply use XeTeX?

by

John

on 2017-09-17 19:59 (#32DPJ)

From an FAQ post I wrote a few years ago: This may seem like an odd question, but itâ€™s actually one I get very often. On my TeXtip twitter account, I include tips on how to create non-English characters such as using \AA to produce Ã…. Every time someone will ask â€œWhy not use XeTeX and just [â€¦]

Pascal’s triangle and Fermat’s little theorem

by

John

on 2017-09-14 23:58 (#3266N)

I was listening to My Favorite Theorem when Jordan Ellenberg said something in passing about proving Fermatâ€™s little theorem from Pascalâ€™s triangle. I wasnâ€™t familiar with that, and fortunately Evelyn Lamb wasnâ€™t either and so she asked him to explain. Fermatâ€™s little theorem says that for any prime p, then for any integer a, ap = a [â€¦]

Making a problem easier by making it harder

by

John

on 2017-09-12 12:50 (#31XEH)

In the oral exam for my PhD, my advisor asked me a question about a differential equation. I donâ€™t recall the question, but I remember the interaction that followed. I was stuck, and my advisor countered by saying â€œLet me ask you a harder question.â€ I was still stuck, and so he said â€œLet me [â€¦]

Quantifying the information content of personal data

by

John

on 2017-09-12 11:55 (#31X9G)

It can be surprisingly easy to identify someone from data thatâ€™s not directly identifiable. One commonly cited result is that the combination of birth date, zip code, and sex is enough to identify most people. This post will look at how to quantify the amount of information contained in such data. If the answer to [â€¦]

Negative correlation introduced by success

by

John

on 2017-09-10 23:06 (#31RTH)

Suppose you measure people on two independent attributes, X and Y, and take those for whom X+Y is above some threshold. Then even though X and Y are uncorrelated in the full population, they will be negatively correlated in your sample. This article gives the following example. Suppose beauty and acting ability were uncorrelated. Knowing how [â€¦]

Highly cited theorems

by

John

on 2017-09-07 12:38 (#31ERN)

Some theorems are cited far more often than others. These are not the most striking theorems, not the most advanced or most elegant, but ones that are extraordinarily useful. I first noticed this when taking complex analysis where the Cauchy integral formula comes up over and over. When I first saw the formula I thought [â€¦]

Width of mixture PDFs

by

John

on 2017-09-05 23:26 (#31A26)

Suppose you draw samples from two populations, one of which has a wider probability distribution than the other. How does the width of the distribution of the combined sample vary as you change the proportions of the two populations? The extremes are easy. If you pick only from one population, then the resulting distribution will [â€¦]

Team dynamics and encouragement

by

John

on 2017-09-02 23:56 (#3128Q)

When you add people to a project, the total productivity of the team as a whole may go up, but the productivity per person usually goes down. Someone suggested that as a rule of thumb, a company needs to triple its number of employees to double its productivity. Fred Brooks summarized this saying â€œMany hands [â€¦]

Relearning from a new perspective

by

John

on 2017-08-31 21:58 (#30WBB)

I had a conversation with someone today who said heâ€™s relearning logic from a categorical perspective. What struck me about this was not the specifics but the pattern: Relearning _______ from a _______ perspective. Not relearning something forgotten, but going back over something you already know well, but from a different starting point, a different [â€¦]

Hurricane Harvey update

by

John

on 2017-08-27 18:47 (#30EK8)

As you may know, I live in the darkest region of the rainfall map below. My family and I are doing fine. Our house has not flooded, and at this point it looks like it will not flood. Weâ€™ve only lost electricity for a second or two. Of course not everyone in Houston is doing [â€¦]

Defining the Fourier transform on LCA groups

by

John

on 2017-08-27 15:34 (#30E81)

My previous post said that all the familiar variations on Fourier transformsâ€”Fourier series analysis and synthesis, Fourier transforms on the real line, discrete Fourier transforms, etc.â€”can be unified into a single theory. Theyâ€™re all instances of a Fourier transform on a locally compact Abelian (LCA) group. The difference between them is the underlying group. Given [â€¦]

Unified theory of Fourier transforms

by

John

on 2017-08-27 01:23 (#30CVS)

You can take a periodic function and analyze it into its Fourier coefficients, or use the Fourier coefficients in a sum to synthesize a periodic function. You can take the Fourier transform of a function defined on the whole real line and get another such function. And you can compute the discrete Fourier transform via [â€¦]

Solving problems we wish we had

by

John

on 2017-08-25 12:24 (#308MN)

Thereâ€™s a great line from Heather McGaw toward the end of the latest episode of 99 Percent Invisible: Sometimes â€¦ we can start to solve problems that we wish were problems because theyâ€™re easy to solve. Reminds me of an excerpt from Richard Weaverâ€™s book Ideas Have Consequences: Obsession, according to the canons of psychology, [â€¦]

Predicting when an RNG will output a given value

by

John

on 2017-08-22 12:00 (#2ZYN6)

A few days ago I wrote about how to pick the seed of a simple random number generator so that a desired output came n values later. The number n was fixed and we varied the seed. In this post, the seed will be fixed and weâ€™ll solve for n. In other words, we ask when a [â€¦]

Programming language life expectancy

by

John

on 2017-08-19 18:08 (#2ZQ8Q)

The Lindy effect says that whatâ€™s been around the longest is likely to remain around the longest. It applies to creative artifacts, not living things. A puppy is likely to live longer than an elderly dog, but a book that has been in press for a century is likely to be in press for another century. [â€¦]

Reverse engineering the seed of a linear congruential generator

by

John

on 2017-08-16 13:11 (#2ZDMZ)

The previous post gave an example of manipulating the seed of a random number generator to produce a desired result. This post will do something similar for a different generator. A couple times Iâ€™ve used the following LCG (linear congruential random number generator) in examples. An LCG starts with an initial value of z and updates z [â€¦]

Manipulating a random number generator

by

John

on 2017-08-16 12:00 (#2ZDEP)

With some random number generators, itâ€™s possible to select the seed carefully to manipulate the output. Sometimes this is easy to do. Sometimes itâ€™s hard but doable. Sometimes itâ€™s theoretically possible but practically impossible. In my recent correspondence with Melissa Oâ€™Neill, she gave me an example that seeds a random number generator so that the [â€¦]

Testing RNGs with PractRand

by

John

on 2017-08-15 01:58 (#2Z9DG)

PractRand is a random number generator test suite, somewhat like the DIEHARDER and NIST tests Iâ€™ve written about before, but more demanding. Rather than running to completion, it runs until it a test fails with an infinitesimally small p-value. It runs all tests at a given sample size, then doubles the sample and runs the tests again. [â€¦]

Random minimum spanning trees

by

John

on 2017-08-09 17:52 (#2YTSG)

I just ran across a post by John Baez pointing to an article by Alan Frieze on random minimum spanning trees. Hereâ€™s the problem. Create a complete graph with n nodes, i.e. connect every node to every other node. Assign each edge a uniform random weight between 0 and 1. Find the minimum spanning tree. Add up [â€¦]

Selecting things in Emacs

by

John

on 2017-08-09 16:27 (#2YTGZ)

You can select blocks of text in Emacs just as you would in most other environments. You could, for example, drag your mouse over a region. You could also hold down the Shift key and use arrow keys. But Emacs also has a number of commands that let you work in larger semantic units. That [â€¦]

Random walk on quaternions

by

John

on 2017-08-05 15:00 (#2YF3M)

The previous post was a riff on a tweet asking what youâ€™d get if you extracted all the iâ€˜s, jâ€˜s, and kâ€˜s from Finnegans Wake and multiplied them as quaternions. This post is a probabilistic variation on the previous one. If you randomly select a piece of English prose, extract the iâ€˜s, jâ€˜s, and kâ€˜s, and multiply them together as quaternions, what [â€¦]

Wolfram Alpha, Finnegans Wake, and Quaternions

by

John

on 2017-08-02 13:29 (#2Y5CM)

I stumbled on a Twitter account yesterday called Wolfram|Alpha Canâ€™t. It posts bizarre queries that Wolfram Alpha canâ€™t answer. Hereâ€™s one that caught my eye. result of extracting the iâ€™s, jâ€™s, and kâ€™s in order from Finnegans Wake and interpreting as a quaternion product â€” Wolfram|Alpha Canâ€™t (@wacnt) May 17, 2017 Suppose you did extract [â€¦]

The cross polytope

by

John

on 2017-07-30 21:36 (#2XX88)

There are five regular solids in three dimensions: tetrahedron octahedron (pictured above) hexahedron (cube) dodecahedron icosahedron. I give a proof here that these are the only five. The first three of these regular solids generalize to all dimensions, and these generalizations are the only regular solids in dimensions 5 and higher. (There are six regular [â€¦]

Bayesian methods at Bletchley Park

by

John

on 2017-07-25 16:51 (#2XDM6)

From Nick Pattersonâ€™s interview on Talking Machines: GCHQ in the â€™70s, we thought of ourselves as completely Bayesian statisticians. All our data analysis was completely Bayesian, and that was a direct inheritance from Alan Turing. Iâ€™m not sure this has ever really been published, but Turing, almost as a sideline during his cryptoanalytic work, reinvented [â€¦]

by

John

on 2017-07-24 13:54 (#2X9PX)

The previous couple blog posts touched on a special case of sphere packing. We looked at the proportion of volume contained near the corners of a hypercube. If you take the set of points within a distance 1/2 of a corner of a hypercube, you could rearrange these points to form a full ball centered [â€¦]

Is most volume in the corners or not?

by

John

on 2017-07-24 13:50 (#2X9PY)

Iâ€™ve written a couple blog posts that may seem to contradict each other. Given a high-dimensional cube, is most of the volume in the corners or not? I recently wrote that the corners of a cube stick out more in high dimensions. You can quantify this by centering a ball at a corner and looking [â€¦]

Corners stick out more in high dimensions

by

John

on 2017-07-19 12:31 (#2WVT4)

High-dimensional geometry is full of surprises. For example, nearly all the area of a high-dimensional sphere is near the equator, and by symmetry it doesnâ€™t matter which equator you take. Hereâ€™s another surprise: corners stick out more in high dimensions. Hypercubes, for example, become pointier as dimension increases. How might we quantify this? Think of [â€¦]

Math diagrams updated

by

John

on 2017-07-15 22:19 (#2WJDK)

I updated several of the math diagrams on this site today. Theyâ€™re SVG now, so they resize nicely if you want to zoom in our out. Special functions Topological vector spaces Category theory concepts General topology Gamma function identities

Discrete example of concentration of measure

by

John

on 2017-07-14 12:00 (#2WESC)

The previous post looked at a continuous example of concentration of measure. As you move away from a thin band around the equator, the remaining area in the rest of the sphere decreases as an exponential function of the dimension and the distance from the equator. This post will show a very similar result for [â€¦]

...45 46 47 484950 51 52 53 54...