Feed john-d-cook John D. Cook

John D. Cook

Link	https://www.johndcook.com/blog
Feed	http://feeds.feedburner.com/TheEndeavour?format=xml
Updated	2025-07-04 05:31

Powers that don’t change the last digit

by

John

on 2020-04-14 10:40 (#5248F)

If you raise any number to the fifth power, the last digit doesnâ€™t change. Hereâ€™s a little Python code to verify this claim. >>> [n**5 for n in range(10)] [0, 1, 32, 243, 1024, 3125, 7776, 16807, 32768, 59049] In case youâ€™re not familiar with Python, or familiar with Python but not familiar with list [â€¦]

An application of Kronecker products

by

John

on 2020-04-13 14:00 (#5231W)

A while back I wrote about Kronecker products in the context of higher order Taylor series. Hereâ€™s how I described the Kronecker product in that post. The Kronecker product of an m Ã— n matrix A and a p Ã— q matrix B is a mp Ã— nq matrix K = A âŠ— B. You can think of K as a block partitioned matrix. The ij block [â€¦]

A wrinkle in Clojure

by

John

on 2020-04-11 13:20 (#5215P)

Bob Martin recently posted a nice pair of articles, A Little Clojure and A Little More Clojure. In the first article he talks about how spare and elegant Clojure is. In the second article he shows how to write a program to list primes using map and filter rather than if and while. He approaches [â€¦]

Exponential growth vs logistic growth

by

John

on 2020-04-09 16:53 (#51YEG)

This seems like a good time to discuss the difference between exponential growth and logistic growth as the covid19 pandemic is starting to look more like a logistic model and less like an exponential model, at least in many parts of the world [1]. This post is an expansion of a Twitter thread I wrote [â€¦]

Sine series for a sine

by

John

on 2020-04-07 23:31 (#51VRF)

The Fourier series of an odd function only has sine termsâ€”all the cosine coefficients are zeroâ€”and so the Fourier series is a sine series. What is the sine series for a sine function? If the frequency is an integer, then the sine series is just the function itself. For example, the sine series for sin(5x) [â€¦]

Two meanings of QR code

by

John

on 2020-04-07 14:06 (#51V6V)

â€œQR codeâ€ can mean a couple different things. There is a connection between these two, though thatâ€™s not at all obvious. What almost everyone thinks of as a QR code is a quick response code, a grid of black and white squares that encode some data. For example, the QR code below contains my contact [â€¦]

Center of mass and vectorization

by

John

on 2020-04-05 19:38 (#51RBX)

Para Parasolian left a comment on my post about computing the area of a polygon, suggesting that I â€œsay something similar about computing the centroid of a polygon using a similar formula.â€ This post will do that, and at the same time discuss vectorization. Notation We start by listing the vertices starting anywhere and moving [â€¦]

Making an invertible function out of non-invertible parts

by

John

on 2020-04-04 14:20 (#51Q4T)

How can you make an invertible function out of non-invertable parts? Why would you want to? Encryption functions must be invertible. If the intended recipient canâ€™t decrypt the message then the encryption method is useless. Of course you want an encryption function to be really hard to invert without the key. Itâ€™s hard to think [â€¦]

Underestimating risk

by

John

on 2020-04-01 13:52 (#51J7P)

When I hear that a system has a one in a trillion (1,000,000,000,000) chance of failure, I immediately translate that in my mind to â€œSo, optimistically the system has a one in a million (1,000,000) chance of failure.â€ Extremely small probabilities are suspicious because they often come from one of two errors: Wrongful assumption of [â€¦]

Reasoning under uncertainty

by

John

on 2020-03-30 20:12 (#51FK0)

Reasoning under uncertainty sounds intriguing. Brings up images of logic, philosophy, and artificial intelligence. Statistics sounds boring. Brings up images of tedious, opaque calculations followed by looking some number in a table. But statistics is all about reasoning under uncertainty. Many people get through required courses in statistics without ever hearing that, or at least [â€¦]

Lee distance: codes and music

by

John

on 2020-03-29 18:47 (#51DY0)

The Hamming distance between two sequences of symbols is the number of places in which they differ. For example, the Hamming distance between the words â€œhammingâ€ and â€œfarmingâ€ is 2, because the two worlds differ in their first and third letters. Hamming distance is natural when comparing sequences of bits because bits are either the [â€¦]

Conditional independence notation

by

John

on 2020-03-27 15:28 (#51BGK)

Ten years ago I wrote a blog post that concludes with this observation: The ideas of being relatively prime, independent, and perpendicular are all related, and so it makes sense to use a common symbol to denote each. This post returns to that theme, particularly looking at independence of random variables. History Graham, Knuth, and [â€¦]

Three composition theorems for differential privacy

by

John

on 2020-03-25 19:46 (#518K2)

This is a brief post, bringing together three composition theorems for differential privacy. The composition of an Îµ1-differentially private algorithm and an Îµ2-differentially private algorithm is an (Îµ1+Îµ2)-differentially private algorithm. The composition of an (Îµ1, Î´1)-differentially private algorithm and an (Îµ2, Î´2)-differentially private algorithm is an (Îµ1+Îµ2, Î´1+Î´2)-differentially private algorithm. The composition of an (Î±, [â€¦]

Minimizing worst case error

by

John

on 2020-03-24 17:48 (#516CR)

Itâ€™s very satisfying to know that you have a good solution even under the worst circumstances. Worst-case thinking doesnâ€™t have to be concerned with probabilities, with what is likely to happen, only with what could happen. But whenever you speak of what could happen, you have to limit your universe of possibilities. Suppose you ask [â€¦]

Pecunia non olet

by

John

on 2020-03-24 15:11 (#5162A)

Iâ€™ve been rereading That Hideous Strength. Iâ€™m going through it slowly this time, paying attention to details I glossed over before. For example, early in the book weâ€™re told that the head of a college has the nickname N.O. N.O., which stood for Non-Olet, was the nickname of Charles Place, the warden of Bracton. The [â€¦]

Simple clinical trial of four COVID-19 treatments

by

John

on 2020-03-23 13:50 (#514AM)

A story came out in Science yesterday saying the World Health Organization is launching a trial of what it believes are the the four most promising treatments for COVID-19 (a.k.a. SARS-CoV-2, novel coronavirus, etc.) The four treatment arms will be Remdesivir Chloroquine and hydroxychloroquine Ritonavir + lopinavir Ritonavir + lopinavir + interferon beta plus standard [â€¦]

Product of copulas

by

John

on 2020-03-23 00:12 (#513F5)

A few days ago I wrote a post about copulas and operations on them that have a group structure. Hereâ€™s another example of group structure for copulas. As in the previous post Iâ€™m just looking at two-dimensional copulas to keep things simple. Given two copulas C1 and C2, you can define a sort of product [â€¦]

How to Set Num Lock on permanently

by

John

on 2020-03-21 13:50 (#5126X)

When I use my Windows laptop, Iâ€™m always accidentally brushing against the Num Lock key. I suppose itâ€™s because the keys are so flat; I never have this problem on a desktop. I thought there must be some way to set it so that itâ€™s always on, so I searched for it. First I found [â€¦]

New Asymptotic function in Mathematica 12.1

by

John

on 2020-03-21 01:59 (#511H8)

One of the new features in Mathematica 12.1 is the function Asymptotic. Hereâ€™s a quick example of using it. Hereâ€™s an asymptotic series for the log of the gamma function I wrote about here. If we ask Mathematica Asymptotic[LogGamma[z], z -> Infinity] we get simply the first term: But we can set the argument SeriesTermGoal [â€¦]

Extended floating point precision in R and C

by

John

on 2020-03-18 14:54 (#50X2E)

The GNU MPFR library is a C library for extended precision floating point calculations. The name stands for Multiple Precision Floating-point Reliable. The library has an R wrapper Rmpfr that is more convenient for interactive use. There are also wrappers for other languages. It takes a long time to install MPFR and its prerequisite GMP, [â€¦]

When is round-trip floating point radix conversion exact?

by

John

on 2020-03-16 23:20 (#50T66)

Suppose you store a floating point number in memory, print it out in human-readable base 10, and read it back in. When can the original number be recovered exactly? D. W. Matula answered this question more generally in 1968 [1]. Suppose we start with base Î² with p places of precision and convert to base [â€¦]

Group symmetry of copula operations

by

John

on 2020-03-16 19:05 (#50SY6)

You donâ€™t often see references to group theory in a statistics book. Not that there arenâ€™t symmetries in statistics that could be described in terms of groups, but this isnâ€™t often pointed out. Hereâ€™s an example from An Introduction to Copulas by Roger Nelsen. Show that under composition the set of operations of forming the [â€¦]

Product of Chebyshev polynomials

by

John

on 2020-03-15 19:12 (#50RMT)

Chebyshev polynomials satisfy a lot of identities, much like trig functions do. This point will look briefly at just one such identity. Chebyshev polynomials Tn are defined for n = 0 and 1 by T0(x) = 1 T1(x) = x and for larger n using the recurrence relation Tn+1(x) = 2xTn(x) â€“ Tn-1(x) This implies [â€¦]

The Brothers Markov

by

John

on 2020-03-14 14:46 (#50Q4E)

The Markov brother youâ€™re more likely to have heard of was Andrey Markov. He was the Markov of Markov chains, the Gauss-Markov theorem, and Markovâ€™s inequality. Andrey had a lesser known younger brother Vladimir who was also a mathematician. Together the two of them proved what is known as the Markov Brothersâ€™ inequality to distinguish [â€¦]

Finding coffee in Pi

by

John

on 2020-03-14 06:05 (#50PYS)

It is widely believed that Ï€ is a â€œnormal number,â€ which would mean that you can find any integer sequence you want inside the digits of Ï€, in any base, if you look long enough. So for Pi Day, I wanted to find c0ffee inside the hexadecimal representation of Ï€. First I used TinyPI, a [â€¦]

Chebyshev approximation

by

John

on 2020-03-11 12:25 (#50HZY)

In the previous post I mentioned that Remez algorithm computes the best polynomial approximation to a given function f as measured by the maximum norm. That is, for a given n, it finds the polynomial p of order n that minimizes the absolute error || f â€“ p ||âˆž. The Mathematica function MiniMaxApproximation minimizes the relative [â€¦]

Remez algorithm and best polynomial approximation

by

John

on 2020-03-10 22:23 (#50HCR)

The best polynomial approximation, in the sense of minimizing the maximum error, can be found by the Remez algorithm. I expected Mathematica to have a function implementing this algorithm, but apparently it does not have one. (But see update below.) It has a function named MiniMaxApproximation which sounds like Remez algorithm, and itâ€™s close, but [â€¦]

by

John

on 2020-03-07 15:56 (#50D33)

A maximum distance separable code, or MDS code, is a way of encoding data so that the distance between code words is as large as possible for a given data capacity. This post will explain what that means and give examples of MDS codes. Notation A linear block code takes a sequence of k symbols [â€¦]

Automatic data reweighting

by

John

on 2020-03-04 11:10 (#507KX)

Suppose you are designing an autonomous system that will gather data and adapt its behavior to that data. At first you face the so-called cold-start problem. You donâ€™t have any data when you first turn the system on, and yet the system needs to do something before it has accumulated data. So you prime the [â€¦]

Maximum gap between binomial coefficients

by

John

on 2020-03-02 12:19 (#5044S)

I recently stumbled on a formula for the largest gap between consecutive items in a row of Pascalâ€™s triangle. For n â‰¥ 2, where For example, consider the 6th row of Pascalâ€™s triangle, the coefficients of (x + y)6. 1, 6, 15, 20, 15, 6, 1 The largest gap is 9, the gap between 6 [â€¦]

Formatting in comments

by

John

on 2020-03-02 12:16 (#5044T)

The comments to the posts here are generally very helpful. I appreciate your contributions to the site. I wanted to offer a tip for those who leave comments and are frustrated by the way the comments appear, especially those who write nicely formatted snippet of code only to see the formatting lost. There is a [â€¦]

Sum of squared digits

by

John

on 2020-02-28 13:13 (#500QS)

Take a positive integer x, square each of its digits, and sum. Now do the same to the result, over and over. What happens? To find out, letâ€™s write a little Python code that sums the squares of the digits. def G(x): return sum(int(d)**2 for d in str(x)) This function turns a number into a [â€¦]

Computing the area of a thin triangle

by

John

on 2020-02-27 17:14 (#4ZZ9C)

Heronâ€™s formula computes the area of a triangle given the length of each side. where If you have a very thin triangle, one where two of the sides approximately equal s and the third side is much shorter, a direct implementation Heronâ€™s formula may not be accurate. The cardinal rule of numerical programming is to [â€¦]

A tale of two iterations

by

John

on 2020-02-27 00:59 (#4ZY5B)

I recently stumbled on a paper [1] that looks at a cubic equation that comes out of a problem in orbital mechanics: ÏƒxÂ³ = (1 + x)Â² Much of the paper is about the derivation of the equation, but here Iâ€™d like to focus on a small part of the paper where the author looks [â€¦]

by

John

on 2020-02-26 14:01 (#4ZX8B)

A couple days ago I wrote about Hamming codes and said that they are so-called perfect codes, i.e. codes for which Hammingâ€™s upper bound on the number of code words with given separation is exact. Not only are Hamming codes perfect codes, theyâ€™re practically the only non-trivial perfect codes. Specifically, Tietavainen and van Lint proved [â€¦]

Computing parity of a binary word

by

John

on 2020-02-24 22:47 (#4ZTNA)

The previous post mentioned adding a parity bit to a string of bits as a way of detecting errors. The parity of a binary word is 1 if the word contains an odd number of 1s and 0 if it contains an even number of ones. Codes like the Hamming codes in the previous post [â€¦]

A gentle introduction to Hamming codes

by

John

on 2020-02-24 16:14 (#4ZSXG)

The previous post looked at how to choose five or six letters so that their Morse code representations are as distinct as possible. This post will build on the previous one to introduce Hamming codes. The problem of finding Hamming codes is much simpler in some ways, but also more general. Morse code is complicated [â€¦]

ADFGVX cipher and Morse code separation

by

John

on 2020-02-23 00:50 (#4ZR1F)

A century ago the German army used a field cipher that transmitted messages using only six letters: A, D, F, G, V, and X. These letters were chosen because their Morse code representations were distinct, thus reducing transmission error. The ADFGVX cipher was an extension of an earlier ADFGV cipher. The ADFGV cipher was based [â€¦]

ChaCha RNG with fewer rounds

by

John

on 2020-02-23 00:45 (#4ZR1G)

ChaCha is a CSPRING, a cryptographically secure pseudorandom number generator. When used in cryptography, ChaCha typically carries out 20 rounds of its internal scrambling process. Googleâ€™s Adiantum encryption system uses ChaCha with 12 rounds. The runtime for ChaCha is proportional to the number of rounds, so you donâ€™t want to do more rounds than necessary [â€¦]

Popcount: counting 1’s in a bit stream

by

John

on 2020-02-21 14:29 (#4ZPBT)

Sometimes you need to count the number of 1â€™s in a stream of bits. The most direct application would be summarizing yes/no data packed into bits. Itâ€™s also useful in writing efficient, low-level bit twiddling code. But there are less direct applications as well. For example, three weeks ago this came up in a post [â€¦]

A brief comment on hysteresis

by

John

on 2020-02-20 16:30 (#4ZMNV)

You might hear hysteresis described as a phenomena where the solution to a differential equation depends on its history. But that doesnâ€™t make sense: the solution to a differential equation always depends on its history. The solution at any point in time depends (only) on its immediately preceding state. You can take the state at [â€¦]

Safe Harbor ain’t gonna cut it

by

John

on 2020-02-20 16:22 (#4ZMNW)

There are two ways to deidentify data to satisfy HIPAA: Safe Harbor, Â§ 164.514(b)(2), and Expert Determination, Â§ 164.514(b)(1). And for reasons explained here, you may need to be concerned with HIPAA even if youâ€™re not a â€œcovered entityâ€ under the statute. To comply with Safe Harbor, your data may not contain any of eighteen [â€¦]

Inverse congruence RNG

by

John

on 2020-02-19 15:03 (#4ZK2M)

Linear congruence random number generators have the form xn+1 = a xn + b mod p Inverse congruence generators have the form xn+1 = a xn-1 + b mod p were x-1 means the modular inverse of x, i.e. the value y such that xy = 1 mod p. It is possible that x = [â€¦]

A better adaptive Runge-Kutta method

by

John

on 2020-02-19 14:08 (#4ZK2N)

This is the third post in a series on Runge-Kutta methods. The first post in the series introduces Runge-Kutta methods and Butcher tableau. The next post looked at Fehlbergâ€™s adaptive Runge-Kutta method, first published in 1969. This post looks at a similar method from Dormand and Prince in 1980. Like Fehlbergâ€™s method, the method of [â€¦]

How to estimate ODE solver error

by

John

on 2020-02-19 14:05 (#4ZK2P)

This post brings together several themes Iâ€™ve been writing about lately: caching function evaluations, error estimation, and Runge-Kutta methods. A few days ago I wrote about how Runge-Kutta methods can all be summarized by a set of numbers called the Butcher tableau. These methods solve by evaluating f at some partial step, then evaluating f [â€¦]

Trapezoid rule and Romberg integration

by

John

on 2020-02-18 18:06 (#4ZHRB)

This post will look at two numerical integration methods, the trapezoid rule and Rombergâ€™s algorithm, and memoization. This post is a continuation of ideas from the recent posts on Lobatto integration and memoization. Although the trapezoid rule is not typically very accurate, it can be in special instances, and Romberg combined it with extrapolation to [â€¦]

Python and the Tell-Tale Heart

by

John

on 2020-02-18 02:53 (#4ZGHV)

I was browsing through SciPy documentation this evening and ran across a function in scipy.misc called electrocardiogram. What?! Itâ€™s an actual electrocardiogram, sampled at 360 Hz. Presumably itâ€™s included as convenient example data. Hereâ€™s a plot of the first five seconds. I wrote a little code using it to turn the ECG into an audio [â€¦]

Why HIPAA matters even if you’re not a “covered entity”

by

John

on 2020-02-17 16:05 (#4ZFZD)

The HIPAA privacy rule only applies to â€œcovered entities.â€ This generally means insurance plans, healthcare clearinghouses, and medical providers. If your company is using heath information but isnâ€™t a covered entity per the HIPAA statute, there are a couple reasons you might still need to pay attention to HIPAA [1]. The first is that [â€¦]

Scaling and memoization

by

John

on 2020-02-16 21:33 (#4ZF2W)

The previous post explained that Lobattoâ€™s integration method is more efficient than Gaussian quadrature when the end points of the interval need to be included as integration points. It mentioned that this is an advantage when you need to integrate over a sequence of contiguous intervals, say [1, 2] then [2, 3], because the function [â€¦]

Lobatto integration

by

John

on 2020-02-16 03:11 (#4ZE89)

A basic idea in numerical integration is that if a method integrates polynomials exactly, it should do well on polynomial-like functions [1]. The higher the degree of polynomial it integrates exactly, the more accurate we expect it will be on functions that behave like polynomials. The best known example of this is Gaussian quadrature. However, [â€¦]

...31 32 33 343536 37 38 39 40...