Distribution of coordinates on a sphere
Almost Sure posted an interesting fact on X:
If a point (x, y, z) is chosen at random uniformly on the unit sphere, then x, y, and z each have the uniform distribution on [-1, 1] and zero correlations (but not independent!) This follows from Archimedes' On the Sphere and Cylinder" published in 225BC.
Archimedes effectively showed that projecting a sphere to a cylinder wrapping round its equator preserves area, so projection of (x,y,z) is uniform on the cylinder. Used in Lambert cylindrical equal area projection.
In this post I want to empirically demonstrate the distribution of the coordinates, their lack of linear correlation, and their dependence.
The usual way to generate random points on a sphere is to generate three samples from a standard normal distribution and normalize by dividing by the Euclidean norm of the three points. So to generate a list ofN points on a sphere, I generate anN * 3 matrix of normal samples and divide each row by its norm.
import numpy as npN = 10000# Generate N points on a sphereM = norm.rvs(size=(N, 3))M /= np.linalg.norm(M, axis=1)[:, np.newaxis]
Next, I find the correlation of each column with the others.
x = M[:, 0]y = M[:, 1]z = M[:, 2]cxy, _ = pearsonr(x, y)cyz, _ = pearsonr(y, z)cxz, _ = pearsonr(x, z)print(cxy, cyz, cxz)
When I ran this I got correlations of -0.0023, 0.0047, and -0.0177. For reasons explained in the previous post, I'd expect values in the interval [-0.02, 0.02] if the columns were independent, so the correlations are consistent with that hypothesis.
To demonstrate that each column is uniformly distributed on [-1, 1] you could look at histograms.
import matplotlib.pyplot as pltplt.hist(x, bins=int(N**0.5))plt.show()plt.hist(y, bins=int(N**0.5))plt.show()plt.hist(z, bins=int(N**0.5))plt.show()
I'll spare you the plots, but they look like what you'd expect. If you wanted to do something more analytical than visual, you could compute something like the Kolmogorov-Smirnov statistic.
The coordinates are not independent, though they have zero linear correlation. Maybe a test for nonlinear dependence will work. I tried Spearman's rank correlation and Kendall's tau, and neither detected the dependence. But distance correlation did.
from dcor import distance_correlationcxy = distance_correlation(x, y)cyz = distance_correlation(y, z)cxz = distance_correlation(x, z)print(cxy, cyz, cxz)
Each of the distance correlations was approximately 0.11, substantially greater than zero, implying dependence between the coordinates.
Related posts- Alternative way to generate points on a sphere
- Generate random points inside a sphere
- Approximately evenly spaced points on a sphere