Estimating normal tail extreme probabilities
In the previous post I said the probability of a normal distribution being 50 standard deviations from its mean was absurdly small, but I didn't quantify it.
You can't simply fire up something like R and ask for the probability. The actual value is smaller than can be represented by a floating point number, so you'll just get zero.
Though there's no practical reason to compute the probability of such events, we'll do it anyway. The technique shown here is useful for less extreme cases that could be practical.
Let Z be a normal random variable with mean 0 and standard deviation 1. We need to estimatethe complementary CDF [1], given by
for t = 50. This post gives us the upper and lower bounds we need:
The same post also gives tighter bounds, which are useful for less extreme cases, but not necessary here.
The bounds above tell us the probability of Z taking on a value of 50 or more is approximately
We'd like to write this value in scientific notation. We can't directly evaluate it with most software because, as we said above, it's too small. If you were to type exp(-2500) in Python, for example, it would return exactly 0. We can, however, evaluate the log base 10 of the expression above using Python.
>>> from math import exp, pi, log10 >>> -1250*log10(exp(1)) - 0.5*log10(2*pi) - log10(50) -544.9661623175799
So our probability is about 10-545. This is far smaller than the probability of selecting a particular elementary particle at random from the universe. (See here.)
Related posts[1] Numerical software packages will provide functions for CDFs and CCDFS, e.g. for and c = 1 - . This may seem unnecessary, and it would be if we could compute with infinite precision. But it is possible, and often necessary, to compute c(x) for values of x so large that all precision would be lost when evaluating 1 - (x). For less extreme values of x we wouldn't lose all precision, but we'd lose some. See the discussion of erf and erfc and other functions that seem redundant here.
The post Estimating normal tail extreme probabilities first appeared on John D. Cook.