Article 6WY6P Chi squared approximations

Chi squared approximations

by
John
from John D. Cook on (#6WY6P)

In the previous post I needed to know the tail percentile points for a chi squared distribution with a huge number of degrees of freedom. When the number of degrees of freedom is large, a chi squared random variable has approximately a normal distribution with the same mean and variance, namely mean and variance 2.

In that post, was 9999 and we needed to find the 2.5 and 97.5 percentiles. Here are the percentiles for ^2(9999):

 >>> chi2(9999).ppf([0.025, 0.975]) array([ 9723.73223701, 10278.05632026])

And here are the percentiles for N(9999, 19998):

 >>> norm(9999, (2*9999)**0.5).ppf([0.025, 0.975]) array([ 9721.83309451, 10276.16690549])

So the results on the left end agree to three significant figures and the results on the right agree to four.

Fewer degrees of freedom

When is more moderate, say = 30, the normal approximation is not so hot. (We're stressing the approximation by looking fairly far out in the tails. Closer to the middle the fit is better.)

Here are the results for ^2(30):

 >>> chi2(30).ppf([0.025, 0.975]) array([16.79077227, 46.97924224])

And here are the results for N(30, 60):

 >>> norm(30, (60)**0.5).ppf([0.025, 0.975]) array([14.81818426, 45.18181574])

The normal distribution is symmetric and the chi squared distribution is not, though it becomes more symmetric as . Transformations of the chi squared distribution that make it more symmetric may also improve the approximation accuracy. That wasn't important when we had = 9999, but it is more important when = 30.

Fisher transformation

If X ~ ^2(), Fisher suggested the approximation (2X) ~ N((2 - 1), 1).

LetY be a N((2 - 1), 1) random variable andZ a standard normal random variable, N(0, 1). Then we can estimate ^2 probabilities from normal probabilities.

fisher.svg

So if we want to find the percentage points forX, we can solve for corresponding percentage points forZ.

Ifz is the point whereP(Z z) = p, then

fisher2.svg

is the point where P(X x) = p.

If we use this to find the 2.5 and 97.5 percentiles for a ^2(30) random variable, we get 16.36 and 46.48, an order of magnitude more accurate than before.

When = 9999, the Fisher transformation gives us percentiles that are two orders of magnitude more accurate than before.

Wilson-Hilferty transformation

If X ~ ^2(), the Wilson-Hilferty transformation is (X/)1/3 is approximately normal with mean 1 - 2/9 and variance 2/9.

This transformation is a little more complicated than the Fisher transform, but also more accurate. You could go through calculations similar to those above to approximate percentage points using the Wilson-Hilferty transformation.

The main use for approximations like this is now for analytical calculations; software packages can give accurate numerical results. For analytical calculation, the simplicity of the Fisher transformation may outweigh the improve accuracy of the Wilson-Hilferty transformation.

Related postsThe post Chi squared approximations first appeared on John D. Cook.
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments