Article 5WVG6 Estimating standard deviation from range

Estimating standard deviation from range

by
John
from John D. Cook on (#5WVG6)

Suppose you have a small number of samples, say between 2 and 10, and you'd like to estimate the standard deviation of the population these samples came from. Of course you could compute the sample standard deviation, but there is a simple and robust alternative

Let W be the range of our samples, the difference between the largest and smallest value. Think w" for width." Then

W / dn

is an unbiased estimator of where the constants dn can be looked up in a table [1].

 | n | 1/d_n | |----+-------| | 2 | 0.886 | | 3 | 0.591 | | 4 | 0.486 | | 5 | 0.430 | | 6 | 0.395 | | 7 | 0.370 | | 8 | 0.351 | | 9 | 0.337 | | 10 | 0.325 |

The values dn in the table were calculated from the expected value of W/ for normal random variables, but the method may be used on data that do not come from a normal distribution.

Let's try this out with a little Python code. First we'll take samples from a standard normal distribution, so the population standard deviation is 1. We'll draw five samples, and estimate the standard deviation two ways: by the method above and by the sample standard deviation.

 from scipy.stats import norm, gamma for _ in range(5): x = norm.rvs(size=10) w = x.max() - x.min() print(x.std(ddof=1), w*0.325)

Here's the output:

 | w/d_n | std | |-------+-------| | 1.174 | 1.434 | | 1.205 | 1.480 | | 1.173 | 0.987 | | 1.154 | 1.277 | | 0.921 | 1.083 |

Just from this example it seems the range method does about as well as the sample standard deviation.

For a non-normal example, let's repeat our exercise using a gamma distribution with shape 4, which has standard deviation 2.

 | w/d_n | std | |-------+-------| | 2.009 | 1.827 | | 1.474 | 1.416 | | 1.898 | 2.032 | | 2.346 | 2.252 | | 2.566 | 2.213 |

Once again, it seems both methods do about equally well. In both examples the uncertainty due to the small sample size is more important than the difference between the two methods.

Update: To calculate dn for other values of n, see this post.

[1] Source: H, A. David. Order Statistics. John Wiley and Sons, 1970.

The post Estimating standard deviation from range first appeared on John D. Cook.
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments