Sum of independent but differently distributed variables

John

from John D. Cook on 2021-06-23 15:47 (#5KECA)

It's well known that a binomial random variable can be approximated by a Poisson random variable, and under what circumstances the approximation is particularly good. See, for example, this post.

A binomial random variable is the sum of iid (independent, identically distributed) Bernoulli random variables. But what if the Bernoulli random variables don't have the same distribution. That is, suppose you're counting the number of heads seen in flipping n coins, where each coin has a potentially different probability of coming up heads. Will a Poisson approximation still work?

This post will cite three theorems on the error in approximating a sum of n independent Bernoulli random variables, each with a different probability of success p_i. I'll state each theorem and very briefly discuss its advantages. The theorems can be found in [1].

Setup

For i = 1, 2, 3, ..., n let X_i be Bernoulli random variables with

Prob(X_i = 1) = p_i

and let X with no subscript be their sum:

X = X₁ + X₂ + X₃ + ... + X_n

We want to approximate the distribution of X with a Poisson distribution with parameter . We will measure the error in the Poisson approximation by the maximum difference between the mass density function for X and the mass density function for a Poisson() random variable.

Sum of ps

We consider two ways to choose . The first is

= p₁ + p₂ + p₃ + ... + p_n.

For this choice we have two different theorems that give upper bounds on the approximation error. One says that the error is bounded by the sum of the squares of the ps

p₁^2 + p₂^2 + p₃^2 + ... + p_n^2

and the other says it is bounded by 9 times the maximum of the ps

9 max(p₁, p₂, p₃, ..., p_n).

The sum of squares bound will be smaller when n is small and the maximum bound will be smaller when n is large.

Sum of transformed ps

The second way to choose is

= ₁ + ₂ + ₃ + ... + _n

where

_i = -log(1 - p_i).

In this case the bound on the error is one half the sum of the squared 's:

(₁^2 + ₂^2 + ₃^2 + ... + _n^2)/2.

When p_i is small, _i p_i. In this case the error bound for the transformed Poisson approximation will be about half that of the one above.

[1] R. J. Serfling. Some Elementary Results on Poisson Approximation in a Sequence of Bernoulli Trials. SIAM Review, Vol. 20, No. 3 (July, 1978), pp. 567-579.

The post Sum of independent but differently distributed variables first appeared on John D. Cook. EJeav55R9jk

Source	RSS or Atom Feed
Feed Location	http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title	John D. Cook
Feed Link	https://www.johndcook.com/blog