Article 6MM03 MCMC and the coupon collector problem

MCMC and the coupon collector problem

by
John
from John D. Cook on (#6MM03)

Bob Carpenter wrote today about how Markov chains cannot thoroughly cover high-dimensional spaces, and that they do not need to. That's kinda the point of Monte Carlo integration. If you want systematic coverage, you can/must sample systematically, and that's impractical in high dimensions.

Bob gives the example that if you want to get one integration point in each quadrant of a 20-dimensional space, you need a million samples. (220 to be precise.) But you may be able to get sufficiently accurate results with a lot less than a million samples.

If you wanted to be certain to have one sample in each quadrant, you could sample at (1, 1, 1, ..., 1). But if for some weird reason you wanted to sample randomly and hit each quadrant, you have a coupon collector problem. The expected number of samples to hit each of N cells with uniform [1] random sampling with replacement is

N( log(N) + )

where is Euler's constant. So if N = 220, the expected number of draws would be over 15 million.

Related posts

[1] We're assuming each cell is sampled independently with equal probability each time. Markov chain Monte Carlo is more complicated than that because the draws are not independent.

The post MCMC and the coupon collector problem first appeared on John D. Cook.
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments