Higher-order Laplace approximation
Yesterday's post presented the most common form of Laplace approximation, the second order version, and mentioned in passing that there are higher order versions. The extension to higher order is not trivial, so post gives a high level overview of how you'd do it.
The previous post looked at integrating exp( log( g(x) ) ) by expanding g(x) in a power series centered at its maximum. Without loss of generality we can assume the maximum occurs at 0; otherwise do a change of variables first to make this happen.
The first order term of the series is missing because the derivative of g is zero at its maximum. Taking the terms of the series up to second order gives us something of the form a - bx^2 where b > 0. Then
exp(a - bx^2) = exp(a) exp(-bx^2).
The first term on the right is a constant that cam be pulled out of the integral, and the second term is a normal distribution density, modulo vertical scaling.
Now suppose we want to use more terms from the power series for g. Let's write out the series as
g(x) = a - bx^2 + R(x)
where R(x) is the remainder of the series after the quadratic terms. When we exponentiate the series for g we get
exp(g(x)) = exp(a) exp(-bx^2) exp(R(x)).
We think of this as the expected value of exp(R(X)) where X is a normal random variable with variance 1/b, up to some constant we pull out of the integral.
We could approximate R(x) with the polynomial formed by the first few terms, but that's not enough by itself. We need to approximate the expected value of exp(R(X)), not of R(X). The trick is to create another polynomial and find the expectation of that polynomial applied to X.
We need to find a power series for exp(R(x)) and truncate it after a few terms. We don't need to analytically find the power series for exp(R(x)) directly before truncating it. We can apply the first few terms of the power series for exp to the first few terms of the power series for R and keep all the terms up to the order we want, say up to order k, giving us a kth order polynomial P(x).
Once we have the polynomial P(x), we can find the expectation P(X) in terms of normal moments (see this post) or alternatively by writing the polynomial as a sum of Hermite polynomials.