Why do linear prediction confidence regions flare out?
Suppose you're tracking some object based on its initial position x0 and initial velocity v0. The initial position and initial velocity are estimated from normal distributions with standard deviations Ifx and Ifv. (To keep things simple, let's assume our object is moving in only one dimension and that the distributions around initial position and velocity are independent.)
The confidence region for the object flares out over time, something like the bell of a trumpet.
Why does the region get larger? Because there's uncertainty in the velocity, and the velocity gets multiplied by elapsed time.
Why isn't the confidence region a cone? Because that would ignore the uncertainty in the initial position. The result would be too small.
Why isn't the confidence region a truncated cone? That's not a bad approximation, though it's a bit too large. If we ignore probability for a moment and treat confidence intervals as deterministic limits, then we get a truncated cone. For example, suppose assume position and velocity are each within two standard deviations of their estimates. Then we'd estimate position to be between x0 - 2Ifx + (v0 - 2Ifv) t on the low end and x0 + 2Ifx + (v0 + 2Ifv) t on the high end. This is only an approximation because we've ignored probability, and it's pessimistic because it assumes extreme error values for both estimates at the same time.
So what is the confidence region? It's some where between the cone and the truncated cone.
The position x + t v is the sum of two random variables. The first has variance Ifx^2 and the second has variance t^2 Ifv^2. Variances of independent random variables add, so the standard deviation for the sum is
a(Ifx^2 + t^2 Ifv^2) = t a(Ifx^2 / t^2 + Ifv^2)
Note that as t increases, the latter approaches t Ifv from above. Ignoring the uncertainty in initial position underestimates standard deviation, but the relative error decreases as t increases.
For large t, a confidence interval for position at time t is approximately proportional to t, so the width of the confidence intervals over time look like a cone. But from small t, the dependence on t is less linear and more curved.