Optimizing associative operations

ericlippert

from Fabulous adventures in coding on 2015-10-27 17:21 (#RSA8)

A question I occasionally get is, suppose I have code like this:

const double x = 200.0;const double y = 0.5;...void M(double z){ double r = z * x * y; ...

Does the compiler generate code as though you'd written z * 100.0?

Well, if you try compiling this and examine the output of the compiler with ILDASM you'll see that no, in fact there are two multiplications generated. But if you'd written x * y * z then in fact there would be just a multiplication by 100.0 in the IL. What's going on?

The issue here is that multiplication is left associative. That is, a * b * c must be evaluated as (a * b) * c, and not a * (b * c). Note that this has nothing to do with the ordering in time of the evaluations of the operands; C# guarantees that the operands are evaluated left to right regardless of how the expression is parenthesized. Rather, the compiler must guarantee that if there is possibly a difference in the values of (a * b) * c and a * (b * c), the former must be chosen.

Could there be such a difference? Sure. Suppose z is so large that multiplying it by 100.0 is finite, but multiplying it by 200.0 results in an infinity. Infinity times 0.5 is still infinity, so the correct answer here would be infinity.

Now we have to throw in a little wrinkle here, which is that the C# specification notes that the runtime is permitted to do floating point operations using extra precision bits if it so chooses, and in fact many implementations of the CLR do. It is entirely possible that even after doing the multiplications by 200.0 and 0.5 on such a value that the result is the same as though it were multiplied by 100.0. The question at hand is not whether you can get that result - you can - but whether or not the compiler is allowed to optimize z * 200.0 * 0.5 by generating code for z * 100.0. Since there could be a difference, the compiler cannot make that optimization.

And similarly if we had something like z * y * x, then z could be so small that multiplying by 0.5 underflows to zero, and then the multiplication by 200.0 results in 0.

Once again we see that floating point arithmetic is weird. What about integer arithmetic? If we are entirely in integers then it is the case that (a * b) * c is equal to a * (b * c), so could this optimization be made if b and c are constant integers? Give it some thought before you read on below".

" There is one bizarre case. Suppose we have q * 2 * 0 and we are in a checked context. The compiler must generate the multiplication by 2 because that multiplication could overflow before the multiplication by zero, so this cannot be reduced to simply q * 0. Other than that, generally speaking integer multiplication is an associative operation in C#, so the compiler could turn q * 2 * 3 into q * 6. To my knowledge no C# compiler does so. The costs of the work needed to design, implement, debug and test the optimization is not justified by the tiny benefit that accrues to almost no code.

There is however one situation in which the C# compiler does realize that an operation can be optimized by harmlessly parenthesizing on the right instead of the left: string concatenation. Code like:

r = s + "blah" + "foo";

is common, particularly in machine-generated code. The C# compiler does realize that this can be optimized to s + "blahfoo". I wrote a bit about that a few years ago, so check that out if you're interested.

Source	RSS or Atom Feed
Feed Location	http://ericlippert.com/feed
Feed Title	Fabulous adventures in coding
Feed Link	https://ericlippert.com/