Precedence: ordering or grouping?
As I've mentioned before, I'm part of the technical group looking at updating the ECMA-334 C# standard to reflect the C# 5 Microsoft specification. I recently made a suggestion that I thought would be uncontroversial, but which caused some discussion - and prompted this "request for comment" post, effectively.
What does the standard say about precedence?The current proposed standard includes the following text:
The order of evaluation of operators in an expression is determined by the precedence and associativity of the operators (13.4.2).
Operands in an expression are evaluated from left to right.
"
When an expression contains multiple operators, the precedence of the operators controls the order in which the individual operators are evaluated. [Note: For example, the expression x + y * z is evaluated as x + (y * z) because the * operator has higher precedence than the binary + operator. end note]
I like the example in the note, but I'm not keen on the rest of the wording. It's very easy to miss the difference between operands for any given expression always being evaluated left to right, and operators being evaluated in an order determined by precedence.
I've always thought about precedence in terms of grouping, not ordering - I think of one operator as "binding tighter" than another rather than as being "executed before" it. When I consider precedence, I mentally apply brackets to group operators and operands together explicitly. Eric Lippert has blogged along similar lines but I wouldn't want to put words into his mouth by suggesting he agrees with me. Interestingly, he includes:
Order of evaluation rules describe the order in which each operand in an expression is evaluated.
That's certainly true about the order of evaluation of operands in an expression, but as seen earlier precedence is also specified in terms of "order of evaluation". To me, that's what makes the standard confusing.
Importantly though, when I expressed this in a meeting, smarter people than me said that they exactly thought of precedence in terms of order of evaluation. Grouping was just another way of looking at it, but a sort of secondary approach.
What does "order of evaluation" even mean?Let's take a closer look at the wording of the standard. It's fairly clear what "operands in an expression are evaluated from left to right" means (ignoring the possibility that the "left" operand actually occurs to the right of the "right" operand physically due to line breaks). The left operand is completely evaluated, from start to finish, before the right operand is evaluated. Great.
But what about the "order of evaluation of operators"? Here it's trickier. Does "evaluating" a + b include first evaluating a and b? Should "order of evaluation" mean "order of starting to evaluate"? If so, the standard would actually be inaccurate. Let's go back to the example of x + y * z. We can view that as a sequence of steps:
- Evaluate x.
- Evaluate y.
- Evaluate z.
- Multiply the results of steps 2 and 3.
- Add the results of steps 1 and 4.
Note that the multiplication definitely occurs before the addition. So that looks right. But if I rewrite it just a little to give more context (sorry about the bullets; it's the only way I could get the formatting right):
- Evaluate x + y * z
- 1 Evaluate x
- 2 Evaluate y * z
- 2a Evaluate y
- 2b Evaluate z
- 2c Multiply the results of steps 2a and 2b; this is the result of step 2
- 3 Add the results of steps 1 and 2; this is the result of the expression
At that point, it's clear that we've started evaluating the + operator before we've started evaluating the * operator.
Similarly, if we view it as a tree:
+ / \ x * / \ y z
" we hit the + node before we hit the *node.
So from the "starting to evaluate" perspective, precedence appears to fall apart. The ordering only makes sense when you start talking about the operator performing its duty with the already-evaluated operands - which is tricky for ??, ?. and ?: which don't always evaluate all their operands. I suspect that's fixable though, with careful wording - and I'm influenced by the fact that the term "precedence" is naturally ordering-related (one thing preceding another). Maybe it's as simple as talking about the order of completing evaluation of operands.
Non-conclusion: over to youSo, what should we do in the standard? Given the range of views on the technical group, I said I'd write this blog post and canvas opinion. Readers: how do you as readers think about precedence? How should the standard talk about precedence? Is any aspect of the existing wording (in C# 5 specification it's section 7.3.1; in ECMA-334 4th edition it's section 14.2.1) particularly helpful or confusing?
The technical group is full of very smart people - all of them smarter than me and with a deeper computer science (and/or C# compiler implementation) background. That makes me simultaneously nervous of proposing changes - but also confident in my role of "interested amateur" in that if I find something confusing, I suspect some other readers will too.
I'm in no way saying it's wrong to think of precedence in terms of ordering - albeit with a more precise definition of ordering than we've got now - but I'm suggesting it's not the most helpful way of expressing it for readers. Just to be entirely clear, I'm not suggesting any sort of semantic change - if we change the wording of the standard, it would be purely about clarification, with no behavioural change.
I had originally intended to make this blog post as "on the fence" as possible, but the more I've looked at it, the more I've reinforced my original position - I can only apologise for not being terribly even-handed. I'm very happy to be corrected though, and look forward to reading plenty of comments. Don't be shy.