Inferring from “is”, part two
In part one I gave a bunch of reasons to reject the proposed feature where the compiler infers additional type information about a local variable when inside the consequence of a conditional statement:
if (animal is Dog){ animal.Bark(); // instead of ((Dog)animal).Bark();}
But the problem still remains that this is a fairly common pattern, and that it seems weird that the compiler cannot make the necessary inference.
A number of readers anticipated my denouement and made some of the same proposals I was going to make in this part. Let's go through a few of them; see the comments to the previous article for a few more.
First, allow the if statement to declare a new variable (as the for, foreach and using statements already do.) Some proposed syntaxes:
if (var dog when animal is Dog) dog.Bark();
Or
if (Dog dog from animal) dog.Bark();
Or
if (animal is Dog dog) dog.Bark();
If I recall correctly, the last there has been proposed for C# 7.
One of the hardest aspects of language design is deciding how general a feature should be. Should the syntaxes proposed above be restricted to the if statement? Can we add a larger, more orthogonal feature to the language and make the whole language more powerful? Suppose var dog when animal is Dog is simply a Boolean expression with the semantics of "declare a local variable of appropriate scope, initialize the local variable appropriately, the value of the expression is the value produced by the is subexpression." Then you could use this construct in other locations. But that then raises other problems, as a commenter noted.
if (foo || var dog when animal is Dog) dog.Bark();
If that's an expression, then it can be the right side of a logical operator, and therefore might not be evaluated! Should this be an "use of uninitialized variable" error? Seems likely. But these are solvable problems.
I want to get back to the idea of generality though. If the feature is to allow a variable to be introduced in an expression and produce a value, then I say let's just go all the way. (Something like this was proposed for C# 6, but was unfortunately cut.)
if ((var dog = animal as Dog) != null) dog.Bark();
Make a local variable declaration with an initializer an expression whose value is the value that was assigned to the variable. (Note, not the value of the initializer; that might be of a type different than the variable!)
There are a few tricky cases you have to consider here regarding what exactly is the scope of the variable depending on where it was declared lexically, and I'm not going to go into those today. Basically the idea here is to solve the problem by declaring a new variable that is clearly of a particular type. However there are other ways to solve the problems we raised last time.
Many of those problems arose from the fact that a variable can change; variables vary. But C# does have a few mechanisms whereby variables can be introduced that change only once, and are treated as values. readonly fields are the obvious example, which are variables only in a constructor, and values otherwise. The "variables" introduced by foreach and using and let in a query also cannot be changed, passed by ref and so on. This has always bugged me, because of the lack of generality here. One of the few features Java has that C# lacks is "final" local variables. C# of course has const locals and fields, but they can only be initialized to compile time constants.
The argument against adding readonly locals to C# is that the feature is unnecessary. Locals have local scope, obviously. The region of code in which the name is valid is of a size of your choosing, and ideally that size is small enough that you can easily know whether the local is written more than once. If you choose to write exactly once, that's your choice; there's no need to have the compiler there to enforce that decision. I used to like that argument, but I am liking it less and less as time goes on.
Readonly locals allow the developer to express to the compiler "this variable is actually a named value, not a variable; an attempt to use it as a variable is wrong, and changing the code to make it a variable may be a breaking change, the costs of which I am willing to bear in the future should it become necessary; please feel free to introduce as many optimizations as you like assuming that this is a value, not a variable."
So I really like readonly locals as a proposed future feature; when combined with the "declare a variable in an expression" feature, it gets even more useful. C# could really use something like a query let that works everywhere.
A third proposal is to introduce pattern matching / type switching / etc. A commenter points out that Nemerle uses:
match (animal){ | dog is Dog => dog.Bark() | cat is Cat => cat.MaintainDignity()}
I'm not super thrilled with the punctuation there but I like the general idea. There are proposals for adding sophisticated pattern matching to C# that I think I will deal with at another time.