DRY out your policies

ericlippert

from Fabulous adventures in coding on 2015-04-23 14:20 (#7NQK)

Occasionally I'm asked to review code that has a lot of repetition in it. Like, for instance, someone is writing a function memoizer:

static Func<A, R> Memoize<A, R>(this Func<A, R> function){ var cache = new Dictionary<A, R>(); return argument => { R result; if (!cache.TryGetValue(argument, out result)) { result = function(argument); cache[argument] = result; } return result; };}

And then they will write remarkably similar code for functions of two arguments, functions of three arguments, functions of four arguments, and so on. "I want my code to be DRY; how can I avoid repeating myself?" is what the developer asking for review usually says.

DRY of course means "Don't Repeat Yourself". In this case I simply deny the premise of the question. This sort of code is WET: We Enjoy Typing. There's no good way in the C# type system to make abstractions across generic types that have a varying number of type arguments. In this case you're just going to have to grit your teeth and deal with it. And besides, it's not that much typing.

But more generally, I would say that the developer asking for review is misunderstanding the true value of the DRY principle. The fundamental idea of Don't Repeat Yourself is not "never write similar code twice", but rather, as Wikipedia helpfully points out "every piece of knowledge must have a single, unambiguous, authoritative representation in the system". What does that have to do with avoiding cutting-and-pasting to make a half-dozen slightly different function memoizers? Really, not much at all. Those things are not themselves representations of knowledge; they're the springs and gears of the machine that represents the knowledge.

The real value of DRY is not in the mechanism domain of the program at all; the underlying computational machinery, like our memoizer above, is what I mean by the mechanism. You can tell it is a mechanism because this code is not part of the business domain of the program; this memoizer might be used in everything from a video game to online banking software. In the business domain of those programs - the wizards and lasers, or the accounts and debits - that is where you want to ensure that all the knowledge in the system has exactly one clear representation, because that's the stuff that is likely to change due to user requirements.

In particular, when applying DRY to my own code I think hard about DRYing out the "policy" code: the code that determines the rules of the business domain of the program. If I want to add a new rule about how wizards interact with lasers, I want to do that in as few places as possible, and ideally only one.

A good way to tell whether policy code is insufficiently DRY is if any of the policy code contains comments like "if you change something here, remember to change it elsewhere too". Back in the pre-Roslyn days of the C# compiler team we had this problem all the time. There would be multiple (and often slightly inconsistent) implementations of, say, the method that computes "is this expression convertible to this type?" - as clear an example of "policy" code as you can imagine. There would be comments in the code saying "if you add a new kind of conversion here, don't forget to add it in the IntelliSense code as well". Because of course we enjoy typing in everything twice, writing the test cases twice, regression testing everything twice, " That's where DRY saves real time, effort and money; spending extra developer effort to avoid a few hundred duplicated keystrokes in a mechanism somewhere simply isn't worth it most of the time.

A commenter asks when is it worthwhile to use a templating language or some such thing to generate highly repetitive code. This is a judgment call; for me, it comes down to the question of first, just how much mechanism code is being produced, and second, what is the likelihood that it is going to change? We used the Visitor Pattern extensively in Roslyn, and it is famous for having absolutely huge amounts of boilerplate code in the visitor base classes. We knew that we would be adding new syntax nodes all the time during the creation of Roslyn, and that even after it was done there would be new nodes to visit in the future, so I wrote up a trivial little code generator that took a list of node types written in XML and produced the visitor base classes automatically when Roslyn was rebuilt.

Source	RSS or Atom Feed
Feed Location	http://ericlippert.com/feed
Feed Title	Fabulous adventures in coding
Feed Link	https://ericlippert.com/