Dimensional analysis and types

John

from John D. Cook on 2015-12-01 13:08 (#W9QQ)

This weekend I mentioned on Twitter that it's spooky how well dimensional analysis catches errors. If you're trying to calculate a number of horses, does your final result have units of horses? If it has units of cats or kilograms, something has gone wrong. This is such a simple idea, it's remarkable that it's worth checking. You can find a surprising number of errors simply by asking whether your answer has consistent units - e.g. did you add a length to a mass somewhere? - and whether it has the right units - if you're computing a dollar amount, does your answer come out in dollars?

A few people replied, making the analogy with type systems for programming languages. Type systems prevent you, for example, from adding a number and a word or from trying to take the square root of an image. You could think of dimensional analysis as a subset of type theory.

I believe there's an argument implicit in the comparison of dimensional analysis and type theory: If a minimal amount of type checking, i.e. checking units of measurement, is effective at catching errors, more type checking should catch more errors. Although I mostly agree with this argument, it leaves out cost. Stronger type checking catches more errors, but at what cost? Dimensional analysis is practically free. It can often be a literal back-of-an-envelope calculation and has great return on effort. What about type systems?

Most people would agree that some minimal amount of type checking is worth the effort. The controversy is over when it is no longer economical to add additional structure. Haskell has a much more expressive type system than less formal languages, and yet the Haskell community is looking for ways to make the type system even more expressive. Some would say we've yet to discover the point where additional typing isn't worth the effort. At the opposite end of the spectrum are people who believe that no amount of explicit typing is worth it.

As with most technical controversies, the resolution depends on context. You don't want types to get in your way when you're writing a quick-and-dirty script. But you might greatly appreciate them in large, mission-critical projects.

It takes skill to get the most benefit from a type system. A good type system won't automatically do anything for your code. You could actively work against the type system by writing "stringly typed" code, for example. Every function takes in a string and returns a string. You could passively work with the type system, using built-in types as intended but not creating any new types. Or you could actively work with the type system, creating new types so that more logical errors will become compiler errors.

An example of passively using a type system would be distinguishing integers and floating point numbers, say using integers for counting things, floating point numbers for measurements like temperature and mass, and strings for character data. You could argue that there aren't many natural types in such program, and so a strong type system wouldn't be that helpful. A more active approach would be, for example, to introduce different types for temperatures and masses. You could go much further than this, organizing data into more complex types.

All other things being equal, I like strong, static typing. But there are usually other factors that outweigh my typing preferences: availability of libraries, convenience, client requirements, etc.

Source	RSS or Atom Feed
Feed Location	http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title	John D. Cook
Feed Link	https://www.johndcook.com/blog