Article 1KYV8 Formal methods let you explore the corners

Formal methods let you explore the corners

by
John
from John D. Cook on (#1KYV8)

I heard someone say the other day that the advantage of formal software validation methods is that they let you explore the corners, cases where intuition doesn't naturally take you.

This made me think of corners in the geometric sense. If you have a sphere in a box in high dimensions, nearly all the volume is in the corners, i.e. outside the sphere. This is more than a metaphor. You can think of software options geometrically, with each independent choice corresponding to a dimension. Paths through a piece of software that are individually rare may account for nearly all use when considered together.

With a circle inside a square, nearly 78.5% of the area is inside the circle. With a ball sitting inside a 3-D box, 52.4% of the volume is inside the ball. As the dimension increases, the proportion of volume inside the sphere rapidly decreases. For a 10-dimensional sphere sitting in a 10-dimensional box, 0.25% of the volume is in the sphere. Said another way, 99.75% of the volume is in the corners.

When you go up to 100 dimensions, the proportion of volume inside the sphere is about 2 parts in 1070, a 1 followed by 70 zeros [1]. If 100 dimensions sounds like pure fantasy, think about a piece of software with more than 100 features. Those feature combinations multiply like geometric dimensions [2].

Here's a little Python code you could use to see how much volume is in a sphere as a function of dimension.

 from scipy.special import gamma from math import pi def unit_sphere_volume(n): return pi**(0.5*n)/gamma(0.5*n + 1) def unit_cube_volume(n): return 2**n def ratio(n): return unit_sphere_volume(n) / unit_cube_volume(n) print( [ratio(n) for n in range(1, 20)] )

* * *

[1] There are names for such extremely large numbers. These names are hardly ever used-scientific notation is much more practical- but they're fun to say. 1070 is ten duovigintillion in American nomenclature, ten undecilliard in European.

[2] Geometric dimensions are perfectly independent, but software feature combinations are not. In terms of logic, some combinations may not be possible. Or in terms of probability, the probability of exploring some paths is conditional on the probability of exploring other paths. Even so, there are inconceivably many paths through any large software system. And in large-scale operations, events that should "never happen" happen regularly.

iPRSvuxCIHo
External Content
Source RSS or Atom Feed
Feed Location http://feeds.feedburner.com/TheEndeavour?format=xml
Feed Title John D. Cook
Feed Link https://www.johndcook.com/blog
Reply 0 comments