Fission for Program Comprehension

Jeremy Gibbons (2006). Fission for Program Comprehension. Submitted for publication.

Fusion is a program transformation that combines adjacent computations, flattening structure and improving efficiency at the cost of clarity. Fission is the same transformation, in reverse: creating structure, ex nihilo. We explore the use of fission for program comprehension, that is, for reconstructing the design of a program from its implementation. We illustrate through rational reconstructions of the designs for three different C programs that count the words in a text file.

The paper works through the examples meticulously and highlights their recursion schemes. The claim is that the three different wordcount programs might all have arisen from the same high-level design, namely the composition length o words.

The more audacious claim is that [i]f one accepts the claim that design patterns in object-oriented programming correspond to recursion patterns in generic functional programming, then this is further support for Johnsonâ€™s slogan that â€˜patterns document architecturesâ€™.

Comment viewing options

design patterns in oo correspond to recursion patterns in gfp

Is there more to this claim than Norvig's observation at http://www.norvig.com/design-patterns/ ? If so, does anyone have any links/references?

Design patterns as higher-order datatype-generic programs

My recollection of Norvig's point is that many design patterns in OO become simply "standard programming idioms" in dynamic languages; they are ways of simulating those idioms in a less dynamic context. I don't disagree with his point, but I meant something else.

Traditional design patterns are, necessarily, described informally, with prose and pictures. My claim is that with suitable language features, some at least can be described formally, as reusable library code. The language features you need are proper higher-order constructs (closures) and datatype-genericity (parametrization by type functors, such as "list of" and "tree of").

I argued this claim in tutorials at ECOOP and OOPSLA last year, and am in the process of writing the case up as a paper. I will have another shot at presenting it at the SSDGP school in April.

Some will want to check out

Some will want to check out our recent discussion re the Spring School on Datatype-Generic Programming.

interesting, but is this wise?

My claim is that with suitable language features, some at least can be described formally, as reusable library code.

When GoF first came out I recall there was a rush in certain quarters to build libraries (in C++, my working language at the time) that did exactly this--within the constraints of the language. From a software engineering point of view this was disasterous, since code written with it became clogged up with references to incidental implementation decisions (a SuchAndSoVisitor here, a WhozitzStrategy there), rather than the patterns fading into the ground, leaving the intent as figure--as it's my understanding that they should.

Also, I'm reminded of the Lisp and Smalltalk programmers' idiom of buiding new control structures as required (and them being incorporated into the standard images if they prove to be broadly applicable) without much in the way of tricky language extentions. Is this not the same sort of thing as you're disucssing? I'm sure that it can't be that you need to introduce datatype-genericity cleverness to work around the shackles of static typing ;)

Shackles? I love 'em

Lisp and Smalltalk are very flexible, but untyped (or rather, dynamically typed). So yes, I guess it is the shackles of static typing I am trying to throw off. But I'd prefer to think of it as obtaining the flexibility of Lisp or Smalltalk while maintaining the benefits of static typing.

refactoring?

I will second the question about the correspondence claim, and add another: What is the relationship between fission (and program comprehension generally) and refactoring as a syntactic formal code transformation, specifically?

There seem to be at least one dubious assertion in the paper:

functional programming encourages the definition of tailor-made datatypes, rather than shoe-horning a problem into one of a fixed collection of general-purpose but sometimes ill-fitting datatypes.

On the other hand, while I do not know the meaning of the word "architectures" in the slogan, I do think that the final results of the transformations express rather deep differences between the initial implementations.

the whole hog

By making yet another design decision (nonblank < blank) it's possible to go beyond games involving hiding the parity in the pc, or deciding whether to chop the pig from snout to tail or tail to snout, and simply process the whole hog (well, everything but the snoc) in straight line, noniterative, nonrecursive, code.

with apologies to APL...

 from Numeric import array, sum def wc(str): def length(msk): return sum(msk) def words(str): # Gibbons' unlengths are nondecreasing runs, eg 00001111 def decreases(m): return m[1:] < m[:-1] def blank(a): return (a==' ')+(a=='\t')+(a=='\n') return decreases(blank(array(' '+str))) return length(words(str)) 

:: :: ::
and while we're on the subject of relations: I believe that fission is supposed to imply moving uphill, from a more concrete implementation to a more abstract design, whereas refactoring usually refers to hiking along a contour, transforming one implementation into another.