A System to Understand Incorrect Programs

An ancient paper (July 1978: 30 years ago) from the long gone Lisp Bulletin by Harald Wertz.

The system describes attempts to improve incompletely specified Lisp programs, without however resorting to more information, in the form of specifications, test cases or the like.

A second paper on the system is Stereotyped Program Debugging: an Aid for Novice Programmers.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

In the same vein

I would also recommend a paper by Daniel Goossens who happens to be both an AI researcher and a first rate cartoonist, alas it's only in french: La Méta-Évaluation au service de la compréhension de programmes.
He was in the same team as Wertz, Greussay and Chailloux, I was around by that time and I can tell you that the true motivation for those "program understanding and improving" tools was to quickly rate and correct theirs students assignements.
There is an especially interesting idea in Goossens paper about organising the (meta) control flow around alternations of computations proper (what he calls "point-de-rupture") and trees of pattern matching upon the results of such "points-de-rupture".
The aim being to improve the intelligibility the underlying logic most especially when faulty so that the students could be nearly on their own.

U.S. Air Force interested in Understanding Incorrect Programs

When I gave papers at conference sponsored by, and won contracts from, the RADC (Rome Air Development Center) of the U.S. Air Force, they were rather interested in this work.

Of course, they excluded Lisp from list of Military Standard allowed languages (in favor of Ada, Fortran 77, Jovial J3, Jovial J73, CMS-2, and the like).

But they were fearful that the number of programmers on DOD contracts was an order of magnitude less than demand, even when the cost of Maintaining legacy software often hit $1,000 per line of code.

They had a notion that automated aids to programming, with built-in program understanding, requirements tracing, automated test-generation and interpretation, pattern matching, meta-logic, semantics, and so forth would allow a 10-fold rate of increase of progarmming, and of detecting and correcting bad code without labor-intensive reverse engineering of the legacy code back to an inferred set of requirements and specifications, which could then be modified, and the modifications trickling down to new code which was more documented, verified, and validated.

I confess that me and my colleagues were happy to present the papers (at, for instance, the Goddard/University of Maryland/CSC venues) and to take the R&D money, whether or not we believed in the likelihood of success.

I shall not go into my experience in supervising the reverse engineering of mission critical Space Shuttle code (where there was some 20,000,000 LOC differing in each flight, in an arcane language (HAL/S used also only on certain Shuttle payloads) on hardware from vendors no longer in existence. But NASA, like the FAA, had a strong and plausible aversion to anything smaking of Artificial Intelligence. The Pentagon was afraid of the Hal 9000.

Some neat examples from the 1980 paper

You don't need much Lisp for this, except maybe to know that "de" is the defining function, with the same syntax as "defunc" in Common Lisp. These examples are drawn from the 1980 paper, and depend only on some knowledge about spelling errors, about Lisp primitives, and about the domains of S-expressions and numbers:

Input #1:

(de rev l1 l2 cond ull l22 a1 t rve a1 ons cra a1 a2)

Syntactic repair:

(de rev (l1 l2)
  (cond
    ((null l2) l1)
    (t (rev l1 (cons (car l1) l2)))))

Semantic processing, step 1:

(de rev (l1 l2)
  (cond
    ((null l2) l1)
    (t (rev (cdr l1) (cons (car l1) l2)))))

Semantic processing, step 2, assumed missing base case

(de rev (l1 l2)
  (cond
    ((null l2) l1)
    ((null l1) l2)
    (t (rev (cdr l1) (cons (car l1) l2)))))

Semantic processing, step 2, assumed inverted arguments

(de rev (l1 l2)
  (cond
    ((null l1) l2)
    (t (rev (cdr l1) (cons (car l1) l2)))))

Input #2:

(de fact n times n fact n)

Syntactic repair:

(de fact (n) (times n (fact n)))

Semantic processing:

(de fact (n) (cond
  ((le n 0) 1)
  (t (times n (fact (sub1 n))))))

Input #3:

(de addit m n (((zerop n) m))
                   (t (addit sub1 m add1 n)))

Syntactic repair:

(de addit (m n)
  (cond
    ((zerop n) m)
    (t (addit (sub1 m) (add1 n)))))

Semantic processing:

(de addit (m n)
  (cond
    ((zerop n) m)
    ((le m 0) n)
    (t (addit (sub1 m) (add1 n)))))

Ugh: scanned dot-matrix

Read a (nearly-verbatim) text version here:

http://www.paci.us/copyright-violations/wertz.txt

Attractor Dynamics

Given these examples, it would appear that you could type pretty much any gibberish into the system and it would "repair" it into either factorial or reverse. Note the second example: is the input program really more consistent with factorial than with, say,

f(n) = n * (n-2) * ...

or f(n,k) = (n choose k)

or any of a large suite of combinatorial numeric functions?

The last example (addit) it really scary: it added a clause that turned what looked like a partially correct program into a seemingly incorrect one.

In modern terms, we might say that the system has a strong prior peaked at FACTORIAL and REVERSE (and probably APPEND).

This seems like a lot of glitter, but I don't see the depth.

I think you are right. The

I think you are right. The system seems very biased, and indeed I am rather skeptical about this type of system in general. Historically, however, this is an interesting attempt and a precursor of more advanced systems developed later.