Managing missing information through Condition Propagation

I'm a bit nervous to be posting here, but I am interested in feedback from the many bright minds that visit here regarding an idea I've been considering for a while:
Here is my official blog posting on the subject

The essence is this:
"Under Condition Propagation missing information is explicitly noted and antecedently avoided. Rather than extend logic systems or stop execution at run time, the compiler in-effect ensures that expressions are never evaluated that do not satisfy the conditions. The logical implications of this are significant. Unlike null-like solutions, operators, and thus truth-tables, remain logically unaffected by the missing information. For example, the AND truth table remains in it's traditional, four entry, form. Each primitive operator may publish compiler meta-data regarding the conditions under which the operator is able to operate. With this information the compiler ensures that any conditions are resolved prior to evaluation. The compiler also propagates the condition meta-data to allow the developer to choose at what point prior to evaluation to resolve the condition. Logic, arithmetic, relational and other operations can all be defined in their true logical form, unaware of missing information because they will never be requested to deal with it."

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

This sounds like an odd

This sounds like an odd version of abstract interpretation?

Here's the original paper by Cousot and Cousot
The Original POPL 77 paper

Could someone who has more experience with abstract interpretation step in?

Abstract interpretation? I'm not sure I see the relevance.

I could possibly see how an implementation of condition propagation might "do" abstract interpretation at some level, but I would not construe condition propagation to be a "version" of abstract interpretation. Unless I am mistaken, abstract interpretation is a name and more formal description given to what we do when we analyze code for a purpose other than the usual execution. Condition propagation is a specific proposal for managing missing information through logical layering. The two concepts seem to be at different levels.

How about you read the paper

How about you read the paper and then explain to me where the similarities and differences between abstract interpretation and your "condition propagation" lie? The paper I provided the link for just provides a few simple examples, but the document should give you a more precise understanding of what abstract interpretation IS.

I certainly feel a lot dumber. ;-)

I installed a new PDF reader and was able to see the paper (was garbled with my old reader). I can see that abstract interpretation gives an apparently unified way of reasoning about programs of a given language. One abstract interpretation application in the vain of CP is the verification of boundaries. The paper even specifically gives the example of verifying array boundaries in a Pascal-like language. If the task were to produce or at least approximate the boundary conditions of a given program, I could see abstract interpretation as being invaluable. If, on the other hand, they were hard-wired into the primitive operators and propagated from there, complex analysis can be avoided. I do see one possible sticky place for the compiler however: determining what condition(s) are satisfied by a particular resolution expression. This may be a place where abstract interpretation could help, because it involves proving that a particular condition is satisfied by a given assertion. Anyway, thanks for the link, maybe I'll understand it better with another read. ;-)

I found the 1977 paper a

I found the 1977 paper a tough read, too. What helped me was to work through Dan Friedman and Anurag Mendhekar's Using an Abstracted Interpreter to Understand Abstract Interpretation to get a hands-on hacker's intuition before trying Cousot and Cousot again.

Interesting

Anything that tries to address the foibles of software is worth a look. Pardon my confusion, how much of CP is static vs. dynamic? I wonder if the CP formulation could be, in effect, translated into a larger but more traditional system as a demo or means of explaining it - presumably, the translation would insert many conditionals and alternate code paths. Similarly with your row/records examples; I think the more traditional way would be to have N related structs, one per possible "state" of the row. CP could be translated/expanded to that.

I'm probably not understanding this...

...but isn't this some variation on a lazy evaluation strategy? That is, if the value is unknown, wrap it up in a function which can be forced at a later point in time?

Proactive... not lazy

The aim is similar (avoid evaluating), but lazy evaluation puts off obtaining values in general in hopes that the value will not actually be required. CP avoids evaluating in order to contain boundary conditions. CP operators expose the conditions under which a value may be obtained and lets the evaluator decide whether to: a) satisfy the conditions and evaluate, b) return some other value or perform some other action, or c) propagate the condition and decide "earlier" in the expression what to do.

Consider this:

insert RowX into TableY
  when not RowUnique then
    update TableX set RowX

And this:

if (insert RowX into TableY) satisfies true then...

The former example inserts a row if it isn't in the table, or updates the row if it is. Note that unlike exception handling, no attempt was made to actually insert the row in order to revert to an update. Instead, the same condition that the system must already perform to avoid a key violation is performed and the developer decides what to do if that condition fails. Note that the condition only needs to be evaluated once.

I suspect that CP and lazy evaluation are orthogonal, and can imagine a language that does both CP and lazy evaluation. ...Interesting.

vs. exceptions

Note that unlike exception handling, no attempt was made to actually insert the row in order to revert to an update. Instead, the same condition that the system must already perform to avoid a key violation is performed and the developer decides what to do if that condition fails. Note that the condition only needs to be evaluated once.

I don't see the difference. With exceptions, 'insert RowX into TableY' for a duplicate row would fail as soon the key violation test is performed, and control would transfer to the exception handler. The test for row uniqueness is performed the same number of times (once for the insert, once for the update).

Similarly, in examples like the array boundary and parsing examples, I don't see how the described system is observably different from a system with resumable exceptions. E.g. when an array access exception occurs, an exception handler would be called essentially as an ordinary function, and one of the choices available to the handler is to return a value like Point(0, 0) to the point at which the exception occurred.

Sparse rows look like a different sort of feature to me. Without working out the implementation details myself, I can only take it on faith that their implementation mechanism via CP would be similar to that of the other examples. It'd be useful to see even a sketch of a toy implementation. That'd also help answer the various questions about how CP relates to abstract interpretation, lazy evaluation, and dependent types. ;)

vs. exceptions

I assume your perception is that an exception version of the same logic would look like this:

try
  insert RowX into TableY;
catch
  on KeyViolation do
    update TableX set RowX;
end;

In such case, you are correct that the check is only performed once, but the prescribed methodology for exception handling is to avoid exceptions and keep them "exceptional". Thus, a "proper" implementation based on exceptions might be:

if not exists (TableY where KeyCol = RowX.KeyCol) then
  insert RowX into TableY
else
  update TableY set RowX;

This is, unless there is an operator like:

TryInsert(Table : Table, Row : Row) : Boolean;

As I state in my main posting on the subject, exception languages have evolved towards such operators to allow exceptions to be avoided. To re-state another point, when is something an exception and when is it an expected outcome? The current mantra in the .NET space, for example, is to avoid exceptions altogether. Under this, exceptions become almost synonymous with defects; places where the developer did not adequately avoid them.

Beyond the philosophical reasons, there are also practical reasons to avoid them. There is an often costly performance hit; not just due to particular implementations but by nature. Exceptions aren't as "cheap" as a simple branch, which is another reason the "TryParse" pattern has evolved.

As the latter example in my prior post shows, it is also important to be able to query a condition without attempting execution. Again, with exception handling it is incumbent on the developer to know and anticipate the conditions of any given operation in order to avoid an exception. This becomes increasingly difficult or even impossible for complex operations like database manipulation. It is impossible when the ability to test the condition without attempting the operation is not exposed, which it often isn't.

CP is largely about splitting the condition from the operation. This allows the operation's conditions to be evaluated independent of the operation, and it allows for the conditions to be resolved earlier than necessary. Not unlike null propagation in usage, conditions can be addressed at different places in the expression.

select Rotate(Translate(Point((5 / X) * 100, Y)), 45) resolve DivideByZero as Point(0, 0);

Unlike null, however, the compiler is ensuring that the DIV operator never receives a 0 denominator. This avoids the run-time null propagation logic as well as the boundary checks within the actual operators.
Again, the conditions can also be inquired of independently:

select Rotate(Translate(Point((5 / X) * 100, Y)), 45) satisfies DivideByZero; // true if X isn't 0

One more point regarding exceptions: The exception embracing languages that I know of don't allow exceptions to be resolved within an expressive context. I don't necessarily see a reason why such languages couldn't be extended to support this, though most such languages are not richly expressive and so I assume this simply hasn't been perceived as a major obstacle. Languages that are more expressive, however, would clearly need to have some sort of try...catch expression. I can only imagine the nightmare this would create for an optimizer that can't "see" the conditions leading to these exceptions.

Regarding sparse rows: I agree that this concept is largely distinct from CP. It is clear that CP could exist without sparse rows, but the converse is not true. Frankly, of the two ideas, I am much more comfortable with CP. I think I could build an implementation of CP right now, while I'm not sure I could say the same about sparse rows. I agree that a prototype would be a good way to exhibit the concept. Perhaps I'll get a chance to modify our D4 language to support CP.

Dependent types?

How (joining in the "guess the related topic") does this differ from dependent types?