Rewriting rules for deducing properties of functions

Commutativity of operations and functions can be a useful thing for a compiler to know. So how do you you tell the compiler an operator or function is commutative (or associative, etc.)? One obvious option would be to let programmers tag their functions. This is bound to introduce subtle and nasty errors in to the code if the programmer makes a mistake. Not to mention it is inelegant, and it isn’t the kind of thing I want to see mixed in people's code (optimization hints that is).

So the big question is, how do we automatically recognize proprerties of functions like commutativity? In Cat (like Haskell) there is already a system for expressing rewriting rules (e.g. rule { $a $b f } => { $b $a f }). This fairly obviously demonstrates the commutativity of 'f'. In fact if we can follow any sequence of rewriting rules to arrive at this equality (which is a simple graph search problem), then we can also deduce commutativity.

I am trying to track down related research on the subject. If anyone has any pointers, I would be most appreciative.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Term Rewriting and All That

The book Term Rewriting and All That might be relevant to your question. In particular, chapter 10, Equational Unification, perhaps discusses related problems. Disclaimer: While I have a copy of the book, I have not yet read it; the book is on my queue of books and papers to read.

Thanks

Thanks Vesa. I wonder if this is a worthwhile book to purchase and read? If anyone has comments on it, or other suggestions for a good textbook on term rewriting, please throw them at me. Thanks!

A Good Book

The book Term Rewriting and All That might be relevant to your question.

I can confirm that this is a good book, and I would recommend it to anyone who wants to think deep thoughts about term-rewriting, lambda calculus and computation.

I don't know if it will help with the originally-posted problem in immediate practical terms, but if anyone wants to think hard about the question, it is probably a good book to absorb as background.

Quoting from Principles of

Quoting from Principles of Programming with Complex Objects and Collection Types:

"Unfortunately, even for simple programming languages featuring structural recursion (together with a few basic constructs such as the ability to manipulate pairs), asking whether equational conditions like associativity, left-idempotence, commutativity, etc., are true about program phrases is undecidable, in fact, not even recursively axiomatizable."

Disclaimer: While I have read the paper, I have not yet understood it; the paper is on my queue of books and papers to understand.

Undecidability not a big concern

Hi Roly, thanks for the link.

"asking whether equational conditions like associativity, left-idempotence, commutativity, etc., are true about program phrases is undecidable"

Yes, this is fairly obviously true. However decidability is not a deterrent: we can still search for proofs of certain properties of functions within the infinite space created by rewrite rules. Whatever properties we discover proofs of, we get to use.

So to give an example, consider the following rewriting rules:

rule { $a $b f } => { $a $b g }
rule { $a $b g } => { $b $a f }

We can follow these rules and find something useful, we just aren't always guaranteed to find something useful, and to get to the end of the search.

It just like ordinary optimization: finding the fastest program is undecidable, but we can still look for faster programs.

Proof

If proofs could be provided to the language, then the compiler could be assured of the correctness of the annotations and subsequent optimisations.

Some automated algorithms will be able to do these proofs in some cases. For instance, associativity and commutativity of addition are provable automatically using distillation (an extension of supercompilation). In this case we would have to trust the correctness of the automated procedure.

Alternatively you could prove it manually. This requires a proof system that can be used to reason about the term language and a trusted proof checker. Isabelle and Coq have such facilities and could, in principle, be used for proofs for optimisation.

I think I saw something by Alberto Pettorossi, Maurizio Proietti, that talked about utilising associativity and commutativity for program transformation, but it was for logic-programming and I can't seem to find it right now.

Found the Pettorossi paper

I believe I found the Pettorossi paper, thanks Gavin.

Automatic Derivation of Logic Programs by Transformation (2000) Alberto Pettorossi, Maurizio Proietti

Abstract: We present the program transformation methodology for the automatic development of logic programs based on the rules + strategies approach. We consider both definite programs and normal programs and we present the basic transformation rules and strategies which are described in the literature. To illustrate the power of the program transformation approach we also give some examples of program development. Finally, we show how to use program transformations for proving properties of predicates...

I will look closely at this, it seems to be exactly what I am suggesting.

Two problems

There are two problems that you are mixing there: it is clear that deciding if a function is commutative is too hard, but providing the compiler with a proof is possible to do in an elegant way in a system with dependent types. You just have a proof term for the commutativity of your function that the compiler can check and use. Related to your question, there are tools to do generalized rewriting (in Coq at least) that allows one to rewrite using arbitrary lemmas as rewrite rules. This can be used to do automatic optimization, if you know a strategy to apply the rules. Moca http://moca.inria.fr/eng.htm is also related: it is an extension to ML which handles invariants like commutativity on binary constructors.

"providing the compiler with

"providing the compiler with a proof is possible to do in an elegant way in a system with dependent types. You just have a proof term for the commutativity of your function that the compiler can check and use."

This is a good point. Thank you for sharing it.

"it is clear that deciding if a function is commutative is too hard"

I am convinced that it is in fact not too hard to be useful. Why do you say that? I suspect you are referring to the general case, which as I mentioned earlier is not neccessary to try and solve. I am not trying to prove that all functions are commutative or not, just trying to find as many commutative functions as possible.

While I appreciate the links for using lemmas as rewrite rules, I should point out that I am proposing the opposite: using rewriting rules as axioms (or lemmas). I am trying to figure out whether such research has some novelty.

Not too hard.. to be useful

Of course I do not dispute the fact that inference of properties is useful, I'm quite happy when people design programs to do work humans would have to do otherwise (isn't that the whole point?). People have already provided links on distillation and rewriting-based techniques for proving equations. There is also the rippling technique which might be of interest: a kind of mix between heuristics and rewriting for automated deduction.

Rippling is cool

This ripppling technique is very interesting, I was not aware of it, thank you very much Matthieu.

This is a good reference for

This is a good reference for rippling. Although it's a bit old now, and doesn't cover some recent developments (dynamic rippling, relational rippling etc.)

There's also, in addition to the rippling book, a chapter in the Handbook of Automated Reasoning (Chapter 7, pgs 876 onwards) that covers some material on rippling.

Working on similar problems..

I'm working on a similar problem; using rewrite rules as axioms for program transformation & proofs (specifically optimization and correctness). I've found much of the term-graph rewrite literature very helpful as background, they have good definitions for what it means to perform rewrites and what types of rewrites will achieve certain theoretical properties (normalization, first-order axiomatization, etc).

I've also spent some time digging through the global optimization literature to check that the graphs produced in this process can be searched in a reasonable amount of time; it appears so (they report successful searches over millions-of-nodes regularly now).

I agree, undecidability does not appear to be an issue for this kind of system, it simply places limits on what can be discovered in a reasonable amount of time. And once found, that derivation need not be searched for again.

As for books/references, I think the gold standard here is "Algebra of Programming" by Bird & DeMoor. I have yet to find a copy, unfortunately (you can see I was looking for a copy last year in that thread)...

Papers I've found insightful:
"Denali Superoptimizer"
"Automatically Restructuring Programs for the Web"

Great stuff

Thank you very much for the links Bryan!

Twelf

It might be worth taking a look at the way that the Twelf folks automate some structural induction: Frank Pfenning and Carsten Schurmann had a CADE98 paper, Automated Theorem Proving in a Simple Meta-Logic for LF introducing their technique. The Twelf wiki has a page showing how such automated proofs work in practice, using commutativity as an example.