Algebraic

I program in scheme. And scheme being expression based means that its possible to do algebraic programming -- or so I heard. My question is, what does it mean to do 'algebraic programming'. I have also read that some people where able to reduce (or simplify) their code algebraically. How does this work? I'm eager to understand this.

Comment viewing options

Auto or Just easy?

First of all, do you mean automatically, with some sort of function? If so, it is schemes s-expr syntax that makes this easy. I think there is an example of this in SICP. Unfortunatly, because scheme is an imperative language, automatic algebraic simplification cant do much more than simple arithmetic expressions.

For those expressions that can be reduced, algebra systems do this by looking for patterns that have a known reduction. For example a sum where one of the terms is a sum can be flattened: (+ 1 0 (+ 5 6) 7 0 9) = (+ 1 0 5 6 7 0 9). Zeros in a sum can be removed: =(+ 1 5 6 7 9). Sums of one term are just that term, etc etc etc. Factoring and such are the really useful operations, but are much harder to do.

Scheme isnt really much more expression based than any other language, except that things like if and begin are expressions. Of course, even C has ?: and ,.

Scheme is really not the

Scheme is really not the best place for this. Haskell is much better. Many functional pearls have nice example, or read the gentle introduction in The Little Haskellist.

Scheme at least has some history here

I'm not sure it's fair to say that Haskell is much better than Scheme in this regard. Perhaps the Haskell literature is richer. Not a debate I think I'm ready to have, though.

For just getting the basic concept, I'm sure Scheme is pretty awesome. In particular, although its style is not modern or mathematically sophisticated, I'd suggest the Rabbit Thesis, calling attention to the various simplifying transforms it uses.

side effects

Inlining implies algebraic simplification, but the presence of side effects makes some inlining impossible, because one must preserve the order of effects.

For example, in (let ((x X)) (y x)), one must prove either X or y to be without side effects in order to inline to (y X). That's not always possible to do in Scheme.

I found "Fast and Effective Procedure Inlining" to be enlightening.

http://www.cs.indiana.edu/~dyb/pubs/inlining.pdf

side effects v. algebra

For example, in (let ((x X)) (y x)), one must prove either X or y to be without side effects in order to inline to (y X). That's not always possible to do in Scheme.

Sorta. Strictly speaking(!), you "only" have to prove that if X or y have side effects that the outcome is the same no matter in which order the two subexpressions are evaluated. (You could also be doing an implementation-specific optimization in which case you might be able to do that inlining without concern for side-effects.)

Of course, to simplify the code to a LET-free dialect, ((lambda (x) (y x)) X) is always a safe algebraic transformation and is closer to Scheme's significance in history, here.

I'm not so sure that the presence of side effects "spoils" Scheme's algebraic properties. By analogy, in arithmetic, the transformation from a = b * c to a / c = b is only valid when the "lexical context" makes it apparent that c can not be 0.

Similarly, Scheme's lexical scoping rules (R6RS notwithstanding) give programmers fairly straightforward and workable control over when and where the possibility of side effects is apparent to a code transformer.

For example, in (let ((x X))

For example, in (let ((x X)) (y x)), one must prove either X or y to be without side effects in order to inline to (y X).

Why? Edit: Oh I see you meant that y could be an arbitrary expression, not just a variable.

Maybe try Pure?

I'm not sure whether this is what you're looking for, but Pure is based on term rewriting, so it allows you to simply write down algebraic rules like these (disjunctive normal form of logical expressions):

// eliminate double negations:
~~a		= a;

// push negations inward (de Morgan's laws):
~(a || b)	= ~a && ~b;
~(a && b)	= ~a || ~b;

// distributivity laws:
a && (b || c)	= a && b || a && c;
(a || b) && c	= a && c || b && c;

// associativity laws:
(a && b) && c	= a && (b && c);
(a || b) || c	= a || (b || c);

// Example:
a || ~(b || (c && ~d)); // yields a||~b&&~c||~b&&d


It also has a kind of local rule sets which is useful for doing algebraic simplifications depending on context:

expand = reduce with
(a+b)*c = a*c+b*c;
a*(b+c) = a*b+a*c;
end;

factor = reduce with
a*c+b*c = (a+b)*c;
a*b+a*c = a*(b+c);
end;

expand ((a+b)*2); // yields a*2+b*2
factor (a*2+b*2); // yields (a+b)*2


Note that while Haskell has function definitions by pattern matching which look pretty much like Pure's rewriting rules, you can't directly use general rewriting rules like (a+b)*c = a*c+b*c in Haskell because they violate the "constructor discipline". Instead, you'll have to encode expressions as an algebraic data type and then define functions working on these (which essentially amounts to building a term rewriting interpreter on top of Haskell).

You can find Pure at http://pure-lang.googlecode.com/.

Another possibility would be to use a computer algebra system like Mathematica. These usually allow you to evaluate expressions using your own rewriting rules, too. Pure is probably more efficient, though, since it compiles the rewriting rules to native code. And it also offers the option to apply rewriting rules at compile time as a kind of rewriting-based macro facility.

Pure...

...is on my short list of dynamically-typed languages to learn (Oz being the other). Thanks for the great work!

RFC(larification)

Because

1. I'm not sure of the topic myself
2. no one seemed to be interested in the original question as much as they are in evaluating the assertion of Scheme's fitness for it
I'd like to recap here:

Is simplifying elementary algebra formulae what ... it mean to do 'algebraic programming'"?

According to a Mathematica article (http://www.mathematica-journal.com/issue/v9i2/contents/AlgebraicProgramming/AlgebraicProgramming_1.html) it is solving programming problems by using concepts of abstract algebra, such as distributivity, commutativity, and associativity which sounds ethereal enough to satisfy an Agda programmer! (just kidding) Anyway, could someone expand and expound more on what "AP" is, please?

Hmm...

"simplifying elementary algebra formulae"
"solving....problems by using concepts of....algebra, such as distributivity, commutativity, and associativity"

These seem to be saying exactly the same thing.

I actually took the question

I actually took the question at face value, and suggested where you can find clear examples of algebraic (equational) reasoning. The link I gave, by the way, presents Scheme code...

We have tons of links to articles etc. that demonstrate the power (and limitations) of algebraic manipulation in the archives. The classic reference, I supposed, would be the book The Algebra of Programming.

I took "algebraic

I took "algebraic programming" to mean heavy use of term rewriting - that is substitution/simplification based on some notion of equality. FWIW, I have no idea why Scheme is particularly relevant to that, but it is a little easier to efficiently program this type of thing in a language built in GC and compiler support for atoms.

I would like a trivial

I would like a trivial example of where algebraic cancellation/simplifying is practiced (using scheme syntax). The reason for my curiosity stems from this short paper:

http://www.cs.indiana.edu/~jsobel/c455-c511.updated.txt

Because of its extremely algorithmic---almost
mathematical---nature, Scheme can be easily manipulated in a sort of
algebraic style. One can follow a series of rewrite rules (just about
blindly) to transform a program into another form with some desirable
property. This was exactly what I needed.

I have problems wrapping my brain around what exactly he might have done. (a trivial use-case would go a far way)

List monoid

A simple case is lists. List concatenation and the empty list form a monoid, which means these laws hold; for lists xs, ys and zs:

(append (append xs ys) zs) = (append xs (append ys zs)),
(append nil xs) = xs = (append xs nil).


The first law says append is associative. The second two are identity laws analagous to 1 * x = x = x * 1.

Some sketchy definitions

I will give you a couple distinct, sketchy definitions; "sketchy", because "algebraic programming" is not exactly a well-defined term.

First of all, an "algebra" in this sense is a type X together with a set of operators with types of the form F(X) -> X, that is, all the operators "construct" the type X. You will note that familiar operators from arithmetic all have this type; for example, multiplication and addition both have type X * X -> X. Unlike in Haskell datatype constructors, though, these operators might satisfy nontrivial laws, which is to say that they are not just constants.

I suspect the most common notion of "algebraic programming language" is a language in the OBJ family, which includes Maude and CafeOBJ. These are languages based on term rewriting, where algebras sort of fill the role of program modules. They are roughly similar to FP languages, except that there are no higher-order functions and programs rewriting is not necessarily confluent/Church-Rosser. This implies that, although there are no side effects, you have to pay attention to the order in which things are evaluated.

Another notion of "algebraic programming language" is "functional programming language (possibly) minus higher-order functions". There aren't many of these, probably because adding higher-order functions is not a huge hardship. (The main reason to avoid is contravariance.) There used to be a language Opal which belong to this class, but I think they ended up adding HOFs. Charity is arguably in this class as well, but it also eventually gained HOFs.

In the broadest sense, you could try to characterize algebraic programming by some distinguishing features, not all of which are technical. The most important thing, if you ask me, is a sense that there is a clear and close relationship between the language and its semantics. In other words, an important goal of such a language is that it has rigorous semantic foundations. It means that programs are composed of "terms", which are built up from operators and constants and variables. It means that programs are executed by rewriting terms. It usually means that rewriting is only allowed when terms are equal. More concretely, it means that substitution of equals works -- that is, you have referential transparency (though not necessarily, as I said, confluence).