foundations for J, APL etc

i have been playing around with Haskell, and finding ADTs a little *choke* restrictive. it seems every evaluation has to be done in the order of creation.

i am wondering if there is a non-functional approach to pattern-matching that can handle abstract operations on arrays and other complex data structures.

i am thinking in the direction of APL, but more a language where the APL primitives can be built from scratch.

from a scan of his pubs, Barry Jay's pattern calculus might be a good fit.

maybe i can do this with Haskell and just don't realise? perhaps 'active patterns' will save me?

Alix

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A recommendation

it seems every evaluation has to be done in the order of creation

Not sure what you mean by this. (Or even if this will end up being on topic for LtU).

I would recommend that you give a snippet of code showing what you are trying to do and how existing language features prevent you from doing it, and then at least we can assess the question.

(Depending on the issues, this may be a question for the Haskell mailing list.)

I don't really understand what's being asked...

However, it sounds like you're asking for generic functions as in Common Lisp or Factor (and others), possibly.

I mean that the encoding of

I mean that the encoding of any data type entails dealing with that data type in a specific way.

eg.

data Tree = Empty
| Leaf Int
| Node Tree Tree


An algorithm that processes the tree needs to decompose the tree in terms of the given structure, and must proceed from the root or a known branch.

There is no way to pattern match against arbitrary structures within the tree, for example all nodes at a certain level, without evaluating from the top. And then i wonder how to deal with arbitrary perspectives on multi-dimensional arrays.

I guess in the system I am describing, there is no concrete encoding, only inferred structure. The description of the data, and how the data is evaluated, is worked out by the compiler.

I hope that makes sense. Maybe I need to look into relational theory again.

Until next time...

It seems like you want something like a view

See A simple implementation technique for priority search queues for an example. Views for Standard ML is something I haven't looked at yet.

There are probably newer techniques for using these types of things, I haven't looked too much into it.

APL and vector-like data types

So, for most scalar operations in APL, the compiler lets you to apply the operation to a vector data type, with the appropriate lifting. One Haskell analogy is having the programmer writing "f xs" and the compiler interpreting it as "map f xs" when xs is a list.

I spent some time investigating APL-like lifting of functions to vectors, arrays, etc, in a more general programming language setting.

In general, that doesn't work, because the resulting system is ambiguous. For example, say that you define a+b on integers to indicate addition, and s+t on arrays to indicate concatenation. How should the compiler interpret a+b on arrays of integers? It could be either an element-wise vector addition, or array concatenation.

Now you're probably thinking the compiler could define some precedence rules to disambiguate such an expression. That would be unpredictable -- a feature that works in some cases confusingly wouldn't work in others. Depending on context, maybe the compiler would interpret a-b as elementwise subtraction, but a+b as concatenation. Worse, given strong implicit typing features (Java generics, Haskell typeclasses), the general case can't be disambiguated for lack of concrete context.

So this aspect of APL is a clever trick for code reduction, but it's unsound and inappropriate in a general programming language. There, we need to clearly specify how we want to lift a function (using map, fold, etc, or explicit recursion and case decomposition) when applying it to a collection data type.

Note that languages like Haskell could make this process simpler than they do. For example, when you define a datatype like "Tree t = Leaf t | Node (Tree t) (Tree t)" which is structured as a covariant functor, the compiler could automatically generate the map function for that type, as it's uniquely determined by the type's structure. When you define a recursive datatype as a fixed point of a functor, the compiler could automatically generate the fold function.

Shape polymorphism?

In general, that doesn't work, because the resulting system is ambiguous. For example, say that you define a+b on integers to indicate addition, and s+t on arrays to indicate concatenation. How should the compiler interpret a+b on arrays of integers? It could be either an element-wise vector addition, or array concatenation.

Addition and concatenation aren't really semantically equivalent. It would be more useful to describe addition for arrays as ... something. Obviously the meaning of (+) is still heavily overloaded, which is why I am interested in composing the exact meaning, sans evaluation.

So this aspect of APL is a clever trick for code reduction, but it's unsound and inappropriate in a general programming language. There, we need to clearly specify how we want to lift a function (using map, fold, etc, or explicit recursion and case decomposition) when applying it to a collection data type.

That is the kind of thing that I want to do, but without the evaluation mechanism implied by map and fold. Maybe interpreting a functional language as declarative would be the way to go...

A better example? (and Trenchard More's Array Theory)

For example, say that you define a+b on integers to indicate addition, and s+t on arrays to indicate concatenation. How should the compiler interpret a+b on arrays of integers? It could be either an element-wise vector addition, or array concatenation.

Tim, your example seems to include an aspect of ad-hoc polymorphism (overloading of +) that may be clouding the issue. For a purely parametric example, consider instead the reversal of a list of lists (in APL2, an array of arrays). We could either reverse each of the inner structures or reverse the outer one. In some sense the behavior of things like + on vectors in APL constitutes the application of an implicit map (called "each" in APL2), so the question is, why not apply it implicitly in the case of reverse?

In APL2 the "ambiguity" of reversal is resolved in favor of the outer structure and we can recover the other behavior via the each operator: this is similar to the way that an explicit map would be necessary in Haskell (or Miranda, or ML). There is no question of an implicit map in the modern languages, but in APL2 we need to account for when and how these implicit operators/coercions are applied.

The general behavior in APL2, as exhibited by reversal, +, and operators of other "rank", is actually something like "seek the first appropriate depth". (In "vanilla" APL there is no inner/outer distinction, but slicing allows one to specify an axis other than the default first one for multi-dimensional arrays.)

I think you can make a consistent story out of this, given that you will still have to resolve the ambiguities somehow (via explicit map/each/whatever). I had this ultimate goal in mind when I started my dissertation work, but first had to address the issue of non-array data structures, i.e. basically algebraic data types; this was my (too!) early version of what is now called polytypism or generics (in the Haskell sense).

Satish Thatte (my advisor) did some work on the APL-style implicit map (zips would also be needed, and some related things perhaps), but I don't think it was ever implemented. (Google indicates that Tim has already been pointed toward Satish's A Type System for Implicit Scaling by Andrew Kennedy back in 2000. There still doesn't seem to be a copy available on-line, though.)

But for a reasonably formal account that best honors APL sensibilities, I would look toward Trenchard More's Array Theory (which was crucial to APL2 and Nial): this was an attempt to give a formal foundation to nested arrays comparable to (and perhaps in competition with) set theory. The latter goal was perhaps not very realistic, but influenced the development in a positive way.

I haven't really looked at Array Theory in 25-30 years, so I can't vouch for it, but I believe that the "depth-seeking implicit map" behavior was accounted for quite well there, at least in terms of an arrays-only perspective.

Rank

J has a explicit mechanisms to control rank, as well as implicit rules to determine rank of various phrasal forms ("trains", "hook", "forks"), which are essentially point free style definitions. So I suggest looking at J before giving up on this project...

Link to Trenchard More Array Theory paper posted

Greg Buchholz has posted a link to a paper by More, Axioms and Theorems for a Theory of Arrays, in a more recent, standalone post to LtU.

Thanks, Greg!

Generics

the compiler could automatically generate the map function for that type


As I'm sure you already know, there's no shortage of papers on how to do this in Haskell. The catch is that all but one (that I know of) require extensions to Haskell 98. Here's a survey of methods that looks at your map example in particular. There's also Hinze's Generics for the Masses but that still requires some boilerplate.

That survey is exactly what

That survey is exactly what I've been looking for, thanks! LTU is very timely sometimes. :-)

[Edit: this is the updated link for that report]