Functions shouldn't be lists, functions should be cast to lists

In Joy, and other similar dynamic languages, values can be evaluated like functions that simply yield themselves. This means that any list can be treated as a function. While elegant, this has several drawbacks:

  • compilation requires an extra indirection step (consider how integers and instructions that place integers on the stack are represented differently in most machine code)
  • it is easy to construct ill-typed functions by concatenating lists
  • you lose referential transparency

I've gone into more detail in this article on my web-site.

As a result of these issues I've decided to differentiate functions from lists of functions in the Cat type system. In order to reclaim some of the elegance of Joy (e.g. using functions and lists interchangeably) I've decided to use implicit function to list casts.

As a general rule I feel implicit conversions are evil, but I think the elegance gained is worth it in this circumstance, because instead of writing:

[1 2 3] list [4 5] list cat

I can now simply write:

[1 2 3] [4 5] cat

This is hopefully a useful insight (if it isn't too obvious) to anyone looking at the relationship between dynamically typed and statically typed languages.

Any thoughts? Am I stating the obvious? Or is this in fact a useful observation? Are there any forseeable problems with the approach?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Automatic promotion

Personally, I heavily dislike implicit casting.

However, there are ways to automatically promote constants of all sorts. I rather like the way it is done in Elemental Function Overloading in Explicitly Typed Languages. In Haskell, in the simplest cases, this is the same as saying that one can put an arbitrary number (ie 0 or more) of const calls in the right places. This gives a new sort of polymorphism (over Nat) which allows you to think of any object as always have enough "arrows on the left" to make things work. So 2 is both a constant and a constant function (and a curried binary function, and ...).

The nice thing is that this corresponds exactly to mathematical practice.

Thanks for the explanation

Thanks for the clear explanation of Haskell promotions and Elemental Function Overloading. I wasn't aware of the mechanisms.

On implicit conversion

I've actually recently been thinking about implicit casting, more specifically the ability have multiple, interchangeable data structures for one type of data. These data structures have different properties, and different algorithmic runtime properties. Some might not even implement some functions. What allows this is that you may freely convert between them (though it might be costly), and the compiler will as necessary.

The main requirements of these conversions is that there is no data-loss. If both convertion and equality are defined, converting to and from should produce an equal object (not necessarily data equality, though...).

A secondary requirement is that assuming a value acceptable to all representations, you may reach any of them from any other of them.

One problem with this is that it doesn't obviously allow such things as floating point numbers or jpg images. At this point it's a speed and precision tradeoff, something that's difficult to specify in code. One solution might be to specify error functions for conversions and operations on lossy data, and then specify acceptable error for the algorithm.

I've just realized that this doesn't have a whole ton to do with your issue. I think this is more of a quirk of stack languages, than to do with implicit casting. I think in this case it has more to do with syntactic/semantic special cases.

Abstract Data Types

This is actually what was originally envisioned for ADTs long ago. And ADT was supposed to be (mostly!) a specification-time entity, so that it was clear that you could have multiple implementations of it. And of course all implementations have to be interchangeable (by definition!).

What is somewhat novel in your approach is that you want to allow complete conversion. There are two ways of doing this: 1) for each pair of implementations, you specify conversions, or 2) you have conversions in/out of the data-type into some universal intermediate type that records all necessary invariants. Method 1 requires O(n^2) implementations, 2 needs O(n). Of course, the hard part in case 2 is to ensure that your compiler can always deforest the intermediate structure.

On top of that, you'd need some serious infrastructure for computing asymptotic series expansions for worst-case (average case??) behaviour of your implementations, to have the compiler reliably know when to switch. That would be amazing... but it does not sound easy.

Abstract Data Types and Term Rewriting

I like this idea a lot. See http://lambda-the-ultimate.org/node/2141 for my previous post on the subject. Rather than leaving it all up to the compiler, I'd like to give more control to the programmer in the form of a declarative term rewriting language. Perhaps the Stratego term language could be used? This approach, if feasible, would make the problem of data type selection much more tractable since you only have to solve it for a specific domain, and a specific piece of software.

In general a term rewriting language could be used for multiple purposes: data type selection, expression of crosscutting concerns (ala AOP) and optimization. I also wonder if a sufficiently powerful term rewriting language could be extended to do type analysis? What a strange new world it would be if we could include novel type systems with our software!

Yep!

Originally when I thought of this I actually figured that I had reinvented ADTs, but quickly realized that it was something quite a bit different.

Actually, as far as the conversion stuff, I would want to allow for any connected, directed graph of conversions to be used. Both of your methods of course fit within this, but it also allows the user the flexibility of incremental conversion implementation.

Another property of these conversions is that they need not cover the full domain. They might fail, leading to other conversion attempts, or full failure. I think that most of the cases where this would happen, there's no way for the algorithm to succeed - it makes no sense for the failing domain anyway.

And yes, the eventual plan is to allow the compiler to figure out switches, and even runtime cases to decide between implementations. Some sort of lazy optimization tree might be used to track down likely optimizations, compare and contrast predicted performance - maybe even test it out! It's definitely not easy, but the beauty of it is that it can be added later - initial implementations only need to convert when really necessary.

perspects on data

... more specifically the ability have multiple, interchangeable data structures for one type of data.
(from Michael Sloan)

yes, that would be nice especially if done in O(1) time, that is, without copying. i mean, it would be truly useful if different views of your TB dataset could be had without replicating it.

Views of Lists

I'm not sure if this directly what you are looking for but I've taken a similar approach where operations like "map", "filter", "cat", etc. create views of lists rather than copying them, in the C# functional-lists library ( http://code.google.com/p/functional-lists/ ).