Points in the Pattern Matching Design Space

Functional languages often have a pattern matching construct. The weakest form is destructuring:

let (x,y) = f(z) in x+y

The left hand side is an expression that usually constructs a data structure, but by using it in a pattern matching position the compiler emits code to perform the inverse of that operation: instead of constructing a pair, we destructure a pair. The general form is:

let f(x) = y in ...

The compiler inverts the operation f and emits this:

let x = f^-1(y) in ...

However, the operation f is not always surjective, that is the equation f(x) = y may not have a solution for every y. For example x::xs only ever produces non-empty lists, so the pattern x::xs only matches non-empty lists. Therefore languages like ML combine destructuring with conditionals:

match ys with
| [] -> ...
| x::xs -> ...

The other limitation is that f is not always injective, that is the equation f(x) = y may have multiple solutions x. For example f(x) = abs(x). To handle this we can combine pattern matching with iteration:

for abs(x) = y do print(x)

For example if y is 2, then x gets bound to -2 and 2 in turn, iterating over all solutions. Note that we can generalise this to a general predicate:

for P do ...

Where P is a predicate. For example:

for x in xs do ...

To iterate over all pairs x in xs and y in ys where x

for x in xs && y in ys && x < y do ...

JMatch is an example of this kind of pattern matching.

This raises the question which predicates can we use in a pattern match? For example:

for prime(n) && n < 100 do ...

We'd like this to iterate over the prime numbers that are smaller than 100, but how can we implement this? It is impossible to do this efficiently for any predicate, but how can we give the programmer the tools to implement the patterns he wants to use efficiently?

Another extension of pattern matching is logic variables like in Prolog. This allows us to reason about incomplete information, where a part of a data structure is left unspecified. For example executing [X] = Y arranges that Y is a list containing an unknown value X. If we later execute X = 2 then Y is now a list containing 2.

Futher extension is constraint programming, where we are not only able to leave out a part of a data structure, we can also specify partial information about a value. For example the constraint X*X < 10. If we also add the constraint X in [-2,5] then X gets bound to -2. New types of constraints require new specialized algorithms to solve them efficiently. How do constraint handling rules fit in? How can we provide mechanisms to the programmer to add additional algorithms for constraint solving. Existing constraint systems are limited in that you need to specify that variables come out of a finite set of values. Can we lift this restriction?

What is the state of the art in pattern matching and its extensions? How can we extend the scope of patterns that can be matched, and in which other ways can we increase the power of pattern matching?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Binding

Your approach of pattern-matching doesn't mention binding. For example you say that `x in xs` is a predicate, but you cannot neglect that `x` is a free variable while `xs` refers to an existing value. If you want to describe the pattern matching of functional languages, you have to describe their effect on the environment.

The settings is very different in logic programming, where unification is used instead of variable binding. Yet I don't think that everything should be considered as a specific case of such loose behavior.

One point of view I like about pattern matching, which is quite different from the `inverse functions` approach you take -- that is not necessarily the most fruitful imho -- is the idea that pattern matching are destructors for data. Values are built by constructors, and deconstructed by pattern matching. Conversely, functions are built by abstraction, and deconstructed by application. The polarity/focusing approach of logic people goes into that direction. See for example neelk's "Focusing on Pattern Matching" discussed on LtU.

Generative Grammars, Term Rewrite Systems

You might review some links in an earlier discussion on generative grammars. Generative grammars have the rather nice property of simultaneously matching and generating sets, which allows them to be compiled for either role. (They are not functions.)

State-of-the-art constraint programming usually controls its domains. That is, you're allowed to create limited constraints for integers and doubles and the like, and there are a lot of default strategies for exploring those spaces, but expression is quite limited - i.e. providing a 'prime' function might not be available to you, much less requesting one "provide me a function called 'prime' with these properties". By limiting expression, they can provide faster solutions.

In order to provide new mechanisms and domains: I imagine that programmers should be able to annotate 'suggested' strategies, expected profiles, and partial-solution heuristics for solving constraints, with the understanding that those are suggestions and do not affect the semantics of the program.

I believe something missing from these pattern-matching systems is the ability to control propagation of constraints in a securable modular fashion (i.e. that could work across organizational boundaries, similar to message passing), and the ability to obtain advice from expert systems and databases, leverage outside services, et cetera. My earlier pursuit of generative grammar approaches was influenced by this belief. For now, I've decided to keep these constraint-solving and grammar things to DSLs and frameworks rather than the language proper.

Term rewrite systems, as seen in the Maude and OBJ family, are also nicely invertible. They are often used to answer a question: can I reach this state?

I pursued TRS for a while, but I'm not really satisfied with it... TRS is far too implicitly stateful, IMO. Constraint programming, logic programming, functions, and generative grammars can be orthogonally extended with temporal semantics, but in TRS modeling that extension is not orthogonal. Temporal semantics allow shoehorning the paradigm into a reactive or real-time system.

Single Assignment Variables

Single assignment variables, as seen in Oz or Janus, seem to be a common point in this design space (just above expressiveness of local destructuring). They also deserve a mention.

What's the difference with

What's the difference with logic variables?

What is the similarity?

Logic variables simply stand in for a set of objects with particular qualifications. Single assignment variables carry one value, assigned at a different time or place than its declaration. The semantics don't seem at all similar, IMO - though it perhaps that becomes more obvious once you start considering time, reactivity, modularity and such.

If assignment is expressed through unification, and the logic is evaluated by unification (as opposed to a half-dozen other techniques), then I suppose the two can appear vaguely similar from a mechanical perspective.

In what way are they

In what way are they different? This seems to me exactly what logic variables are:

Single assignment variables carry one value, assigned at a different time or place than its declaration.

Of course due to backtracking the variables can lose their value again, and get another value. Is this different with logic variables?

Single assignment variables

Single assignment variables never lose their value. Computation with single-assignment variables is monotonic because variables never lose their value. Code in a single-assignment variable language looks a bit like: declare A,B,C in Expression, to create new variables A,B,C then evaluate Expression. The new variables can be assigned to exactly one value, ever. If you want backtracking, you'll need to model it explicitly.

Single-assignment variables are related to constraint variables, where the only legal constraint is '=' and where one side must be a variable or simple data structure (not an arbitrary function... i.e. generally, nothing like cosine(X)=0.17174 is possible).

You must be thinking something different than me when you say 'logic variables'. In logic programming, I'll declare logic expressions such as: foo(X,Y) :- bar(X,Z),baz(Y,frob,Z). There are three variables in that expression, which is really shorthand for: ∀X.∀Y.foo(X,Y) :- ∃Z.bar(X,Z),baz(Y,frob,Z). Logic variables do not imply backtracking. (That's just an evaluation strategy, not necessary for logic programming semantics!) They do not imply unification. They are never 'assigned'. They're just sitting there, declaratively standing in for a quantified relationship.

That's just two ways of

That's just two ways of describing the same thing. Oz's "single" assignment variables can in fact be assigned multiple times due to backtracking, exactly as in Prolog.

Oz's variables can be

Oz's variables can be assigned multiple times if you use cells and explicit mutation (Oz is not pure). For logic programming, Oz explicitly models lists and streams. Review chapter 9 of CTM, if you're interested.

I'm going to need something stronger than repeated argument by assertion that single assignment variables and logic variables are two ways of describing the same thing. To me, they seem totally different, e.g. with regards to modularity and reactive computation. Certainly, X and Y aren't 'assigned' by any particular 'foo' expression, since there might be a whole ton of foo expressions. Rationally, I expect two different descriptions to have two different meanings until proven otherwise. If your attention is focused upon equivalence in some specific attribute or use-case, keep in mind: my crayon is yellow, and the sun is yellow, therefore the sun is my crayon.

Vague intuition

My knowledge of logic programming and Oz is quite weak but I must say that, when learning both, I was also stricken by a likeness in the handling of variables. Nothing specific, and I wouldn't claim they are the same, but intuitively they felt similar.

That was a few years ago and have forgotten the details. One similarity I see is the bidirectionality present in the way you program them. When defining a predicate in prolog, it's not always clear which variables[1] are "known" and which one will get their value deduced. Similarly in Oz, when you see a variable use site, you don't necessarily know whether the variable as already be assigned, or this use will cause this part of the program to block until the value is assigned. I don't remember exactly, but what happens if you assign an Oz variable twice with unifiable content, or two unifiable partially defined variables, doesn't it work ?

[1] your definition of logic variable as "just" quantified variable is not validated by my -- small -- experience in prolog programming. I rather considered all variables as partially undefined value, and when writing a predicate the question is how to "compute" a result, that is give more definedness to the variables of the goal. Maybe I was doing it wrong and your point of view is more high-level. It seems at least that my p.o.v was more operational, while yours is more denotational.

I don't remember exactly,

I don't remember exactly, but what happens if you assign an Oz variable twice with unifiable content, or two unifiable partially defined variables, doesn't it work ?

If both partially defined variables are compatible (i.e. all the well-defined chunks are equal in structure and value) then the remaining variables will be unified. Otherwise you get a fail condition.

You can also use !!X to create a read-only reference for variable X, in which case trying to unify against it is an error. Oz developers found themselves having some trouble controlling direction of assignments - not a problem for bug-free code, but how many of us write bug-free code?

It seems at least that my p.o.v was more operational, while yours is more denotational.

Maybe so. Prolog isn't pure logic programming (due to 'cut' and side-effects). Datalog, I think, is a cleaner example of the paradigm.

Oz variables cannot be

Oz variables cannot be assigned multiple times. To support mutation it has a cell data structure like a ref in OCaml. Also Oz doesn't model lists and streams any more explicitly than Prolog. They fall out of cons cells + single assignment variables, like they fall out of cons cells + logic variables in Prolog.

I cannot prove that logic variables are the same as single assignment variables, because as far as I know there is no strict mathematical definition of both.

However constructs related to logic variables in Prolog map one to one to constructs related to single assignment variables in Oz. Both support unification. Both support reasoning about yet unknown values. Both support undoing of assignments when backtracking.

To me, they seem totally different, e.g. with regards to modularity and reactive computation.

What exactly do you mean by this? Can you perhaps give a concrete example of something you can do with one and not with the other, or an example where they behave differently?

Certainly, X and Y aren't 'assigned' by any particular 'foo' expression, since there might be a whole ton of foo expressions.

Absolutely, and indeed in this respect logic variables and single assignment variables are the same: there is no one expression that binds a variable, instead the variables get constrained by unification.

logic variables and single

logic variables and single assignment variables are the same: there is no one expression that binds a variable, instead the variables get constrained by unification.

With single-assignment variables you can point to a single location in code that was the binding point for that variable (at least, once it is assigned). It just happens that this binding point need not be local to the variable's creation, nor need it be determined at the time of creation. The assignment may contain references to other variables, which gives this technique decent flexibility.

With logic variables, there is no binding point. All disjunctions are equally valid, all the time. For the existential variable Z in the 'foo' example, you don't ever need to bind the variable, just prove that a valid binding exists (which is sometimes an easier problem).

Single assignment variables are operationally very similar to futures/promises, except that they are much more transparent and generally have better syntactic integration (e.g. with pattern decomposition assignment).

Both support unification. Both support reasoning about yet unknown values. Both support undoing of assignments when backtracking.

Unification is not critical to the notion of single-assignment variables. Several languages with single-assignment variables (including Janus, which I mentioned earlier) separate the assignment and reading halves into tellers and askers. Oz allows something similar via !!X.

Similarly, backtracking is unnecessary for understanding of single-assignment variables.

Oz does support a notion of hierarchical 'computation spaces' that allow something similar to logic programming, and which (operationally) might be understood in terms of 'backtracking'. But that's orthogonal to the notion of single-assignment. In the top-level space (or, indeed, in any particular computation space) a variable in Oz is assigned at most once. It is never 'undone' - i.e. you cannot observe the variable at two different times and say, "Hey, the assignment of this variable has been undone!", and if you can't observe it then it didn't happen as far as semantics are concerned. Instead, a program will suspend until exactly one consistent choice for binding remains, then it will commit to this remaining choice. (If every choice would result in inconsistent bindings, a failure condition is raised - which results in an exception in the top-level space.) There are some ways to explore or control search of these spaces.

If you're going to use Oz in particular to grok single assignment variables, then you should understand single-assignment variables in terms of their top-level computation space semantics. The rest is semantic baggage - part of Oz, perhaps, but not part of the 'single assignment variables' concept.

I am not familiar with

I am not familiar with Janus, or what single assignment variable means in a general context, but Oz single assignment variables do correspond to logic variables as found in Prolog? Of course Oz is richer than Prolog, for example its support for !! and constraint programming (although some Prolog implementations support that as well).

In Prolog you can also always point to a place (an unification X=Y) where a logic variable got assigned in a particular node of the backtracking tree, exactly like in Oz (in Oz you'd call a "node in the backtracking tree" a computation space).

So you are saying that single assignment variables, in the general meaning of the word (i.e. not in the Oz meaning of the word), are strictly less powerful than logic variables. And that this is the case because logic variables imply things like unification and some kind of quantification / search, which single assignment variables don't imply but nonetheless support, e.g. in Oz. It seems that we're now quibbling over definitions.

Why do you insist on

Why do you insist on attributing features to 'single assignment variable' that can cleanly be attributed to other, orthogonal Oz constructs? (in this case, 'dis') Would you also argue that single assignment variables imply transparent distribution and garbage collection because Oz provides those features?

In logic programming variables aren't 'assigned' in any meaningful sense. You can't take query predicate foo(X,Y) and meaningfully pass X to one module and Y to another, like you could with assigned variables.

seems that we're now quibbling over definitions

Just now it seems that way?

Since definitions are under

Since definitions are under dispute, I'll point to the definitions used by the Mozart/Oz documentation:

A logic variable is a single assignment variable that can also be equated with another variable.

This definition is consistent with literature on Flow Java, Alice ML, and other concurrent languages.

Thanks for clarifying that.

Thanks for clarifying that. If that had been the response to my earlier clarification ("You must be thinking something different than me when you say 'logic variables'. In logic programming [...]"), I certainly wouldn't have objected to it.

Single assignment variables, in general, have very different properties than variables seen in Datalog or Dedalus logic programming, or Concurrent Constraint Programming, or generative grammars, and so on.

So it seems that indeed

Thank you that sheds some light on the issue. So it seems that indeed logic variables are single assignment variables, but single assignment variables are not necessarily logic variables, and that "Single assignment variables, as seen in Oz" are indeed also logic variables by this definition because they can be equated to another variable.

Logic Programing in Oz

If you use the search features of oz, with for example the 'dis' construct it behaves nearly exactly like a logic language like Prolog, and the dataflow variables behave as logic variables (ie you can assign them a value, backtrack and assign a different value).

Complete Functional Synthesis

You might be interested in the work of Viktor Kuncak et al. at EPFL on Complete Functional Synthesis.

This is very interesting,

This is very interesting, thanks. Using decision procedures to solve the equations seems exactly what's needed here to get a more powerful pattern matching facility, and they claim they can even combine methods for different domains automatically. Very interesting indeed, I'm definitely going to study this paper.

Pattern Calculus

Barry Jay's work on the pattern calculus goes in a rather different direction than the ones you outline and is quite powerful. Relevant links: Barry's pattern calculus page, Bondi, First class patterns (PDF). Note that as you read these, the pattern calculus has evolved quite a lot and quite rapidly, so many of the earlier papers are nowhere near as compelling as the later ones. This is part of why I included "First class patterns" which is, I believe, the latest in the theory of pattern calculus (and is not, I believe, embodied in bondi currently.) Section 1.2 of the introduction of "Pattern Calculus: Computing with Functions and Structures", as its name suggests, provides motivation.

OOPifying

I just got finished getting pattern matching working in Magpie. It doesn't do anything particularly novel, but one area where it differs slightly from ML is that it can match against the subtype relation. Since Magpie doesn't have algebraic datatypes, it uses subtypes/subclasses instead to model variant types. Thus, to be able to match on those, you need to be able to match against a value's type (and then bind a "destructured" variable that's the same as the original value but safely downcast to that type).

It looks like this:

    interface Expr
    end

    class Literal
        def value Int
    end

    class Add
        def left Expr
        def right Expr
    end

    // Note: implementing an interface is implicit, like Go.

    def evaluate(expr Expr -> Int)
        match expr
            case literal Literal then literal value
            case add     Add     then evaluate(add left) + evaluate(add right)
        end
    end

The grammar is pretty simple. A variable name (lowercase) followed by a type annotation is parsed as a type pattern. These can nest inside tuple and field patterns, which lets me use the same system for function parameter declarations:

def foo(a Int, b String, c Bool)
        |---------------------| a tuple pattern containing three type patterns

Case classes and XDUCE

You probably have heard of Scala case classes, which are a bit less general than your approach, but allows for more expressive patterns, containing variable bindings; yours are mostly conditional tests so far, plus one local renaming -- which is good, but maybe not enough.

Another use of the idea that "types" (or classes) are patterns is the definition of patterns in CDuce as "types plus binding variables" : `(Int, Float)` is a syntax for the product type, `nil` is a constructor but also the singleton type of that constructor, `<cons> (p, nil)` is a pattern for matching a list of one element, and `<cons> (Int, nil)` is the type of one-element lists of integers.

I'm roughly familiar with

I'm roughly familiar with case classes. I've got tentative plans to implement something like unapply in Magpie so you can do arbitrary class-specific destructuring in patterns, but I'm still trying to figure out 1) how useful that actually is and 2) what the semantics are.

For the first point, I'm not sure if:

    def evaluate(expr Expr -> Int)
        match expr
            case Literal(value)   then value
            case Add(left, right) then evaluate(left) + evaluate(right)
        end
    end

is really any better than:

    def evaluate(expr Expr -> Int)
        match expr
            case literal Literal then literal value
            case add     Add     then evaluate(add left) + evaluate(add right)
        end
    end

It is a little more terse, which is always good, but it doesn't have the nice mirroring in Magpie that it would in another language. You don't create an object by doing Point(1, 2), so it isn't obvious that that's how you'd destructure it.

The second point is that I'm not sure if destructuring like this should pull out values or type parameters. You'd expect a pattern like Point(x, y) to destructure fields. But the type annotation for a generic list is List(T), so you'd expect List(element) to destructure the type parameter. What would Pair(left, right) do then? It could go either way.

So, I'd like to have something along these lines, but I'm still trying to figure out what.

Not just destructuring, pattern matching

The idea of binding in a destructuring is that "case Add(left, right)" matches how Add was constructed. Which is nice if not huge. But pattern matching adds an additional ability to say richer things without a bunch of ifs.

case (Add(left1, Literal(0))::tail1, Add(left2, Literal(1))::tail2) => 
// if you have a tuple consisting of two lists with Adds as their 
// heads and the first one has a literal 0 as its right argument and 
// the second one has a literal 1 as its right argument then give me the 
// corresponding left expressions, plus the tails.

In that sense, Scala behaves pretty similarly to pattern matching in MLish languages.

Scala's extractors (and GHC's views and F#'s active patterns) are there to allow separation of pattern matching from details of representation. The semantics are just that unapply is called and it returns an Option. If the Option is a None then it didn't match. If it's a Some, then the contents of the Some are used in the pattern match normally (i.e. bound to variables, compared with values in the pattern, etc).

Understood. Magpie has both

Understood. Magpie has both aspects of pattern matching: the predicate testing and the destructuring, and of course they can compose arbitrarily deeply. Some examples:

// match a tuple whose first field is 1, bind the second field to a
case (1, a) then ...

// match an object whose "a" member is 1 and where
// the second field of the tuple returned by the "b" member
// is an Int, and bind "c" and "d" to the fields of that tuple
case a: 1, b: (c, d Int) then ...

What you can't do is something like:

case Add(left, right) then left + right

You'd have to do either:

case add Add then add left + add right

Or:

case left: left, right: right then left + right

The last example will pull out the fields, but it won't match on type: it's just duck-typed. There's currently no way to combine those two operations: if the value is of this type, then pull out these fields. In an OOP language without algebraic datatypes, I don't think that facility is quite as valuable since you can just do foo bar to pull a field out once foo has been cast to the proper subtype.

Emulation

A technique to emulate MLish pattern matching in your system would then be, for each case class, to add -- by hand -- a unit field with the name of the class. You could then do:

match expr
  case literal:t, value:value then ..
  case add:t, left:left, right:right then ...

Additionally, having "left:" desugar to "left:left" may be helpful (a similar idea has been added in OCaml 3.12 for record patterns):

match expr
  case literal:, value: then ..
  case add:, left:, right: then ...

Another possibility, if you have enough type reification, which I'm not sure is a good idea in that case, it to have a 't' field containing the class itself, with the relevant semantics for class names patterns: don't bind, check for subclassing. I don't like it but it's the kind of "clever" tricks you may be interested in:

match expr
  case t:Literal, value: then ..
  case t:Add, left:, right: then ...

PS: Note that it allows the quite natural

match class(obj)
  case Foo then ...
  case Bar then ...

Another possibility, if you

Another possibility, if you have enough type reification, which I'm not sure is a good idea in that case, it to have a 't' field containing the class itself

Magpie does have a fully reified type system. In fact, most of the type system is implemented in Magpie. You can do this right now:

match expr
    case type: Literal, value: v             then v
    case type: Add, left: left, right: right then left + right

That would work since all objects have a type property that gets their type, but it has some flaws:

  1. You're doing an equivalence test on the type, not a subtype test, so this breaks Liskov.
  2. It doesn't know how to correctly type the value, left, and right variables. It will work at runtime, but wouldn't type-check.
  3. It's pretty ugly.

That is an interesting idea, though. I hadn't considered testing on the type as a field directly.

PS: Note that it allows the quite natural

match class(obj)
  case Foo then ...
  case Bar then ...

That's actually valid Magpie syntax today (except that there isn't a function called class). It tests to see if the value of obj is equal to any of those type objects. Since types are first-class, it's valid (though uncommon) to treat them like values and test against them directly.

To be able to syntactically distinguish between a type test and a value test, Magpie requires a variable (or _) before a type test:

match obj
case Foo then "obj is itself the type object Foo"
case _ Foo then "obj's type is a subtype of Foo"
end

It's a little syntactically fishy (though it reads normally in the common cases, which is what I care about the most) but was the best solution I could find for allowing matching on types and matching on types-as-values (since they are first class objects themselves).

You're doing an equivalence

You're doing an equivalence test on the type, not a subtype test, so this breaks Liskov.

Why did you decide to have class patterns test equivalence rather than substituability ? With that semantics, it is indeed quite fragile, but I don't see what you gain with equivalence. I was under the impression that you didn't want to be able to observe that an inherited instance is "not quite" an instance of the base class.

Of course, this is not a strong (~formal) justification as ad-hoc reasoning on the type directly is a bit fishy anyway, and moreover you can still distinguish between Base and Derived by putting a Derived case before the Base case -- using the first-match semantics.

Why did you decide to have

Why did you decide to have class patterns test equivalence rather than substituability?

A normal type pattern (which is what you'd use 99% of the time) does do a subtype test:

match SomeDerived new()
case b SomeBase then "this will match"
end

But note the variable name before the type. That distinguishes it from a value test:

match SomeDerived // note: passing type itself, not object of type
case SomeBase then "this will not match"
end

It's just weird here because you can coincidentally use types in a value pattern because types are values in Magpie. In most cases, you wouldn't actually do this in practice.

On purpose

In most cases, you wouldn't actually do this in practice.

You mean: in most cases, you wouldn't actually do this on purpose. :)

Is the T in List(T) a

Is the T in List(T) a runtime argument?

Yes and no. Magpie's

Yes and no. Magpie's generics story is a little unusual since types are first-class and type annotations are normal expressions.

Consider:

def foo(items List(Int) ->)
    var listType = List(String)
    print(listType members count) // prints 10
end

Here, List(Int) is a type annotation. During type-checking, that will be evaluated by invoking the interpreter directly to evaluate the expression. Concretely, List is an instance of GenericInterface. That class has a call method that lets you call it like a function. That method expects a type and returns an instance of Interface. So when you do List(Int) in a type annotation, you're sending a call message to List which instantiates the interface with the given type.

Then the type-checker uses that object to type check the body of the function and its callsites.

But, since that's all just normal runtime evaluation stuff, you could also call it directly at runtime, which is what that List(String) call does in the body of the function. That also just instantiates and returns an Interface (the class of interface objects). Interfaces have a collection of members, so the print line just prints the size of that collection.

Separate from this, Magpie does also have a distinct syntax/semantics for real generic methods. For example:

var foo = bar as[Int | String]

Here we're invoking as and passing in a type parameter, Int | String. That's also just a normal Magpie expression, but there's two important distinctions:

  1. It's evaluated in a different name context than normal expressions. It only has access to global names, and names of type parameters of surrounding functions. In other words, its closure just contains other stuff in square brackets (and the top level global scope).
  2. At type-check time, the checker will evaluate String | Int instead of checking it. That way, the resulting type can be used in the return type of the method.

It's a little odd, and there's still some corner cases I'm working on, but it covers the main use cases I care about:

  1. You should be able to define functions whose return type is dependent on the argument type. For example:
    def identity[T](obj T -> T) obj
    var i = identity(123) // the type-checker knows i is an Int
    
  2. You should be able to define functions whose return type is based on the (type) argument value:
    def none[T](-> T | Nothing) nothing
    var i = none[Int] // i is of type Int | Nothing
    

What I just discovered yesterday is that almost all of this mirrors StrongTalk, which is reassuring but could have saved me a lot of time if I'd known that sooner. :)

a last suggestion

Your current construction for optionally typed "value patterns" is `$name $type?`, but maybe not exactly `$name` as there is also `_` which I suppose is not a valid name in other contexts.

Why not use `$pattern $type` as a more general rule ? With the semantics that if `p` matches a value `v`, then `p T` matches `v` if `v` is an instance of `T`.

You could then write:

match expr
    case (value:) Literal then value
    case (left:, right:) Add then evaluate left + evaluate right
end

PS: I know that LtU is not supposed to be a place for design discussions, so I suppose my various remarks on this thread were quite off-topic and I'll try to stop. Do you have an URL (e.g. a blog post) where such suggestions could be posted as comments instead ?

Hmm, I don't follow what

Hmm, I don't follow what you're saying. Can you try to clarify (but likely not here at LtU?

Do you have an URL (e.g. a blog post) where such suggestions could be posted as comments instead?

The best place is likely the mailing list here: http://groups.google.com/group/magpie-lang?pli=1. There's almost no traffic yet, but you could help fix that. :)

Can you write code like

Can you write code like this:

def evaluate args
  ...
end

If (expr Expr -> Int) is an 1-tuple pattern, then just using args captures that tuple in a variable?

Pretty much

Pretty much. You need some parens there, but yes:

def evaluate(args)
    print(args _0 ~~ args _1)
end

evaluate("a", "b") // prints: "a b"

If you wanted to strongly type the tuple, but not destructure it:

def evaluate(args (String, String) ->) ...

You could also destructure it and also keep the whole enchilada:

def evaluate(args (a String, b String) ->) ...

How do you distinguish

How do you distinguish between evaluate(("a","b")) and evaluate("a","b") inside evaluate?

If you do evaluate(1) then args is not a tuple, but if you do evaluate(1,2) then it is?

How do you distinguish

How do you distinguish between evaluate(("a","b")) and evaluate("a","b") inside evaluate?

It doesn't. Those two expressions are identical.

If you do evaluate(1) then args is not a tuple, but if you do evaluate(1,2) then it is?

It's sort of a tuple in both cases. In the first case, it's a tuple whose first field is the number itself and where the other fields are nothing.

A full set of examples will help, I hope:

def one(a) print(a)
def two(a, b) print(a _0 ~~ b _1)

one(1)    // "1"
one(1, 2) // "(1, 2)" (1)
two(1)    // "1 nothing" (2)
two(1, 2) // "1 2"

(1) Converting a tuple to a string looks like that.

(2) This works because Object has a _0 property that returns itself. That lets you treat any object like a tuple of arity one. The tuple destructuring in a pattern just calls _n properties on the objects. Any missing property returns nothing. So destructuring the number 1 as if it were a 2-uple returns 1 for the first field and nothing for the second.

This is a kind of ugly corner case of the language. If you pass a tuple to a function that expects a single argument, it doesn't magically know to just destructure the first field, which is more or less how something like JavaScript works. The behavior is consistent, but very possibly surprising. I think the overall gain in elegance is worth it, but I'll have to see how it works in practice.

Ah, so the parens are part

Ah, so the parens are part of the function call syntax, not part of the tuple literal syntax. Giving all objects a _0 method is an interesting approach.

The only way I see this could cause trouble is if you add overloading. For example:

def foo(a) ...
def foo(a,b) ...

Now a call foo(1,2) is ambiguous. If you differentiate between 1-tuples (a) and plain objects a then you get in trouble with parentheses, e.g. (a+b)*c...there doesn't seem to be a nice solution that solves both problems.

Giving all objects a _0

Giving all objects a _0 method is an interesting approach.

Yeah, it feels a bit dirty, but it accomplishes my main goal which is to make tuples be concatenations of values, not containers. A 1-tuple should just a value, not a thing containing a value (unlike, say, Python). Those semantics don't exactly hold up, but that's what I was going for.

Now a call foo(1,2) is ambiguous.

I'm still thinking about multimethods. In this case, it would just have to pick the more specialized match and maybe I could have longer tuples take precedence over shorter ones? I'm not sure. I'd have to think about it some more.

If I wanted to make tuples container-like, I could address that by having a comma with nothing after it create a tuple. So you could do (a,) to create a 1-tuple.