Multiple Value Return - Common Lisp vs. Tuples and destructuring

I'm most familiar with the Common Lisp style implementation of the multiple-value-return feature.

But I notice that several other languages, including Python, use tuple return values and do destructuring on these tuples.

What are the advantages/disadvantages to each? Semantically? Implementation and performance?

Are there other (better) models of multiple-value-return that I should study?

Thanks much!

Scott
- No, I'm not a student :-)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Scheme

In Scheme, CPS-conversion is a common strategy for compiling, and in CPS, returning multiple values is just calling the continuation with multiple arguments. The only point of nonduality between arguments and values is that you know how many arguments a Scheme function accepts (well, unless it has a rest argument), whereas you cannot always statically determine how many values it returns.

Now that I think about it....

...it's always undecidable at compile time how many values a Scheme function returns, unless it never calls an unknown function.

Great point.

Somehow, I missed this comment months ago. I read it, but it didn't stick. It's kind of obvious in retrospect; I've recently discovered some implications that make me appreciate it greatly now.

I came back to re-read Matt's comments, but then I saw this. Hopefully, more later...

[edit: to be clear, I was referring to your original post.]

Haskell and CPS

I spent last evening exploring how one could use CPS to work around GHC's weaknesses with respect to returning lazy tuples. A few examples work out beautifully; so beautifully I felt sure this idea was going to work well in more cases. But alas, the beautiful examples seem to be special cases.

If you wrap a Control.Monad.State in a ContT, or vice-versa, and work out the corresponding get and put operations, you come up with a rather elegant monad with a number of tradeoffs versus a "normal" state monad. A striking feature is the way that repeated tupling disappears, returning only one tuple instead of many. (Under typical patterns of usage, anyway, only involving callCC, put, and get, and not the more exotic mapCont operator.)

However, in many (most? all?) of the cases where you want to incrementally return a lazy list, such as the splitAt example below, using CPS to avoid returning tuples is not obviously useful. The problem is that, without tuples, your one "channel" out of the monad (through the appropriate run operation) is occupied by providing partial data. Using tuples there would then mean that you'll be returning quite a few of them, unlike typical applications of ContT (State st) r a.

[edit: I missed the obvious caveat: wrapping a lazy state monad in a ContT produces a strict semantics, although laziness can be easily recovered via mapCont. Something to keep in mind...]

As is the amount of argument

As is the amount of argument it consumes (reverse-engineering your definition of 'decidable', decidable usually concerns a binary question).

You cannot statically in Scheme, or any language with support for variadic functions really, determine in each function call how many arguments are going to be passed.

No such thing

CPS is a good suggestion. I'd also look at a language like Haskell, which formally speaking only has single-argument functions. This makes things consistent by going the other way, and of course tuples or other data types can always be used to return multiple values.

CL style return values and performance

One thing nice with CL is that one doesn't have to cons in order to return multiple values, and one can accept just one value and ignore the rest. I can imagine the same is true of Scheme (the consing part), and from what I understand of CPS, there as well.

I recall a paper (Dybvig?) about Scheme where the number of desired return values was encoded in the instruction stream itself, with the callee peeking into the instruction stream.

Do tuple returning language typically cons in order to return multiple values. Would optimizing this away be a static vs. dynamic language issue?

Many thanks.

Scott

Glasgow Haskell

Well, R5RS Scheme doesn't require values and call-with-values to be implemented in any particular way. They could be implemented with simple consing. However, Chez Scheme does implement it efficiently, as described in Kent Dybvig and J Michael Ashley's paper. And yes, this is an eminently practical feature.

As for optimizing consing away in statically typed languages... this would be quite nice, but the Glasgow Haskell Compiler doesn't do this. One of GHC's dirty little secrets is that the prelude definition of splitAt is not very efficient. Traversing a million elements twice is more efficient than returning a pair a million times to avoid the extra traversal: [edit: this has been fixed sometime between 6.8.3 and 6.10.3]

module SplitAt(reportSplitAt, splitAt, demandList, demand) where

reportSplitAt :: Int -> [a] -> ([a], [a])
reportSplitAt n xs = (take n xs, drop n xs)

demandList []     = ()
demandList (x:xs) = demandList xs

demand (xs, ys) = (demandList xs, demandList ys)

splitAt is defined as follows inside of GHC.List and re-exported through the Prelude:

splitAt (I# n#) ls
  | n# <# 0#    = ([], ls)
  | otherwise   = splitAt# n# ls
    where
        splitAt# :: Int# -> [a] -> ([a], [a])
        splitAt# 0# xs     = ([], xs)
        splitAt# _  xs@[]  = (xs, xs)
        splitAt# m# (x:xs) = (x:xs', xs'')
          where
            (xs', xs'') = splitAt# (m# -# 1#) xs

The hash marks can be ignored, all this means is that the Ints are unboxed, meaning they don't get wrapped up in thunks. The real point of interest is that this "optimized" version of splitAt iterates over the list only once. Lets see how it does:

lps@azarel:~/Programs/snippets $  ghc -O -c splitAt.hs
lps@azarel:~/Programs/snippets $  ghci splitAt.hs
GHCi, version 6.8.2: http://www.haskell.org/ghc/  :? for help
Loading package base ... linking ... done.
Ok, modules loaded: SplitAt.
Prelude SplitAt> :set +s
Prelude SplitAt> demand (reportSplitAt 2500000 [1..2500000])
((),())
(7.09 secs, 274966016 bytes)
Prelude SplitAt> demand (splitAt 2500000 [1..2500000])
((),())
(8.59 secs, 261429012 bytes)

Disappointing? A bit. I suppose somebody should report this as a buglet, but I doubt that cleaning up the definition of splitAt would make much difference in anybody's code. [edit: It's worse than that. Using splitAt as defined in the Haskell 98 report in place of GHC's splitAt potentially introduces memory leaks. That is fixable, but one must tread carefully.]

The bigger problem is that this type of recursion should often be avoided in GHC, if you care about every last ounce of performance. The real payoff here would be to optimize away the consing as you suggest. This would almost certainly "pay it's own way" in Kent Dybvig's sense of the phrase, in that the compiler itself would get faster, despite doing more work.

Speaking of...

... does anybody know of a compiler (OCaml, MLton, maybe?) that will optimize away consing in these types of cases?

Supero perhaps

Have you tried Supero? Your code should run as is and the compiler does some pretty deep optimizations.

ML

From a variety of sources I've heard it claimed that ML implementations do a lot of work to avoid tuples due to the uncurried style that is normally used. So I highly suspect MLton and SML/NJ handle these and many other scenarios. There are almost certainly some good papers about those aspects of those implementations.

Shrinking to avoid consing

One basic way to avoid consing is to transform code of the form

let
   val t = (t_1, t_2)
in
   (* ... *) #i t (* ... *)
end

to

let
   val t = (t_1, t_2)
in
   (* ... *) t_i (* ... *)
end

Now, if t is no longer used, it can be eliminated. This kind of optimization is fairly easy to perform in a CPS or SSA based intermediate language. It doesn't directly apply to an arbitrary multiple return values case, but may be applicable after the function being called is inlined.

Shrinking Lambda Expressions in Linear Time and Compiling with Continuations, Continued use this optimization as one of the examples.

Not so sure...

It looks like GHC will indeed avoid consing in many of these cases. Here's an example of this in action.

I'm not sure why it wouldn't work for splitAt. You may know more about this than I do, so maybe you can clarify?

I was expecting somebody to yell.

Hmm... I stand corrected. Trying to understand the performance of GHC is not a trivial task, it's easy to make mistakes, and I've been wrong in the past. (Indeed, my first assertions don't have a good track record to date.) Thanks for the articles!

My misconception was based off of more than just splitAt: my experience has been that multiple return values usually incur much steeper penalties than they should. Many of these experiences revolve around implementations of various monads and splitAt-style recursions. So I'll have to dig deeper.

Off the cuff, I'd bet that this has something to do with laziness. "Avoiding consing" in this case is a bit of a misnomer: laziness involves heap allocations in GHC, it has been called "evaluation by allocation". It's just that one hopes that returning multiple values doesn't involve extra heap allocations, if indeed allocation is at the heart of this performance discrepancy.

I also need to think about how the fact that Haskell differentiates between _|_ and ( _|_ , _|_ ) impacts potential implementation techniques here. Perhaps equating these two (conceptual) values, like Miranda, is a better choice.

Shortly after I posted yesterday, I replaced the pairs inside of splitAt# with strict pairs. The result was something that runs somewhat faster than the definition in the Haskell 98 Report, sort of. Except it has a tendancy to blow the stack, and it now computes all 2.5 million elements of the first list even if you only want thirty, which means use cases outside this particular microbenchmark can get dramatically slower. In any case I was soon doubting my hypothesis.

GHC is indeed glorious, but the downside of all this glory is these types of cognitive challenges. There is a lot to be said for understanding (approximately) how your compiler works. Playing devil's advocate a bit, I'd say Scheme/Common Lisp is the ultimate high-level language, largely because of gratifyingly good implementations, and these implementations are (relatively) easy to understand. Haskell, sadly, has a ways to go before reaching the same level of implementation maturity.

After all, languages themselves are useless. Only implementations make them useful.

Agreed, on all counts

  1. Yes, "avoid consing" is confusing terminology here. It's clearly a lispism that doesn't really translate... I just figured I'd stick with the jargon of the original question.
  2. Yes, I think the distinction between strict and non-strict data constructors is at least as important here as the other optimization. However, I don't think that making tuples strict by default would be a good change.
  3. I agree that complexity is a significant downside to GHC's (significant) glory. I'm not able to come up with this stuff on my own (I linked to a blog post by Don Steward for a reason), although I like to imagine that if I wrote Haskell for a living, I'd develop the expertise.

In any case, the main point I'd like to emphasize is that GHC is really, really glorious.

Strict Pairs

Haha, yes. The reputation of the Simons is well deserved. Most of us here at LtU are mere mortals by comparison. (I'd make an exception for Oleg, at least, and I think I've seen Lennart post at least one comment.)

I'm not sure I quite understand what your second point is getting at. The difference between Miranda, whose pairs are also lazy, and Haskell can be demonstrated with the following code:

bot = bot

konst (x,y) = 3

Now, what's the value of konst bot? In Haskell, the value is _|_, while Miranda returns 3. Pattern matching on a pair cannot fail, which is quite a bit different than a "strict pair" as used in Don Stewart's rather informative blog post.

And yes, even this would be significant change for Haskell that probably can't be justified. Moving to truly strict pairs would create an entirely different language. Even so, I think I would prefer a pure, strict-by-default language, that I can make lazy as appropriate. But that language isn't Haskell.

Irrefutable patterns

Just want to mention that this is an issue of pattern matching semantics, not tied to tuples, so any kind of data structure would have the same issue. Lazy pattern matching is available (in the form of irrefutable patterns):

bot = bot
konst ~(x,y) = 3

Any pattern can be prefixed by a tilde to become irrefutable, moving the possible bottom deeper inside the lambda.

Almost.

Your response is 90% right. However, there is a subtle difference: in Miranda, those two values are totally indistinguishable, whereas in Haskell, some other function can still distinguish them. Technically, the difference is deeper than pattern matching semantics. It modifies the Complete Partial Order of the domain: as far as tuples are concerned, there is one more distinct point. (I'm certainly no expert in Domain Theory. I can, to this day, rattle off a number of definitions and describe the basic ideas correctly, but I still don't "get it".)

You would be 92% right if Haskell required every pattern match against a pair to be irrefutable. However, this also makes patterns inside the pair irrefutable, so you'd have to translate konst (x:xs, y) = ... in Miranda to something like konst ~(a,y) = case a of { (x:xs) -> ... } in Haskell, where a is a fresh variable, except this translation doesn't quite work if konst has more than one rule, because if the pattern matching on the list fails, Miranda will fall through to the next rule, whereas Haskell will have already committed.

Semantically, irrefutable patterns are exactly equivalent to this kind of translation (from Haskell to Haskell), and this works for an arbitrary number of rules:

konst ~(x,y) = ...

konst a = let x = fst a
              y = snd a
           in ...

(However, this is not how the Haskell 98 report defines irrefutable patterns, rather they define let and other constructs in terms of "primitive" irrefutable patterns. Though this approach is equivalent, I find it much less intuitive, personally.)

As far as implementation is concerned, distinguishing these values means that the difference must somehow be reflected in memory. A naive implementation of Miranda might represent every tuple as an array of pointers: one to each of it's components, and each pointer would point to either a thunk (to compute that component), or a concrete value that's already been computed. A similarly naive implementation of Haskell would require the tuple itself to be represented by a pointer, which would point to either a thunk or an array of pointers a la Miranda. Obviously, this makes allocating a tuple directly inside something else (e.g. inside another thunk's closure) a bit harder to accomplish in Haskell.

Aha, I see

I see your point now with regard to #2. I misinterpreted it as a response to my musing if whether Haskell's lifted pairs were a good thing. However, I did bring up strict pairs as well, so I understand the source of confusion. Making Haskell's pairs unlifted like Miranda would be a minor-ish change, but as you rightly point out, making them strict by default would be truly cataclysmic.

The reasoning behind Haskell's pairs being lifted, it turns out, is that unlifted pairs have a complicated interaction with seq, which Miranda doesn't have. You can force evaluation by pattern matching in Miranda, it's just that this solution isn't polymorphic on the things you want to force.

I'm also now pretty certain that laziness is behind these inefficiencies. If you read the paper Constructed Product Result Analysis in Haskell, you'll find that GHC's optimizations only deal with strict pairs. (Or those that can be proven strict by the analyzer.)

I tried hacking a CPS + State monad that could return multiple results from run, and incrementally build one of them lazily, except that it only returned a single tuple, right at the end.

To get the extra result out, I used unsafePerformIO and IORefs to open up a "side channel", if you will. It's a thought-provoking exercise. It's mildly difficult to get it to mesh properly with laziness and preserve the desired runtime semantics. And when I did finally get something that worked, it broke when I turned on optimizations, so I had to figure out which optimization to turn off...

I'm suspicious that I ended up creating something remarkably similar to what GHC already does. At least, it exhibits the same performance characteristics as the safe and clean solution of returning many tuples, although these native tuples offer a substantially lower constant factor, taking about 1/3 the time.

Post it!

You should post your code somewhere!

Implementations are useless. Only languages make them useful.

My two citations would be (1) tensor calculus, and (2) the following:

By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and, in effect, increases the mental power of the race. ... Civilisation advances by extending the number of important operations which we can perform without thinking about them.

Alfred North Whitehead, An Introduction to Mathematics, 1911

Simple optimisation

You could simply put the elements of the tuple into the appropriate registers/stack locations defined by your ABI instead on consing. In the case where you are returning an already constructed tuple, assuming tuples are immutable, simply copy the elements into the appropriate locations.

Edit: this won't work unless you have the static return type!

Oz

Inputs and outputs of functions are the same thing in Oz. For example:

(define (pair x y)
  (list x y))
(define (unpair p)
  (values (first p) (second p)))

e.g. (unpair (pair 1 2)) => (values 1 2)

in Oz:

proc {Pair X Y P}
  P = [X Y]
end

proc {UnPair P X Y}
  [X Y] = P
end

e.g.

{Pair 1 2 P}
{UnPair P A B} // A is now 1 and B is 2

Note that that is the same as

{Pair 1 2 P}
{Pair A B P} // A is now 1 and B is 2

We can even ask for the values in P before constructing P!

{Pair A B P} // bind A and B to the values that will eventually be in P
{Pair 1 2 P} // put 1 and 2 in P, now A=1 and B=2

Implementation?

Do you know how this is actually implemented?

Unification

Oz is a logic programming language, so this is simply unification of logic variables.

Right... so is Mercury, but

Right... so is Mercury, but they work hard to avoid unification when they can. Does Mozart/OZ have any important optimizations that users should be aware of in this regard?

A construct like this?

In my toy language language design, I've been trending toward a simple construct that supports just multi-value bind and assign.

def (a,b,c) = vals(1,2,3+4);
var m,n;
(m,n) = returns2values();

I'll encode the a pointer to (stack) storage for extra return values (if any) in the call frame, so no consing. Functions that return a single value can ignore this altogether.

It's nice, because with efficient mv return, it can cut down on needless funcalls and wacky return value encodings, allow for multiple value iterators and generators, cleanly define methods like Sequence.find(val:Obj -> Bool, Int) for found and index position and so on and so forth.

The only thing I'm sorely missing with this mv scheme is a similarly efficient call-with-values construct, which is actually really useful for abstractly composing all kinds of good stuff.

Scott

call-with-values syntax

I've always liked Python's syntax for call-with-values, which is extended into keyword arguments as well. If t is a tuple, d is a dictionary (of strings to values), x is some other value, f is a function that returns multiple values, and g is some other function, then

g(x, *t, *f(), **d)

calls the function g with x as a positional parameter, expands t into a sequence of positional parameters, expands the result of calling f into its constituent positional parameters, and expands d into a sequence of named argument=value pairs. In a statically typed language, everything except the **d expression can be implemented as efficiently as you like.

You're reading my mind

So far, my syntax design uses Python's *rest syntax for the rest argument to functions, although my *rest arg is a special stack allocated vector object that can be reified into a normal sequence on demand if need be.

Ideally, for calls and return values, I'd love to support.

// I don't support keyword args, so ** can distinguish
// call-with-values from *bar() unpack-seq-returned-by-bar
// as args.
foo(1, **bar(), *sequence)
vals(*seq, **call())
return(*seq, **bar())

In general, Python is a kind of inspiration for my toy language.

I like that it starts with a butt simple imperative, scripting language and offers *optional* advanced features that offer power to the programmer should s/he need them. The programmer doesn't *have* to learn a textbook's worth of concepts just to get going. While a very different language, Scheme is like this: (display "hello, world!")

I have a different set of optional features to offer the developer, among several others - optional static typing, lexical closures, a symbol data type, TCO, compilation to native code, binary data structs and linkage conventions, yada yada yada :-)

Scott

The only thing I'm sorely

The only thing I'm sorely missing with this mv scheme is a similarly efficient call-with-values construct

How does this call-with-values differ from the ordinary activation frame of the function?

I'm glad you asked that question :)

I was just reading Dybvig's paper again. My scheme thus far (doc'd, not implemented) places a compile-time *fixed* number of requested mv's (> 1) into some arbitrary location in the callee's stack frame.

I see that Dybvig's scheme places mv's at the end of the callee's stack frame upon function return, thus facilitating the caller returning an *arbitrary* number of mv's nicely passed in a call-with-values context.

I've got to grok Dybvig's scheme more closely, draw out some detailed stack diagrams and see if it fits in with the other stack frame invariants that I've presumed to date.

Scott

Not preemption-safe

Any scheme that leaves content "below" the stack pointer (i.e. after the end of stack) is not preemption-safe. Both UNIX and Windows have things that amount to signal frames that can be pushed below SP at any time, in such a way that the application can't see it coming. Unless your ABI gives you a well-defined "red zone" at the bottom of the stack (e.g. thumb, but almost nothing else), or you use heap-allocated frames, this technique simply won't work.

In a statically typed language, however, there is really no need. Tuples are, in effect, anonymous records whose arity (therefore size) is known at compile time. The compiler can leave a suitable "hole" in the caller-constructed frame, and either pass a pointer to the hole (making the MV return value a by-reference OUT parameter) or rely on the callee's ability to compute the location of the hole based on the SP-at-entry and knowledge of the calling convention rules. The by-reference approach becomes necessary when functions have a variable number of arguments. It's probably also necessary when curried application is used.

Multiple-value return in Common Lisp

When we added the multiple-value return feature to Lisp machine Lisp (from whence it went into Common Lisp), I believe that the desire to avoid consing was one of the serious motivations. Back then, we did not have good GC technology, and writing Lisp programs to carefully minimize consing was very common.

The other reason was for expressiveness: to make it clear that the function was really returning more than one result, rather than introducing a little list "type" (e.g. "a two-list of the file handle and a boolean to say whether...").

In Java, I've seen definitions of little classes whose only purpose is to be a way to return multiple values. The advantage of this approach is that the types and meanings of the returned values are documented (at least if you pick clear names and put in comments!). But it's rather verbose, in my opinion.

When I was first reviewing the early Java spec, I was surprised to see that there was no way to return multiple values. I wanted to lobby for adding that, so I tried to come up with a use case that was both simple and very compelling. I was unable to do so! So I didn't try lobbying for it. The designers of Java were pretty clearly trying to avoid a lot of bells and whistles (and syntactic sugar), and there's certainly a lot to be said for that.

Stack-based languages

You might want to look at concatenative languages (Joy, Factor, etc) where production (and subsequent consumption) of multiple values is trivial.

Of "consing" and avoiding multiple fun calls with mv return

First, I think "cons'ing" is a perfectly good term. Yes it is a "Lispism", but so what - most of the birth of FP (and GC and recursion in PL and on and on) comes from early Lisp tradition. In any case, it's a short hand preferable to the lengthy "heap allocating a new data structure."

But anyway, I've noted that mv return can also be used to avoid needless multiple function calls. Imagine a simple iterator construct. With MV return, we can obtain successive values from a collection (1) with a single funcall and (2) while avoiding needlessly allocating some kind of Some/None Option data type. To elaborate on the three idioms under comparison.

// MV call iterator - one call, no mem alloc
def Iter[T].next(-> Bool, T);

// HasNext/next Java style iterator - requires two calls
Iter[T].hasNext(-> Bool);
Iter[T].next(-> T);

// Some/None style iterator - allocs mem if optimization
//  Typically requires a pattern match construct, although
//   this often enjoys a speedy implementation.
datatype Option[T] = None | Some[T];
Iter[T].next(-> Option[T]);

Anyway, I'm surprised more languages don't support efficient multiple value return, at least allowing for fixed arity value return and ignoring of arguments "trailing" those of interest.

The semantics are, or can be made, clear, and the implementation technology is no mystery.

Different issue

scottmcl:

// MV call iterator - one call, no mem alloc
def Iter[T].next(-> Bool, T);

The reason Lisp does that is not that it has MV return. Any language with tuples could do that (which includes all those that have an Option type). In principle at least.

The real reason is that Lisp is not concerned with types. The above does not work in a typed language, because you cannot simply cough up a value of arbitrary type T in the empty case.

Anyway, I'm surprised more languages don't support efficient multiple value return, at least allowing for fixed arity value return and ignoring of arguments "trailing" those of interest

I'm rather surprised more languages don't support efficient tuples.

Pattern matching MV returns?

Designing my toy language, I hit on the idea of pattern matching multiple return values in order to avoid the problem with type safety pointed out by Andreas. An iterator example might look something like this.

mv-match some-mv-call()
| false => we-are-done()
| true, a ... => do-something-with-value(a) ;

The ... signals that we are ignoring any additional return values, although we could make this implicit as well. We could have an "| else" (or | _) clause as well.

MV bind and assign seem to be easily translated into mv-match based expressions, where too few return values throws a run time exception (unless verified by the type checker at compile time).

I'm curious - does this indeed solve the type safety issue? Am I missing some problem with the semantics of an mv-match style construct?

Thanks much.

Scott

equivalent to CPS

If you replace this boolean value true/false with an enumeration which you can call the continuation labels, you have basically recreated continuation-passing style (although you are branching in two steps, but true CPS is only one optimisation away...)

Consing

Consing is perfectly good terminology. It just didn't translate too well when the discussion veered off into the world of lazy evaluation. :-) (The terminology is even sometimes used inside the Haskell community, despite this issue)

Also, who says that using an Option type necessarily involves heap allocation? There are several implementation strategies that could avoid this...

Chez, at least, uses the somewhat common technique of using least significant bits for type tags, so many values don't require boxing. Small integers have these bits zero, for efficiency purposes: additions are simple hardware additions (modulo some run-time type checking, which is optimized away where possible), and multiplications are a single shift followed by a hardware multiplication. Memory allocations are aligned, freeing up the low-order pointer bits. No reference points exactly to the start of it's value.

The list of tricks goes on and on, and I'm certainly not familiar with all of them. However, if you read Chez's header for FFI, which allows C code to read and write many Scheme values inside the system and call Scheme from C, among other things, you can get some idea of what the machine representation looks like.

GHC

A relatively recent release of the Glasgow Haskell Compiler added an optimization called pointer tagging. This causes the cases of algebraic data types to be encoded in the lower-order bits of the pointer. This will do exactly what you want here. Maybe a (the Haskell equivalent to SML's option) will always be a tagged pointer regardless of what 'a' is.

Lua

Lua has a very nice simple syntax and semantics for multi-value call and return. It includes nice support for variable numbers of arguments and passing the "rest" of the arguments to another function. Internally this uses the stack so avoids consing.

Dao

It might be interesting for you if I mention how I implemented returning multiple values in Dao. In Dao multiple values are returned as a tuple. Noting that, in Dao, each element of a tuple is typed independently, eg., a tuple of an integer, a float and a string is typed as tuple<int,float,string>. The elements of a tuple can be unpacked into variables by:
(v1, v2, ...) = a_tuple;
And in doing this, the types of the variables are checked at compiling time against the types of the corresponding tuple elements.

In fact, supporting type checking of individual values in a multiple value return was one of the main motivation to add the tuple type into Dao. By the way, a Dao tuple has other interesting properties, for example, a field name can be assigned each element of a tuple, so that the tuple elements can be accessed by field names in the same way as class instances. As an example:
a_tuple : tuple<x:float,y:float,z:float> = ( 1.0, 2.5, 3.0 );
x = a_tuple.x;

This is exactly how tuples

This is exactly how tuples work in any statically typed language that has them.

Well, it is natural to have

Well, it is natural to have tuple to work in this way, otherwise there is no strong reason to have tuple type. But I am unaware of other languages where tuple can have named items, I am sure there are, but maybe not in mainstream languages.

Tuple with named fields = record

A tuple with labelled fields is usually called a record. Languages with structural record types do what you describe. Standard ML is one example (where a tuple just is a special case of a record with consecutive numeric labels).

Thanks, I didn't know it got

Thanks, I didn't know it got a different name as record. It seems to me that it is not necessary to keep them as seperated types, because the field names can be simply attached to the typing object instead of the tuple itself. In ML, is it possible to assign freely a compatible tuple to a record, and vice verse?

Implicit conversions

In SML, a tuple is a record, so your question boils down to: can I implicitly convert between "compatible" record types? The answer is no. It is not clear what it should mean, because records are equivalent up to reordering of fields. That is, {a=4, b=5} is the same value as {b=5, a=4}. So if you allowed binding

val x = {a=4, b=5}
val {c, d} = x

then it is not clear whether c=4 and d=5 or the other way round, and there would be no reason to favor one over the other.

Also, ML (and functional languages in general) have a strong tradition of avoiding any kind of implicit conversion, because they are considered harmful, and don't go well with type inference.

Tuple with named fields = record ?

Are you sure they are really the same? As you have shown, the ordering of items in a record is irrelevant in SML, but a tuple should have ordered items as I understood.

They are really the same.

A tuple in SML is precisely the same as a record with fields named 1, 2, ... So, for example:

{1 = "hi", 2 = 58008} 
("hi", 58008)
{2 = 58008, 1 = "hi"} 

are all equivalent. Tuple syntax is literally nothing but syntactic sugar, as the following shows:

- {1="hi", 2=58008};
val it = ("hi",58008) : string * int

[Edit: fixed my SML record syntax]

Difference

He's asking whether tuple with named fields = record , not whether tuple = record with certain names. To model his "tuples with named fields" as records, I think you'd have to have support name aliases - e.g. #2 and 'b' are both names for the same second element.

Should've been more clear

I was only saying that yes, I'm sure that tuples and records are "the same" in SML. But I agree with you: the sameness of tuples and records in SML is not the same sameness that the original poster wants.

Wait, I guess I should still have been more clear. ;) I'm just not sure how to say it...

It's clear

Indeed, such sameness regarding the tuple and record type in SML is not the kind I expected.

Aha - CPS!

I thought this looked sorta familiar :-) I'll pursue this line of inquiry further.

I came up in the 80's with 128k and 640K machines (don't laugh), so no matter how fancy the GC, allocating memory needlessly still just seems wrong to me, particularly in an inner loop.

Iteration is one *very* common example of an inner loop, hence my interest in stack based MV return - particularly since I want to support MV iteration, such as iterating over (key,val) pairs in a dict (hash, map, whatever) or (val, index) pairs in a vector, etc.

I'll check out tensor calculus - new term to me.

Scott

See also

See also an earlier LtU thread on optimizing calling conventions for functional languages. The intermediate language proposed by that paper has symmetric multiple-parameter/multiple-result function calls.

It won't help much on the question of which syntax is easier for your target audience of humans to understand, but it does explore the related optimization questions in reasonable depth.

Thanks for the reference!

Thanks for the reference!

Multiple value return with J and APL

Both languages are multiple input value, multiple return. The same code will handle single input, single return, multiple input, multiple return, and the various permutations of single input/multiple return, multiple input/single return, etc. automatically. Most of the time there is no need for the programmer to have different code to handle all of these cases.

If the last line executed in a function you write results in a single value, that single value will be returned. If it results in a multiple value, a multiple value will be returned.

Here is an example of using a built-in function (addition) in the J interpreter to add numbers. This will return single or multiple values depending on the inputs. It works the same way with functions you define yourself.

   NB. Add 2 single values to get a single value
   1+2
3
   NB. Add 2 arrays (lists, vectors, etc.) to get an array back
   1 2 3 + 4 5 6
5 7 9