Help with N-Ary functions?

I feel incredibly silly posting this, but I could use some pointers.

There seem to be two schools of function specification: one in which all function types are expressed as T1 -> T2, and the other in which function types are expressed as T1 x T2 x ... Tn -> Tout. The difference here is that the second makes function arity part of the type.

In BitC, we initially chose the N-ary functions because I was slow figuring out how to make C's void type come out right. Later, we retained it because I thought that arity checking was important. That is, given a function:

(define (f x y) ...)

it seemed like a good thing to be able to notice that:

(f 1 2 3)

was a malformed call because the number of arguments do not agree. The alternative was to define application to occur by pair consing (similar to Scheme style, but with using an unboxed construction):

(f 1 2 3) =>

(f (1, 2, 3))

My concern about this is that type checking doesn't accomplish arity checking in the case where the type of the last parameter to the function is parametric. That is, if we get a function f whose type is something like:

(int32, float, 'a) -> bool

then you'll get to pass as many arguments as you want.

The N-ary function thing is now getting in our way, and I would like to switch to something like the pair-consed view. There seem to be three resolutions:

Conclude that losing the arity check when the last parameter is polymorphic is survivable.
Check arity as a separate syntactic matter that is independent from type.
Ask how other people have dealt with this.

So: how do other polymorphic languages handle this?

Arity and curried functions

If you ditch the variadic (N-Ary) thing then currying and arity checking all fall out with no problem. It's always seemed to me that while most Lisps have supported variadic functions, they really don't need to since lists are so syntactically easy to create using a quote or something similar. E.g. instead of a variadic sum that lets me write (sum 1 2 3 4 5) I use sum over lists (sum '(1 2 3 4 5)). There is a syntactic loss here, but the gain in currying may be worth it, especially for higher order functions.

(define (incList xs) 
  (map (+ 1) xs))

That said, one thing to be aware of in the interplay of currying, polymorphism, and side effects is the issue that MLers call the "monomorphic value restriction" or just "value restriction." From the comp.lang.ml faq:

What is the value restriction?

The value restriction is a feature of SML '97 which was introduced to
address some issues with polymorphism in the presence of effects.
The basic idea is that when a variable is bound to a polymorphic
expression, it must be the case that the expression is tantamount to
a value: that is, that it is guaranteed not to raise an exception or
allocate memory. For the purposes of typechecking, the class of
values is conservatively approximated by the syntactic notion of
"non-expansiveness".

Values (non-expansive expressions) are:

- functions
- constants
- variables
- records of values
- constructors applied to values

So for example, while in the following code f has type 'a -> 'b -> 'b,
g cannot be given the type 'b -> 'b because of the value restriction.

fun f x y = y
val g = f 3

The expression (f 3) is polymorphic (at type 'b -> 'b) but is not a
value, and so the value restriction forbids g from being bound
polymorphically. Either the code will be rejected, or g will be
given a useless "dummy" type.

In practice, this is usually easy to get around by eta-expanding:

fun f x y = y
val g = fn x => f 3 x

More information on the value restriction is available from a number
of sources, in particular:

- Section 4.7 of "The Definition of Standard ML". (see above)

- Pages 321-326 of Paulson's "ML for the working programmer". (see
above)

- http://cm.bell-labs.com/cm/cs/what/smlnj/doc/Conversion/types.html#Value
This includes discussion of the particulars of SML/NJ's treatment
of the value restriction.

- Extensive discussion in the comp.lang.ml archives.

By James Iry at Mon, 2008-09-29 04:48 | login or register to post comments

Value Restriction - don't look at it.

I still wonder whether that value restriction was just a dumb idea, or a legacy mistake. Seems like the ML inferencer doesn't skolemize type variables at the right location?

By marco at Mon, 2008-09-29 17:07 | login or register to post comments

Nasty issue

Actually, it's a nasty issue, and it gets a lot worse with explicit unboxing. The question is whether an aggregate containing a single mutable field must be subjected to the value restriction.

Fortunately, the issue does not arise for function parameters, because the function will be specialized at application time regardless.

By shap at Tue, 2008-09-30 00:52 | login or register to post comments

Thanks

Thanks for your comments, James. Currying, in our view, is a bug rather than a feature. It is an explicit goal of the BitC design to support strictly non-allocating code. The problem with the currying construction for multi-argument procedures is that it leads very easily and obscurely to heap-based storage allocation at run time. We therefore made a conscious decision not to adopt currying.

We are aware that sufficiently sophisticated compilers can then remove the overheads, but this in turn introduces serious challenges with assuring the compiler itself and also with the developer-predictability of the outcome. In effect, currying (along with many other possible features) would place us in the position of mandating certain optimizations in every conforming BitC implementation.

In light of this, the question here is whether we will continue to have N-ary functions or whether we will adopt implicit pair-consing of parameters.

In case it isn't clear, what I mean by pair-consing is a bit different from lisp-style consing. The end result in an N-argument call is an N-tuple of unboxed elements. Structurally, the net result on the call stack has identical layout to the C calling convention on most targets.

By shap at Tue, 2008-09-30 00:02 | login or register to post comments

Got ya

Now I understand the question better. When you asked about T1 -> T vs (T1 x T2) -> T I misread the question as T1 -> T2 -> T vs (T1 x T2) -> T

Anyway, in my limited experience with languages that have both polymorphism and variadic arguments (Java and Scala), the user is required to explicitly mark the variadic formal parameter - i.e. a polymorphic function type like 'a -> bool will never match to a variable argument list, but some notation like 'a* -> bool will (or, in C tradition, 'a.. -> bool).

If you can live with those constraints, it seems like you can follow typical C style calling conventions - actual variadic parameters are pushed on the stack per normal with an extra value on the stack indicating the number of values in the variadic list.

Here's a Scala example. Note that the JVM's stack security prevents a function from using variable sized stack arguments so Scala uses a list style sequence. But the representation and the typing issues should be orthogonal.

So here's a function that takes x and y of the same type and then a variable number of arguments of the same type. It returns a 3-tuple consisting of the first two arguments plus the variadic argument transformed to an actual list. To read the example, Scala uses inline "definition : type" notation, and [A] indicates that "A" is a ("for all") type variable. Underneath each expression is a result labeled resn : type = value of resn

scala> def foo[A](x : A, y : A, as : A *) = (x,y,as.toList)
foo: [A](A,A,A*)(A, A, List[A])

arity is enforced

scala> foo(1)                                              
:6: error: wrong number of arguments for method foo: (A,A,A*)(A, A, List[A])
       foo(1)
       ^

the variadic argument is optional

scala> foo(1,2)
res1: (Int, Int, List[Int]) = (1,2,List())

An example with many values

scala> foo(1,2,3,4,5)
res2: (Int, Int, List[Int]) = (1,2,List(3, 4, 5))

The variadic argument plays in the type system. Scala has a top type called Any which is the only type that unifies integers and strings

scala> foo(1,2,"hello!","world!")
res3: (Any, Any, List[Any]) = (1,2,List(hello!, world!))

Another function that puts a limit on the type of A will prevent that unification from working

scala> def bar[A <: AnyRef](x:A, y:A, as:A*)=(x,y,as.toList)
bar: [A  bar(1,2,"hello!","world!")
:6: error: inferred type arguments [Any] do not conform to method bar's type parameter bounds [A <: AnyRef]
       bar(1,2,"hello!","world!")
       ^

scala> bar("hello!","world!",1,2)    
:6: error: inferred type arguments [Any] do not conform to method bar's type parameter bounds [A <: AnyRef]
       bar("hello!","world!",1,2)

Polymorphic arguments are not automatically variadic

scala> def baz[A](x:A,y:A) = (x,y)
baz: [A](A,A)(A, A)

scala> baz(1,2,3,4)
:6: error: wrong number of arguments for method baz: (A,A)(A, A)
       baz(1,2,3,4)

By James Iry at Tue, 2008-09-30 02:35 | login or register to post comments

Thanks. That is very

Thanks. That is very helpful. Are there other languages whose handling of this I should look at in addition to Scala?

By shap at Tue, 2008-09-30 03:27 | login or register to post comments

Subtyping?

Oh, and what subtyping relationship (if any) exists between variadic and non-variadic functions? In particular is 'a...->bool to be accepted where 'a->bool is expected?

By shap at Tue, 2008-09-30 03:35 | login or register to post comments

Subtyping Relationship

There isn't one. In a system like that "'a.." is treated a bit like there's a type constructor called "..". Using ML style notation it's as if "'a.. -> Bool" is a syntactic shortcut for "Variadic 'a -> Bool" but unlike type constructors like List, the caller doesn't have to explicitly construct a Variadic argument since the compiler will implicitly do so.

Scala's variadic arguments are in 4.6.2 of the Scala Language Specification. In Scala, T* is explicitly indicated to be a synonym for Seq[T] plus some syntactic magic. However, for your needs, the "Variadic" type need not be denotable by the user except using ".." on a formal parameter.

A very similar variadic system can be found in Java where "T.." is synonymous with T[] (an array of T) plus syntactic magic.

By James Iry at Tue, 2008-09-30 03:59 | login or register to post comments

Row polymorphism?

Is Scala's "variadic typing" just the same thing as row polymorphism?

By Charles Stewart at Tue, 2008-09-30 11:13 | login or register to post comments

Cons trees

My concern about this is that type checking doesn't accomplish arity checking in the case where the type of the last parameter to the function is parametric. That is, if we get a function f whose type is something like:
(int32, float, 'a) -> bool
then you'll get to pass as many arguments as you want.

There are two obvious ways to represent parameters using cons trees:

(cons* arg1 arg2 arg3)
(cons* arg1 arg2 arg3 '())

AFAICS, this problem with last argument parametricity can only trouble representation 1. Have I missed something?

If you don't like the overhead of passing about the nil token, you can do what ML &c do and use n-ary tuple types, which seems natural enough.

By Charles Stewart at Mon, 2008-09-29 07:50 | login or register to post comments

Pair consing is not list consing

I understand the comment, but I'm not sure that your point really helps us. The pair-consing option yields an N-tuple of unboxed elements rather than a list. The analogous construction would implicitly encode the procedure:

(define (f i:int32 c:char) ...)

as taking a tuple of type int32 x char x Unit, and then take the argument assembly rule would be that the presented arguments are pair-consed with an implicit Unit on the right-most position.

I don't really see any advantage to this vs. just doing the arity checking by hand, and it has the disadvantage that it would preclude variadic procedures.

By shap at Tue, 2008-09-30 00:09 | login or register to post comments

Do not understand...

Does your type (int32, float, 'a) -> bool mean "a function which gets three arguments where the type of the last argument is unknown", or "a function which gets any number of arguments of which only the first two types are known"?

I think it is easy enough to write a type-checker for both kind of meanings (or even mixed).

I suppose you just need an extra rule in a standard HM type checker that any function which ends with a variadic type will eat all arguments, and maybe some syntactical rule what makes up all arguments (if you go higher order).

Just guessing.

[Extra thought: in the case of higher order languages, arity of functions is unknown, but can in most cases be derived syntactically from the function definition. It is not that difficult to extend HM type inference with extra integer constants and unknowns for the arity check]

By marco at Mon, 2008-09-29 17:19 | login or register to post comments

This is the heart of the question at hand.

In the current specification, the type

(fn (int32 float 'a) bool)

means "a function taking three arguments where the type of the last argument is unrestricted, and the function will be instantiated appropriately according to the type of the actual parameter in the usual way associated with polymorphic application." That is, our current specification is N-ary.

The option under consideration is to switch to a tupleized (pair-consed) approach, in which the corresponding type would be

(fn (pair int32 (pair float 'a)) bool)

In this scheme, all functions take exactly one argument, but implicit tuplization is done at the call site:

(f 3 #\c "abc") => (<prim-apply> f (pair 3 (pair #\c "abc")))

The problem with this approach is illustrated by this call:

(f 3 #\c "abc" "def") = > (<prim-apply> f (pair 3 (pair #\c (pair "abc" "def"))))

with the result that the third argument type 'a will unify with (pair string string). This unification has the side effect that arity violations go uncaught when the last formal parameter has type 'a. The catch, of course, is that this is exactly what we want to have happen when the procedure is intended to be variadic.

One possible view of this (which I am considering) is to adopt the position that enforcement of arity is orthogonal to type checking -- or at least that variadicity is encoded into function types in some fashion other than the instantiability of the final parameter. Perhaps introduce two function types fn and vfn. If we distinguish these, the HM adaptation is straightforward.

Another position is to declare by fiat that I'm worrying over a whole lot of nothing here, and just switch to pair-consing.

The catch with the fn/vfn approach is that interesting issues arise about parameter compatibility at argument passing if multiple function types co-exist that are potentially copy compatible. This issue doesn't arise for methods, because methods and functions are not, in general, copy compatible.

By shap at Tue, 2008-09-30 00:29 | login or register to post comments

Lambda the Ultimate

User login

Navigation