How best to add a record type to my typed Scheme variant?

I'm working on a compiler for a Scheme-like language, 'Irken'. The purpose of Irken is to generate a VM for a (yet to be designed) python-like language. I intend for Irken to be the implementation language for the VM, and also to be the language by which the language is extended - similar to the role of pre-scheme in scheme48.

http://www.nightmare.com/rushing/irken/

One goal here is to eliminate the need for end users to program in C.

Another reason for the roundabout approach: I want massively scalable cooperative threads (think 100,000+ threads, think Erlang)... so using the C stack is out (as is BitC, which I've recently learned about here).

Irken is a whole-program compiler that generates one large C function - 'vm()'. Irken uses the gcc address-of-label extension and 'goto' to implement closures. The current continuation (i.e., 'stack') is heap-allocated.

A couple of months ago my desire for a speedy VM got the best of me and I decided to dabble with type inference - mostly so I could get rid of runtime type checks. It's also nice to get the type safety. 8^)

I now have ML-style let-polymorphism - the algebraic data types. I can implement nice polymorphic data structures like red-black trees, parsers, etc. I've been pleasantly surprised at how little trouble the type inference has been. I've been able to write well-typed generators using call/cc, for example.

I put off thinking about records for a while, assuming that it would be a sugary detail. Now I'm seeing that it's more of a tar pit.

So I'm asking for advice - given the purpose of Irken, what approach should I take in adding a record type? Depending on the amount of work and complexity involved, I could accept anything in the range of 'all records require type annotation' to 'everything is figured out magically by the compiler and has no run-time cost'. Leaning toward the latter direction, though. Also, it'd be nice if I could sugar it up to support some simple OO features.

I've read Wand's "Type Inference for Record Concatenation..." and a bunch of stuff by Remy related to OCaml's row polymorphism. I've also been reading Ohori's "A Polymorphic Record Calculus...".

The impression I have right now is:

Wand: simple monomorphic records + recursive types.
Remy: less simple polymorphic records + recursive types.
Ohori: really cool polymorphic records but no recursive types?

The Ohori approach also looks like I'd have to throw my existing code away - it seems to replace products and sums with labeled records?

Thanks for any and all advice!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

[Admin]

Design questions of this kind are off-topic for LtU. Those wanting to offer advice should contact Sam offline.

Please read the Policy

Please read the policies document, especially 4a.

There are plenty of times I'd like to ask similar questions...

Anyhow, I rather like how Oz handles records. I've adopted it and have been tweaking it for total functional programming with structural typing.

There shouldn't be any problem with recursive (and corecursive) record types. I've managed to fit them readily enough into my own language, even with structural typing.

Why not BitC?

Sam: could you please identify your concerns about bitc on the bitc-dev list? It's not obvious to me why this doesn't address your needs fairly exactly, and I'ld like to understand that. The list subscription info is at www.bitc-lang.org/mailman/listinfo/bitc-dev.

I expect that it does, but

I expect that it does, but that he either isn't aware of continuation-passing style, or is prejudiced against it like I was for awhile. Any language that guarantees tail calls is a legitimate target for Erlang-scale concurrency once a program is in CPS form.

BitC Tail Calls

Last I saw, BitC's tail call optimization was insufficient for general CPS. It only optimizes locally declared methods, and thus is only sufficient for folds and looping tasks.

The author of Irken, Sam, is aware of continuation passing style; it receives specific mention in the Irken README file:

Irken

The Goal:

A new dialect and implementation of a python-like language that supports massively scalable cooperative multi-threading, and eventually, erlang-style concurrency.

How:

A VM that supports continuations. 'threads' will be built using continuations. The VM will be written in a new high-level language ('irken-low') that compiles to C. Continuations will be available in both the target language and the implementation language, and since we want to support millions of threads, we cannot use the C stack. Our approach is to generate continuation-passing style (CPS) code in C.

Apparently the author plans to lift the evaluation via a loop in C, and is likely disfavoring BitC as a 'target' language because it provides no significant advantage for this particular task... and happens to have a much smaller installation base.

My question wasn't so much

My question wasn't so much "why not target BitC" as it was "why not just use/contribute to BitC instead of building yet another new language here"?

There are a few different

There are a few different reasons why I hadn't considered BitC, the main one being that I didn't discover it until recently - probably because I hadn't considered using type inference before.

It's possible that the target language could do without the C stack, but I've learned painfully that the interaction with C turns out to be pretty important. I'd also like to have full continuations at both levels if possible.

With my previous company we did a lot of work with Stackless Python, and whenever we needed to drop to C for performance we had to write that code in CPS 'by hand'.

My impression from reading about BitC (still trying to get it up and running) is that it wants to preserve separate compilation and assumed that this implied use of the C stack.

I'll post again on bitc-dev, regardless of what happens with Irken I'm very interested in it and EROS.