archives

Help with study of functional programmers

Are you currently developing or maintaining a medium to large-sized
program written in a functional language, such as Haskell, F#, OCaml,
or Lisp? I'm a PhD student doing a study of functional programmers, as part of
a research internship at Microsoft, and I would like the opportunity to look over
your shoulder while you do debugging or coding on your project.

I'm looking for people with at least a year's experience doing
functional programming, and who are currently working on a real
project (i.e. for some purpose other than learning functional
programming). I'd simply come
watch you work, and ask a few questions along the way. You'd do
whatever you would normally be doing. If you're near Seattle or
Portland, I'd come to your office for a couple of hours. If you're
not near Seattle or Portland, then we'd set you up with LiveMeeting
or some other remote screencast software so I can watch you from here.

Obviously security concerns are an issue - I will not share any
proprietary information that I learn about while visiting you.

In exchange for your help, Microsoft will offer you your pick of free
software off its gratuity list (which has about 50 items, including
Visual Studio Professional, Word for Mac, XBOX 360 games) or any book
from MS Press.

We're doing this because expert functional programmers have not been
studied much. We plan to share our findings through academic
publications, to help tool developers create debugging tools that are
genuinely helpful in real-world settings.

I'm hoping to finish my observations by August 8th, so please contact
me immediately if you're interested!

Thank you,

Chris Bogart
425-538-3562
t-chribo@microsoft.com

(Correction 7/18/08: I was mistaken in my original posting: I *can* use
participants in Europe as well as the US)

Ada, the Ultimate Lambda?

Chris Oakasaki has a blog post about teaching functional programming using Ada. He says

Why, with excellent functional programming languages like Haskell or ML easily available, would I choose to do a functional programming project in Ada? (I CAN STILL HEAR YOU LAUGHING!) Partly because we use Ada in some courses at our institution, and I wanted a library that could help in teaching recursion to novices. But, honestly, mostly I just wanted to see if it could be done

The idea of crossing paradigms has been around awhile. SICP rather famously introduces object orientation by building it on top of Scheme.

What do you think about teaching or learning paradigm A in a language strongly oriented towards paradigm B? What's gained? What's lost?

Practical Bits of Making a Compiler for a New Language

I'm designing (and hopefully implementing) a new programming language. Now, I've taken a bit of language-design advice I once heard from a wise old professor, which is: "Never add an entirely new feature to an entirely new language." Instead, I'm consolidating older, thoroughly-tested ideas whose time has come into a language suited for a domain that has yet to see such features.

The vision? An advanced, almost-type-safe imperative language with functional and object-oriented features for kernel and base-library level systems programming. The goal in creating this? To bring the convenience of high-level language features to low-level programming.

The very foundation of the language is:

1) Lexical scoping.
2) Imperative block structure based on statements.
3) A strong type system with a single exceptional case of code marked "unsafe" (because when writing a kernel you either allow unsafe code somewhere, or develop a type system that could probably win a Turing award).

The features I'm putting in are:

1) Module/unit files. See: Delphi, Python.
2) Lambda functions and full lexical closures. Every function treated as data comes with a closure by default, even when it has no closed-over lexical variables. When a function lacks such variables, the compiler can optimize out their storage, but the type system must treat them as the same! Tail-recursion optimization will be mandatory.
3) Exception handling as seen in Java, C++, Delphi, Python and practically every other modern language. Not having it is downright silly, especially when a clever compiler can allocate an exception object on the stack, thus allowing even the heap allocator to safely throw exceptions.
4) Templates, as almost directly stolen from D. Only types can be passed to templates.
5) Object orientation based on generic functions and specializing multi-methods. However, there won't be pre or post methods, and each class can only inherit from one other class (thus VASTLY simplifying inheritance-related problems). Generic functions and specializing methods are used to provide dynamic method dispatch.
6) The unification of tuples, structures, and classes. A class is merely a structure that can inherit from another class, have methods specialized on it, and have multiple scopes (private, public, protected, et al). A structure is merely a tuple with named members. I see no reason that I can't declare a tuple type in which only some members have names, and leave the language to provide indexes for all, then declare a new tuple type that inherits from the first, and finally specialize a generic function on that latter type.
7) Dynamic arrays, stolen right from Delphi. I will also include static arrays just for the convenience of allocating a known number of items off the stack when heap space often runs scarce at the lowest level.
8) Typed reference values. These function like pointers in that their implementation identifies an address in memory, but to the programmer they can either contain the NULL value or reference a known object whose address was obtained via a take-the-address operator. Objects (as in instances of classes) will be referred to by references only. I'm actually quite on-edge about this one as it will introduce the possibility of dangling references. The level of safety provided by an utterly pointer-less language would be nice, but so would not copying huge amounts of data to pass an object into a function.
9) Blocks of code marked unsafe in which pointers, pointer arithmetic, inline assembly and re-interpretive type casting all become legal. These should only be used for ultra-low-level functions such as writing to device registers, and a perfectly-formed unsafe block will make the compiler emit a warning unless the programmer also marks it assumed safe.
10) Let blocks for variable declaration. Sod just declaring a new variable out of the blue, we should be marking where our variables go into and out of scope.

Features left out as deliberate design decisions are:
1) Type inference. It requires writing an inconveniently extensive type system and disallowing either references or mutable variables.
2) Garbage collection. You guys will hate me for this (because everyone and everything has garbage collection nowadays), but when writing an operating system or a device driver one can't find themselves left at the mercy of a GC, either for performance or safety reasons. For these domains you need a language with such a small runtime-library you can write it in 3 pages of code, or none at all.
3) Looping constructs. In the presence of tail-recursion optimization and templates, they simply aren't needed. Let people write functional code!
4) Concurrency. Once again, you simply don't assume certain language features at the lowest level.
5) Implicit type conversions. The only working implicit conversion will be the assignment of an integer to a floating-point variable.
6) Operator overloading. It makes parsing that much harder.

Now, the things I need your advice about are:
1) Syntax. Should I stick with C-like syntax for familiarity, switch to a Lispy s-expression syntax for metaprogramming capabilities, or use some completely other syntax I haven't thought of?
2) The references. Is not copying data really worth aliasing bugs?
3) Any advice on the implementation of generic-function dispatch, particularly in the case where methods might only become known at link time? I've got a system that will work fairly well already, but it only works when the programmer can only specialize on a generic function in the same module that declares and implements that generic function. Still, assuming that, it can do some work to type-check and ambiguity-check the generic-function call at compile time, resulting in only a small list of the type-safe possible dispatches to be looked up at runtime.
4) Do you think I've completely missed something?

I've got a textbook on compilation techniques and I'm raring to go at this, so please write in any ideas or criticisms you might have. Except about the mandate of garbage collection. I refuse to have my thread scheduler pause for a GC, though I feel quite happy about having one for writing desktop applications.