Sound and Complete Type Inference in BitC

Just a "head's up" that the extended version of the mentioned paper is now available on the BitC docs page. Given the amount of type algebra, it doesn't do very well in HTML. The PDF version is SRL2008-02. This is an extended version of our upcoming APLAS paper that includes all of the supporting proofs. So far as we know, this is the first sound and complete, integrative treatment of polymorphism and mutability in a single inference scheme.

Subsequent to this work, the BitC type system was revised to add a const meta-constructor and to re-specify mutability on a path-wise basis. This corrects a problem with deep immutability that was revealed by by-reference parameters and inner references. The revised type rules can be found in SRL2008-03. The revised rules remain sound and complete, and should be implemented in the compiler by some time early next week.

Still to come before we can release the first compiler are existential types, effect types, and structure subtyping.

I don't want to pick on Clojure (which looks very cool), but in light of the current discussion about state in Clojure, it may be appropriate to offer a counter-position:

Claim: If it isn't possible to efficiently self-host the runtime of your favorite functional language, then you can't argue credibly against imperative programming yet. At most, you might sustain an argument that imperative programming can (and perhaps should) be relegated to a very restricted part of your runtime library. This argument, in our view, evaporates in a puff of logic when performance criteria are admitted into the language evaluation.

In BitC, we take the position that pure programming is good when you can afford it (which is most places), but that there are places where we don't yet have any realistic alternative to imperative programming idioms.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

In SRL2008-03

It is not clear to me what |_ \tau _| refers to, is it "const"? Also, what is the semantic for const?

I have asked Swaroop to respond, but...

The "floor" notation is indeed const, so floor(T) is the type (const T). If T is not a type variable, then (const T) is some type T' that is copy compatible with T and is constructed by removing all shallow mutability from T. If T is a type variable matters are a bit more complicated and the const has to be handled in a sticky way (which is why it is a meta-constructor). For example, we cannot naively unify 'a and 'b when (const 'a) and (const 'b) are unified. Fortunately, a re-write is possible using a copy compatibility constraint that allows us to dodge the bullet here. When (const 'a) and (const 'b) are unified, we introduce the constraint (copy-compat 'a 'b). Or at least, this works conceptually. I'm not sure if this is what Swaroop actually did.

Const Types

To add to Dr. Shapiro's reply:

The `const' type |_t_| represents the shallowly immutable
form of a type t, where the immutability is interpreted lazily.

In a declarative sense, the semantics of |_ _| are similar to the
\downtriangle operator in Figure 2. However, during inference, it
preserves the const-ness across future unifications of type-variables.

For example,
|_bool_| === bool
|_ (bool, (mutable bool) _| === (bool, bool)
|_ ref(mutable bool) _| === ref(mutable bool)
|_ ('a, 'b) _| === (|_'a_|, |_'b_|)

In the paper, \Psi represents the mutable constructor, and
\Uparrow represents a reference type.

The S-Const1/2 rules in Figure-3 state that a const type occupies
the highest (most immutable) position in the lattice of
copy-compatible types.

No argument here

> If it isn't possible to efficiently self-host the runtime of your favorite functional language, then you can't argue credibly against imperative programming yet.

One can argue for a reduction in imperative programming, and identify its shortcomings, and that's all that Clojure does.

I've stated elsewhere that Clojure's core persistent data structures couldn't have been written efficiently in a purely functional manner, and recognize that, if I had to do it in the runtime, you might have to in an application. Thus, Clojure provides the tools for accessing Java primitives and arrays, type hints etc.

Providing and describing a toolset for doing the right thing in most places is not the same as arguing for the banishment of more dangerous tools. There are certainly cases where using imperative programming as an implementation detail of a function that is pure in its interface is fine, and necessary for performance. Clojure is impure and practical - you don't need to drop out of Clojure to pursue performance objectives.

However, bashing bits of memory that can be seen concurrently by other threads has the problems, and requires the difficult-to-get-right coordination, I've described. Is there anything in BitC that addresses that? Sorting const from mutable using a type system is insufficient, although it might form the basis for some future analysis of when/where locks are required.

Until then, immutability combined with some system-managed coordination like STM/Actors/Agents remains a good strategy.

Partial responses...

Rich: when I set that I didn't want to knock Clojure, I really meant it. You're doing some very cool stuff. To answer some of your questions and points:

I agree about reduction in imperative programming in abstract, and in most codes in practice. One of the problems that motivated BitC was writing kernels that need to run in constant space and with zero allocations. In pure languages, it can be exceptionally hard to develop any picture of how allocations occur and what bounds may or may not exist. I guess my senses is that this type of requirement constitutes a precipice that (so far) forces one to take a fairly polarized position on use of stateful idioms. Thankfully, it is also a rare use case.

I also agree that concurrency is hard, but I think that the fundamental problem is concurrency, and that state is merely a (substantial) contributing factor. There are complex issues in scaling event-driven systems as well. With respect to BitC, we made a conscious decision to exclude concurrency primitives from the language. My current view on this is that (a) there isn't one good answer, (b) for any particular answer one might select there are use cases for which that answer is utterly fatal, (c) because of this, concurrency primitives and libraries don't get along (consider Java locking vs. GUIs) and (d) concurrency safety is something that can be checked using static analysis, and this is the right place to do it until we have a solid handle on what support we actually want in the programming language. None of this disputes your point, and it is clear that we will soon need to address this deficiency in the language.

BitC does (or will shortly) provide some mechanisms that we hope will help. The first is effect typing, which allows an expression invocation to require that the computation is pure (allows local non-escaping mutation only). The second is type system support for showing that both shallow and deep data structures are deeply immutable. One thing I'm interested to experiment with is whether this may give us enough to implement any interesting concurrency support libraries that retain some reasonable semantics. It gives us quite a lot for fine-grain, compiler-exploitable parallelism, but that is not the type of concurrency that we are discussing here.

In the large, the problem with most concurrency control mechanisms is that they deeply violate notions of modularity. In the small, the problem is that the best concurrency mechanism is highly intertwined with the requirements of the particular application. The locking mechanism that we use in Coyotos, for example, probably doesn't generate to many other codes, but it is both beautiful and elegant in that particular context.