archives

Compile-Time Execution in an Object Oriented Language

I'm designing a generic object oriented language, and I would like to replace all of those silly declaration modifiers that you often see (e.g. public, protected, private, internal, const, static, etc.) with annotations. My plan is to introduce a preprocess phase, executed immediately after compilation, that allows the program to verify that the semantics associated with the various annotations are respected.

This phase would be simply a call to some "preprocess" function from the compiler, which uses the langauge's own reflection facilities to allow the programmer to walk the code and verify things. This seems relatively straightforward, but I am concerned that there are well-known pitfalls that in my infinite naivete I may be overlooking. There is of course the really obvious stuff like the possibility of a non-terminating or intractably complex algorithm at compile-time, but that I can live with. Thanks in advance for your suggestions.

The Design and Implementation of Typed Scheme

Tobin-Hochstadt, Felleisen. The Design and Implementation of Typed Scheme. POPL 2008.

When scripts in untyped languages grow into large programs, maintaining them becomes difficult. A lack of types in typical script ing languages means that programmers must (re)discover critical piecs of design information every time they wish to change a program. This analysis step both slows down the maintenance process and may even introduce mistakes due to the violation of undiscovered invariants. This paper presents Typed Scheme, an explicitly typed extension of an untyped scripting language. Its type system is based on the novel notion of occurrence typing, which we formalize and mechanically prove sound. The implementation of Typed Scheme additionally borrows elements from a range of approaches, including recursive types, true unions and subtyping, plus polymorphism combined with a modicum of local inference. Initial experiments with the implementation suggest that Typed Scheme naturally accommodates the programming style of the underlying scripting language, at least for the first few thousand lines of ported code.

The key feature of occurrence typing is the ability of the type system to assign distinct types to distinct occurrences of a variable based on control flow criteria. You can think of it as a generalization of the way pattern matching on algebraic datatypes works.

Go to sections 4.4 and 7 to whet your appetite...

Where are all the editors? LtU needs your TLC...

Numbers in Smalltalk

Hey,

I was just reading the discussion where William Cook is critiquing CTM. In this post, where he is contrasting Objects and ADTs, he says:

...the well-known problem of 'binary methods' in object-oriented programming. It is also one reason why numbers are hard to implement as objects: the operations +, *, / etc all require access to representations unless you are willing to give up a lot of performance. The way Smalltalk handles numbers is really fascinating, and not very well known.

Further down, he illustrates objects as follows:

The argument O is not the Client, the function is the Client:

Client = Fun{Container O}.... {O.get...} {O.put....} ... end.

It gets even more interesting when you have operations that take multiple object values, like "compare dictionary" or "append lists". Then you would have

Client2 = Fun{Container O1, Container O2}.... {O1.append(O2)...} {O2.compare(O1)....} ... end.

In this case the "class" of the two container objects O1 and O2 can be completely unrelated and everything will still work! This is a strong illustration of the power of objects. Again, see my paper on ADT vs OOP for more discussion. This is how numbers work in Smalltalk.

Oz makes me a bit cross-eyed, but this seems a pretty straight-forward use of OO. Hopefully someone can explain why binary operators are so fascinating in Smalltalk...

Cheers...

Induction of variadic functions, functions over tuples, etc.

Hey again,

I was just working through "Faking It: ..." by Conor McBride.

One of the examples in the paper, he introduces Fridlender & Indrika's variadic zipWith:

(<<) :: [a -> b] -> [a] -> [b]
(f:fs) << (a:as) = (f a) : (fs << as)
_ << _ = []

z = id
s = \n fs as -> n (fs << as)

s :: ([b] -> c) -> [a -> b] -> [a] -> c

And then creates a type-safe version by induction:

data Zero = Zero
data Suc n = Suc n

class Nat n
instance Nat Zero
instance Nat n => Nat (Suc n)

...

class Nat n => ZipWithN n s t | n s -> t where
manyAppN :: n -> [s] -> t
zipWithN :: n -> s -> t
zipWithN n f = manyAppN n (repeat f)

instance ZipWithN Zero t [t] where
manyAppN Zero fs = fs

instance ZipWithN n t ts => ZipWithN (Suc n) (s -> t) ([s] -> ts) where
manyAppN (Suc n) fs ss = manyAppN n (fs << ss)

[Note: This is not exactly how the code appears.]

To use this, you need to supply a faux-numeric argument like Suc( Suc( ... Zero ... )).

Slightly related is the fact that I just made a linear algebra library based on lists, but would have done it on tuples had there been a (syntactically) nice way of mapping and folding etc. Well, I know that tuples aren't really recursive, but the Faking It: ... paper makes a convincing argument that variadic functions like zipWith can (should) be defined inductively.

This is a vague question about what kind of environment would make this possible. Since it is desirable to remove the numeric argument, I question whether dependent types are really what I want. I haven't delved too much into dependent types, but I feel like this is more about induction on the structure of the function, as opposed to induction on the numerical argument.

It will be a while before I can fully appreciate the power of staging, but some of the stuff I have read here recently by Carette, Kiselyov and Chan looks pretty cool. Jay's work on FISh and program shape analysis looks interesting as well. I also heard (somewhere on here) of Prolog being able to infer structure like this.

My brain works a lot faster than I can learn, and I am really trying to get a feel for what I might study (honors) next year. So thanks if you can find the time to point me in the right direction...

[fixed: '(f s)' became '(f a)'. thanks fruehr]

How useful is reflection, anyway?

In a post in a different thread, Ben Titzer explains why he doesn't include reflection capabilities in his language Virgil, a language intended for very small and/or deeply embedded systems. In the context of his requirements, the exclusion makes perfect sense.

Introspection (the ability of a program to inspect itself at runtime) is well-established as a powerful facility--especially in languages with otherwise weak typing systems, or in dynamically typed languages. Some argue that it breaks Liskov substitution (as via introspection one can discover the differences between a subtype and a supertype; differences which are not visible from the interface of the type), but it's useful to have around.

But what of reflection--the ability for a running program to *modify* itself--changing subroutines or type definitions "on the fly", and in some cases having extant data objects reconfigured when the underlying schema changes? Many LISPs (culminating in CLOS; far less so in Scheme) are highly reflective, as is Smalltalk. More recently, many of the popular scripting languages feature reflective capabilities. RDBMSs, which generally have no phase distinction as they have to be running 24/7, are highly reflective systems--the schema of a particular relation (table) is defined in a different relation, which can be altered by a user with sufficient authority (a DBA and nobody else, one hopes).

Excluding the RDBMS example, and other high-availability, dynamically-reconfigurable systems--the most common use of reflection capabilities (again, here I mean mutating runtime structures on the fly; not merely inspecting them) seems to be... debugging and development. A common complaint about "traditional" languages with a phase distinction is that the edit/compile/run/debug cycle is too long; and that use of reflection improves programmer productivity. OTOH, many non-reflective languages nowadays come with rapid-turnaround editing environments (whether a REPL or something fancier).

Again excluding databases and such ('cause they've been covered above, not because I don't think they are worth considering)--and now excluding debugging/development--what about production code? Should deployed code, in the general case, ever be using reflection capabilities to "rewrite" itself--epsecially "destructive" changes? What user requirements call for reflective capabilities--requirements which are harder to implement in languages with a phase distinction (and lack of mutability of structures handled in earlier phases)?