Virgil: Objects on the Head of a Pin

Remarkable little language. This thread on embedded languages got me looking, and I came across this paper:

Embedded microcontrollers are becoming increasingly prolific, serving as the primary or auxiliary processor in products and research systems from microwaves to sensor networks. Microcontrollers represent perhaps the most severely resource-constrained embedded processors, often with as little as a few bytes of memory and a few kilobytes of code space. Language and compiler technology has so far been unable to bring the benefits of modern object-oriented languages to such processors. In this paper, I will present the design and implementation of Virgil, a lightweight object-oriented language designed with careful consideration for resource-limited domains. Virgil explicitly separates initialization time from runtime, allowing an application to build complex data structures during compilation and then run directly on the bare hardware without a virtual machine or any language runtime. This separation allows the entire program heap to be available at compile time and enables three new data-sensitive optimizations: reachable members analysis, reference compression, and ROM-ization. Experimental results demonstrate that Virgil is well suited for writing microcontroller programs, with five demonstrative applications fitting in less than 256 bytes of RAM with fewer than 50 bytes of metadata. Further results show that the optimizations presented in this paper reduced code size between 20% and 80% and RAM size by as much as 75%.

Looks very promising. It's notion of "initialization-time" is worthy of further thought. See also the Virgil homepage, where you can download preliminary source releases. The features page list a number of interesting tidbits as well: Virgil will soon support tail call optimization! In particular, see what's coming in Virgil II:

* Parametric types
* Module system
* Tuples
* Generalized Algebraic Types
* Integration with non-Virgil code

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Link

The paper is here.

Why for microcontrollers only?

I think desktop applications would equally benefit - there is set of applications that always seem slow : IDEs, word processors, etc.

I was thinking the same re:

I was thinking the same re: broader usefulness than embedded systems. There's no reason why the ideas can't be applied to ordinary languages, as I think he's hit a sweet spot in the features that people actually use. I suspect his research interests simply lie in this domain.

no dynamic allocation, for one

Virgil is designed so all memory allocation is done at compile time. You wouldn't want to write an IDE or word processor without dynamic allocation!

In general, it seems that most of Virgil's ideas are targeted at reducing memory footprint, and RAM usage in particular. Desktop applications are not memory contrained in the way tiny embedded systems are, and there is no ROM/RAM distinction, so a lot of Virgil's assumptions don't make sense.

I've done most of my work on embedded systems. Something like Virgil would have been wonderful for the really constrained environments (eg, 8 bits, 4 KB ROM, 256 bytes RAM). But even for mid-size environments (eg, 32 bits, 1 MB ROM, 1 MB RAM), I couldn't have used Virgil, except maybe for the OS. Those apps need dynamic memory, and the code is copied to RAM (for speed) in any case.

Tiny Embedded really is a world of its own, and Virgil looks like an excellent language for that domain. Now, if we could only have a well-designed high-level language to replace Verilog!

Virgil is designed so all

Virgil is designed so all memory allocation is done at compile time. You wouldn't want to write an IDE or word processor without dynamic allocation!

Well, initialization-time, but your point stands. Fortunately, static allocation isn't a limitation of the language so much as the current implementation.

I think Virgil's more important contribution comes from its approach to OO programming (no superclass in particular). This brings it much closer to functional languages and the powerful abstractions their module systems are capable of. Scala takes this route too, where:

Object = Module
Object type = Signature
Class = Method = Functor

I would think this could be used to great advantage in any environment, embedded systems included.

[Edit: corrected mistake in Scala mapping to module systems]

Hey Cool!

Hey Guys,

I've been lurking around LtU for a while now, good to have my work be noticed :)

The reason Virgil is targetted so specifically at embedded microcontrollers is mostly tactical; I've worked on embedded sensor networks (mostly simulation) and UCLA has a major research center dedicated to sensor networks, so I have a captive market in a sense. But the domain also has a couple of advantages: 1.) The software systems people tend to build are small and manageable enough to be completely self-contained and built by just a couple of individuals, 2.) The operating system abstractions don't have to be all that complex (e.g. protected memory, isolation, inter-process communication, concurency, etc aren't required), 3.) It offers a good excuse to throw out the baby with the bathwater and start from scratch, getting rid of whole runtime systems and language features and going back to the basics.

Things are coming along pretty well now, and I'll probably be making another release soon. Parametric types are implemented in the front end and supported in the interpreter, so the exensibility limitation of no universal superclass is gone. Some more work has to happen to push the polymorphism through the backend and produce monomorphic C code, using some tricks I still have up my sleeve (tradeoffs in code size, heap size, and performance are important here).

The restriction of no dynamic memory allocation I don't really believe is inherent in the language, but more tied to the class of microcontroller devices. In the future, for larger classes of systems, I hope to have a precise garbage collector as well as possibly some points in between such as regions or stack allocation only. I hope this can be done without leaving microcontroller class systems high and dry, but still supporting the same no-allocation model for them.

In addition to running the constructors in the compiler using the built-in interpreter, I now have a second entrypoint where one can run the whole program in the interpreter too, so you actually can write programs that allocate memory dynamically :) I've toyed with the idea of bootstrapping the compiler (i.e. rewriting key parts, or all of it, in Virgil), but more debugging and development tools are probably better to focus on at the moment.

Thanks for your interest guys!

Module and backens systems for Virgil

Have you considered Scala's approach to modules? It seems to integrate nicely with an object-oriented language, instead of introducing a new "module glue language" on top of the underlying OO language.

Also, I haven't done much embedded programming beyond some assembler-level stuff, so perhaps I'm just ignorant of the restrictions, but how would LLVM suit your needs as a platform-independent back-end rather than C? As long as you don't use on any of the LLVM services, they are left out of the final binary, so I believe it's theoretically possible to match the size of a gcc-compiled C binary. LLVM has a gcc-port, so perhaps comparing the size of the static binaries produced by both would yield an interesting comparison of any additional LLVM overheads, if any.

Thanks for the interesting paper! It's given me some things to think about it as I hack on my own language. :-)

Scala is Interesting

Yes, I am looking at Scala for ideas, too. I've even toyed with the idea of implementing a Scala frontend for my compiler. Somehow, you never really fully understand a language until you've implemented it.

I have looked at LLVM somewhat. Another student in my lab has worked extensively on register allocation in LLVM, so I have some second-hand experience. It doesn't have a backend for the microcontroller architectures that I am interested in, but it does have pretty decent backends for say x86 and PPC. For that reason, it might make a pretty good target for Virgil programs that don't have to run on microcontrollers but run on embedded Linux or desktop systems. Emitting Java bytecode and running on the JVM is also a possibility. Or Mono, too. It would be really cool if Virgil could run on any of these. Without any built-in runtime or class library, this might be easier than it otherwise would be.

Good luck with your language projects.

P.S. If I had it to do all over again, I'd write my Virgil compiler in ML or Haskell, not Java. I hate writing compilers in Java!

LLVM can emit C

It doesn't have a backend for the microcontroller architectures that I am interested in, but it does have pretty decent backends for say x86 and PPC.

If compiling Virgil emits C, LLVM has a C backend. So you can get your C portability, and a native backend for free if it's supported. Win-win.

Good luck with your language projects.

You too! Actually, I found your "lightweight confinement" a really interesting side effect of getting rid of the universal superclass. In my opinion, that omission is probably the singular most important change you've made to OO programming with Virgil. I'm now convinced it's The Right WayTM.