HOPL-III: A History of Erlang

A History of Erlang and the accompanying Presentation Slides by Joe Armstrong are a must read for anyone interested in PL history.

Erlang was designed for writing concurrent programs that "run forever". Erlang uses concurrent processes to structure the program. These processes have no shared memory and communicate by asynchronous message passing. Erlang processes are lightweight and belong to the language and not the operating system. Erlang has mechanisms to allow programs to change code "on the fly" so that programs can evolve and change as they run. These mechanisms simplify the construction of software for implementing non-stop systems.

(Link to previous HOPL-III papers on LtU).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

on the fly code changes

For me, "on the fly" code swapping functionality is most interesting. The ability to change small parts of the logic, in a controlled manner, is extremely useful for systems that shouldn't be brought down.

I think it will be helpful to integrate this hot-swapping ability with source control. In other words, a programmer could commit a change to the repository, tag it as "currently-running" and the running system would pick up the change immediately. If the change turns out to be problematic, the programmer (or even the system itself) could simply roll-back the change. If the change is good, it becomes part of the current version.

Does any one know of literature that discusses 'on the fly' changes to code in a systematic manner? There has been at least one discussion about 'live programming' but I don't think that model quite fits long running processes.

UpgradeJ

You might be interested in UpgradeJ, an approach to dynamic code linking that sees versioning as an integral part of the type system. Gavin Bierman, Matthew Parkinson and James Noble had a paper about it on ECOOP 2008.

Change boxes

You might be interested as well in change boxes, an extension of Squeak that makes change a first-class element of the language. All version of the software exist at the same time, and code can be run against any version. You can find more information in Pascal Zumkehr's thesis.

on-line code update projects

There's a groundswell of interest, in my opinion, in this research area. Some more mature projects (there are many others):

I've been working on problems here since my doctoral work in 2001. A good paper to read, IMHO, that highlights the underlying issues is our PLDI 06 paper.

To me, the interesting question is how to provide this support without imposing a serious burden on the programmer while providing some assurance that a dynamic update will be applied correctly. An easy way to answer this question is to not allow dynamic updates to be very full featured; this is the approach taken by Ksplice. The more flexibility you allow, the more you can do, but the harder it is to provide assurances/ease of use. We have been exploring the latter space for a bit now.

To me, the interesting

To me, the interesting question is how to provide this support without imposing a serious burden on the programmer while providing some assurance that a dynamic update will be applied correctly.

In contrast to the Erlang programmers who instead focus on the assurance that if the update is applied incorrectly the damage will be predictable and well contained (i.e. crash and restart of some particular process). The important practical feature for hot code upgrade in Erlang is the fault isolation, not the module-reloading semantics.

failing fast

Doesn't this assume that bugs resulting from an upgrade will manifest themselves rather quickly? (I'm guessing your answer will be that since Erlang programs are built to fail as fast as possible at every level, that this tends to happen anyway.) Still, I could imagine subtle changes to the semantics during an upgrade that might manifest in components producing wrong answers to computations but not crashing.

(I'm not disagreeing, just curious, having never written Erlang.)

Your assessment is right. I

Your assessment is right. I don't think it's fundamentally different to loading new code into Lisp systems, which I think you do have experience with. But I find that I mess up Lisp images much more often than Erlang nodes and I attribute this to Erlang's fault-isolation.

But different parts of the Erlang community do things very differently depending on their products and customers. Some groups make OTP upgrade packages with well-defined (hand-written) update procedures, others seldom load new code into running systems (restart instead), and we adventurous few tend to load code directly from Emacs into production systems and rely on our wits to catch problems that aren't handled automatically. The only way to live IMHO :-)

Speaking only for myself and not the whole Erlang community. :-)

Interesting but probably pointless

There's a lot of interesting papers on the topic and I'm sure there will be many more.

That doesn't mean such a technique is helpful or will actually be deployed in a non-academic context though. Whilst it's true that there are many systems that shouldn't be brought down, being able to upgrade a program "on the fly" is redundant, because they already need to be regularly checkpointing their state so the system can handle hardware failure and upgrades of the underlying system (eg, cpu, kernel, whatever). Any system that requires serious uptime needs some kind of redundancy and the ability to fail over to other instances of the same program, which means thinking carefully about persistent state and how to quickly restore your program from it.

I expect that over time, some of the ideas will percolate out of these custom languages and fancy type systems resulting in better checkingpointing and data structure evolution tools. I predict that exactly zero industrial programs will actually be written in such languages though.

That said, there are a few cases where hot-swapping of code can be useful .... the way Eclipse lets you change code whilst it's being debugged is one obvious case which lends itself nicely to very fast iterations. It works because the nature of the hot swaps is limited. You can modify methods (typically) but not always add/remove them or change their prototypes. And of course there's an efficiency impact to that, so it's not a technique you'd use on a production system.

Ceci n'est pas un advertisement

javarebel is apparently not too suckful; gets one a bit closer to the Erlang style hot swap?

Theres always live

Theres always live programming, you can do a lot with that, but see my OOPSLA paper. Its a big development boost to continuously run a program during development, and I'm sure its in the future of PL to support better hot swapping.

Ensuring credit is given where it is due, Eclipse hot swapping is actually a product of the Sun folks and supported directly by the JavaVM (you could take advantage of it yourself in your own IDE with very little effort). From what I understand, it grew out of support for hot swapping in Smalltalk VMs, where hot swapping has been supported for a while.

Microsoft also supports a form of hot swapping in Visual Studio called edit and continue, however, it is much more limited than Sun's hot swapping as the program must be paused before the code can be modified.

You'd be surprised how unimportant efficiency is to some applications. The 10 or so percentage points needed to support hot swapping would actually be acceptable in many domains given the ability to upgrade code on the fly. Actually, this has been shown in Erlang and other systems many times over.

Microsoft also supports a

Microsoft also supports a form of hot swapping in Visual Studio called edit and continue, however, it is much more limited than Sun's hot swapping as the program must be paused before the code can be modified.

With F# in VS, you just evaluate your code with a highlight and alt-enter, and if you want to change something you (say a form) you just enter it in the REPL, or write an expression, highlight, alt-enter.

Yes, but in the JVM, you

Yes, but in the JVM, you just change the code, no need to do anything with the keyboard. If the code is running in a while true loop, it will eventually re-execute with no extra keyboard inputs.

SuperGlue goes beyond this and processes all changes in real time as soon as they are not in an error state. Let me know when you can support no-highlight, no alt-enter in F#, then I'll be impressed :) To be fair, F#'s interpreter support is nice and useful, but not what I would consider real live programming, which requires changing code rather than just adding new commands to an existing program.

Rewriting programs at runtime

Systems for interactive programming are increasingly used in the algorithmic art community. Mainly originating from improvising with programming languages for sound synthesis, this practice has become known as 'live coding'. These methods have spread over the last decade to visual art and also to more scientific fields like sonification and musicology.

It stands to reason that algorithmic acoustics encourages such approaches because sound is inherently temporal, so that the system is required to remain active while being modified. There are interesting issues that come up in systems where temporal semantics is central.

Various languages are used, including Haskell, Scheme, SuperCollider, Perl, Chuck, and SmallTalk.

A mix of papers concerning this field can be found here: http://www.toplap.org/index.php/ToplapPapers

SPOTS

I can't find the results of the SPOTS project on the Web.

[10] B. Dacker, N. Eishiewy, P. Hedeland, C-W. Welin and M.C. Williams.
Experiments with programming languages and techniques for telecom-
munications applications. SETSS ’86. Eindhoven 14-18 April, 1986.

Do anybody know if it is published somewhere?