What Makes Erlang Processes Tick?

I was amazed to see the article concerning the C implemented Apache versus the Erlang implemented Yaws and immediately went looking for how Erlang emulated its processes. What I could not find after reading "Making reliable distributed systems in the presence of software errors", and "The development of Erlang" was what was making the magic happen. What is this high level of concurrency attributed to; bytecode instructions, abstractions built upon continuations, erlang-style trampolines, what?

I would like to learn more about concurrency in general (and more specifically, its implementation), even to the point of playing around with these new ideas in Common Lisp and Scheme. I have reviewed CL-MUPROC, Termite, and Distel; and so far Distel seems to be the most interesting due to its lack of need for any underlying thread or process implemenation.

Can anyone provide me some insight, whether explaination as to how Erlang makes process magic or Lisp equivalents, or reference readings to learn more about the subject.

Thank you for your time,

Mark Stahl

References:
Apache vs. Yaws
Making reliable distributed systems in the presence of software errors
The development of Erlang
CL-MUPROC
Termite
Distel

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Implementation techniques

In a typical bytecode interpreter that doesn't use the C stack for procedure calls you usually get coroutines, threads, etc almost for free without having to make any significant architectural changes. In such a bytecode interpreter you'll usually have a representation of an execution context or activation frame or whatever you want to call it. A process is then just a bunch of process-related metadata and a pointer to its active execution context. The only structural change you might make is having the evaluation stack be shared between execution contexts belonging to the same process rather than having it be global or per-context. On top of all this you have a scheduler which maintains a list of processes and decides which should run at a given point in time.

I know no Erlang internals, but ...

... the "magic" seems to be mainly what is generally referred to as "green threads", although these are generally referred to as being cooperative while Erlang's processes are preemptable AFAIK. Also a point for Erlang is that it's functional inside a process and it's a safe language (no pointers, etc).

The problems with OS threads are (at least):

  • They need more memory. Usually it's at least one kernel-mem page (4/8K) + at least one more page for thread-local storage. Erlang process only have an overhead of about 300 bytes (IIRC).

    Kernel memory also is limited, so one reason for Apache dying this quickly may just be due to too little Kernel space. (Although, I'd expect numbers to be a bit higher.)

  • Syscall overhead. Ok, there are probably some tricks possible, but I think it's not possible to fully get rid of the need for syscalls. Syscalls are slow because mode-switches are slow, and entering the kernel pollutes the (L1-)cache.

Erlang basically reimplemented all OS features in userland and it's not limited to allocate memory at page boundaries, so it does need much less memory and is also faster. Maybe they also have some special lock-free implementation for mail-boxes. In any way they can fully avoid expensive thread-blocking calls.

CL-MUPROC uses OS threads, thus cannot be expected to perform any better than Apache. Distel threads seem to run on top of Emacs' VM, so I wouldn't expect it to perform terribly well either--although having the right abstraction available can be a big win too!

From the most recent version of the Termite paper it seems, however, that they might be able to come very close to Erlang's performance using Gambit-C's lightwight threads. Like Erlang it also removes side-effects inside a process (you still have global state). This way data can be passed around (locally) without copying (and possibly without locking either).

So, in summary, the answer to your question seems to be: Lightweight threads + Side-effect free processes

But I'm just thinking loudly, so feel free to correct me.

Preemptive?

I wondered about this issue as well. How can green threads be preemptive? Is the interpreter integrated with the scheduler so as to do a context switch every few instructions (or most likely according to a more complicated policy)?

Scheduling policy

This post by Ulf Wiger on the Erlang mailing list helps to explain how the scheduler works.

Thanks, that was very

Thanks, that was very helpful.

Now also with SMP support

Ulf's explanation was very good, but it's a bit dated. As of release 11, Erlang now has SMP support (and it's not just experimental), so two or more Erlang processes may execute in parallel within the same VM.

http://www.erlang.org/doc/doc-5.5/doc/highlights.html

Why was Apache was so

Why was Apache was so absurdly configured?

Confused

OK, I'll admit it: I don't understand why you feel Apache was "absurdly configured" in this case. Could you perhaps elucidate?

From the plots it seems it

From the plots it seems it was configured in one case with 250 threads in one process, in the other case with 250 threads in 64 processes.