User loginNavigation |
emulation for Erlang style processesIn The Problem With Threads Peter Van Roy wrote: The real problem is not threads as such; it is threads plus shared mutable state. To solve this problem, it's not necessary to throw away threads. It is sufficient to disallow mutable state shared between threads The last few weeks I've been thinking about avoiding shared mutable state in new language dialects, while retaining convenient thread styles. Recently I had a fun idea which -- as an unplanned side effect -- seems to solve shared mutable state as a problem well enough for my purposes. Suddenly I told myself, hey, I'm going to get Erlang style processes out of this. Then when I googled a bit for references, I hit the LtU thread I cited above, which I'd read recently. Note I'm unsure how to present the whole idea yet, which I'm still groveling over in a zillion pieces, like a car that's been totally disassembled. (It's hard for me to point at this pile and say, "This would be a nice car, if assembled." To you it might seem merely a pile of junk.) Note I'm not trying to build this up; I'm apologizing in advance for how lateral this likely sounds. The disturbing part is I seem to have veered off the language path into OS territory without meaning to do so. I'm not sure. Maybe that's always what happens when you plan to embed VM systems in host operating systems. The VM world starts to sound an awful lot like an operating system by the time you start getting processes (with no shared mutable state between threads). Try to ignore this transgression into OS territory as a separate issue. The reason I'm describing this here is because I think a lot of folks would like to get Erlang's safe and efficient use of many thousands of threads into other languages. So this is a tactic you might consider when brainstorming on the topic. I was thinking about instruction sets for virtual machines, because recently I was going to post a story to the history department about the Smalltalk blue book, with a minor note of interest in the Smalltalk-80 VM instruction set. Then I got derailed thinking about instruction sets, and now I'm posting about Erlang processes instead. The missing part in the middle, which is still spinning off distracting derivative ideas for me, was a notion of using Motorala 68K family 16-bit instruction sets as a VM instruction set, instead of something like the 8-bit Smalltalk-80 byte codes. One of several reasons for this idea is a rule of thumb: "don't invent things." If you consider a virtual assembler for a VM, don't invent an assembler (or, not until you succumb to exotic extensions). Instead, get a free ride on lots of existing books and tools for some existing, well-known, easy-to-understand instruction set. I happen to like m68k. Of course, if you don't run on an actual m68k chip, this means emulation of some kind. It's the emulation that leads to a wonderful game making the Erlang process style easy. The emulation of a chip's machine code adds an indirection causing some extra layer cost, but suddenly making many other complex things easier and perhaps cheaper to do. My original motive was something like this: debug all the compilation issues under a simulation, and later worry about real machine code. Then I noticed that sandboxing was going to be trivial. An emulator can manually memory map all space seen by the code run on the VM, so it's impossible for VM code to either read or write space unless the emulator wants to let it happen. So VM code simply can't touch other pages in the emulator's host process except those the emulator chooses to let appear in the virtual address space. Then, it's easy to think about sharing readonly pages among separate VM processes, simply by letting the VM's emulator map pages into multiple address spaces, with suitable refcounting. And while you're at it, you can make inter process messaging cheap too, because it's not necessary to actually copy any bytes around in memory to deliver a message in the same host process; you can simply map parts of the host address space into a virtual address space of a VM process. Also, when sychronization primitives for concurrency are done in the VM engine, then multitasking starts to look rather cooperative (read: cheap) in each VM process. And high level operations performed by a VM engine can be atomic from the perspective of each VM process. As an example, suppose you mimic m68k machine code and utilize A-line traps so instructions beginning with hex $A are not defined by the emulated chip, and instead cause (emulated) trap exceptions to system defined high level routines. This is how the original Macintosh architecture used 68000 traps to implement the Mac toolbox in ROM very cheaply, since each A-line instruction was only 16-bits in size. Effectively you get to dispatch to an arbitrary late-binding method using only one 16-bit instruction. On a real 68K chip, A-line traps are not cheap because because they cause exceptions with a lot of register maintenance costs as part of the dispatch to other 68K code somewhere else. But when a 68K engine is emulated, traps are even faster than a normal dispatch in VM code, because it can be done directly in native code in the VM engine, without emulation. Complex system operations can run native in a VM engine, and atomically from the view of a VM process. If you wanted to implement a Scheme system with, say, around 250 primitive methods, each primitive could be a separate 16-bt A-line trap instruction running native in the VM engine. Since the leading $A consumes only 4 bits of an instruction, the other 12 bits are good for 4096 different traps, without needing to start a new VM engine. So you could implement a very large high level library as atomic VM engine operations this way, without needing any emulation for code in the library. The last paragraph has little to do with Erlang processes per se. But it helps to illustrate the idea that a VM engine for a hosted language is basically the operating system from the point of the hosted language. The VM engine can use and manage space invisible to a VM process, and perform magic like high level atomic operations without a VM process knowing how or why. None of this requires any particular language to work. It works with any high level language you want to implement on top of emulated machine code. So it seems to be a runtime idea only, and not a programming language specific idea, unless you count the emulated assembler as the language. Weirdly, you can even use different high level languages in different VM processes, without them being able to interfere with one another. Here I'll stop suddenly because I mentioned all my main points, and I haven't finished thinking about all the details, and so lack any conclusion as yet. By Rys McCusker at 2006-06-05 03:31 | LtU Forum | previous forum topic | next forum topic | other blogs | 8032 reads
|
Browse archives
Active forum topics |
Recent comments
22 weeks 6 days ago
22 weeks 6 days ago
22 weeks 6 days ago
45 weeks 15 hours ago
49 weeks 2 days ago
50 weeks 6 days ago
50 weeks 6 days ago
1 year 1 week ago
1 year 6 weeks ago
1 year 6 weeks ago