archives

Seeking nearly anything re: so called language "bootstrapping" process

So like 30 years ago in Byte, some cool dude had an m4 clone and an assembler on a CP/M machine and managed to bootstrap a full Pascal development environment.

Ok, I exaggerate, but it's hard to meet an old Forth'er who doesn't have more than one Forth system bootstrap story.

I don't know if there's a formal definition of "bootstrap" (esp. re: prog langs/environments).... So any help here welcome too.

But in this day of compiling to C, the JVM or CLR, LLVM, C--, Boehm's "instant" GC, huge and complex runtime systems and tools + tools + tools, it got me thinking about the days or yore (or current practice, even better) of "bootstrapping" a language and programming environment from "minimal components," to say the least.

I'm also interested if there are *qualities* of various languages/environments that lend themselves to bootstrapping from small parts, while other languages lack these qualities for whatever reasons.

Thanks much in advance.

Scott

Bytecodes meet Combinators: invokedynamic on the JVM

Bytecodes meet Combinators: invokedynamic on the JVM. John Rose. VMIL'09.

The Java Virtual Machine (JVM) has been widely adopted in part because of its classfile format, which is portable, compact, modular, verifiable, and reasonably easy to work with. However, it was designed for just one language—Java—and so when it is used to express programs in other source languages, there are often “pain points” which retard both development and execution. The most salient pain points show up at a familiar place, the method call site.
To generalize method calls on the JVM, the JSR 292 Expert Group has designed a new invokedynamic instruction that provides user-defined call site semantics. In the chosen design, invokedynamic serves as a hinge-point between two coexisting kinds of intermediate language: bytecode containing dynamic call sites, and combinator graphs specifying call targets. A dynamic compiler can traverse both representations simultaneously, producing optimized machine code which is the seamless union of both kinds of input. As a final twist, the user-defined linkage of a call site may change, allowing the code to adapt as the application evolves over time. The result is a system balancing the conciseness of bytecode with the dynamic flexibility of function pointers.

The abstract is pretty vague, but this paper is actually quite interesting, particularly if you're interested in meta-object protocols and if, like me, you don't have the interest or patience to read JSRs. Of course, invokedynamic has been discussed many times over the years. The wheels of Java turn slowly...

The perfect advanced programming language for the productive industrial developer

To each their own language, but am I alone in aspiring a productive, efficient and fun programming language for the advanced programming professional?

Here's a list of 50 items I'd like to see for the basis of a new language:

* runs really fast (compiles to native code) and has a performance profile within 20% of C.
* compiles really fast (have you seen "GO" compile?)
* has an "interpreter" (compiles and runs code on the fly)
* can be used as a systems programming language (ie: you can write device drivers with this stuff)
* supports both functional and imperative programming (leans on the functional)
* functional programming is pure and identifiable at compile time.
* imperative programming is identifiable at compile time.
* passing an imperative function to a higher order pure function renders the pure code impure.
* for/while loops and mutability allowed in pure functions so long as the output is pure (and there's no IO and such)
* Mutability allowed within imperative code, but contained within function body for pure code.
* OOP but no classes (see Haskell Type classes and GO's interfaces)
* Strongly typed and supporting Generic programming
* Soft-realtime GC
* NO null pointers
* Continuations
* Data types can be extended, methods can be specialised.
* No exceptions, but a similar throw/handle mechanism for errors _within_ a single body of code to increase readability.
* Fault-Tolerance built right into the library
* Inner functions
* name spaces
* modules
* neither lazy nor eager evaluation by default, but three distinct function calling operators. CALLER PICKS {lazy, eager, usecallersemantics}
* exception to the above, function arguments -- lazy by default (so you can write your own control loops).
* understandable compiler errors
* efficient pure array programming
* efficient in-memory database structure (multi-index table data structure)
* easily parse binary data (see Erlang's bit syntax)
* compiled regular expression syntax
* extensive message-based concurrency
* multi-core, multi-note runtime environment (can relocate a computation from one box to the next and back again)
* parsable code so an IDE is easy to build
* possible and straightforward provide a contextual method list in the IDE: it's a huge productivity booster
* executable carries meta-data such as dependencies and run-time reflection.
* supports embedding DSLs with code (e.g. so you can declare XML data and have it parsed and turned into code at compile time)
* supports embedded assembly code --automatically marking the code as "unsafe"
* scalable: not declaring a module header imports a standard bunch of stuff that makes it easy to write programs that easily do what Python, Perl, Ruby do.
* must be able to mark an executable as "safe" and indicate dependencies in meta-data so it can be run in a sand-boxed environment
* fast string processing
* UTF-8 support out of the box
* Module split into "public" (on top and default) and "private" section. --no mucking about retyping all the "exported" functions
* Immediate guaranteed collection of objects living on the stack: support for RAII (great for managing resources other than memory)
* Haskell-style indenting
* pattern matching
* guards
* Good debugger which supports back-stepping
* No debug build vs opt build.. Debug info efficiently ignored unless run under a debugger.
* currying possible, but explicit .. e.g: map(someFun,__): Currying can lead to hard-to-figure-out compile errors.
* Scala-style _. i.e. _.moo() creates a lambda on the object. Similarly run(_, 42, _) returns a function with two args.
* no semi-colons at the end of the line