Cross language runtimes
Long HN thread ensues. Many of the comments discuss the benefits/costs of basing pipes on typed objects rather than text streams. As someone who should be inclined in favor of the typed object approach I have to say that I think the text-only folks have the upper hand at the moment. Primary reason is that text as a lingua franca between programs ensures interoperability (and insurance against future changes to underlying object models) and self-documenting code. Clearly the Achilles' heel is parsing/unparsing.
As happens often, one is reminded of the discussions of DSLs and pipelines in Jon Bentley's Programming Pearls...
Don Syme receives the Royal Academy of Engineering's Silver Medal for his work on F#. The citation reads:
F# is known for being a clear and more concise language that interoperates well with other systems, and is used in applications as diverse asanalysing the UK energy market to tackling money laundering. It allows programmers to write code with fewer bugs than other languages, so users can get their programme delivered to market both rapidly and accurately. Used by major enterprises in the UK and worldwide, F# is both cross-platform and open source, and includes innovative features such as unit-of-measure inference, asynchronous programming and type providers, which have in turn influenced later editions of C# and other industry languages.
Draining the Swamp: Micro Virtual Machines as Solid Foundation for Language Development
Kunshan Wang, Yi Lin, Stephen Blackburn, Michael Norrish, Antony Hosking
Many of today's programming languages are broken. Poor performance, lack of features and hard-to-reason-about semantics can cost dearly in software maintenance and inefficient execution. The problem is only getting worse with programming languages proliferating and hardware becoming more complicated. An important reason for this brokenness is that much of language design is implementation-driven. The difficulties in implementation and insufficient understanding of concepts bake bad designs into the language itself. Concurrency, architectural details and garbage collection are three fundamental concerns that contribute much to the complexities of implementing managed languages. We propose the micro virtual machine, a thin abstraction designed specifically to relieve implementers of managed languages of the most fundamental implementation challenges that currently impede good design. The micro virtual machine targets abstractions over memory (garbage collection), architecture (compiler backend), and concurrency. We motivate the micro virtual machine and give an account of the design and initial experience of a concrete instance, which we call Mu, built over a two year period. Our goal is to remove an important barrier to performant and semantically sound managed language design and implementation.
Inside you will find the specification of an LLVM-inspired virtual instruction set with a memory model (enables proper GC support) including a specification of concurrent weak-memory operations (reusing C(++)11, a debatable choice), relatively rich control-flow primitive (complete stack capture enabling coroutines or JIT-style de-optimization), and live code update.
The .NET Compiler Platform (Roslyn) provides open-source C# and Visual Basic compilers with rich code analysis APIs. You can build code analysis tools with the same APIs that Microsoft is using to implement Visual Studio!
In a nutshell: OPEN SOURCE C# COMPILER. Putting aside possible practical implications of this for the .NET ecosystem, I think it is good for programming language geeks to be able to peruse the source code for compilers and language tools.
For the debate about MS being evil, you can head directly to HN where you'll also find an explanation of what bootstrapping a compiler means.
Here's Gil Tene on Azul's Pauseless Garbage Collector for the JVM.
One of the key techniques that we use is massive and rapid manipulation of virtual memory mappings. We will change mappings of virutal to physical memory at the rate of Java allocation.
The same read barrier I mentioned before will also intercept any attempt to read a reference to an object that has been relocated, and that allows us to lazily relocate references without needing a pause. We compact by moving an entire virtual page worth of objects, we kind of blow it up, moving all the live objects to other places and thereby compacting them. But we don't try and locate and find all the pointers to that page immediately.
The challenge seems to be that standard OSes don't currently have enough hooks for them to do this kind of thing so their runtime must live in either their custom hardware and OS or a virtual machine.
a dynamically-typed concurrent language in which lightweight isolated processes communicate by message passing. Thorn includes powerful aggregate data types, a class-based object system, first-class functions, an expressive module system, and a variety of features supporting the gradual evolution of prototype scripts into robust programs.
Thorn is implemented by a compiler targeting the JVM and a Java interpreter, and syntactically resembles Scala, at least superficially.
One of those "features" is a unique (as far as I know) soft type system:
In Thorn, the type dyn (for dynamic) is assumed as default (and never written explicitly). At the other extreme, Thorn supports concrete types, as used in statically typed programming languages. A variable of a concrete type T is guaranteed to refer to a value of that type (or a subtype). [...] While concrete types help with performance and correctness, they introduce restrictions on how software can be used and make rapid development more difficult; scripting languages do not favor them.
As an intermediate step between the two, we propose like types, getting some of the safety of concrete types while retaining the flexibility of dynamic types. Concrete types for var x:T or fun f(x:T) are used in two main places. At a method call x.m(), a static type check ensures that x actually has an m method. At a binding or assignment, like x := y; or f(y), a static type check can ensure that y's value makes sense to assign to x, can reject it entirely, or can inspire a dynamic check. Like types, var x: like T or fun f(x:like T), give the expressive power of concrete type checks on method calls, but do not constrain binding or
assignment. They do require runtime checks and thus may cause programs to fail with runtime type errors: sometimes fewer and never more than dynamic types do.
Concurrency is also a little odd:
Every component (marked by the keyword spawn) runs in a different JVM. Component handles contains sufficient information to identify the node and port on which the component runs.
A couple of papers are linked to the home page; "Thorn - Robust, Concurrent, Extensible Scripting on the JVM", by Bard Bloom, et. al., is a general description of the language, from which come the quotes above; and "Integrating Typed and Untyped Code in a Scripting Language", by Tobias Wrigstad, et. al., with more information about like types.
I have not seen Thorn here before. Apologies if I have just missed it.
One of the future additions to C# announced by Anders Hejlsberg in this entertaining video from 2008 is Compiler as a Service. By that he means the ability to
eval code strings (and I'm guessing that this will also be integrated with C#'s built-in AST objects).
He shows this off at around minute 59, to great effect and great excitement by the audience. It feels like an inflection point. There probably won't be another REPL-less language from now on.
I predict that after that, they'll add hygienic macros and quasisyntax.
Here's a little sausage making article for JVM language implementors. In Compiling Structural Types on the JVM: A Comparison of Reflective and Generative Techniques from Scalaâ€™s Perspective, Gilles Dubochet and Martin Odersky describe
Scalaâ€™s compilation technique of structural types for the JVM. The technique uses Java reflection and polymorphic inline caches. Performance measurements of this technique are presented and analysed. Further measurements compare Scalaâ€™s reflective technique with the â€œgenerativeâ€ technique used by Whiteoak to compile structural types. The article ends with a comparison of reflective and generative techniques for compiling structural types. It concludes that generative techniques may, in specific cases, exhibit higher performances than reflective approaches, but that reflective techniques are easier to implement and have fewer restrictions.
There's no discussion of the the proposed JVM "method handles" and whether they might be an even better solution than runtime reflection.
Whiteoak was mentioned previously on LtU.
ACM Press Release:
The ACM Special Interest Group on Programming Languages (SIGPLAN) today presents its first-ever Programming Languages Software Award to Chris Lattner of Apple Inc. for his design and development of the Low Level Virtual Machine (LLVM), a compiler infrastructure that has been quickly adopted by a wide array of industry and academic organizations. Since LLVMâ€™s release as an open source compiler infrastructure in October 2003, companies including Apple, Adobe, and Cray have incorporated it into their commercial products, reflecting its simplicity, flexibility, and versatility.
VMKit: a Substrate for Managed Runtime Environments, VEE '10
Managed Runtime Environments (MREs), such as the JVM and the CLI, form an attractive environment for program execution, by providing portability and safety, via the use of a bytecode language and automatic memory management, as well as good performance, via just-in-time (JIT) compilation. Nevertheless, developing a fully featured MRE, including e.g. a garbage collector and JIT compiler, is a herculean task. As a result, new languages cannot easily take advantage of the benefits of MREs, and it is difficult to experiment with extensions of existing MRE based languages.
This paper describes and evaluates VMKit, a first attempt to build a common substrate that eases the development of high-level MREs. We have successfully used VMKit to build two MREs: a Java Virtual Machine and a Common Language Runtime. We provide an extensive study of the lessons learned in developing this infrastructure, and assess the ease of implementing new MREs or MRE extensions and the resulting performance. In particular, it took one of the authors only one month to develop a Common Language Runtime using VMKit. VMKit furthermore has performance comparable to the well established open source MREs Cacao, Apache Harmony and Mono, and is 1.2 to 3 times slower than JikesRVM on most of the DaCapo benchmarks.
So... One person built a CLR using VMKit in one month. One consequence of such faster development speeds is that language designers do not have to feel so restricted when targeting a Managed Runtime Environment for their language. If the MRE they want to target has restrictions, they can fork it. If the MRE specification has a gray area, then they can quickly prototype a solution to clarify what the behavior should be for that gray area of the specification. If you are a researcher/student and want to experiment with a new language design and implementation, then you can do so incrementally by first augmenting the MRE and then targeting your language to that new MRE; you can then benchmark the improvements by using the original MRE as a baseline.