Cross language runtimes

Project Snowflake: Non-blocking safe manual memory management in .NET

Project Snowflake: Non-blocking safe manual memory management in .NET by Matthew Parkinson, Kapil Vaswani, Manuel Costa, Pantazis Deligiannis, Aaron Blankstein, Dylan McDermott, Jonathan Balkind, Dimitrios Vytiniotis:

Garbage collection greatly improves programmer productivity and ensures memory safety. Manual memory management on the other hand often delivers better performance but is typically unsafe and can lead to system crashes or security vulnerabilities. We propose integrating safe manual memory management with garbage collection in the .NET runtime to get the best of both worlds. In our design, programmers can choose between allocating objects in the garbage collected heap or the manual heap. All existing applications run unmodified, and without any performance degradation, using the garbage collected heap. Our programming model for manual memory management is flexible: although objects in the manual heap can have a single owning pointer, we allow deallocation at any program point and concurrent sharing of these objects amongst all the threads in the program. Experimental results from our .NET CoreCLR implementation on real-world applications show substantial performance gains especially in multithreaded scenarios: up to 3x savings in peak working sets and 2x improvements in runtime.

Rather than trying to solve safe manual memory management using compile-time reasoning, you can move some of it into the runtime and raise dynamic failures on use-after-free and other hazards. With some judicious special types that track ownership and a type of borrowing, and some reasonable restrictions on how these types can be handled, you can achieve a nice framework for integrating manual and automatic memory management. The performance benefits for large heaps looks pretty clear.

PowerShell is open sourced and is available on Linux

Long HN thread ensues. Many of the comments discuss the benefits/costs of basing pipes on typed objects rather than text streams. As someone who should be inclined in favor of the typed object approach I have to say that I think the text-only folks have the upper hand at the moment. Primary reason is that text as a lingua franca between programs ensures interoperability (and insurance against future changes to underlying object models) and self-documenting code. Clearly the Achilles' heel is parsing/unparsing.

As happens often, one is reminded of the discussions of DSLs and pipelines in Jon Bentley's Programming Pearls...

Don Syme receives a medal for F#

Don Syme receives the Royal Academy of Engineering's Silver Medal for his work on F#. The citation reads:


F# is known for being a clear and more concise language that interoperates well with other systems, and is used in applications as diverse asanalysing the UK energy market to tackling money laundering. It allows programmers to write code with fewer bugs than other languages, so users can get their programme delivered to market both rapidly and accurately. Used by major enterprises in the UK and worldwide, F# is both cross-platform and open source, and includes innovative features such as unit-of-measure inference, asynchronous programming and type providers, which have in turn influenced later editions of C# and other industry languages.

Congratulations!

Draining the Swamp: Micro Virtual Machines as Solid Foundation for Language Development

Draining the Swamp: Micro Virtual Machines as Solid Foundation for Language Development
Kunshan Wang, Yi Lin, Stephen Blackburn, Michael Norrish, Antony Hosking
2015

Many of today's programming languages are broken. Poor performance, lack of features and hard-to-reason-about semantics can cost dearly in software maintenance and inefficient execution. The problem is only getting worse with programming languages proliferating and hardware becoming more complicated. An important reason for this brokenness is that much of language design is implementation-driven. The difficulties in implementation and insufficient understanding of concepts bake bad designs into the language itself. Concurrency, architectural details and garbage collection are three fundamental concerns that contribute much to the complexities of implementing managed languages. We propose the micro virtual machine, a thin abstraction designed specifically to relieve implementers of managed languages of the most fundamental implementation challenges that currently impede good design. The micro virtual machine targets abstractions over memory (garbage collection), architecture (compiler backend), and concurrency. We motivate the micro virtual machine and give an account of the design and initial experience of a concrete instance, which we call Mu, built over a two year period. Our goal is to remove an important barrier to performant and semantically sound managed language design and implementation.
Inside you will find the specification of an LLVM-inspired virtual instruction set with a memory model (enables proper GC support) including a specification of concurrent weak-memory operations (reusing C(++)11, a debatable choice), relatively rich control-flow primitive (complete stack capture enabling coroutines or JIT-style de-optimization), and live code update.

.NET Compiler Platform ("Roslyn")

The .NET Compiler Platform (Roslyn) provides open-source C# and Visual Basic compilers with rich code analysis APIs. You can build code analysis tools with the same APIs that Microsoft is using to implement Visual Studio!

In a nutshell: OPEN SOURCE C# COMPILER. Putting aside possible practical implications of this for the .NET ecosystem, I think it is good for programming language geeks to be able to peruse the source code for compilers and language tools.

For the debate about MS being evil, you can head directly to HN where you'll also find an explanation of what bootstrapping a compiler means.

Azul's Pauseless Garbage Collector

Here's Gil Tene on Azul's Pauseless Garbage Collector for the JVM.

One of the key techniques that we use is massive and rapid manipulation of virtual memory mappings. We will change mappings of virutal to physical memory at the rate of Java allocation.

And

The same read barrier I mentioned before will also intercept any attempt to read a reference to an object that has been relocated, and that allows us to lazily relocate references without needing a pause. We compact by moving an entire virtual page worth of objects, we kind of blow it up, moving all the live objects to other places and thereby compacting them. But we don't try and locate and find all the pointers to that page immediately.

The challenge seems to be that standard OSes don't currently have enough hooks for them to do this kind of thing so their runtime must live in either their custom hardware and OS or a virtual machine.

Thorn

Thorn is

a dynamically-typed concurrent language in which lightweight isolated processes communicate by message passing. Thorn includes powerful aggregate data types, a class-based object system, first-class functions, an expressive module system, and a variety of features supporting the gradual evolution of prototype scripts into robust programs.

Thorn is implemented by a compiler targeting the JVM and a Java interpreter, and syntactically resembles Scala, at least superficially.

One of those "features" is a unique (as far as I know) soft type system:

In Thorn, the type dyn (for dynamic) is assumed as default (and never written explicitly). At the other extreme, Thorn supports concrete types, as used in statically typed programming languages. A variable of a concrete type T is guaranteed to refer to a value of that type (or a subtype). [...] While concrete types help with performance and correctness, they introduce restrictions on how software can be used and make rapid development more difficult; scripting languages do not favor them.

As an intermediate step between the two, we propose like types, getting some of the safety of concrete types while retaining the flexibility of dynamic types. Concrete types for var x:T or fun f(x:T) are used in two main places. At a method call x.m(), a static type check ensures that x actually has an m method. At a binding or assignment, like x := y; or f(y), a static type check can ensure that y's value makes sense to assign to x, can reject it entirely, or can inspire a dynamic check. Like types, var x: like T or fun f(x:like T), give the expressive power of concrete type checks on method calls, but do not constrain binding or
assignment. They do require runtime checks and thus may cause programs to fail with runtime type errors: sometimes fewer and never more than dynamic types do.

Concurrency is also a little odd:

Every component (marked by the keyword spawn) runs in a different JVM. Component handles contains sufficient information to identify the node and port on which the component runs.

A couple of papers are linked to the home page; "Thorn - Robust, Concurrent, Extensible Scripting on the JVM", by Bard Bloom, et. al., is a general description of the language, from which come the quotes above; and "Integrating Typed and Untyped Code in a Scripting Language", by Tobias Wrigstad, et. al., with more information about like types.

I have not seen Thorn here before. Apologies if I have just missed it.

The Future of C#

One of the future additions to C# announced by Anders Hejlsberg in this entertaining video from 2008 is Compiler as a Service. By that he means the ability to eval code strings (and I'm guessing that this will also be integrated with C#'s built-in AST objects).

He shows this off at around minute 59, to great effect and great excitement by the audience. It feels like an inflection point. There probably won't be another REPL-less language from now on.

I predict that after that, they'll add hygienic macros and quasisyntax.

Compiling Structural Types on the JVM

Here's a little sausage making article for JVM language implementors. In Compiling Structural Types on the JVM: A Comparison of Reflective and Generative Techniques from Scala’s Perspective, Gilles Dubochet and Martin Odersky describe

Scala’s compilation technique of structural types for the JVM. The technique uses Java reflection and polymorphic inline caches. Performance measurements of this technique are presented and analysed. Further measurements compare Scala’s reflective technique with the “generative” technique used by Whiteoak to compile structural types. The article ends with a comparison of reflective and generative techniques for compiling structural types. It concludes that generative techniques may, in specific cases, exhibit higher performances than reflective approaches, but that reflective techniques are easier to implement and have fewer restrictions.

There's no discussion of the the proposed JVM "method handles" and whether they might be an even better solution than runtime reflection.

Whiteoak was mentioned previously on LtU.

SIGPLAN's first Programming Languages Software Award goes to LLVM

ACM Press Release:

The ACM Special Interest Group on Programming Languages (SIGPLAN) today presents its first-ever Programming Languages Software Award to Chris Lattner of Apple Inc. for his design and development of the Low Level Virtual Machine (LLVM), a compiler infrastructure that has been quickly adopted by a wide array of industry and academic organizations. Since LLVM’s release as an open source compiler infrastructure in October 2003, companies including Apple, Adobe, and Cray have incorporated it into their commercial products, reflecting its simplicity, flexibility, and versatility.

XML feed