Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages (PACMPL) is a Gold Open Access journal publishing research on all aspects of programming languages, from design to implementation and from mathematical formalisms to empirical studies. Each issue of the journal is devoted to a particular subject area within programming languages and will be announced through publicized Calls for Papers.

See the ToC of the September 2017, ICFP issue, here. Some very cool stuff.

Congrats!

Graydon Hoare: What next for compiled languages?

Since everybody is talking about this post,we might as well.

Key topics discussed: modules(you know, real ones); errors ("there are serious abstraction leakages and design trade-offs in nearly every known approach"); Coroutines, async/await, "user-visible" asynchronicity; effect systems, more generally (you could see that coming, couldn't you?); Extended static checking (ESC), refinement types, general dependent-typed languages; and formalization ("we have to get to the point where we ship languages -- and implementations -- with strong, proven foundations").

He goes on to discuss a whole grab bag of "potential extras" for mainstream languages, including the all time favorite: units of measure.

Feel free to link to the relevant discussions from the LtU archive...

Review of Graham Hutton's Programming in Haskell, 2e

A concise review by Simon Thompson of the second edition of Graham Hutton's Programming in Haskell. The first edition was published in 2007, but chapters were written earlier, and the review focuses on how the language has changed since then, embracing the "categorical / algebraic approach more fully".

Happy Birthday, dear Lambda: 17 is good edition

Seventeen years ago to the day, LtU was born. I guess it's about time I stop opening these birthday messages by saying how remarkable this longevity is (this being the fate of Hollywood actresses over 25). Still, I cannot resist mentioning that 17 is "good" (טוב) in gematria, which after all is one of the oldest codes there are. It's is very cool that the last couple of weeks had a flurry of activity. This old site still got game.

I will not try to summarize or pontificate. The community has grown too big and too unruly for that and I have been more an absent landlord recently than a true participant (transitioning from CS to being a professional philosopher of science and having kids took a bit more of my free time than I expected).

One thing I always cherished about LtU was that we welcomed both professional, academic work, and everything and anything that was cool and fun in programming languages. It was never a theory only site. So here's a little birthday party game instead of a long summary.

Which new (or old) languages inspire you to think that a good language can smoothly allow people to reach heretofore hard to reach semantic or abstraction levels?

I mean things that affect how the little guy thinks, not formal semantics, category theory, or what have you.

I'll start with two unconventional languages that I have had the pleasure (and exasperation) of using recently. Both are in some respects descendants of Logo, the language through which I was introduced to programming when I was a ten year old child in Brookline. They are NetLogo and ScratchJr.

NetLogo is a language for building agent based models (here's a classic ABM for you to enjoy; if you install NetLogo there's an implementation in the model library). While some aspects of the language semantics (and syntax) are irritating, NetLogo is very good at what it does. I may say more in the comments, but the key is that a simulation consists of multiple agents, who can move and interact, and the language makes building such simulations straightforward. There is a central clock; you can address multiple agents using conditions ("ask guys with [color = red] [die]"); implicitly have code run by each agent etc. In fact, you hardly think about these issues. If you have no previous background in programming, it feels natural that keeping track of these and other details is not part of the task programming.

ScratchJr is for young kids. It allows them to create little animated scenes, which may be responsive to touch and so on. You can record sounds and take pictures and use them as first class elements in your animations. But the nice thing for me was to notice how natural it is for kids to use the event-driven model (you can "write" several bits of code, and each will execute when the triggering event happens; no need to think about orchestrating this) as well as intuitively understand how this may involves things happening concurrently. These things just emerge from the way the animations are built, they are not concepts the programmer has to explicitly be aware of (which is a good thing, considering she is typically a five year old).

When I see philosophy students writing netlogo and reasoning about the behavior of the agents and when I see kids playing with ScratchJr, I am reminded why I found this business of "engineering abstractions" so enticing when I first used structured programming to design vocabulary for the program I was writing and when I heard some language described as a language for stratified design.

So which are your nominations for cool language based abstractions, for the little guy? Just to give us all some motivation and maybe get me worked up enough to finally delve into algebraic effects?

Happy birthday, LtU-ers. Keep fighting the good fight!

Implementing Algebraic Effects in C

Implementing Algebraic Effects in C by Daan Leijen:

We describe a full implementation of algebraic effects and handlers as a library in standard and portable C99, where effect operations can be used just like regular C functions. We use a formal operational semantics to guide the C implementation at every step where an evaluation context corresponds directly to a particular C execution context. Finally we show a novel extension to the formal semantics to describe optimized tail resumptions and prove that the extension is sound. This gives two orders of magnitude improvement to the performance of tail resumptive operations (up to about 150 million operations per second on a Core i7@2.6GHz)

Another great paper by Daan Leijen, this time on a C library with immediate practical applications at Microsoft. The applicability is much wider though, since it's an ordinary C library for defining and using arbitrary algebraic effects. It looks pretty usable and is faster and more general than most of the C coroutine libraries that already exist.

It's a nice addition to your toolbox for creating language runtimes in C, particularly since it provides a unified, structured way of creating and handling a variety of sophisticated language behaviours, like async/await, in ordinary C with good performance. There has been considerable discussion here of C and low-level languages with green threads, coroutines and so on, so hopefully others will find this useful!

Project Snowflake: Non-blocking safe manual memory management in .NET

Project Snowflake: Non-blocking safe manual memory management in .NET by Matthew Parkinson, Kapil Vaswani, Manuel Costa, Pantazis Deligiannis, Aaron Blankstein, Dylan McDermott, Jonathan Balkind, Dimitrios Vytiniotis:

Garbage collection greatly improves programmer productivity and ensures memory safety. Manual memory management on the other hand often delivers better performance but is typically unsafe and can lead to system crashes or security vulnerabilities. We propose integrating safe manual memory management with garbage collection in the .NET runtime to get the best of both worlds. In our design, programmers can choose between allocating objects in the garbage collected heap or the manual heap. All existing applications run unmodified, and without any performance degradation, using the garbage collected heap. Our programming model for manual memory management is flexible: although objects in the manual heap can have a single owning pointer, we allow deallocation at any program point and concurrent sharing of these objects amongst all the threads in the program. Experimental results from our .NET CoreCLR implementation on real-world applications show substantial performance gains especially in multithreaded scenarios: up to 3x savings in peak working sets and 2x improvements in runtime.

Rather than trying to solve safe manual memory management using compile-time reasoning, you can move some of it into the runtime and raise dynamic failures on use-after-free and other hazards. With some judicious special types that track ownership and a type of borrowing, and some reasonable restrictions on how these types can be handled, you can achieve a nice framework for integrating manual and automatic memory management. The performance benefits for large heaps looks pretty clear.

The Syntax and Semantics of Quantitative Type Theory

The Syntax and Semantics of Quantitative Type Theory by Robert Atkey:

Type Theory offers a tantalising promise: that we can program and reason within a single unified system. However, this promise slips away when we try to produce efficient programs. Type Theory offers little control over the intensional aspect of programs: how are computational resources used, and when can they be reused. Tracking resource usage via types has a long history, starting with Girard's Linear Logic and culminating with recent work in contextual effects, coeffects, and quantitative type theories. However, there is conflict with full dependent Type Theory when accounting for the difference between usages in types and terms. Recently, McBride has proposed a system that resolves this conflict by treating usage in types as a zero usage, so that it doesn't affect the usage in terms. This leads to a simple expressive system, which we have named Quantitative Type Theory (QTT).

McBride presented a syntax and typing rules for the system, as well as an erasure property that exploits the difference between “not used” and “used”, but does not do anything with the finer usage information. In this paper, we present present a semantic interpretation of a variant of McBride's system, where we fully exploit the usage information. We interpret terms simultaneously as having extensional (compile-time) content and intensional (runtime) content. In our example models, extensional content is set-theoretic functions, representing the compile-time or type-level content of a type-theoretic construction. Intensional content is given by realisers for the extensional content. We use Abramsky et al.'s Linear Combinatory Algebras as realisers, yield a large range of potential models from Geometry of Interaction, graph models, and syntactic models. Read constructively, our models provide a resource sensitive compilation method for QTT.

To rigorously define the structure required for models of QTT, we introduce the concept of a Quantitative Category with Families, a generalisation of the standard Category with Families class of models of Type Theory, and show that this class of models soundly interprets Quantitative Type Theory.

Resource-aware programming is a hot topic these days, with Rust exploiting affine and ownership types to scope and track resource usage, and with Ethereum requiring programs to spend "gas" to execute. Combining linear and dependent types has proven difficult though, so making it easier to track and reason about resource usage in dependent type theories would then be a huge benefit to making verification more practical in domains where resources are limited.

hobbes, Morgan Stanley OSS

Over the last few years, I have been developing hobbes -- a programming language, JIT compiler, and database system -- as part of my work for Morgan Stanley. It has become a critical piece of infrastructure in our low-latency, high-volume trading applications, and we have decided to release the source code to the public on github (currently can be built for recent Linux and macOS platforms):

github.com/Morgan-Stanley/hobbes

The database system is a lightweight (self-contained, header-only) library used by applications to efficiently log structured (binary) data over shared memory to minimize application latency and reflect application type structure as accurately as possible in log files. We use this to record a very detailed image of application state over time, which we analyze/query out of band.

The JIT compiler can be used embedded in another application (e.g. to "hot patch" an application with very efficient, precisely typed intercepts) or as a standalone interactive interpreter (e.g.: to monitor and query application log data).

The programming language is a variant of Haskell (HM type inference, algebraic data types, qualified types / type classes) with some adjustments to help reduce boilerplate and derive very efficient machine code. For example, we use "structural" record, variant, and (iso-)recursive types to represent data as it's naturally represented in applications and can be deconstructed generically at compile time (e.g. a record can be printed if its first field can be printed and its rest can be printed, a variant can be printed if its first case can be printed and its rest cases can be printed, a recursive type can be printed if its one-step unrolling can be printed, etc).

We are actively using this on major projects, we will continue to develop this github project as we need new features, and we are interested in engaging others outside of the firm for their thoughts, feedback, and hopefully pull requests. :)

p5.js

p5.js is a JavaScript library inspired by Processing. Seems it could be a fun way to introduce non-CS types to programming. The demo is particularly well done; check it out first. The actual home of the project is here.

RustBelt: Securing the Foundations of the Rust Programming Language

RustBelt: Securing the Foundations of the Rust Programming Language by Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, Derek Dreyer:

Rust is a new systems programming language that promises to overcome the seemingly fundamental tradeoff between high-level safety guarantees and low-level control over resource management. Unfortunately, none of Rust’s safety claims have been formally proven, and there is good reason to question whether they actually hold. Specifically, Rust employs a strong, ownership-based type system, but then extends the expressive power of this core type system through libraries that internally use unsafe features. In this paper, we give the first formal (and machine-checked) safety proof for a language representing a realistic subset of Rust. Our proof is extensible in the sense that, for each new Rust library that uses unsafe features, we can say what verification condition it must satisfy in order for it to be deemed a safe extension to the language. We have carried out this verification for some of the most important libraries that are used throughout the Rust ecosystem.

Rust is definitely pushing the envelope in a new direction, but there's always a little wariness around using libraries that make use of unsafe features, since "safety with performance" is a main reason people want to use Rust. So this is a great step in the right direction!