Presented annually to the author of the outstanding doctoral dissertation in the area of Programming Languages. The award includes a prize of $1,000. The winner can choose to receive the award at ICFP, OOPSLA, POPL, or PLDI.
Nominations must be submitted to the secretary of SIGPLAN by January 5th 2014 to be considered for this year's award. The nominated dissertation must be available in an English language version to facilitate evaluation by the selection committee and have been awarded in the year 2013.
I know of some outstanding dissertations. I am sure you do to. So why not nominate them for this honor (for further details see here)?
Taking Off the Gloves with Reference Counting Immix, by Rifat Shahriyar, Stephen M. Blackburn, and Kathryn S. McKinley:
Despite some clear advantages and recent advances, reference counting remains a poor cousin to high-performance tracing garbage collectors. The advantages of reference counting include a) immediacy of reclamation, b) incrementality, and c) local scope of its operations. After decades of languishing with hopelessly bad performance, recent work narrowed the gap between reference counting and the fastest tracing collectors to within 10%. Though a major advance, this gap remains a substantial barrier to adoption in performance-conscious application domains. Our work identiﬁes heap organization as the principal source of the remaining performance gap. We present the design, implementation, and analysis of a new collector, RCImmix, that replaces reference counting’s traditional free-list heap organization with the line and block heap structure introduced by the Immix collector. The key innovations of RCImmix are 1) to combine traditional reference counts with per-line live object counts to identify reusable memory and 2) to eliminate fragmentation by integrating copying with reference counting of new objects and with backup tracing cycle collection. In RCImmix, reference counting offers efﬁcient collection and the line and block heap organization delivers excellent mutator locality and efﬁcient allocation. With these advances, RCImmix closes the 10% performance gap, outperforming a highly tuned production generational collector. By removing the performance barrier, this work transforms reference counting into a serious alternative for meeting high performance objectives for garbage collected languages.
A new reference counting GC based on the Immix heap layout, which purports to close the remaining performance gap with tracing collectors. It builds on last year's work, Down for the count? Getting reference counting back in the ring, which describes various optimizations to raw reference counting that make it competitive with basic tracing. There's a remaining 10% performance gap with generational tracing that RCImmix closes by using the Immix heap layout with bump pointer allocation (as opposed to free lists typically used in RC). The improved cache locality of allocation makes RCImmix even faster than the generational tracing Immix collector.
However, the bump pointer allocation reduces the incrementality of reference counting and would impact latency. One glaring omission of this paper is the absence of latency/pause time measurements, which is typical of reference counting papers since ref counting is inherently incremental. Since RCImmix trades off some incrementality for throughput by using bump pointer allocation and copy collection, I'm curious how this impacts the pause times.
Reference counting has been discussed a few times here before, and some papers on past ref-counting GC's have been posted in comments, but this seems to be the first top-level post on competitive reference counting GC.
LVars are one outcome of Lindsey Kuper's ongoing
PhD work at Indiana University. They generalize existing models for
deterministic parallelism by considering a general framework of
monotonic read and write operations. They were briefly mentioned
on LtU before (along with the strongly related work on Bloom in the distributed
systems community), and were recently presented in two
distinct and complementary articles.
The first article describes the basic building blocks and ideas of LVars:
LVars: Lattice-Based Data Structures for Deterministic Parallelism
Lindsey Kuper, Ryan R. Newton
Programs written using a deterministic-by-construction model of
parallel computation are guaranteed to always produce the same
observable results, offering programmers freedom from subtle,
hard-to-reproduce nondeterministic bugs that are the scourge of
parallel software. We present LVars, a new model for deterministic-
by-construction parallel programming that generalizes existing
single-assignment models to allow multiple assignments that are
monotonically increasing with respect to a user-specified lattice.
LVars ensure determinism by allowing only monotonic writes and
"threshold" reads that block until a lower bound is reached. We
give a proof of determinism and a prototype implementation for a
language with LVars and describe how to extend the LVars model
to support a limited form of nondeterminism that admits failures
but never wrong answers
The second relaxes the original model by introducing failure, which
widens its applicability:
Freeze After Writing: Quasi-Deterministic Parallel Programming with LVars
Lindsey Kuper, Aaron Turon, Neelakantan Krishnaswami, Ryan R. Newton
Deterministic-by-construction parallel programming models offer
programmers the promise of freedom from subtle, hard-to-reproduce
nondeterministic bugs in parallel code. A principled approach to
deterministic-by-construction parallel programming with shared state
is offered by LVars: shared memory locations whose semantics are
defined in terms of a user-specified lattice. Writes to an LVar take
the least upper bound of the old and new values with respect to the
lattice, while reads from an LVar can observe only that its contents
have crossed a specified threshold in the lattice. Although it
guarantees determinism, this interface is quite limited.
We extend LVars in two ways. First, we add the ability to “freeze”
and then read the contents of an LVar directly. Second, we add the
ability to attach callback functions to an LVar, allowing events to be
triggered by writes to it. Together, callbacks and freezing enable
an expressive and useful style of parallel programming. We prove that
in a language where communication takes place through freezable LVars,
programs are at worst quasi-deterministic: on every run, they either
produce the same answer or raise an error. We demonstrate the
viability of our approach by implementing a library for Haskell
supporting a variety of LVar-based data structures, together with
two case studies that illustrate the programming model and yield
promising parallel speedup.
Something I personally found surprising and impressive about LVars is
that, while I was initially interested in the very formal aspects of
providing a theoretical framework for deterministic concurrency, it
very quickly produced a practical library that people can use to write
parallel program -- and competitive with existing high-performance
approaches. As described in a
recent blog post, a Haskell library is available on Hackage -- but
surely LVars-inspired libraries could make sense in a lot of other
languages as well.
Andreas Abel and Brigitte Pientka's Well-Founded
Recursion with Copatterns; a Unified Approach to Termination and
Productivity is one of my highlights of the just-finished ICFP
2013, but it makes sense to focus on the first paper on this work,
published at POPL back in January.
Copatterns: Programming Infinite Structures by Observations
Andreas Abel, Brigitte Pientka, David Thibodeau, Anton Setzer
Inductive datatypes provide mechanisms to define finite data such as
finite lists and trees via constructors and allow programmers to
analyze and manipulate finite data via pattern matching. In this
paper, we develop a dual approach for working with infinite data
structures such as streams. Infinite data inhabits coinductive
datatypes which denote greatest fixpoints. Unlike finite data which is
defined by constructors we define infinite data by observations. Dual
to pattern matching, a tool for analyzing finite data, we develop the
concept of copattern matching, which allows us to synthesize infinite
data. This leads to a symmetric language design where pattern matching
on finite and infinite data can be mixed. We present a core language
for programming with infinite structures by observations together with
its operational semantics based on (co)pattern matching and describe
coverage of copatterns. Our language naturally supports both
call-by-name and call-by-value interpretations and can be seamlessly
integrated into existing languages like Haskell and ML. We prove type
soundness for our language and sketch how copatterns open new
directions for solving problems in the interaction of coinductive and
Codata has been often discussed here and elsewhere. See notably the
discussion on Turner's Total Functional
Programming (historical note: this 2004 beautification of the
original 1995 paper which had much of the same ideas), and on the
language. Given those precedents, it would be easy for the quick
reader to "meh" on the novelty of putting "observation" first
(elimination rather than introduction rules) when talking about
codata; yet the above paper is the first concrete, usable presentation
of an observation in a practical setting that feels right,
and it solves long-standing problem that current dependently-typed
languages (Coq and Agda) have.
Coinduction has an even more prominent role, due to its massive use
to define program equivalence in concurrent process calculi; the
relevant LtU discussion being about Davide Sangiorgi's On the origins of
Bisimulation, Coinduction, and Fixed Points. The POPL'13 paper
doesn't really tell us how coinduction should be seen with
copatterns. It does not adress the question of termination, which is
the topic of the more recent ICFP'13 paper, but I would say that the
answer on that point feels less definitive.
XKCD comic on functional programming. Overlay money quote:
Functional programming combines the flexibility and power of abstract mathematics with the intuitive clarity of abstract mathematics.
The Size-Change Termination Principle for Constructor Based Languages, by Pierre Hyvernat:
This paper describes an automatic termination checker for a generic first-order call-by-value language in ML style. We use the fact that value are built from variants and tuples to keep some information about how arguments of recursive call evolve during evaluation. The result is a criterion for termination extending the size-change termination principle of Lee, Jones and Ben-Amram that can detect size changes inside subvalues of arguments. Moreover the corresponding algorithm is easy to implement, making it a good candidate for experimentation.
Looks like a relatively straightforward and complete description of a termination checker based on a notion of 'sized types' limited to first-order programs. LtU has covered this topic before, although this new paper doesn't seem to reference that particular Abel work.
Types for Flexible Objects, by Pottayil Harisanker Menon, Zachary Palmer, Alexander Rozenshteyn, Scott Smith:
Scripting languages are popular in part due to their extremely flexible objects. These languages support numerous object features, including dynamic extension, mixins, traits, and ﬁrst-class messages. While some work has succeeded in typing these features individually, the solutions have limitations in some cases and no project has combined the results.
In this paper we deﬁne TinyBang, a small typed language containing only functions, labeled data, a data combinator, and pattern matching. We show how it can directly express all of the aforementioned ﬂexible object features and still have sound typing. We use a subtype constraint type inference system with several novel extensions to ensure full type inference; our algorithm reﬁnes parametric polymorphism for both flexibility and efficiency. We also use TinyBang to solve an open problem in OO literature: objects can be extended after being messaged without loss of width or depth subtyping and without dedicated metatheory. A core subset of TinyBang is proven sound and a preliminary implementation has been constructed.
An interesting paper I stumbled across quite by accident, it purports quite an ambitious set of features: generalizing previous work on first-class cases while supporting subtyping, mutation, and polymorphism all with full type inference, in an effort to match the flexibility of dynamically typed languages.
It does so by introducing a host of new concepts that are almost-but-not-quite generalizations of existing concepts, like "onions" which are kind of a type-indexed extensible record, and "scapes" which are sort of a generalization of pattern matching cases.
Instead of approaching objects via a record calculus, they approach it using its dual as variant matching. Matching functions then have degenerate dependent types, which I first saw in the paper Type Inference for First-Class Messages with Match-Functions. Interesting aside, Scott Smith was a coauthor on this last paper too, but it isn't referenced in the "flexible objects" paper, despite the fact that "scapes" are "match-functions".
Overall, quite a dense and ambitous paper, but the resulting TinyBang language looks very promising and quite expressive. Future work includes making the system more modular, as it currently requires whole program compilation, and adding first-class labels, which in past work has led to interesting results as well. Most work exploiting row polymorphism is particularly interesting because it supports efficient compilation to index-passing code for both records and variants. It's not clear if onions and scapes are also amenable to this sort of translation.
Edit: a previous paper was published in 2012, A Practical, Typed Variant Object Model -- Or, How to Stand On Your Head and Enjoy the View. BigBang is their language that provides syntactic sugar on top of TinyBang.
Edit 2: commas fixed, thanks!
Dynamic Region Inference, by David Pereira John Aycock:
We present a garbage collection scheme based on reference counting and region inference which, unlike the standard reference counting algorithm, handles cycles correctly. In our algorithm, the fundamental operations of region inference are performed dynamically. No assistance is required from the programmer or the compiler, making our algorithm particularly well-suited for use in dynamically-typed languages such as scripting languages. We provide a detailed algorithm and demonstrate how it can be implemented efficiently.
A novel garbage collector that solves reference counting's cycle problems by introducing "regions", which demarcate possibly cyclic subgraphs. These regions are updated by merge and split operations that occur on pointer update and incrementally on region allocation, respectively, ie. adding a pointer to B into aggregate C merges their regions, and trying to allocate a new region first attempts to split some random candidate region by computing the local reference counts via union-find of the region's members.
Obviously dynamic regions don't share contiguous storage like their static counterparts, so "regions" here are merely a concept to get reference counting to work. The implementation adds two words to each object, one for pointing to the object's current region, the other for a "next" pointer for the next object in the region.
The practicality of this approach isn't clear compared to other cycle detection algorithms, and no benchmarks are provided. I haven't found any follow-up work either.
Conor McBride gave an 8-lecture summer course on Dependently typed metaprogramming (in Agda) at the Cambridge University Computer Laboratory:
Dependently typed functional programming languages such as Agda are capable of expressing very precise types for data. When those data themselves encode types, we gain a powerful mechanism for abstracting generic operations over carefully circumscribed universes. This course will begin with a rapid depedently-typed programming primer in Agda, then explore techniques for and consequences of universe constructions. Of central importance are the “pattern functors” which determine the node structure of inductive and coinductive datatypes. We shall consider syntactic presentations of these functors (allowing operations as useful as symbolic differentiation), and relate them to the more uniform abstract notion of “container”. We shall expose the double-life containers lead as “interaction structures” describing systems of effects. Later, we step up to functors over universes, acquiring the power of inductive-recursive definitions, and we use that power to build universes of dependent types.
The lecture notes, code, and video captures are available online.
As with his previous course, the notes contain many(!) mind expanding exploratory exercises, some of which quite challenging.
Parsing@SLE is a new workshop on parsing programming languages and other software languages. The intended participants are the authors of parser generation tools and parsers for programming languages and other software languages. For the purpose of this workshop "parsing" is a computation that takes a sequence of characters as input and produces a tree or graph shaped model as output. This possibly includes tokenization using regular expressions, deriving trees using context-free grammars, mapping to abstract syntax trees and perhaps even some semantic analysis.
The goal of the workshop is to bring together today's experts in the field of parsing, in order to explore open questions and possibly forge new collaborations. The topics may include algorithms, implementation and generation techniques, syntax and semantics of meta formalisms (BNF), etc. We expect to attract participants that have been or are developing theory, techniques and tools in the broad area of parsing non-natural languages such as programming languages and other software languages (domain specific languages, configuration languages, build languages, data description languages, query languages, etc.)
We solicit short abstracts, asking for positions, demonstrations and early achievements. The submissions will be reviewed on relevance and clarity, and used to plan the mostly interactive sessions of the day.
* workshop website
* Deadline August 15 2013
* Notification September 1 2013
* Submit a position paper