Coroutines with async and await syntax (Python 3.5)

With Python 3.5 released, the thing that drew my attention is the support for asynchronous programming through PEP 0492 -- Coroutines with async and await syntax, which added awaitable objects, coroutine functions, asynchronous iteration, and asynchronous context managers. I found the mailing list discussions (look both above and below) particularly helpful in understanding what exactly is going on.

Ancient use of generators

Guido van Rossum reminisces a bit about early discussions of generators in the Python community (read the other messages in the thread as well). I think we talked about the articles he mentions way back when. Earlier still, and beyond the discussion by Guido here, was Icon, a clever little language that I have a soft spot for. i don't think we ever fully assessed its influence on Python and other languages.

Portable Efficient Assembly Code-generation in High-level Python

PeachPy is a Python framework for writing high-performance assembly kernels.

PeachPy aims to simplify writing optimized assembly kernels while preserving all optimization opportunities of traditional assembly.

You can use the same code to generate assembly for Windows, Unix, and Golang assembly. The library handles the various ABIs automatically. I haven't seen this cool project before.

Among the cool features is the ability to invoke the generated assembly as regular Python functions. Nice.

Freer Monads, More Extensible Effects

Freer Monads, More Extensible Effects, by Oleg Kiselyov and Hiromi Ishii:

We present a rational reconstruction of extensible effects, the recently proposed alternative to monad transformers, as the confluence of efforts to make effectful computations compose. Free monads and then extensible effects emerge from the straightforward term representation of an effectful computation, as more and more boilerplate is abstracted away. The generalization process further leads to freer monads, constructed without the Functor constraint.

The continuation exposed in freer monads can then be represented as an efficient type-aligned data structure. The end result is the algorithmically efficient extensible effects library, which is not only more comprehensible but also faster than earlier implementations. As an illustration of the new library, we show three surprisingly simple applications: non-determinism with committed choice (LogicT), catching IO exceptions in the presence of other effects, and the semi-automatic management of file handles and other resources through monadic regions.

We extensively use and promote the new sort of ‘laziness’, which underlies the left Kan extension: instead of performing an operation, keep its operands and pretend it is done.

This looks very promising, and includes some benchmarks comparing the heavily optimized and special-cased monad transformers against this new formulation of extensible effects using Freer monads.

See also the reddit discussion.

Haskell for Mac

Available here with hackernews and reddit discussions ongoing.

Even though I'm not a big fan of Haskell, I'm pretty excited about this. It represents a trend where PL is finally taking holistic programmer experiences seriously, and a move toward interactivity in program development that takes advantage of our (a) rich type systems, and (b) increasing budget of computer cycles. Even that they are trying to sell this is good: if people can get used to paying for tooling, that will encourage even more tooling via a healthy market feedback loop. The only drawback is the MAS sandbox, app stores need to learn how to accept developer tools without crippling them.

Reagents: Expressing and Composing Fine-grained Concurrency

Reagents: Expressing and Composing Fine-grained Concurrency, by Aaron Turon:

Efficient communication and synchronization is crucial for finegrained parallelism. Libraries providing such features, while indispensable, are difficult to write, and often cannot be tailored or composed to meet the needs of specific users. We introduce reagents, a set of combinators for concisely expressing concurrency algorithms. Reagents scale as well as their hand-coded counterparts, while providing the composability existing libraries lack.

This is a pretty neat approach to writing concurrent code, which lies somewhere between manually implementing low-level concurrent algorithms and STM. Concurrent algorithms are expressed and composed semi-naively, and Reagents automates the retries for you in case of thread interference (for transient failure of CAS updates), or they block waiting for input from another thread (in case of permanent failure where no input is available).

The core seems to be k-CAS with synchronous communication between threads to coordinate reactions on shared state. The properties seem rather nice, as Aaron describes:

When used in isolation, reagents are guaranteed to perform only the CASes that the hand-written algorithm would, so they introduce no overhead on shared-memory operations; by recoding an algorithm use reagents, you lose nothing. Yet unlike hand-written algorithms, reagents can be composed using choice, tailored with new blocking behavior, or combined into larger atomic blocks.

The benchmarks in section 6 look promising. This appears to be work towards Aaron's thesis which provides many more details.

OcaPic: Programming PIC microcontrollers in OCaml

Most embedded systems development is done in C. It's rare to see a functional programming language target any kind of microcontroller, let alone an 8-bit microcontroller with only a few kB of RAM. But the team behind the OcaPic project has somehow managed to get OCaml running on a PIC18 microcontroller. To do so, they created an efficient OCaml virtual machine in PIC assembler (~4kB of program memory), and utilized some clever techniques to postprocess the compiled bytecode to reduce heap usage, eliminate unused closures, reduce indirections, and compress the bytecode representation. Even if you're not interested in embedded systems, you may find some interesting ideas there for reducing overheads or dealing with constrained resource budgets.

Eric Lippert's Sharp Regrets

In an article for InformIT, Eric Lippert runs down his "bottom 10" C# language design decisions:

When I was on the C# design team, several times a year we would have "meet the team" events at conferences, where we would take questions from C# enthusiasts. Probably the most common question we consistently got was "Are there any language design decisions that you now regret?" and my answer is "Good heavens, yes!"

This article presents my "bottom 10" list of features in C# that I wish had been designed differently, with the lessons we can learn about language design from each decision.

The "lessons learned in retrospect" for each one are nicely done.

STABILIZER : Statistically Sound Performance Evaluation

My colleague Mike Rainey described this paper as one of the nicest he's read in a while.

STABILIZER : Statistically Sound Performance Evaluation
Charlie Curtsinger, Emery D. Berger
2013

Researchers and software developers require effective performance evaluation. Researchers must evaluate optimizations or measure overhead. Software developers use automatic performance regression tests to discover when changes improve or degrade performance. The standard methodology is to compare execution times before and after applying changes.

Unfortunately, modern architectural features make this approach unsound. Statistically sound evaluation requires multiple samples to test whether one can or cannot (with high confidence) reject the null hypothesis that results are the same before and after. However, caches and branch predictors make performance dependent on machine-specific parameters and the exact layout of code, stack frames, and heap objects. A single binary constitutes just one sample from the space of program layouts, regardless of the number of runs. Since compiler optimizations and code changes also alter layout, it is currently impossible to distinguish the impact of an optimization from that of its layout effects.

This paper presents STABILIZER, a system that enables the use of the powerful statistical techniques required for sound performance evaluation on modern architectures. STABILIZER forces executions to sample the space of memory configurations by repeatedly re-randomizing layouts of code, stack, and heap objects at runtime. STABILIZER thus makes it possible to control for layout effects. Re-randomization also ensures that layout effects follow a Gaussian distribution, enabling the use of statistical tests like ANOVA. We demonstrate STABILIZER's efficiency (< 7% median overhead) and its effectiveness by evaluating the impact of LLVM’s optimizations on the SPEC CPU2006 benchmark suite. We find that, while -O2 has a significant impact relative to -O1, the performance impact of -O3 over -O2 optimizations is indistinguishable from random noise.

One take-away of the paper is the following technique for validation: they verify, empirically, that their randomization technique results in a gaussian distribution of execution time. This does not guarantee that they found all the source of measurement noise, but it guarantees that the source of noise they handled are properly randomized, and that their effect can be reasoned about rigorously using the usual tools of statisticians. Having a gaussian distribution gives you much more than just "hey, taking the average over these runs makes you resilient to {weird hardward effect blah}", it lets you compute p-values and in general use statistics.

State of the Haskell ecosystem - August 2015

Interesting survey.

Based on a brief look I am not sure I agree with all the conclusions/rankings. But most seem to make sense and the Notable Libraries and examples in each category are helpful.