Sergi Valverde and Ricard Solé, "Punctuated equilibrium in the large scale evolution of programming languages", SFI working paper 2014-09-030
Here we study the large scale historical development of programming languages, which have deeply marked social and technological advances in the last half century. We analyse their historical connections using network theory and reconstructed phylogenetic networks. Using both data analysis and network modelling, it is shown that their evolution is highly uneven, marked by innovation events where new languages are created out of improved combinations of different structural components belonging to previous languages. These radiation events occur in a bursty pattern and are tied to novel technological and social niches. The method can be extrapolated to other systems and consistently captures the major classes of languages and the widespread horizontal design exchanges, revealing a punctuated evolutionary path.
The results developed here are perhaps not that surprising to people familiar with the history of programming languages. But it's interesting to see it all formalized and analyzed.
Don Syme receives the Royal Academy of Engineering's Silver Medal for his work on F#. The citation reads:
F# is known for being a clear and more concise language that interoperates well with other systems, and is used in applications as diverse asanalysing the UK energy market to tackling money laundering. It allows programmers to write code with fewer bugs than other languages, so users can get their programme delivered to market both rapidly and accurately. Used by major enterprises in the UK and worldwide, F# is both cross-platform and open source, and includes innovative features such as unit-of-measure inference, asynchronous programming and type providers, which have in turn influenced later editions of C# and other industry languages.
In case this one went under the radar, at POPL'12, Martín Escardó gave a tutorial on seemingly impossible functional programs:
Programming language semantics is typically applied to
prove compiler correctness and allow (manual or automatic) program
verification. Certain kinds of semantics can also be applied to
discover programs that one wouldn't have otherwise thought of. This is
the case, in particular, for semantics that incorporate topological
ingredients (limits, continuity, openness, compactness). For example,
it turns out that some function types (X -> Y) with X infinite (but
compact) do have decidable equality, contradicting perhaps popular
belief, but certainly not (higher-type) computability theory. More
generally, one can often check infinitely many cases in finite time.
I will show you such programs, run them fast in surprising instances,
and introduce the theory behind their derivation and working. In
particular, I will study a single (very high type) program that (i)
optimally plays sequential games of unbounded length, (ii) implements
the Tychonoff Theorem from topology (and builds finite-time search
functions for infinite sets), (iii) realizes the double-negation shift
from proof theory (and allows us to extract programs from classical
proofs that use the axiom of countable choice). There will be several
examples in the languages Haskell and Agda.
A shorter version (coded in Haskell) appears in Andrej Bauer's blog.
In his blog, Bob Harper, in joint effort with Dave MacQueen and Lars Bergstrom, announces the launch of sml-family.org:
The Standard ML Family project provides a home for online versions of various formal definitions of Standard ML, including the "Definition of Standard ML, Revised" (Standard ML 97). The site also supports coordination between different implementations of the Standard ML (SML) programming language by maintaining common resources such as the documentation for the Standard ML Basis Library and standard test suites. The goal is to increase compatibility and resource sharing between Standard ML implementations.
The site includes a history section devoted to the history of ML, and of Standard ML in particular. This section will contain a collection of original source documents relating to the design of the language.
Logical methods in computer science just published Matija Pretnar's latest take on algebraic effects and handlers:
We present a complete polymorphic effect inference algorithm for an ML-style language with handlers of not only exceptions, but of any other algebraic effect such as input & output, mutable references and many others. Our main aim is to offer the programmer a useful insight into the effectful behaviour of programs. Handlers help here by cutting down possible effects and the resulting lengthy output that often plagues precise effect systems. Additionally, we present a set of methods that further simplify the displayed types, some even by deliberately hiding inferred information from the programmer.
Pretnar and Bauer's Eff has made previous appearances here on LtU. Apart from the new fangled polymorphic effect system, this paper also contains an Eff tutorial.
A Next Generation Smart Contract and Decentralized Application Platform, Vitalik Buterin.
When Satoshi Nakamoto first set the Bitcoin blockchain into motion in January 2009, he was simultaneously introducing two radical and untested concepts. The first is the "bitcoin", a decentralized peer-to-peer online currency that maintains a value without any backing, intrinsic value or central issuer. So far, the "bitcoin" as a currency unit has taken up the bulk of the public attention, both in terms of the political aspects of a currency without a central bank and its extreme upward and downward volatility in price. However, there is also another, equally important, part to Satoshi's grand experiment: the concept of a proof of work-based blockchain to allow for public agreement on the order of transactions. Bitcoin as an application can be described as a first-to-file system: if one entity has 50 BTC, and simultaneously sends the same 50 BTC to A and to B, only the transaction that gets confirmed first will process. There is no intrinsic way of determining from two transactions which came earlier, and for decades this stymied the development of decentralized digital currency. Satoshi's blockchain was the first credible decentralized solution. And now, attention is rapidly starting to shift toward this second part of Bitcoin's technology, and how the blockchain concept can be used for more than just money.
Commonly cited applications include using on-blockchain digital assets to represent custom currencies and financial instruments ("colored coins"), the ownership of an underlying physical device ("smart property"), non-fungible assets such as domain names ("Namecoin") as well as more advanced applications such as decentralized exchange, financial derivatives, peer-to-peer gambling and on-blockchain identity and reputation systems. Another important area of inquiry is "smart contracts" - systems which automatically move digital assets according to arbitrary pre-specified rules. For example, one might have a treasury contract of the form "A can withdraw up to X currency units per day, B can withdraw up to Y per day, A and B together can withdraw anything, and A can shut off B's ability to withdraw". The logical extension of this is decentralized autonomous organizations (DAOs) - long-term smart contracts that contain the assets and encode the bylaws of an entire organization. What Ethereum intends to provide is a blockchain with a built-in fully fledged Turing-complete programming language that can be used to create "contracts" that can be used to encode arbitrary state transition functions, allowing users to create any of the systems described above, as well as many others that we have not yet imagined, simply by writing up the logic in a few lines of code.
Includes code samples.
Conor McBride gave an 8-lecture summer course on Dependently typed metaprogramming (in Agda) at the Cambridge University Computer Laboratory:
Dependently typed functional programming languages such as Agda are capable of expressing very precise types for data. When those data themselves encode types, we gain a powerful mechanism for abstracting generic operations over carefully circumscribed universes. This course will begin with a rapid depedently-typed programming primer in Agda, then explore techniques for and consequences of universe constructions. Of central importance are the â€œpattern functorsâ€ which determine the node structure of inductive and coinductive datatypes. We shall consider syntactic presentations of these functors (allowing operations as useful as symbolic differentiation), and relate them to the more uniform abstract notion of â€œcontainerâ€. We shall expose the double-life containers lead as â€œinteraction structuresâ€ describing systems of effects. Later, we step up to functors over universes, acquiring the power of inductive-recursive definitions, and we use that power to build universes of dependent types.
The lecture notes, code, and video captures are available online.
As with his previous course, the notes contain many(!) mind expanding exploratory exercises, some of which quite challenging.
Oleg Kiselyov, Simon Peyton-Jones and Amr Sabry: Simple Generators:
Incremental stream processing, pervasive in practice, makes the best case for lazy evaluation. Lazy evaluation promotes modularity, letting us glue together separately developed stream producers, consumers and transformers. Lazy list processing has become a cardinal feature of Haskell. It also brings the worst in lazy evaluation: its incompatibility with effects and unpredictable and often extraordinary use of memory. Much of the Haskell programming lore are the ways to get around lazy evaluation.
We propose a programming style for incremental stream processing based on typed simple generators. It promotes modularity and decoupling of producers and consumers just like lazy evaluation. Simple generators, however, expose the implicit suspension and resumption inherent in lazy evaluation as computational effects, and hence are robust in the presence of other effects. Simple generators let us accurately reason about memory consumption and latency. The remarkable implementation simplicity and efficiency of simple generators strongly motivates investigating and pushing the limits of their expressiveness.
To substantiate our claims we give a new solution to the notorious pretty-printing problem. Like earlier solutions, it is linear, backtracking-free and with bounded latency. It is also modular, structured as a cascade of separately developed stream transducers, which makes it simpler to write, test and to precisely analyze latency, time and space consumption. It is compatible with effects including IO, letting us read the source document from a file, and format it as we read.
This is fascinating work that shows how to gain the benefits of lazy evaluation - decoupling of producers, transformers, and consumers of data, and producing only as much data as needed - in a strict, effectful setting that works well with resources that need to be disposed of once computation is done, e.g. file handles.
The basic idea is that of Common Lisp signal handling: use a hierarchical, dynamically-scoped chain of handler procedures, which get called - on the stack, without unwinding it - to parameterize code. In this case, the producer code (which e.g. reads a file character by character) is the parameterized code: every time data (a character) is produced, it calls the dynamically innermost handler procedure with the data (it yields the data to the handler). This handler is the data consumer (it could e.g. print the received character to the console). Through dynamic scoping, each handler may also have a super-handler, to which it may yield data. In this way, data flows containing multiple transformers can be composed.
I especially like the OCaml version of the code, which is just a page of code, implementing a dynamically-scoped chain of handlers. After that we can already write map and fold in this framework (fold using a loop and a state cell, notably.) There's more sample code.
This also ties in with mainstream yield.
Tiny Transactions on Computer Science (TinyToCS) is the premier venue for computer science research of 140 characters or less.
This is an interesting idea: CS papers whose body fits in 140 characters - the abstract may be longer, watering the concept down a bit.
A blog post about Call-By-Push-Value by Rob Simmons: What does focusing tell us about language design?
I think that one of the key observations of focusing/CBPV is that programs are dealing with two different things - data and computation - and that we tend to get the most tripped up when we confuse the two.
- Data is classified by data types (a.k.a. positive types). Data is defined by how it is constructed, and the way you use data is by pattern-matching against it.
- Computation is classified by computation types (a.k.a. negative types). Computations are defined their eliminations - that is, by how they respond to signals/messages/pokes/arguments.
There are two things I want to talk about, and they're both recursive types: call-by-push-value has positive recursive types (which have the feel of inductive types and/or algebras and/or what we're used to as datatypes in functional languages) and negative recursive types (which have the feel of recursive, lazy records and/or "codata" whatever that is and/or coalgebras and/or what William Cook calls objects). Both positive and negative recursive types are treated by Paul Blain Levy in his thesis (section 5.3.2) and in the Call-By-Push Value book (section 4.2.2).
In particular, I want to claim that Call-By-Push-Value and focusing suggest two fundamental features that should be, and generally aren't (at least simultaneously) in modern programming languages:
- Support for structured data with rich case analysis facilities (up to and beyond what are called views)
- Support for recursive records and negative recursive types.
Previously on Rob's blog: Embracing and extending the Levy language; on LtU: Call by push-value, Levy: a Toy Call-by-Push-Value Language.
And let me also repeat CBPV's slogan, which is one of the finest in PL advocacy: Once the fine structure has been exposed, why ignore it?