LtU Forum

BCS FACS - Annual Peter Landin Semantics Seminar: Compiling Without Continuations, Prof Simon Peyton Jones, 12th Dec, 6pm, Lon

BCS FACS - Annual Peter Landin Semantics Seminar: Compiling Without Continuations

Date/Time: Tuesday 12 December 2017, 6.00pm - 9.00pm

Venue: BCS, 1st Floor, The Davidson Building, 5 Southampton Street, London, WC2E 7HA

Speaker: Professor Simon Peyton Jones, FRS (Microsoft Research)

Cost: Free

Booking: https://events.bcs.org/book/2701/

Synopsis:

Peter Landin (1930 - 2009) was a pioneer whose ideas underpin modern computing. In the 1950s and 1960s, Landin showed that programs could be defined in terms of mathematical functions, translated into functional expressions in the lambda calculus, and their meaning calculated with an abstract mathematical machine. Compiler writers and designers of modern-day programming languages alike owe much to Landin's pioneering work.

Each year, a leading figure in computer science will pay tribute to Landin's contribution to computing through a public seminar. This year's seminar is entitled “Compiling Without Continuations” and will be given by Professor Simon Peyton Jones, FRS (Microsoft Research).

Programme

5.15pm Coffee

6.00pm Welcome & Introduction

6.05pm Peter Landin Semantics Seminar

Compiling Without Continuations

Professor Simon Peyton Jones, FRS (Microsoft Research)

7.20pm Drinks Reception

Seminar details

GHC compiles Haskell via Core, a tiny intermediate language based closely on the lambda calculus. Almost all GHC’s optimisations happen in Core, but until recently there was an important kind of optimisation that Core really did not handle well. In this talk I’ll show you what the problem was, and how Core’s new “join points” solve it simply and beautifully, by effectively allowing Core to express control flow as well as data flow; there are strong links to so-called “continuation passing style” (CPS) here.

Understanding join points can help you are a programmer too, because you can write code confident that it will optimise well. I’ll show you a rather compelling example this: “skip-less streams” now fuse well, for the first time, which allows us to drop the previous (ingenious but awkward) workarounds.

SK in Prolog

A thought experiment I am too lazy to do so I'll ask you folk.

Define SK, define a reduction relation, ask whether two terms are equal/reduce to similar terms.

Can you do this in Prolog? (Asked because of interest in current unification based languages like miniKanren.)

Advancement in TDFA and POSIX submatch extraction

It came up in an old LTU thread about regular expressions. There was an argument whether tagged FA invented by Ville Laurikari can support POSIX disambiguation semantics. It turns out, they can: it is possible to construct efficient Laurikari TDFA with POSIX semantics, as well as with leftmost greedy semantics (details in this paper).

Back in 2007 Chris Kuklewicz suggested an algorithm which is implemented in his Regex-TDFA Haskell library. He also wrote an informal description of his algorithm, but never fully formalized it. Some ten years later I stumbled upon this thread when I was trying implement fast submatch extraction in the open source lexer generator re2c. I revised both the original algorithm by Laurikari and the modification by Kuklewicz and found a number of improvements and bugs in Regex-TDFA.

Language features for tracing JIT?

Are there any special programming language features ("superpowers") that are made possible by specifically exploting a tracing JIT compiler?

It seems to me like tracing JITs are often developed for existing languages (JavaScript, Python, Lua, etc) and tend to focus on efficiently compiling the existing code bases that are written with their traditional programming idioms. I am interested in the opposite approach: what programming idioms and language extensions can help programmers to make the most of their tracing JIT compiler?

I can offer one candidate. Tracing JIT compilers can choose to specialize machine code based on the exact values of parameters that are only available at runtime, and the language can facilitate "hinting" whether a certain variable should be specialized on its value (e.g. generate machine code for the specific case that i = 42) or on its type (e.g. generate machine code for the general case that i is an integer.) Often a certain operations can be compiled much more efficiently when a parameter is constant, for example the size argument to a memcpy or memcmp, and so a potential "superpower" is to apply these optimizations automatically without having to statically generate code that anticipates each possible value.

Is that a legitimate superpower? what are some others? what tricks can tracing JIT users exploit to leave C/C++ hackers eating their dust? :-)

Something I forgot

I think it was discussed here before but it came up in a conversation I had, and I forgot whether I ever got an answer to the following question:

Why doesn't Haskell need a value restriction a-la ML to deal with unsound interactions between mutable state and polymorphism?

Did no-one ever check this or does it have to do with invariance? I forgot.

New simple proof system

I took basic logic and made it as familiar and easy to learn as possible. Proofs look just like proofs in elementary school algebra--a series of steps where each step has a formula on the left and a justification on the right. Formulas are manipulated using rewrite rules (commutativity, etc.) rather than the typical deduction rules. This isn't any sort of limitation. It can implement both intuitionistic and classical logic using only rewrite rules (currently it uses intuitionistic).

I wrote up the result as a sort of game. The first real proof is problem 8, which is just a few steps. Some of the later problems are ferociously difficult (but possible)--no one I've shown it to has gotten close to solving them, so if anyone manages to solve problem 25, 35, 36, or 37, please let me know.

https://sftm.schlussweisen.com

It's completely free.

NOOL 2017

Co-located with SPLASH 2017, there will be another NOOL workshop - see here and here for previous editions.

Please consider submitting an abstract! It's a fun, friendly event and this year it's followed by the LIVE workshop.

Type system based on epistemic modal logic?

This is something I’ve been mulling over lately, and I’m curious to hear opinions on it and whether there’s any existing research (I couldn’t find any).

Since type systems based on linear logic are receiving more attention lately in Rust and the forthcoming linear types extension for Haskell, I was looking around at different logics to see if they would make a useful basis for type system features in a programming language. I was aware of modal logics, but hadn’t really looked into them since I read From Löb’s Theorem to Spreadsheets a few years ago.

I came across epistemic modal logic, a logic for reasoning about knowledge. That struck me as potentially very useful for distributed data stores, in which different agents (servers) know different things (have different data), and also know things about the things that they and other agents know.

For instance, if a client asks a server for some data, the server may be aware that it doesn’t have the data in question, by the “negative introspection axiom”, ¬Kiφ ⇒ Ki¬Kiφ, if an agent (i) does not know (¬Ki) some fact (φ), then that agent knows (Ki) that it does not know the fact (¬Kiφ). However, it may know that some other agent does know the fact, and can go and fetch it accordingly, or arrange for the other agent to send it directly to the client.

This could incorporate other modalities such as possibility (◊), necessity (□), “everyone knows” (E), “common knowledge” (C), “distributed knowledge” (D). Just brainstorming, I feel like a type system based on these ideas might be able to enforce things like:

  • You can only make a synchronous request for some data from a server if that server necessarily has the data and can therefore respond immediately. Pseudocode:

    syncRequest (fact : FactID, server : ServerID) : Fact
      requires necessarily (server knows fact)
  • After you send some data to a server, you know that it possibly knows the data and that you can ask for it asynchronously.

    send (fact : FactID, server : ServerID) : ()
      ensures possibly (server knows fact)
  • Any part of a message (metadata or data) can be lost, and the server can recover or negotiate repair to achieve eventual consistency. (I’m envisioning a sort of Erlang-style “fire and forget” message passing system, for example over UDP.)

  • Constraints on server & client knowledge to express tradeoffs between consistency and availability.

I’m just riffing informally here. Let me know your thoughts, or if you can point me to any resources. :)

The question of the possibility of a simple formal foundation to the natural languages.

I think that it can safely be said that the natural languages can transport any formal structure; that we can communicate any mathematical structure using the natural language. That's the essence of metamathematics.

But then we are led to believe that the natural language has no proper formal structure. It is informal. In the sense that it is not possible to feed the Don Quijote to an algorithm that will be able to pinpoint any formal inconsistency implicit in the text (without recurring to any other text or knowledge). An example imaginary inconsistency would be if at one point Cervantes says that Don Quijote always likes lo love Dulcinea, and at another he says that he does not like to love her. The kind of inconsistencies that would destroy any formal structure if inadvertently injected in transit.

These 2 previous paragraphs seem a bit paradoxical to me, since, if there is no procedure to decide whether a natural text is inconsistent (or if there are only heuristic fallible procedures) metamathematics should be impossible, and mathematics could not have been born.

If we take a formal structure expressed in some mathematical formalism, and express it with the natural language, metamathematically, we need to be sure that there are no inconsistencies in any of the cases. In the former case, we can check, and there are algrithms that can check. In the later case, we can check. Can no algorithm generally check?

I know that there is a history to this, from Llull and Leibniz, reaching a summit with Frege, to be toppled by Russell, and followed by the neopositivists etc., and later experiments like the semantic web.

So my question here is: Is there some kind of proof or argument showing that the natural languages cannot be provided with a simple mathematical foundation? Some recognizable fundamental property of the natural language that is inherently inconsistent? (Are both questions the same?)

What I mean with a "simple" mathematical foundation is that it must consist on a core formal theory (that can then be suplemented with a number of ad-hoc rules that provide for shortcuts and phrases in the natural languages). A priori, it should be totally disprovided of semantics. Then we should be able to interpret extensions built on that foundation in natural language texts, taken as abstract structures of words; and then, by proxy, we might provide those extensions with the semantics of the natural language texts in which they are interpreted. (Perhaps we need some a priori semantics, for things like time [edited to add: or I? That would seem to lead us towards Gödel's pit, damm. However, I think that we can settle in principle for a descriptive language without person], that are deeply involved in the basic structure of the natural language?)

So, simple in the sense that Frege's proposal was simple, or that the semantic web is simple (my previous paragraph is meant to be interpreted in their intended use), but a huge neural network with crazy amounts of delicately balanced branches trained by all books ever published is not simple. Notwithstanding the problem that with the neural network solution we are dealing with an ungodly mixture of syntax and semantics.

I really don't know whether I am using the right terminology to phrase my question - or even whether there exists a terminological framework where it can be meaningful and exact. So apologies if I have taken a couple of poetic licences in trying to lay it out.

Question: do you have to climb the tower of interpreters?

You know what is interesting about meta-circular interpreters? The infinite recursion of eval-apply is always broken in the end by the application of some primitives. So in the end, no matter what your computational model is, you are still executing a stream of primitives in your "ground" language. And, usually, "lambda" is all you'll ever need as a primitive. But looking at so many toy interpreters you'd see that "lambda" is not quite so "primitive" and requires a lot of machinery to work.
So instead of providing some means of abstraction directly as a ground mechanism to build everything else around it, I wanted to do a simple thought exercise and to go from another direction: if we have a fixed set of primitives, without means of abstraction, what are the ways to compose/combine them?

Imagine we have a really old CPU without a support for a call instruction, but it has a conditional jump, ALU instructions, loads/stores. It's a simple exercise to define how would a call instruction would look like (in pseudo-assembler):

define CALL(target_addr):
load stack_pointer, tmp
store tmp, the-address-of-the-next-instruction-after-call //don't care now how we can get this
dec tmp
store stack_pointer, tmp
//skipped the frame pointer manipulation
branch target_addr

So, in general, our CPU IS capable of doing necessary steps to create an abstraction mechanism, but there is no way to "plug it in" by the CPU itself unless our preprocessor rewrites all "CALL" tokens into these primitives. Not good.
The only other way I could think of is to interpret another language, which has this CALL as primitive. I.e. the CPU itself becomes such a "preprocessor". But does that mean that the only way to escape deficiencies of your current computational model is to climb the tower of interpreters? (aka interpret another language that has that capability. My favorite examples: a Scheme interpreter with first-class macros in scheme by Matt Might, Kernel interpreter in Scheme by J.Shutt, 3-LISP by Brian Smith)

Are there any other ways, besides interpreting another language? Maybe there is a way to build such a reflexive virtual machine that can modify and extend itself? I haven't seen such yet.

I've recently stumbled upon Ian Piumarta's "Open,extensible composition models" VPRI Technical Report TR-2011-002 which describes a meta-circular evaluator, where instead of your usual big cond-statement in eval you select your evaluator based on the type of expression you are evaluating (same for applicators) and this mapping is available for the programmer. I've found this a very interesting read, maybe someone could comment on this paper, as I've surely missed a lot.

I have surely asked a lot of questions in a very unstructed form, but if anyone would maybe give some useful pointers (something to read on the topic, maybe) it would be greatly appreciated.

XML feed