LtU Forum

Are Monads a Waste of Time?

I was thinking about this yesterday. If the point of functional code is that it is easier to reason about, should we like Monads? Consider we have an interpreter, and that we construct a sequence of commands in the free monad. Given a program like this:

f = do
   s <- newRef 
   writeRef s 23
   writeRef s 42
   readRef s

Now I have deliberately not specified the implementation of the interpreter, it could be pure using threading, it could be impure using IORef, or it could be any other valid implementation in Haskell except we won't allow "unsafe" code.

The question is, just looking at the above code, it looks like "s" is mutated. Yes the implementation in the monad could be pure and use state threading, but how does that help us reason about the above imperative program? How is the above program any better than the following JavaScript program:

function f() {
   var s
   s = 23
   s = 42
   return s
}

Ignoring syntax differences (because we can trivially translate from one syntax to the other) surely we can make any argument about implementation in the free monad with an interpreter about this code that we can about the first example? Surely this means that this second imperative program is no easier or harder to reason about than the first?

Hence my deliberately controversial title, why write an imperative program in a monad? Either we should actually write code in a visibly referentially transparent way like this:

f = let s1 = 23 in let s2 = 42 in s2

Or we may as well just use an impure language to start with. (I am deliberately ignoring the fact that parts of a program could be pure in Haskell).

The second second part of this question is, if making a program look as if it had state is as bad as having state from an analysis point of view, how could we prevent this in the type system? What loophole is allowing 's' to behave as if it had state, and how can that be closed?

Type systems for acyclic terms

I've been looking for type systems which guarantee that terms are acyclic, in a context with mutability, of course. Any references? Thoughts?

Mark – A simple and unified notation for both object and markup data

Glad to announce the public beta release of Mark, a simple and unified notation for both object and markup data, at https://github.com/henry-luo/mark.

The notation is a superset of what can be represented by JSON, HTML and XML, but overcomes many limitations these popular data formats, yet still having a very clean syntax and simple data model.

  • It has clean syntax with fully-type data model (like JSON or even better)
  • It is generic and extensible (like XML or even better)
  • It has built-in mixed content support (like HTML5 or even better)
  • It supports high-order composition (like S-expressions or even better)

Welcome your feedback!

Defunctionalization+Refunctionalization+Expression Problem

Interesting paper which suggests you can solve the expression problem by defunctionalizing, converting data to codata and refunctionalizing (and vice versa). Suggests that the function type is a special case of codata with a single destructor.

An impure solution to the problem of matching fans

Some time ago, I found an impure solution to the problem of matching fans in Lamping's abstract algorithm. It is described in [1] and implemented in [2], the essential part of source code being available in [3].

My understanding is that the algorithm effectively eliminates the need in bookkeeping nodes (so-called "oracle") for optimal reduction in case of arbitrary untyped λ-terms. Although I have no formal proof for its correctness yet, the amount of testing [4, 5] that have already been done leaves little room for counterexamples.

Questions remaining open are: how to (dis)prove correctness of the algorithm as well as how to simplify and improve the algorithm? Any help would be highly appreciated.

[1] https://arxiv.org/abs/1710.07516
[2] https://codedot.github.io/lambda/
[3] https://github.com/codedot/lambda/blob/master/encoding/abstract/template.txt
[4] https://gist.github.com/codedot/721469173df8dd197ba5bddbe022c487
[5] https://www.npmjs.com/package/@alexo/lambda (the "Benchmarks" section)

V-Parser

V-Parser is a novel chart parsing algorithm I've been developing recently. The first open source implementation is at GitHub. You can test it at online grammar development environment. This is the pseudocode:
01  DECLARE chart: [][], text: STRING;
02 
03  FUNCTION Parse (grammar, input)
04      text ← input;
05      chart.CLEAR ();
06      MergeItem (0, [grammar.TOP_RULE], 0, null);
07      FOR each new column in chart
08          FOR each new item in column
09              FOR each alternation of item.Sequence[item.Index]
10                  MergeItem (column.Index, alternation.sequence, 0, item);
11  
12      RETURN chart;
13  
14  PROCEDURE MergeItem (offset, sequence, index, parent)
15      item ← chart[offset].FIND (sequence, index);
16      IF not found item THEN
17          item ← {Sequence: sequence, Index: index, Inheritable: [], Inheritors: [], BringOver: []};
18          chart[offset].ADD (item);
19  
20      inheritors ← [item] UNION item.Inheritors;
21      IF item.Index + 1 == item.Sequence.LENGTH THEN
22          inheritable ← iff (parent is null, [], [parent] UNION parent.Inheritable);
23      ELSE
24          inheritable ← [item];
25          IF parent is not null THEN item.BringOver.ADD_IF_NOT_EXIST (parent);
26  
27      FOR each x in inheritable
28          FOR each y in inheritors
29              x.Inheritors.ADD (y);
30              IF (x.Sequence, x.Index) not in y.Inheritable THEN
31                  y.Inheritable.ADD (x);
32                  IF x.Index + 1 < x.Sequence.LENGTH AND y is terminal succeeded in text at offset THEN
33                      FOR each z in x.BringOver
34                          MergeItem (offset + y.LENGTH, x.Sequence, x.Index + 1, z);
For a sake of simplicity, we present the algorithm that operates on classical context free grammar (CFG) where each rule is represented in the following form:

X -> A B C ...

where X is a rule name, and A B C ... is a sequence of rules named: A, B, C, and so on. A, B, or C may be terminal constants, as well. Alternations are noted by having the same rule name on the left side over multiple rule definitions in a grammar.

V-Parser is a chart parser that groups parsing items into columns that correspond to offset from the beginning of input string. Columns are incrementally processed, never looking back into previous columns in the chart. V-Parser stores its items in chart as pairs of a sequence and an index of the sequence element. This way it is always possible to know what is an ahead element of the current item (we just increment index property). The main function Parse serves as a loop over columns, items and their alternations. It repeatedly calls MergeItem procedure to populate the chart onwards.

MergeItem procedure creates a new item in the current column determined by offset only if the item doesn't already exist. Properties Inheritable and Inheritors are used as pointers to parents and items that inherit these pointers, respectively. Parents in Inheritable property are accumulated over children, meaning that each child has pointers to all of its direct or indirect parents.

Lines 20-25 make sure that Inheritable is properly set up in a case of pointing to non-last index of the symbol seuence. BringOver property is used to remember parent ahead symbols, and is used when we get to the point when we reach the last sequence symbols in parsing.

Lines 27-34 loop over each inheritable, and further reach to each inheritor. If inheritor is a successfully parsed terminal, and if inheritable is an item with non-last index, algorithm populates ahead item in corresponding chart column. The whole loop basically makes sure that newly realized ahead symbols are properly distributed over the chart, at positions determined by relevant past parse terminals, including the current one.

The algorithm stops when it runs out of new items in further columns.

Building compact parse forest (CPF) by V-Parser is trivial, because we just have to assign parsed extents to terminals. Reporting errors should also be relatively easy by analysing the last column in the chart.

Are "jets" a good idea?

I've noticed the beginning of a trend in hobbyist/amateur programming language theory. There have been a number of recent programming languages that have extremely minimalistic semantics, to the point of being anemic. These languages typically define function types, product types, the unit type, perhaps natural numbers, and not much else. These languages are typically capable of universal computation, but executing programs in them directly would be comically slow, and so what they do is recognize special strings of code and replace them with implementations written in a faster meta-language. This pair of the recognized string of code and the efficient replacement is known as a "jet", or an "accelerator".

Some languages that do this:

  • Urbit's Nock, which, other than the jet concept, I honestly think is either a strange art project or snake oil, so I won't discuss this further.
  • Simplicity, a smart contract language for blockchain programming by Blockstream.
  • David Barbour's Awelon, which is the most minimalistic of all three of these by only defining function types!

My question is this: is this idea of "jets" or "accelerators" a serious approach to programming language implementation? There's an obvious question that comes to mind: how do you know the object language program and the meta-language program you've replaced it with are equivalent? Simplicity claims to have done a number of Coq proofs of equivalence, but I can't imagine that being sustainable given the difficulty of theorem proving and the small number of developers able to write such a proof. Urbit takes a rather cavalier attitude towards jet equivalence by advocating testing and simply treating the jet as the ground truth in the case of a conflict. And I know David posts here so I hope he will respond with Awelon's take on this matter.

Here is an excerpt from the Simplicity document arguing in favor of jets:

• Jets provide a formal specification of their behavior. The implementation
of a jet must produce output identical to the output that would be pro-
duced by the Simplicity expression being replaced. There is no possibility
for an ambiguous interpretation of what a jet computes.
• Jets cannot accidentally introduce new behavior or new side effects be-
cause they can only replicate the behavior of Simplicity expressions. To
add new behavior to Simplicity we must explicitly extend Simplicity (see
Section 4).
• Jets are transparent when it comes to reasoning about Simplicity expres-
sions. Jets are logically equal to the code they replace. Therefore, when
proving properties of one’s Simplicity code, jets can safely be ignored.

Naturally, we expect jetted expressions to have properties already proven and available; this will aid reasoning about Simplicity programs that make use of jets.

Because jets are transparent, the static analyses of resource costs are not affected by their existence. To encourage the use of jets, we anticipate discounts to be applied to the resource costs of programs that use jets based on the estimated savings of using jets.

When a suitably rich set of jets is available, we expect the bulk of the
computation specified by a Simplicity program to be made up of these jets, with only a few combinators used to combine the various jets. This should bring the computational requirements needed for Simplicity programs in line with existing blockchain languages. In light of this, one could consider Simplicity to be a family of languages, where each language is defined by a set of jets that provide computational elements tailored for its particular application.

In the interest of charity, I'll also try to make an argument in favor of this approach, although I remain skeptical: an extremely minimal programming language semantics means a language designer can truly consider their work to be done at some point, with no further room for improvement. A minimalistic Python would have largely avoided the fiasco involved in the upgrade from Python 2 to Python 3, by pushing all of the breaking changes to the library level, and allowing for more gradual adoption. Languages features added to JavaScript in recent years (like classes, async/await, modules and so on) would also have been libraries and presentation layer additions instead of breaking changes at the implementation level.

Help with Herbelin

DISCLAIMER: I'm uneducated with PLT and don't know what I'm talking about, so please forgive any whacked terminology.

Ok, that outta the way...

I'm trying to make my way through The Duality of Computation (https://pdfs.semanticscholar.org/048f/c94be2ec752bb210c5f688cba0200c1a1f92.pdf), and this stuff, like most everything posted on this fascinating website, is way over my head. I can't follow most of the back half of the paper, but I was hoping someone might have the spare time to answer two newbie, school-level, questions...

1) (The silly one.) How on earth do you pronounce the "mu, mu with tilde" calculus. It'd literally be easier to read this if I knew how to read it. Is there some definitive guide for people who just figured out that the funny squiggle so many these papers is pronounced "eta"? :)

2) (The basic one.) The authors talk about the call-by-value side of the duality as dealing with "Environments with holes." This reminds me of PTS, with it's capital-PI lambda expressions that can close over terms or types. What's the difference between an "Environment with a hole" and a lambda over environments? And what does that mean for compiled languages, where environments aren't parameters?

Any help on either question would be appreciated.

Project Loom: adding fibers and continuations to Java

Just saw this on Hacker News -- Project Loom: Fibers and Continuations for the Java Virtual Machine with the following overview:

Project Loom's mission is to make it easier to write, debug, profile and maintain concurrent applications meeting today's requirements. Threads, provided by Java from its first day, are a natural and convenient concurrency construct (putting aside the separate question of communication among threads) which is being supplanted by less convenient abstractions because their current implementation as OS kernel threads is insufficient for meeting modern demands, and wasteful in computing resources that are particularly valuable in the cloud. Project Loom will introduce fibers as lightweight, efficient threads managed by the Java Virtual Machine, that let developers use the same simple abstraction but with better performance and lower footprint. We want to make concurrency simple(r) again! A fiber is made of two components — a continuation and a scheduler. As Java already has an excellent scheduler in the form of ForkJoinPool, fibers will be implemented by adding continuations to the JVM.

I'm a fan of fibers, and this has quite a bit of interesting material in it for like-minded folks.

Non-determinism: a sublanguage rather than a monad

Non-determinism: a sublanguage rather than a monad

A puzzlingly named, exceedingly technical device introduced to structure the denotational semantics has by now achieved cult status. It has been married to effects -- more than once. It is compulsively looked for in all manner of things, including burritos. At least two ICFP papers brought it up without a rhyme or reason (or understanding), as the authors later admitted. I am talking about monads.

In truth, effects are not married to monads and approachable directly. The profound insight behind monads is the structuring, the separation of `pure' (context-independent) and effectful computations. The structuring can be done without explicating mathematical monads, and especially without resorting to vernacular monads such as State, etc. This article gives an example: a simple, effectful, domain-specific sublanguage embedded into an expressive `macro' metalanguage. Abstraction facilities of the metalanguage such higher-order functions and modules help keep the DSL to the bare minimum, often to the first order, easier to reason about and implement.

The key insight predates monads and goes all the way back to the origins of ML, as a scripting language for the Edinburgh LCF theorem prover. What has not been clear is how simple an effectful DSL may be while remaining useful. How convenient it is, especially compared to the monadic encodings. How viable it is to forsake the generality of first-class functions and monads and what benefits it may bring. We report on an experiment set out to explore these questions.

We pick a rather complex effect -- non-determinism -- and use it in OCaml, which at first blush seems unsuitable since it is call-by-value and has no monadic sugar. And yet, we can write non-deterministic programs just as naturally and elegantly as in Haskell or Curry.

The running tutorial example is computing all permutations of a given list of integers. The reader may want to try doing that in their favorite language or plain OCaml. Albeit a simple exercise, the code is often rather messy and not obviously correct. In the functional-logic language Curry, it is strikingly elegant: mere foldr insert []. It is the re-statement of the specification: a permutation is moving the elements of the source list one-by-one into some position in the initially empty list. The code immediately tells that the number of possible permutations of n elements is n!. From its very conception in the 1959 Rabin and Scott's paper, non-determinism was called for to write clear specifications -- and then to make them executable. That is what will shall do.

XML feed