Computer History Museum releases PostScript source

The Computer History Museum, in conjunction with Adobe, has released the PostScript source code. Here is the release, with some helpful historical context and several photos:

The story of PostScript has many different facets. It is a story about profound changes in human literacy as well as a story of trade secrets within source code. It is a story about the importance of teams, and of geometry. And it is a story of the motivations and educations of engineer-entrepreneurs.

The Computer History Museum is excited to publicly release, for the first time, the source code for the breakthrough printing technology, PostScript. We thank Adobe, Inc. for their permission and support, and John Warnock for championing this release.

The Verse Calculus: a Core Calculus for Functional Logic Programming

The Verse Calculus: a Core Calculus for Functional Logic Programming

https://simon.peytonjones.org/assets/pdfs/verse-conf.pdf

  • LENNART AUGUSTSSON, Epic Games, Sweden
  • JOACHIM BREITNER
  • KOEN CLAESSEN, Epic Games, Sweden
  • RANJIT JHALA, Epic Games, USA
  • SIMON PEYTON JONES, Epic Games, United Kingdom
  • OLIN SHIVERS, Epic Games, USA/li>
  • TIM SWEENEY, Epic Games, USA

Functional logic languages have a rich literature, but it is tricky to give them a satisfying semantics. In this
paper we describe the Verse calculus, VC, a new core calculus for functional logical programming. Our main
contribution is to equip VC with a small-step rewrite semantics, so that we can reason about a VC program
in the same way as one does with lambda calculus; that is, by applying successive rewrites to it.


This draft paper describes our current thinking about Verse. It is very much a work in progress, not a finished
product. The broad outlines of the design are stable. However, the details of the rewrite rules may well change; we
think that the current rules are not confluent, in tiresome ways. (If you are knowledgeable about confluence proofs,
please talk to us!)We are eager to enagage in a dialogue with the community. Please do write to us.

LtU is now running in a new, more stable environment

LtU has experienced a long period of downtime recently. Its software infrastructure was outdated enough that it became difficult to maintain when problems arose. It has now been migrated to a brand new environment. It should be much more stable from now on.

Graydon Hoare: 21 compilers and 3 orders of magnitude in 60 minutes

In 2019, Graydon Hoare gave a talk to undergraduates (PDF of slides) trying to communicate a sense of what compilers looked like from the perspective of people who did it for a living.

I've been aware of this talk for over a year and meant to submit a story here, but was overcome by the sheer number of excellent observations. I'll just summarise the groups he uses:

  • The giants: by which he means the big compilers that are built the old-fashioned way that throw massive resources at attaining efficiency
  • The variants, which use tricks to avoid being so massive:
    1. Fewer optimisations: be traditional, but be selective and only the optimisations that really pay off
    2. Use compiler-friendly languages, by which he is really taking about languages that are good for implementing compilers, like Lisp and ML
    3. Theory-driven meta-languages, esp. how something like yacc allows a traditional Dragon-book style compiler to be written more easily
    4. Base compiler on a carefully designed IR that is either easy to compile or reasonable to bytecode-interpret
    5. Exercise discretion to have the object code be a mix of compiled and interpreted
    6. Use sophisticated partial evaluation
    7. Forget tradition and implement everything directly by hand

I really recommend spending time working through these slides. While much of the material I was familiar with, enough was new, and I really appreciated the well-made points, shout-outs to projects that deserve more visibility, such as Nanopass compilers and CakeML, and the presentation of the Futamura projections, a famously tricky concept, at the undergraduate level.

Latent Effects for Reusable Language Components

Latent Effects for Reusable Language Components, by Birthe van den Berg, Tom Schrijvers, Casper Bach Poulsen, Nicolas Wu:

The development of programming languages can be quite complicated and costly. Hence, much effort has been devoted to the modular definition of language features that can be reused in various combinations to define new languages and experiment with their semantics. A notable outcome of these efforts is the algebra-based “datatypes "a la carte" (DTC) approach. When combined with algebraic effects, DTC can model a wide range of common language features. Unfortunately, the
current state of the art does not cover modular definitions of advanced control-flow mechanisms that defer execution to an appropriate point, such as call-by-name and call-by-need evaluation, as well as (multi-)staging. This paper defines latent effects, a generic class of such control-flow mechanisms. We demonstrate how function abstractions, lazy computations and a MetaML-like staging can all be expressed in a modular fashion using latent effects, and how they can be combined in various ways to obtain complex semantics. We provide a full Haskell implementation of our effects and handlers with a range of examples.

Looks like a nice generalization of the basic approach taken by algebraic effects to more subtle contexts. Algebraic effects have been discussed here on LtU many times. I think this description from section 2.3 is a pretty good overview of their approach:

LE&H is based on a different, more sophisticated structure than AE&H’s free monad. This structure supports non-atomic operations (e.g., function abstraction, thunking, quoting) that contain or delimit computations whose execution may be deferred. Also, the layered handling is different. The idea is still the same, to replace bit by bit the structure of the tree by its meaning. Yet, while AE&H grows the meaning around the shrinking tree, LE&H grows little “pockets of meaning” around the individual nodes remaining in the tree, and not just around the root. The latter supports deferred effects because later handlers can still re-arrange the semantic pockets created by earlier handlers.

Introducing PathQuery, Google's Graph Query Language

Introducing PathQuery, Google's Graph Query Language

We introduce PathQuery, a graph query language developed to scale with Google's query and data volumes as well as its internal developer community. PathQuery supports flexible and declarative semantics. We have found that this enables query developers to think in a naturally "graphy" design space and to avoid the additional cognitive effort of coordinating numerous joins and subqueries often required to express an equivalent query in a relational space. Despite its traversal-oriented syntactic style, PathQuery has a foundation on a custom variant of relational algebra -- the exposition of which we presently defer -- allowing for the application of both common and novel optimizations. We believe that PathQuery has withstood a "test of time" at Google, under both large scale and low latency requirements. We thus share herein a language design that admits a rigorous declarative semantics, has scaled well in practice, and provides a natural syntax for graph traversals while also admitting complex graph patterns.

Things that are somewhat interesting to me, from an engineering standpoint:

1. PathQuery has a module/compilation system, enabling re-use of PathQuery modules across projects. (Someone had mentioned that Google has around 40,000 PathQuery modules already, internally...)
2. PathQuery supports native functions so that some query pieces can be evaluated procedurally (peephole optimization)
3. Use of relational algebra to enable a lot of known optimizations, plus future optimizations

Also, from a socio-linguistic perspective, Graph Languages are effectively the new Object-Relational Mapping layer, but they solve an interesting organizational problem of allowing multiple teams to code in different languages, without needing to re-write / re-implement entities and mapping configurations in each language. It's the Old New Thing again...

Google announces Logica: organizing your data queries, making them universally reusable and fun

You can read more about it at the Google Open Source blog post, Logica: organizing your data queries, making them universally reusable and fun.

They advocate for datalog-like language they developed internally at Google.

The reason?

Good programming is about creating small, understandable, reusable pieces of logic that can be tested, given names, and organized into packages which can later be used to construct more useful pieces of logic. SQL resists this workflow. Although you can encapsulate certain repeated computations into views and functions, the syntax and support for these can vary among implementations, the notions of packages and imports are generally nonexistent, and higher-level constructions (e.g. passing a function to a function) are impossible.

Coq will be renamed

From the Coq-club:

The Coq development team acknowledges the recent discussions (started on the Coq-Club mailing list) around Coq's logo and name.

We wish to thank everyone that participated in these discussions. Testimonies from people who experienced harassment or awkward situations, reports about students (notably women) who ended up not learning / using Coq because of its name, were all very important so that the community could fully recognize the impact of the current name and its slang meaning in English, especially with respect to gender-diversity in the Coq community.

For these reasons, the Coq development team is open to a renaming.

Suggestions for alternative names go here.

LAMBDA: The ultimate Excel worksheet function

Post by Andy Gordon and Simon Peyton Jones on LAMBDA giving Excel users the ability to define functions.

Ever since it was released in the 1980s, Microsoft Excel has changed how people organize, analyze, and visualize their data, providing a basis for decision-making for the millions of people who use it each day. It’s also the world’s most widely used programming language. Excel formulas are written by an order of magnitude more users than all the C, C++, C#, Java, and Python programmers in the world combined. Despite its success, considered as a programming language Excel has fundamental weaknesses. Over the years, two particular shortcomings have stood out: (1) the Excel formula language really only supported scalar values—numbers, strings, and Booleans—and (2) it didn’t let users define new functions.

Until now.

Google Brain's Jax and Flax

Google's AI division, Google Brain, has two main products for deep learning: TensorFlow and Jax. While TensorFlow is best known, Jax can be thought of as a higher-level language for specifying deep learning algorithms while automatically eliding code that doesn't need to run as part of the model.

Jax evolved from Autograd, and is a combination of Autograd and XLA. Autograd "can automatically differentiate native Python and Numpy code. It can handle a large subset of Python's features, including loops, ifs, recursion and closures, and it can even take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation), which means it can efficiently take gradients of scalar-valued functions with respect to array-valued arguments, as well as forward-mode differentiation, and the two can be composed arbitrarily. The main intended application of Autograd is gradient-based optimization."

Flax is then built on top of Jax, and allows for easier customization of existing models.

What do you see as the future of domain specific languages for AI?