LtU Forum

A Farewell to FRP in Elm

Making signals unnecessary with The Elm Architecture

...the big benefit is that Elm is now significantly easier to learn and use. As the design of subscriptions emerged, we saw that all the toughest concepts in Elm (signals, addresses, and ports) could collapse into simpler concepts in this new world. Elm is designed for ease-of-use, so I was delighted to stumble upon a path that would take us farther with fewer concepts. To put this in more alarmist terms, everything related to signals has been replaced with something simpler and nicer.

Algebra Of Pointers

I have been thinking a little about pointers, and what kind of mathematical structure they form. They seem to be like a one-dimensional vector (as opposed to a simple scalar). If we use the terms location for a point in n-dimensional space, and distance for a vector, then we get the following properties, adding a distance to a location results in a location, subtracting a distance from a location results in a location, subtracting a location from a location results in a distance.

Where is gets more tricky is that adding a location to a location would seem to be invalid (I don't know what type of thing this is), and yet to average two locations, to get the mid-point seems perfectly reasonable. Topologically what is going on here? What is the type of a location added to a location?

If we treat all locations as distances with an origin at zero, then everything makes sense again in terms of types, except for zero itself which seems to still be a location and not a distance (or it all gets horribly self-referential).

Any thoughts on how to make sense of this, what the type of the sum of two locations might be, etc?

Session Types in a Linearly Typed Multi-Threaded Lambda-Calculus

The paper Session Types in a Linearly Typed Multi-Threaded Lambda-Calculus, Hongwei Xi, Zhiqiang Ren, Hanwen Wu, William Blair, 2016.

We present a formalization of session types in a multi-threaded lambda-calculus (MTLC) equipped with a linear type system, establishing for the MTLC both type preservation and global progress. The latter (global progress) implies that the evaluation of a well-typed program in the MTLC can never reach a deadlock. As this formulated MTLC can be readily embedded into ATS, a full-fledged language with a functional programming core that supports both dependent types (of DML-style) and linear types, we obtain a direct implementation of session types in ATS. In addition, we gain immediate support for a form of dependent session types based on this embedding into ATS. Compared to various existing formalizations of session types, we see the one given in this paper is unique in its closeness to concrete implementation. In particular, we report such an implementation ready for practical use that generates Erlang code from well-typed ATS source (making use of session types), thus taking great advantage of the infrastructural support for distributed computing in Erlang.

Programming by poking: why MIT stopped teaching SICP

A blog post; excerpt:

In this talk at the NYC Lisp meetup, Gerry Sussman was asked why MIT stopped teaching the legendary 6.001 course, which was based on Sussman and Abelson’s classic text The Structure and Interpretation of Computer Programs (SICP). Sussman’s answer was that: (1) he and Hal Abelson got tired of teaching it (having done it since the 1980s). So in 1997, they walked into the department head’s office and said: “We quit. Figure out what to do.” And more importantly, (2) that they felt that the SICP curriculum no longer prepared engineers for what engineering is like today. Sussman said that in the 80s and 90s, engineers built complex systems by combining simple and well-understood parts. The goal of SICP was to provide the abstraction language for reasoning about such systems.

Today, this is no longer the case. Sussman pointed out that engineers now routinely write code for complicated hardware that they don’t fully understand (and often can’t understand because of trade secrecy.) The same is true at the software level, since programming environments consist of gigantic libraries with enormous functionality. According to Sussman, his students spend most of their time reading manuals for these libraries to figure out how to stitch them together to get a job done. He said that programming today is “More like science. You grab this piece of library and you poke at it. You write programs that poke it and see what it does. And you say, ‘Can I tweak it to do the thing I want?'”. The “analysis-by-synthesis” view of SICP — where you build a larger system out of smaller, simple parts — became irrelevant. Nowadays, we do programming by poking.

Also, see the Hacker news thread. I thought this comment was useful:

What should we consider fundamental?

A fair question, and a full answer would be too long for a comment (though it would fit in a blog post, which I'll go ahead and write now since this seems to be an issue). But I'll take a whack at the TL;DR version here.

AI, ML, and NLP and web design are application areas, not fundamentals. (You didn't list computer graphics, computer vision, robotics, embedded systems -- all applications, not fundamentals.) You can cover all the set theory and graph theory you need in a day. Most people get this in high school. The stuff you need is just not that complicated. You can safely skip category theory. What you do need is some amount of time spent on the idea that computer programs are mathematical objects which can be reasoned about mathematically. This is the part that the vast majority of people are missing nowadays, and it can be a little tricky to wrap your brain around at first. You need to understand what a fixed point is and why it matters. You need automata theory, but again, the basics are really not that complicated. You need to know about Turing-completeness, and that in addition to Turing machines there are PDAs and FSAs. You need to know that TMs can do things that PDAs can't (like parse context-sensitive grammars), and that PDAs can to things that FSAs can't (like parse context-free grammars) and that FSAs can parse regular expressions, and that's all they can do.

You need some programming language theory. You need to know what a binding is, and that there are two types of bindings that matter: lexical and dynamic. You need to know what an environment is (a mapping between names and bindings) and how environments are chained. You need to know how evaluation and compilation are related, and the role that environments play in both processes. You need to know the difference between static and dynamic typing. You need to know how to compile a high-level language down to an RTL. For operating systems, you need to know what a process is, what a thread is, some of the ways in which parallel processes lead to problems, and some of the mechanisms for dealing with those problems, including the fact that some of those mechanisms require hardware support (e.g. atomic test-and-set instructions). You need a few basic data structures. Mainly what you need is to understand that what data structures are really all about is making associative maps that are more efficient for certain operations under certain circumstances.

You need a little bit of database knowledge. Again, what you really need to know is that what databases are really all about is dealing with the architectural differences between RAM and non-volatile storage, and that a lot of these are going away now that these architectural differences are starting to disappear. That's really about it.

Was there a language with an explicit call stack?

When writing a runtime system or an interpreter in C (though TBH it’s true about all programming languages that I know of) with e. g. moving garbage collection or support for continuations, you find that, surprisingly, the call stack is too abstracted away by the semantics, and basically have to roll a separate one yourself (without the benefits of hardware support) or resort to things like getcontext(3) or GCC’s split stacks that technically are an extension of the semantics that need their own definition. LLVM tries to support precise GC, but falls short of real-world uses. Another facet of this problem is the unpredictability of stack overflows with alloca(3), C99 variable-length arrays or just plain old recursion. (I have to count my recursive calls so that my recursive-descent parser doesn’t crash on malicious input? Really?)

This motivates the question: was there a language that officially permitted or even required managing one’s own call stack, inspecting stack frames and the like? Note that this doesn’t mean not having some sort of subroutine linkage, but could mean banning recursion unless the user explicitly allocates space for it, so that the implicit stack is statically guaranteed to be bounded. (FORTRAN didn’t have recursion, of course, but I doubt it was for this reason.) Things I’d want to (be able to) implement in such a language are, again, a moving GC, closures, coroutines, and continuations. I’d expect the language to be reasonably ALGOL- or ML-like so that FORTH and its kin are excluded.

On a second thought it looks like the requirements of moving GC are orthogonal to those of continuations and well-defined stack overflow behaviour, but closures seem to be somewhere in between, so I’m leaving it as one question with possibly multiple valid answers.

PL's hotness challenge

Blog post on HN. Only the intro is related to PL:

I’m trying to get into neural networks. There have been a couple big breakthroughs in the field in recent years and suddenly my side project of messing around with programming languages seemed short sighted. It almost seems like we’ll have real AI soon and I want to be working on that. While making my first couple steps into the field it’s hard to keep that enthusiasm. A lot of the field is still kinda handwavy where when you want to find out why something is used the way it’s used, the only answer you can get is “because it works like this and it doesn’t work if we change it.”

Putting my ear to the ground, competition from ML has become more and more common not just in PL, but also in systems, I know many researchers who are re-purposing their skills right now while it has become more difficult to find grad students/interns.

So, is this something we should be worried about? What could happen to make PL more competitive in terms of exciting results/mindshare?

Chez Scheme now open-source

Kent Dybvig (Cadence Research, Cisco Systems) has released the commercial scheme compiler Chez Scheme (scheme.com) as open source on GitHub. Chez Scheme is a native code generating optimizing compiler for R6RS focusing on performance and productivity. It supports cross-compilation, threading, and many other extensions. Current version is 9.4.

I'm excited to see what the community will build with this great tool.

https://github.com/cisco/chezscheme

Elements of Programming in Rust

I have been working on implementing the algorithms and concepts in Stepanov's "Elements of Programming" in Rust. I have started with chapter 6 "Iterators" as I think it involves the most complex type mechanics, and if it is possible to translate this, the rest should be doable. I have posted what I have so far which is all the plain Iterator algorithms as I now think it will be possible to complete the translation.

There are some issues:

- I cannot define the dereference operators to work on values (they should be the identity function on values), this limits the generality of algorithms, in that we cannot write an algorithm that will operate on both a value and an iterator.
- I am not happy with passing some arguments by moved values, and some by borrowed reference as it makes the function APIs harder to work with, but at the moment I can see no alternative. Ideally there should be a way of defining an algorithm generically over both.
- Currently I am passing closures by moved value as this makes them cleaner to pass in, but this means where a closure is passed to other functions more than once, as reference has to be taken. This will cause problems in there recursive case where there will be an accumulation of references with recursion depth, however auto-dereferencing in function application stops the code getting ugly. I am not sure if this results in a performance issue. One solution would be to pass all closures by mutable-reference, but I am not sure this is the best answer.

Here's the link to the repo: https://github.com/keean/elements_in_rust

I am interested in comments about improving the implementation, how generic programming looks in Rust, how Rust could be improved to deal with the issues above, and any other issues that people can spot.

Remora: An Array-Oriented Language with Static Rank Polymorphism

The paper: An Array-Oriented Language with Static Rank Polymorphism, Justin Slepak, Olin Shivers, and Panagiotis Manolios, Northeastern University, 2014.
Abstract. The array-computational model pioneered by Iverson’s lan- guages APL and J offers a simple and expressive solution to the “von Neumann bottleneck.” It includes a form of rank, or dimensional, poly- morphism, which renders much of a program’s control structure im- plicit by lifting base operators to higher-dimensional array structures. We present the first formal semantics for this model, along with the first static type system that captures the full power of the core language.

The formal dynamic semantics of our core language, Remora, illu- minates several of the murkier corners of the model. This allows us to resolve some of the model’s ad hoc elements in more general, regular ways. Among these, we can generalise the model from SIMD to MIMD computations, by extending the semantics to permit functions to be lifted to higher-dimensional arrays in the same way as their arguments.

Our static semantics, a dependent type system of carefully restricted power, is capable of describing array computations whose dimensions cannot be determined statically. The type-checking problem is decidable and the type system is accompanied by the usual soundness theorems. Our type system’s principal contribution is that it serves to extract the implicit control structure that provides so much of the language’s expres- sive power, making this structure explicitly apparent at compile time.

Binary Representation - Is it something to rise above?

There is a tension in programming language design. Are we describing operations, or are we describing semantics?

I have a strong opinion that these are separate spheres - that representation should be either fully above or fully below the language's native level of abstraction - and that it is therefore hard for a language to do both well.

At one extreme, C, Pascal, Fortran, etc -- all those old classic statically compiled languages -- are primarily for describing machine operations, and specifically machine operations on bits and bytes. To the extent that they have type systems at all, the types each refer to a specific range of memory - bits and bytes - mapped directly to a template that specifies the bit-level encoding of each value.

At the other end, R4RS scheme - I think that was the last version that this mostly applied to. R4RS scheme was, aside from a few warts, a language in which caring how something was represented in bits and bytes just wasn't a relevant question. The programming model was a simple one in which things were values, or containers for values, and if you cared how a character or a number was represented in bits, then it could only be because you were making a mistake.

Numbers were numeric values, not a specific number of bits with a specific mapping of bit patterns to numbers. Characters were character values, not a specific number of bits with a specific mapping. They had no numeric values. In principle you didn't know, or even care, whether you were working on a machine were everything was represented in nine-trit "trytes" at the bottom level; there was just no reference to representation anywhere in the language's semantics, because the abstraction barrier was drawn above the level of representation.

I recently observed that C was a language for describing machine operations, and scheme a language for describing semantics, and this is what I meant. The C programming model makes no sense if types don't represent specific memory-aligned areas with specific bit-pattern-to-value mappings. R4RS scheme's programming model would have made no sense if they were. They operate at entirely different levels of abstraction.

Another topic recently asked what would be the characteristics of a good successor language to scheme, or of a future direction for scheme - and I think that representational abstraction barrier is an important property of any high-level language which ought to be preserved.

If you're not building a binary interface to something (like hardware or a communications channel that's specified in binary) and you're not working in a hardware-oriented language where those units and specific mappings of bit patterns to values ARE the sole underlying paradigm of your whole type system, then I submit that there is a deficiency in the design of the language.

Conversely, (and here R4RS scheme failed horribly) in such a language when you *are* building an interface to something, the patterns of bits you read or write should get interpreted according to a very specific mapping. The language that abstracts above representation issues for other operations, cannot be used to interface with anything unless the programmer specifies explicitly what mapping of bits to values is to be used for I/O. And this is because you're interfacing to things that *DON'T* have a common language for communication that abstracts above the level of representation, so you can't keep the abstraction barrier above that for the channel itself. A particularly egregious failure of this principle happened in a graphics library written in Scheme. Scheme provided no 'blob' type, and there was no way to specify a bits-to-values mapping on the I/O channels, so the implementor used strings to hold bytes, assuming that the I/O would be done in Ascii + Code page 1386. And of course the code exploded when used on an interface where reading and writing characters was by default done with a different code page or in UTF16.

So... in designing a programming language to operate above the representation abstraction barrier, I would absolutely ensure that bit and byte width of representations entered into the language semantics only when invoked specifically for I/O purposes. If the word "bit" came up in any chapter of the documentation other than the one about how to do I/O, I would be certain that I had made a mistake. If it were used referring to something the programmer actually needs to know to get useful work done, I would be certain that the mistake was egregious.

And this is why I am so completely certain that "uniform vectors" are a mistake in Scheme. They mean the programmer has to care about binary representation for some reason other than I/O.

XML feed