LtU Forum

Earl Grey; the story of a new programming language

If you ask me why I made a programming language, I could justify it in a lot of ways, point out its strengths, what I think it does better than the others, and so on. But I don't think that's really the driving force. As I see it, that driving force is, basically, a kind of conceit. A typical programmer will learn one or several well-established languages, depending on what they aim to achieve. They will adapt their way of thinking to fit these tools as best they can. But perhaps you don't want to adapt. You don't like any of the tools you can find because they are never exactly the way you want them, it's like they don't fit your brain at the moment. If you won't adapt to the language, the only alternative is to adapt the language to you. And if you are like me it becomes a bit of an obsession, an itch you just have to scratch...

By Greg Buchholz at 2016-02-06 20:55 | LtU Forum | 67 comments | other blogs | 23155 reads

Formalization and programming language design -- explained to all

As part of an introduction to my own academic work, I wrote a rather general introduction (for non-specialists) to the mathematical approach to programming language design. This introduction reasonates with some eternal LtU debates, so I thought that I could propose it here.

I suspect that many frequent posters it will find it very old-school. It is old school, and it is not at all radical text -- I could be more radical, but this would not be appropriate for this document. In particular, it is mostly a post-hoc justification of the general "mathematical" approach as represented in mainstream PL conferences. I'm in favor of supporting diversity and underdog approaches, but I still think the mathematical approach has a very strong case, and I think the claims in this text could make consensus across many LtU members. Feedback is welcome.

Humans programmers have invented many different symbolic representations for computer programs, which are called programming languages. One can think of them as languages used to communicate with the computer, but it is important to remember that programming is also a social activity, in the sense that many programs are created by a collaboration of several programmers, or that programs written by one programmer may be reused, inspected or modified by others. Programs communicate intent to a computer, but also to other human programmers.

Programmers routinely report frustration with the limitations of the programming language they use -- it is very hard to design a good programming language. At least the three following qualities are expected:

concision: Simple tasks should be described by simple, not large or complex programs. Complex tasks require complex programs, but their complexity should come solely from the problem domain (the specificity of the required task), not accidental complexity imposed by the programming language.

For example, early Artificial Intelligence research highlighted the need for language-level support for backtracking (giving up on a series of decisions made toward a goal to start afresh through a different method), and some programming languages make this substantially easier than others.

clarity: By reading a program description it should be easy to understand the intent of its author(s). We say that a program has a bug (a defect) when its meaning does not coincide with the intent of its programmers -- they made a mistake when transcribing their thoughts into programs. Clarity is thus an essential component of safety (avoiding program defects), and should be supported by mechanized tools to the largest possible extent. To achieve clarity, some language constructions help programmers express their intent, and programming language designers work on tools to automatically verify that this expressed intent is consistent with the rest of the program description.

For example, one of the worst security issues that was discovered in 2014 (failure of all Apple computers or mobile phones to verify the authenticity of connections to secure websites) was due to a single line of program text that had been duplicated (written twice instead of only once). The difference between the programmer intent (ensure security of communications) and the effective behavior of the program (allowing malicious network nodes to inspect your communications with your online bank) was dramatic, yet neither the human programmers nor the automated tools used by these programmers reported this error.

consistency: A programming language should be regular and structured, making it easy for users to guess how to use the parts of the language they are not already familiar with. Programming languages must be vastly easier to learn than human languages, because their use requires an exacting precision and absence of ambiguity. In particular, consistency supports clarity, as recovering intent from program description requires a good knowledge of the language: the more consistent, the lower the risks of misunderstanding.

Of course, the list above is to be understood as the informal opinion of a practitioner, rather than a scientific claim in itself. Programming is a rich field that spans many activities, and correspondingly programming language research can and should be attacked from many different angles: mathematics (formalization), engineering, design, human-machine interface, ergonomics, psychology, linguistics, sociology, and the working programmers all have something to say about how to make better programming languages.

This work was conducted within a research group -- and a research sub-community -- that uses mathematical formalization as its main tool to study, understand and improve programming languages. To work with a programming language, we give it one or several formal semantics (defining programs as mathematical objects, and their meaning as mathematical relations between programs and their behavior); we can thus prove theorems about programming languages themselves, or about formal program analyses or transformations.

The details of how mathematical formalization can be used to guide programming language design are rather fascinating -- it is a very abstract approach of a very practical activity. The community shares a common baggage of properties that may or may not apply to any given proposed design, and are understood to capture certain usability properties of the resulting programming language. These properties are informed by practical experience using existing languages (designed using this methodology or not), and our understanding of them evolves over time.

Having a formal semantics for the language of study is a solid way to acquire an understanding of what the programs in this language mean, which is a necessary first step for clarity -- the meaning of a program cannot be clear if we don't first agree on what it is. Formalization is a difficult (technical) and time-consuming activity, but its simplification power cannot be understated: the formalization effort naturally suggests many changes that can dramatically improve consistency. By encouraging to build the language around a small core of independent concepts (the best way to reduce the difficulty of formalization), it can also improve concision, as combining small building blocks can be a powerful way to simply express advanced concepts. Finding the right building blocks, however, is still very much dependent of domain knowledge and radical ideas often occur through prototyping or use-case studies, independently of formalization. Our preferred design technique would therefore be formalization and implementation co-evolution, with formalization and programming activities occurring jointly to inform and direct the language design process.

By gasche at 2016-02-06 16:25 | LtU Forum | 129 comments | other blogs | 17930 reads

Programmatic and Direct Manipulation, Together at Last

A technical report by Ravi Chugh et al. Abstract:

We present the SKETCH-N-SKETCH editor for Scalable Vector Graphics (SVG) that integrates programmatic and direct manipulation, two modes of interaction with complementary strengths. In SKETCH-N-SKETCH, the user writes a program to generate an output SVG canvas. Then the user may directly manipulate the canvas while the system infers realtime updates to the program in order to match the changes to the output. To achieve this, we propose (i) a technique called trace-based program synthesis that takes program execution history into account in order to constrain the search space and (ii) heuristics for dealing with ambiguities. Based on our experience writing more than 40 examples and from the results of a study with 25 participants, we conclude that SKETCH-N-SKETCH provides a novel and effective work- flow between the boundaries of existing programmatic and direct manipulation systems.

This was demoed at PLDI to a lot of fanfare. Also see some videos. And a demo that you can actually play with, sweet!

By Sean McDirmid at 2016-02-03 22:32 | LtU Forum | 18 comments | other blogs | 23421 reads

SPREAD: Authenticated reusable computations

I want to know exactly what my software is doing. Better still, I want cryptographic proof that my software executes each and every computation step correctly.

Wouldn’t you?

Currently, I don’t know what Windows 10 is doing (or it is very hard to find out) and I hate that.

That’s because most of Windows 10:

- is compiled to machine code that bears no resemblance to the original source code,
- hides intricate networks of mutable objects after abstract interfaces,
- and destroys precious machine state after machine state.

Welcome to SPREAD!

In SPREAD, all data, machines states, code and computations are cryptographically authenticated.

To put it mildly, some practical issues had to be resolved to make SPREAD a reality. First and foremost, SPREAD almost completely eradicates mutable state.

Obviously, keeping all the state for authentication purposes would require enormous amounts of storage. SPREAD solves that issue by cryptographically ‘signing’ states incrementally, while still allowing full user-control over which ones need to be signed, and at what level of granularity.

Alternatively, full machine states can also be stored incrementally by SPREAD. In turn, this allows the pervasive re-use of state that was calculated earlier.

So SPREAD kinda acts like a spreadsheet. Because spreadsheets also re-use the previous states of a workbook to optimally recalculate the next.

Unlike SPREAD however, spreadsheets are incapable to keep all their versions around. And typically, Excel plug-ins completely destroy the (otherwise) purely functional nature of spreadsheets. In contrast, SPREAD only allows referentially transparent functions as primitives.

SPREAD builds on top of a recent breakthrough in cryptography called SeqHash. Unfortunately, SeqHash has been unnoticed by most. But hopefully this post will stir some renewed interest. To honour SeqHash, my extension has been called SplitHash:

SplitHash is an immutable, uniquely represented Sequence ADT (Authenticated Data Structure):

- Like SeqHashes, SplitHashes can be concatenated in O(log(n)).
- But SplitHash extends SeqHash by allowing Hashes to also be split in O(log(n)).
- It also solves SeqHash's issue with repeating nodes by applying RLE (Run Length Encoding) compression.
- And to improve cache coherence and memory bandwidth, SplitHashes can be optionally chunked into n-ary trees.

SplitHash is the first known History-Independent(HI) ADT that holds all these properties.

Sorry about the obvious self interest, but I'm just so excited about what I've created that I need to share this.

By Robbert van Dalen at 2016-02-01 21:28 | LtU Forum | 11 comments | other blogs | 5288 reads

PECAN: Persuasive Prediction of Concurrency Access Anomalies

(This group and the diaspora seems to be doing all sorts of interesting things. I got around to this by wondering about static analysis of JavaScript wrt performance.)

As a developer
Who ends up on concurrent systems
I would like to be able to debug them.
(Even before I run them.)

Persuasive Prediction of Concurrency Access Anomalies
Jeff Huang, Charles Zhang
Department of Computer Science and Engineering The Hong
Kong University of Science and Technology
Predictive analysis is a powerful technique that exposes concurrency bugs in un-exercised program executions. However, current predictive analysis approaches lack the persuasiveness property as they offer little assistance in helping programmers fully understand the execution history that triggers the predicted bugs. We present a persuasive bug prediction technique as well as a prototype tool, PECAN , for detecting general access anomalies (AAs) in concurrent programs. The main characteristic of PECAN is that, in addition to predict AAs in a more general way, it gener- ates ‚bug hatching clips‚ that deterministically instruct the input program to exercise the predicted AAs. The key in- gredient of PECAN is an efficient offline schedule generation algorithm, with proof of the soundness, that guarantees to generate a feasible schedule for every real AA in programs that use locks in a nested way. We evaluate PECAN using twenty-two multi-threaded subjects including six large concurrent systems, and our experiments demonstrate that PECAN is able to effectively predict and deterministically expose real AAs. Several serious and previously unknown bugs in large open source concurrent systems were also re- vealed in our experiments

(sotto voice: I guess there's something to be said for using just utterly unprincipled, unrestricted, unconstrained, awful things like rampant concurrency, and Java, JavaScript, et. al., because it gives the Really Smart People in the world something to attack and improve and show off around. I mean, if we all had the luxury of Doing Things Right from the get-go I feel like lots of valuable insights (with wider application than their originating research) would never have been discovered or created.)

By raould at 2016-01-29 00:14 | LtU Forum | login or register to post comments | other blogs | 3197 reads

Challenges Facing a High-Level Language for Machine Knitting

Knitting is the process of creating textile surfaces out of interlocked loops of yarn. With the repeated application of a basic operation – pulling yarn through an existing loop to create a new loop – complicated three-dimensional structures can be created [4]. Knitting machines automate this loop-through-loop process, with some physical limitations arising from their method of storing loops and accessing yarns [1, 3]. Currently, knitting machines are programmed at a very low level. Projects such as AYAB [2] include utilities for designing knit colorwork, but only within a limited stitch architecture; designers working in 3D usually do so via a set of pre-designed templates [4].

From Lea Albaugh, James McCann, "Challenges Facing a High-Level Language for Machine Knitting", POPL2016.

By marco at 2016-01-20 12:43 | LtU Forum | 50 comments | other blogs | 9224 reads

Project Lamdu

Project Lamdu, a live-programming environment with a little something for everyone, including things like:

...the canonical representation of programs should not be text, but rich data structures: Abstract syntax trees.
Effect Typing... ...allows a live environment to actually execute code as it is being edited, safely, and bring the benefits of spreadsheets to general purpose programming
When types are rich enough, much of the program structure can be inferred from the types.
Integrated revision control and live test cases will allow "Regression Debugging".

By Greg Buchholz at 2016-01-18 21:43 | LtU Forum | 9 comments | other blogs | 6685 reads

Typed X (Racket, Clojure, Lua) just doesn't pan out?

I have used Typed Lua. I have kept an eye on Typed Racket, and Typed Clojure. Overall I currently get the impression that they somehow don't quite get over some hurdles that would allow them to really shine. And thus people who wanted to love and use them are leaving them instead.

A post today re: Typed Clojure echoes this previous one:

In September 2013 we blogged about why we’re supporting Typed Clojure, and you should too! Now, 2 years later, our engineering team has made a collective decision to stop using Typed Clojure (specifically the core.typed library).

I come not to bury Typed X, but to praise them. Can somebody please work on figuring out what it is that is missing or needs to be tweaked to make them more usable? (Anything from speed to culture.) Or should we truly conclude that trying to augment / paper over a dynamic ecosystem is just for the most part doomed to fail?

By raould at 2016-01-12 23:49 | LtU Forum | 14 comments | other blogs | 10188 reads

Need to Talk

Someone has a need to talk.

By marco at 2016-01-08 20:34 | LtU Forum | 56 comments | other blogs | 10955 reads

Bedrock case study, modular program verification

People have mentioned Bedrock on and off here over the years. I found the latest paper to be an exciting read since for some strange reason I like the idea of verification.

From Network Interface to Multithreaded Web Applications:
A Case Study in Modular Program Verification.

This paper reports on one case study applying modular proof techniques in the Coq proof assistant. To our knowledge, it is the first modular verification certifying a system that combines infrastructure with an application of interest to end users. We assume a nonblocking API for managing TCP networking streams, and on top of that we work our way up to certifying multithreaded, database-backed Web applications.

By raould at 2016-01-07 01:47 | LtU Forum | login or register to post comments | other blogs | 5159 reads

Lambda the Ultimate

User login

Navigation