LtU Forum

Nullable type is needed to fix Tony Hoare's "billion dollar mistake".

The Nullable type is needed to fix Tony Hoare's "billion dollar mistake".
But implementing the idea is full of pitfalls, e.g., Null by itself should not be an expression.

is the terminator to mark the end of a top level construct.
An Expression◅aType▻ is an expression which when executed returns aType.

In an expression,
1. followed by an expression for a nullable returns the Actor in the nullable or
throws an exception iff it is null.
2. Nullable followed by an expression for an Actors returns a nullable with the Actor.
3. Null followed by aType returns a Nullable◅aType▻.
4. ⦾? followed by an expression for a nullable returns True iff it is not null.

* 3▮ is equivalent to Nullable 3▮
* Null Integer▮ throws an exception
* True is equivalent to ⦾?Nullable 3▮
* False is equivalent to ⦾?Null Integer▮

In a pattern:
1. followed by a pattern matches a Nullable◅aType▻ iff
it is non-null and the Actor it contains matches the pattern.
2. Null matches a Nullable◅aType▻ iff it is null

* The pattern ⦾x matches Nullable 3 , binding x to 3
* The pattern Null matches NullInteger

Edited for clarity

Eric Lippert's Sharp Regrets

In an article for InformIT, Eric Lippert runs down his "bottom 10" C# language design decisions:

When I was on the C# design team, several times a year we would have "meet the team" events at conferences, where we would take questions from C# enthusiasts. Probably the most common question we consistently got was "Are there any language design decisions that you now regret?" and my answer is "Good heavens, yes!"

This article presents my "bottom 10" list of features in C# that I wish had been designed differently, with the lessons we can learn about language design from each decision.

The "lessons learned in retrospect" for each one are nicely done.

Big questions

So, I've been (re)reading Hamming /The Are of Doing Science and Engineering/, which includes the famous talk "you and your research". That's the one where he recommends thinking about the big questions in your field. So here's one that we haven't talked about in awhile. It seems clear that more and more things are being automated, machine learning is improving, systems are becoming harder to tinker with, and so on. So for how long are we going to be programming in ways similar to those we are used to, which have been with us essentially since the dawn of computing? Clearly, some people will be programming as long as there are computers. But is the number of people churning code going to remain significant? In five years - of course. Ten? Most likely. Fifteen - I am not so sure. Twenty? I have no idea.

One thing I am sure of: as long as programming remains something many people do, there will be debates about static type checking.

Update: To put this in perspective - LtU turned fifteen last month. Wow.

Update 2: Take the poll!

How long will we still be programming free polls

Continuous feedback in PL

One of the tricks to creating a good user interface is to provide a continuous feedback loop between the user and the system. This can be done in many scrolling, live "scrubbing" of parameters via a slider, combo list, or whatever. Bringing this to PL is possible to an extent, but is limited by code that is basically non-continuous (to be described later!).

In math, we have the notion of a continuous function: a function whose outputs change a little bit when its inputs change a little bit. It is a prerequisite that the inputs also be able to "change a little bit"; i.e. are themselves continuous. This is a bit harsh, actually, and it is interesting to also consider discrete values that have some kind of order in a space (e.g. 1 ... 2) where going to an adjacent value is considered as the "small change." In any case, a continuous function can be augmented with excellent continuous feedback if the user is viewing its output while scrubbing its input (e.g. see Conal Elliott's Eros).

So we could treat code as a value "x" that is interpreted by a function "f", so "f(x)" gives us the output of the program defined by "x". If "x" was somehow a continuous value (or discrete value in an ordered continuum), and "f" was a continuous function, then we'd have the ultimate live programming environments: programmers simply move "x" around it its space according to how "f(x)" approaches or diverges from the desired output.

Of course, this isn't possible in general: code is hardly continuous in its value or interpretation. For small fragments of code, we can totally do this: e.g. scrubbing the position or color of a shape in a program. We can even scrub functions to a certain extent (e.g. shapes or colors in APX), but it is all quite limited to simply graphical/physical logic. Accordingly, most the successful demos of live programming are simple UI examples. I wonder also if this is the real bottleneck that visual programming runs into: that it is unable to provide reasonable feedback for many kinds of programming tasks, not because of (or at least, not just because of) real estate issues, but because the abstractions used are not amenable to continuous feedback.

The way forward is clear but perhaps difficult: we need to design programming languages and abstractions that are more continuous. Not just for graphical/physical computations, but for everything. One vague idea here is to distinguish between abstraction selection, a discrete non-continuous process, and abstraction configuration, which should be made as continuous as possible. Any other thoughts?

hashing facts (the secret of nnimh)

My intention here is to start a topic on "entertaining hashing facts", in case a programming language decides to build some particular hash algorithm into a module with persistent impact (in a manner obligating backwards compatibility). In the spirit of proverb measure twice and cut once, such decisions out to be informed by reliable facts instead of guesses. I wonder if others know things useful in planning PL effects.

It's possible there's not much interesting to say, in which case the discussion goes nowhere. That's okay. But here's an example of what I mean. A few months ago I was thinking about putting a checksum in a persistent file format, where it was going to be important that a PL portably knew how to do the right checksum. The 32-bit checksum used by gzip has a polynomial specified by RFC, which appears in the penultimate four bytes of a gzip file format, in little endian order (followed by uncompressed length in the final four bytes, also in little endian order). While you can just use crc32() from the zlib library for this, what if you didn't, perhaps because a 'better' version is dictated by local code rules? How can you informally check the right checksum is generated for a standard input?

I wrote a search over four and five letter lowercase sequences, followed by a newline, to find crc checksums with a memorable looking hex pattern, by printing those with patterns like anagrams in hex nibbles. I was really lucky, because I found an input whose hash has the same eight digits in a row, so big and little endian order look the same.

% echo nnimh > nnimh.txt
% hexdump nnimh.txt
0000000 6e 6e 69 6d 68 0a                              
% gzip nnimh.txt 
% hexdump nnimh.txt.gz 
0000000 1f 8b 08 08 80 e2 cb 55 00 03 6e 6e 69 6d 68 2e
0000010 74 78 74 00 cb cb cb cc cd e0 02 00 ee ee ee ee
0000020 06 00 00 00                                    

As you can see, the value of crc32(0, "nnimh\n", 6) is equal to 0xeeeeeeee, and this is easy to check on a command line when an environment has ready access to gzip and hexdump. As a secondary bonus, to me the pattern has a mnemonic, because it reminds me of a Disney movie, The Secret of NIMH.

Maybe discussing hashes and checksums is a dumb idea, in which case I apologize for bringing it up. I guess I'm hoping folks will avoid an argument about the fastest or most effective hash, which are things easy to measure empirically if you write a careful test. (For example, in a hashmap resolving collision by probing, a hash is best when unused slots are distributed uniformly, which you can measure as standard deviation of distance between non-adjacent empty buckets. I don't think intuitive opinion has a place in arguments based on empirical tests.)

Parsing@SLE 2015: Call for talk proposals


Third Workshop on Parsing@SLE 2015

October 25, 2015
Pittsburgh, Pennsylvania, United States
Co-located with SPLASH 2015: OOPSLA, Onward!, SLE and GPCE
Deadline for talk proposals: August 31, 2015

The goal of this workshop is to bring together today's experts in the fields of parser construction and parser application from across the diverse application areas. Participants will present ongoing work as well as explore the challenges that lie ahead. By bringing the whole community together (a rare occurrence, given the diversity of domain-specific conferences/workshops), we hope to have a wide-ranging collection of talks on parsing-related topics, and forge new collaborations.

*** Topics ***

While parsing and parser generation, both in theory and in practice, are mature topics, there are still many challenging problems with respect to the construction, maintenance, optimization, and application of parsers in real world scenarios.

Especially in the context of real programming languages there are ample theoretical as well as practical obstacles to be overcome. Contemporary parsing challenges are caused by programming-language evolution and diversity in the face of new application areas such as IDE construction, reverse engineering, software metrics, domain specific (embedded) languages, etc. What are modular formalisms for parser generation? How to obtain (fast and correct) parsers for both legacy and new languages that require more computational power than context-free grammars and regular expressions can provide? How to use increasing parallelism offered by multi-cores and GPUs in parsers? How to enable the verified construction or prototyping of parsers for languages such as COBOL, C++ and Scala without years of effort?

In addition to the traditional programming-language applications of parsing technology, several other areas of computing also depend heavily on parsers. Examples include computational linguistics, network traffic classification, network security, and bioinformatics. Those areas often have their own unusual requirements, such as: speed (e.g. in network algorithmics), memory efficiency (e.g. embedded devices for networks, but also computational linguistics), or rapid/dynamic parser construction (e.g. in network traffic classification and in bioinformatics) as grammars are adapted. We encourage talk proposals on parsing challenges and solutions in such non-traditional areas as well.

*** Call for Submissions ***

We solicit talk proposals in the form of short abstracts (max. 2 pages in ACM 2-column format). A good talk proposal describes an interesting position, demonstration, or early achievement. The submissions will be reviewed on relevance and clarity, and used to plan the mostly interactive sessions of the workshop day. Parsing@SLE is not a publication venue. Publication of accepted abstracts and slides on the website is voluntary.

* Deadline for talk proposals: Monday August 31, 2015
* Notification: Monday September 7, 2015
* Workshop: Sunday October 25, 2015
* Submission website:

*** Previous Editions ***

Parsing@SLE 2013:
Parsing@SLE 2014:

*** Organizers ***

* Loek Cleophas, Umeå University, Sweden; and FASTAR, Stellenbosch University, South Africa
* Ali Afroozeh, CWI research institute for mathematics and computer science in the Netherlands

Don't use "Yield" for co-routines; instead use "Postpone"

A pattern followed “↑” and a type can be used to match the pattern with something which has been extended from the type. For example,

PostponedFringe.[aTree:Tree]:FutureList◅String▻  ≡ 
   aTree � 
       aLeaf↑Leaf ⦂ [aLeaf.getString[ ]] ⍌
       aFork↑Fork ⦂ [⩛Fringe.[aFork.getLeft[ ]],  ⩛Postpone Fringe.[aFork.getRight[ ]]]

For example, ["The", "boy"]▮ is equivalent to the following:
PostponedFringe.[Fork.[Leaf.["The"], Leaf.["boy"]]]▮

The procedure Fringe can be used to define SameFringe? that determines if two trees have the same fringe [Hewitt 1972]:

   SameFringe?.[aTree:Tree, anotherTree:Tree]:Boolean ≡  //  test if two trees have the same fringe
               PostponedFringe.[aTree] = PostponedFringe.[anotherTree]▮

For example, [False, True]▮ is equivalent to the following:

Let aList ← PostponedFringe.[Fork.[Leaf.["The"], Leaf.["boy"]]]。    // aList = ["The", "boy"]
  [SameFringe?.[aList, ["The", "girl"],              // False
   SameFringe?.[aList, ["The", "boy"]]]▮           // True

Note that Leaf can be defined as an extension of Tree:

Actor Leaf.[aString:String] extension Tree using
        getString[ ]:String → aString▮

and Fork can be defined as an extension of Tree:

Actor Fork.[aLeft:Tree, aRight:Tree] extension Tree using
        getLeft[ ]:Tree → aLeft,
        getRight[ ]:Tree → aRight▮

Edited for clarity.

RAII and Async Stackless Fibers

I like RAII for exception safe programming, and as a general model for resource management. In another thread it was mentioned that this does not work well with an asynchronous stackless fibre model (where stackless might mean cactus stacks). As I like the fibre model for managing concurrency as well, because you don't need to have threads waiting on high latency services, I wanted to continue that discussion, focusing on: is RAII really a problem with these kind of fibres? What is the core problem? How could a new language combine these features in a way that avoids the problem (if the problem is a significant one)?

I don't really see what the problem is, as in simple terms RAII is attaching a resource (be it memory, file-handle, or any other) to a handle that is placed on a stack. The resource is allocated when the handle is created, and freed when the handle is destroyed. The handle is uncopyable, so it can only be assigned using move-semantics. Naively this would seem to work fine with cactus stacks.

"Inconsistency Robustness" now available

Inconsistency robustness is information system performance in the face of continually pervasive inconsistencies---a shift from the previously dominant paradigms of inconsistency denial and inconsistency elimination attempting to sweep them under the rug. Inconsistency robustness is a both an observed phenomenon and a desired feature:

• Inconsistency Robustness is an observed phenomenon because large information-systems are required to operate in an environment of pervasive inconsistency.
• Inconsistency Robustness is a desired feature because we need to improve the performance of large information system.

This volume has revised versions of refereed articles and panel summaries from the first two International Symposia on Inconsistency Robustness conducted under the auspices of the International Society for Inconsistency Robustness (iRobust The articles are broadly based on theory and practice, addressing fundamental issues in inconsistency robustness.

The field of Inconsistency Robustness aims to provide practical rigorous foundations for computer information systems dealing with pervasively inconsistent information.

Please write a review here.

Branch Prediction and the Performance of Interpreters - Don’t Trust Folklore

The following paper has been posted on Hacker News and seems very relevant to the LtU audience, including its more practically-minded part.

Branch Prediction and the Performance of Interpreters - Don’t Trust Folklore
Erven Rohou, Bharath Narasimha Swamy, André Seznec

Interpreters have been used in many contexts. They provide portability and ease of development at the expense of performance. The literature of the past decade covers analysis of why interpreters are slow, and many software techniques to improve them. A large proportion of these works focuses on the dispatch loop, and in particular on the implementation of the switch statement: typically an indirect branch instruction. Folklore attributes a significant penalty to this branch, due to its high misprediction rate. We revisit this assumption, considering state-of-the-art branch predictors and the three most recent Intel processor generations on current interpreters. Using both hardware counters on Haswell, the latest Intel processor generation, and simulation of the ITTAGE, we show that the accuracy of indirect branch prediction is no longer critical for interpreters. We further compare the characteristics of these interpreters and analyze why the indirect branch is less important than before.

The paper measures the accuracy of branch predictors on the indirect branch found in "naive" instruction decoding loops of bytecode interpreters. Experiments are performed with CPython, Javascript (SpiderMonkey) and CLI (.NET) on the interpreter side, and recent Intel processors as well as the authors' own state-of-the-art predictor (simulated) on the predictor side. The conclusion is that, assuming reasonably-sized predictor state, current prediction schemes fare very well on the dreaded indirect branch of while/switch interpreters. Reading between the lines, this makes well-known dispatch optimization techniques (e.g., jump threading) somewhat obsolete, at least on top of the line Intel processors.

I, for one, would like to see more work in this style, with hardware people bringing their expertise and experimental rigor to low-level PL implementation issues.

Also, I suspect that the PL implementation community would be interested in additional results, such as other interpreters and less aggressive microarchitectures (ARM processors, anyone?).

XML feed