Programming and Scaling

Programming and Scaling, a one-hour lecture by Alan Kay at his finest (and that's saying something!)

Some of my favorite quotes:

  • "The biggest problem we have as human beings is that we confuse our beliefs with reality."
  • "We could imagine taking the internet as a model for doing software modules. Why don't people do it?" (~00:17)
  • "One of the mistakes that we made years ago is that we made objects too small." (~00:26)
  • "Knowledge in many cases trumps IQ. [Henry] Ford was powerful because Isaac Newton changed the way we think." (~00:28)
  • "Knowledge is silver. Outlook is gold. IQ is a lead weight." (~00:30)
  • "Whatever we [in computing] do is more like what the Egyptians did. Building pyramids, piling things on top of each other."
  • "The ability to make science and engineering harmonize with each other - there's no greater music." (~00:47)

And there are some other nice ideas in there: "Model-T-Shirt Programming" - software the definition of which fits on a T-shirt. And imagining source code sizes in terms of books: 20,000 LOC = a 400-page book. A million LOC = a stack of books one meter high. (Windows Vista: a 140m stack of books.)

Note: this a Flash video, other formats are available.

Rob Pike: Public Static Void

Rob Pike's talk about the motivation for Go is rather fun, but doesn't really break new ground. Most of what he says have been said here many times, from the critic of the verbosity of C++ and Java to the skepticism about dynamic typing. Some small details are perhaps worth arguing with, but in general Pike is one of the good guys -- it's all motherhood and apple pie.

So why mention this at all (especially since it is not even breaking news)? Well, what caught my attention was the brief reconstruction of history the Pike presents. While he is perfectly honest about not being interested in history, and merely giving his personal impressions, the description is typical. What bugs me, particularly given the context of this talk, is that the history it totally sanitized. It's the "history of ideas" in the bad sense of the term -- nothing about interests (commercial and otherwise), incentives, marketing, social power, path dependence, any thing. Since we had a few discussions recently about historiography of the field, I thought I'd bring this up (the point is not to target Pike's talk in particular).

Now, when you think about Java, for example, it is very clear that the language didn't simply take over because of the reasons Pike marshals. Adoption is itself a process, and one that is worth thinking about. More to the point, I think, is that Java was (a) energetically marketed; and (b) was essentially a commercial venture, aimed at furthering the interests of a company (that is no longer with us...) Somehow I think all this is directly relevant to Go. But of course, it is hard to see Go gaining the success of Java.

All this is to say that history is not just "we had a language that did x well, but not y, so we came up with a new language, that did y but z only marginally, so now we are building Go (which compiles real fast, you know) etc. etc."

Or put differently, those who do not know history are doomed to repeat it (or some variation of this cliche that is more authentic). Or does this not hold when it comes to PLs?

Passing a Language through the Eye of a Needle

Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes, "Passing a Language through the Eye of a Needle: How the embeddability of Lua impacted its design", ACM Queue vol. 9, no. 5, May 2011.

A key feature of a scripting language is its ability to integrate with a system language. This integration takes two main forms: extending and embedding. In the first form, you extend the scripting language with libraries and functions written in the system language and write your main program in the scripting language. In the second form, you embed the scripting language in a host program (written in the system language) so that the host can run scripts and call functions defined in the scripts; the main program is the host program.
In this article we discuss how embeddability can impact the design of a language, and in particular how it impacted the design of Lua from day one. Lua is a scripting language with a particularly strong emphasis on embeddability. It has been embedded in a wide range of applications and is a leading language for scripting games.

An interesting discussion of some of the considerations that go into supporting embeddability. The design of a language clearly has an influence over the API it supports, but conversely the design of an API can have a lot of influence over the design of the language.

Memory Models: A Case for Rethinking Parallel Languages and Hardware, CACM, August 2010

Memory Models: A Case for Rethinking Parallel Languages and Hardware by Sarita V. Adve and Hans-J. Boehm

This is a pre-print of the actual version.

The era of parallel computing for the masses is here, but writing correct parallel programs remains far more difficult than writing sequential programs. Aside from a few domains, most parallel programs are written using a shared-memory approach. The memory model, which specifies the meaning of shared variables, is at the heart of this programming model. Unfortunately, it has involved a tradeoff between programmability and performance, and has arguably been one of the most challenging and contentious areas in both hardware architecture and programming language specification. Recent broad community-scale efforts have finally led to a convergence in this debate, with popular languages such as Java and C++ and most hardware vendors publishing compatible memory model specifications. Although this convergence is a dramatic improvement, it has exposed fundamental shortcomings in current popular languages and systems that prevent achieving the vision of structured and safe parallel programming.

This paper discusses the path to the above convergence, the hard lessons learned, and their implications. A cornerstone of this convergence has been the view that the memory model should be a contract between the programmer and the system - if the programmer writes disciplined (data-race-free) programs, the system will provide high programmability (sequential consistency) and performance. We discuss why this view is the best we can do with current popular languages, and why it is inadequate moving forward. We then discuss research directions that eliminate much of the concern about the memory model, but require rethinking popular parallel languages and hardware. In particular, we argue that parallel languages should not only promote high-level disciplined models, but they should also enforce the discipline. Further, for scalable and efficient performance, hardware should be co-designed to take advantage of and support such disciplined models. The inadequacies of the state-of-the-art and the research agenda we outline have deep implications for the practice, research, and teaching of many computer science sub-disciplines, spanning theory, software, and hardware.

The IO Monad is 45 years old

Oleg Kiselyov wrote a mail to haskell-cafe today titled, The IO Monad is 45 years old. I thought LtU readers might like this.

Pure and Declarative Syntax Definition: Paradise Lost and Regained, Onward 2010

Pure and Declarative Syntax Definition: Paradise Lost and Regained by Lennart C. L. Kats, Eelco Visser, Guido Wachsmuth from Delft

Syntax definitions are pervasive in modern software systems, and serve as the basis for language processing tools like parsers and compilers. Mainstream parser generators pose restrictions on syntax definitions that follow from their implementation algorithm. They hamper evolution, maintainability, and compositionality of syntax definitions. The pureness and declarativity of syntax definitions is lost. We analyze how these problems arise for different aspects of syntax definitions, discuss their consequences for language engineers, and show how the pure and declarative nature of syntax definitions can be regained.

I haven't compared this version with the Onward 2010 version, but they look essentially the same. It seems timely to post this paper, considering the other recent story Yacc is dead. There is not a whole lot to argue against in this paper, since we all "know" the other approaches aren't as elegant and only resort to them for specific reasons such as efficiency. Yet, this is the first paper I know of that tries to state the argument to software engineers.

For example, the Dragon Book, in every single edition, effectively brushes these topics aside. In particular, the Dragon Book does not even mention scannerless parsing as a technique, and instead only explains the "advantages" of using a scanner. Unfortunately, the authors of this paper don't consider other design proposals, either, such as Van Wyk's context-aware scanners from GPCE 2007. It is examples like these that made me wish the paper was a bit more robust in its analysis; the examples seem focused on the author's previous work.

If you are not familiar with the author's previous work in this area, the paper covers it in the references. It includes Martin Bravenboer's work on modular Eclipse IDE support for AspectJ.

The Triumph of Types: Principia Mathematica's Impact on Computer Science

The Triumph of Types: Principia Mathematica's Impact on Computer Science. Robert L. Constable

The role the ideas of Principia Mathematica played in type theory in programming languages is often alluded to in our discussions, making this contribution to a meeting celebrating the hundredth anniversary of Whitehead-and-Russell's opus provocative.

To get your juices going here is a quote from page 3:

...I will discuss later our efforts at Cornell to create one such type theory, Computational Type Theory (CTT), very closely related to two others, the Calculus of Inductive Constructions (CIC) implemented in the Coq prover and widely used, and Intuitionistic Type Theory (ITT) implemented in the Alf and Agda provers. All three of these efforts, but especially CTT and ITT, were strongly influenced by Principia and the work of Bishop presented in his book Foundations of Constructive Analysis.

The Resurgence of Parallelism

Peter J. Denning and Jack B. Dennis, The Resurgence of Parallelism, Communications of the ACM, Vol. 53 No. 6, Pages 30-32, 10.1145/1743546.1743560

"Multi-core chips are a new paradigm!" "We are entering the age of parallelism!" These are today's faddish rallying cries for new lines of research and commercial development. ... The parallel architecture research of the 1960s and 1970s solved many problems that are being encountered today. Our objective in this column is to recall the most important of these results and urge their resurrection.

A brief but timely reminder that we should avoid reinventing the wheel. Denning and Dennis give a nice capsule summary of the history of parallel computing research, and highlight some of the key ideas that came out of earlier research on parallel computing. This isn't a technically deep article. But it gives a quick overview of the field, and tries to identify some of the things that actually are research challenges rather than problems for which the solutions have seemingly been forgotten.

Algol 58/60

Paul McJones has been curating ALGOL section of Software Preservation Group. He notes:

I recently created an ALGOL section at the Computer History Museum’s Software Preservation Group web site, covering the language standardization efforts — for ALGOL 58 (also known as the International Algebraic Language), ALGOL 60, and ALGOL 68 — and also covering many implementations, dialects, and offshoots, complete with source code, manuals, and papers for many of these. The history of ALGOL has attracted many writers, and the final section of the web site links to many of their papers.

Also see his follow up blog about Whetstone ALGOL.

A Formal System For Euclid's Elements

A Formal System For Euclid's Elements, Jeremy Avigad, Edward Dean, and John Mumma. Review of Symbolic Logic, Vol. 2, No. 4, 2009.

Abstract. We present a formal system, E, which provides a faithful model of the proofs in Euclid’s Elements, including the use of diagrammatic reasoning.

Diagrammatic languages are a perennial favorite discussion topic here, and Euclid's proofs constitute one of the oldest diagrammatic languages around. And yet for hundreds of years (at least since Leibniz) people have argued about whether or not the diagrams are really part of a formal system of reasoning, or whether they are simply visual aids hanging on the side of the true proof. The latter position is the one that Hilbert and Tarski took as well when they gave formal axiomatic systems for geometry.

But was this necessary, or just a contingent fact of the logical machinery available to them? Avigad and his coauthors show the former point of view also works, and that you can do it with very basic proof theory (there's little here unfamiliar to anyone who has read Pierce's book). Yet it sheds a lot of light on how the diagrams in the Elements work, in part because of their very careful analysis of how to read the diagrams -- that is, what conclusion a diagram really licenses you to draw, and which ones are accidents of the specific figure on the page. How they consider these issues is a good model for anyone designing their own visual programming languages.

XML feed