History

Algol 58/60

Paul McJones has been curating ALGOL section of Software Preservation Group. He notes:

I recently created an ALGOL section at the Computer History Museum’s Software Preservation Group web site, covering the language standardization efforts — for ALGOL 58 (also known as the International Algebraic Language), ALGOL 60, and ALGOL 68 — and also covering many implementations, dialects, and offshoots, complete with source code, manuals, and papers for many of these. The history of ALGOL has attracted many writers, and the final section of the web site links to many of their papers.

Also see his follow up blog about Whetstone ALGOL.

A Formal System For Euclid's Elements

A Formal System For Euclid's Elements, Jeremy Avigad, Edward Dean, and John Mumma. Review of Symbolic Logic, Vol. 2, No. 4, 2009.

Abstract. We present a formal system, E, which provides a faithful model of the proofs in Euclid’s Elements, including the use of diagrammatic reasoning.

Diagrammatic languages are a perennial favorite discussion topic here, and Euclid's proofs constitute one of the oldest diagrammatic languages around. And yet for hundreds of years (at least since Leibniz) people have argued about whether or not the diagrams are really part of a formal system of reasoning, or whether they are simply visual aids hanging on the side of the true proof. The latter position is the one that Hilbert and Tarski took as well when they gave formal axiomatic systems for geometry.

But was this necessary, or just a contingent fact of the logical machinery available to them? Avigad and his coauthors show the former point of view also works, and that you can do it with very basic proof theory (there's little here unfamiliar to anyone who has read Pierce's book). Yet it sheds a lot of light on how the diagrams in the Elements work, in part because of their very careful analysis of how to read the diagrams -- that is, what conclusion a diagram really licenses you to draw, and which ones are accidents of the specific figure on the page. How they consider these issues is a good model for anyone designing their own visual programming languages.

Google TechTalk: The Evolution of End-User Programming

End-User Programming has been a topical discussion lately in mainstream software outlets. The IEEE journal Software recently had an issue dedicated to end-user programming challenges; see Joel Brandt's Opportunistic Programming: Writing Code to Prototype, Ideate and Discover and Martin Erwig's Software Engineering for Spreadsheets. Also, a few years ago a consortium of universities formed End-Users Shaping Effective Software, which includes Martin Erwig's PLT work on bringing type systems to spreadsheets.

Recently, Google invited Allen Cypher to give a TechTalk on The Evolution of End-User Programming, which appears to be a recapitulation of his VL/HCC paper by the same name. Allen was the editor of Watch What I Do (an LtU recommended reading).

Towards the end of the talk, Allen mentions the practical issues of knowing when to use what tool, and that novice users struggle with finding the right tool for the right job. What's notable about discussion of end-user software engineering is how little attention its proponents pay to its critics biggest criticism: Security. In the IEEE Software realm, probably the most open critic has been Warren Harrison (see: The Dangers of End-User Programming). For example, Ko's 2009 ACM Computing Survey The State of the Art in End-User Software Engineering only mentions security once, in the context of designing end-user description languages for security, but does not assess how well this technique compares to techniques software engineers might employ. It seems strange that leading researchers in visual languages and end-user programming do not discuss the potential usage of object capability systems, especially as companies try to monetize a percentage of the value added by users who mash-up their service with other services.

Why Normalization Failed to Become the Ultimate Guide for Database Designers?

While trying to find marshall's claim that Alberto Mendelzon says the universal relation is an idea re-invented once every 3 years (and later finding a quote by Jeffrey Ullman that the universal relation is re-invented 3 times a year), I stumbled across a very provocative rant by a researcher/practitioner: Why Normalization Failed to Become the Ultimate Guide for Database Designers? by Martin Fotache. It shares an interesting wealth of experience and knowledge about logical design. The author is obviously well-read and unlike usual debates I've seen about this topic, presents the argument thoroughly and comprehensively.

The abstract is:

With an impressive theoretical foundation, normalization was supposed to bring rigor and relevance into such a slippery domain as database design is. Almost every database textbook treats normalization in a certain extent, usually suggesting that the topic is so clear and consolidated that it does not deserve deeper discussions. But the reality is completely different. After more than three decades, normalization not only has lost much of its interest in the research papers, but also is still looking for practitioners to apply it effectively. Despite the vast amount of database literature, comprehensive books illustrating the application of normalization to effective real-world applications are still waited. This paper reflects the point of view of an Information Systems academic who incidentally has been for almost twenty years a practitioner in developing database applications. It outlines the main weaknesses of normalization and offers some explanations about the failure of a generous framework in becoming the so much needed universal guide for database designers. Practitioners might be interested in finding out (or confirming) some of the normalization misformulations, misinterpretations, inconsistencies and fallacies. Theorists could find useful the presentation of some issues where the normalization theory was proved to be inadequate, not relevant, or source of confusion.

The body of the paper presents an explanation for why practitioners have rejected normalization. The author also shares his opinion on potentially underexplored ideas as well, drawing from an obviously well-researched depth of knowledge. In recent years, some researchers, such as Microsoft's Pat Helland, have even said Normalization is for sissies (only to further this with later formal publications such as advocating we should be Building on Quicksand). Yet, the PLT community is pushing for the exact opposite. Language theory is firmly rooted in formal grammars and proven correct 'tricks' for manipulating and using those formal grammars; it does no good to define a language if it does not have mathematical properties ensuring relaibility and repeatability of results. This represents and defines real tension between systems theory and PLT.

I realize this paper focuses on methodologies for creating model primitives, comparing mathematical frameworks to frameworks guided by intuition and then mapped to mathematical notions (relations in the relational model), and some may not see it as PLT. Others, such as Date, closely relate understanding of primitives to PLT: Date claims the SQL language is to blame and have gone to the lengths of creating a teaching language, Tutorial D, to teach relational theory. In my experience, nothing seems to effect lines of code in an enterprise system more than schema design, both in the data layer and logic layer, and often an inverse relationship exists between the two; hence the use of object-relational mapping layers to consolidate inevitable problems where there will be The Many Forms of a Single Fact (Kent, 1988). Mapping stabilizes the problem domain by labeling correspondances between all the possible unique structures. I refer to this among friends and coworkers as the N+1 Schema Problem, as there is generally 1 schema thought to be canonical, either extensionally or intensionally, and N other versions of that schema.

Question: Should interactive programming languages aid practitioners in reasoning about their bad data models, (hand waving) perhaps by modeling each unique structure and explaining how they relate? I could see several reasons why that would be a bad idea, but as the above paper suggests, math is not always the best indicator of what practitioners will adopt. It many ways this seems to be the spirit of the idea behind such work as Stephen Kell's interest in approaching modularity by supporting evolutionary compatibility between APIs (source texts) and ABIs (binaries), as covered in his Onward! paper, The Mythical Matched Modules: Overcoming the Tyranny of Inflexible Software Construction. Similar ideas have been in middleware systems for years and are known as wrapper architecures (e.g., Don’t Scrap It, Wrap It!), but haven't seen much PLT interest that I'm aware of; "middleware" might as well be a synonym for Kell's "integration domains" concept.

Back to the Future: Lisp as a Base for a Statistical Computing System

Back to the Future: Lisp as a Base for a Statistical Computing System by Ross Ihaka and Duncan Temple Lang, and the accompanying slides.

This paper was previously discussed on comp.lang.lisp, but apparently not covered on LtU before.

The application of cutting-edge statistical methodology is limited by the capabilities of the systems in which it is implemented. In particular, the limitations of R mean that applications developed there do not scale to the larger problems of interest in practice. We identify some of the limitations of the computational model of the R language that reduces its effectiveness for dealing with large data efficiently in the modern era.

We propose developing an R-like language on top of a Lisp-based engine for statistical computing that provides a paradigm for modern challenges and which leverages the work of a wider community. At its simplest, this provides a convenient, high-level language with support for compiling code to machine instructions for very significant improvements in computational performance. But we also propose to provide a framework which supports more computationally intensive approaches for dealing with large datasets and position ourselves for dealing with future directions in high-performance computing.

We discuss some of the trade-offs and describe our efforts to realizing this approach. More abstractly, we feel that it is important that our community explore more ambitious, experimental and risky research to explore computational innovation for modern data analyses.

Foot note:
Ross Ihaka co-developed the R statistical programming language with Robert Gentleman. For those unaware, R is effectively an open source implementation of S-PLUS, which in turn was based on S. R is sort of the lingua franca of statistics, and you can usually find R code provided in the back of several Springer Verlag monographs.

Duncan Temple Lang is a core developer of R and has worked on the core engine for TIBCO's S-PLUS.

Thanks to LtU user bashyal for providing the links.

The Development of Sage

Sage is a project to create a viable free open source alternative to Magma, Maple, Mathematica and Matlab. The lead developer/manager William Stein has recently written Mathematical Software and Me: A Very Personal Recollection, a rather enjoyable story of his experience with mathematical software, especially Magma, and how Sage came to be.

One of the difficulties of writing broadly useful math software is the sheer size and scope of such a project. It is easily outside the abilities of even the most prodigious lone developer. So the focus of Sage, at least up until recently, has been on creating Python-based interfaces to existing mathematical software. For example, for symbolic calculation the Sage distribution includes Maxima (written in Common Lisp), a fork of Macsyma dating back to the early 1980s, and released as open-source software by the US Department of Energy approximately 10 years ago. In addition to Maxima, Sage includes the ability to call out to Magma, Mathematica, and Maple.

There are some interesting PLT-related snippets, for example, Magma's language is frequently criticized, although its algorithms are frequently praised. In conversations with others, OCaml and Haskell were brought up, but William Stein chose Python because he felt that it was more accessible. Also, Axiom, which includes the dependently-typed language Aldor, was rejected in favor of Maxima because Maxima was less esoteric and much more widely used.

Two Bits: The Cultural Significance of Free Software

Christopher Kelty's book, Two Bits: The Cultural Significance of Free Software, can be read online, and I think parts of it will interest many here.

It seems that programming languages, while mentioned, do not receive a lot of attention in this work. I would argue that they are a significant factor in the history that is being told, and an important resource for historians (though reading the history from the languages is not a trivial undertaking by any means).

Still, seems like a very good discussion and well worth pursuing.

Edited to add: As Z-Bo mentions in the comments, the website of the book invites people to re-mix it (or "modulate" it). Motivated readers can thus add the relevant PL perspective, if they so wish.

On Understanding Data Abstraction, Revisited

One of the themes of Barbara Liskov's Turing Award lectue ("CS History 101") was that nobody has invented a better programming concept than abstract data types. William Cook wrote a paper for OOPSLA '09 that looks at how well PLT'ers understand their own vocabulary, in particular abstract data types and concepts that on the syntactical surface blend to all seem like ADTs. The paper is On Understanding Data Abstraction, Revisited.

In 1985 Luca Cardelli and Peter Wegner, my advisor, published an ACM Computing Surveys paper called “On understanding types, data abstraction, and polymorphism”. Their work kicked off a flood of research on semantics and type theory for object-oriented programming, which continues to this day. Despite 25 years of research, there is still widespread confusion about the two forms of data abstraction, abstract data types and objects. This essay attempts to explain the differences and also why the differences matter.

The Introduction goes on to say:

What is the relationship between objects and abstract data types (ADTs)? I have asked this question to many groups of computer scientists over the last 20 years. I usually ask it at dinner, or over drinks. The typical response is a variant of “objects are a kind of abstract data type”. This response is consistent with most programming language textbooks.

[...]

So what is the point of asking this question? Everyone knows the answer. It’s in the textbooks.

[...]

My point is that the textbooks mentioned above are wrong! Objects and abstract data types are not the same thing, and neither one is a variation of the other.

Ergo, if the textbooks are wrong, then your Dinner Answer to (the) Cook is wrong! The rest of the paper explains how Cook makes computer scientists sing for their supper ;-)

When I’m inciting discussion of this topic over drinks, I don’t tell the the full story up front. It is more fun to keep asking questions as the group explores the topic. It is a lively discussion, because most of these ideas are documented in the literature and all the basic facts are known. What is interesting is that the conclusions to be drawn from the facts are not as widely known.

Liskov's list of papers

Ralph Johnson posted the list of papers that Liskov mentioned as having influence her.

A good place to start as any, I'd say.

Retrospective: An Axiomatic Basis for Computer Programming

Retrospective: An Axiomatic Basis for Computer Programming, by C.A.R. Hoare:

This month marks the 40th anniversary of the publication of the first article I wrote as an academic. I have been invited to give my personal view of the advances that have been made in the subject since then, and the further advances that remain to be made. Which of them did I expect, and which of them surprised me?

An interesting review of the history of computing. He has some nice perspectives on the complementarity of testing and formal methods, and how the growing cracking industry became an unexpected driving force behind industrial interest in verification.

XML feed