Theory

On Understanding Data Abstraction, Revisited

One of the themes of Barbara Liskov's Turing Award lectue ("CS History 101") was that nobody has invented a better programming concept than abstract data types. William Cook wrote a paper for OOPSLA '09 that looks at how well PLT'ers understand their own vocabulary, in particular abstract data types and concepts that on the syntactical surface blend to all seem like ADTs. The paper is On Understanding Data Abstraction, Revisited.

In 1985 Luca Cardelli and Peter Wegner, my advisor, published an ACM Computing Surveys paper called “On understanding types, data abstraction, and polymorphism”. Their work kicked off a flood of research on semantics and type theory for object-oriented programming, which continues to this day. Despite 25 years of research, there is still widespread confusion about the two forms of data abstraction, abstract data types and objects. This essay attempts to explain the differences and also why the differences matter.

The Introduction goes on to say:

What is the relationship between objects and abstract data types (ADTs)? I have asked this question to many groups of computer scientists over the last 20 years. I usually ask it at dinner, or over drinks. The typical response is a variant of “objects are a kind of abstract data type”. This response is consistent with most programming language textbooks.

[...]

So what is the point of asking this question? Everyone knows the answer. It’s in the textbooks.

[...]

My point is that the textbooks mentioned above are wrong! Objects and abstract data types are not the same thing, and neither one is a variation of the other.

Ergo, if the textbooks are wrong, then your Dinner Answer to (the) Cook is wrong! The rest of the paper explains how Cook makes computer scientists sing for their supper ;-)

When I’m inciting discussion of this topic over drinks, I don’t tell the the full story up front. It is more fun to keep asking questions as the group explores the topic. It is a lively discussion, because most of these ideas are documented in the literature and all the basic facts are known. What is interesting is that the conclusions to be drawn from the facts are not as widely known.

A Type-theoretic Foundation for Programming with Higher-order Abstract Syntax and First-class Substitutions

A Type-theoretic Foundation for Programming with Higher-order Abstract Syntax and First-class Substitutions by Brigitte Pientka, appeared in POPL 08.
Higher-order abstract syntax (HOAS) is a simple, powerful technique for implementing object languages, since it directly supports common and tricky routines dealing with variables, such as capture-avoiding substitution and renaming. This is achieved by representing binders in the object-language via binders in the meta-language. However, enriching functional programming languages with direct support for HOAS has been a major challenge, because recursion over HOAS encodings requires one to traverse - abstractions and necessitates programming with open objects.

We present a novel type-theoretic foundation based on contextual modal types which allows us to recursively analyze open terms via higher-order pattern matching. By design, variables occurring in open terms can never escape their scope. Using several examples, we demonstrate that our framework provides a name-safe foundation to operations typically found in nominal systems. In contrast to nominal systems however, we also support capture-avoiding substitution operations and even provide first-class substitutions to the programmer. The main contribution of this paper is a syntax directed bi-directional type system where we distinguish between the data language and the computation language together with the progress and preservation proof for our language.

Its been a while since I posted, but this paper appears that it may be of interest to some members of this community. It looks interesting to me, but I just wish I understood all the terminology. I don't know what "open objects" are, and why they are a problem. I don't understand what HOAS is. I don't even know what binders are. The list goes on. I surely can't be the only person who is interested, but feels that this is just out of their grasp. I bet that I probably could understand these things with a minimum of explanation, given I have experience implementing languages. If anyone is interested in rephrasing the abstract in more basic terms, I would be very appreciative. [Edit: corrected spelling of Brigitte Pientka. My apologies.]

Perl Cannot Be Parsed: A Formal Proof

Perl Cannot Be Parsed: A Formal Proof via Perl Monks.

Elegantly proved via reduction to (from?) the Halting Problem.

MitchFest 2009: Symposium in Honor of Mitchell Wand

I'm pleased to announce that we are planning a celebration for Mitch Wand's 60th birthday!

From the MitchFest home page:

Northeastern University is hosting a special Symposium in celebration of Dr. Mitchell Wand's 60th birthday and honoring his pioneering work in the field of programming languages. For over 30 years Mitch has made important contributions to many areas of programming languages, including semantics, continuations, type theory, hygienic macros, compiler correctness, static analysis and formal verification.

After receiving his PhD from MIT in 1973, Mitch taught at Indiana University where he and colleague Daniel P. Friedman wrote the first edition of their seminal text, Essentials of Programming Languages. Described by a reviewer as "so influential that the initials EOPL are a widely understood shorthand," the text is now in its third edition from MIT Press. Mitch joined the faculty of Northeastern University in 1985 and has been a leader in the College of Computer and Information Science. In 2007, Mitch was inducted as a Fellow of the Association for Computing Machinery.

Please join us at Northeastern on August 23rd and 24th as we celebrate this personal milestone and pay tribute to a great computer scientist, researcher, teacher and colleague, Dr. Mitchell (Mitch) Wand.

LtU regulars will recall that we've discussed DanFest 2004 here before, as well as the talk videos.

MitchFest is open to the public and coordinated with Scheme Workshop 2009, which will be at MIT on August 22nd (the same weekend). More event information, including registration, is available on the MitchFest home page. Following the Symposium, we will be publishing a special edition of HOSC as a Festschrift in honor of Mitch.

We will post a schedule on the web site soon, but for now you can view the preliminary list of papers in the Call for Participation.

Update: added link to HOSC.

Lectures on Jacques Herbrand as a Logician

Wirth, Siekmann, Benzmueller & Autexier. Lectures on Jacques Herbrand as a Logician. SEKI Report SR-2009-01.

Herbrand's work, more than that of any other, provides the intellectual foundations of logic programming. This very readable article discusses Herbrands' contributions to proof theory and the formulation of the idea of a recursive function, and most importantly to PL, his fundamental theorem that yields a semi-decision algorithm for first-order logic and his unification algorithm.

Generic Discrimination: Sorting and Partitioning Unshared Data in Linear Time

Generic Discrimination: Sorting and Partitioning Unshared Data in Linear Time, Fritz Henglein. ICFP 2008.

We introduce the notion of discrimination as a generalization of both sorting and partitioning and show that worst-case linear-time discrimination functions (discriminators) can be defined generically, by (co-)induction on an expressive language of order denotations. The generic definition yields discriminators that generalize both distributive sorting and multiset discrimination. The generic discriminator can be coded compactly using list comprehensions, with order denotations specified using Generalized Algebraic Data Types (GADTs). A GADT-free combinator formulation of discriminators is also given.

We give some examples of the uses of discriminators, including a new most-significant-digit lexicographic sorting algorithm.

Discriminators generalize binary comparison functions: They operate on n arguments at a time, but do not expose more information than the underlying equivalence, respectively ordering relation on the arguments. We argue that primitive types with equality (such as references in ML) and ordered types (such as the machine integer type), should expose their equality, respectively standard ordering relation, as discriminators: Having only a binary equality test on a type requires Θ(n^2) time to find all the occurrences of an element in a list of length n, for each element in the list, even if the equality test takes only constant time. A discriminator accomplishes this in linear time. Likewise, having only a (constant-time) comparison function requires Θ(n log n) time to sort a list of n elements. A discriminator can do this in linear time.

If you're like me, at some point you probably heard about linear time sorts like radix sort and then dismissed them as an interesting curiosity --- the restriction to sorting numbers seems pretty limiting, after all.

Henglein did not, and in this paper he shows how to write a teeny-tiny library of type-directed combinators that can lift linear time sorts to structured types like lists and trees, and in addition lets you quotient out by order or repetition, if you want. It's terrifically pretty code.

Note: This is a link to the ACM digital library, and so is behind a paywall. I found a link to a tech report by Henglein covering the same material, but I don't exactly what is different.

A Computer-Generated Proof that P=NP

Doron Zeilberger announced yesterday that he has proven that P=NP.

Using 3000 hours of CPU time on a CRAY machine, we settle the notorious P vs. NP problem in the affirmative, by presenting a “polynomial” time algorithm for the NP-complete subset sum problem.

The paper is available here and his 98th Opinion is offered as commentary.

A Foundation for Flow-Based Program Matching Using Temporal Logic and Model Checking

A Foundation for Flow-Based Program Matching Using Temporal Logic and Model Checking, Julien Brunel, Damien Doliguez, René Rydhof Hansen, Julia Lawall, Gilles Muller. POPL 2009.

Reasoning about program control-flow paths is an important functionality of a number of recent program matching languages and associated searching and transformation tools. Temporal logic provides a well-defined means of expressing properties of control-flow paths in programs, and indeed an extension of the temporal logic CTL has been applied to the problem of specifying and verifying the transformations commonly performed by optimizing compilers.

Nevertheless, in developing the Coccinelle program transformation tool for performing Linux collateral evolutions in systems code, we have found that existing variants of CTL do not adequately support rules that transform subterms other than the ones matching an entire formula. Being able to transform any of the subterms of a matched term seems essential in the domain targeted by Coccinelle. In this paper, we propose an extension to CTL named CTL-VW (CTL with variables and witnesses) that is a suitable basis for the semantics and implementation of the Coccinelle’s program matching language.

Our extension to CTL includes existential quantification over program fragments, which allows metavariables in the program matching language to range over different values within different control-flow paths, and a notion of witnesses that record such existential bindings for use in the subsequent program transformation process. We formalize CTL-VW and describe its use in the context of Coccinelle. We then assess the performance of the approach in practice, using a transformation rule that fixes several reference count bugs in Linux code

The Coccinelle tool is quite fun to play with. You write things that look like the output of patch, only with some extra patterns and boolean conditions in it, and the tool will go over your C source, find all the source code that matches it, and apply all the changes you've specified. It's open source and available online.

The theory described in this paper is quite fun, too -- the algorithms they describe are (surprisingly) not too complicated and apparently quite speedy.

Dana

Luke Palmer and Nick Szabo can shoot me for this if they want, but I think this is warranted, and I want to connect a couple of dots as well. Luke is one of a number of computer scientists, with Conal Elliott probably being the best known, who have devoted quite a bit of attention to Functional Reactive Programming, or FRP. FRP has been discussed on LtU off and on over the years, but, unusually for LtU IMHO, does not seem to have gotten the traction that some other similarly abstruse subjects have.

In parallel, LtU has had a couple of interesting threads about Second Life's economy, smart contracts, usage control, denial of service, technical vs. legal remedies, and the like. I would particularly like to call attention to this post by Nick Szabo, in which he discusses a contract language that he designed:

Designing the contract language radically changed my idea of what program flow and instruction pointers can be. Its successor, a general-purpose programming language, may thereby make event-oriented programming and concurrency far easier. The language is targeted at GUI programming as well as smart contracts, real-time, and workflow programming. My answer to the problems of concurrency and event handling is to make the ordering of instructions the syntactic and semantic core of the language. The order of execution and event handlers are the easiest things to express in the language rather than kludged add-ons to a procedural or functional language.

In recent private correspondence, Nick commented that he'd determined that he was reinventing synchronous programming à la Esterel, and mentioned "Reactive" programming.

Ding!

To make a potentially long entry somewhat shorter, Luke is working on a new language, Dana, which appears to have grown out of some frustration with existing FRP systems, including Conal Elliot's Reactive, currently perhaps the lynchpin of FRP research. Luke's motivating kickoff post for the Dana project can be found here, and there are several follow-up posts, including links to experimental source code repositories. Of particularly motivating interest, IMHO, is this post, where Luke discusses FRP's interaction with garbage collection succinctly but nevertheless in some depth. Luke's most recent post makes the connection from Dana, which Luke has determined needs to have a dependently-typed core, to Illative Combinatory Logic, explicit, and offers a ~100 line type checker for the core.

I find this very exciting, as I believe strongly in the project of being able to express computation centered on time, in the sense of Nick's contract language, in easy and safe ways extremely compelling. I've intuited for some time now that FRP represents a real breakthrough in moving us past the Von Neumann runtime paradigm in fundamental ways, and between Conal Elliott's and Luke's work (and no doubt that of others), it seems to me that my sense of this may be borne out, with Nick's contract language, or something like it, as an initial application realm.

So I wanted to call attention to Luke's work, and by extension recapitulate Conal's and Nick's, both for the PLT aspects that Luke's clearly represents, but also as a challenge to the community to assist in the realization of Nick's design efforts, if at all possible.

AMS: A Special Issue on Formal Proof

Formal proofs, especially in support of Mathematics, is not something that I work with - (I use a lot of intuition in analyzing my code). I found that the articles from The American Mathematical Society A Special Issue on Formal Proof are fairly good introductions to the subject of using Proof Assistants in Formalizing 100 Math Theorems.

Traditional mathematical proofs are written in a way to make them easily understood by mathematicians. Routine logical steps are omitted. An enormous amount of context is assumed on the part of the reader. Proofs, especially in topology and geometry, rely on intuitive arguments in situations where a trained mathematician would be capable of translating those intuitive arguments into a more rigorous argument.

A formal proof is a proof in which every logical inference has been checked all the way back to the fundamental axioms of mathematics. All the intermediate logical steps are supplied, without exception. No appeal is made to intuition, even if the translation from intuition to logic is routine. Thus, a formal proof is less intuitive, and yet less susceptible to logical errors.

Though the articles are focused on the application to mathematical proofs, they do give a background on languages that are continually mentioned on LtU (HOL, Coq, Isabelle, etc...).

XML feed