Specification and Verification: The Spec# Experience

Mike Barnett, Manuel Fahndrich, K. Rustan M. Leino, Peter Muller, Wolfram Schulte, and Herman Venter, Specification and Verification: The Spec# Experience" Preprint of an article appearing in the June 2011 CACM.

CACM tagline: Can a programming language really help programmers write better programs?

Spec# is a programming system that facilitates the development of correct software. The Spec# language extends C# with contracts that allow programmers to express their design intent in the code. The Spec# tool suite consists of a compiler that emits run-time checks for contracts, a static program verifier that attempts to mathematically prove the correctness of programs, and an integration into the Visual Studio development environment. Spec# shows how contracts and verifiers can be integrated seamlessly into the software development process. This paper reflects on the six-year history of the Spec# project, scientific contributions it has made, remaining challenges for tools that seek to establish program correctness, and prospects of incorporating verification into everyday software engineering.

Spec# is, in some ways, quite similar to JML+ESC/Java2. But Spec# is a language rather than a set of annotations, which allows it to incorporate features such as a non-null type system and a very tight integration with the IDE.

Spec# was previously mentioned on LtU back in 2005.

Passing a Language through the Eye of a Needle

Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes, "Passing a Language through the Eye of a Needle: How the embeddability of Lua impacted its design", ACM Queue vol. 9, no. 5, May 2011.

A key feature of a scripting language is its ability to integrate with a system language. This integration takes two main forms: extending and embedding. In the first form, you extend the scripting language with libraries and functions written in the system language and write your main program in the scripting language. In the second form, you embed the scripting language in a host program (written in the system language) so that the host can run scripts and call functions defined in the scripts; the main program is the host program.
In this article we discuss how embeddability can impact the design of a language, and in particular how it impacted the design of Lua from day one. Lua is a scripting language with a particularly strong emphasis on embeddability. It has been embedded in a wide range of applications and is a leading language for scripting games.

An interesting discussion of some of the considerations that go into supporting embeddability. The design of a language clearly has an influence over the API it supports, but conversely the design of an API can have a lot of influence over the design of the language.

One Pass Real-Time Generational Mark-Sweep Garbage Collection

In One Pass Real-Time Generational Mark-Sweep Garbage Collection Joe Armstrong and Robert Virding talk about a very simple garbage collector used in Erlang*.

Traditional mark-sweep garbage collection algorithms do not allow reclamation of data until the mark phase of the algorithm has terminated. For the class of languages in which destructive operations are not allowed we can arrange that all pointers in the heap always point backwards towards "older" data. In this paper we present a simple scheme for reclaiming data for such language classes with a single pass mark-sweep collector.

We also show how the simple scheme can be modified so that the collection can be done in an incremental manner (making it suitable for real-time collection). Following this we show how the collector can be modified for generational garbage collection, and finally how the scheme can be used for a language with concurrent processes.

Unfortunately it's very restrictive. In particular even the "hidden" destructive updates used in a call-by-need language are problematic for this kind of collector.

* I'm not sure whether the described collector is still used in Erlang.

Asynchronous Proof Processing with Isabelle/Scala and Isabelle/jEdit

Asynchronous Proof Processing with Isabelle/Scala and Isabelle/jEdit. Makarius Wenzel, UITP 2010.

After several decades, most proof assistants are still centered around TTY-based interaction in a tight read-eval-print loop. Even well-known Emacs modes for such provers follow this synchronous model based on single commands with immediate response, meaning that the editor waits for the prover after each command. There have been some attempts to re-implement prover interfaces in big IDE frameworks, while keeping the old interaction model. Can we do better than that?

Ten years ago, the Isabelle/Isar proof language already emphasized the idea of proof document (structured text) instead of proof script (sequence of commands), although the implementation was still emulating TTY interaction in order to be able to work with the then emerging Proof General interface. After some recent reworking of Isabelle internals, to support parallel processing of theories and proofs, the original idea of structured document processing has surfaced again.

Isabelle versions from 2009 or later already provide some support for interactive proof documents with asynchronous checking, which awaits to be connected to a suitable editor framework or full-scale IDE. The remaining problem is how to do that systematically, without having to specify and implement complex protocols for prover interaction.

This is the point where we introduce the new Isabelle/Scala layer, which is meant to expose certain aspects of Isabelle/ML to the outside world. The Scala language (by Martin Odersky) is sufficiently close to ML in order to model well-known prover concepts conveniently, but Scala also runs on the JVM and can access existing Java libraries directly. By building more and more external system wrapping for Isabelle in Scala, we eventually reach the point where we can integrate the prover seamlessly into existing IDEs (say Netbeans).

To avoid getting side-tracked by IDE platform complexity, our current experiments are focused on jEdit, which is a powerful editor framework written in Java that can be easily extended by plugin modules. Our plugins are written again in Scala for our convenience, and to leverage the Scala actor library for parallel and interactive programming. Thanks to the Isabelle/Scala layer, the Isabelle/jEdit implementation is very small and simple.

I thought this was a nice paper on the pragmatics of incremental, interactive proof editing. I've suspected for a while that as programming languages and IDEs grow more sophisticated and do more computationally-intensive checks at compile time (including but not limited to theorem proving), it will become similarly important to design our languages to support modular and incremental analysis.

However, IDE designs also need more experimentation, and unfortunately the choice of IDEs to extend seem to be limited to archaic systems like Emacs or industrial behemoths like Eclipse or Visual Studio, both of which constrain the scope for new design -- Emacs is too limited, and the API surface of Eclipse/VS is just too big and irregular. (Agda-mode for Emacs is a heroic but somewhat terrifying piece of elisp.)


Kona is a new open-source implementation of Arthur Whitney's K, an ASCII-based APL like language. Kona is a fully working version of K3.

If you haven't ever tried APL/J/K or ilk you might find this language incomprehensible at first -- unless you like a challenge! Watch the screencasts or read some of our earlier APL/J stories.

Regardless of your interest in K, any LtUer worth his salt will enjoy the source code. We wrote a bit about the history of the remarkable C coding style used in the past, but I can't locate the link at the moment.

Finding and Understanding Bugs in C Compilers

In Finding and Understanding Bugs in C Compilers Xuejun Yang, Yang Chen, Eric Eide, and John Regehr of University of Utah, School of Computing describe Csmith, a fuzzer for testing C compilers. The hard part was avoiding undefined behavior.

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.

Two bits really stuck out for me. First, formal verification has a real positive impact

The striking thing about our CompCert results is that the middleend bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine-checked, has tangible benefits for compiler users.

And second, code coverage is inadequate for ensuring good test thoroughness for software as complex as a compiler.

Because we find many bugs, we hypothesized that randomly generated programs exercise large parts of the compilers that were not covered by existing test suites. To test this, we enabled code coverage monitoring in GCC and LLVM. We then used each compiler to build its own test suite, and also to build its test suite plus 10,000 Csmith-generated programs. Table 3 shows that the incremental coverage due to Csmith is so small as to be a negative result. Our best guess is that these metrics are too shallow to capture Csmith’s effects, and that we would generate useful additional coverage in terms of deeper metrics such as path or value coverage.

Leveled Garbage Collection

Leveled Garbage Collection by Guanshan Tong and Michael J. O'Donnell:

Generational garbage collection (GGC) is one of the most popular garbage collection techniques. GGC gains a performance advantage by performing minor collections on the younger objects in the heap, reducing the number of major collections of the whole heap. A promotion policy determines when an object moves from the younger generation to the older. The design of GGC has been justified by the plausible assumption that many objects die very young, and a few live a very long time. But, attempts to tune the performance of GGC by adjusting the promotion policy have been disappointing - only the simplest immediate promotion policy has proved attractive. The success of GGC is probably due to simplicity and to avoiding scans of the whole heap, rather than to accurate lifetime predictions.
This paper presents Leveled Garbage Collection (LGC), a new algorithm that is not based on object ages. It uses a heap structure and collection scheme similar to those of generational garbage collectors, and has a non-age-based promotion policy that doesn't promote all of the live objects, but still guarantees ample free space immediately after each garbage collection. By tuning LGC's promotion policy, we can often improve on GGC with immediate promotion.
Performance comparisons show that LGC outperforms GGC with immediate promotion policy in many cases, while losing only slightly on cases favorable to immediate promotion. LGC has a substantial advantage when the heap fits in main memory, and an even greater advantage as the heap gets paged to disk.

Leveled GC is based on a more general heuristic than generational GC, in that it tries to keep as many objects as possible in the nursery because minor collections are so much cheaper. What I found most interesting about this paper is that it scales well with virtual memory, which as we know can degrade performance significantly. They provide benchmarks demonstrating a marked difference when large heap sizes trigger paging (Section 5.2.2). LGC performance is hardly affected, while the runtime of generational GC degrades significantly.

Memory Models: A Case for Rethinking Parallel Languages and Hardware, CACM, August 2010

Memory Models: A Case for Rethinking Parallel Languages and Hardware by Sarita V. Adve and Hans-J. Boehm

This is a pre-print of the actual version.

The era of parallel computing for the masses is here, but writing correct parallel programs remains far more difficult than writing sequential programs. Aside from a few domains, most parallel programs are written using a shared-memory approach. The memory model, which specifies the meaning of shared variables, is at the heart of this programming model. Unfortunately, it has involved a tradeoff between programmability and performance, and has arguably been one of the most challenging and contentious areas in both hardware architecture and programming language specification. Recent broad community-scale efforts have finally led to a convergence in this debate, with popular languages such as Java and C++ and most hardware vendors publishing compatible memory model specifications. Although this convergence is a dramatic improvement, it has exposed fundamental shortcomings in current popular languages and systems that prevent achieving the vision of structured and safe parallel programming.

This paper discusses the path to the above convergence, the hard lessons learned, and their implications. A cornerstone of this convergence has been the view that the memory model should be a contract between the programmer and the system - if the programmer writes disciplined (data-race-free) programs, the system will provide high programmability (sequential consistency) and performance. We discuss why this view is the best we can do with current popular languages, and why it is inadequate moving forward. We then discuss research directions that eliminate much of the concern about the memory model, but require rethinking popular parallel languages and hardware. In particular, we argue that parallel languages should not only promote high-level disciplined models, but they should also enforce the discipline. Further, for scalable and efficient performance, hardware should be co-designed to take advantage of and support such disciplined models. The inadequacies of the state-of-the-art and the research agenda we outline have deep implications for the practice, research, and teaching of many computer science sub-disciplines, spanning theory, software, and hardware.

Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing

In Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing, Rendel Tillmann and Klaus Ostermann at the University of Marburg, Germany apply the "don't repeat yourself" principle to parsers and pretty printers.

Parsers and pretty-printers for a language are often quite similar, yet both are typically implemented separately, leading to redundancy and potential inconsistency. We propose a new interface of syntactic descriptions, with which both parser and pretty-printer can be described as a single program. Whether a syntactic description is used as a parser or as a pretty-printer is determined by the implementation of the interface. Syntactic descriptions enable programmers to describe the connection between concrete and abstract syntax once and for all, and use these descriptions for parsing or pretty-printing as needed. We also discuss the generalization of our programming technique towards an algebra of partial isomorphisms.

Language Virtualization for Heterogeneous Parallel Computing

Hassan Chafi, Zach DeVito, Adriaan Moors, Tiark Rompf, Arvind Sujeeth, Pat Hanrahan, Martin Odersky, and Kunle Olukotun describe an approach to parallel DSLs that is a hybrid between external DSLs and internal DSLs in Language Virtualization for Heterogeneous Parallel Computing.

As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatible mix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomposition and machine-specific details. Ordinary programmers are having a difficult time using these programming models effectively. To provide a programming model that addresses the productivity and performance requirements for the average programmer, we explore a domain-specific approach to heterogeneous parallel programming.

We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. We define criteria for language virtualization and present techniques to achieve them.We present two concrete case studies of domain-specific languages that are implemented using our virtualization approach.

While the motivation of the paper is parallelization the proposed design looks like LINQ expression trees dialed to 11.

XML feed