Software Engineering with Formal Methods: The Development of a Storm Surge Barrier Control System - Seven Myths of Formal Methods Revisited (2001), by Jan Tretmans, Klaas Wijbrans, Michel Chaudron:
Bos is the software system which controls and operates the storm surge barrier in the Nieuwe Waterweg near Rotterdam. It is a complex, safety-critical system of average size, which was developed by CMG Den Haag B.V., commissioned by Rijkswaterstaat (RWS) â€“ the Dutch Ministry of Transport, Public Works and Water Management. It was completed in October 1998 on time and within budget.
CMG used formal methods in the development of the Bos software. This paper discusses the experiences obtained from their use. Some people claim that the use of formal methods helps in developing correct and reliable software, others claim that formal methods are useless and unworkable. Some of these claims have almost become myths. A number of these myths are described and discussed in a famous article: Seven Myths of Formal Methods [Hal90]. The experiences obtained from using formal methods for the development of Bos will be discussed on the basis of this article. We will discuss to what extent these myths are true for the Bos project.
The data for this survey were collected by means of interviews with software engineers working on the Bos project. These include the project manager, designers, implementers and testers, people who participated from the beginning in 1995 until the end in 1998 as well as engineers who only participated in the implementation phase, and engineers with and without previous, large-scale software engineering experience.
This paper concentrates on the experiences of the software engineers with formal methods. These experiences, placed in the context of the seven myths, are described in section 3. This paper does not discuss technical details about the particular formal methods used or the way they were used; see [Kar97, Kar98] for these aspects. Moreover, formal methods were only one technique used in the development of Bos. The overall engineering approach and the way diï¬€erent methods and techniques were combined to assure the required safetycritical quality, are described [WBG98, WB98]. Testing in Bos is described in more detail in [GWT98], while [CTW99] will give a more systematic analysis of the results of the interviews
with the developers.
Discussion of formal methods and verification has come up a few times here on LtU. In line with the recent discussions on the need for more empirical data in our field, this was an interesting case study on the use of formal methods. The seven myths of formal methods are reviewed in light of a real project:
- Myth 1: Formal methods can guarantee that software is perfect
- Myth 2: Formal methods are all about program proving
- Myth 3: Formal methods are only useful for safety-critical system
- Myth 4: Formal methods require highly trained mathematicians
- Myth 5: Formal methods increase the cost of developmen
- Myth 6: Formal methods are unacceptable to users
- Myth 7: Formal methods are not used on real, large-scale software
Extensible Programming with First-Class Cases, by Matthias Blume, Umut A. Acar, and Wonseok Chae:
We present language mechanisms for polymorphic, extensible records and their exact dual, polymorphic sums with extensible first-class cases. These features make it possible to easily extend existing code with new cases. In fact, such extensions do not require any changes to code that adheres to a particular programming style. Using that style, individual extensions can be written independently and later be composed to form larger components. These language mechanisms provide a solution to the expression problem.
We study the proposed mechanisms in the context of an implicitly typed, purely functional language PolyR. We give a type system for the language and provide rules for a 2-phase transformation: first into an explicitly typed Î»-calculus with record polymorphism, and finally to efficient index-passing code. The first phase eliminates sums and cases by taking advantage of the duality with records.
We implement a version of PolyR extended with imperative features and pattern matchingâ€”we call this language MLPolyR. Programs in MLPolyR require no type annotationsâ€”the implementation employs a reconstruction algorithm to infer all types. The compiler generates machine code (currently for PowerPC) and optimizes the representation of sums by eliminating closures generated by the dual construction.
This is an elegant solution to the expression problem for languages with pattern matching. This paper was posted twice in LtU comments, but it definitely deserves its own story. Previous solutions to the exression problem are rather more involved, like Garrigue's use of recursion and polymorphic variants, because they lack support for extensible records which makes this solution so elegant.
Extensible records and first-class cases unify object-oriented and functional paradigms on a deeper level, since they enable first-class messages to be directly encoded. Add a sensible system for dynamics, and I argue you have most of the power people claim of dynamic languages without sacrificing the safety of static typing.
Microsoft has recently detailed their "Compiler as a Service" initiative in a whitepaper. The whitepaper calls the project Roslyn.
Related, IBM sponsors the Eclipse IMP project for its X10 language (and the X10DT). IMP is also used for Eelco Visser's Spoofax IDE and WebDSL IDE.
Mike Barnett, Manuel Fahndrich, K. Rustan M. Leino, Peter Muller, Wolfram Schulte, and Herman Venter, Specification and Verification: The Spec# Experience" Preprint of an article appearing in the June 2011 CACM.
CACM tagline: Can a programming language really help programmers write better programs?
Spec# is a programming system that facilitates the development of correct software. The Spec# language extends C# with contracts that allow programmers to express their design intent in the code. The Spec# tool suite consists of a compiler that emits run-time checks for contracts, a static program verifier that attempts to mathematically prove the correctness of programs, and an integration into the Visual Studio development environment. Spec# shows how contracts and verifiers can be integrated seamlessly into the software development process. This paper reflects on the six-year history of the Spec# project, scientific contributions it has made, remaining challenges for tools that seek to establish program correctness, and prospects of incorporating verification into everyday software engineering.
Spec# is, in some ways, quite similar to JML+ESC/Java2. But Spec# is a language rather than a set of annotations, which allows it to incorporate features such as a non-null type system and a very tight integration with the IDE.
Spec# was previously mentioned on LtU back in 2005.
Roberto Ierusalimschy, Luiz Henrique de Figueiredo, and Waldemar Celes, "Passing a Language through the Eye of a Needle: How the embeddability of Lua impacted its design", ACM Queue vol. 9, no. 5, May 2011.
A key feature of a scripting language is its ability to integrate with a system language. This integration takes two main forms: extending and embedding. In the first form, you extend the scripting language with libraries and functions written in the system language and write your main program in the scripting language. In the second form, you embed the scripting language in a host program (written in the system language) so that the host can run scripts and call functions defined in the scripts; the main program is the host program.
In this article we discuss how embeddability can impact the design of a language, and in particular how it impacted the design of Lua from day one. Lua is a scripting language with a particularly strong emphasis on embeddability. It has been embedded in a wide range of applications and is a leading language for scripting games.
An interesting discussion of some of the considerations that go into supporting embeddability. The design of a language clearly has an influence over the API it supports, but conversely the design of an API can have a lot of influence over the design of the language.
Asynchronous Proof Processing with Isabelle/Scala and Isabelle/jEdit. Makarius Wenzel, UITP 2010.
After several decades, most proof assistants are still centered around TTY-based interaction in a tight read-eval-print loop. Even well-known Emacs modes for such provers follow this synchronous model based on single commands with immediate response, meaning that the editor waits for the prover after each command. There have been some attempts to re-implement prover interfaces in big IDE frameworks, while keeping the old interaction model. Can we do better than that?
Ten years ago, the Isabelle/Isar proof language already emphasized the idea of proof document (structured text) instead of proof script (sequence of commands), although the implementation was still emulating TTY interaction in order to be able to work with the then emerging Proof General interface. After some recent reworking of Isabelle internals, to support parallel processing of theories and proofs, the original idea of structured document processing has surfaced again.
Isabelle versions from 2009 or later already provide some support for interactive proof documents with asynchronous checking, which awaits to be connected to a suitable editor framework or full-scale IDE. The remaining problem is how to do that systematically, without having to specify and implement complex protocols for prover interaction.
This is the point where we introduce the new Isabelle/Scala layer, which is meant to expose certain aspects of Isabelle/ML to the outside world. The Scala language (by Martin Odersky) is suï¬ƒciently close to ML in order to model well-known prover concepts conveniently, but Scala also runs on the JVM and can access existing Java libraries directly. By building more and more external system wrapping for Isabelle in Scala, we eventually reach the point where we can integrate the prover seamlessly into existing IDEs (say Netbeans).
To avoid getting side-tracked by IDE platform complexity, our current experiments are focused on jEdit, which is a powerful editor framework written in Java that can be easily extended by plugin modules. Our plugins are written again in Scala for our convenience, and to leverage the Scala actor library for parallel and interactive programming. Thanks to the Isabelle/Scala layer, the Isabelle/jEdit implementation is very small and simple.
I thought this was a nice paper on the pragmatics of incremental, interactive proof editing. I've suspected for a while that as programming languages and IDEs grow more sophisticated and do more computationally-intensive checks at compile time (including but not limited to theorem proving), it will become similarly important to design our languages to support modular and incremental analysis.
However, IDE designs also need more experimentation, and unfortunately the choice of IDEs to extend seem to be limited to archaic systems like Emacs or industrial behemoths like Eclipse or Visual Studio, both of which constrain the scope for new design -- Emacs is too limited, and the API surface of Eclipse/VS is just too big and irregular. (Agda-mode for Emacs is a heroic but somewhat terrifying piece of elisp.)
In Finding and Understanding Bugs in C Compilers Xuejun Yang, Yang Chen, Eric Eide, and John Regehr of University of Utah, School of Computing describe Csmith, a fuzzer for testing C compilers. The hard part was avoiding undefined behavior.
Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to ï¬nd compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our ï¬rst contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undeï¬ned and unspeciï¬ed behaviors that would destroy its ability to automatically ï¬nd wrong code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.
Two bits really stuck out for me. First, formal verification has a real positive impact
The striking thing about our CompCert results is that the middleend bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot ï¬nd wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine-checked, has tangible beneï¬ts for compiler users.
And second, code coverage is inadequate for ensuring good test thoroughness for software as complex as a compiler.
Because we ï¬nd many bugs, we hypothesized that randomly generated programs exercise large parts of the compilers that were not covered by existing test suites. To test this, we enabled code coverage monitoring in GCC and LLVM. We then used each compiler to build its own test suite, and also to build its test suite plus 10,000 Csmith-generated programs. Table 3 shows that the incremental coverage due to Csmith is so small as to be a negative result. Our best guess is that these metrics are too shallow to capture Csmithâ€™s effects, and that we would generate useful additional coverage in terms of deeper metrics such as path or value coverage.
Habit is a systems programming dialect of Haskell from the High-Assurance Systems Programming (HASP) project at Portland State University. From The Habit Programming Language: The Revised Preliminary Report
This report presents a preliminary design for the programming language Habit, a dialect of Haskell that supports the development of high quality systems software. The primary commitments of the design are as follows:
* Systems programming: Unlike Haskell, which was intended to serve as a general purpose functional programming language, the design of Habit focusses on features that are needed in systems software development. These priorities are reï¬‚ected fairly directly in the new features that Habit provides for describing bit-level and memory-based data representations, the introduction of new syntactic extensions to facilitate monadic programming, and, most signiï¬cantly, the adoption of a call-by-value semantics to improve predictability of execution. The emphasis on systems programming also impacts the design in less direct ways, including assumptions about the expected use of whole program compilation and optimization strategies in a practical Habit implementation.
* High assurance: Although most details of Haskellâ€™s semantics have been formalized at some point in the research literature, there is no consolidated formal description of the whole language. There are also known differences in semantics, particularly with respect to operational behavior, between different Haskell implementations in areas where the Haskell report provides no guidance. Although it is not addressed in the current report, a high-priority for Habit is to provide a full, formal semantics for the complete language that can be used as a foundation for reasoning and formal veriï¬cation, a mechanism for ensuring consistency between implementations, and a basis for reliably predicting details about memory allocation, asymptotic behavior, and resource utilization.
HASP has a couple of postdoc positions open to help with Habit.
Not strictly PLT-related, but Neil Brown has contributed an amazing series of articles to Linux Weekly News:
For this series we try to look for patterns which become visible only over an extended time period. As development of a system proceeds, early decisions can have consequences that were not fully appreciated when they were made. If we can find patterns relating these decisions to their outcomes, it might be hoped that a review of these patterns while making new decisions will help to avoid old mistakes or to leverage established successes.
Pure and Declarative Syntax Definition: Paradise Lost and Regained by Lennart C. L. Kats, Eelco Visser, Guido Wachsmuth from Delft
Syntax definitions are pervasive in modern software systems, and serve as the basis for language processing tools like parsers and compilers. Mainstream parser generators pose restrictions on syntax definitions that follow from their implementation algorithm. They hamper evolution, maintainability, and compositionality of syntax definitions. The pureness and declarativity of syntax definitions is lost. We analyze how these problems arise for different aspects of syntax definitions, discuss their consequences for language engineers, and show how the pure and declarative nature of syntax definitions can be regained.
I haven't compared this version with the Onward 2010 version, but they look essentially the same. It seems timely to post this paper, considering the other recent story Yacc is dead. There is not a whole lot to argue against in this paper, since we all "know" the other approaches aren't as elegant and only resort to them for specific reasons such as efficiency. Yet, this is the first paper I know of that tries to state the argument to software engineers.
For example, the Dragon Book, in every single edition, effectively brushes these topics aside. In particular, the Dragon Book does not even mention scannerless parsing as a technique, and instead only explains the "advantages" of using a scanner. Unfortunately, the authors of this paper don't consider other design proposals, either, such as Van Wyk's context-aware scanners from GPCE 2007. It is examples like these that made me wish the paper was a bit more robust in its analysis; the examples seem focused on the author's previous work.
If you are not familiar with the author's previous work in this area, the paper covers it in the references. It includes Martin Bravenboer's work on modular Eclipse IDE support for AspectJ.