archives

Synchronic Computation II

May I draw members attention to an updated version of an earlier attempt to bypass the iteration, data structure, and deadlock issues associated with conventional dataflow models. As far as I am aware, no pure λ-calculus simulation environment for a high level programming language such as Haskell has ever been constructed, that has viable time and space complexity characteristics. That difficulty lends support to the view that alternative models should be considered, where software is not fully abstracted from hardware. An 8 page pdf summary of the material may be accessed here.

Abstract

Space is a dataflow oriented language that exploits the massive parallelism available in a formal model of computation called the Synchronic A-Ram, and provides a framework for developing general-purpose parallel environments for FPGA and reconfigurable architectures. Space is an example of an interlanguage; a circuit oriented, textual environment that was developed to address shortcomings associated with conventional tree languages for representing dataflow, allowing complex syntax trees to be collapsed into a simplified form. Space expresses variable grained MIMD parallelism, is modular, strictly typed, and deterministic. Barring operations associated with memory allocation and compilation, modules cannot access global variables, and are referentially transparent. Modules exhibit a small, sequential state transition system, aiding verification. Space deals with communication, scheduling, and resource contention issues in parallel computing, by resolving them explicitly in an incremental manner, module by module, whilst ascending the ladder of abstraction. An environment has been developed, that incorporates a simulator and compiler that transform Space programs into Synchronic A-Ram machine code consisting of only three bit-level instructions, and a marking instruction. Space and the Synchronic A-Ram do not exhibit the iteration, data structure, and deadlock related issues associated with conventional dataflow models. The implementation of high level (parallel) computation on a simple formal model, closes a missing link in computer science, and points to architectures and environments that are less susceptible to contention and programmability issues associated with multithreading on processor networks.

Finding and Understanding Bugs in C Compilers

In Finding and Understanding Bugs in C Compilers Xuejun Yang, Yang Chen, Eric Eide, and John Regehr of University of Utah, School of Computing describe Csmith, a fuzzer for testing C compilers. The hard part was avoiding undefined behavior.

Compilers should be correct. To improve the quality of C compilers, we created Csmith, a randomized test-case generation tool, and spent three years using it to find compiler bugs. During this period we reported more than 325 previously unknown bugs to compiler developers. Every compiler we tested was found to crash and also to silently generate wrong code when presented with valid input. In this paper we present our compiler-testing tool and the results of our bug-hunting study. Our first contribution is to advance the state of the art in compiler testing. Unlike previous tools, Csmith generates programs that cover a large subset of C while avoiding the undefined and unspecified behaviors that would destroy its ability to automatically find wrong code bugs. Our second contribution is a collection of qualitative and quantitative results about the bugs we have found in open-source C compilers.

Two bits really stuck out for me. First, formal verification has a real positive impact

The striking thing about our CompCert results is that the middleend bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine-checked, has tangible benefits for compiler users.

And second, code coverage is inadequate for ensuring good test thoroughness for software as complex as a compiler.

Because we find many bugs, we hypothesized that randomly generated programs exercise large parts of the compilers that were not covered by existing test suites. To test this, we enabled code coverage monitoring in GCC and LLVM. We then used each compiler to build its own test suite, and also to build its test suite plus 10,000 Csmith-generated programs. Table 3 shows that the incremental coverage due to Csmith is so small as to be a negative result. Our best guess is that these metrics are too shallow to capture Csmith’s effects, and that we would generate useful additional coverage in terms of deeper metrics such as path or value coverage.

Extensible nesting of classes

Many languages support aggressive nesting, from the syntax-only nesting of Java and C# to the more semantically meaningful nesting of BETA and Scala; e.g., a nested class in Scala is qualified by the type of its creating objecting. In any case, a class Inner is only nested in a class Outer if it is physically expressed within it.

I was wondering if anyone ever considered containment constraints before. Rather than nest class Inner in class Outer, why not just specify that class Inner could only be instantiated in the context of an instance of class Outer? This makes containment more like inheritance: class Subclass can inherit from class Superclass after Superclass has been created, and perhaps even in a different module. Why not do the same with containment?

Any references would be helpful. I've turned up some of the basics nesting papers, however I haven't found anyone ever considering containment constraints before. That seems strange as this seems like one of those obvious generalizations.