Compiler Validation through Program Analysis

Anna Zaks and Amir Pnueli (2008). CoVaC: Compiler Validation by Program Analysis of the Cross-Product. In Proc. 15th Symp. Formal Methods.

From the introduction:

In this paper, we present a Compiler Verification by Program Analysis of the Cross-Product framework (CoVaC ) - a novel translation validation approach, in which one constructs a comparison system - a cross-product of the source and target programs. The input programs are equivalent if and only if the comparison system satisfies a certain specification. This allows us to leverage the existing methods of proving properties of a single program instead of relying on program analysis and proof rules specialized to translation validation, used by the existing frameworks.

and

Finally, this paper reports on a prototype tool CoVaC that applies the developed framework to verification of optimizing transformations performed by LLVM 1.9 - a very aggressive open-source compiler.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Translation Validation

A. Pnueli, M. Siegel, E. Singerman (1998). Translation Validation (citeseerx). In LNCS 1389: Proc. 4th Conf. TACAS, pp. 151-166.

Abstract:

We present the notion of translation validation as a new approach to the verification of translators (compilers, code generators). Rather than proving in advance that the compiler always produces a target code which correctly implements the source code (compiler verification), each individual translation (i.e. a run of the compiler) is followed by a validation phase which verifies that the target code produced on this run correctly implements the submitted source program. Several ingredients are necessary to set up the-- fully automatic-- translation validation process, among which are:

  1. A common semantic framework for the representation of the source code and the generated target code.

  2. A formalization of the notion of "correct implementation " as a refinement relation.
  3. A syntactic simulation-based proof method which allows to automatically verify that one model of the semantic framework, representing the produced target code, correctly implements another model which represents the source.
These, and other ingredients are elaborated in this paper, in which we illustrate the new approach in a most challenging case. We consider a translation (compilation) from the synchronous multi-clock data-flow language Signal to asynchronous (sequential) C-code.

Another paper of interest (I

Another paper of interest (I mentioned in the Chrome thread with regards to proving different JS implementations correct / different):

Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation.
David Brumley, Juan Caballero, Zhenkai Liang, James Newsome, and Dawn Song. To appear in Proceedings of USENIX Security Symposium, Aug 2007.
Best Paper Award

In this context, the task would be to either:

1) find differences between a naively compiled program and an optimal one

and

2) find differences between a naive compiler and an optimal one

Some types of code is horrible for these binary tools -- a small world assumption mixed with enumerations for 1) might actually be more feasible than 2) !