DySy: Dynamic Symbolic Execution for Invariant Inference

DySy: Dynamic Symbolic Execution for Invariant Inference. Christoph Csallner, Nikolai Tillmann, Yannis Smaragdakis

Dynamically discovering likely program invariants from concrete test executions has emerged as a highly promising software engineering technique. Dynamic invariant inference has the advantage of succinctly summarizing both “expected”program inputs and the subset of program behaviors that is normal under those inputs. In this paper, we introduce a technique that can drastically increase the relevance of inferred invariants, or reduce the size of the test suite required to obtain good invariants. Instead of falsifying invariants produced by pre-set patterns, we determine likely program invariants by combining the concrete execution of actual test cases with a simultaneous symbolic execution of the same tests. The symbolic execution produces abstract conditions over program variables that the concrete tests satisfy during their execution. In this way, we obtain the benefits of dynamic inference tools like Daikon: the inferred invariants correspond to the observed program behaviors. At the same time, however, our inferred invariants are much more suited to the program at hand than Daikon’s hardcoded invariant patterns. The symbolic invariants are literally derived from the program text itself, with appropriate value substitutions as dictated by symbolic execution. We implemented our technique in the DySy tool, which utilizes a powerful symbolic execution and simplification engine. The results confirm the benefits of our approach. In Daikon’s prime example benchmark, we infer the majority of the interesting Daikon invariants, while eliminating invariants that a human user is likely to consider irrelevant.

The Daikon invariant detector mentioned in the abstract is here.

Seems like a pretty straightforward idea, I guess, but as always: God is in the details.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Symbolic/Concolic Signature Generators

An interesting similar idea is practiced in the security community with respect to using symbolic and concolic techniques to make exploit signatures with no false positives. The security folks lump together multiple paths leading to the goal deviant state to characterize the malicious input. As a parallel to the above cited article, you would say the paths are those reached by the test suite (a lot of work in this area is around discovering them), and allow intermediate values instead of optimizing them away. Definitely an exciting space.

The comparison to Daikon seemed a little strange, as the hard coding of templates is just shifted around to the solver / path constraints. Definitely want to reread it to understand the claim better. Type (like taint) invariants would still involve 'templating', such as their purity example.

Just to throw it out there: one thought we toyed with awhile back was the inverse of their approach. Dynamic analyses win for statistical needs, so instead of using the path conditions as summaries, take them to generate as much concrete data as needed and work with partitioning from there (statistically, not logically).

This idea sounds rather like

This idea sounds rather like explanation-based generalization in the AI literature.