Automated Whitebox Fuzz Testing

Automated Whitebox Fuzz Testing. Patrice Godefroid; Michael Levin; David Molnar.

Fuzz testing is an effective technique for finding security vulnerabilities in software. Traditionally, fuzz testing tools apply random mutations to well-formed inputs and test the program on the resulting values. We present an alternative whitebox fuzz testing approach inspired by recent advances in symbolic execution and dynamic test generation. Our approach records an actual run of a program under test on a well-formed input, symbolically evaluates the recorded trace, and generates constraints capturing how the program uses its inputs. The generated constraints are used to produce new inputs which cause the program to follow different control paths. This process is repeated with the help of a code-coverage maximizing heuristic designed to find defects as fast as possible. We have implemented this algorithm in SAGE (Scalable, Automated, Guided Execution), a new tool employing x86 instruction-level tracing and emulation for whitebox fuzzing of arbitrary file-reading Windows applications.

I wonder how moving this type of system to the level of high-level programming language would impact its effectiveness.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Sounds disturbingly familiar

See "EXE: Automatically Generating Inputs of Death" on Dawson Engler's page.

From the abstract it's hard to tell how this is different at all from the above, but it does have some comparison in the Related Work section.

A couple differences

After a quick read, I think I see a couple of differences:

  • SAGE does not require source code; it works directly on x86 binaries.
  • SAGE uses a record / replay approach to obtaining information about an execution, rather than instrumenting the running program. This helps from an engineering standpoint and also prevents the instrumentation from perturbing the application.
  • There seem to be various new search heuristics for handling really large programs.
There may be other differences, but these caught my eye.

Why "disturbingly"?

What are you implying?

Not much considering it's a

Not much considering it's a technical report.


Patrice presented DART with Koushik Sen, so I'd imagine familiarity with the literature should make it more interesting, not disturbing. In NLP, there are a lot of basic problems, and experimental approaches are very incremental. Thanks for the link, looking forward to the read on the plane!

Also of interest

Also of interest is ``Exploring Multiple Execution Paths for Malware Analysis'' by A. Moser, C. Kruegel, and E. Kirda, which was presented at the last Security & Privacy [PDF]