System-level Static Analysis

Here's a shocker: static analysis of C or ASM is tricky. I'd rather be analyzing Pi-calculus or Haskell. Now, of course, when

  • you have to deal with existing system-level software (read Linux kernel or kernel modules)
  • you need to deal with infinite behaviours
  • you want guarantees of safety, in terms of what system calls are performed and when they are performed
  • you want these guarantees to be expressed in a manner that can be checked against a local policy
  • you can't afford to develop a general kernel-level sandbox
  • you can afford to assume that array accesses and pointer operations are safe
  • static policies for dynamic control, as provided by SELinux are not powerful enough

it starts to look like you don't have much choice.

In this case, after some thinking, I believe that my best angle would be type-and-effects, either on C (preferably) or ASM. Of course, this requires a degree of formal semantics and/or type system framework for the chosen language. Also of course, I'd be willing to go the PCC way if there's anything available. Again, preferably for C.

So far, my googling on the subject has located:

  • A message on LtU Classics, unfortunately pointing to a 404.
  • Nikolaos S. Papaspyrou's PhD thesis on the formal semantics of C, which I'm trying to read at the moment.
  • Tal and DTal, which are probably quite usable but a bit too close to the metal for my tastes.
  • Jeff Cook's formalization of C in Nqthm (I haven't been able to locate the actual document yet and I'm not familiary with Nqthm).
  • Michael Norrish's formalization of C in HOL.
  • CQual, whose documentation I'm currently reading.

Before proceeding any further, I would like to hear your opinion. Is there any important angle or previous work I've forgotten ? Is anyone around here working on something similar ? Do you have any suggestion ?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Dawson Engler

Please see Dawson Engler's home page. Much of his recent work revolves around bug-finding in real-world systems.

the ieee security & privacy

the ieee security & privacy series featured this paper, which, beyond the best skipped article, concludes with references to some very interesting static analysis research projects. most of these are specifically targeting large c systems.

Updated link

The broken link on the LtU Classic page should point here.

That entire Types forum thread may be of interest; it begins here.

Some related work

Caduceus is a verification tool for C programs that uses annotations in comments similar to JML. It doesn't support full C, in particular it doesn't handle arbitrary casts and unions.

The VFiasco project aims to proof properties of a micro kernel written in C++. Because of this application domain, they've also looked at some more low-level details of C such as a verification of Duff's Device.

Caduceus deprecated in

Caduceus deprecated in favour of Frama-C.