SLAM (at POPL)
started 1/24/2002; 2:31:19 AM - last post 1/28/2002; 12:54:32 PM
|
|
Ehud Lamm - SLAM (at POPL)
1/24/2002; 2:31:19 AM (reads: 2474, responses: 12)
|
|
|
Ehud Lamm - Re: SLAM (at POPL)
1/24/2002; 2:33:55 AM (reads: 1669, responses: 0)
|
|
On the first day of POPL, Thomas Ball from Microsoft Research
presented an invited talk on SLAM: "The SLAM Project: Debugging system
software via static analysis", on behalf of himself and Sriram
K. Rajamani. SLAM has already been covered on LtU. Hopefully, my
summary provides a bit more detail.
Most of the time, programmers write code within a particular framework
-- (X) Windows API, Qt, device driver API, etc. API is a set of
functions, which must be invoked in a correct sequence. The most
primitive example (discussed in a talk) is a locking API: given a lock
handle, you can invoke a lock() function -- but should do it only
once. After that, you must eventually invoke an unlock() function --
again, exactly once. Verifying this pattern of usage seems trivial,
until we note that calls to a lock() function may occur within a
conditional statement. The call to an unlock() function must also
occur within a conditional statement -- such that the condition of the
first statement must imply the condition of the second. Checking this
implication given the imperative source language is highly
non-trivial. Naive model checking (which assumes that the conditional
branch can be either taken or not) will report a violation of the
locking API -- which often is a false positive. To be sure that we
notice a real bug, we need to know which of the possible program paths
are feasible.
SLAM tool takes C code -- as is, without any annotations. SLAM also
takes a specification of the API -- as a finite state machine. SLAM
iteratively infers program invariants and tries to improve the
analysis to eliminate false positives.
- SLAM creates an abstraction of the code: at first the
abstraction is cruel: all variables can take any possible value and
all conditional branches can be either taken or bypassed.
- SLAM runs a model checker on the abstract program to see if
calls to API functions fit the API specification (the FSM). If the
model checker finds no counter-example, the program is correct (wrt the
API usage).
- If the model checker finds a counter-example (which shows a
possible program trace), a special component (called 'Newton') checks
the conditionals of the branches taken in the counter-example trace,
notes the variables used in these conditions, and symbolically
evaluates the program with respect to the noted variables. This
evaluation yields new predicates -- constraints on the values
variables can really take and constraints on the branches
taken. Newton adds these new predicates to the abstraction of the
program and repeats the model checking.
Thus SLAM is a combination of three tools -- create abstraction, model
check, symbolic evaluation (abstract interpretation). The tools are
indeed independently-compiled modules run through a command line. Many
of the tools are off-the-shelf components: one component was written
in OCaml! OCaml was also used as a glue language.
Thomas Ball claimed that the approach scales to 100k lines of code. In
their experiments, they have analyzed several real (Winxx) device drivers and
checked their use of DDR API. They found several real bugs in a floppy
device driver.
|
|
Oleg - Mining specifications
1/24/2002; 8:00:27 PM (reads: 1795, responses: 4)
|
|
Mining Specifications
Glenn Ammons, Rastislav Bodik (U. Wisconsin @ Madison); James R. Larus
(Microsoft Research). POPL, Jan 16, 2002.
To verify program's compliance to an API specification, we need the
specification. The Windows Device Driver API used in the SLAM project
was derived by hand, with expertise of people who developed the
API. It is often the case however that (plain-text) API manuals are
not precise enough for formal verification, and the original
developers are not available for comments.
The talk proposes another, AI inspired, model: learn the behavior from
observing a training set. In our case, the training set is running a
set of test cases. Observations are dynamic traces of the program
executions. The authors have built a tool that, given such traces,
outputs a finite state machine that specifies the correct protocol of
API usage. The main assumption is that common behavior is the correct
behavior.
In more detail, the tool ingests the traces, notes calls to API
functions, and builds a probabilistic finite state machine. Removing
infrequent transitions from the latter yields the desired API
specification.
The probabilistic FSA learner was "off the shelf". The contributions of the paper are
- Figuring out which calls to API functions are related
(by examining data dependencies).
E.g., open() returning 7, write(7,...) and close(7) are related,
but write(1,...) is not related to the above
- Simplifying the flow control to "scenarios". This involves
some manual tuning and seeding, but drastically reduces
the size of training data and makes learning faster
- Deciding which scenarios are equivalent:
open()->7...close(7) and open()->8...close(8) are equivalent.
The authors inferred the basic X windows API (libX and libXi) by
analyzing traces of 17 programs (five from the X Windows
distribution). They used human-assisted learning: run a program,
collect a trace, verify against the previously inferred API. If the
verification fails, get the expert to inspect the failure. If the
expert deems the program correct, extend the API specification. The
authors found a real bug in two programs and benign violations of the
X window protocol rules in three other programs (xconsole, xclipboard
and clipboard).
|
|
Ehud Lamm - Re: Mining specifications
1/25/2002; 6:02:53 AM (reads: 1793, responses: 0)
|
|
Any more info on the FSA learner?
|
|
Frank Atanassow - Re: SLAM (at POPL)
1/25/2002; 11:58:27 AM (reads: 1443, responses: 1)
|
|
Re: Mining Specifications: Sounds like "correctness by consensus"?
I wonder what happens when there is no most-general-model of two traces, i.e., how inconsistency is treated. This is probabilistic, so I guess the users get to decide "what the specification ought to be". Very strange, thought-provoking, politically correct...! :)
|
|
Ehud Lamm - Re: SLAM (at POPL)
1/25/2002; 12:14:12 PM (reads: 1500, responses: 0)
|
|
Right on. I wanted to comment on the fact that as far as I recall specification ought to come before implemenetation
|
|
Oleg - Re: SLAM (at POPL)
1/25/2002; 8:16:51 PM (reads: 1404, responses: 1)
|
|
Any more info on the FSA learner?
As the authors indicate, they used an off-the-shelf PFSA learner
Anand V. Raman and Jon D. Patrick
The sk-strings method for inferring PFSA.
http://citeseer.nj.nec.com/103674.html
"which is the variation of the classical k-tails algorithm"
@article{ biermann72synthesis,
author = "A. W. Biermann and J. A. Feldman",
title = "On the synthesis of finite-state machines from samples of their behaviour",
journal = "IEEE Transactions on Computers",
volume = "C",
number = "21",
pages = "592--597",
year = "1972"
}
|
|
Oleg - Re: SLAM (at POPL)
1/25/2002; 8:17:24 PM (reads: 1389, responses: 0)
|
|
The paper tries to solve the following problem:
- an API manipulates at most k objects
For example, the file API manipulates only one object: the file
handle
- suppose the correct protocol of API usage can be represented
by M, a probabilistic finite state automaton (PFSA). PFSA is a
non-deterministic FSA with output alphabet being the set of
API function names and with a probabilistic state transition function.
Note that the correct API protocol is still
probabilistic: some usage patterns are common and some are less common
(either because they are rarely used or because we can know the
correct API only approximately)
- suppose we have a sample of execution traces of mostly
correct programs.
The problem is to efficiently derive M', which is an
approximation of M, given a confidence parameter, an approximation
parameter, and the upper bound on the number of states in M.
Again, the fundamental assumption is that bugs are relatively rare --
at least for the programs and the traces used in a training set.
When it came to experiments, the authors did not have any reliable
training set. Therefore, they tried to construct one
incrementally. They started with traces of one program believed
correct, extracted scenarios and inferred the automaton. They picked
another program, got the set of traces, and extracted scenarios. The
author then checked if all such scenarios (up to the specified max
size) can be described by the previously learned automaton. Scenarios
rejected by the automaton are manually inspected. If an expert
believes a scenario is correct, it is added to the learning set, and a
new automaton is inferred.
|
|
Ehud Lamm - Re: SLAM (at POPL)
1/26/2002; 7:01:15 AM (reads: 1455, responses: 0)
|
|
Thanks. Looks interesting.
|
|
Ehud Lamm - Re: Mining specifications
1/26/2002; 1:07:53 PM (reads: 1663, responses: 2)
|
|
Let's connect this to programming languages.
The issue here is that most languages are not expressive enough, and don't allow libraries to specify correct order for subroutine calls.
Notice that even if a tool such as the one described deduces a correct specificatoin, this specification can not be enforced by the programming language (e.g. C). Contrast this, for example, with deducing assertions, which can then be put inside the code - so that they will be enforced in the future.
Holland and Helm wrote some papers about reifying Contracts, which among other things enable the specification of allowed interactions between objects and methods.
|
|
Frank Atanassow - Re: Mining specifications
1/26/2002; 1:21:05 PM (reads: 1720, responses: 1)
|
|
C has assertions. All you need for assertions is booleans and a mechanism for crashing your program...
|
|
Ehud Lamm - Re: Mining specifications
1/26/2002; 2:05:04 PM (reads: 1786, responses: 0)
|
|
I guess I didn't make myself clear (though I did use the word contrast)...
Assertions are easy. Specifying API usage is much more problematic. (Examples: if open(file-id) is called, then eventually (before the program terminates) close(fid) must be called; read(file-id) is legal if open() was successful, and close() haven't been called yet.)
|
|
Noel Welsh - Re: SLAM (at POPL)
1/28/2002; 12:54:32 PM (reads: 1323, responses: 0)
|
|
This is interesting work! I wonder if having extracted the scenarios, one could run the FSA backwards to generate code. This could have interesting applications in smart IDEs. Imagine a code wizard looking over your shoulder and suggesting common usage patterns. Should be good for rookies; would probably drive the grey-beards insane! :)
|
|
|
|