General

Automatic Patch-Based Exploit Generation

Brumley, Poosankam, Song & Zheng, 2008. Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications:
The automatic patch-based exploit generation problem is: given a program P and a patched version of the program P′, automatically generate an exploit for the potentially unknown vulnerability present in P but fixed in P'. In this paper, we propose techniques for automatic patch-based exploit generation, and show that our techniques can automatically generate exploits for 5 Microsoft programs based upon patches provided via Windows Update. Although our techniques may not work in all cases, a fundamental tenet of security is to conservatively estimate the capabilities of attackers. Thus, our results indicate that automatic patch-based exploit generation should be considered practical. One important security implication of our results is that current patch distribution schemes which stagger patch distribution over long time periods, such as Windows Update, may allow attackers who receive the patch first to compromise the significant fraction of vulnerable hosts who have not yet received the patch.
The technique is based on flow analysis, to test code that gets changed for boundaries where safety properties fail. The limitations of the technique they have developed automatically generate vulnerabilities for only a small fraction of propagated updates. Nonetheless I find it astonishing that such a simple analysis can provide such a payoff. Via Bruce Schneier.

Safe and Secure Software in Ada

A first installment of a booklet by John Barnes titled Safe and Secure Software: An Introduction to Ada 2005.

The purpose of this booklet is to illustrate the ways in which Ada 2005 can help in the construction of reliable software, by illustrating some aspects of its features. It is hoped that it will be of interest to programmers and managers at all levels.

It must be stressed that the discussion is not complete. Each chapter selects a particular topic under the banner of Safe X where Safe is just a brief token to designate both safety and security. For the most critical software, use of the related SPARK language appears to be very beneficial, and this is outlined in Chapter 11.

The introduction is rather amusing, so even non Ada fans may want to take look.

CERT C Secure Coding Standard

From SC-L:

We would like to invite the community to review and comment on the current version of the CERT C Secure Coding Standard available online at www.securecoding.cert.org before Version 1.0 is published.

Hey, maybe this is also relevant for the curriculum thread.

When Are Two Algorithms the Same?

When Are Two Algorithms the Same? Andreas Blass, Nachum Dershowitz, Yuri Gurevich. February 2008

People usually regard algorithms as more abstract than the programs that implement them. The natural way to formalize this idea is that algorithms are equivalence classes of programs with respect to a suitable equivalence relation. We argue that no such equivalence relation exists.

A bit more philosophical than usual, but the issue is quite relevant to discussions in the field.

It is possible to stipulate any equivalence relation that is considered useful (e.g., equivalence up to local transformations) but the notion of a universally applicable relation is indeed problematic.

Programmers At Work

Via Scott Rosenberg (whose book Dreaming in Code I think we mentioned here before), I learn than Susan Lammers is making the interviews from her 1984 book Programmers At Work available on the Web.

Here is how she describes the goals of her new site:

Many people have urged me over the years to do a second Programmers At Work (PAW) with a new generation of programmers and I’ve sketched out the project, made lists of new folks to feature, done inquiries, thought about going back to talk to the guys in the original edition, and other variations . Now with this web site, I will make the original interviews available online, and perhaps it will become the seed for a more “thoroughly modern” approach to the PAW series. What I’m hoping we can kindle on this site is an ongoing exploration and dynamic conversation with the "connected" community of programmers on the web about the creative process in programming.

Who better than the LtU community to contribute to such a conversation?

The first PAW interview posted to the site is the interview with Charles Simonyi, a man whose views of the future of programming, and programming languages, are mentioned here often (though not always with great enthusiasm).

Arc is released

Make of it what you will, but Arc is now officially released.

This part of Graham's announcement is a gem:

I worry about releasing it, because I don't want there to be forces pushing the language to stop changing. Once you release something and people start to build stuff on top of it, you start to feel you shouldn't change things. So we're giving notice in advance that we're going to keep acting as if we were the only users. We'll change stuff without thinking about what it might break, and we won't even keep track of the changes.

I realize that sounds harsh, but there's a lot at stake. I went to a talk last summer by Guido van Rossum about Python, and he seemed to have spent most of the preceding year switching from one representation of characters to another. I never want to blow a year dealing with characters. Why did Guido have to? Because he had to think about compatibility. But though it seems benevolent to worry about breaking existing code, ultimately there's a cost: it means you spend a year dealing with character sets instead of making the language more powerful.

This sure made me smile...

Prediction for 2008

So, what are your prediction for 2008? Naturally, we are only interested with predictions related to programming languages...

Three types of predictions are thus in order: (1) Predictions about PLT research (direction, fads, major results) (2) Predictions about programming languages (whether about specific languages, or about families etc.) and (3) Predictions about industrial use of languages/language-inspired techniques (adoption, popularity).

OCaml Light: A Formal Semantics For a Substantial Subset of the Objective Caml Language

OCaml Light: a formal semantics for a substantial subset of the Objective Caml language.

OCaml light is a formal semantics for a substantial subset of the Objective Caml language. It is written in Ott, and it comprises a small-step operational semantics and a syntactic, non-algorithmic type system. A type soundness theorem has been proved and mechanized using the HOL-4 proof assistant, thereby ensuring that the proof is free from errors. To ensure that the operational semantics accurately models Objective Caml, an executable version of the semantics has been created (and proved equivalent in HOL to the original, relational version) and tested on a number of small test cases.

Note that while we have tried to make the semantics accurate, we are not part of the OCaml development team - this is in no sense a normative specification of the implemented language.

From a team including Peter Sewell (Acute, HashCaml, Ott).

I continue to believe that things are heating up nicely in mechanized metatheory, which, in the multicore/multiprocessor world in which we now live, is extremely good news.

Online Learning of Relaxed CCG Grammars for Parsing to Logical Form

Online Learning of Relaxed CCG Grammars for Parsing to Logical Form, Luke S. Zettlemoyer and Michael Collins. Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007.

We consider the problem of learning to parse sentences to lambda-calculus representations of their underlying semantics and present an algorithm that learns a weighted combinatory categorial grammar (CCG). A key idea is to introduce non-standard CCG combinators that relax certain parts of the grammar—for example allowing flexible word order, or insertion of lexical items— with learned costs. We also present a new, online algorithm for inducing a weighted CCG. Results for the approach on ATIS data show 86% F-measure in recovering fully correct semantic analyses and 95.9% F-measure by a partial-match criterion, a more than 5% improvement over the 90.3% partial-match figure reported by He and Young (2006)

This paper isn't exactly PL, though Ehud has been okay with the odd foray into computational linguistics before. I thought it was interesting to see machine learning work make use a typed formalism like categorial grammars to handle long-range dependencies, and it leaves me wondering if it's possible to set these kinds of techniques onto program analysis problems.

One neat thing about the CCG formalism is that you have parsing, which is described more or less as a typechecking problem, and separately you have the semantic constraints, which are basically lambda-calculus terms that build up a term in first-order logic. So I can imagine doing something like writing down how you're supposed to use a library as semantic constraints that add up to a Prolog program, and then at compile time the compiler walks the typing derivation to construct the program, and then runs it to figure out if you're doing something dumb (according to the library designers). I don't know if that actually makes sense, but this work certainly prompted me to think.

XML feed