Online Learning of Relaxed CCG Grammars for Parsing to Logical Form

Online Learning of Relaxed CCG Grammars for Parsing to Logical Form, Luke S. Zettlemoyer and Michael Collins. Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007.

We consider the problem of learning to parse sentences to lambda-calculus representations of their underlying semantics and present an algorithm that learns a weighted combinatory categorial grammar (CCG). A key idea is to introduce non-standard CCG combinators that relax certain parts of the grammar—for example allowing flexible word order, or insertion of lexical items— with learned costs. We also present a new, online algorithm for inducing a weighted CCG. Results for the approach on ATIS data show 86% F-measure in recovering fully correct semantic analyses and 95.9% F-measure by a partial-match criterion, a more than 5% improvement over the 90.3% partial-match figure reported by He and Young (2006)

This paper isn't exactly PL, though Ehud has been okay with the odd foray into computational linguistics before. I thought it was interesting to see machine learning work make use a typed formalism like categorial grammars to handle long-range dependencies, and it leaves me wondering if it's possible to set these kinds of techniques onto program analysis problems.

One neat thing about the CCG formalism is that you have parsing, which is described more or less as a typechecking problem, and separately you have the semantic constraints, which are basically lambda-calculus terms that build up a term in first-order logic. So I can imagine doing something like writing down how you're supposed to use a library as semantic constraints that add up to a Prolog program, and then at compile time the compiler walks the typing derivation to construct the program, and then runs it to figure out if you're doing something dumb (according to the library designers). I don't know if that actually makes sense, but this work certainly prompted me to think.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

dumb(x) -> fails(x)

it leaves me wondering if it's possible to set these kinds of techniques onto program analysis problems.

My sense is that PL compilers are already ahead of the curve on this. Since even the shoddiest PL is much more rigorously defined in both syntax and semantics than any HL, it is much easier to accomplish the same kinds of analysis in the standard phased compilation approach.

There may still be some inspiration there though...

the compiler walks the typing derivation to construct the program, and then runs it to figure out if you're doing something dumb

The challenge here would be to define "something dumb" as a first order predicate of appropriate effect. ;-)

CCG and programming language theory

There are many parallels between CCG, other approaches
to what linguists call "Type logical grammar" and things
which have been done in theoretical computer science.
They all have some connection to twentieth century work
in logic.

I recommend:

- An (out of print) book by Simon Thompson which
you can download and read from.

http://www.cs.kent.ac.uk/people/staff/sjt/TTFP/

which goes into detail on how Martin-Lof type theory works.
You can read this without caring about language at all.

- A book on type logical semantics by Bob Carpenter

http://www.colloquial.com/carp/books/tls/index.html

- Mark Steedman's The Syntactic Process
(http://linguistlist.org/issues/11/11-2540.html)

Chicken or Egg?

Let me add Computational Semantics and Type Theory

I'm curious if you agree with my assessment that this a case of applying PLT success to help deal with NLP issues, or if you agree with Neel's conjecture that the NLP case has significant potential to illuminate PLT issues?

In other words, do you agree that PLT is way ahead of the game?