Lambda the Ultimate

Prolog and Natural-Language Analysis
started 10/8/2002; 2:09:16 PM - last post 11/2/2002; 12:50:19 PM

Ken Shan - Prolog and Natural-Language Analysis

10/8/2002; 2:09:16 PM (reads: 2337, responses: 11)

Prolog and Natural-Language Analysis

From the introduction to the book:

This book is an introduction to elementary computational linguistics from the point of view of logic programming. The connection between computational linguistics and logic programming has both formal and utilitarian aspects. On the formal side, we shall explore the restricted logical language of definite clauses as a means of expressing linguistic analyses and representations. On the utilitarian side, we shall introduce the logic-programming language Prolog, whose backbone is the definite-clause formalism, as a tool for implementing the basic components of natural-language-processing systems.

Now available online.

(The idea of parsing as deduction has been mentioned here before.)
Posted to Logic/Declerative by Ken Shan on 10/8/02; 2:15:07 PM

jon fernquest - Re: Prolog and Natural-Language Analysis

10/9/2002; 12:33:40 AM (reads: 1542, responses: 1)

Here is a course page with source code from the book:

http://www.let.uu.nl/~Richard.Moot/personal/prolog.html

Putting this book online means that the starting point for a lot of other great online material is available:

Covington's untyped feature structures: GULP 3.1, an extension of Prolog for unification-based grammar (PDF, Postscript, Prolog Source) Preprocessor for handling feature structures, such as case:nom..number:plural, in Prolog programs.
Robin Cooper's "Prolog and Natural Language Semantics" [1, 2]
A First and Second Course in Computational Semantics

You can go quite far in modelling natural language and building useful natural language applications with the ideas in these books.

Another great (easy to understand, but not online) Prolog natural language book is:

Matthews, Clive (1998) An Introduction to Natural Language Processing Through Prolog. Longmans.

One chapter analyzes several non-trivial English grammatical constructions in Prolog. He suggests getting a comprehensive book on transformational grammar and code up the grammatical constructions in Prolog. I can't find the book he recommends, but here's a review of my current favorite, comprehensive grammar:

http://www-writing.berkeley.edu/TESL-EJ/ej12/r7.html

Ehud Lamm - Re: Prolog and Natural-Language Analysis

10/9/2002; 2:36:34 AM (reads: 1561, responses: 0)

From a theoretical point of view, is Prolog really appropriate for modeling natural languages?

jon fernquest - Re: Prolog and Natural-Language Analysis

10/10/2002; 3:43:15 AM (reads: 1482, responses: 0)

> From a theoretical point of view, is Prolog
> really appropriate for modeling natural languages?

According to my understanding, that's what Colmeraurer in France originally designed it for.

There might be better technologies, but if you want to write programs that model language I'd recommend Prolog and this online book, because there's the real danger that you'll get so caught up in a search for the ultimate tool or grammatical formalism that you'll never get any work done.

In the '80's Pereira and Schieber wrote several great papers and little systems using Prolog, like a cool geography question answerer Chat-80 that I got to run pretty easily on SWI-Prolog (eliminate the statistics calls on the mainline).

Typed feature structure unification like that found in HPSG seems to be a better programming language (the list of grammatical phenomenon that it handles in the example sentence list in the distribution is amazing), but you'll have to buy proprietary Lisp to use Stanford's HPSG.....I wasted many an hour trying to get it working with command line based clisp ...maybe the situation has changed...other systems require proprietary SICSTUS prolog....

Ken Shan - Re: Prolog and Natural-Language Analysis

10/10/2002; 10:16:54 AM (reads: 1483, responses: 1)

A simple linguistic intuition behind using a unification-based formalism to model natural language is agreement (e.g., number agreement between subject and verb in English; case agreement (i.e., dative vs accusative) between verb and object in German).

Here is a programming-language way to think about unification grammars: Grammars are type systems that specify which expressions are well-formed and which are not. For example, "Alice slept Bill" is not grammatical, for perhaps the same reason "sleep(alice, bill)" would not be grammatical if alice and bill are of type "person" and sleep is of type "person -> bool". Unification is useful in modeling natural language grammars, then, for the same reason it is useful in designing programming language type systems. Search is useful in natural language parsers for the same reason type inference is useful in programming language compilers.

One might expect from the above intuition that other aspects of PL type systems may also be applicable to NL grammars. For example, subtyping and intersection types may help us understand how "frauen" in German ('women' plural) can appear as the object of both accusative and dative verbs. (Agreement issues quickly get complicated as you consider say conjunction, e.g., English "and".)

Ehud Lamm - Re: Prolog and Natural-Language Analysis

10/10/2002; 1:26:06 PM (reads: 1512, responses: 0)

One might expect from the above intuition that other aspects of PL type systems may also be applicable to NL grammars.

Well, Prolog does unification, of course, but it doesn't have a powerful type system.

Ken Shan - Re: Prolog and Natural-Language Analysis

10/10/2002; 4:26:41 PM (reads: 1469, responses: 1)

You are correct that Prolog doesn't have a powerful type system. That makes it difficult to implement type systems in Prolog whose types have complex kinds. Note that when you implement a type system in Prolog, you are treating types in the object language as Prolog terms (i.e., data in the meta-language). Hence kinds in the object language correspond to types in the meta-language (i.e., Prolog types, if it had them).

Ehud Lamm - Re: Prolog and Natural-Language Analysis

10/13/2002; 5:48:04 AM (reads: 1508, responses: 0)

Of course, you can model a typed object language using an untype meta language, in the same way you can use unification even if the meta-language deosn't have this facility. It just means you have to implement these yourself.

So the real question is whther you use Prolog because it is a good all around langugae, or because it has features that resemble features of the object language.

jon fernquest - Re: Prolog and Natural-Language Analysis

10/17/2002; 2:23:19 AM (reads: 1415, responses: 0)

> Of course, you can model a typed object
> language using an untype meta language,
> in the same way you can use unification
> even if the meta-language deosn't
> have this facility. It just means
> you have to implement these yourself.

Carpenter chose Prolog to implement typed feature structure ALE and at the same time wrote papers that discussed a lower level, more efficient virtual machine implementation.

Efficient virtual machine implementations like WAM for prolog always seem to come a long time after the initial language definition, cf ML, first there was ML and then efficient CAML.

I guess you could call implementations like the object language ALE in the untyped meta-language Prolog "proto-type implementations" as opposed to efficient "production implementations" like Prolog implemented with the WAM virtual machine.

One advantage of using untyped Prolog over typed Mercury as a meta-language is portability. You can run ALE on many different Prologs, I experimented with on free SWI-Prolog. Also the definition of Prolog is pretty stable and unchanging (Mercury?) which reduces implementation maintenance. Since there are so many open questions in the area of type systems, strong statically typed languages like Haskell are research languages in a constant state of flux, e.g. when you read the circa 1995 research on modular monadic semantics it's all in Gofer with its type system, when you read research circa 2000 it uses Haskell98 type extensions. Untyped meta-languages are probably safer until some consensus on type systems forms.

In the posting above I said "untyped" when I meant "weakly typed."

My understanding: The stronger the type checking is, the *earlier* (compile-time static checking vs. run-time dynamic checking vs. no checking at all) and *more extensive* (not just in data types but in functions and parametrized types also) it is. (i.e. stronger = early and often).

Anyone know of a nice, complete set of definitions?

jon fernquest - Re: Prolog and Natural-Language Analysis

10/19/2002; 3:51:15 AM (reads: 1392, responses: 0)

> A simple linguistic intuition behind using a
> unification-based formalism to model
> natural language is agreement...
> One might expect from the above intuition
> that other aspects of PL type systems may
> also be applicable to NL grammars.
> For example, subtyping and intersection types
> may help us understand how "frauen" in
> German ('women' plural) can appear
> as the object of both accusative and dative verbs.
> (Agreement issues quickly get complicated
> as you consider say conjunction, e.g., English "and".)

Benjamin Pierce quote:

"Type systems describe approximations of program behaviors, concentrating on the shapes of the values passed between different parts of the program." (see From prior LTU posting)

What's cool about the examples you give is that they go beyond "approximations of program behavior." They seem to be "exact specifications of program behaviour."

Supposedly dynamic languages like Lisp and Perl are popular because they are not strongly typed and it easier to break and remold abstractions. But if type systems every get to point where they are highly expressive languages for building (and for taking apart and rebuilding) abstractions (like the linguistic abstractions you describe) it seems like they might be a lot more popular and even supercede popular dynamic languages in popularity. Hudak's Haskell music DSL is also a useful type based language for building abstractions in a specific domain: music. [See also Javore (Java version) and also check out jMusic]

Ken Shan - Re: Prolog and Natural-Language Analysis

11/2/2002; 12:25:13 PM (reads: 1276, responses: 1)

I agree that, as type systems become more and more powerful, they are likely to be used for more things that are done dynamically today. In the case of modeling natural language with types, though, I'm not sure if a specification of grammaticality would constitute an "exact specification of program behaviour".

It depends on what you mean by "program behaviour": As someone studying semantics, I am used to thinking that program behaviour includes whether a sentence is true or not. Type systems of the sort I described only tell you whether "Alice loves Bob" is well-formed; they don't tell you whether it is true...

Ehud Lamm - Re: Prolog and Natural-Language Analysis

11/2/2002; 12:50:19 PM (reads: 1336, responses: 0)

Did you see the recent post about Type-Directed Natural Language Parsing?