Middle History of Logic Programming

Resolution, Planner, Edinburgh LCF, Prolog, and the Japanese Fifth Generation Project arXiv:0904.3036

Logic Programming can be broadly defined as “using logic to infer computational steps from existing propositions” (although this has been opposed by Kowalski; see below). The focus of this article is on the middle period of the development of this idea from the advent of uniform proof procedures using resolution to the Japanese Fifth Generation Project.

Kowalski [1988] stated “Looking back on our early discoveries, I value most the discovery that computation could be subsumed by deduction.” According to Hewitt et. al. and contrary to Kowalski, computation in general cannot be subsumed by deduction. Hewitt and Agha [1991] and other published work argued that mathematical models of concurrency did not determine particular concurrent computations as follows: The Actor Model makes use of arbitration for determining which message is next in the arrival order of an Actor that is sent multiple messages concurrently. For example Arbiters can be used in the implementation of the arrival order of messages sent to an Actor which are subject to indeterminacy in their arrival order. Since arrival orders are in general indeterminate, they cannot be inferred from prior information by mathematical logic alone. Therefore mathematical logic cannot implement concurrent computation in open systems. Consequently, Procedural Embedding of Knowledge [Hewitt 1971] is strictly more general than Logic Programming.

Over the course of history, the term “functional programming” has grown more precise and technical as the field has matured. Logic Programming should be on a similar trajectory. Accordingly, “Logic Programming” should have a more precise characterization, e.g., “the logical inference of computational steps.” Kowalski's approach has been to advocate Logic Programming in terms of backward-chaining only inference building on the resolution uniform proof procedure paradigm for proving theorems. In contrast, our approach was to reject the resolution uniform proof procedure paradigm and to explore Logic Programming defined by a principled criterion, namely, “the logical inference of computational steps”.

Note: This article is about the middle history of Logic Programming. See ArXiv:0901.4934 for conceptual development of the role of Logic Programming from its origins to the present.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Kowalski given IJCAI-11 Award for Research Excellence

Congratulations to Robert Kowalski who has been given the IJCAI-11 Award for Research Excellence. (The original citation was revised to remove fundamental contributions to procedural embedding of knowledge and to include logic programming.)

Logic Programming is a currently being neglected

It was good that IJCAI revised Kowalski's citation to include Logic Programming because it is a currently being neglected.

PLT Newbie's Query

There are elegant "multi-paradigm" languages today, but none of which I am aware include logic programming amongst the "paradigms" they embrace. Why?

- SMcL

Mozart/Oz

Mozart/Oz

Embedability

It's relatively easy to embed SLD resolution within other programming languages. See, for example, Kanren for Scheme and JStetL for Java. Note that decent metaprogramming facilities, such as those in Scheme, make embedding much more pleasant.

Hansei

One reason is that once you have continuations in a language, implementing other paradigms than what comes "out of the box" can be done as a library rather than as explicit language support. A good example of this is Hansei, which embeds lazy non-deterministic probabilistic programming in OCaml, which is a strict object-functional language to which delimited continuations have been added, also as a library.

Haynes

And of course any language with tail recursion can directly model continuations if you can carry out a CPS transformation, as Siskind's Screamer does.

A key reference here is Haynes, 1986, Logic Continuations, which describes how continuations can be used break up backtracking into a series of entries into a continuation. I recommend paying attention to Haynes' classification in section 3.

Logic/Functional languages

I would propose Lambda Prolog and Mercury as logic programming language with a strong influence of functional programming, and Monadic Constraint Programming as an integration of the logic/constraint paradigm into a functional language.

Most early multi-paradigm languages had logic paradigms

e.g. Leda, Oz.

The later multi-paradigm languages arose so that future modern languages would not "forget" the lessons that had been continuously forgotten in language design. There are position papers from around 1994-1999 that specifically called for the integration of abstract objects and ADTs. This led to a huge research boom in the expression problem, extensibility problem, and then Scrap Your Boilerplate, and soforth.

Tim Budd's book, Multi-Paradigm Programming in Leda, actually shows some very elegant examples of merging OO with logic programming. But, as Charles and Paul have already mentioned, there are even more snazzier ways that lead to a simpler overall core language.

Any language with continuations

For example, Racket provides quite nice versions of Datalog and Prolog:

http://docs.racket-lang.org/racklog/index.html

The victory of logic programming

I'll argue that logic programming did achieve prevalence among forward-looking languages. Challenge: Find the backtracking unification-based logic language at the heart of Haskell and Scala, the ad-hoc one in C++, and the more sensible one that was cut from C++0x. Hint: You won't find them at runtime!

It is right that logic programming won as a compile-time feature of advanced type systems and disappeared as a runtime feature. The former is highly expressive and fast enough; the later is nondeterministic and far too slow; and the two cannot soundly be mixed.

small victory

I agree with your premises - it is difficult to grok the runtime performance of a logic program, and nondeterminism doesn't play nice with side-effects. But I'm not convinced of your conclusion.

Haskell, Scala, and C++ are not languages designed for runtime logic programming: They aren't collection-oriented. None of the language has ad-hoc queryable sets. The binding between input and output for functions in these languages is inflexible. They don't have IO concepts built with logic in mind.

I think Bloom is an interesting example of using logic programming as a basis for performance (in this case, leveraging the properties of logic programming for stream processing, filtering, conditionals). I believe functional relational programming is interesting and feasible, though really requires late-binding of the query optimizations (which could be achieved by language support or a staged interpreter). LINQ also fits in here somewhere, seeing as relational and logic programming are very closely related.

I would be pleasantly surprised to see more runtime logic in mainstream languages... but I wouldn't think the mix "unsound".

Any rule-based system has

Any rule-based system has some run-time logic programming in it. Many systems utilize backward and forward chaining even if support for such models isn't first class in the language. The main problem with logic programming languages is that they fix the logic and the logic is not nearly flexible and/or fast and/or powerful enough for application needs, so we rely on lots of domain specific solutions. I don't think the answer is logic programming, maybe just good general purpose abstractions to implement domain specific logic programming systems/abstractions.

The main problem

The main problem with logic programming languages is that they fix the logic and the logic is not nearly flexible and/or fast and/or powerful enough for application needs, so we rely on lots of domain specific solutions.

I don't believe this problem is unique to logic programming. Indeed, substitute 'logic' with 'single-paradigm' and I suspect you'll be closer to the mark. ;)

I've been considering the possibility of logic or grammars in-the-small, combined with reactive multi-agent programming in-the-large.

The idea would be agents, conceptually, communicate and operate upon entire databases as the primary agglutinative data type - databases instead of records or lists or uniform sets - and the primary operations are joins and filters on these databases. (Logic programming tends to involve a transitive join.) This has a lot of promise for incremental computation (i.e. a minor change over time - just propagate the relevant changes). I also get ad-hoc distributed queries with forwards and backwards chaining.

maybe the answer is just good general purpose abstractions to implement domain specific programming systems/abstractions [slightly edited]

Implementing little-languages to control expressiveness will certainly be part of our answer no matter which paradigm we use for the foundation. It would be reasonable to expect staging, specialization, or partial-evaluation so that our little-languages can achieve first-class performance.

But calling it "the" answer is regressive. The challenge is determining those "good general purpose abstractions", and thus we are right back where we started: deciding what makes a "good" programming language. All we've really managed to do is add a requirement: a "good" programming language will enable developers to somehow extend the language with DSLs (raising issues of modularization, extension, composition, safety and security) then smash the tower of interpreters for first-class performance (optimization, compilation, parallelization).

Recognizing this regression allows you to get back on track to whatever paradigm zealotry you were espousing before it was raised. E.g. I could tell you all about why my RDP is better than mainstream paradigm Foo (for most values Foo) with regards to this DSL business. ;)

Alan Kay's Vivarium project at Apple

The idea would be agents, conceptually, communicate and operate upon entire databases as the primary agglutinative data type - databases instead of records or lists or uniform sets - and the primary operations are joins and filters on these databases. (Logic programming tends to involve a transitive join.) This has a lot of promise for incremental computation (i.e. a minor change over time - just propagate the relevant changes). I also get ad-hoc distributed queries with forwards and backwards chaining.

Just an FYI, but this exact idea constitutes a precise summary of the Vivarium project at Apple in the 1980s...

Vivarium project

Big ideas are reinvented all the time. It's fitting a bunch of big ideas into a small, simple, consistent design that's the problem.

I've not read much about Alan Kay's project Vivarium at Apple. I've been able to find a few decent sources of information on the topic, but not enough to tell me where they failed. Do you have any others?

Many-core concurrency has supplanted backtracking

Many-core concurrency has supplanted backtracking.

Fortunately, logic programming does not depend on backtracking, e.g., ActorScript

Many-core concurrency has

Many-core concurrency has supplanted backtracking.

What do you mean by this?

A modern programming language ought to compile in linear time

A modern programming language ought to compile in linear time.

A thesis statement ought to have justification.

A thesis statement ought to have justification.

This could be said to just

This could be said to just be common sense. Anything that is super-linear will probably not scale to accommodate large projects.

Whether a language scales to

Whether a language scales to accommodate large projects is more likely a function of its modularity and composition properties than of unit compilation costs.

True. But if you have good

True. But if you have good modularity, you much more likely to reach linear build times with respect to project size. Perhaps we are defining compile times differently.

There was (maybe still is) a case in Scala where the compiler would try really hard to infer a type, which would cause an apparent freeze. The compiler will eventually give up but uses heuristics to determine when to give up, hence you could always run into a new case that wasn't covered by the heuristic. Any super-linear compile time would have a similar problem, compile times should be deterministically linear.

Type inference in linear time should be a requirement

Performing type inference in linear time should be a requirement for a modern programming language.

Not a real issue

I'm not aware of any interesting type inference algorithm that has linear worst-time behaviour. Nor do I think that is a problem - see e.g. Hindley/Milner.

Actually, the trend is rather the opposite: we are seeing more and more undecidable type systems, where even plain type checking can take potentially infinite time.

What matters more is the "normal" case, not the theoretical complexity bound - again, see Hindley/Milner.

For production languages,

For production languages, for certain fast/predictable compilation performance is a requirement. So C# forgoes many type inference features, even very simple ones, while Scala pulls out all the stops. Even C++ is pretty linear in compile time if templates aren't abused.

In research, I agree, the trend is increasingly toward experimental languages with beefy type systems and no real performance guarantees. But I don't think industry could adopt these languages, and in fact its more likely for an experimental dynamic language to be adopted.

OCaml has a really fast compiler

faster than lots of more generally accepted "production" languages. And this despite the fact that the worst-case performance of H-M type-inference is absurdly bad. I think Andreas' point is right: theoretical niceties matter far less than the practical speed. And at least for H-M type-systems, the practical speed is far far better than the theoretical performance bounds would indicate.

And...

And that although OCaml is in fact one of those languages with an undecidable type system...

Really? I didn't know that.

Really? I didn't know that. What type systems features have made it undecidable?

Abstract signatures

It's the ability to have abstract signatures in the presence of higher-order modules, see this old example of mine.

I agree practical speed is

I agree practical speed is important. When I talk about predictability, I mean reliable performance rather than formally provable performance.

Not funny

For C++ to be linear in compile time you at least have to forbid templates, macros, include files, and probably even user-defined conversions and overloading. At that point the language is more than useless.

I cannot say too much about C#, but I highly doubt that it is linear, given generics, F-bounded quantification, and other type system features alone. So until I see a serious proof published somewhere I'm not ready to believe it even for a minute. (Just recall the original conjectures about the complexity of Hindley/Milner.)

I'm pretty sure C++ is

I'm pretty sure C++ is pragmatically linear if you consider macros, include files, and anything else beyond templates (which are well known to be Turing complete). C++ is designed explicitly for the purpose of fast compilation, in this regard fairly the user does a lot of manual upfront work so the compiler doesn't have to do any guess work that might make compilation slower.

C#'s support for generics is fairly simple, there isn't that much type inference. Both C++ and C# compilers are in the area of million LOC/sec, fairly consistent with a few caveats for C++.

Linear per compilation unit?

C++ may be "pragmatically linear" (which I assume includes n log n, etc.) for each compilation unit, but if you grow your code base by increasing the number of bounded size compilation units, with a link step at the end, it's not linear in theory or, in my experience, practice.

Come on...

C++ is the prime example of a language where super-linear compilation times actually are a real and persistent problem in practice. Of course, if you "pragmatically" choose to ignore all features that cause this (but yet are essential) then your statement is almost vacuously true.

C++ designed for fast compilation? This is the first time I'm hearing that. Do you have a citation? If it's really true then it's a substantial fail.

C#'s support for generics is fairly simple, there isn't that much type inference.

You don't need type inference to go far beyond linear complexity. The subtyping relation alone can quickly become costly or even undecidable to check. I don't know if that is the case, given that C# has nominal subtyping, but in the presence of generics with bounds etc I'm not sure that you cannot construct counter examples either. They won't really matter in practice, but that was my point.

Interestingly, in the

Interestingly, in the article series by Eric Lippert I linked to in my other post, early implementations of LINQ triggered exponential blow-up in the initial implementation of lambda type inference the C# type checker; it can matter in practice.

C++ compile times

C++ tends to be dog slow to compile in large part because of header processing. A header included multiple times may have different meanings at each include point (e.g. due to preprocessor variables being redefined). In general a C++ compiler has little choice but to re-process a header all the way from source each time it's encountered.

"Pre-compiled" headers are non-standard and only work if you carefully design your header to not change meanings.

Some compilers can optimize a bit if you use a standard #ifdef guard, only processing a given header once per compilation unit, but that optimization is pretty fragile and not supported across all compilers.

Indeed

Indeed, as I have recently argued, header files yield quadratic compile times in the worst case.

C++ header file slowness incented development of Google Go

C++header file slowness incented development of Google Go. See excellent presentation by Rob Pike at The Go Programming Language

Citation needed

If you can come up with a million LOC/sec C++ compiler, I'd like to see it. Similarly C#. There's no reason to expect a C# compiler to be much faster than JavaC, which in my experience tops out around 100KLOC/sec after getting really, really warmed up.

C# does not have a linear

C# does not have a linear typechecker; it is exponential in edge cases; in fact, 3SAT can be expressed as a C# typing problem, making it NP-complete.

Nice

Thanks for the links. I'm not surprised. Ad-hoc overloading is evil in the presence of (even limited) type inference.

compile times and project size

if you have good modularity, you much more likely to reach linear build times with respect to project size. Perhaps we are defining compile times differently

That may well be the case. I happen to include in 'compile-time' everything that happens just after a developer presses 'compile': parsing, macro expansions, staged computation or partial-evaluation, safety analysis, optimization passes, any static linking. I sort of feel we should include unit tests, too, at least if they're executed automatically.

It does not bother me if compile-time is super-linear, so long as it is predictable and justified - i.e. a developer can grok how much time a given compile will take, and can agree with the reasoning behind it. Super-linear does not imply unpredictable - e.g. Θ(N3) for a compilation unit of size N is super-linear but predictable. With staging (compile-time execution of arbitrary code), compile-time may easily be unbounded, but this is less an issue since the developer is in control of expensive bits.

The whole purpose of compilation is to automate some work that the programmer would otherwise need to do by hand. I have this impression that, if your compilation costs aren't super-linear, then your compiler probably isn't doing enough.

Another possibility is that we define 'project size' differently.

I happen to scope 'project' as including all the relevant components - libraries, services, scripts, data, art. We need common architecture, communication framework, protocols, interfaces, and assumptions (e.g. that 'GET' is effectively pure and 'PUT' is idempotent) in order to link components into a common 'project'.

Not many languages are designed for scaling to large projects. (Most aren't even designed to scale beyond a single-process boundary.) This is rather unfortunate because a common language would potentially allow for deeper integration between components - i.e. automating safety and sanity checks, 'deep' object references, promise pipelining and laziness, sharding and code distribution, late-bound optimizations. (We see some of these features with languages designed for scaling, such as E and Mozart.)

Many languages that fail to scale beyond a single process - due, ultimately, to poor choice of general purpose abstractions for modularization and composition - rely heavily on 'libraries' to instantiate, adapt, and provide services. So we end up with a few monolithic processes in our project, each of them replicating a large number of services.

With such monolithic compilation units, I can see why you might really want linear compilation. But I feel that linear compilation is curing the symptom. If we chose better abstractions to start with, so the language could scale across processes and services, the individual components could be much smaller and the unit compilation costs much less relevant.

Compile time = wait time till programmer input is executable

Compile time should include all wait time till programmer input is executable. In this conception, the ideal compile time is zero.

Staging

With staging, it is somewhat unclear that programmer input isn't being executed at compile-time.

Should we view 'compilation' as a stage of program execution? Or should we view it as a pure transform between program descriptions? The latter view seems inadequate once we include compile-time evaluation semantics, makefiles and compilation scripts, macros with side-effects, et cetera.

Compile time = wait time till program is executable

Try the following mod:
Compile time = wait time till program is executable

Compiling in linear time

Carl,

Compile time should be incremental and a language's syntax should have minimal importance. The focus needs to be on the semantics. Unfortunately, many of the languages your ActorScript targets today have very hard semantics to do this with, e.g. concurrency and parallelism refactorings. Thoughts?

Tools need to finally get upto speed on what "software development lifecycle" really should mean, as demonstrated by the Vesta system. Once we get to that point, with all "modern" languages, then we can start building cooler tools such as What If and Provenance debugging tools, along with artificially intelligent help systems.

Moreover, with theories like Pluggable Types, we should consider the possibility of layering type inference systems and not necessarily worry if one layer has an undecidable type system.

If this post is too buzzword-y or namedroppy, let me know.

Udated version

According to http://arxiv.org/abs/0904.3036:

Logic Programming can be broadly defined as “using logic to infer computational steps from existing propositions” However, mathematical logic cannot always infer computational steps because computational systems make use of arbitration for determining which message is processed next by a recipient that is sent multiple messages concurrently. Since arrival orders are in general indeterminate, they cannot be inferred from prior information by mathematical logic alone. Therefore mathematical logic cannot in general implement computation. This conclusion is contrary to Robert Kowalski who stated “Looking back on our early discoveries, I value most the discovery that computation could be subsumed by deduction.”

Over the course of history, the term “functional programming” has grown more precise and technical as the field has matured. Logic Programming should be on a similar trajectory. Accordingly, “Logic Programming” should have a general precise characterization. Kowalski's approach has been to advocate limiting Logic Programming to backward-chaining only inference building on the resolution uniform proof procedure paradigm. In contrast, our approach has been to explore Logic Programming defined by a general principled criterion, namely, “the logical inference of computational steps” using inconsistency-robust Natural Deduction that is contrary to the resolution approach requiring reduction to conjunctive normal form followed by proof by contradiction, e.g., connection graphs [Kowalski 1975]. In order to prove a goal G using resolution, (not G) (the negation of G) is placed in the data base and a contradiction is derived. However, this approach can easily prove any proposition from an inconsistency. For example, suppose that there is a simple inconsistent data base with just P and (not P). To prove the false proposition MadeOfCheese[Moon], first prove the lemma (P or MadeOfCheese[Moon]) based on proof by contradiction using resolution. With the lemma in the data base, MadeOfCheese[Moon] follows immediately using resolution.

Because contemporary large software systems are pervasively inconsistent, it is not safe to reason about them using classical logic.

Note: See ArXiv:0901.4934 for a more comprehensive view that includes Logic Programming