## automatic theorem proving

I've been looking at Coq and automatic theorem proving for a while, although I'm not sure if I completely understand it. From what I can tell, the idea is to create a core set of routines from which a function can be proven correct, given a detailed specification of what it should do. Is this right, or am I completely off track?

## Comment viewing options

### Basically Right

I'd phrase it this way: you specify what your function does in an impractically-powerful type system (the Calculus of Inductive Constructions), then you prove that your specification is sound by implementing it in the proof language (gallina). Thanks to the Curry-Howard Isomorphism, this really is "defining the type of a function and implementing the function." Then you can extract your function to O'Caml, Haskell, or Scheme. Since O'Caml and Haskell are statically typed, but with weaker type systems than Coq's, the extracted code may not be typable in those type systems, and so might have to rely on unsound mechanisms (Obj.magic in O'Caml; Unsafe________ in Haskell), but it's OK because the operations have been proven sound in a richer type system. In Scheme, of course, no such magic is necessary.

If you're seriously exploring Coq, then I think Coq'Art is a must have.

Finally, a minor point: Coq is not an automated theorem prover, but rather a proof assistant: it supports interactive, rather than automated, theorem proving. This is a perhaps subtle, but crucial, distinction: Coq's logic/type system is so powerful that it can't automagically prove most theorems of any interest, although auto with... and eauto with... are awfully impressive!

### I'd phrase it this way: you

I'd phrase it this way: you specify what your function does in an impractically-powerful type system (the Calculus of Inductive Constructions), then you prove that your specification is sound by implementing it in the proof language (gallina). Thanks to the Curry-Howard Isomorphism, this really is "defining the type of a function and implementing the function." Then you can extract your function to O'Caml, Haskell, or Scheme. Since O'Caml and Haskell are statically typed, but with weaker type systems than Coq's, the extracted code may not be typable in those type systems, and so might have to rely on unsound mechanisms (Obj.magic in O'Caml; Unsafe________ in Haskell), but it's OK because the operations have been proven sound in a richer type system. In Scheme, of course, no such magic is necessary.

I'm not sure if I'm understanding you correctly. If you eventually export to a programming language, doesn't that make it one itself?

If you're seriously exploring Coq, then I think Coq'Art is a must have.

Finally, a minor point: Coq is not an automated theorem prover, but rather a proof assistant: it supports interactive, rather than automated, theorem proving. This is a perhaps subtle, but crucial, distinction: Coq's logic/type system is so powerful that it can't automagically prove most theorems of any interest, although auto with... and eauto with... are awfully impressive!

I'll keep that in mind as I look into it, thanks :-)

### Yep!

Curtis W: I'm not sure if I'm understanding you correctly. If you eventually export to a programming language, doesn't that make it one itself?

It's one itself whether it allows you to extract to another programming language or not; that's the meaning of the Curry-Howard Isomorphism. Put another way, all programming languages are theorem provers (the popular ones being theorem provers for relatively weak logics), and all theorem provers are programming languages (albeit generally impractical ones; try timing Coq's "eval compute in..." sometime). See Djinn for another excellent example of the ramifications of this. Another good one is Oleg Kiselyov's De-typechecker.

Update: While you're waiting for Coq'Art, check out the Coq tutorial, and just to prove the point about Coq being a programming language, note that you can use it as an outrageously powerful calculator:

Coq < Eval compute in (2 + 2).
= 4
: nat


This simple interaction hides a stupendous amount of expressive power! Consider, for example, that "nat" is not a primitive type in Coq (in fact, the Calculus of Inductive Constructions has no primitive types); that "compute" is short for "cbv beta delta iota zeta," that is, "use call-by-value employing the beta, delta, iota, and zeta reduction strategies;" and that addition isn't primitive in Coq. Heck, even "2" is just syntactic sugar for (S (S O)), likewise "4" is sugar for (S (S (S (S O)))). No, I'm not kidding.

### Isn't it very hard to

Isn't it very hard to comprehend things that have gone wrong in Coq? Deciphering the tracebacks appears not to be pleasurefull. I checked out Alloy once for model checking and besides some awkwardness of the mental translation of the pseudo-OO notations into the underlying relational logics I found the relation graphs close to be incomprehensible for non-trivial models.

PS. I've sent a message to Avi Pfeffer the author of IBAL. I asked him whether IBAL is still maintained because the last release was old and it didn't even built correctly. He responded that IBAL is continued to be maintained and he updated the release to include the /Tests directory now.

### I'm Sure It Is!

Kay Schluehr: Isn't it very hard to comprehend things that have gone wrong in Coq? Deciphering the tracebacks appears not to be pleasurefull.

If you mean "if I develop a function in Coq and evaluate it and it doesn't work, isn't it very hard..." I have to imagine that the answer is yes, given the extremely high level of abstraction of the concepts being employed both in the expression and in the evaluation of it. Coq'Art makes very clear that Coq isn't really intended to be used for programming itself for this reason: it's both hard to figure out what's going on when things don't go as you expect and it's really slow, both as consequences of its extreme abstraction.

But so far, I haven't used Coq that way. I'm using it purely as an interactive theorem prover (from which I can then extract code). In that process, Coq really shines: I can undo the application of tactics that didn't help when I thought they would, I can restart the entire proof from scratch, I can take a detour to identify and prove a lemma and then use that lemma in my main proof... and I have a slew of ready-to-use tactics for various situations and, if I need, a pretty extensive library of user contributions in various domains. I'm enjoying it a lot, and looking forward to studying Xavier Leroy's work on implementing a compiler in Coq.

### ok

I think I get what you're saying, although I'm still having trouble grasping what Coq is proving if you're just giving it a program. Surely you have to provide it with a spec of what the code should be doing, or am I just completely misunderstanding what it does?

p.s. Thanks for taking the time to answer my questions. I know I can be a bit sparse on the details sometimes, but I appreciate your help :-)

### You're On the Right Track!

Curtis W: I think I get what you're saying, although I'm still having trouble grasping what Coq is proving if you're just giving it a program. Surely you have to provide it with a spec of what the code should be doing, or am I just completely misunderstanding what it does?

Nope, you're not misunderstanding at all. What's confusing when you're new to interactive theorem proving and the Curry-Howard Isomorphism, I think, is that you call the thing you're trying to prove a "theorem" and the list of tactics that you apply in the process a "proof," but there's a one-to-one correspondence (hence "isomorphism") between "theorems" and "types," and "proofs" and "programs." So yeah, this begs the question: what's the point? The point, as I see it, includes, but is not necessarily limited to, the following observations:

• The theorems are as expressive as the entirety of constructive mathematics, and if you want, you can even add the axioms necessary to support classical logic, at the expense of extractability. That is, if you stick to constructive (also called "intuitionistic") logic, you can extract correct code, whereas classical logic lacks this capability. You've seen me refer to the Calculus of Inductive Constructions, which is the logic underlying Coq. It might be helpful to note that the CiC lies at the "top right rear" corner of the Barendregt Cube, which is kind of a neat visualization of various typed lambda calculi and how they relate to each other in expressive power. In other words, Coq's type system is as expressive as it gets. By comparison, it's worth noting that Standard ML and O'Caml sit at the "bottom left front" corner of the cube. Their type systems are relatively impoverished!
• Proving a theorem with Coq is kind of like writing code in an IDE with good undo support, interactive evaluation, and so on. But only kind of. It's harder to explain than it is to try. Suffice it to say that this takes practice and that you build up an intuition about what will work and what won't over time, just as with any other programming process. But in this case you could, e.g. learn how to prove the four-color map theorem, or how to prove Fermat's Last Theorem, or how to prove the Riemann Hypothesis. Do it constructively and then you can extract code to implement it automagically! (Doing that last one will instantly make you world famous, by the way.)
• Proving things is almost always harder than you think. Try proving that 1+1=2 sometime! The miracle of modern computing is precisely that we don't have to do this stuff from first principles; the miracle of a proof assistant like Coq is that we can when we want or need to.
• If you work through Types and Programming Languages or its ilk, you'll be writing a lot of proofs. Doing them by hand is tedious and error-prone; doing them with an assistant helps ensure accuracy and tends to be faster, at least once you have some experience with the assistant. In this case, being able to extract, e.g. a working type checker or inferencer from your proof might just be a bonus, or it might be the point of the exercise.

I think that's enough for starters. The crucial take-away, IMHO, is that Coq's type system is wildly more expressive than that of any programming language except those that are also based on dependent types—and some such programming languages, e.g. Epigram, impose restrictions in order to keep type inferencing decidable. As Phillipa Cowderoy said in another thread (I'm paraphrasing wildly): if it can't be proven in the Calculus of Inductive Constructions, chances are it can't be proven, period, since you essentially have all of constructive mathematics to work with. And again, thanks to Curry-Howard, we have theorems <-> types, and proofs <-> programs, which Coq's extraction facilities exploit to give us O'Caml, Haskell, or Scheme code. So you can prove things about data structures and algorithms involved in, e.g. distributed systems, cryptographic protocols, information flow, etc. that simply can't be expressed in any popular programming language, and then extract code that uses Obj.magic/Unsafe__________ but has been proven to do so correctly.

Does that help clarify?

Curtis W: p.s. Thanks for taking the time to answer my questions. I know I can be a bit sparse on the details sometimes, but I appreciate your help :-)

That's very kind of you. I confess to having grown impatient with your questions/comments in past, because I felt that you were attempting to make points based on very limited analogies, making the points suspect. However, I think I failed to support my point that the analogy was limited and instead only revealed my frustration on a personal level, which I regret. I'm heartened to see that I didn't scare you away. :-)

### It might be helpful to note

It might be helpful to note that the CiC lies at the "top right rear" corner of the Barendregt Cube, which is kind of a neat visualization of various typed lambda calculi and how they relate to each other in expressive power. In other words, Coq's type system is as expressive as it gets.

The Barendregt Cube is far from comprehensive, e.g. Pure Type Systems do not fit into it. That said, CiC should do you.

But in this case you could, e.g. learn how to prove the four-color map theorem, or how to prove Fermat's Last Theorem, or how to prove the Riemann Hypothesis. Do it constructively and then you can extract code to implement it automagically! (Doing that last one will instantly make you world famous, by the way.)

As well as net you a 1,000,000 USD. P v. NP (a more computational oriented result) is also one of the problems.

### Oops!

Derek Elkins: The Barendregt Cube is far from comprehensive, e.g. Pure Type Systems do not fit into it. That said, CiC should do you.

Good point. I can't recall; are Pure Type Systems actually more expressive than the Calculus of Constructions (inductive or otherwise)? I recall reading them characterized as "a generalization of the Lambda Cube," but it wasn't clear to me from that reading whether they added expressive power or not. Regardless, the Open Calculus of Constructions, CINNI, Uniform Pure Type Systems, and Maude all do seem quite fascinating, and seem to me as if they could serve as an excellent alternative to Coq and the CiC.

### ML is not that impoverished

It might be helpful to note that the CiC lies at the "top right rear" corner of the Barendregt Cube, which is kind of a neat visualization of various typed lambda calculi and how they relate to each other in expressive power. In other words, Coq's type system is as expressive as it gets. By comparison, it's worth noting that Standard ML and O'Caml sit at the "bottom left front" corner of the cube. Their type systems are relatively impoverished!

That is not true. ML subsumes a significant subset of System F_omega (with higher-order modules, it actually contains F_omega), so if the CoC lies "top right rear", then ML lies close to "top right front" (or "top left rear", or "bottom right rear", depending on how you rotate it).

"Bottom left front" is the simply typed lambda calculus, corresponding to a boring language like Pascal or C.

### Right Again :-)

Yep, you're right; I misspoke and said "Standard ML and O'Caml," when I should have said "Core ML." Very definitely, with their module systems, Standard ML and O'Caml give us universal and existential quantification and a host of other goodies that make them more expressive than Core ML! Thanks for cleaning up after me!

### Clarification

Just to clarify a bit more :-), even Core ML is "almost" F_omega. The missing bit is that type lambdas (on both term and type level) are restricted to first-order. More precisely, polymorphic types have to be in prenex form (rank 1 polymorphism) and type constructors cannot be higher-order. Moreover, type lambdas on term level - like all types there - are implicit.

### There I Go Again!

Ach, of course, you're right: even let polymorphism is strictly more expressive than the simply typed lambda calculus! I think someone should draw a Barendregt Cube annotated with where various historical and current languages fit on it—that might actually help future discussions about expressive power. In any case, thanks for your patient help. :-)