A refutation of Gödel's first incompleteness theorem

Contradictions in Gödel's First Incompleteness Theorem.

Notes for the second edition. I have edited this post a bit, sorry if it is against the etiquette. The original post can be found here. I have removed a couple of -periopheral- mistakes. However I want to also warn that the argument is a lot clearer when adapted to a modern rendition of Gödel's theorem, based on Turing machines, as suggested in i this comment. That adaptation is carried out in the comments that hang from the previous referenced comment, this an onwards.

Intro

This refers to Kurt Gödel's "On formally undecidable propositions of principia mathematica and related systems". The notation used here will be the same as that used by Gödel in that paper.

In that work, Gödel starts with a description of the formal system P, which, according to himself, "is essentially the system obtained by superimposing on the Peano axioms the logic of PM".

Then he goes on to define a map Phi, which is an arithmetization of system P. Phi is a one-to-one correspondence that assigns a natural number, not only to every basic sign in P, but also to every finite series of such signs.

One

There are alternative arithmetizations of system P. I will later delve on how many.

This is obvious from simply considering a different order in the basic signs when they are assigned numbers. For example, if we assign the number 13 to "(" and the number 11 to ")", we obtain a different Phi.

If we want Gödel's proof to be well founded, it should obviously be independent of which arithmetization is chosen to carry it out. The procedure should be correct for any valid Phi chosen. otherwise it would **not** apply to system P, but to system P **and** some particular Phi.

To take care of this, in Gödel's proof we may use a symbol for Phi that represents abstractly any possible valid choice of Phi, and that we can later substitute for a particular Phi when we want to actually get to numbers. This is so that we can show that substituting for any random Phi will produce the same result.

The common way to do this is to add an index i to Phi, coming from some set I with the same cardinality as the set of all possible valid Phi's, so we can establish a bijection among them - an index. Thus Phi becomes here Phi^i.

Two

Later on, Gödel proceeds to spell out Phi, his Phi, which we might call Phi^0, with his correspondence of signs and numbers and his rules to combine them.

And then Gödel proceeds to define a number of metamathematical concepts about system P, that are arithmetizable with Phi^0, with 45 consecutive definitions, culminating with the definition of provable formula.

Definition of provable formula means, in this context, definition of a subset of the natural numbers, so that each number in this set corresponds biunivocally with a provable formula in P.

Let's now stop at his definition number (10):

E(x) === R(11) * x * R(13)

Here Gödel defines "bracketing" of an expression x, and this is the first time Gödel makes use of Phi^0, since:

Phi^0( '(' ) = 11

Phi^0( ')' ) = 13

If we want to remain general, we may rather do:

E^i(x) === R(Phi^i( '(' )) * x * R(Phi^i( ')' ))

Two little modifications are made in this definition. First, we substitute 11 and 13 for Phi^i acting on "(" and ")". 11 and 13 would be the case if we instantiate the definition with Phi^0.

And second, E inherits an index i; obviously, different Phi^i will define different E^i. And so do most definitions afterwards.

Since, for the moment, in the RHS of definitions from (10) onwards, we are not encoding in Phi^i the index i, which has sprouted on top of all defined symbols, we cease to have an actual number there (in the RHS); we now have an expresion that, given a particular Phi^i, will produce a number.

So far, none of this means that any of Gödel's 45 deffinitions are in any way inconsistent; we are just building a generalization of his argument.

Three

There is something to be said of the propositions Gödel labels as (3) and (4), immediately after his 1-45 definitions. With them, he establishes that, in his own words, "every recursive relation [among natural numbers] is definable in the [just arithmetized] system P", i.e., with Phi^0.

So in the LHS of these two propositions we have a relation among natural numbers, and in the RHS we have a "number", constructed from Phi^0 and his 45 definitions. Between them, we have an arrow from LHS to RHS. It is not clear to me from the text what Gödel was meaning that arrow to be. But it clearly contains an implicit Phi^0.

If we make it explicit and generalized, we must add indexes to all the mathematical and metamathematical symbols he uses: All Bew, Sb, r, Z, u1... must be generalized with an index i.

Then, if we instantiate with some particular Phi^i, it must somehow be added in both sides: in the RHS to reduce the given expression to an actual number, and in the LHS to indicate that the arrow now goes from the relation in the LHS **and** the particular Phi^i chosen, to that actual number.

Obviously, if we want to produce valid statements about system P, we must use indexes, otherwise the resulting numbers are just talking about P and some chosen Phi^i, together.

Only after we have reached some statement about system P that we want to corroborate, should we instantiate some random Phi^i and see whether it still holds, irrespective of any particularity of that map.

These considerations still do not introduce contradiction in Gödel's reasoning.

Four

So we need to keep the indexes in Gödel's proof. And having indexes provides much trouble in (8.1).

In (8.1), Gödel introduces a trick that plays a central role in his proof. He uses the arithmetization of a formula y to substitute free variables in that same formula, thereby creating a self reference within the resulting expression.

However, given all previous considerations, we must now have an index in y, we need y^i, and so, it ceases to be a number. But Z^i is some function that takes a natural number and produces its representation in Phi^i. It needs a number.

Therefore, to be able to do the trick of expressing y^i with itself within itself, we need to convert y^i to a number, and so, we must also encode the index i with our 45 definitions.

The question is that if we choose some Phi^i, and remove the indexes in the RHS to obtain a number, we should also add Phi^i to the LHS, for it is now the arithmetic relattion **plus** some arizmetization Phi^i which determine the number in the RHS, and this is not wanted.

Five

But to encode the index, we ultimately need to encode the actual Phi^i. In (3) and (4), If in the RHS we are to have a number, in the LHS we need the actual Phi^i to determine that number. If we use a reference to the arithmetization as index, we'll also need the "reference map" providing the concrete arithmetizations that correspond to each index. Otherwise we won't be able to reach the number in the RHS.

Thus, if we want definitions 1-45 to serve for Gödel's proof, we need an arithmetization of Phi^i itself -with itself.

This may seem simple enough, since, after all, the Phi^i are just maps, But it leads to all sorts of problems.

Five one

Now, suppose that we can actually arithmetize any Phi^i with itself, and that we pick some random Phi^i, let's call it Phi^0: we can define Phi^0 with Phi^0, and we can use that definition to further define 10-45.

But since Phi^0 is just a random arithmetization of system P, if it suffices to arithmetize Phi^0, then it must also suffice to arithmetize any other Phi^i equally well. However, with Phi^0, we can only use the arithmetization of Phi^0 as index to build defns 10-45.

This means that, as arithmetizations of system P, the different Phi^i are not identical among them, because each one treats differently the arithmetization of itself from the arithmetization of other Phi^i.

Exactly identical arithmetical statements, such as definition (10) instatiated with some particular Phi^i, acquire different meaning and truth value when expressed in one or another Phi^i.

Among those statements, Gödel's theorem.

Five two [This argument does not hold if we have to consider recursively enumerable arithmetiazations.]

A further argument that shows inconsistency in Gödel's theorem comes from considering that if we are going to somehow encode the index with Phi^i, we should first consider what entropy must that index have, since it will correspond to the size of the numbers that we will need to encode them. And that entropy corresponds to the logarithm of the cardinality of I, i.e., of the number of valid Phi^i.

To get an idea about the magnitude of this entropy, it may suffice to think that variables have 2 degrees of freedom, both with countably many choices. Gödel very conveniently establishes a natural correspondence between the indexes of the variables and the indexes of primes and of their consecutive exponentiations, but in fact any correspondence between both (indexes of vars, and indexes of primes and exponents) should do. For example, we can clearly have a Phi^i that maps the first variable of the first order to the 1000th prime number exponentiated to the 29th power.

This gives us all permutations of pairs of independent natural numbers, and so, uncountably many choices for Phi^i; so I must have at least the same cardinality as the real line. Therefore y^i doesn't correspond to a natural number, since it needs more entropy than a natural number can contain, and cannot be fed into Z^i, terminating the proof.

Comment viewing options

Probably not

Godel's theorem has been checked thousands of times, including mechanically, so the inconsistency you've found is probably one of your contributions. I don't follow it, but what you've written about needing to check every encoding sounds wrong. You only need one encoding. The Godel statement you produce will depend on the encoding you choose, but one Godel statement is enough.

mechanized proof

A reference to a mechanized proof would be the Coq proof by Russell O'Connor. The proof is not completely identical to Gödel's, it contains simplification in places and additional developments related to the question of variable representation, but it should be able to lift most doubts that the result actually holds.

re: mechanized proof

Having a mechanized proof about something that is essentially metamathematical, and where missing a metamathematical step may produce meaningless but coherent mathematics wrongly interpreted as metamathematically meaningful, doesn't help here.

The Book

Nicely put. I might add that there are no mechanized proofs in The Book. Claimed refutations of Gödel's Theorem all, in my experience, treat Gödel's proof as esoteric manipulation of symbols. The original proof was a mess, of course; but by now it's down to a straightforward exercise in diagonalization, and not vulnerable to a complicated argument. It might conceivably be bypassed, by a simple and (once pointed out) obvious bit of reasoning that allows us to simply breeze past it; but I'd expect that to take just a few sentences to explain.

Perhaps

I repeat what was nicely put, and add: perhaps it's fun. If I'm going to answer to an argument that was put forward by Gödel, and that answer can be maintained in the modern incarnations of his words, I should not be so irrespectful so as not to answer directly to him... that was my thought at least :)

My take

My own treatment of Gödel is yonder.

The first proof of something like Gödel's Theorem typically is a mess for, more or less, two reasons; or perhaps one reason with two halves. At first, nobody knows how to prove it cleanly. Iirc, the first simple, clear proof of the Church-Rosser Theorem started circulating about 30–35 years after Church and Rosser's paper. So (1) the originator of the proof doesn't know how to write a really neat proof, and (2) the very awkwardness of the proof makes everyone that much more suspicious and therefore forces the prover to go into inordinate detail to try to make sure they're not overlooking something. Of course, they might be overlooking something anyway, not because they're "wrong" in the essence of their result but because their inordinately complicated method of proof is itself error-prone. Later on, once the essence is better understood and everything looks much simpler, if there was any truly fundamental problem with the result it will become more and more obvious; and by that point, you're not going to convince people there's a problem with the basic result by pointing to the original proof. As the result can be explained simply, any discussion of it must be related to the simple explanation rather than to the unnecessarily difficult original proof; if a new objection can't be related to the simple new treatment of the result, the natural suspicion will be that the lack of clarity is hiding a flaw in the objection.

Ok

I see your point, I concede it. I will try to migrate my argument to the modern incantations of the theorem. Thanks!

My take on your take

In your take you speak about a Turing machine L and another A. The Turing machine L takes a Turing machine f and an input i and works out the output. And A uses L to produce the contrary result; So when applied to itself, produces a contradiction.

the machine A must obviously contain L.
No problem with that. Let's note it with A(L)

L is about implementation, in the sense that to be able to, given some machine f, determine that it produces some output, either you run the machine, or you already have within L all possible results of all possible machines, which is crazy.

So L corresponds with a given way to encode Turing machines and run them.

There are many, many ways of doing this (Encoding and running Turing machines). Let's note this by adding an index to L, L^i. It's totally harmless, and we can still have L^1, that is exactly and simply L, by
definition here and now.

However now we also have different A(L^i), and also A(L^1).

In your text, we may say that you encode A(L^1) in L^1 and feed it to A(L^1).
And this produces a contradiction.

But using indexes allows us to wonder:

What happens if we encode the A(L^1) in L^2 and feed it to the A(L^2) machine?
It will find no trouble at all, no contradiction.
And feeding it to A(L^2) amounts to nothing but running it;
So, running the Turing machine A(L^1) in different environments leads to different
results. The results are implementation dependent.

Expressing the same Turing machine, A(L^1), in 2 different encodings, L^1 and L^2, leads to different results.

Take 3

And here we go :-D ...

The Turing machine L takes a Turing machine f and an input i and works out the output.

Actually, no. All I assumed L would do was decide whether or not f could have input/output pair (i,j). That is, given triple (f,i,j), L would decide whether or not f on input i would produce output j. We don't know how L does its thing. I didn't want a proof that only applies to machines L that work in a certain way, you see; so I made no assumption about how it works (except, of course, the assumption that it's a Turing machine). If there is some way to do what L is supposed to do without L having to work out f's output on i, the proof should cover that.

To bring out the point a bit more: I noted that if L never makes a mistake, there must be some (f,i,j) for which L answers neither "yes" nor "no", because if L never makes a mistake and always answers, we could solve the halting problem. Build a machine L' such that L' takes input (f,i); L' automatically constructs another machine f' such that f' runs f on i and then outputs "confirm"; and L' sends (f',i,confirm) to L. So L' will take (f,i) and tell us "yes" or "no" whether or not f halts on input i. But L' does not necessarily run f on input i. It constructs an f' that would run f on i, but then instead of running f', it passes (f',i,confirm) to L; and we don't know for sure that L runs f' on input i, either.

the machine A must obviously contain L.
No problem with that. Let's note it with A(L)

This is doable, but unnecessary. I do describe how to build A, given L; therefore one could view the construction as a function of L. However, we have only one L (although we don't know which one that is); so I don't see why we should depict A as a function; I'm only interested in the A constructed from the particular L we're looking at. Granted, viewing A as a function of L is doable.

However, at this point you proceed with a bunch of reasoning that depends on assumptions about how L must work. Again: I deliberately made no assumptions about how L works, beyond requiring L to be a Turing machine. If I had made assumptions about how L works, my proof by contradiction would only show that there could not be an L that works the way I'd assumed; and I don't want my proof to be so limited, so I made no such assumptions.

Thanks for the patience

> And here we go :-D ...

Thanks for the patience. Really.

> I didn't want a proof that only applies to machines L that work in a certain way, you see; so I made no assumption about how it works

Nor do I say any assumption has to be made.

But the fact that there are alternative L's is also true. And that fact means that L + A expressed in L behaves differently than expressed in L' . The same Turing machine (A + L in some combination, be L part of A, or part of the input) with the same input, run in different environments -L and L', that are supposedly equivalent as regards running Turing machines-, behaves differently.

on-going

Thanks for the patience. Really.

Pushing the envelope with extra-paradigm scientific ideas is risky and difficult. There are a wide variety of things that can go wrong, both in implementation and presentation. I have some respect and sympathy for someone rational who's willing to try it.

Nor do I say any assumption has to be made.

I did see you making some assumptions, though:

L is about implementation, in the sense that to be able to, given some machine f, determine that it produces some output, either you run the machine, or you already have within L all possible results of all possible machines, which is crazy.

So L corresponds with a given way to encode Turing machines and run them.

That's an unnecessary assumption. You're assuming L has to run the machine. In fact, when someone manages to prove that something surprising can be done, often they've found a surprising way to do it — hence my particular caution not to assume that L has to actually run f in order to deduce things about its behavior.

It is true that L has to use some encoding of f.

There are many, many ways of doing this (Encoding and running Turing machines). Let's note this by adding an index to L, L^i. It's totally harmless, and we can still have L^1, that is exactly and simply L, by definition here and now.

It isn't possible for L to take f unencoded, because the input to a Turing machine has to be a string of symbols, and a Turing machine is not itself a string of symbols. This seems an important point to explore, as you remark (in another comment, below),

> However, we have only one L

The point is that we may easily have more than one. And then things get weird.

It doesn't matter what other Turing machines exist. We suppose that a machine exists with such-and-such properties, and show that this supposition leads to a contradiction. So there wasn't such a machine in the first place.

My point about encodings is that a Turing machine is actually a graph; a set of states, with directed transitions between them labeled with symbols and actions. It's about as un-string-like as you can get. When I speak of L taking as input (f,i,j), what I really mean is that we stipulate some manner of encoding Turing machines as strings, and L takes as input a triple of strings where the first is an encoding of f and the second and third are i and j. There's another encoding there: we have some way to represent a triple of strings; but there are various ways to do that, and the details of how we choose to do it shouldn't matter.

Machine A takes as its input a string i; it figures out what the column number m is for that i; it figures out what the encoding-of-a-Turing-machine is that labels row m; it uses L to decide whether that encoded machine on input i would output "confirm"; and if L produces an answer "no" then A outputs "confirm", if L produces "yes" then A loops forever, and of course if L doesn't produce an answer then neither does A. I have just described a Turing machine, which I'm calling A. We can encode A, producing a string that must appear as the label on some row of the table.

That is, if our initial assumption was true in the first place — that there exists an L with such-and-such properties. But the whole point is that our initial assumption could not have been true, because it leads to a contradiction.

For the later bits of your reasoning, it seems pretty clear to me (tbh) you've gotten yourself tangled up somehow, and I'm not 100% sure what's happened but I have a suggestion. You say

In your text, we may say that you encode A(L^1) in L^1 and feed it to A(L^1).
And this produces a contradiction.

But using indexes allows us to wonder:

What happens if we encode the A(L^1) in L^2 and feed it to the A(L^2) machine?

If I'm reading you rightly, L^1 and L^2 are the same logic except that they use different encodings of machines as strings. A(L^1) and A(L^2) would then be the corresponding machines "A" for these two different versions of "L", although it's a bit dicey to put it this way because, although constructing an "A" for each of two different logics might be fairly described as a function A(L), when you change the encoding used by L you may have to change the way you do the construction of your "A" in ways that are unclear. Glossing over that, though, there's a more basic problem (if I'm reading you rightly): we do not get to choose which machine we encode in which encoding and send to which machine. I'll walk through that carefully.

The machine A(L^1) is doing its own internal processing, based on an input string i. Because A(L^1) is (by supposition) a Turing machine, its encoding (using encoding number "1" that both machines L^1 and A(L^1) use) does in fact occur as the label on some row of the table; since we have A(L^1) in hand, and we know how encoding number "1" works, we could in principle figure out which row m it is, and what input i labels column m, and then we could give input i to A(L^1). It would be fair to think of i as itself an encoding of A(L^1), since we have a unique mapping from Turing machines to input strings i; it probably isn't encoding number "1" (nor encoding number "2"), but technically it is an encoding. And we reason about what would happen if we gave input i to A(L^1), as a consequence of the behavior we've built in to machine A(L^1).

If we wanted to base our reasoning off of encoding number "2", we'd have machine L^2, machine A(L^2), and a whole new table (the ordering of rows could be very different, due to the different encoding). We could apply encoding number "2" to A(L^2), figure out what row the encoding occurs on (call it m2), figure out the input on the same-numbered column (call it i2), and reason about what would happen if we give that input to A(L^2). There isn't anything useful we can say, in general, about what would happen if we gave input i to A(L^2), or input i2 to A(L^1); as far as A(L^2) is concerned, i is just some arbitrary number, likewise i2 has no significance for A(L^1).

Mostly agree with you

> I have some respect and sympathy for someone rational who's willing to try it.

Excellent :)

> There isn't anything useful we can say, in general,
> about what would happen if we gave input i to A(L^2), or input i2 to A(L^1);
> as far as A(L^2) is concerned, i is just some arbitrary number,
> likewise i2 has no significance for A(L^1).

I agree with everything you say, except for one point, that might be recognized on the above quoted words, and which I try to clarify below.

Both L^1 and L^2 embody a way of encoding **any** Turing machine, and working out whether, given some input and output (also properly encoded), the encoded machine would stop with the given output.

Suppose we have a Turing machine f, and an input i and output o appropriate for that machine. Encoding them all to L^1 and giving them as input to Turing machine L^1, should provide the same result as if we encode it in L^2 and work it out with L^2. As an example, we may imagine a Turing machine f1 that determines whether a number n is bigger than 10. If we encode the machine in L^1, encode the number 11 in L^1, and provide both encoded objects as input for L^1, it should stop at "true", exactly the same as if we repeat the procedure with L^2.

I think that we can agree in that if we have some arbitrary L^1 and L^2, and also some particular Turing machine f and input i, and we get a different result if we provide f and i (appropriately encoded) as input to L^1, than we get when we do the same with L^2, we have a problem. We have an implementation dependent result.

Now, the combination of A and L^1 (or any other L) is just pure and simply a combination of a Turing machine and an input for it. So encoding them in L^1 and providing them as input for the L^1 Turing machine should provide the same result as encoding them in L^2 and using it as input for L^2.

And my point is that they do not provide the same result.

(edited for clarity)

More engaged response

If I'm reading you rightly, L^1 and L^2 are the same logic except that they use different encodings of machines as strings

yes. And I agree to most of the previous, although you must admit there is a bit of a drift.

A(L^1) and A(L^2) would then be the corresponding machines "A" for these two different versions of "L", although it's a bit dicey to put it this way because, although constructing an "A" for each of two different logics might be fairly described as a function A(L), when you change the encoding used by L you may have to change the way you do the construction of your "A" in ways that are unclear.

If you change the encoding used by L, you have to change A(L) so that it uses the new encoding. It will just be a new L. I think there is no difference whether you assume that L is a part of A, or an input to it. In both cases, if you want to "perform" A with it, with A encoded in some L, L will have to be encoded to L.

And of course the question is what happens if the L you use to encode and the encoded L are different.

The machine A(L^1) is doing its own internal processing, based on an input string i. Because A(L^1) is (by supposition) a Turing machine, its encoding (using encoding number "1" that both machines L^1 and A(L^1) use) does in fact occur as the label on some row of the table; since we have A(L^1) in hand, and we know how encoding number "1" works, we could in principle figure out which row m it is, and what input i labels column m, and then we could give input i to A(L^1)

Sorry if I misunderstand you, but encoding number "1" is totally out of scope. L^1 is a Turing machine, and as such, it can be encoded in L^1. No need to encode a reference to it.

It would be fair to think of i as itself an encoding of A(L^1), since we have a unique mapping from Turing machines to input strings i

exactly, an encoding of A(L^1) (i.e., of the very simple A combined -as part or as currying- with the quite complex L^1) in L^1

as far as A(L^2) is concerned, i is just some arbitrary number, likewise i2 has no significance for A(L^1).

I totally agree with this. i corresponds to the encoding of A(L^1) in L^1, and to L^2 it is just a number. However,there is also the encoding of A(L^1) in L^2. And that is not just a number to L^2, it corresponds to a particular Turing machine, the same one that corresponds to i in L^1.

And, working out the output of that Turing machine - A(L^1) - in L^1 or L^2 gives different results.

Final take

I will go step by step through the diagonalization method in your post, taking into account the possibility of different encodings for Turing machines, different L^i.

Each L^i must correspond to some table, T^i. Both the rows (Turing machines) and columns (inputs) must be in some encoding that can be fed to L^1 so it can decide whether the Turing machines halt on some output.

To the A(L^i) Turing machines, the tables are oracles, and we'll imagine solved the problem of providing the same oracle to all encodings of each A(L^i).

So now suppose we have a physical implementation of A(L^1), connected to an oracle with the table T^1.

Now we must feed the nth input (encoded in L^1) to the unencoded A(L^1); which corresponds in the nth column to an L^1 encoded copy of itself, of A(L^1). Which therefore must have a connexion with the T^1 itself. Now if we check the nxn cell, there is nothing but chaos. No doubt about it.

Let's imagine now in the same setup that there is a row corresponding to the encoding into L^1 of A(L^2), being L^2 another machine complying with L. To A(L^1) now we provide the ith input that corresponds to the encoding of A(L^2). But now, that input, in the T^2 table to which A(L^2) is connected, doesn't correspond to the A(L^2) encoded in L^2. So there is no trouble in the concerned cells in any of the tables.

This means that different L^i have tables with different entries in some cell that corresponds to the same particular specification of a Turing machine.

The row that corresponds to A(L^1) in the T^1 oracle has some problematic cell, whereas in the T^2 oracle it has none.

Imagine that 2 T^i had different entries in the cells that corresponded to the some Turing machine that checked whether some number is bigger than 10.

The whole edifice around L seems to have a flaw, in which some propositions, some Turing machines, have a certain a meaning or behaviour in every one of the possible L^i except in one, which corresponds to its own encoding. So if this contex is correct, Gödel's proof might survive in it. However, it would be very diminished, since exactly the same propositions that are problematic in some implementation, have a clear meaning in all others. It stops being about Truth, and becomes about implementing Turing machines, or arithmetizations, etc.

More trouble comes from the fact that, if L^1 and L^2 are both correct implementations of L, it should be impossible for a Turing machine to distinguish whether it is being encoded and run in one or another L. However some Turing machine connected to the oracle might search for the rarity in the table, and work out exactly its own implementation by reverse engineering?

It is not difficult to add the concept of "compiler" and "decompiler" Turing machines that take formal specifications of a Turing machine and encode them into some particular L^1, and take their outputs and produce formal specs, and use these machines to clarify the previous algebra.

Conpleteness

This appears to be a criticism of Godel's method, but it is also not a proof of completeness. Where does that leave us? Intuitively do you think a proof of completeness is possible in some systems, or do you think Godel was right, but the proof is flawed?

re completeness

At the beginning of this discussion, I thought I could see an invalidation of Gödel's method. However at the moment I'm not so sure about that; what I see, if my assumptions are correct, is that his method, his proof, is not incorrect, but his conclussions are; his constructions cease to speak about completeness of system P.

And, in my view, if there is no support for such a crazy idea as the incompleteness that Gödel was deriving from his proof, we should better dissmiss it.

But even crazier is the idea that I have really found a flaw in Gödel's argumentation! So I'm not yet thinking too much about consecuences, I am basically going through my argument over and over looking for flaws...

We may have more

> However, we have only one L

The point is that we may easily have more than one. And then things get weird.

Last off

> > The Turing machine L takes a Turing machine f and an input i and works out the output.

> Actually, no. All I assumed L would do was decide whether or not f could have input/output pair (i,j).

To me this amounts to the same thing. The easiest way to take that decision is to run the machine; any other way just adds complexity. Perhaps I'm wrong. But the point is simply that there are alternative L's, not what form might they adopt.

Looking at the treatment of Godel on your blog, we find this:

To pin this down, we first need to work out what "sufficiently powerful" means. Gödel wanted a system powerful enough to reason about arithmetic: we can boil this down to, for an arithmetic function f and integer i, does f(i)=j or doesn't it? The functions of interest are, of course, general recursive functions, which are equi-powerful with Turing machines and with λ-calculus; so we can equivalently say we want to reason about whether a given Turing machine with input i will or will not produce output j. But a formal system is itself essentially a Turing machine; so in effect we're talking about a Turing machine L (the formal system; L for logic) that determines whether or not a Turing machine (the function f) on given input produces given output. The system would be consistent and complete if it confirms every true statement about whether or not f on given input produces given output, and doesn't confirm any false such statements.

I think you may have boiled it down too far, losing a couple of important aspects of the incompleteness theorem in the process. One of these losses makes the problem easier, and the other makes it harder, and they seem to sort of cancel each other out and you're left with what is essentially the halting problem. The missing pieces are:

1. Your version of the problem requires only checking a single input to a function, whereas arithmetic has quantifiers. This makes your version easier.

2. Your version doesn't allow for a proof term to accompany the problem. This makes the problem harder. Checking a function at a single input becomes easy if you're allowed a proof annotation -- just choose the number of steps needed for the Turing machine to terminate as your proof term. Affirming that f(i)=j in N steps just requires a universal machine to run f.

Because of these differences, I'm worried that your treatment in the blog doesn't really capture Godel's incompleteness theorem. Thoughts?

Some thoughts

First, a general observation about these sorts of profound principles named after their discoverer. In such cases, the name honors the person credited with discovering the profound principle, but we're likely to have neatified and generalized the principle since, so that what we end up crediting to them is broader than their original work. I'm after the general principle, the thing that's a deep feature of the fabric of reality. With that in mind...

Working from my belief that the principle involved is very profound and general and should afford a simple proof, I viewed entanglement with quantifiers as a defect; a distraction, making it harder to see the principle and giving an impression that the principle is far more dependent on specifics of the system than it actually is.

The part about proof annotations... is a bit more fraught, perhaps. I suspect it should be considered an adjunct to the introduction of quantifiers.

Concerning the halting problem — well, yes. I have always conceptualized Gödel as another form of the same primordial feature of reality as the halting problem. Gödel being the more elaborated form.

Sure

Generalization is what mathematics is about. My question is whether your treatmeant actually generalizes Godel. I don't think it does at the moment.

Two roads diverged in a wood

I think there are two ways to tackle this. On one hand, there's the matter of what general principle one perceives Gödel to be part of. On the other hand, there's the question of technical reducibility of one problem to another. The first question is subjective. The second seems a technical challenge that I, at least, would probably have to launch a major effort to sort out (and perhaps eventually I will; I've no clue how far in the future that might be, though).

It's been a pleasure

It's been the best conversation I've had in a long time :)

:-)

I also have found it quite pleasant.

Universal quantifier on f

we're talking about a Turing machine L (the formal system; L for logic) that determines whether or not a Turing machine (the function f) on given input produces given output.

I would say that there is an implicit closure here.

I mean, we wouldn't want "a

I mean, we wouldn't want "a Turing machine L (the formal system; L for logic)" that can only check the output of some given f. So I think there is a universal quantifier hidden in that sentence.

Yeah, probably not - but

> needing to check every encoding

You don't need to check every encoding, but you need a proof that is independent of the encoding, i.e., that can be carried out with any valid encoding.

> The Godel statement you produce will depend on the encoding you choose, but one Godel statement is enough.

Only if you can prove that it does not depend on your choice of encoding.

One choice is enough

I don't agree that you need a proof that is independent of the encoding. The encoding is used to make precise, in terms of arithmetic, the mechanics of the proof system. Once you believe that this encoding faithfully captures those mechanics, proving the theorem in that setting is enough.

By analogy, suppose I want to prove the triangle inequality in the plane. I introduce coordinates for the plane and do the proof in coordinates. Do I need to worry that my proof works in any coordinate system? No. I can choose coordinates that are convenient to me.

One choice may not be enough

> Do I need to worry that my proof works in any coordinate system?

No; you need to worry if your proof doesn't work in some valid coordinate system. So you have to protect against it.

> I can choose coordinates that are convenient to me.

Only once you have a description that is independent of the coordinates. Engineers choose coorddinates.

the question is

> I don't agree that you need a proof that is independent of the encoding.

In any case, the question is not whether we **need** such a proof. Such a proof should be obvious and immediate if his proof were right. But making it explicit, that we may try a more general proof that is independent of the encoding, helps to uncover a very big understatement in Gödel's propositions (3) and (4), where he hides behind an arrow the whole of the structure of his arithmetization, while making it seem a simple logical implication.

> The encoding is used to

> The encoding is used to make precise, in terms of arithmetic, the mechanics of the proof system. Once you believe that this encoding faithfully captures those mechanics, proving the theorem in that setting is enough.

If there are alternative encodings that also faithfully capture those mechanics, what harm can be found in seeing whether the theorem also works in them?

But you're not just

But you're not just considering alternate encodings. It sounds like you're requiring these alternative encodings to be "first class":

Therefore, to be able to do the trick of expressing y^i with itself within itself, we need to convert y^i to a number, and so, we must also encode the index i with our 45 definitions.

That is, you're not just generalizing the proof to handle one arbitrary encoding, but are insisting that the proof handle all of them at once. I don't see how you're thinking that's necessary.

> are insisting that the

> are insisting that the proof handle all of them at once

I really do not think that is correct. The index that would be encoded would be that corresponding to a single arithmetization. The problem is the entropy that is needed to discriminate it from the alternatives.

Well you're trying to make

Well you're trying to make the encoding aware of the entire family of encodings. Otherwise you wouldn't need to encode the encoding index i. No?

I think that I don't try to

I think that I don't try to make the encoding aware of anything. What I find is that if you try to make the proof independent of the encoding, you end up having to encode the actual encoding - or the index, which ultimately accounts for the same thing.

Equality

Sometimes i think i understand Godel, then i don't think about it for a while, and i find i lose any clarity i might have had.

To me there seems to be some implicit assumptions that divide mathematics and computation. Equality in mathematics seems to be intentional, things are equal for all time, proofs are true for all time.

Computation on the other hand includes time, equality is extensional, the halting problem is a classic example of something we can only know by 'executing' the code. As far as i know this was all introduced by Turing, and the Turing machine.

My understanding of the kind of mathematics Godel was talking about was the former where consistency is a property, if something is true now we mean it is true for all time. A Turing machine on the other hand does not have to do so, it has a memory, and a computation may oscillate being true one time and false the next. As such a Turing machine is not part of the mathematical universe described by Godel. A Turing machine is more a model of a mathematician, than it is a model of mathematics.

In the world of the Turing machine, just because 2 + 2 = 4 today, does not mean it must tomorrow. A Turing machine exists in a larger universe than mathematics because it includes time. In effect you cannot prove a Turing machine consistent from the outside, because it may change the rules with time (the Y2K bug is an example here).

Edit: a little more clarity on the oscillating computation. The Russel paradox is a problem for classical set theory, because it creates a contradiction, but only because we require all proofs to be true for all time. However if we take the universe of one element sets, and populate it in a computer (a set that contains itself, and a set that contains the empty set), and then flag them if they are not members of themselves, there is no contradiction. The contradiction literally cannot exist because a computer cannot create an infinite set, it cannot perform a literal inductive proof. The computer can iteratively deepen the proof but it can never reach infinity. It appears to me that a computation can never have completeness, and we cannot prove the consistency of its calculations (without looking inside the box to see how it is implemented).

If you drop your keys into a river of molten lava, let 'em go

In the world of the Turing machine, just because 2 + 2 = 4 today, does not mean it must tomorrow.

What does it mean for something to be true in the world of a Turing machine?

Nothing much

I am not sure the question makes much sense. To bring a little Wittgenstein into it, we have one language game called mathematics, where truth has a meaning defined in the context of mathematical proofs, and we have another different language game called computation where a Boolean value can be true or false.

So in the world of a Turing machine, true is one of the two states of a Boolean variable, it has no relation to mathematical truth, except that we can model mathematics in computation using certain algorithms. But I think it's a mistake to conflate the use of the truth value to represent mathematical truth with it actually being "truth".

mutable state doesn't exist in the universe described by Gödel.

Keean Schupke wrote:
> More a model of a mathematician than a model of mathematics.

This is a key observation. To elaborate it more formally, you're trying to elucidate the idea that Turing machines have mutable state, and sets of axioms do not. Gödel described the properties of mathematical systems (ie, sets of axioms) rather than the properties of sequential computation (axioms plus mutable state).

I believe that this elaboration of Enrique's proof fails because it needs an encoding with additional properties to attempt a refutation of a statement restricted to sets which do not include members having that property. The property stated by Enrique's proof does not generalize to systems without mutable state, and therefore does not address the property proven by Gödel.

IOW, what we have here is applicable to stateful rather than functional systems, but stateful systems were, as you correctly point out, already known to be inconsistent; the value of a variable can, by definition, change where the value of a constant cannot.

Not convinced.

If we want Gödel's proof to be well founded, it should obviously be independent of which arithmetization is chosen to carry it out. The procedure should be correct for any valid Phi chosen. otherwise it would **not** apply to system P, but to system P **and** some particular Phi.

To take care of this, in Gödel's proof we must use a symbol for Phi that represents abstractly any possible valid choice of Phi, and that we can later substitute for a particular Phi when we want to actually get to numbers. This is so that we can show that substituting for any random Phi will produce the same result, and thus that the result is well founded.

I am not convinced by your line of argument. I've highlighted what I believe to be the problematic word in your reasoning. The word "must" indicates that there is no valid alternative approach, and the rest of your arguments seem to be derived from here.

One alternative approach is to pick an arbitrary phi and show that it works (as Gödel did). If there exists another phi that is a valid encoding then there must be a bijection between the new phi, and the old. If there is no bijection then the two encoding disagree about the set of valid statements.

As long as the bijection holds we can wrap the original phi in a function that produces the new phi. Thus we do not need to show the result for every phi - only for one phi that is a canonical rendering of the set of statements, and we need an argument that no other valid phi exists so that there is no bijection between them. This is generally understand as phi being a canonical representation of the set.

Agreed

The original rendition of the argument has some flaws, and the one you point out is one of them. The adaptation of the argument to the expression of Gödel's theorem through Turing machines, developed in the discussion with John Shutt in the coments, follows your line of reasoning, and is in my opinion more robust than the original post, although also has weaker -but if correct, still significative- results.

Encodings

I don't really see the connection, but then I find your description to be quite informal so it is certainly possible that I've filled in some of the dots in the wrong way.

In your take you speak about a Turing machine L and another A. The Turing machine L takes a Turing machine f and an input i and works out the output. And A uses L to produce the contrary result; So when applied to itself, produces a contradiction.

Here is an example of where I got lost along the way. How can the machine L take the machine f as input? A machine executes on a tape that encodes it input. So where you say $$f$$ in the above I can understand what you would mean if you said $$f_L$$ to indicate the machine $$f$$ encoded into the language accepted by the machine $$L$$. Or as it is normally phrased, that L is a simulator (interpreter) of the machine $$f$$ (of the language that $$f$$ is implemented in).

Any input to a machine that is not a tape (in the form used in the definition of the machine) requires an encoding. Some of these encodings seem to be missing in your description.

I see mention of the final encoding (outermost wrapper) in each case, but no mention of the inner encodings. The claim that feeding the Gödel-string of $$L_1$$ into $$L_2$$ would have no effect seems to skip over this encoding issue. I would claim (without checking) that if you plug in the extra encodings and keep track of the bijections between them that the issue of translating Gödel-strings from one machine/encoding to another will disappear. Probably.

Encodings are there

In the paragraph you quote, I was trying to link with the terminology used for the common diagonalization method as described in John Shutt's rendering of the theorem, where there is no talk of encodings. Then, in the rest of that comment, and in the following comments in that subthread, I take into account all the necessary encodings, exactly as you suggest, and it is this taking them into account that surfaces the alleged problem I am seeing with the proof. Take this comment and the 2 that follow.

Yes

Yes - thanks for the link to the right comment, I see that you did put the description of those encodings in there. Then I went back over John Shutt's post again and in fact I'm just paraphrasing a point that he has already made, but I'm paraphrasing it quite poorly so I'll just stop there.

Meta things

This discussion seems finished. Thanks to all involved, and to the host, and sorry if someone has felt spammed. (Of course, if anyone else has something more to say, I'm all ears).

I've been a few days sleeplessly agonizing over all that has been put forward here, specilally in the conversation with John Shutt. Most of my original arguments are gone; even the one still left as Five two about cardinality, which was the first I saw, seems to not hold with with recursively enumerable Turing machines.

However, the originating idea of considering and following the alternative encodings, or arithmetizations, still seems to me to lead to a significative result about the metamathematical implications of Gödel's exercise, explicit in the comments around this one, that was buried in a lot of trash in my original post.

So now I think I will try to write a paper about it and see how this holds in the academic community, if it is ever allowed entry at all. I will be doing it here at github. Anyone interested in contributing, either in the redaction or in any other way, will be most welcome.

First, be careful

My advice would be to go over your reasoning even more slowly than you have here. Also, if you can distill a 10,000 feet version that conveys the intuition of what you think goes wrong, that would also help. I had trouble figuring out which your steps was supposed to be the important idea. When the proof has vague or imprecise steps and I don't get the big picture, it's really hard to help.

Thanks

Heartful thanks for the advice. I will follow it and calm down a bit, I was getting carried away.

The problem in understanding the argument is I think that it is necessary to understand a very weird Turing machine called "A" that is very well described in John Shutt's post, it takes a while. Once that is taken, I think that my argument is easy. It boils down to the fact that Turing's statement is made in some encoding, and it contains an encoding, and exactly it, Gödel's statement, expressed in a different encoding than that used within, ceases to be problematic; So Gödel's proposition is just an arbitrary arithmetic proposition in almost any encoding possible. Which in my view is akin to saying that "9 > 8" has different truth value in one or another encoding.

Ah, thanks for explaining

Ah, thanks for explaining the connection to John's blog post. I didn't follow that link and ended up confused when I tried to follow along with your exchange a few days ago.

I believe the situation is that different encodings lead to different Godel sentences. All of them are true but unprovable in the logic (regardless of how you encode them). Any statement which asserts "I am unprovable" -- under any encoding -- is true but unprovable (a fact which holds without regard to any encoding). Of course this isn't going to be very many statements, because it will need to assert that there exists a "proof" which is a very particular mathematical structure under any encoding.

I believe the situation is

I believe the situation is that different encodings lead to different Godel sentences.

Yes, agreed. And each Gödel sentence is a perfectly defined arithmetic recursive function. So what we have here is just a bunch of plain old recursive functions. Each and every one of them can be expressed in all possible encodings.

All of them are true but unprovable in the logic (regardless of how you encode them)

All of them lead to problems when used for Gödel's argument. But also, none of them give any problem when expressed in a different encoding than the one they use internally.

Supposing we are given 2 different arithmetizations of system P or of some equivalent system, and are asked for a proof that at least one of them is invalid as regards being interpreted in system P (or equivalent), a perfecctly sensible strategy would be to find some plain old recursive expression in P that exhibits different behaviour in one or the other arithmetization.

Which is exactly what obtains if you follow Gödel's argument taking into account alternative encodings.

Any statement which asserts "I am unprovable" [...]

If, in different encodings, the exact same statement asserts different things, then you can't say "... unprovable in system P", you have to say "... unprovable in this arithmetization of system P". Which doesn't make sense: depending on a particular implementation to check some general (i.e., independent of the implementation) truth doesn't make sense. So my conclussion is that we have to think of some other interpretation for the weirdness obtained by expressing Gödel's statement in the encoding that it uses internally.

Nope

If, in different encodings, the exact same statement asserts different things, then you can't say "... unprovable in system P", you have to say "... unprovable in this arithmetization of system P".

That's not true. The statement can be interpreted as "unprovable in system P". "Unprovable in an arithmetization" looks like a category error to me.

I tried the analogy of coordinates before, but let me give another analogy. Suppose I see an endgame chess position and want to prove that white can force a win, and decide to write a computer program to check all of the possibilities. I assume you're a programmer, and so you understand how this would work. The first thing I do is pick an encoding of the chess board and implement the rules of chess in terms of that encoding showing that whatever move black makes, white can counter it in a way that leads to a win.

Do you agree that if I do this correctly, I can prove something about the original chess position and not just about my encoding of the chess position?

"Unprovable in an

"Unprovable in an arithmetization" looks like a category error to me.

That's what I meant with "makes no sense".

if I do this correctly, I can prove something about the original chess position and not just about my encoding of the chess position

Yes. And if you obtain a different result if you pick a different encoding and implement the same rules in terms of that other encoding, you should think that you are not doing it correctly.

You're right

My analogy wasn't very good because the thing we want to prove about our logic depends upon the choice of encoding.

If you take a Godel sentence that was generated by an unknown encoding, it will just look like a sentence about numbers. To demonstrate that the sentence is true but unprovable, you need to know that the sentence can be understood as referring to its own provability under some encoding. But it remains true & unprovable even if you don't see how to demonstrate that it is. What is problematic about this?

... as referring to its own

... as referring to its own provability under some encoding

...

That we have agreed that "provability under some encoding" is a categorical error.

Wrong parse

My sentence was meant to be parsed as if it was written:

... you need to know that under some encoding, the sentence can be understood as referring to its own provability.

So "provability under some encoding" is a category error but I didn't use it as such.

If Gi is a Godel statement (which I would assume is a statement -- a symbolic encoding of a proposition -- not a number), constructed on the basis of encoding i, then Gi is either provable or not in *arithmetic*. It doesn't make sense to ask whether it's provable in a different encoding. That's the category error.

You can encode Gi into another encoding, but nothing "goes wrong". Doing so just isn't helpful.

sorry [edited to mark it as off-topic]

[edited: this comment is a bit philosophical, I would steer away from it.]

Gi is either provable or not in *arithmetic*. It doesn't make sense to ask whether it's provable in a different encoding

Yes, but with some caution. I think that there is some important nuance missing in your comment. You don't have arithmetics, you have encodings, models of it. You prove formulae in those models. And to know whether those models are good, you compare the proofs in different models, among them your intuitive model of arithmetics. If all models coincide in a proof, you consider the formula provable in arithmetics.

Once you have a model you trust, you prove in that model, and say that the proof is "in arithmetics". But that is only because you implicitly trust the encoding of arithmetics in which you work. If you were to be offered another model, that made perfect sense as model for arithmetics, and that allowed for different proofs, different truths, you would rightly suspect that one or the other model was not really a good model for arithmetics.

So effecctively you prove things in encodings, in models, but the things are provable in arithmetics, and so proofs of provable formulae can be constructed in valid encodings of arithmetics. [added: The difference I meant here was between verb for models and noun for arithmetics, very lame.]

The problem comes when you have formulae that seem to have proofs in some models but not in others.

I think that that means that you cannot reach arithmetics from your proofs in your models.

By Arithmetic I mean a

By Arithmetic I mean a formal system. When you speak of a model I think you mean an encoding of the formal system. I don't think checking that an encoding faithfully captures a formal system is as interesting as you imply. That's easy math. Also that use of 'model' is going to confuse logicians.

I'm going to have to break for a while but I still don't see any problem. Good luck.

Thanks

I'm going to have to break for a while but I still don't see any problem. Good luck.

Thank you, for all :)

My problem is that I still see a problem...

My problem is that I still see a problem...

[Btw, I know for sure at least some details have gotten skewed in my memory, because I see from the mathematics geneaology project that, strictly by line of dissertation supervisors, my master's thesis advisor isn't a descendant of Juris Hartmanis, although I'm sure that was the name the trail ended with in his answer to me; instead, he's academically descended from Stephen Kleene. :-P]

I've had the feeling for

I've had the feeling for some time that you were making some subtle conceptual mistake that is difficult for us to identify because it causes you to say things that we interpret differently than you do. It made me uncomfortable that you would speak of "encoding in L^i", but L^i is not an encoding and doesn't do any encoding. Encoding is something the human meta-mathematician does; encoding is taking our meta-mathematical meaning and casting it into formal mathematics. The formal mathematics has no meaning except what we give it. I was daunted by the challenge of how to make this aspect of the situation lucid for you, and of course all the effort I might put into that challenge might fail to produce apparent results and yet I wouldn't know whether the failure was because I didn't explain it well enough or because I was wrong about the misunderstanding I was trying to clear up. But in the above comment I think I see a related confusion.

each Gödel sentence is a perfectly defined arithmetic recursive function.

But no, a Gödel sentence isn't any kind of function. It's just an assemblage of symbols. This is part of why I preferred, in my blog post, to use Turing machines: the input to a Turning machine is explicitly a string of symbols (which hopefully is easy to think of as very concrete), and the machine either halts with an output string of symbols, or doesn't. I suspect you're reading too much meaning into the input strings, possibly because you're not sufficiently separating an input string from the maching/function/whatever that, under some particular meta-mathematical encoding convention, we take it to "mean".

[wait, I think I said that wrong; isn't a Gödel sentence actually an integer? all the more lacking in inherent meaning, of course; doesn't change my point.]

That was not well put

each Gödel sentence is a perfectly defined arithmetic recursive function.
a Gödel sentence isn't any kind of function. It's just an assemblage of symbols
...
isn't a Gödel sentence actually an integer?...

What I meant is that each Gödel sentence is an integer, that, under the encoding in which it is a Gödel sentence, corresponds to a perfectly defined arithmetic relation. Take into account that the arithmetic properties of a Gödel number relate it not only to a string of symbols, but also to the syntactic properties of that string of symbols under sytem P, which, if system P is sound, should correspond to its meaning. [added later: what follows in this paragraph can be dissmissed. see next comment. In other words, we cannot have a Gödel number for "3 > 4", even though we can have that string of symbols [added later: Yes, you can have a Gödel number for "3 > 4". Perhaps what I should have said is that you cannot obtain "Bew( 4 < 3 )", or something, but it now seems slightly off-topic ... I'm tripping here, hold on].].

So, if you decode a Gödel sentence under the encoding in which it is a Gödel sentence, you obtain an arithmetic recursive relation; and if you decode another Gödel sentence, under its own encoding, you obtain a different arithmetic recursive relation. Those 2 different plain old arithmetic relations can both be encoded into one or the other encodings, and, depending on which, they will have different "behaviour".

I also totally agree that Turing machines are a much better framework to discuss this than logic.

[sorry about the bracketed

[sorry about the bracketed editing of the previous comment, I will continue here.]

we cannot have a Gödel number for "3 > 4"

What I was thinking about here is that the numbers we use to reach a Gödel sentence are not just Gödel numbers, but Gödel numbers that belong to certain sets, that are defined metamathematically, and so, for example, you have the set of Gödel numbers that correspond to provable formulae. Ultimately, those metamathematical concepts are defined as recursive functions on the integers, and so, conversely, the arithmetic properties of a Gödel sentence carry, not only the formula, but its relation with system P, since it is built with concepts (Bew, etc) that carry the axioms of system P in their definitions.

State of conversation

Honestly, I only meant to slow down my comments here, not end them. It was taking more of my time than I could afford to let it take, so I decided to step back for a few days; I may still have something to say here.

Ok!

I'll wait then, I also need a rest :)

A bit more detail

I will try here to describe the sketch of a conceptual framework where the argument put forward can be examined in more detail.

First we have an informal definition of Turing machine. Infinite tape, internal state, transitions, etc.

Then we have different formal models for that informal definition of Turing machine. One may be a septuple of sets and functions etc, as specified in the wikipedia.

We also have informal specifications for particular Turing machines. To each informally specified Turing machine, we provide a name.

Each of those formal models allows us to produce a formal specification for any informally specified Turing machine. Otherwise, either our informal definition of Turing machine, or our informal specification of the particular Turing machine, or the formal model, is invalid.

So we might define "informal specification of a Turing machine" as that which a human needs to test whether a Turing machine is valid - according to specifications.

Both formal and informal specifications of Turing machines are made by humans, and we may assume that the alphabet used to make formal specifications can be printed to a tape and used as input for Turing machines.

Now we informally specify a particular Turing machine, L for logic. This machine takes as input a Turing machine f and an input i, and determines what the output of f would be if run with input i - or hangs.

To take a Turing machine f as input, f (and i) must be encoded to some particular format, specific for the machine L. So we may think of different actual formal specifications for L, each requiring a possibly different encoding of the machines it uses as input. Let's call L^i to the different formal specifications of L.

So if we want to feed machine f to L^i, then from our formal, human encoding of f, we must obtain an encoding appropriate for the L^i. For this we define a new Turing machine, which we call C for compiler. To each L^i (and each formal model for Turing machines, but for simplicity we'll assume here one single formal model) there corresponds a C^i compiler, that can take a formal specification of a Turing machine, and produce an encoding of it appropriate to be used as input for L^i.

It's clear that the formal specification of some machine f, and the encoding produced from it by some C^i, are equivalent, and, for example, some L^i machine might require the encoding directly produced by humans here. So our compilers might better be called transpilers. But the fact is that there may be alternative formal encodings, be they made by humans of by compiler machines; and that to take them into account, in this discussion, it's clearer to account for them with compiler Turing machines, than with human incantations. So, without loss of generality, we assume a single encoding produced by humans, and all alternative encodings produced by machines.

One final Turing machine we need to define is a decompiler, D^i, that can take the output of machine L^i and encode it to the language of human formal specifications for Turing machines, so humans can check the output.

Now, with this terminology, I think that we can agree that a requirement for all this to be valid, a sanity check, could be as follows:

• Let's say we have a formal specification for a Turing machine f, and for an input x for f. We encode both with any C^i, and feed them to the corresponding L^i. If there is output, we take it and encode it with the corresponding D^i, to obtain a human understandable o. For example, if f is "input is bigger than 10", and i is "9", o should be "false".
• Repeat the procedure, starting with exactly the same formal definitions for f and x, but now using different machines C^j, L^j, and D^j, to obtain o'.
• The requirement is obviously that o and o' should be identical.

An ad hoc formula for this might be (assuming that m(n) stands for the output of machine m when given n as input - or for hanging) :

For all i, j: Di( Li ( Ci (f), Ci (x) ) ) === Dj( Lj ( Cj (f), Cj (x) ) )

But if you do the former with the A machine, the requirement is broken. Substitute f for A(L^1), and there will be some input for which the equality doesn't hold.

Compiler

If I'm following you (important qualification, that), your compiler C^i is what I'm saying isn't real. You're talking (I think) about mapping a formal Turing machine — such as a seven-tuple of sets and functions — to an encoded string. But that mapping literally cannot be performed by a Turing machine, because the input has to already be encoded if it's to be processed by a Turing machine. You might write a Turing machine C^{i,j} that converts an i-encoding of a Turing machine to a j-encoding, but the input to C^{i,j} has to already be encoded. Indeed, the encoding is all in our heads; these actual Turing machines just do symbol manipulations, and only we as thinking observers perceive a correspondence of these symbol manipulations to any sort of "reasoning" about the behavior of described Turing machines.

I'm reminded of the term "meta-circular evaluator". John C. Reynolds coined this name for a technique of describing the semantics of a programming language. The idea (to remind) is that you specify the behavior of a programming language by writing an interpreter for it in another programming language. This is a technique used to good effect in the Wizard Book. He called it "meta-circular" because you aren't (in general) using a programming language to define itself, which would be a circular definition, but, because you are using one programming language to define another, you can't use this technique to define programming languages as a class: you have to already know at least one programming language before you can use the technique. (I'm not sure that's quite what I would ordinarily call "meta", but admittedly the name "meta-circular evaluator" sounds very cool and has turned out to be a very good meme.)

Btw, Alan Turing in his dissertation explored the idea of a hierarchy of formal logics, where you take a logic L that "ought to" be able to prove some proposition P(L) about itself but can't, and you form a new logic L' that's just like L except that P(L) has been added as an axiom. For example P(L) might be the proposition "L is consistent", which L can't itself prove (unless we've mistakenly started with an L that isn't consistent). L' can prove that L is consistent, because it has an axiom saying so; but L' can't prove P(L'). So we have L'' that adds axiom P(L'). L'' can prove P(L'), but cannot prove P(L''). And so on. Those primes get awkward (L''', L''''), so we might as well use a nonnegative integer index (I'll use subscripts, to avoid clashing with your use of superscripts): L_0, L_1, etc. We can also have a logic L_omega, where omega is the number of integers: that is, L_omega is L augmented by axioms P(L_n) for all finite nonnegative integers n. And we can do this for even larger ordinals, noting that L_omega can prove P(L_n) for every finite n but cannot prove P(L_omega). So there is some interesting stuff involving the way different logics relate to each other. (pdf)

We agree on this

You're talking (I think) about mapping a formal Turing machine — such as a seven-tuple of sets and functions — to an encoded string. But that mapping literally cannot be performed by a Turing machine, because the input has to already be encoded

That's clear, that's the reason I said:

we may assume that the alphabet used to make formal specifications can be printed to a tape and used as input for Turing machines.

What I meant is basically what you mean, I think. The model of a septuple of objects is treated in my sketch as the formal framework under which we can produce formal specifications for specific Turing machines. A septuple of sets and functions is, effectively, a set of axioms and theorems that provide the terminology to describe Turing machines, and this terminology can be very easily formatted in a tape fit to be used asinput for a Turing machine.

You might write a Turing machine C^{i,j} that converts an i-encoding of a Turing machine to a j-encoding, but the input to C^{i,j} has to already be encoded

Again, I think it's not difficult to see that we are saying the same thing:

So our compilers might better be called transpilers

I am not assuming any special property of encodings of Turing machines made by humans as compared to those made by compilers, other than the fact that they can be understood by humans.

I use compilers only to cover the fact that different L^i may require different encodings. We might have assumed that we already have our machines and inputs appropriately encoded for our L machines, but the whole point of the exercise is to make those encodings explicit. So that's the reason to have one canonical, humanly produced, humanly understandable encoding, and compiler (transpiler) machines that go from this canonical encoding to specific encodings for specific L machines: to be able to explicitly label differently different encoding of the same machine.

The sketched framework would work equally well (or equally bad) if I had chosen some other non-humanly understandable encoding as canonical encoding, or even no canonical encoding at all; it would just be more difficult to reason about it. The canonical encoding just provides us with a ground where we can compare the outputs of different machines in different encodings.

BTW

Thanks for the link, really interesting, looking forward to finding a while to read it in more depth

So

So if your objections to my proposed framework are already covered, perhaps we might agree on it, it should help focus the discussion on the real point of contention we have here, which is, I think, the level at which the 2 encodings that are involved in the A machine happen: Are there 2 encodings, or one? if there are 2, are they independent, does it make sense to consider 2 that are different? what is the dependency between them, if there are 2? If we denote by Ai the A machine connected to the oracle/table made with Li, does this make sense?:

D1( L1 ( C1 (A2) ) )

Notation

I think I need to say what I mean with

For all i, j: Di( Li ( Ci (f), Ci (x) ) ) === Dj( Lj ( Cj (f), Cj (x) ) )

We take the formal human encoding to be formally equivalent to any other encoding that an L machine might need as input; we only distinguish it as canonical as a convenience.

All letters except superscripts correspond to formal specifications of Turing machines and inputs for them, expressed in the canonical encoding. (Obviously, if a Turing machine can be encoded in some encoding, that encoding must be capable of encoding its inputs, since a specification of a Turing machine must speak about its inputs.)

And, as said, we understand "m(n)" to mean the output of turing machine m when given input n.

Thus,

f

is the canonical encoding of a Turing machine,

Ci

is the canonical specification of a compiler Turing maachine that takes as input anything encoded in the canonical encoding and produces as output the same anything that was used as input but encoded in the encoding that Li needs (let's call it the i encoding),

Ci (f)

is the Turing machine f encoded in the i encoding,

Li ( Ci (f), Ci (x) )

is the output of Turing machine f when given x as input, expressed in the i encoding, and finally

Di

takes the previous as input and produces the output of f on x, expressed in the canonical encoding.

C^i(x)

Wanting to be very sure of each detail, here. C^i is a "compiler" that translates a canonical representation of a machine into its representation in encoding i. I have some qualm over this because it's not clear to me that this translation-between-encodings has to be Turing-computable. However, it's easy enough to move past this, at least tentatively, by stipulating that the encodings are limited to ones whose translation-from-canonical is Turing-computable. I'm stuck, though, when you apply C^i to x. That's an arbitrary input string. It might or might not be a canonical representation of a Turing machine. So it doesn't make sense to me to make it input to C^i.

You are right

So it doesn't make sense to me to make it input to C^i

So this also applies to the whole action of D^i, which takes the output of L^i as input.

And I agree that it is unneeded. I was thinking of the possibility of different alphabets for different machines, that might be encoded in other alphabets, but that would mean having different models for specifying Turing machines, and I have already forced agreeing on a single one.

Until there is no talk of types, or while there is only one type that corresponds to the alphabet to be printed in the tape, there is no need to compile inputs or decompile outputs.

So sorry, please simplify and assume C^i(x) and D^i to be noops, the identity function.

it's not clear to me that

it's not clear to me that this translation-between-encodings has to be Turing-computable

I would think it is. An encoded Turing machine is just a big number, or sequence of symbols, that can be decomposed with the aid of the encoding into pieces that correspond to concepts such as internal state, transitions, etc. This decomposition must be Turing-computable, since it is the decomposition that an L machine must somehow do to be able to work out the output of the encoded machine. And all encodings must coincide in their underlying high level concepts, otherwise they wouldn't be equivalent - they are equivalent as regards precisely those high level concepts.

-

Wrong place (really is an ergonomic problem with this interface).

To recap

I will try to round up and point towards a conclusion.

It is commonly agreed that

The Gödel sentence asserts of itself: “I am not provable in S,” where “S” is the relevant formal system

My point is that it may be that what the Gödel sentence asserts of itself is: "I have some arithmetization of system S within myself (or I have some kind of reference to an arithmetization of S, or any other way to link the Gödel sentence with a particular arithmetization - it is clear that there is a link), and, in that particular arithmetization, it is impossible to construct a proof of myself.".

Would this be significative? In my view, yes. If you have a proof in some arithmetization of system S, and want to lift it to the category of "provable in S", you better make sure that your proof can be carried out in any other arithmetization. Otherwise something is wrong.

So, what might be the test to decide which of both interpretations of Gödel's procedure is correct?

In my view, and sorry for changing from Logic parlance to Turing machines, a valid test would be to decide if this (interpreted as given in the "Notation" comment) is a correct expression:

D1( L1 ( C1 (A2) ) )

If it is, then it must have a different value than:

D2( L2 ( C2 (A2) ) )

So I think that "the A2 machine behaves differently in different L", clearly implies that "Gödel's proof cannot be generalized to system S".

I think that to understand my position, all that is needed is this comment, the one about notation, and the one describing the framework used in the notation; and John Shutt's post.

The inequality is:For all

The inequality is:

For all i,j: There exists x: Li ( Ci ( Ai ), x ) =/= Lj ( Cj ( Ai ), x )

i.e., the x that corresponds to the diagonal column of Ai in the table built with Li.

[sorry, added: with i =/= j]

I think I understand the

I think I understand the formal proposition, but I don't see why it would be true. Let's see.

Ai is a Turing machine that takes an input string and does stuff internally that uses encoding i of Turing machines.

Cj takes as input a canonical encoding of a Turing machine, and produces as output the same machine encoded via encoding i. So Cj(Ai) would be a j-encoding of Ai.

Lj takes two arguments, the first of which is a j-encoding of a Turing machine, and the second of which is an arbitrary input string that is supposed to be sent to the Turing machine whose j-encoding was the first argument. You say, I think, that Lj decides whether or not its first argument given its second argument would halt? Okay. The first argument does need to be j-encoded, though, so given a canonical encoding of a Turing machine f, we can j-encode it by sending its canonical encoding through Cj. Thus, to determine whether or not machine f would half on input x, we would run Lj(Cj(f),x).

But, we're assuming machines Cj and Lj are correctly implemented. So for every machine f, every input string x, and every encoding i and j, we have

Li(Ci(f),x) = Lj(Cj(f),x)

And this should be true no matter what f is. The encoding needed by L must be provided by C, but the machine f is the same machine no matter which encoding is used to represent it. In particular, it doesn't matter whether or not f uses some particular encoding internally (or even several different ones); the encoding used by L and C is a matter of how f is represented, not what it does. For example, f could be Ai, and then there's a particular choice of x that we're interested in for the logic; but that's all a matter of what Ai does internally, having no effect on the way our Lj and Cj "reason" about its behavior.

So I don't see where your inequality would come from; my understanding of what the various machines are suggests that those two things would always be equal.

Let Li be an L Turing

Let Li be an L Turing machine, and Ti the corresponding table. In Ti you have rows for both Ci(Ai) and for some arbitrary Ci(Ak).

Let x be the input that corresponds to the diagonal cell in the row for Ci(Ai) in the Ti table, and z the input that corresponds to the diagonal cell in the row for Ck(Ak) in the Tk table.

Now, in the ( Ci(Ai), x ) cell in the Ti table, the value must be the output of Li( Ci(Ai), x ), but that is undecidable.

Exactly the same happens in the cell ( Ck(Ak), z ) of the Tk table.

Now, what happens in the ( Ci(Ak), z ) cell (not necessarily diagonal, or most probably not diagonal) of the Ti table? Its value must be Li( Ci(Ak), z). Li has access to the Tk oracle, through Ci(Ak); it is needed to work out the output of running Ak with input z. There has been trouble building the Tk table in the diagonal cell ( Ck(Ak), z ), undecidability; but, whatever the problem, in the cell, "undecidable" is not a valid entry; either it is blank, or has a value. So Li will not have any trouble filling that ( Ci(Ak), z ) cell. Remember that to run Li( Ci(Ak), z) we do not have to build table Tk, it is given.

Now, of course the same is true in the first case, in the case where we evaluate Li( Ci(Ai), x ) for the Ti table. Li will have access to the Ti oracle, and whatever the trouble there was when building it in the ( Ci(Ai), x ) cell, the value cannot be "undecidable". However, in this case, it is Li that cannot decide on Ai, whatever the value in the cell.

So it seems clear to me that there is a difference executing (Ai, x) in Li or in Lk. It doesn't seem to me now that it is captured in the inequality above, though.

detail

I believe there is a subtle error in your post, in the sixth paragraph under the "Gödel" title. There, first you fill a table

Enumerate the machines f and make them the rows of a table. Enumerate the input/output pairs and make them the columns. In the entry for row n and column m, put a "yes" if L confirms that the nth machine has the mth input/output pair, a "no" if L confirms that it doesn't.

...

let's restrict our yes/no table by looking only at columns [...]

Then, you define machine A, and say:

For given input, go through the column labels till you find the one that matches it (we were careful there'd be only one); call that column number m. Use L to confirm a "no" in table entry m,m

But you don't need to use L to confirm, you have the table already filled; it is an oracle. And not only don't you need: you can't use L. Because if you Use L to confirm a "no" in table entry m,m, and m happens to be A, you enter a loop.

And you don't even need it to find undecidability:

Now, in the ( Ci(Ai), x ) cell in the Ti table, the value must be the output of
Li( Ci(Ai), x ), but that is undecidable.

The value in that cell has to be the contrary of what is found in the cell of the table of the encoded machine, but it actually is the same table and cell, so it must be the same.

However,

Li will not have any trouble filling that ( Ci(Ak), z ) cell.

Li takes the encoded Ak, the table Tk with it, and the input z, goes to the z column in the Tk table, goes to the diagonal cell, takes the value there, and takes the contrary; Then, Li will have no problem using that contrary to fill in the totally different cell ( Ci(Ak), z ) in the totally different table Ti.

It doesn't seem to me now that it is captured in the inequality above, though.

The LHS is undecidable, and the RHS isn't. However, since undecidable is not an output, and the RHS's output depends on an oracle that depends on an undecidable computation for its response, the actual equality may hold.

Um, no

The table is not an oracle. The table exists only in the mind of the meta-mathematician; it's our way of conceptualizing the range of different computations.

Unloop

Here's imho an interesting point.

if you use L to confirm a "no" in table entry m,m, and m happens to be A, you enter a loop.

That's an unnecessary assumption. We don't know how L works. L is making deductions about the behavior of machines, but there's no reason to suppose L must run the machine in order to deduce its behavior. In fact, running the machine would be quite a weak way to make deductions about it, and in particular one would never deduce that way that a machine was not going to halt.

[Okay, "quite a weak way to make deductions" is a little harsh; one can after all make any correct deduction that way about something that the machine will do. Nevertheless, you can't make deductions about non-halting that way, so it makes sense that L probably wouldn't be limited to only things that can be deduced by simulation.]

Nevertheless, you can't make

Nevertheless, you can't make deductions about non-halting that way, so it makes sense that L probably wouldn't be limited to only things that can be deduced by simulation.

Haven't been following too closely, but would abstract interpretation be considered simulation? Because you can infer termination properties using AI.

The abstraction is normally

The abstraction is normally designed so that the abstract machine will terminate (with a loss of precision in the output being calculated). So it is a form of simulation, but of a less powerful class of machine. It would not execute the simulation of the more powerful machine.

e.g. $$L^i$$ would not execute the TM encoded in language $$i$$ - but it would instead abstract it to an encoding on a less powerful machine. So $$L^i$$ would not simulate a Turing Machine as such, but rather a less powerful type of machine that approximates the execution of the machine that $$L^i$$ should infer something about.

Okay

I can concede that it is not needed that the table is an oracle. The looping argumant is invalid.

However, it is also not needed that it isn't. We can assume it is an oracle, even though we don't need to assume that. But doing it helps isolate the problem I am trying to pinpoint.

It seems clear to me that undecidability happens in different steps for Li( Ci(Ai), x ) than for Lk( Ck(Ai), x). In the first case, undecidability is immediate; in the second case, undecidability comes from the need to compute Li( Ci(Ai), x).

Confining the table to an oracle helps to isolate the different steps at which undecidability happens. It surfaces the difference between Li( Ci(Ai), x ) and Lk( Ck(Ai), x).

So, is there any reason the table cannot be in an oracle? In your post, you say: "...and we've conspicuously placed no constraints at all on O, not even computability".

It seems clear to me that

It seems clear to me that undecidability happens in different steps for Li( Ci(Ai), x ) than for Lk( Ck(Ai), x). In the first case, undecidability is immediate; in the second case, undecidability comes from the need to compute Li( Ci(Ai), x).

No. The choice of encoding k has no effect on the behavior of the encoded machine; it doesn't matter that, in this particular case, the encoded machine happens to be one that, internally, uses encoding i.

Lj(Cj(f),x)) uses a certain logic, L, to deduce the behavior of a machine f on an input x. The deduction depends on three things: the algorithm of L, the algorithm of f, and the input string x. It does not depend on the choice of encoding, j. In the particular case you've been looking at, f=Ai, the superscript i denotes what encoding this A uses internally for its computation. That is, Ai and Aj are algorithmically different. But Ci(Ai) and Ck(Ai) are just two different encodings of the same machine, which happens to be Ai. The k-encoding of Ai is an encoding of a machine that internally uses encoding i. And as long as logic engine Lk is given an encoding of Ai in the encoding it knows how to read, k, it uses the rules of L to make deductions about the behavior of the encoded machine, Ai.

There is no leakage of the encoding choice j into the behavior of the encoded machine Ai.

Of course you are right

Of course you are right. And in my "detail" comment there was something missing: L machines don't build tables, some other builder agent does, using L machines.

I was trying to focus in the difference between the ( Ci(Ak), z ) cell in the Ti table (not a diagonal cell) and the ( Ck(Ak), z ) cell in the Tk table (a diagonal cell). Of course the content should be the same, as you have clearly shown.

When you focus on the n,n cell in your post, that corresponds to ( Ck(Ak), z ) here, and you find undecidability, it is the builder agent for the table, using Lk, Ck(Ak), and z, that finds it, right? Of course, if the table is in the head of the meta-mathematician, the builder agent will be the actual A machine; if the table can be in an oracle, the agent would be something else.

But in the non-diagonal case, the builder agent (supposing some arbitrary machine that needs to access that cell - not an A machine obviously), with access to an oracle, might take a shortcut and not work out the output of Ak(z), and be decidable. Simply follow the steps you prescribe in the rest of the paragraph after focusing in the n,n cell, with a non diagonal ( Ci(Ak), z ), and it doesn't work: Just ask the oracle for the diagonal z cell, without bothering with the corresponding machine, and do what A has to do with it. We are filling a totally different cell in a different table, so no problem.

In the diagonal case, the builder agent (a Turing machine?) would be unable to avoid undecidability. Is this significant? If correct, does this make the tables (and so the A machines) dependent on the encoding?

diagonal

I suspect there's a bottomless pit of technicalities one could get tangled up in with all these conversions between encodings; but, we seem to be getting past that to the essential concepts involved, which is where we want to be.

The diagonal in the table is defined by the way we enumerate the input strings (the column indices) and the way we enumerate the machine encodings (the row indices). Those choices are built in to machine A; machine A defines for itself what is the diagonal. We need the diagonal to cross-cut every row of the table, which is to say that we need A to cross-cut every row of the table, and behave in a way that demonstrates a problem with our supposition when it cross-cuts its own row. We don't have to know just when it cross-cuts its own row, which is to say, we don't need to know which row it occupies, exactly because it cross-cuts every row; that's the essence of the diagonalization method. And A doesn't need to know which row it occupies either. My head hurts just thinking about whether or not it's possible to guarantee that A knows its own row number; if it is possible to show that this cannot be guaranteed, there's probably a proof of it using... diagonalization. I recall hearing a math talk once in which the speaker made a pretty good case that mathematicians who research infinities end up in therapy.

I suspect — suspect — part of your intuition of a problem here is that you're asking, what if we consider multiple encodings, so that what is on the diagonal for the permutation in one encoding is not on the diagonal for another encoding? But the diagonal is actually defined by its hardcoding into the algorithm of A — even more specifically, it's defined by its hardcoding into the algorithm of Ai. So each Ai has its own diagonal that it implements.

[edit: tried to straighten out somewhat muddled phrasing re 'what is on the diagonal' for one encoding versus another]

Almost seeking therapy

I really think that I see and understand what you mean. The only thing I'm not sure about is your suspicion of my intuition of the problem. I think it is not correct, but I'm not 100% sure, and I'd hate to repeat understood things when you already have shown me wrong. But since I'm not sure, I'll risk it and try to detail how I see the problem in light of what I think your suspicion is.

And A doesn't need to know which row it occupies either.

Agreed

the diagonal is actually defined by its hardcoding into the algorithm of A — even more specifically, it's defined by its hardcoding into the algorithm of Ai. So each Ai has its own diagonal that it implements.

Also agreed.

Let's suppose we have 2 different encodings, and an enumeration of Turing machines and inputs for each encoding, so that together they determine 2 different tables *, j and k. We can use them to build 2 A machines, Aj and Ak. Let's also take that the machine Aj, in its diagonal cell in the j table, corresponds to the x input, and that the Ak machine in the k table corresponds to input z in the diagonal.

Let's now limit our attention to the j table. What to put in the Aj,x cell? This cell corresponds to the n,n cell in your post, and the procedure will result in the undecidability you describe.

Now, in the same j table, there is also a row for Ak, and also a column for the z input. That row and column will not correspond -most probably- to a diagonal cell. But we can check anyway what should be in the cell - and it should be exactly the same as in the diagonal cell in the k table. This we have already agreed, I think.

But now, with the non diagonal cell Ak,z in the j table, your undecidable procedure will not work - immediately. It will have to descend to the k table to meet undecidability. Your explanation leading to undecidability would not be identical to the one for n,n - it would be a little bit longer.

Is it a difference, or do I definitely seek therapy? ;) **

What an agent -machine or otherwise- must (or may) do to fill in the Aj,x cell in the j-table and find undecidability is different than what it must do for the Ak,z cell in the same table. Your undecidable process in both cells differs.

If the difference is real, I am despairing of being able to isolate it. Oracles don't seem such a good idea, since using an oracle to build itself seems too circular. Perhaps debugging, tracing the steps?

* [edited to add: I am not supposing that the tables are filled finished tables or anything. Just that an L machine, and an enumeration of Turing machines in the encoding required by the L machine, determines a table. Any machine that has those hardcoded has access to the table - can work out the content of any cell.]

**[edited to add: Honest question. (Not the one about therapy, that's what the ;) if for.) From the clarity and cockyness of the first days of this discussion, I've come to, for the last couple of days, being continuously changing between: There is a difference. But it is insignificant. No, it is significant. Woah. No, but there is no difference. Oops, there is a difference, isn't there? ]

Everyone's mad except me and thee...

I'm going to make a suggestion. I'm not 100% sure of it; I'd need to go back to my blog post and reread that part with this suggestion in mind, to see whether it fits; but I'm going to put the suggestion out here now, rather than wait to check it first and risk losing track of it before getting to the check.

I suggest that the undecidability reasoning using Ak is inherently reasoning about the place of Ak in its own table, the table that Ak defines. That is, when you say

But now, with the non diagonal cell Ak,z in the j table, your undecidable procedure will not work - immediately. It will have to descend to the k table to meet undecidability.

I'm suggesting that this descent isn't anything new, but is how the procedure always worked; it just wasn't obvious because we weren't considering multiple encodings.

Living in an encoded world

I suggest that the undecidability reasoning using Ak is inherently reasoning about the place of Ak in its own table

Agreed that the argument is inherently about A's own table. But that is not incompatible with there being other cells in the table that correspond to other A machines, cells that must be undecidable, but where undecidability will only be met when we reach reasoning about them in their own tables.

And what is in those cells is, basically, Lj (Aj, x) for Aj in its own table, and Lk (Aj, x) for the same machine in other tables.

But now, with the non diagonal cell Ak,z in the j table, your undecidable procedure will not work - immediately. It will have to descend to the k table to meet undecidability.
I'm suggesting that this descent isn't anything new, but is how the procedure always worked;

Ok, so you agree that there may be a difference there. [edited to add: you haven't really agreed, have you ;) The question is that that descent is different for Ak in the k table that in any other table, it's shorter with the k table. See the comment "now with encodings" below.]

The problem is that, if there is such a difference, it may leak the encoding to the encoded.

Suppose we are encoded and live in an encoded world. Let's call this encoding w. Can we work out w from within? Of course not, we shouldn't be able to do that. The encoding shouldn't leak to the encoded objeccts. But now imagine the following procedure.

First, note that in this w world, running a Turing machine f is effectively just doing Lw ( f ).

So we can concoct some enumeration of encodings, or of L machines, and use it to build an A machine for each encoding, and run them. Since we are within w, what we are actually doing is making the world run Lw( Ak, z) for all possible k *. Now, should we be able to notice any difference between running Lw (Aw, z), and running any other Lw (Ak, x)? Beacuse these are the 2 things we have agreed that may be different.

If the above difference is correct and noticeable, we might be able to, among all A machines, identify Aw as the one that contains the w encoding, the encoding of our world.

Note that I'm not saying saying that it must be noticeable and thus we must be able to find w. All I say is that Lw (Aw, z) contains all the information needeed to "fail" earlier than Lw (Ak, x). So that whether it is noticeable and we do notice, may depend on the algorithm the Lw follow, which is outside of our control. However, it may also depend on A, which is under our control, and we might be able to tweak A to expose the difference.

* [edited to add: and z, see next comment]

For completeness

making the world run Lw( Ak, z) for all possible k

For all possible k and z. In this scheme, we should obtain undecidabiity (inconsistencies in the outputs) with one input z for each A, the input that corresponds to the diagonal cell in its own table.

And now, once we have appropriate pairs of undecidable (Ak, z), we can question whether we can trace the A machines and distinguish Aw.

Now with encodings

The question I think has become whether there is a difference between running Lw( Cw(Aw), x) and running Lw( Cw(Ak), z). But that's not exact, because within the world w, we do not know what is happening outside, how those objescts are being used. What we can be certain of is that those objects must somehow exist - that, wathever happens outside, it will have all the information that is assumed within Lw, Cw, and Aw or Ak at its disposal.

So now the question becomes that of finding a path to undecidabity with Lw, Cw, Aw, and x, that is unavailable to Lw, Cw, Ak and z, and that is shorter than any available to the later.

So let's examine the following path.

We have Lw, Cw, Cw(Aw), and x . We analyze Cw(Aw) and see that it contains Lw and an enumeration of Cw encoded machines and inputs. We learn that we must use those enumerations to find out the next necessary operation, which is specified in the diagonal row to the x column. We find out what is in that row, and find Cw(Aw). We already have Cw(Aw), so we can recognize it. And now we learn that we have to output the contrary; but we already know that the contrary must be the same, we have already identified both machines as the same. Here we reach an undecidable state.

We have Lw, Cw, Cw(Ak), and z . We analyze Cw(Ak) and see that it contains Lk and an enumeration of Ck encoded machines and inputs. We learn that we must use those enumerations to find out the next necessary operation, which is specified in the diagonal row to the z column. We find out what is in that row, and find Ck(Ak). We already have Cw(Ak), **but** we cannot recognize it; as of now, we do not understand the k encoding. Now we have to use Lk to analyze Ak and understand what to do. We are already past the state where we found undecidability in the previous case.

So at the moment it semms to me that there is a path, the possibility of leaking seems real.

[edited a bit to remove redundancy]

First wrong turn

It's in the nature of this stuff, I think, that once one makes a wrong turn, one has to stop, because nothing beyond that point can be trusted. You say

Agreed that the argument is inherently about A's own table. But that is not incompatible with there being other cells in the table that correspond to other A machines, cells that must be undecidable, but where undecidability will only be met when we reach reasoning about them in their own tables.

The phrasing "undecidability will only be met" comes out a little garbled, but, on to the substance. A given machine A has some particular table built into it, and diagonals that table; if that machine A is then examined by L, it doesn't matter what encoding is used, it's still the same machine (by assumption the encodings are interchangeable). Each entry in any of these tables is identified by the machine f (labeling the row) and the input w (labeling the column), and the contents of that entry are determined by the logic L; all of that is independent both of the choice of encoding and the actual position in the tables. The different tables have all the same entries, just in different places. All this is true regardless of whether or not f happens to be A. Our meta-mathematical reasoning is about the diagonal entry for A in its own table; that's how we derive our conclusion, by reasoning about that entry, and the difference with the other entries in that table is simply that they aren't the one we're doing that particular meta-level-reasoning about.

I've been looking for a parabel to clarify

Mary says Phillip is a scoundrel, in English. Amelie agrees, she also says Phillip is a scoundrel, but she says it in French. Oh-la-la!

The argument seems to be that since Mary and Amelie don't understand each other they don't agree. But Phillip is a scoundrel!

I would complete the parable with English-speaking Susan. If Susan receives the information from Mary, she will be able to immediately extract meaning, to act on that information and stay away from Phillip. If she receives the information from Amelie, Susan will need an interpreter before she can act on it. The final information received by Susan will be the same in both cases, only in the second case it will be more work to get it.

But the parallelism breaks quite soon, I would not try to follow it much longer.

I'm not speaking about different outputs for different L

The different tables have all the same entries, just in different places

Agreed. I agree that different L, when given as input the same Turing machine, must output the same.

But the 2 different L do not need to follow the same algorithm to come to that output.

In particular, if the information included in the Turing machine given as input, happens to have an overlap with the information contained in the L machine that analyses it, and the L machine finds in the machine given as input the need of some computation it has already performed in its role as L machine, I don't see any reason the L machine would have to perform the same computation again, instead of reusing its previous result.

I say this without assuming anything about the inner workings of the L machine; I only say that, if in those black-box workings, the situation described in the previous paragraph arises, the L machine can take the shortcut. And it will obviously result in the same output as if it had not taken the shortcut.

But the 2 different L do not

But the 2 different L do not need to follow the same algorithm to come to that output.

But they do, in order for the idea of "varying the encoding" to make sense.

L is a Turing machine, which in this case we have chosen to view in a way abstracted from the specific encoding of the Turing machine it takes as input. What, then is "L"? In essence, it is an algorithm, with the details of the input-encoding factored out of it. My understanding of what we mean when we index these machines by which encoding of the input they expect, Li versus Lj, is that the functioning of the machine is the same except for the way it handles the representation of the input; otherwise, it seems we would not have grounds for considering them both examples of L.

In particular, if the information included in the Turing machine given as input, happens to have an overlap with the information contained in the L machine that analyses it, and the L machine finds in the machine given as input the need of some computation it has already performed in its role as L machine, I don't see any reason the L machine would have to perform the same computation again, instead of reusing its previous result.

For the behavior you're talking about, the algorithm that is L would have to have built into it a way to recognize itself when it sees the encoding of itself as input. But in order to understand the implications of this, we need to say more precisely what it is recognizing.

• Perhaps L recognizes an input encoding of a machine that is L using a certain encoding; that is, it recognizes Li for a particular i. That would be L, the algorithm independent of which encoding it uses, recognizing when its input is the particular refinement of L that uses encoding i. To be very clear on this, any particular Lj would recognize when it receives the j-encoding of Li; and it would make no difference to this whether or not i=j. In fact, although the particular Lj may be able to recognize, as you suggest, the j-encoding of itself regardless of whether or not the j it's using is i, Lj cannot know what j it's using. That kind of self-knowledge is precluded by the requirement that the algorithm L is the same and the only thing that varies is which input encoding is used.
• Alternatively, perhaps L recognizes a machine that is L using any encoding; that is, it recognizes Li for any i. This implies that the variation between the different L machines for different input encodings is a regular variation that can be recognized by a Turing machine; I'm uncertain whether or not this variation for encoding must be Turing-recognizable in order for the variation-of-encoding scheme to work, but let's assume it's so at least in this case. Then L, the algorithm independent of which encoding it uses, recognizes when its input is any particular refinement of L to use any particular encoding i. But this has no effect on the impossibility of any particular Lj knowing whether the encoding i used by its input is the same encoding it is using.

No need to "recognize itself"

But the 2 different L do not need to follow the same algorithm to come to that output.
But they do, in order for the idea of "varying the encoding" to make sense.

I would have thought that L machines, defined in your post as Logic machines, similarly to Universal Turing machines, are sufficiently defined that way, and need no further constraints. However, this is not necessary for my argument; all it's needed is that L can take different paths for different inputs. So I won't press this.

the L machine finds in the machine given as input the need of some computation it has already performed in its role as L machine
For the behavior you're talking about, the algorithm that is L would have to have built into it a way to recognize itself

You argue against the possibility that L somehow can recognize itself; But I only propose that L can recognize some operation it has already performed. I think I stated clearly what would need to be recogized for my argument to work in the comment "now with encodings" above, and I don't see it unreasonable in the sense you point out. There is no need there for any machine to "recognize itself".

In fact [if I understand correctly,] what would need to be recognized is the same that would have to be recognized for your diagonal argument to work, when you say in your post that:

By construction, if the entry contains "no", then A outputs confirm, and the "no" is incorrect

It has to be recognized that the content of the entry and the output must be the same.

If you don't need L to

If you don't need L to recognize itself, then you're only talking about what algorithm it uses. The meta-mathematical reasoning here doesn't care what algorithm it uses. No matter what its behavior is, we can see meta-mathematically that there's a problem, and that's all that's needed.

I might have something more to say at your "summary", just posted a few minutes ago.

A summary

I have produced a summary of my position in this discussion at the moment. I still see something unexplained there.

I haven't posted it here because it is mostly copy-pasting from the comments above.

And also a weaker form of the argument here.

Edited to fix formatting in second link

simulation

It looks as if you're getting tangled in trying to find a way around something that inherently cannot be gotten around.

If one is correctly simulating something, by definition the simulated behavior of the simulated thing is the same as the behavior of the real thing would be. The simulated thing cannot "recognize" that it's being simulated unless the actual thing would also "recognize" — incorrectly — that it was being simulated; and in that case, the simulated thing isn't really recognizing that it's simulated, it's just imagining it.

If machine L takes as input an encoding of a machine f, and there are different versions of L that use different encoding schemes but they're all otherwise the same L, then it doesn't matter at all which encoding scheme is used. They must all behave the same way when given f in whatever encoding they use; otherwise it is not valid to claim they are all versions of the same L, differing only by what encoding they use.

Clarification

If machine L takes as input an encoding of a machine f, and there are different versions of L that use different encoding schemes but they're all otherwise the same L, then it doesn't matter at all which encoding scheme is used.

But that's not the scheme I'm proposing in the summary (and above). I am proposing a single Lw machine, that takes different Turing machines as input, Aw and Ak.

As regards a correctly simulated something cannot "recognize" that it's being simulated, I agree, that's one of my premises; that's what I mean with it seems clear that this should be impossible.

The question is that the subsequent procedure seems to contradict that impossibility, and I don't see how that procedure is wrong.

special relativity

Well, first of all, to clear up a point that could otherwise cause problems (as everything can, in this tangle), I suggest it is the scheme you're proposing. You say

I am proposing a single Lw machine, that takes different Turing machines as input, Aw and Ak

Right. It's about the relationship between the simulated machine f and the encoding used by the simulating machine L. The relationship between L and f is independent of what machine f is; in this case f is either Aw or Ak; and in particular, the treatment of f by L is independent of the encoding used by L. You are looking at whether or not the encoding used by simulating L is the same as something used internally by f, which happens to be Ai for some i that may or may not be equal to w. But to change whether or not w=i, you can vary either w or i. In my explanation that it doesn't matter whether w=i, I described varying w. In your search for a flaw in the reasoning, you vary i instead of w. This seems a little like two observers on passing trains disagreeing about which one is moving.

Whichever way one says it, I'm saying that for the simulated machine to determine whether it's being simulated — or how it's being simulated — is impossible. You describe elaborately tangled scenarios, and (if I'm understanding you right) you're aware this shouldn't make a difference but there's something still bothering you that it seems like somehow it does make a difference. I maintain that it cannot make a difference. All the elaborations that you suppose are going on in the simulated machine are all simulated faithfully, no matter what they are. You say that something happens earlier when simulated under one encoding than when simulated under another encoding; but no, as seen within the simulated world, everything takes exactly the same amount of time regardless of which way the simulation is done. This is (with a nod to Einstein) because if you have a clock with you in your simulated world, the clock is part of what is being simulated. You mention (somewhere, I think, though I don't see it in the summary) using some sort of caching to not redo anything that's already been done; but if the cache is in the simulating machine, the simulated machine can't see it; while if the cache is in the simulated machine, the cache is also being simulated, and any judgement of what has already been done is also simulated, so the simulated world still looks the same from the inside no matter which way it's being simulated.

(I was struck, this time around, by the similarity to thought experiments about time dilation in relativity. For whatever that may suggest about the degree of mind-bending-ness in this.)

I have answered to you below, to gain some horizontal space.

Davies

BTW did you think anything about my migration of my argument to Martin Davies version of the Theorem that I linked above? Perhaps it is easier to read, to round things up...

My argument at its simplest (I think)

[Edited to change the title and to remove some garbage]

My argument is much simpler than the entangled scenarios these reasonings lead me to. It is simply that, in all versions of Gödels theorem, an encoding or an arithmetization is used to reach an undecidable proposition or state, but that, using any other encoding or arithmetization, the undecidable proposition or state would be different. And it seems to me that the conclussions generally extracted from the procedure correspond to assuming that different encodings would lead to the same proposition or state.

The conclussions extracted from assuming a dependence between encodings and Gödel sentences seem to me much richer and intuitively understandable; but of course I may be seeing things in the clouds.

For example, I think we should be able to make a distinction between systems where undecidability can be met, and systems where it cannot, systems that would enter an infinite loop before reaching an undecidable state. And this distinction seems to me that should be related to the ability of the systems to solve hard problems. Because the procedure you follow in your post to find undecidability corresponds to a hard problem, right?

meta-math

an encoding or an arithmetization is used to reach an undecidable proposition or state, but that, using any other encoding or arithmetization, the undecidable proposition or state would be different.

I wouldn't call it an "undecidable" proposition-or-state; more like, a proposition-or-state that can be used in the meta-mathematical reasoning to show there's a problem.

This isn't mere terminology. It seems you're treating this as a property of the proposition-or-state. It isn't a property of the proposition-or-state; it's a property of the meta-mathematical reasoning. Or, if you really must phrase it as a "property" of the proposition-or-state, the property is that that proposition-or-state is the one that is used by the meta-mathematical reasoning to show that there's a systemic problem. That's it: the property is that it's the one the meta-mathematics uses. That isn't about the outcome of any formal reasoning/derivation/computation/whatever, it's about the meta-mathematician's reasoning. And although the reasoning uses some particular proposition-or-state, it reaches a conclusion that is about the entire scenario, not just about that one proposition-or-state. Naturally when you change the encoding-or-arithmetization, a different proposition-or-state is involved: the meta-mathematical reasoning is that if you give me an encoding-or-arithmetization, I can choose a proposition-or-state by means of which I can show that there's a problem with the scenario. I don't need to use the same proposition-or-state each time, I just need to, each time, find some proposition-or-state to use.

Are we finished?

idk

I may have more to say. I've been short on time just lately, so haven't had a chance, thus far, to really focus in on your most recent set of remarks.

:)

Train of thought

[Edited to add: This is a response to this comment, posted here to gain space.]
You describe elaborately tangled scenarios, and (if I'm understanding you right) you're aware this shouldn't make a difference but there's something still bothering you that it seems like somehow it does make a difference. I maintain that it cannot make a difference.

Spot on.

I'm aware this shouldn't make a difference; in fact, funnily enough, in a previous comment, in an attempt to convince you of it, I wrote (and then deleted thinking it was unnecessary) something like: "obviously, if 2 different Turing machines simulate the same event, but one takes a thousand years when the other takes one hour, from the point of view of the simulated event, its duration will be exactly the same in both cases".

But yes something bothers me. I'll try to show the train of thought that lead me to my elaborate tangled scenarios. But I realize that we are reaching an impasse, and I would not want to abuse your attention. So I would propose, if I don't manage to communicate anything new to you with this comment, to suspend the conversation and see if I am capable of rounding up my argument. If I were, I would most probably reach you, in case you were interested, before doing anything else. Of course, I must say that, in rounding it up, this conversation is being invaluable.

What bothers me appears from analysing the A machines. They are very weird machines, you have to concede that. Their main purpose is not their output, but undecidability. They are a vehicle to find that a table of encoded TMs and all their possible inputs must by necessity be incomplete.

And now, and this is just a remark, if we consider a different encoding, we will come to the same conclussion, with a different table. But will we have just one undecidable cell in each table? No; with different encodings, we have different A machines, that will occupy different rows in the differents tables, and that will each have one undecidable input.

So these A machines seem to be in a correspondence with the encodings. One encoding determines one A machine, that determines one undecidable cell - in each table.

What is the meaning of this apparent correspondence? What happens in The Book? All A machines will be undecidable propositions in The Book, of course, but will there be any more? Or all incompleteness in The Book can be adscribed to self reference within an encoding of itself (The Book)?

How might we test this?

The way that occurs to me to test it, is to look for a a procedure that would use those A machines, or the Gödel sentences, to leak the encoding to the encoded. Obviously, if such a thing is possible, it would show that the undecidable proposition is a characteristic of the particular encoding with which it is built, or in which it is found, since It serves to identify it. [Edited to add: While this would not mean that undecidability is impossible outside self reference within an encoding, it would limit Gödel's and subsequent results to those cases. It would make for a weaker Theorem.]

And this is what leads me to the 2 elaborately tangled scenarios at the end of my "summary". I don't think they are so tangled ;) They are parallel to your scenario in Structural Insight, and in fact the second is a copy-paste of the first, with the end changed. They pretend to show how 2 equivalent A machines, that should behave identically, reach an undecidable state in a different way depending on the encoding they are analysed.

It is a difference that depends on the algorithm chosen to "run", to analyse the machines, granted; It depends on a particular form of L. But it is a perfectly sensible algorithm. And if with a particular but perfectly sensible encoding and L machine, we can use the A machines to leak the encoding to the encoded, we can only conclude that the A machines - or more exactly, the undecidable states that we have designed them to reach, aka the Gödel sentences, are a property of their encodings.

mistake

I wanted to post this "my argument at its simplest" comment here, sorry.

essence

I'm glad for your simplification of this, elsewhere above, which seems to me to reduce the matter to its essence. I'm going to make a second attempt here at reducing my side of this to its essence.

The proof technique is, I want to show that no system/machine has such-and-such property (in this case, it's the property of being both consistent and complete). So I establish that, if you give me any particular system/machine, I can show that it can't have the property because there would be a problem. The means by which I demonstrate the problem happens to involve choosing an encoding (thus, a "table") and identifying a table entry that I can figure some stuff out about. It doesn't matter which encoding I choose, or which table entry I figure some stuff out about, as long as I do manage to show that there's a problem.

Meta-mathematics

The proof technique is, I want to show that no system/machine has such-and-such property (in this case, it's the property of being both consistent and complete). So I establish that, if you give me any particular system/machine, I can show that it can't have the property because there would be a problem.

There is no mis-encounter here, I totally agree. It is very clear that there is a problem with consistency and completeness, given by Gödel. What I propose is that perhaps it is not a problem, but a feature.

The mis-encounter, I think, is in the scope and residence of meta-mathematics. Rather than trying to willfully and perversely imagine your perspective, I will try to lay out mine. I will do it with Gödel's terminology, it seems more appropriate to me.

A Gödel sentence* is 2 things. Mathematically, it is a number. Meta-mathematically, it is a proposition. As a number, a precise number, it contains all the information needed to express the meta-mathematical proposition. It can be factored into a sequence of numbers corresponding to a sequence of symbols that have a meta-mathematical meaning in the intended interpretation of system P. That's what Gödel does with his 40 definitions: establish mathematical relations with meta-mathematical interpretations. Gödel's presuposition is clear: mata-mathematics are, either mathematics, or imagination**. The first -not, the second (or third?)- magnificent idea in his paper is to define mathematically all the meta-mathematical concepts he needs.

So a Gödel sentence is both a number, that can be shown to belong to certain arithmetically defined sets, and a proposition, since it contains all the same information of some given meta-mathematical proposition. What is then left of meta-mathematics in the head of the mathematician, of the reasoner? At the moment, I believe, all that is left is the encoding. The mathematician must know the encoding, to relate the number that the Gödel sentence is mathematically, with the proposition that the Gödel sentence also is meta-mathematically. Because it is clear that the same mathematical number***, with some other encoding, or arithmetization, might mean something completely different; being able to be factored out into so many factors, it is almost guaranteed that some arithmetization exists in which it has some arbitrarily given meaning in the intended interpretation.

Do we really want things in the head of the mathematician? Or, phrased differently, do we want meta-mathematical concepts that can only exist in the head of the mathematician, and do not have any mathematical relevance? I don't think so.

So, let's pull the encoding out of the head of the mathematician; and where does it land? It lands exactly between mathematics and meta-mathematics: between the Gödel sentence as a number and the Gödel sentence as a meta-mathematical proposition.

Does this mean that, if we want to leave the head of the mathematician out of the bussiness, the Gödel sentence needs the encoding to be mathematically meaningful in the same way that it is meta-mathematically meaningful? That we need the encoding to express mechanically, mathematically, with an operational semantics, what until now only was in the head of the mathematician? If that is correct, I think it would be wonderful, because the encoding is obviously arithmetizable; it means that the Gödel sentence(s) ceases to be a "negative result", and becomes a tool to explore the structure of "encodings" in the context of completeness and consistency. (or viceversa?)

*I might have said a Gödel number as well.

**I believe, that the only possible meta-mathematical reasoning is with a proof and a theorem. Then the head of the mathematician puts in the interpretation; but the interpretation adds nothing, it's mere structural correspondence.

***Sorry.

gear shift

Seems we've successfully advanced beyond one set of considerations, to another. I don't want to flood LtU with philosophy... so I'll try to say some of this compactly (and probably botch the job; but, I'll try). Sorry to go over familiar territory, but I'm hoping to save time in the long run by not skipping steps (one can hope).

The Scholastics of the European Dark Ages (!), besides a lot of theological questions, considered whether universals (I might call them "abstract entities") exist. There were, broadly, three positions. One position said universals are just as real as things in the concrete world; that position is generally credited to Plato, so it's called Platonism. One position said universals exist, but only as concepts; that's called conceptualism, and a noted proponent is John Locke; if you've read the Wizard Book, it starts with Locke's definition of abstraction from his Essay Concerning Human Understanding. The third position is that universals have no actual existence, they're just names used to group things together; nobody really believes that in absolutely pure form, but certainly some people are closer to that pole than to either of the others; it's called nominalism; and a particularly famous member of that camp was William of Ockham.

In the second half of the 19th century the paradigm for mathematical foundation was axiomatic. That paradigm went into crisis with the discovery of antinomies like Russell's Paradox at the end of the century, and three schools of thought emerged on how to cope. The logicists put logic prior to mathematics. The intuitionists put mathematics prior to logic. And the formalists took a pragmatic approach to resolving the crisis by setting up formal systems that could be used to study the paradoxes and figure out what could be safely done. W.V. Quine noted some decades later that these were pretty much the three Scholastic positions on existence of universals — logicism corresponding to Platonism, intuitionism to conceptualism, formalism to nominalism. The logicists and intuitionists both tried to avoid the paradoxes by backing off on the axioms so that the paradoxes would no longer be provable. The formalists studied which sets of axioms were and were not consistent. And then Gödel successfully used the formalist approach to prove that the whole problem couldn't (in a sense) be solved.

A quick note on this: if Gödel overturned the gameboard and scattered the pieces all over the floor, what became of those three schools of thought? Pure philosophy is about questions we have no clue how to settle, so skip that. None of them "solved" the paradox problem (there is no prosaic solution, within the rules they were using), so that's a draw. Practically, I suggest they all won. Frege, founder of logicism which is usually cast as a loser in this story, is rightly considered the founder of modern logic, and we discuss logic using the conceptual tools he set up. Intuitionistic logic is what mathematicians use when they're being cautious, the safe platform for when there's no need to stick one's neck out. And the meta-mathematical technique from the formalist Hilbert has been added to everybody's toolkit.

Key here is Hilbert's meta-mathematical technique. There would be three systems involved. There would be a mathematical system to be studied. There would be a formal system, with a correspondence set up between those two whereby the formal system models the informal mathematics. And there would be a meta-mathematics used to reason about the formal system. The mathematics and meta-mathematics were inherently informal. They exist in the mathematician's head; and maybe also in the Platonic realm, if you believe in it, but that's an unnecessary assumption. (Remember, this is the school of thought Quine indirectly linked to William of Ockham; William of Ockham made good use of the principle of necessity, which says entities should not be multiplied unnecessarily, i.e., don't assume more than needed, a principle that centuries later was dubbed "Occam's Razor" because William had used it so effectively to cut his opponents' arguments to ribbons.) A basic premise here was that the formal system had in itself no meaning whatever. When formalist Alonzo Church proposed a novel logic in 1932, he remarked up front that even if the formal system failed as a logic, it might be useful for some entirely different purpose. (It failed as a logic, and a subset of it is now called λ-calculus.) Meaning resides entirely in the two informal systems, not at all in the formal one; the unavoidable, minimal peripheral presence of meaning is like nominalism, which was never completely pure but did encourage assigning as little existence to universals as one could get away with (so not to multiply entities unnecessarily).

Now, about proving Gödel's theorem. There's the correspondence between the formal system and the modeled mathematics; and there's the meta-mathematics used to reason about the formal system. But the formal system, I submit, is still formal, i.e., in itself entirely devoid of meaning. There's no avoiding either of those informal systems; the modeled mathematics is the purpose of the exercise, and the meta-mathematics is the means of thinking about it. The use of Gödel numbers can't shift any of the meaning into the formal system, because meaning cannot ever get in there, not even a little bit; the meaning is still-and-always in the mathematician's head. Which leaves me with objections to quite a few specific passages in your meta-math comment, above. For example

Do we really want things in the head of the mathematician? Or, phrased differently, do we want meta-mathematical concepts that can only exist in the head of the mathematician, and do not have any mathematical relevance? I don't think so.

We don't have a choice; all meaning is in the head of the mathematician, and cannot fail to be. Concepts, of any kind whatever, can only exist in the head of the mathematician, whether we want them to or not. There is no relevance outside the mathematician's head, either; mathematical relevance can't live in the formal system.

Too many such remarks would just be heckling. More interestingly, I did not miss your remark in a footnote,

I believe, that the only possible meta-mathematical reasoning is with a proof and a theorem. Then the head of the mathematician puts in the interpretation; but the interpretation adds nothing, it's mere structural correspondence.

I've actually got most of a blog post drafted about the difference between sapience and non-sapience (e.g., why none of our technology, not even our more powerful AIs, are at all sapient; why we are unable to measure sapience; and visible evidence of sapience despite our lack of a measure for it). Sapience allows us to consider things in their full context, and judge their significance and when it's time to step outside a set of formal rules because something overrides the concerns they address. Whether or not a formal correspondence "exists" with no one to observe it, it might as well not exist since there's no one there to judge its significance (by the principle of necessity, with nobody to observe it there is no need to suppose that it exists :-).

You end with

Does this mean that, if we want to leave the head of the mathematician out of the bussiness, the Gödel sentence needs the encoding to be mathematically meaningful in the same way that it is meta-mathematically meaningful? That we need the encoding to express mechanically, mathematically, with an operational semantics, what until now only was in the head of the mathematician? If that is correct, I think it would be wonderful, because the encoding is obviously arithmetizable; it means that the Gödel sentence(s) ceases to be a "negative result", and becomes a tool to explore the structure of "encodings" in the context of completeness and consistency. (or viceversa?)

Here I cannot tell for sure, atm, whether you're on to something. I can see a reason why you might not be (always the skeptic), but it depends on whether I've got enough of what you have in mind. It seems you're talking about bringing more and more formal reasoning about the system within the formal structure of the formal system; but, I don't think this changes anything essential about the situation. The two informal systems are still there, neither more nor less so than before, and there is still no trace of meaning within the formal system. My reservation is for the possibility that, after all, we are really talking about the same things using different words, and you may have a line on another interesting use of Hilbert's formal technique for learning from the study of a formal system.

[Edit: I'd said 1931 for Church's logic; it's 1932; fixed.]

preliminary response

Do we really want things in the head of the mathematician? Or, phrased differently, do we want meta-mathematical concepts that can only exist in the head of the mathematician, and do not have any mathematical relevance? I don't think so.

We don't have a choice; all meaning is in the head of the mathematician, and cannot fail to be

A thousand times yes. I got carried away, we cannot leave out of the bussiness the head of the mathematician. As you very well say, the head of the mathematician is the holder of the interpretation, the provider of meaning; and it is true that "holding an interpretation", adscribing meaning, is somewhat magical, it's difficult to imagine any other thing capable of holding an interpretation other than the head of a mathematician (not even the head of a physicist, they use math as a tool, most everything is in their heads ;-) ).

But what I hold in this discussion is that in the case of Gödel's theorem, we are puting 2 things, 2 maps in the head of the mathematician. First, the encoding, and then, the interpretation. To show what I mean, imagine that we get in contact with an alien civilization. And the aliens, to impress us with their mathematical knowledge, decide to send us 2 things: pi, and a Gödel sentence. Can we understand that they are sending us pi? No doubt. Can we understand that they are sending us a Gödel sentence? If they don't also send us the encoding, the way of arranging the information in that number into a syntactic form, no way. We will only be able to work out the entropy in that number. If they give us a syntactic form, we can look for structures that have some correspondence, we can look for meaning. If they give us entropy, we can't.

they cannot?

If they don't also send us the encoding

Well they cannot send us an encoding I think. All they can do is send many messages and hope that we see patterns and work out the encoding. I think. I'm a bit lost here. In any case, we need the encoding :)

slightly more complicated

Can we understand that they are sending us pi? No doubt. Can we understand that they are sending us a Gödel sentence? If they don't also send us the encoding

Of course, it is slightly more complicated. Because if they send us a number, it must be expanded in some base, with some word size, and that might be called an encoding; but once we have that, easily obtainable if they send us pi, we will have the Gödel sentence as a number and stil need the encoding to look for meaning in it.

true names

CETI - communication with extraterrestrial intelligence.

Not enough information

Now, about proving Gödel's theorem. There's the correspondence between the formal system and the modeled mathematics; and there's the meta-mathematics used to reason about the formal system.

I would say that what Gödel tries it to transfer all of his meta-mathematical
reasoning into the modelled mathematics. For example, in page 39 of
this edition of his paper
he says:

Metamathematical concepts and propositions thereby become
concepts and propositions concerning natural numbers, or series of
them, and therefore at least partially expressible in the symbols of
the system PM itself. In particular, it can be shown that the
concepts, “formula”, “proof -schema”, “provable formula” are definable in the
system PM, i.e. one can give a formula F(v) of PM -for example- with one free
variable v (of the type of a series of numbers), such that F (v) -interpreted
as to content- states: v is a provable formula. We now obtain an undecidable
proposition of the system PM, i.e. a proposition A, for which neither A nor not
A are provable [...]

But the formal system, I submit, is still formal, i.e., in itself entirely devoid of meaning.

Agreed, we put the meaning, the interpretation. Which, if it is sound,
should have the same structure as the formal system.

There's no avoiding either of those informal systems; the modeled mathematics is the purpose of the exercise, and the meta-mathematics is the means of thinking about it.

I don't see so much difference between both informal systems. Both are
interpretations of the formal system, into math as sets of integers, and into
meta-math as "provable formula" etc.

The use of Gödel numbers can't shift any of the meaning into the formal system, because meaning cannot ever get in there, not even a little bit; the meaning is still-and-always in the mathematician's head.

I have no problem with this, I agree. What I contend is that even using the
mathmatician head, we would not be able to find meaning in a Gödel number, if
we don't know the arithmetization that has been used to construct it. We need
both the Gödel number **and** the arithmetization to hold the meaning that
conveys Gödel's theorem; and adding the arithmetization changes the meaning a
bit. By itself, the Gödel number does not carry enough information to express
the theorem. It contains a lot of roots, but not the information of how to
arrange them to represent a proposition.

Btw, that was a masterful historical introduction.

I see a whole bunch of specifics I could take issue with, but I think there's one underlying issue. In essence, it seems you're underestimating the role of the mathematician's head. A likely starting point is your early remark

I would say that what Gödel tries [is] to transfer all of his meta-mathematical reasoning into the modelled mathematics

It can't be transferred. It can be copied. Here's what I mean. I had this nifty diagram illustrating Hilbert's technique, which I'll clumsily reproduce using ASCII art.

informal
(meta)
|
| used to reason about
|
modeled by      V
informal  ------------>  formal
(subject) <------------
models

There may be a subject mathematical theory in the mathematician's head, that we want to study. That's not always the case; sometimes we want to consider a variety of formal theories without worrying, for the moment, which ones do and don't model such a subject theory; but even if there isn't a subject theory now, our interest in formal theories is as potential models of subject theories. That is, we are the ones who do mathematics, and the point of the exercise is to study the range of ways we might do that.

In order for us to study the formal theory, we have to think about it, mathematically. Every thought we have about it is in our heads, and is done using some mathematical theory which, in the generic situation, is called the meta-mathematics. This meta-theory is probably not the same as the subject theory. Hilbert devised this arrangement in order to apply mathematics to the problem of how to deal with the antinomies in the foundations of mathematics; the point was that mathematicians can't reason mathematically about thinking because thoughts aren't mathematical objects, but a formal theory can be reasoned about, one only has to establish the connection between formal theory and subject theory. The reasoning he allowed in the meta-mathematics was, iirc, called "finitary" in English; it had to be stuff everyone could have great confidence in, and was pretty much the same as intuitionistic logic. He meant to reason about subject theories that he wasn't already sure of, i.e., the subject theory might be more powerful than the finitary methods of the meta-mathematics; it's certainly possible by this means to prove that various theories aren't consistent.

But suppose you wanted to reason about the relationship between the power of the subject theory and the power of the meta theory. We already have the ability to reason about the subject theory, because we've established a corresponding formal theory. We might do the same for the meta theory, which naively might look like this:

informal
(meta-meta)
|
| used to reason about
|
modeled by     V
informal ------------> formal
(meta)  <------------ (meta)
|         models
|
| used to reason about
modeled by      V
informal  ------------>  formal
(subject) <------------ (subject)
models

Since there are now two formal theories, I've labeled them (subject) and (meta) to keep track. The theory in our heads that we use to reason about the formal meta theory, could be correctly called "meta-mathematics", but that's asking for confusion because it probably isn't the same theory as the meta-theory we're studying, so I'm calling it the meta-meta theory. We don't actually need to keep the informal meta theory in play, so long as we keep in mind that the relationship between the formal meta theory and the formal subject theory isn't actually one of reasoning in the cognitive sense; it's a relationship that we reason about in our heads, using the meta-meta theory. We then have:

informal
(meta-meta)
|
| used to reason about
|
V
formal
(meta)
|
| models reasoning about
|
modeled by      V
informal  ------------>  formal
(subject) <------------ (subject)
models

We could, in principle, extend this by any finite number of additional levels. We could have a formal meta-meta theory, and formal meta-meta-meta theory — but somewhere up at the top has to be the mind in the mathematician's head, thinking about it all; and that informal metan theory will have the same need for power that Hilbert's informal meta theory had, so will probably be essentially finitary/intuitionistic.

Gödel realized that one could use encoding to collapse the formal meta theory and formal subject theory into one. We would then have a diagram with just the shape of the first one I showed, Hilbert's basic configuration; well, almost, except we're not limiting our formal subject theory by requiring it to model a particular informal subject theory, so really we might leave the informal subject theory out of it. There's a bit of a problem with labeling, though. The formal theory is now a combination of subject and meta. And the informal theory at the top is... what? It seems a bit odd to call it "meta meta", when there's only one level below it. But if we call it just "meta theory", we'll get confused over its separation from the formal meta theory which has now collided with the formal subject theory. Hence my remark about a messy difficulty over the meaning of "meta-mathematics".

Still not enough information in a Gödel number :)

I think we have a little pool of misunderstanding around our use of the word "reason". I will try to make my use of it explicit, and I think I can do that best if I focus on Gödel's paper.

I think that there are 2 aspects to what we are calling "reason". One is interpretation, semantics; and the other is drawing conclusions, what I would properly call "reason", and is essentially syntax. And with my remark:

I would say that what Gödel tries [is] to transfer all of his meta-mathematical reasoning into the modelled mathematics

I was referring just to drawing conclusions, to syntactic maneuvers. What I meant that I understood that Gödel was doing, was to limit the role of the head of the mathematician to semantics, to interpretation, and move all "drawing consequences" to proofs and theorems.

I totally see Hilbert's technique, it is of course invaluable. But I think that Gödel does a good job of changing it, and of moving the act of drawing meta-matematical conclusions from the head to the formal system. I don't see the need for assuming any other entity at play in Gödel's paper.

What are the new meta-mathematical propositions in his paper? Any system equivalent to PM contains undecidable propositions; any such system is incomplete. And they are backed by a proof. What else does he do, apart from the proof? Introduce and describe the models he is going to formally reason about, show how the new propositions fit in those models, suggest new possible proofs using his new propositions... But no other new meta-mathematical propositions.

So, in essence, the only thing I think we differ as respects your post, is in the reach of Gödel's achievement as regards syntax and heads. But this is not essential for the central argument I am trying to advance here, which can be stated entirely on semantic terms, i.e. that he forgot something in his interpretation.

There is no denying that the relation Gödel establishes between the arithmetic propositions and the meta-mathematical propositions is semantic: we interpret Bew(v) (a number) as v is provable in P, etc.

And I actually thought that I would sway you, with my previous comment, where I was arguing that there is not enough information in a Gödel number to determine a (meta-mathematical) Gödel sentence. Because I think we can agree that if the model as model contains more information than the formal theory, the interpretation is not sound.

Rabbit hole

I think that the combination of model theory and information theory provides for a deeper rabbit hole than we need (is there any work on it? I have found nothing.). What I meant with "not enough information in a Gödel number" is a lot simpler. In short:

We have the formal system P, and also its arithmetization. For each proposition in system P, we have an arithmetized version of it, a Gödel number. Both formal systems have the same semantics (apart from their mutual interrelation). However, each proposition in system P has more information that its corresponding Gödel number. To go from the Gödel number to the proposition, we need more information than the Gödel number, i.e., the arithmetization used to build it. The original interpretation is from system P: Gödel numbers "inherit" their interpretation from the propositions they are built from. So it seems to me, that if to go from a Gödel number to a proposition, we need the arithmetization, we should also need it to go from the Gödel number to the interpretation.

It is not the same in the opposite direction. To go from a proposition to a Gödel number, we can use any arithmetization; to go from a Gödel number to the proposition, we need the actual arithmetization that was used to build it. And also, of course, the original interpretation is from system P, so we need nothing else than system P to have it.

reason

I agree there's something of interest lurking near the word "reason". A note on how slippery the concepts around there are: I also dabble in conlanging, the construction of languages for sapient communication (versus, languages for programming non-sapient computational artifacts), and this has repeatedly brought me to the question of meaning as distinct from any particular language. The difficulty with this is, we cannot isolate meaning in the laboratory; it can only manifest itself through language. Well, except perhaps for our own introspection, where we may... perhaps... glimpse meaning in our own minds in its unbound state, but then we have no direct way to show it to someone else, it's hard to know whether we're misreading ourselves, and our conclusions even if right might be difficult to generalize to others because different people's minds work by a variety of remarkably different principles. And yet, we can perhaps get some clues to the shape of meaning by studying the shapes of the shadows it casts on many different languages.

It seems to me that Hilbert's metamathematical technique is set up to very carefully treat thought as an unknown. There is meant, afaics, to be no assumption about what goes on in the mathematician's head, except for the correspondence set up between the formal system and the modeled mathematics. Hilbert's formal system, and digital computers, are the culimination of the same trend as the axiomatic foundations of mathematics in the late nineteenth century, which was itself a new paradigm for mathematical foundations: at the beginning of the nineteenth century, the foundations of mathematics were intuitive, and had been for many centuries. We may tend to overlook thought, now in the digital age, rather like we overlook the air we breathe, but we're immersed in it. When we reason about a formal system, no matter how trivial the reasoning may seem, what's actually going on is something extraordinary and really quite inscrutable. We have no clue what is really going on there. (I submit it is not possible to objectively measure sapience of an individual; given any standardized test, it's possible in principle to construct a clearly non-thinking artifact that can do better on the test than a sapient being would.) A characteristic of sapience is the ability to recognize when it's time to step outside the system; it's absolutely undetectable as long as one is within the system, and as long as one is within the system a technological artifact will outperform a sapience, hence the problem with measuring sapience. But everything we think, even if it's about the behavior of a formal system, is thought. It can't be moved out of the mathematician's head because it is thought.

dialog

A clarification: I don't agree that drawing conclusions can be shifted out of the mathematician's head. Copied out, perhaps; not moved wholesale.

A general observation on this entire dialog: This stuff is really hard to talk about; the trouble isn't the formal part, but the concepts (in our heads :-). I feel we've come a very long way in developing a common understanding of what we're talking about, but I don't yet fully get where you're coming from, and as long as I don't, somehow your explanations slide past me a bit. It's frustrating, because I am interested to understand your thinking on it. Are you on to something? I don't know; it's not clear to me; and yet, I don't know you aren't on to something. It's also possible that the act of our trying to understand each other could itself turn something up.

So I continue to be interested in the discussion, even though it goes slowly as it time-shares with the non-LtU world (it turns out there is one), and even though I often have to struggle to grip something solidly enough to help move things forward to the next step.

A test

[edited to fix the link]

I will assume here that "reason" stands for linguistic thought, i.e., that it is anything that happens within the head that results in linguistic consecuences, or in newly held truths that can be articulated and communicated. I hope you can agree with this bound.

I don't agree that drawing conclusions can be shifted out of the mathematician's head. Copied out, perhaps; not moved wholesale.

I agree that everything must happen within our heads. Obviously, something that has no echo within our heads doesn't exist, as far as we are concerned. By shifting drawing conclusions out of the head, what I meant was that, if starting from X, you get to Y, but I get to Z, we can recourse to a mechanical device to help us decide whether Z or Y.

If we limit reason to linguistic processes in the head, I think it is not controversial to say that both Frege's and Hilbert's programs were intended to comprehend reason. Frege's program pretended to formalize the use of the natural languages; and if he had managed, he would have taken everything linguistic out of the inscrutability of thought. He would have made all linguistic phenomena comprehensible, unmisterious.

And then Hilbert (kind of) proposed to forget natural language, and center in mathematical thought, to formalize it. But he also failed, since Gódel showed that there are things that can be in the head but not in the formalism. (or viceversa? ;-)

Now we find ourselves with some tools in our hands, tools that were devised to explain reason, and finally do not, but that, in any case, have proven to be very powerful tools. And with reason, in its linguistic aspect, as misterious as in 1800.

However, my position in this discussion changes the way in which I see Gödel's result, and it allows me to see Hilbert's program as fully achievable. It changes my view of reason, it takes reason for me to where Hibert wanted it to be. It's as if understanding Gödel changes the way one understands reason, and therefore the way one understands Gödel itself. Understanding Gödel makes around itself a semantic pearl in your mind.

So, what I mean with this, is that there are some things that we cannot hope to agree among ourselves, before settling the subject matter of this discusion; because they actually are aspects of the subject matter of the discussion.

To settle the issue, at the moment, I would propose this very syntactic and mechanic test. Check this comment and following.

[edited to add: it's the same argument as with Martin Davis' version.]

[edited to add: I'm also interested to know whether the "not enugh information" argument makes sense.]

[edited to add: I understand your intention to lead the discussion towards a broad common understanding, sorry for the pressure towards the formal part :)]

Re: dialog

So I continue to be interested in the discussion, even though it goes slowly as it time-shares with the non-LtU world (it turns out there is one), and even though I often have to struggle to grip something solidly enough to help move things forward to the next step.

I understand and agree. In fact, I'm in a similar situation: I believe I understand Gödel, and I think I see a flaw in his interpretation, but obviously, if it's just me, I better think I'm probably wrong. It's so easy to be wrong in this context... So I have your same interest, and also a non-LtU life :)

And in fact, though the starting point of the argument has not changed, the conclusions have changed over our discussion... So I see this discusion with a lot of interest, and without any hurry.

I look forward to being enlightened, thankful, and a bit shamed; I don't dare to look forward to any other conclusion.

Though I dream with it

I don't dare to look forward to any other conclusion.

Though I dream with it. I believe that Gödel's theorem, from a social perspective, has been terrible for Science and progress; it has twisted knowledge and Truth, the objective of Science and the seat of progress, to an almost unrecognizable state, giving wings to the anti-science and anti-progress social forces.

truth and consequences

I agree Gödel's result has had some unfortunate social consequences, both outside and inside scientific circles. Social consequences are apt to happen, though. Not just Gödel, there was also quantum mechanics; everything started making a lot less sense around then, WWI and WWII did their part too.

Para-paradigmatic perspective is hard to come by. It seems to me there's a certain kind of person who, having been a young and flexibly-minded witness to a scientific revolution, is comfortable with the old and new paradigms both and quite unimpressed by the more extreme demonizing of the old paradigm and agrandizing of the new. (Like Gerhard Gentzen who, afaict, didn't object to Gödel's result, he just wasn't intimidated by it...) Those sorts of people typically occur only in that trans-paradigm generation, and their ability to see past the paradigm boundaries doesn't usually recur in later generations until the new paradigm begins to break down (unless it's the recurrence that causes the breakdown?).

A remark I made in the context of some speculations in basic physics a while back:

Relish the mind bending, take advantage of it to keep your thinking limber, and don't get upset when you're not finding something that might not be there. And at the same time, if you're after something really unlikely, say with only one chance in a million it'll pan out, and you don't leave yourself open to the possibility it will, you might just draw that one chance in a million and miss it, which would be just awful. So treat the universe as if it has a sense of humor, and be prepared to roll with the punchlines.

Too much meta-mathematics is bad for the health

Or, if you really must phrase it as a "property" of the proposition-or-state, the property is that that proposition-or-state is the one that is used by the meta-mathematical reasoning to show that there's a systemic problem. That's it: the property is that it's the one the meta-mathematics uses

Sorry but I think that that is wrong. Let's first distingish between the meta-mathematician and meta-mathematics. In my opinion [edited to add: I mean, in what I gather from Gödel's paper, and like], meta-mathematics is just mathematics, interpreted in the context of the highest level concepts of mathematics. And the meta-mathematician is the one that makes that "interpreting" connection between maths and the high level concepts. The meta-mathematician doesn't invent the connection, the correspondence; it is already there. The meta-mathematician chooses it. There are all sorts of structure preserving maps between all sorts of structures, and any of them might serve as interpretation of the domain in the image. And meta-mathematics is just one of those maps, that happens to go from arithmetics to high level concepts of math.

I don't think meta-mathematics is in some way above or around any other kind of math. If there is a systemic problem, I don't think that we can somehow separate meta-maths from it and say that m-m "uses" a problematic proposition-or-state to show things abroad. The systemic problem is incompleteness, and the property of the proposition is undecidability; you cannot get much more meta-mathematical than that. Meta-mathematics does not use those; it **is** those. And meta-mathematical reasoning is Gödel's proof.

The property of the Gödel sentence is not that "it is the one used by the meta-mathematical reasoning to show a problem"; the property is undecidability, and it has that property because it has the arithmetic properties that we can interpret meta-mathematically as undecidability. I think, am I right? And of course, my point is that to make that interpretation, we need the encoding. We need the encoding to build the sentence, that's clear. But we also need it to interpret it, and it is this need that is being ignored, and implicitly supplied in the head of the meta-mathmatician.

meta-vocabulary

I suspect there's a very messy difficulty here to do with what one chooses to mean by the word "meta-mathematics".

meta-meta-math

Yes, of course you are right, and of course you were not wrong. It was a stupid way to try to keep up the conversation from my part. sorry. I'm still thinking about your last comment.

[Edited to change the title, I'm chuckling. What I mean is that I want to build on your meaning rather than antagonize it. As you do, for which I am grateful. Been on a family gathering, drinking, sorry, ignore this.]

awe

It's aweful to imagine what became of Frege's mind after Russell's violent blow, right? I understand it never recovered. To live within it. We all do, since Gödel.

[Edited t add: We all do. What is pathetic to imagine is the transformation from Frege's perspective. From seeing light to being blind. Not so many historical figures go from light to darkeness and then left to rot in such stark way. Gollum, Sisiphus.]

Back in 2000 I visited the Crazy Horse Memorial. The orientation film loop included a remark by the guy who started that project — long gone, the project now run by his descendants — something along the lines of 'I realized it would be the work of a lifetime, and I happened to have one available.' Sometimes in a moment you'll have a vision that you spend the rest of your life pursuing. Frege had his taken away after he'd already spent the larger part of his life on it.

Frege's vision

Sometimes in a moment you'll have a vision that you spend the rest of your life pursuing. Frege had his taken away after he'd already spent the larger part of his life on it.

I think that in the case of Frege it was worse, because his vision was clarity itself, not just any old vision. His vision was seeing.

[Edited to add: I think that it is like having the face of your god change from Jehovah (or Allah or any other, let's not offend $DEITY) to Cthulthu.] $DEITY

I was, honestly, wondering about the choice of Jehovah, whose reputation is rather stern. If I'm choosing a deity (which, granted, I often don't), I would generally favor either Eris or the Flying Spaghetti Monster, neither of whom is, I think, easily offended (although I wouldn't care to incur either's wrath). I do take your point, though, as Cthulhu's reputation is not at all warm and cuddly. Moreover, a noted theme of Lovecraft's writings is that of humans searching for knowledge only to suffer the horrible consequences of finding it, which seems quite apropos to Frege's case.

Aneris

People don't seem very good at inventing nice deities. I don't blame them. But at least Jehovah et al. have some human measure, offer a paradise, a way out of inhumanness to the faithful, with humanly understandable instructions (though not humanly likable instructions) to reach it. Cthulhu on the other hand seems to only care about things with cosmic dimensions...

Reading up on the links you post, I liked Aneris, "Goddess of order and not-being". Pure Gödel. And her brother is Spirituality, hahah.

Do you know Hehner's work on "Beautifying Godel"?

Beautifying Gödel, Eric C.R. Hehner, U. Toronto

I think it is pretty solid and convincing. It makes Godel's theorem almost a triviality that could hardly be refuted, while debunking some usual but incorrect interpretations of the theorem.

My understanding is that it is unreasonable for a formalism to reason on itself, but that is a good thing actually!

Beautifying Gödel

[edited to change the title]

Sorry Arnaud, I missed your comment.

I had not seen the link you provide; thanks a lot, it is a very clear and concise rendition of Gödel's theorem. And my argument is very straight forward with the toolset provided there.

The Gödel sentence, in that paper, is represented like:

G =  ¬  |–  “G”

I think that this representation can be understood at a glance, perhaps the only thing of note for someone that knows the theorem but not the paper are the quotes around the second G: in this version of Gödel, the encoding is as a string, represented as quoted.

Of course, some symbols have been omitted here for clarity (by Hehner, not by me).

The first is the Theory. Let's add it:

G =  ¬  T  |–  “G”

Now we have 2 variables in this sentence, and, since it is a theorem, we want them bounded, not free. For G we want an existential quantifier; for T we want a universal one. This is clear. The problem is their order.

If we say "there exists a G such that for all T...", we are talking about G as a proposition that is universal, independent of T, a proposition that will be possible in every T and therefore must be in The Book.

But if we say "for all T, there exists a G...", we are talking about each T, in every one of which a **different** G proposition can be found or constructed. In this case, it is doubtful that there is some G in The Book - that is independent of any T, i.e., that is generalized for all T.

And the question is that we cannot reduce the G before the T. To have a particular G, we need to already have chosen a particular T. G is constructed with a T, it contains an implementation of some T. In the proof, a particular T is used to build a G ("Either way, it amounts to writing a theorem-proving program.")

So my conclusion is that the theorem, in full, is:

∀ T: ∃ G: G =  ¬  T  |–  “G”

But it seems to me that it is usually interpreted as if it were:

∃ G: ∀ T: G =  ¬  T  |–  “G”

Beautifying Gödel (cont.)

I mean, there is a difference between "undecidable propositions in PM" and "different undecidable propositions in different implementations of PM". Isn't there?

[edited to add: s/implementations/arithmetizations/]

Beautifying Gödel (cont.)

In my view, this means that we cannot use G ever without first choosing a T. Its use in any sentence must always be preceded by a quantification of T.

Beautifying Gödel (cont. III)

In my view, it means that in abstract PM, or in The Book, G, or the Gödel number, must be parametrized by T, by the arithmetization; and being parametrized it looses its meaning, it no longer is a number. [edited]

And sorry for this comment diarrhea, I'll stop.

Just do not try to compare G and "G"...

Surely, you have got the right "full" theorem, but as your last sentance suggests, the real lesson Hehner draws is:

As long as you consider G and "G" of different types and do not allow your Theory to compare them, you can build as many useful and consistent Theories as you need.

Only if you add this possibility to your Theory in an attempt to make it complete and "universal", Godel's theorem demonstrates that your Theory loses consistency.

In other words, universality is an unreasonable goal...

universal?

As long as you consider G and "G" of different types and do not allow your Theory to compare them, you can build as many useful and consistent Theories as you need.

I think that this is clear, no objection.

Only if you add this possibility to your Theory in an attempt to make it complete and "universal", Godel's theorem demonstrates that your Theory loses consistency.

To add the possibility of comparing G and "G", what you have to add is a particular definition of |-, i.e., a particular theorem proving program. And this consists of a whole bunch of definitions and axioms (like what is later in the paper used to define I). So, to be able to have an inconsistent theory, it is not sufficient to have the theory, you need to add to the theory a stringification or arithmetization of the theory, otherwise the G and "G" remain unrelated. I wouldn't call this "making it universal".

To me, this means that what is inconsitent, is not T, but T and a stringification of itself; T and a particular program to establish correspondences between strings and T.

To make it universal, in my opinion, you would need to remove the particular stringification or arithmetization, you need to remove the definition of |-, or of the I Hehner defines under the "Semantics and Interpreters" title. And this is obviously impossible, a particular |- (or I) is needed to reach a particuar "G" to compare it to G. Therefore, I conclude that G is impossible in T, if you want it abstracted from any particular arithemetization of T.

So for now, I remain convinced that Gödel's theorem does not speak about undecidable propositions in PM; that the undecidable propositions are not just in PM, but need a particular arithmetization of PM to be undecidable (because even though the particular arithmetization has been used to build the proposition, the arithmetization cannot be extracted from the proposition, it has to be provided separately.)

I still think that Gödel's theorem speaks about different undecidable propositions in different arithmetizations of PM; about PM plus arithemtizations of PM rather than about PM itself.

All Gödel sentences are undecidable in the same PM

I still think that Gödel's theorem speaks about different undecidable propositions in different arithmetizations of PM; about PM plus arithemtizations of PM rather than about PM itself.

Don't think so.

IIUC, each arithmetization gives an undecidable sentence, indeed—so you get a "function" (at some level) from arithmetizations to undecidable sentences. But once you apply the function to the argument, you get an actual sentence in Peano Arithmetic, without free variables referring to the arithmetization.
To interpret this Gödel sentence as "I cannot be proven", you need to use the same arithmetization. To conclude that this Gödel sentence is indeed undecidable, you also need to use the same arithmetization. But the sentence is undecidable in Peano arithmetic, full stop.

By (silly) analogy: take f(x) = x + 1, and compute f(5). What you get is just the number 6, not an expression that references f or x. Given the definition of f, and given x = 5, you can reinterpret 6 as f(x). But you don't need to.

alternative arithmetization

To conclude that this Gödel sentence is indeed undecidable, you also need to use the same arithmetization. But the sentence is undecidable in Peano arithmetic, full stop.

The sentence is undecidable in PM (or Peano), no doubt about it. But not by itself. What is undecidable in PM is the conjunction of the Gödel sentence and the arithmetization used to build it.

The Gödel sentence is just a number with enough roots, with enough entropy to be interpreted as the undecidable PM sentence. But keep in mind that that same number, with an alternative arithmetization, might be interpreted as some other perfectly decidable PM arithmetic sentence.

And that having any arbitrary number with enough entropy, it should not be difficult to find an arithemtization in which the said number is a Gödel sentence.

Without the arithmetization, the Gödel sentence is just a number, without a particular or preferred interpretation. You cannot prove that the sentence is undecidable if you don't use in the proof the arithmetization that was used to build it. So my argument is that, for the proof, you need PM, the Gödel number, and the arithmetization used to build it.

Full stop, he said

But the sentence is undecidable in Peano arithmetic, full stop.

... no doubt about it.

Yes!

But not by itself.

Doh! No...

What is undecidable in PM is the conjunction of the Gödel sentence and the arithmetization used to build it.

No, the Godel sentence contains that conjunction already. And note that the Godel sentence is a sentence in PA. It isn't a number.

I heard

[edited to change the title]

Doh! No...

I of course know that that is the other side in this discussion. And of course I know that most probaby, I am experiencing a Dunning-Kruger effect. Other than that, your comment does not allow for much of a response. I'll try.

the Godel sentence contains that conjunction already.

That the arithmetization has been used to construct the Gödel sentence does not mean that the Gödel sentence contains the arithmetization. Full stop ;-)

the Godel sentence is a sentence in PA. It isn't a number.

How do you reconcile this with your previous assertion (that the Gödel sentence contains that conjunction already)? In what kind of sentence in PA are you thinking about?

In my opinion it is quite clear that the Gödel sentence is a number, that with the aid of an arithmetization can be interpreted as a PA sentence. But it needs to be a number to deliver its interpretation, you need to arithmetize it to have self-reference.

Try this?

In my opinion it is quite clear that the Gödel sentence is a number, that with the aid of an arithmetization can be interpreted as a PA sentence. But it needs to be a number to deliver its interpretation, you need to arithmetize it to have self-reference.

No, the Godel sentence is a PA sentence. I just googled for an actual Godel sentence for PA and couldn't find one expanded out into simple PA terms, but it's just something like:

not exists n. forall m. m ... and forall p. p = m*n and ...

What Godel exploits is that a proof in PA is a sequence of statements that follow by inference in a mechanical way. Because of the mechanical nature of this deduction, we can model statements as numbers and write down what it means for an inference to be valid as a simple mathematical relation on numbers -- simple enough that it can be expressed in PA. Now, there is more than one way to do this encoding of statements as numbers, but this fact is irrelevant for Godel's theorem to work. We just need to pick one way to encode sentences and proofs as integers and we need to write down the relation in terms of that one encoding. We need to be able to write this relation in PA:

Proves(p, s) = ... -- Does number p encode a valid PA proof of the sentence encoded in number s

Then we use a trick. Instead of just representing statements as numbers, we encode formulas (statements with a free variables) as well. We need to do it in a nice way so that we can write this relation in PA:

Apply(s, n, m) -- Does number s encode the sentence that is produced when we substitute m for the free variable of encoded formula n?

If we can write both of those down, then we can write down this formula in PA:

G(n) = not exist p. exist s. Apply(s, n, n) and Proves(p, s)

This is a forumla (proposition with free variable n) and so by the work we did above it can be encoded as a number. Let's call that number #G.

Then G(#G) is our Godel sentence. Note: it's this sentence:

not exist p. exist s. Apply(s, #G, #G) and Proves(p, s)

Where #G is some number, like 5. But G(#G) is not a number itself.

And of course then we note that G(#G) asserts that G(#G) is not provable and go from there.

Edit: Added existential quantifier for s in definition of of G(n).

I'll try :)

No, the Godel sentence is a PA sentence.

Well ok, it depends on the formalism. In Gödel's original paper, it is a number (17 Gen r) that must comply and at the same time cannot comply with certain arithmetic relations.

And, in your version of the theorem, of course, G contains the arithmetization, since it is defined on Proves and Apply, which need to contain the arithetization to do their thing.

We just need to pick one way to encode sentences and proofs [...]

We only need to pick one encoding to build a constructive proof of existence.

Then G(#G) is our Godel sentence. Note: it's this sentence:

not exist p. exist s. Apply(s, #G, #G) and Proves(p, s)

That's not a sentence, it is a formula. That we have used a constructive proof to prove it does not mean that we can leave free variables in the theorem. There are in particular 2 free variables: the arithmetization (in G, or in Proves and Apply), and the number #G. So, in my opinion, we must quantify those 2 variables.

And, also in my opinion, the only way to do it is: for each arithmetization, there exists a number #G...

That's all I say: The sentence is undecidable in PA, but only in conjunction with the arithmetization used to build it.

Look at the way Martin Davis states the theorem: "Let F be a formal system. Then there exists a number n0...". It is obvious that it is stating it for all appropriate sound formal systems.

Note that I am not saying that the Gödel sentence it is not undecidable, just that it is so only in conjunction with the arithmetization. In your version the arithmetization is there, but as a free variable.

And it seems to me that the theorem becomes much more useful and comprehensible, if we require the arithmetization within it.

Free variables

In Gödel's original paper, it is a number (17 Gen r) that must comply and at the same time cannot comply with certain arithmetic relations.

Perhaps so. I'm not very familiar with the original. In modern usage, though, a Godel sentence is actually a sentence.

not exist p. exist s. Apply(s, #G, #G) and Proves(p, s)

That's not a sentence, it is a formula.

But Proves, Apply and #G are all constants to be inlined. They aren't free variables. So this is a sentence as a syntactic matter. In a proposition like "exist n. n > 5", the 5 is not a free variable regardless of how I arrived at it and regardless of the fact that this proposition holds "for arbitrary values of 5."

I keep getting the feeling that your difficulty is just in the basic logic of the argument.

Look at the way Martin Davis states the theorem: "Let F be a formal system. Then there exists a number n0...". It is obvious that it is stating it for all appropriate sound formal systems.

I don't follow your point. The F there is generalizing PA, not the arithmetization used. But yes, Godel's theorem can also be generalized to other formal systems.

In your version the arithmetization is there, but as a free variable.

No. The theorem is that PA has a sentence that is neither provable nor disprovable. The statement of the theorem makes no reference to an arithmetization, nor should it.

And it seems to me that the theorem becomes much more useful and comprehensible, if we require the arithmetization within it.

Your wording here confuses me. Do you think Godel's theorem is correct as stated but not useful or comprehensible? Doesn't your proposed modification lead you to trouble? It seems to me that maybe you're saying that you understand Godel and it seems right, but you also have some other way of looking at things that seems contradictory. Usually in that case you just have some error somewhere in your reasoning, but unfortunately it might be hard for someone to help you track it down.

Whatever one chooses

But Proves, Apply and #G are all constants to be inlined. They aren't free variables.

They can be constants if you wish. But if you provide a proof that is valid for any appropriate arithmetization, the usual thing is to provide a theorem generalized for all those arithmetizations. And even if you wish to keep your theorem particularized for some constant arithmetization, I, or any other that reads your proof, can use it to provide a theorem that is general.

The F there is generalizing PA

No, if you read Martin Davis carefully, you realize that the F take the role of arithmetizations there. He says, for example:

We can suppose that, in a particular formal system these propositions are each represented by a corresponding string of symbols we may write as Pn. We need only assume that there is an algorithm for obtaining Pn given n

He starts with arithmetics in general, and uses particular algorithms to represent arithmetic propositions in particular formal systems.

The statement of the theorem makes no reference to an arithmetization

The theorem uses G. And G is defined in terms of Apply and Proves. Take Apply, defined, in your words, by:

Does number s encode the sentence that is produced when we substitute m for the free variable of encoded formula n?"

In my opinion, to be able to decide this, Apply must contain the arithmetization. And so does Proves. Their definition needs to contain the whole arithmetization, if they are capable of taking any s, n and m, or, in the case of Proves, any p and s. If we remove all defined symbols in the sentence, as you proposed in a comment above, the arithmetization should become explicit. And the number #G should be intact, of course.

And it seems to me that the theorem becomes much more useful and comprehensible, if we require the arithmetization within it.

Your wording here confuses me. Do you think Godel's theorem is correct as stated but not useful or comprehensible? Doesn't your proposed modification lead you to trouble?

What I meant was that, if you go from "there are undecidable propositions in PA", to "each arithmetization of PA determines an undecidable proposition in PA + the arithmetization, (that can be thought of as structure whithin the proposition, as I think you do with G(#G), or as an additional set of axioms, as Hehner does for his interpreter I,) and there are no (known or imaginable) other undecidable propositions in PA", things make more sense.

Because there is no denying that intuitively, naïvely, pregodelianly, the fact that there are undecidable propositions in arithmetics seems wrong.

But that arithmetizations of PA, that are not exactly self-reference but certainly are self-something, open the way to self reference, that allows us to propose "I am not provable"; this doesn't seem so wrong.

Other undecidable propositions in PA

There are many undecidable propositions in PA other than Gödel sentences. Examples of "naturally" motivated ones include Goodstein's theorem and the strengthened finite Ramsey theorem. Any of the Gödel sentences constructed using particular arithmetizations are merely "artificial" examples among many. Note that these two theorems, as well as all of the Gödel sentences for PA, are "true" statements about what we usually think of as the natural numbers, but there exist non-standard models of PA in which they are false, as a consequence of Gödel's completeness theorem. One way to think about this is that the axioms of PA do not completely pin down the natural numbers. Stronger systems (e.g., second-order arithmetic, set theory) do, but then it's less surprising that they are incomplete. There's a much longer list of statements independent of ZFC, for example. The axiom of choice itself might be considered the first example (being independent of ZF).

The importance of Gödel's first theorem isn't the particular undecidable sentence used in the proof, which isn't of much independent interest. Rather, it shows once and for all that there's no way to "complete" a set of axioms for a system capable of expressing arithmetic, for example by adding the axiom of choice to ZF, so that all sentences are decidable. The expectation is that for any given system, we would always be able to find "naturally" motivated examples of undecidable sentences besides just the Gödel sentences.

Goodstein

I've been reading up on Goodstein et al, and I was quite mistaken about the stuff. I was thinking that those arguments also used some encoding. So I agree with you that my perspective of considering alternative arithmetizations in Gödel is useless as regards incompleteness. I stand corrected, thank you :)

The importance of Gödel's first theorem isn't the particular undecidable sentence used in the proof, which isn't of much independent interest.

The only thing that remains in my mind from this discussion, that still seems to me that might provide value to the perspective I am taking, is that those sentences might be of interest. In the sense of their correspondence to arithmetizations, in the sense of using them to access the arithmetization.

[edited to remove some stuff, I had rendered wrongly the argument I meant to make. I'll try to express it right.]

This is what I meant

The scenario I am going to paint is slightly convoluted; it is designed to correspond to a simulation scenario, in the sense of the matrix films.

Suppose we have a theorem prover that has, not only PA, but also an arithmetization of PA, so that we can input Gödel numbers and the prover will output 'yes' when the corresponding proposition has a proof and 'no' when its negation has a proof (and perhaps 'invalid' when the number does not correspond to a sentence; and let's for the moment forget problems of time).

And suppose we also have an "encoder", that takes sentences in PA (in whatever appropriate input language) and arithmetizes (or encodes) them -- with the arithmetization that the prover understands.

The need for the encoder is that in this case, we are in the same world as the prover, whereas in the matrix, we are in the encoded world; whatever we do there is already encoded. So, corresponding to that fact, we use the encoder.

So the scenario is that we feed sentences to the encoder, the encoder encodes and feeds them to the prover, and the prover tells whether they are a theorem.

Suppose we have 2 different arithmetizations and know that one of them is the one that the machines use, but not which one. Could we use the Gödel sentence to find out which?

If we build it with the correct arithmetization, and give it to the encoder, it will produce a number which is in itself a Gödel sentence. So the prover should find undecidability very soon, just analysing the number with its own arithmetization: it will find, once the number is fully analysed, some particular numbers that it understands whithin its arithmetization, and determine undecidability.

But if we build it with the incorrect arithmetization, the prover will receive a number that contains a Gödel sentence, but only after it uses the arithmetization that is hidden in that number to find out. The prover will have to work more to find out that the proposition is undecidable, it will have to use the new arithmetization to analyse the particular numbers hidden in the sentence.

Can undecidability be detected? Does this make any sense?

depends on the algorithm?

So the prover should find undecidability very soon

This would depend on the algorithm, on the possibility that the prover might recognize that the received encoding is the same as the one it already had... I'm really not sure.

the usual thing is to

the usual thing is to provide a theorem generalized for all those arithmetizations.

This isn't usual at all. You're the only person I've seen worried about making this generalization.

No, if you read Martin Davis carefully, you realize that the F take the role of arithmetizations there. He says, for example:

We can suppose that, in a particular formal system these propositions are each represented by a corresponding string of symbols we may write as Pn. We need only assume that there is an algorithm for obtaining Pn given n

He starts with arithmetics in general, and uses particular algorithms to represent arithmetic propositions in particular formal systems.

This is not arithmetization, though. This is generalizing the logic from PA. The arithmetization step is (very roughly) about finding an isomorphic structure to your logic in the model of your logic. So for Godel and PA, it's about finding an isomorphic structure to the rules of PA in the natural numbers. Whereas this generalization is about changing the rules of PA to some arbitrary computable logical system.

The theorem uses G.

The theorem does not reference G at all. The particular proof of the theorem I sketched does.

In my opinion, to be able to decide this, Apply must contain the arithmetization. And so does Proves. Their definition needs to contain the whole arithmetization, if they are capable of taking any s, n and m, or, in the case of Proves, any p and s. If we remove all defined symbols in the sentence, as you proposed in a comment above, the arithmetization should become explicit. And the number #G should be intact, of course.

What does it mean for a proposition to "contain the arithmetization"? The construction of the sentence depends on the arithmetization used. The arithmetization is only "explicit" as a really big expression in the proposition, pinning down the structure of PA within the natural numbers. Proves and Apply are built out of forall quantifiers and basic PA primitives (+ * =). #G is just a number.

Your propose this new theorem:

What I meant was that, if you go from "there are undecidable propositions in PA", to "each arithmetization of PA determines an undecidable proposition in PA + the arithmetization, (that can be thought of as structure whithin the proposition, as I think you do with G(#G), or as an additional set of axioms, as Hehner does for his interpreter I,) and there are no (known or imaginable) other undecidable propositions in PA", things make more sense.

You can prove another theorem if you want. The second half of the theorem you've stated looks hard. IMO having the arithmetization as a parameter is going to just make the theorem less useful, because no one cares about that. If you could actually prove that every unprovable sentence arises in this way from some arithmetization, that might make people care, but can you actually prove that? How are you even going to enumerate possible arithmetizations? Things make perfect sense as they are. Imagine I proposed parameterizaing the Pythagorean theorem by a coordinate system so that it made more sense to me. AFAICT, that's completely analogous to your proposal.

Because there is no denying that intuitively, naïvely, pregodelianly, the fact that there are undecidable propositions in arithmetics seems wrong.

I deny that. Why should arithmetic be complete? Hilbert et. al. just kind of hoped (I don't think they even assumed) that it would be complete. Were the Pythagoreans justified in assuming that the rationals were complete? I don't think so.

generally correct

Yes, I now see that you are generally correct. Thanks :)

I think we agree

The following is not inconsistent with boolean logic to me as boolean logic does not say anything about "T |-"
G = not T |- "G"

Only when you add an axiom allowing to "equate", that is, rewrite T |- "G" as G, you can derive:
G = not G

Which is indeed inconsistent with boolean logic axioms

Inconsistency

I'm not sure I understand you. The question here is undecidability / incompleteness, not inconsistency. If you define G first as "X" (not T |- "G") and then as "not X" (T |- "G"), you obviously have a contradiction, but that's trivial...

It would be awesome

It would be awesome if you was right about this. You know, I read somewhere that Stephen Hawking stopped to work on Theory of Everything when he realized what Gödel wrote. Is there any way to formally prove the opposite, that all true terms are provable, when dealing with arithmetic? Isn't that what a completeness proof should be about?

I'm worried about an infinite set of tautologies. If they are not countable, how to prove all of them at once?

Godel did first prove a

Godel did first prove a completeness theorem that says that every proposition that holds in all first order models is provable in first order logic. Then he proved his incompleteness theorem showing how to construct true formulas that aren't provable. But propositions are finite and so the set of them and thus tautologies is countable.