Negation in Logic Languages

This is a follow up from my earlier post (I will try and put a link here) about negation in Prolog. The question was if negation is necessary, and I now have some further thoughts (none of this is new, but new to me). I can now see my earlier approach to implementing 'member of' relies on evaluation order, which now I have moved to iterative-deepening no longer works. So I need a way of introducing negation that is logically sound. Negation as failure leads to unsoundness with variables. Negation as refutation seems to require three valued logic, and negation as inconsistency requires an additional set of 'negative' goals which complicates the use of negation. The most promising approach I have found is to convert negation into inequalities, and propagate 'not equal' as a constraint in the same way as CLP. This eliminates negation in goals in favour of equality and disequality. I am interested if anyone has any experience of treating negation in this way, or any further thoughts about negation in logic programming or logical frameworks.

Comment viewing options

Closed vs. Open?

Isn't negation as failure logically sound as long as you stick to a closed set of rules to try? Let's say you have rules:

X -| A
X -| B

Those rules are usually interpreted as A implies X and B implies X, which you can also write as:

X = A or B or ...

The ... can be thought of as clauses you haven't seen yet -- open semantics. It's not logically valid to conclude that if A is false and B is false then X is false, because of that ... term. If you want to treat negation as failure, you can do it in a logically sound way if you can arrange for your semantics to actually be:

X = A or B

With closed semantics like this, you can conclude not X if you can prove not A and not B.

Backtracking and Variables

The problem is with unifying variables and backtracking. The following is unsound:

vr(X) :- \+ \+ (X=0), \+ \+ (X=1).


As it reveals the binding state of 'X'.

An interesting solution is

An interesting solution is to not build in equality at all. Equality could be a user defined procedure. Similarly, conjunction can be a user defined procedure. Then you get functional logic programming, and you can define negation simply:

not True = False
not False = True


And you define equality yourself:

equal_int Zero Zero = True
equal_int Zero (S _) = False
equal_int (S _) Zero = False
equal_int (S n) (S k) = equal n k


split

We need some way to prevent overlaps however, see description of member below, if the 'true' branch leads to a longer proof, then iterative deepening will find the 'false' solution first (as clause order no longer determines the results). Here's another example:

1. split(X, cons(pair(X, Y), T1), cons(Y, T2), T3) :-
split(X, T1, T2, T3).
2. split(X, cons(Y, T1), T2, cons(Y, T3)) :-
split(X, T1, T2, T3).
3. split(X, nil, T2, T3).


This version of split will not work with breadth first or iterative deepening search due to the overlap between clauses (1) and (2). We need to be able to express the disequality of X and Y in clause (2), this is complicated by the fact that X or Y may not be grounded, and hence the need to propagate the disequality constraint.

I don't understand split.

I don't understand split. Sometimes it has 3 arguments, sometimes 4. Is that intended, or is it a typo? Can you give a description of what split is supposed to do?

Perhaps I still understand your general point. Just like you can translate repeated variables to explicit equalities:

foo(... X ... X ...) :- ...

to

foo(... X ... Y ...) :- X=Y, ...


Can't you fix the case where it shouldn't be equal by adding an explicit disequality X /= Y (or \+ X=Y in your syntax). In logic languages it's expected that overlapping clauses are executed both ways, right? So if you don't want that you need to explicitly rule it out.

In Hansei variables are conceptually always grounded. That's exactly what makes negation so easy: you don't need to keep track of equalities or inequalities.

typos

Sorry, typing it on a phone I left some variables out. It matches the first argument (the key) against a key/value list in the second argument, gathering all the values where the key matches in the third argument, and all the non - matching key/value pairs in the final argument.

I am suggesting adding disequalities as the solution, but this has to be done by constraint propagation to deal with ungrounded variables.

The other bit is that when you have equality (which is normal unification) and disequality you can get general goal negation by negation-elimination without implementing a specific negation operator, which is useful because all the negation operator definitions seem to have problems.

Example of Wrong Result

This can be used as in:

foo(X) :- vr(X), X=1.

?- foo(X).
X = 1
Yes
?- foo(1).
No


If we rewrite '\+ (X=0)' as '(X<>0)' then vr reduces to '(X=0), (X=1)'. We can use De Morgan to rewrite expressions with multiple negations: '\+ ((X=0) & \+(Y=1))' would become (X<>0) | (Y=1), etc...

Can you back up to a problem statement?

i.e. can you give an example of the kind of thing you'd like to be able to express in logic?

I'm not much of a logic programming guy, but I agree that doesn't look sound using variables in conjunction with negation-as-failure. There's an implicit existential quantifier that's being negated. When you do a double negation, you should be left with an explicit quantifier ... it's no longer equivalent to the skolemized version for purposes of unification.

Overlapping Clauses

At the moment I need to make sure clauses do not overlap, otherwise iterative a deepening can find the wrong solution. Consider:

1. member(X, cons(X, _), true).
2. member(X, nil, false).
3. member(X, cons(Y, T), Z) :-
member(X, T, Z).


The problem is (1) and (3) overlap. So I would like to express ~eq(X, Y), and being able to negate queries seems more useful than just 'neq', especially with negation elimination, it turns out to be the same thing. Propagating the disequality like a constraint until contradiction or grounding of both sides is necessary to avoid two ungrounded variables as being counted as equal (unification can be seen as propagating an equality constraint).

Axiomatic Language

Axiomatic language is a form of logic programming without negation. It is just a formal system for defining recursively enumerable sets of symbolic expressions. The assumption is that "negative" predicates, such as 'not_member', can be defined explicitly, in a positive way, as needed.

A more recent development, extended axiomatic language, uses the complement of the above recursively enumerable sets for specification. This provides a form of negation, but is not computable, in general.

not-equal

I looked at Axiomatic Language, and it has disequality and equality constraints as I am thinking. I prefer the minimal syntax of Prolog like languages, but thats obviously a personal thing. I wonder how the complement of recursively enumerable sets relates to the conversion of negative clauses to positive clauses with disequalities that propagate the constraint.

Logic is at least ternary.

The law of the excluded middle is bogus.

If you can derive a definite "False" for both a statement and its negation, due to a lack of evidence supporting either as True, then you are working with a system and laws of logic that are Wrong.

Case in point: God Exists.: no proof. God Does Not Exist.: no proof. If we admit the law of the exluded middle we have both G: no proof -> False and ~G: no proof -> False. We have derived both G and ~G, so we have a disproof by contradiction of the law of the excluded middle. Therefore, logic is at least ternary and G: no proof -> Maybe.

And yet, people have fought wars, killed, and died because they were unable to accept "Maybe" as an answer.

It's kind of a shame George Boole did such nice rigorous consistent work with 2-valued logic: it's kept everyone from even considering the theory that logic might have more than two values.

I once developed a nice heuristic theorem-proving system that was three-valued: It checked logical propositions in order according to a "Relevance" heuristic that was supposed to be how much value they had in the solution of the desired goal. The test proposition started with a truth value of "Maybe" and relevance of 1.0. Everything that was introduced via backward chaining started with a truth value of "Maybe" and relevance inherited from its parent (or parents, if the same proposition appeared in more than one subtree). This meant that sometimes the system could halt (reach relevance zero) while the test proposition still had a value of "Maybe". And in those cases, I strongly feel that it was right to do so.

So why is this relevant? It sort of parallels the way humans reason. In practical terms, we're usually looking for a self-consistent set of propositions that includes both our axioms (whatever we consider axiomatic) and a truth value for our test proposition (whatever test proposition we want to know an answer for). Overlooking "Maybe" (meaning, given our current information we just can't tell) as an answer is all too common for undisciplined human thought, and can lead to false or inconsistent conclusions.

fuzzy logic?

clueless question: is that not what all those fuzzy logic folks were on about?

fuzzy truth versus fuzzy relevance

He seems to have done something a little different. Fuzzy logic makes the truth value a number in a continuous range, whereas, if I'm tracking, he only had three values (yes no maybe), and a continuous number for relevance. I'd not be surprised if it comes out formally equivalent to fuzzy logic, but I doubt it's ergonomically the same (suspect it produces a different user experience).

Only 3-valued.

I was admitting only three truth values; nothing took on a value of True or False unless it was definitely, provably, True or False.

Relevance in this case was nothing to do with what truth value got determined for it. Rather it was an estimate of how important it was to LEARN that truth value for purposes of showing the truth or falsehood of the test proposition.

For example, if my test proposition was A, and I had a law that B AND C -> A, then both B and C would have a relevance of 0.25. If I also had a law that ~B AND D -> A, then both C and D would have relevance 0.25 but B would now derive relevance from two different propositions, so it would have relevance 7/16 (The combination rule for relevance from different logic subtrees is 1 minus the product of inverses of relevance in each subtree).

Anyway, Bayesian logic gives propositions a probability of truth, and fuzzy logic gives propositions a degree of truth. Those are confusingly similar, but not the same, and relevance is similar to neither. To explain the differences....

Consider the proposition, "The liquid in this container is drinkable."

If you have (Fuzzy Logic) truth degree 0.7, then that means it's drinkable, but nasty. It could be ditch water that's likely to make you sick, or it might be whiskey that'll kill you if you drink too much of it at once. But anyway, it's 70% drinkable, and that means you'll be able to take it without risking your life if you're at least a bit careful and take only small doses at a time.

If you have (Bayesian reasoning) truth probability 0.7, then it's 70% likely to be pure refreshing water or fruit juice or milk or something - but also has a 30% chance of being something like poison or a Dengue Fever culture, a single drop of which could be fatal.

And if you have a relevance of 0.7? that's meaningless to the truth value. It is only defined relative to some test proposition. The test proposition could be something like, "There is food and drink on this table which I can eat and survive." And in that case 0.7 would mean that knowledge of the proposition's truth or falsehood is about 70% likely to be necessary in constructing a proof of the truth or falsehood of the test proposition.

That isn't nearly as rigorously defined a notion and there's really no mathematically correct a priori method for deriving it, but having a decent heuristic for it can change a depth-first or breadth-first search for proof into something like an A* search or a directed search for a proof.

The things to remember about relevance are first that it does not depend on, nor predict a probability that, nor estimate a degree to which, a proposition is true or false. It is instead an a priori estimate of how important it is to KNOW whether it is true or false. Also, it's wildly unstable and varies during the search for a proof depending on what else you already have proven and disproven.

For example, where the test proposition is T, if you have A AND B AND C AND D AND E -> T and you know the truth value of none of those propositions, then A has relevance 1/32. But with the same law, and also the knowlege that B and C and D and E are all true, then A suddenly has relevance 1/2. Nothing has changed except the set of other propositions you've shown and the truth values you've shown them to have.

If you can derive a definite

If you can derive a definite "False" for both a statement and its negation, due to a lack of evidence supporting either as True

This does not follow. A lack of evidence for A is not a proof of A, nor is it a proof of NOT(A). A proof is a series of deductive steps concluding A xor NOT(A). Missing the deductive steps proving A simply means you lack the proof, it does not mean the proof does not exist, nor does it mean NOT(A) must be true. The truth value is simply inconclusive at this time.

Proof and Disproof

A lack of proof could be because there is no proof, or because not all axioms are encoded in the database.

However, the logic may be incomplete, in which case neither a proof nor disproof may exist for some theorems. The logic may be inconsistent, that is both a proof and a disproof may exist for some theorems. Invoking Gödel, we cannot gave both consistency and completeness, so for any logic, one of the above must be true.

My thoughts are the logic itself does not need more than two values (true / false), as statements can be either proofs or disproofs. In other words there should be the following interpretations:

theorem true = proved
theorem false = not proved
~theorem true = disproved
~theorem false = not disproved


From this it is clear negation and complement are not the same thing in general logics.

My understanding is that negation elimination in logical frameworks would have the desired properties without needing multi-valued logic as it propagates disequalities, it is searching for disproof by contradiction (these two variables cannot be the same, so failure would imply the variables either have the same ground state at some point in the search, or are unified at some point).

For a four valued logic including incomplete and inconsistent, you would have to do two searches for every statement to see if a proof and/or disproof can be found so as to know which of the four values to return. For efficiency reasons I would prefer a two valued result, where you have to exicitly specify whether you are searching for a proof or disproof.

However, the logic may be

However, the logic may be incomplete, in which case neither a proof nor disproof may exist for some theorems.

Yes, but the propositions are still either true or false, regardless of provability, so the excluded middle isn't falsified in any meaningful sense.

Complement is not Negation

Your comment above was "it does not mean the proof does not exist", and in as I pointed out, it may indeed mean a proof does not exist. So you have moved the goal-posts to, its 'true' even if you cannot prove its true.

The law of the excluded middle says:

theorem false -> ~theorem true
~theorem false -> theorem true


Which is not true for a logic language, as not-proven is clearly not the same as disproved. The law of the excluded middle is valid by definition in a Boolean algebra, but this is not the right model for logic proofs like above. So you are right, there is nothing invalid about the law of the excluded middle for Boolean algebras, it is simply that a Boolean algebra is not the right model for proof search, which is the fundamental mode of operation of a logic language. A Heyting algebra (where complement and negation are not the same operation, and hence there is no law of the excluded middle) seems to be the right model as it corresponds to intuitionistic logic.

Which is not true for a

Which is not true for a logic language

The statements in the original post were not restricted to logic languages, he specifically said "systems and laws of logic" that use excluded middle are wrong. This is clearly incorrect as a general statement, and specifically what I objected to. I certainly haven't moved any goalposts unless you assume my statements were made in a more specific context.

But yes, it's well known that the excluded middle does not translate to constructive logics, although some restricted versions have been formulated.

In General

I probably was assuming a more specific context, so in retrospect you probably weren't moving the goalposts, and I agree the excluded middle is not 'wrong', but in the same sense neither is it 'correct'. It is correct when used in the specific case of a logic in which it holds, which is a subset of all possible logics.

The way I understand it is the existence of some logic in which the law of the excluded middle does not exist, means that it cannot be said to hold in general. It therefore is incorrect as a general statement. There exist specific logics (like Boolean) in which it does hold, but you cannot generalise to other logics. I guess its your use of 'general' which bothers me.

The way I understand it is

The way I understand it is the existence of some logic in which the law of the excluded middle does not exist, means that it cannot be said to hold in general. It therefore is incorrect as a general statement. [...] I guess its your use of 'general' which bothers me.

I didn't make a general claim about the excluded middle, I said the original poster's unqualified general claim that the excluded middle is bogus, is false (given the context and the example's cited). Other than that, we agree, ie. the excluded middle isn't a general principle that always holds.

Excluded middle

The law of the excluded middle doesn't hold for all logics, but I think we can all agree that for any particular logic, it either holds or it doesn't.

It can also be admissible, that is the law of the excluded middle can be included as a rule in the logic. So for example in intuitionistic logic, you can define the law of the excluded middle so that it applies locally, but in classical logic it must always apply (you cannot have it locally not apply). Finally it can be inadmissible, which means including the LoTEM as a rule will directly result in a contradiction.

Well, exactly!

That was kind of my point. Lack of evidence for A is NOT in fact a proof of NOT A, nor a proof of A. And if you treat it as such, you are manifestly reasoning by bogus rules.

But "logic languages" like Prolog DO in fact treat the inability to prove something true as evidence that that thing is false.

Therefore the rules these systems are using for reasoning are WRONG.

Negation is the problem.

Technically Prolog returns true or false, where the statement could either be a proof or a disproof. For example 'member(X,Y)' true would prove X is a member of Y. 'not_member(X,Y)' true would prove X is not a member of Y, which is a disproof of 'member(X,Y)' true. The point to note is that with negation as failure, '~member(X,Y)' is not the same thing as 'not_member(X,Y)', so it is the definition of negation that is the problem. My understanding is by using negation-elimination (using De Morgen's to replace negated equality with disequality), we can get the correct behaviour for negation. What Prolog calls negation is in fact the complement, which is the same as negation for Boolean algebra, but different for others, for example Heyting algebra, which is I think what we need here.

closed vs. open universe?

Is that not due to open vs. closed universe, like how Datalog is open and Prolog is closed?

Proof vs Disproof

I think its due to a lack of proof not being the same thing as a disproof, but I can't prove that :-) Some things may just be unprovable, even in a closed universe.

Not equal easier than negation

I have implemented a not-equal builtin that uses deferred goals, which works similarly to constraint logic programming, except there is a single not-equal constraint. The asymmetry between equals and not-equals is interesting, in that we can 'unify' variables that are equal instead of propagating an equality constraint, but it appears we have to view not-equals as a constraint. Prolog defines 'dif/2' which works like this, so the only novelty in my code is that it does not provide the unsound not-equals, is using sound unification, and the search is iterative-deepening. A membership test can be written like this (the source code for the interpreter is at https://github.com/keean/Clors.git):

1. f(Key1, cons(K1, Tail1)) :-
neq(Key1, K1),
f(Key1, Tail1).
2. f(Key2, cons(Key2, Tail2)).

:- f(X1, cons(a, cons(b, cons(c, nil)))), neq(X1, a), neq(X1, b).

DEPTH 9 ELAPSED TIME: 186us

PROOF:
1. f(Key1 = c, cons(K1 = a, Tail1 = cons(b, cons(c, nil)))) :-
neq(Key1 = c, K1 = a),
f(Key1 = c, Tail1 = cons(b, cons(c, nil))).
0. neq(Key1 = c, K1 = a).
1. f(Key2 = c, cons(K2 = b, Tail2 = cons(c, nil))) :-
neq(Key2 = c, K2 = b),
f(Key2 = c, Tail2 = cons(c, nil)).
0. neq(Key2 = c, K2 = b).
2. f(Key3 = c, cons(Key3 = c, Tail3 = nil)).
0. neq(Key2 = c, K2 = b).
0. neq(Key1 = c, K1 = a).
0. neq(X1 = c, a).
0. neq(X1 = c, b).

yes(X1 = c).


However, negation seems somewhat more problematic, as we would like to be able to derive something like:

even(nil).
even(succ(succ(Y))) :- even(Y).

:- even(X).
yes(X = nil).

not_even(succ(nil)).
not_even(succ(succ(Y))) :- not_even(Y).

:- not_even(X).
yes(X = succ(nil)).

where:

-even = not_even


It seems relatively easy to defer unification of 'even' until we have ground terms, and then use negation-as-failure, by using the deferred goals mechanism for the not-equals constraint, but that would not return the same result as 'not_even' when passed a variable.

This seems like the approach

This seems like the approach used by miniKanren (section 7).

MiniKanren

I find the reasoning in section 7 very familiar, so its good to see that other people have thought about this the same way. I think the implementation is similar, except that miniKanren is operating with immutable data, whereas I am using a disjoint set to represent variables. This means that rather than simplifying the not-equality constraints, I just need to store them (they get attached to variables as an attribute) and re-evaluate them later when something grounds the variable they are attached to.