The Logic of Plausible Reasoning

The objective Bayesian interpretation of probability theory

The most important contribution to the philosophy of science after Popper's "logic of scientific discovery", and the correction of the greatest flaw in Popper's critical rationalism, was what has been formally named the "objective Bayesian interpretation of probability", but what is more appropriately named the "logic of plausible reasoning". It has been developed by Cox [Cox 1946] and Jaynes ( Jaynes 1983],[ Jaynes 2003].

What is the point of this "logic of plausible reasoning"? In our everyday life, plausible reasoning plays an important role. In fact it is even more important than the rules of formal classical logic because we are seldom in a situation where these rules would be sufficient to prove something important in our life. So that, given our incomplete information about the world, we have no other choice but to use plausible reasoning. Once plausible reasoning gives, necessarily, only plausible, uncertain results, it seems natural to leave the rules of plausible reasoning imprecise and uncertain too. But this appears to be wrong - while degrees of plausibility mean, by their nature, imprecise knowledge, the rules of handling plausible reasoning appear to be very precise rules, namely the rules of standard probability theory.

How is this possible? The key for this are consistency requirements. If one can use different ways of reasoning, the final result should be exactly the same, even if it is only a degree of plausibility. Else, there is some inconsistency in the reasoning. That the thing itself, a degree of plausibility, is something quite vague does not matter at all - logical consistency requires that, whatever the result, it should be the same if we start with the same assumptions and apply only consistent reasoning.

Beyond the requirement of consistency of reasoning, where remains some freedom of choice. But as far as the resulting differences between different sets of consistent rules can be shown to be equivalent, we have the freedom to prefer the simplest one. If, in particular, the degree of plausibility X is simply a real number between 0 (certainly wrong) and 1 (certainly true), then we can also use, instead, some function f(X) as long as it transforms 0 into 0, 1 into 1, and greater X into greater f(X). Such a replacement will change the rules - those for X will be different than those for f(X). But here we have some freedom of choice, we can prefer the rules which are simpler. for some particular situation should be exactly the same, even if the thing itself - a degree of plausibility - is something quite vague. The other element is that there are transformations which change nothing. If we name some X the degree of plausibility or some f(X) changes nothing, if done consistently. But such a change can be used to simplify the mathematical rules of reasoning.

As a consequence, it can be proven that the rules of plausible reasoning are exactly those of probability theory. And the assumptions we have to make to prove this are sufficiently simple and self-evident.

The completion of the equivalence to Kolmogorovian probability theory

Nonetheless, the equivalence between classical (Kolmogorovian) probability theory and the logic of plausible reasoning was incomplete.

Our system of probability, however, differs conceptually from that of Kolmogorov in that we do not interpret propositions in terms of sets, but we do interpret probability distributions as carriers of incomplete information. [ Jaynes 2003, p.50]

The Kolmogorov system of probability (henceforth denoted by KSP) is a game played on a sample space \(\Omega\) of elementary propositions \(\omega_i\) (or ‘events’; it does not matter what we call them at this level of abstraction). ... For all practical purposes, then, our system will agree with KSP if we are applying it in the set-theory context. But in more general applications, although we have a field of discourse F and probability measure P on F with the same properties, we do not need, and do not always have, any set \(\Omega\) of elementary propositions into which the elements of F can be resolved. [ Jaynes 2003, p.651ff]

But this distinction is not necessary. Once in one direction everything is fine (if there exist a space \(\Omega\), then the rules are the same as in Kolmogorovian probability theory), we have to care only about the other direction - are there "fields of discource" F (that means, a Boolean algebra of propositions under consideration) so that there exists no such set \(\Omega\) of elementary propositions?

This question has been considered by mathematicians already long ago, and the result was Stone's theorem: Every Boolean algebra can be represented as the algebra of subsets of some set \(\Omega\). The elements of this set are simply the homomorphisms from the "field of discource" F - the Boolean algebra - into the two-element Boolean algebra consisting of the truth values {false, true}. So, each element of this set of homomorphisms simply assigns, in a logically consistent way, a particular truth value to every proposition. In other words, it defines a logically imaginable possibility.

Application to Bell's theorem

What seems to be a quite simple purely technical result, which completes the equivalence between the mathematical apparatus of Kolmogorovian probability theory and the logic of plausible reasoning has unexpected consequences if applied to Bell's inequalities. Namely, whatever observable o is measured, the expectation value can be computed by an integral over this space \(\Omega\): \[ E(o) = \int_{\omega\in\Omega} o(\omega) \rho(\omega) d \omega.\] If we are interested only in those outcomes where some experimenters have made decisions described by some a, then we can simply split this Omega into parts \(\omega=(a,\lambda)\) and obtain the conditional expectation value in dependence of a as \[E(O|a) = \int_{\omega\in\Omega} o(a,\lambda)\rho(a,\lambda) d\lambda.\] Applied to the experimental situation considered by Bell, this gives the following: \[E(AB|a,b) = \int_{\lambda\in\Lambda} A(a,b,\lambda)B(a,b,\lambda)\rho(a,b,\lambda) d\lambda.\]

To prove Bell's theorem, all we need is to reduce this to \[E(AB|a,b) = \int A(a,\lambda)B(b,\lambda)\rho(\lambda) d\lambda.\]

The reduction \(A(a,b,\lambda)B(a,b,\lambda) \Rightarrow A(a,\lambda)B(b,\lambda)\) is Einstein causality, and the reduction \(\rho(a,b,\lambda) \Rightarrow \rho(\lambda)\) follows from a rejection of superdeterminism. Everything else disappeared. There is no "realism" to be rejected to open a loophole for Einstein causality. The only ways to save Einstein causality are superdeterminism and the rejection of logic.

Logical independence

Superdeterminism, if taken seriously (and not only as an excuse to get rid of one particular unwanted result) would be the end of science: No experiment which contains at least some statistical aspect would be able to prove anything if superdeterminism would be accepted as an excuse. And, in fact, a second look at superdeterminism reveals that to accept it is also in contradiction with the logic of plausible reasoning.

It is the concept of logical independence which would not allow us to accept superdeterminism.

But logical independence forces us to accept much more serious restrictions, restrictions which in some situations seem quite counterintuitive, so, this has to be considered separately, and is worth to be considered in detail.

The dice: \(\frac{1}{6}\) for all numbers

The straightforward example is that of throwing a dice. The possible outcomes may be the numbers from 1 to 6. Assume that we have no information which somehow makes a difference between these six values. Then, the logic of plausible reasoning forces us, obligatory, to accept a degree of plausibility of \(\frac{1}{6}\) to all outcomes.

But what about a faked, bogus dice? If the dice would be bogus, the frequency of getting, say, a six could be much larger than \(\frac{1}{6}\). So, what is prescribed by this "logic of plausible reasoning" can simply appear to be wrong. What would be the value of a "logic" which forces us to accept things which, if tested, may appear wrong?

The point which justifies the prescription of accepting the \(\frac{1}{6}\) is that at that moment we have no other information. What the rules of plausible reasoning propose is what is the optimal choice given the information which is available at that particular moment.

So, the prescription works only if there is really no such information which makes a difference between the six outcomes. What makes the outcome suspicious is that in our example and in our culturural context we have information which makes a difference: Dices are often used in games, these games tend to make those who reach higher results, like 5 and 6, the winners, and so those who play such games have some interest in faking dices so that this allows them to reach such results like 5 and 6 with higher probability than \(\frac{1}{6}\). So, our cultural background gives us information which makes a difference, and therefore the logical prescription does not work: Once we have information which makes a difference, we are not forced accepting the \(\frac{1}{6}\).

Let's illustrate this point with a modified example. Assume that we have such a context information that cheating is possible, even common. But assume we are in some foreign country, using foreign letters for the numbers, which we don't know, so that we simply have no information which of the six signs which we see on the dice, say, ၅,၂,၆,၃,၄,၁, the cheaters would prefer. In this case, we would be, again, forced to accept the \(\frac{1}{6}\). If we start to make difference, even only in guesses, like "that ၆ sign is higher than the others, probably it means something more valuable", the symmetry is broken, and we are no longer logically forced by logic to accept the \(\frac{1}{6}\) prescription.

The "zero hypothesis"

The other classical application of this is the so-called "zero hypothesis", namely that if we have two propositions A and B but no information which suggests some correlation between them, then we have to assume that they are independent, \(p(A\land B) = p(A) p(B)\), which is named "null hypothesis". In everyday life, this has a similar problem as the \(\frac{1}{6}\) prescription for the dice, it may fail in reality. But, similarly, it is a prescription only if we have really no information about such a connection between A and B. If there would be some information about this, however weak, the logical prescription to accept \(p(A\land B) = p(A) p(B)\) is no longer applicable. An actual experiment which measures such a correlation and gets a nontrivial result would give such information, so that after receiving this information there would be no longer a prescription to accept the null hypothesis.

The rejection of superdeterminism

Once one has understood and accepted the meaning of this notion of logical independence, the rejection of superdeterminism becomes a triviality. We have no evidence, neither empirical nor any other, in favor of any correlation between the \(\lambda\) and the free decisions of the experimenter \(a,b\), and this absence of information is already sufficient to prescribe \(\rho(a,b,\lambda) = \rho(a,b)\rho(\lambda)\).

So you have to reject either Einstein causality or logic

So, neither realism nor superdeterminism are available as loopholes, once this is accepted. To defend Einstein causality, one has to reject logic. Even if only in the form of the logic of plausible reasoning, which can be, at least in principle, questioned.

The author tends to think that those who are ready to reject even logic to preserve a particular metaphysical hypothesis are driven by other, non-scientific, emotions, and are no longer open to scientific arguments. Feel free to object, if you think otherwise.