Now it is precisely in cleaning up intuitive ideas for mathematics that one is likely to throw out the baby with the bathwater. So the next step should be viewed with the utmost suspicion. Bell
Assume we have experiments measuring some value y ∈ Y, and the experimenters are free to influence the outcome by setting some input or control parameters a ∈ A. A statistical or empirical theory for these experiments has to predict the resulting probability distributions ρ(y,a)dy.
This theory fulfills the Bell criterion of realism, if it gives also realistic explanations for this distribution. These explanations consist of a probability distribution ρ(λ) dλ on a space λ ∈ Λ of "states of the real object" and a function y(λ,a): Λ×A→Y, which describes the result of the measurement y, if the object is in state λ. As a consequence, for every test function f(y) on Y we have:
∫ f(y) ρ(y,a) dy = ∫ f(y(λ,a)) ρ(λ) dλ.
Philosophers (and other readers without a masters degree in physics or mathematics) may not like the use of formulas in the very definition of a fundamental philosophical principle like realism. But the mathematics used here are only a quite natural way to formalize philosophical concepts and principles.
The most important and interesting example of a non-realistic theory is quantum theory in its minimal interpretation. For every imaginable experiment, it defines a corresponding probability distribution for the outcome, and up to now all the predictions of quantum theory have been verified by observation.
The point of our criterion of realism is not to reject quantum theory, to exclude it from the domain of scientific research as unscientic. The criterion of realism only distinguishs realistic theories, as a subclass of especially beautiful scientific theories, from other empirical, scientific theories.
Therefore it is important to recognize that there is a more general class of scientific theories, namely statistical or empirical theories, which are also a legitimate and useful part of science. In fact, they cover a very important part of science – the part which can be answered uniquely by observation.
The aim of a statistical or empirical theory is only to reproduce the observable statistical results. It does not aim to give a realistic explanation for these observations. The only aim is to give statistical predictions, which can be compared with observation.
Thus, a statistical theory is completely defined if for every experiment the resulting probability distributions is predicted by the theory. This probability distribution, denoted by ρ(y,a) dy, is defined on the space of the possible measurement results y ∈ Y. It depends also on the parameters a ∈ A which may be controlled by the experimenters.
As a consequence, for every (continuous bounded) function of the measurement results f: Y → R the statistical theory has to define the expectation value
E(f|a) = ∫ f(y) ρ(y,a) dy.
This last property is already sufficient, and can be used to define the probability distribution ρ(y,a) dy: If the expectation values E(f|a) are fixed for all continuous functions f(y), then the probability distribution ρ(y,a) dy is fixed too.
In other words, the statistical theory is completely defined by the probability distributions ρ(y,a) dy. The formula for E(f|a) is introduced for another reason – it makes some other formulas, in particular the proof of Bell's theorem, easier.
Realistic theories have to define more than only the probability distributions ρ(y,a) dy. They are obliged to give some realistic explanation, which has to be defined in terms of a real state of the measured object λ ∈ Λ. Here, λ denotes the actual state of reality in an actual experiment, and the space Λ is the space of all possible states of reality.
We don't know, and don't have to know the real state λ. It is not necessary for a realistic explanation to fix the real state λ uniquely by observational data. The state of reality is, in this sense, hypothetical. But what we have to explain is the observable probability distribution ρ(y,a) dy. To explain it, we have to postulate a probability distribution ρ(λ) dλ on the space Λ of all possible states of reality λ.
The important point is that this real state cannot depend on the free will choices of experimenters a what to measure. This is the general philosophical idea of an observer-independent reality. This philosophical idea appears, therefore, in the formalization in the property that the probability distribution ρ(λ) dλ of the real states (different from ρ(y,a) dy of the observable results) is not allowed to depend on a.
The real, complete state λ of reality of the actual measured object, together with the control parameters a chosen by the experimenter, already define the result of the actual experiment y. This is described by some function y(λ,a): Λ × A → Y. (This seems to introduce some element of determinism into our definition of realism but in fact doesn't – using a stochastic function instead does not lead to a nontrivial generalization.)
Given the function y(λ,a): Λ × A → Y and the probability distribution ρ(λ) dλ, we can predict all observable expectation values E(f|a) by the formula:
E(f|a) = ∫ f(y) ρ(y,a) dy = ∫ f(y(λ,a)) ρ(λ) dλ.
Thus, the realistic theory makes the same empirical predictions as the statistical theory which it explains. But the requirement to provide a realistic explanation restricts the class of realistic theories: If we have found a statistical theory, for a realist the job of science is not finished: One has, yet, to find a realistic explanation for the observed results. This is a mathematically nontrivial job, because one has to find a realistic model, with a space Λ, a probability distribution ρ(λ) dλ on it, and a function y(λ,a).
Instead, for the positivist, everything is already fine if we have rules to compute the observable probability distribution ρ(y,a) dy. "Explanation" is only a classical prejudice, not the job of physics.
Realism is a notion which belongs to philosophy. As usual in philosophy, it is defined in a verbal way and understood differently by different philosophers. Why do we propose here a formalization of this notion, by providing a formalized criterion of realism?
The main motivation is that the notion of realism I present here is precisely the notion of realism which I want to defend, because I think that realism – as formulized in this particular way – is true. The other part of the motivation – the true reason why I bother about formalizing it – is that the notion of realism, formalized in this way, is in conflict with some (also formalized versions of) relativistic causality and relativistic symmetry. And I believe, and want to argue, that (these versions of) relativististic symmetry and causality have to be rejected, because it is more reasonable to reject a particular symmetry principle than to reject a part of the scientific method as fundamental as realism.
To realize this program, one property of the notion of realism is required: The notion of realism, combined with an appropriate notion of relativistic causality (or, more general, of relativistic symmetry) has to be sufficient to prove Bell's inequality.
The consequence is, that this criterion of realism is rejected by the majority of modern physicists – a decision which is, in my opinion, wrong, and the aim of these pages is to justify this opinion.
Nonetheless, I would like to note here that there remains some freedom of choice in the definition of realism. For our aim, we need a definition of realism which is sufficiently strong to prove, together with Einstein causality, Bell's inequalities. But there are different proofs of Bell's inequalities, as well as different inequalities, starting with sometimes slightly different assumptions.
We have used here a definition which is quite close to Bell's original theorem. But what have been the criteria for choosing this particular definition? One the one hand, we want to avoid the following straightforward argument: "Your definition of realism is much too strong, too naive. It makes assumptions which are unjustified even in some classical common sense situations, where classical common sense explanations exist." . But this means only that our criterion has to be sufficiently weak, so that it does not exclude any reasonable common sense explanation as a realistic explanation.
We have to say that this criterion holds for all variants of Bell's theorem which have been presented by Bell himself. But there are lot's of presentations of Bell's theorem which start with "definitions of realism" which are unreasonably strong: Namely, they often start with the assumption that all results of possible "spin measurements" have to have objective, predefined values.
But this is not an assumption of Bell's theorem – instead, it is a consequence of a the weaker notion of realism which we use here and Einstein causality. The weaker notion of realism, taken alone, allows, in particular, the following scenario: There are no predefined values of the measurement of the spin components. When Alice "measures the spin" in direction a, the resulting value is created by some interaction between the particle, the measurement device, and some other random local influences. But then, violating Einstein causality, the information about the result is transferred to B, and after this the result of the "measurement of spin" for the same direction b=a gives already a well-defined result B=-A.
So this scenario is excluded not because our criterion of reality requires well-defined measurement results (an assumption named "naive realism"), but because of Einstein causality, which forbids that the non-predefined result of measurement of A is transferred to B. Only because of this the "measurement results" have to be predefined. This is not an assumption, but the result of the first part – the EPR part – of Bell's proof.
So, a notion of realism as strong as "naive realism" would be inappropriate. Our notion of realism is more general, it allows that the "measurement results" are not observations of some predefined quantity, but are the result of some complex interaction, which depends not only on the state of the "observed entity", but also on the state of the measurement device, and, possibly, on some other random influences.
But may be there is an even weaker notion of realism? Now, our aim is not to find the weakest possible notion of realism. It should be weak enough to meet the objection that it is too strong, that a reasonable common sense realistic explanation does not fit into this scheme. But this objection is relevant only if a supposed weaker version is no longer in conflict with relativisitic symmetry. If we have a choice between different notions of realism, with all of them in conflict with relativity, this argument does not count. In this case, we can use other criteria, in particular simplicity and beauty of presentation, to choose between them.
That's why we have chosen a formalization with a slightly deterministic touch: It postulates that a realistic explanation has to be based on an explicit function y(λ,a). It is more clear and obvious that such a representation really defines a non-trivial explanation. As well, it clarifies the causal connections: a and λ are the causes, and y(λ,a) is the effect.
So one may think that this requires too much – some sort of determinism.
This does not matter very much: The point is that there is a notion of realism based only on assumptions about probability distributions, but which is also sufficient to prove (with Einstein causality) Bell's inequality. And, in fact, the difference is not essentiell: The point is that in classical Kolmogorovian probability theory random (stochastic) functions are defined in a quite similar "deterministic" way: Compare: A stochastic function y is a function y(ω) on some "space of events" Ω with probability measure ρ(ω)dω. And our realistic explanation requires some function y(λ) on some "space of events" Λ with probability measure ρ(λ)dλ. Of course, the "events" ω are hidden, but the λ are hidden variables as well.
More general, there is not really a point of distinguishing "deterministic" and "stochastic" theories, except for purely pragmatical reasons: We can always replace a stochastic theory by a deterministic theory which uses some deterministic random number generator. And deterministic theories often enough appear to be chaotic, so that to describe them effectively one has to use a stochastic theory. The difference between deterministic theories and classical (Kolmogorovian) stochastic theories is quite irrelevant for the conceptual problem we consider here: A stochastic explanation is as good as a deterministic one, and both are incompatible with Einstein causality.
To avoid the argumentation presented here, one would need a different, non-Kolmogorovian notion of randomness. So, the reason why we prefer here the "deterministic" version is the simplicity of this presentation. one may think that our notion of realism measure some predefined nobody would say that realism is
The EPR paper is famous for its EPR criterion of reality. Instead, we name our criterion Bell criterion of realism. Using Bell instead of EPR is clear: We focus our interest on the proof of Bell's inequality, thus, want to fix what we need to prove this theorem.
But why realism instead of reality? This is, indeed, an interesting point worth explanation. Reality is what exists. Realism is, instead, a principle. It is a property of theories. There are realistic theories, an there are other types of theories, like mystic or solipsistic theories.
Our criterion may be applied to theories. It allows to distinguish "realistic" theories (which fulfill this criterion) from other theories. Therefore, "criterion of realism" seems more appropriate.
The common sense notion of realism contains more than the EPR-Bell criterion of reality. This is a consequence of our alternative definition: All things, which are part of common sense realism, but are not necessary for the proof of Bell's inequality, have been omitted — they are unnecessary. One, very important, but omitted part is the requirement of consistency between the different partial explanations of different experiments.
Imagine A observes an object B, C observes A together with B, and D, D observes A, B, and C. All these observations lead to statistical predictions. Following the definition, all these statistical predictions need explanations. What is missed in our definition is that these explanations have to be compatible with each other. As a consequence of such compatibility conditions, realistic theories usually describe not only some particular objects, but the whole universe including observers.
This consistency requirement is what usually gives realistic theories additional predictive power. Imagine a large set of observations. Imagine that, for each particular observation, we have some explanation. But these explanations may be incompatible with each other. In a consistent realistic theory, this would be forbidden. Therefore, this set of observations would falsify the realistic theory. Instead, if we restrict ourself to the consideration of the statistical predictions, we have no such contradiction. Thus, we have less possibilities to falsify the theory, thus, less predictive power in the sense of Popper's criterion.
It does not follow that for all observables of quantum theory exists some predefined hidden variables. This can be seen in the standard example of a realistic interpretation of quantum theory: The pilot wave theory. In this theory, only the cofiguration space observables have definite values. While there exists a trajectory in configuration space, thus, also some corresponding velocity, the property measured by quantum momentum observation p = -i ∂q does not measure this velocity, but, instead, "measures" some result of a complex interaction between the "measured object" and the "measurement device", a result which, in particular, depends on the initial configuration of the "measurement device".
The criterion of realism has also nothing to do with determinism. Instead, every classical stochastic process is realistic according to our criterion.
Given the probability distribution ρ(λ) dλ and the function y(λ,a), it is possible to construct a probability distribution ρ(y,a) dy. To explain how, we have to define probability distributions. How to do this in a way that laymen can follow? It seems, the dual definition is useful here: A probability distribution ρ(λ)dλ is defined, if we have defined, for all continuous bounded functions f(λ), their expectation value E(f), which is usually written as
E(f) = ∫ f(λ) ρ(λ) dλ.
This map f(.) → E(f) has to fulfill some axioms (f≥0 → E(f)≥0, E(1)=1, E(f+g)=E(f)+E(g), E(cf) = cE(f) for constants c). If it does, the map defines a probability distribution. It seems to be a quite natural definition, essentially because we, anyway, want to be able to compute such integrals.
And with this definition, it is quite simple to construct the image ρ(y) dy of ρ(λ) dλ for a given function y(λ). Indeed, all we need is the formula
∫ f(y) ρ(y) dy = ∫ f(y(λ)) ρ(λ) dλ.
If ρ(λ) dλ is a probability distribution, then the term on the right hand side is defined for every function f(.) on Y, and has all the nice properties we need to define a probability distribution on Y. This can be easily checked – all what we need to prove the properties for the left hand side follows from the corresponding properties of the probability distribution ρ(λ) dλ.
In comparison with other ways to define probability distributions this way seems to be quite simple and nice. Moreover, the expression for expectation values we would have to introduce anyway, because we use it in the proof of Bell's inequality.