discussion

Common misrepresentations of inflation

If your knowledge about inflation is based on popular science, you are probably misguided in a quite serious way. Some of the misinformation is so horrible that at least some mainstream scientists have corrected the most horrible nonsense. In particular, Sean Carrol has made a good job to correct one of this - the claim that "during inflation, the universe 'expanded faster than the speed of light.' It’s extraordinarily common, if utterly and hopelessly incorrect". I agree completely. To quote his main points (see here for details):


  1. The expansion of the universe doesn’t have a "speed." ... The expansion of the universe is quantified by the Hubble constant, which is typically quoted in crazy units of kilometers per second per megaparsec. That’s (distance divided by time) divided by distance, or simply 1/time. Speed, meanwhile, is measured in distance/time. Not the same units! Comparing the two concepts is crazy. ...
  2. There is no well-defined notion of “the velocity of distant objects” in general relativity. ...
  3. There is nothing special about the expansion rate during inflation. ...
What’s special about inflation is that the universe is accelerating. ... Many “misconceptions” in physics stem from an honest attempt to explain technical concepts in natural language, and I try to be very forgiving about those. This one, I believe, isn’t like that; it’s just wrongity-wrong wrong. ...

Similar points have been made by Davis and Lineweaver in Expanding Confusion: common misconceptions of cosmological horizons and the superluminal expansion of the universe. In particular, they criticize:



Another common error in popular presentations of inflation

Unfortunately, the "faster than light expansion" of the universe during inflation is not the only thing which is regularly distorted in popular presentations of inflation theory. Another very common error in popular presentations, and not only, is to mingle cosmological inflation - which is a period of accelerated expansion \(\mathbf{\ddot{a}(\tau)>0}\) - with common sense meaning of "inflation" as something with a high expansion rate. See, for example, the following examples:
By Yinweichen - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=31825049

Let's note that in all these pictures the region named "inflation" also contains a region with increasing expansion rate (near the start of "inflation") but also a region with a fastly decreasing expansion rate (near the end of "inflation"). All this, of course, in nice agreement with the common sense meaning of "inflation" as a period with a high "inflation rate".

Unfortunately, this is not what defines cosmological inflation. It is defined by not by a high exansion rate (the first derivative should be large, \(\dot{a}(\tau) \gg 1\)), but, instead, by an increasing expansion rate (the second derivative should be positive, \(\ddot{a}(\tau)>0\)). If we use this, correct, definition of inflation, then only the first half of what is named "inflation" in these pictures fits.

This leads to a second misinformation, which is explicit in the two left pictures, but implicit in the two pictures on the right too: Namely, a wrong information about what happens without inflation. The pictures suggests what common sense suggests: No high "inflation rates" without "inflation". So, we would have much lower expansion rates without inflation, and the curve without "inflation" would have to be higher than with inflation.

But to find out what would be the curve without inflation, we would have to continue the curve which we have after inflation, with the descreasing expansion rate, \(\ddot{a}(\tau)<0\). And we would have to start with this continuation near the place where inflation, the region with increasing expansion rate, \(\ddot{a}(\tau)>0\), ends.

In the picture we have the curve after the inflation in green. I have used here \(a(\tau)=\sqrt{\tau}\). The the continuation of the same \(a(\tau)=\sqrt{\tau}\) would be the curve without inflation, in red. Above are curved downwards, \(\ddot{a}(\tau)<0\), the expansion rate decreases with time. Inflation is shown by the blue line. It is curved upwards, \(\ddot{a}(\tau)>0\), the expansion rate increases.

So, we see that the continuation of the curve without inflation lies below the curve for inflation, not above (like the wrong, dotted continuation).

This is completely counterintuitive if one has usual ideas about the meaning of "inflation" in mind, given that the expansion rates "without inflation" are higher than "with inflation". But this is a necessity: if for the inflation curve the expansion rate increases, for the curve without inflation the expansion rate decreases, and when they meet each other their expansion rates are equal, the inflation curve has to have a lower expansion rate than the curve without inflation.

We have to conclude: To use "inflation" as a name for what is named inflation is extremely misleading. "Accelerating expansion" would have been much better. Increasing expansion rates, \(\ddot{a}(\tau)>0\), are something different from large values of the expantion rate, \(\dot{a}(\tau)\gg 1\).

But is this misunderstanding somehow important? It is, if you want to understand why we need inflation at all to solve some observable facts. In particular, with this misunderstand of inflation one cannot understanding correctly any of the problems inflation solves. This is because of the following simple fact:

A large expansion rate does not solve any problem,
because without inflation we have arbitrary large expansion rates anyway

Popular explanations of inflation like to mention some numbers about how many times the size of the universe increases in an extremely short time. The first picture above is an example: It gives the factor \(10^{30}\), an unimaginable large number, as the expansion factor during the inflation epoch, which is, with \(10^{-35}s\), unimaginable short. This seems to be an extremely impressive number. But so what? Let's see what the standard Friedman solution for a radiation-dominated universe, with \(a(\tau)=\sqrt{\tau}\), gives. Let's try an even shorter time, \(10^{-40}s\), and an even larger factor, \(10^{40}\). Can we find in \(a(\tau)=\sqrt{\tau}\) such a large expansion? Let's try. At the time \(\tau_1 = 10^{-40}s\) we have some number \(a_1=a(\tau_1) =\sqrt{\tau_1}\). Let's find now at which time \(\tau_0\) we obtain \(a_1=10^{40}a_0\): \[ a_0 = 10^{-40} a_1 = 10^{-40} \sqrt{\tau_1} = \sqrt{10^{-80}\tau_1} = a(10^{-80}\tau_1)=a(\tau_0).\] In other words, \(\tau_0 = 10^{-80}\tau_1 = 10^{-120}s > 0\), so that the time interval is \(\Delta \tau = \tau_1-\tau_0>10^{-40}s\), as we have wanted. Thus, for the period between \(\tau_0=10^{-120}s\) after the big bang up to \(\tau_1 = 10^{-40}s\) after the big bang, a time interval shorter that \(10^{-40}s\), the universe expands by a factor \(10^{40}\) in the standard Friedman universe with \(a(\tau)=\sqrt{\tau}\). Which is, by construction, \(10^{10}\) times the factor reached by inflation. And, don't forget, inflation needs a \(10^5\) times larger period of time for this miserable result.

And what works for the radiation-dominated universe with \(a(\tau)\sim \tau^{1/2}\) works as well for the matter-dominated \(a(\tau)\sim \tau^{2/3}\), as well as for arbitrary high expansion factors during arbitrary short time intervals.

So, a problem which one could solve by introducing some very large expansion factor during some very short time would not be a problem at all in a universe without inflation.

To support this claim, we will show that the most important problem solved by inflation, the horizon problem, would not be solved, but even made worse if those pictures would accurately describe inflation.

Understanding the horizon problem

This requires some introduction into what the horizon problem is about. Wikipedia has given the following explanation of the horizon problem:

When we look at the CMB it comes from 46 billion comoving light years away. However, when the light was emitted the universe was much younger (300,000 years old). In that time light would have only reached as far as the smaller circles. The two points indicated on the diagram would not have been able to contact each other because their spheres of causality do not overlap.

So far, fine. But what defines the size of the region which is possibly causally connected? The first idea would be simple: Once the universe was 300,000 years old at that time, light could have travelled only 300,000 light-years. So, the biggest causally connected region would be a circle of radius 300,000 light-years. Not?

Not. Don't forget, the universe is expanding, even during the first 300,000 years. If, say, during the first 100 000 years it has been much smaller, say only 1/10, light could travel during these first 100 000 years 10 times the distance which becomes later, 300,000 years after the big bang, 100 000 light years. This would give already 1 000 000 light years. So, this already shows that the causally connected region was greater than the circle with radius 300,000 light-years, in distances measured 300,000 years after the big bang.

So, to find out the size of the greatest possibly causally connected region, we have to compute the distance light can travel in an expanding universe. Fortunately, this is not that difficult. We have the FLRW-metric \[ds^2 = d\tau^2 - a^2(\tau)(dx^2+dy^2+dz^2). \] For light rays, \(ds^2=0\), so that \(dl = \sqrt{dx^2+dy^2+dz^2} = a^{-1} d\tau\), or, in other words, the coordinate speed of light, defined for the comoving coordinates, is simply \(c(\tau)=a^{-1}(\tau)\). So that the distance which light can travel is simply \[ l = \int_{0}^{\tau_1} c(\tau) d\tau = \int_{0}^{\tau_1} a^{-1}(\tau) d\tau\] But this is an integral which one can easily compute for the Friedman universe, as for the radiation-dominated one with \(a(\tau) = \tau^{1/2}\), as for the matter-dominated one with \(a(\tau) = \tau^{2/3}\), in fact for every \(a(\tau) = \tau^{\alpha}\): \[ l = \int_{0}^{\tau_1} \tau^{-\alpha} d\tau = \frac{1}{1-\alpha}\int_{0}^{\tau_1} d\tau^{1-\alpha}= \frac{\tau_1^{1-\alpha}}{1-\alpha}.\] The naive formula, which ignores that the coordinate speed of light changes, would have given, instead, \[l = \int_{0}^{\tau_1} a^{-1}(\tau_1)d\tau = \int_{0}^{\tau_1} \tau^{-\alpha}(\tau_1)d\tau = \tau_1^{1-\alpha}.\] Thus, the correct length a light ray can travel during the first \(\tau_1\) years gets an additional factor \(1 / (1-\alpha)\), thus, 2 or 3 in dependence of the matter model. That means, the radius of the maximal causally connected sphere becomes greater by a factor 2 or 3. So it will be, instead of 300 000 light years, something below 900 000 light years.

The empirical evidence for inflation

Full-sky image derived from nine years' WMAP data

Despite this correction, the size of region is nonetheless quite small, only \(1.7^o\) of angular size on the sky.

And this appears to be, nonetheless, far to small to be compatible with what we see at the background radiation.

Why? First of all, the background radiation has a very homogeneous temperature. There would be a simple explanation for this - this is what one expects for a thermal equilibrium. But a thermal equilibrium would require a lot of causal contact between the different parts of the universe, between all \(360^o\), not only minor \(1.7^o\) circles.

Ok, this problem would have another solution. We would have, anyway, to make some assumptions about the initial state of the universe. What would be more natural than some completely homogeneous initial condition? Occam's razor and so on.

Fine, this would allow to explain the homogeneity. But only as far as it is an indeal homogeneity. Instead, the background radiation we see has some small inhomogeneities. The picture of these inhomogeneities is a quite well-known one. If we would have ideal homogeneous initial conditions, such inhomogeneities could have appeared after the big bang, by random fluctuations. But such fluctuations would appear in a single event. Thus, their maximal size would be the maximal causally connected region.

Unfortunately, the inhomogeneities we observe are much much larger than the allowed \(1.7^o\) circles. As one can see at the picture. (Caveat: The picture alone is not sufficient, because one has to distinguish inhomogeneities already present at that time from inhomogeneities caused by various physical effects in the foreground. Such distortions by the foreground have to be identified and substracted. If this has been done in a correct way is what I would hope for, but I don't know enough about this question to make definite statements.)

The conlusion is quite simple and clear: The maximal causally connected region has to be much larger, and has to cover, essentially, the whole observable universe.

How one can increase the size of the maximal causally connected region?

Let's now see how we can increase the size of the maximal causally connected region. For this purpose, we need another function of the expansion of the universe \(a(\tau)\). For this modified function, we would need some moment \(\tau_0\) after the singularity so that for the time \(\tau_1\) when the CMB radiation was emitted we obtain \[l = \int_{\tau_0}^{\tau_1} a^{-1}(\tau) d\tau \gg a^{-1}(\tau_1)(\tau_1-\tau_0)\] Let's try, again, \(a(\tau) = \tau^\alpha\), but now without the restriction \(\alpha<1\). For \(\alpha>1\) we obtain \[ l = \int_{\tau_0}^{\tau_1} \tau^{-\alpha}d\tau = \frac{\tau_1^{1-\alpha}-\tau_0^{1-\alpha}}{1-\alpha} = \frac{\tau_0^{1-\alpha}-\tau_1^{1-\alpha}}{\alpha-1}\to \infty \text{ for } \tau_0 \to 0\] so that the whole universe could have been causally connected. In this case, we have \(\ddot{a}(\tau) > 0\). Another example would be \(a(\tau)=\exp \tau\). This gives \[ l = \int_{\tau_0}^{\tau_1} e^{-\tau}d\tau = e^{-\tau_0} - e^{-\tau_1} \to \infty \text{ for } \tau_0 \to -\infty.\] Again, we have \(\ddot{a}(\tau) > 0\) and an infinitely large region which can be causally connected. What about the border case with \(\ddot{a}(\tau) = 0\)? This gives \(a(\tau)=\tau\), thus: \[ l = \int_{\tau_0}^{\tau_1} \frac{d\tau}{\tau} = \ln(\tau_1)-\ln(\tau_0) \to \infty \text{ for } \tau_0 \to 0.\] That even in this limiting case \(a(\tau)=\tau\) we obtain an infinite causally connected region gives us a simple way to prove the following

Theorem: If \(\mathbf{\ddot{a}(\tau) > 0}\) all the time before some moment \(\mathbf{\tau_1}\), then the causally connected region is infinite. In other words, inflation solves the horizon problem.

How does the proof of this theorem work? We compare the distance which can be travelled for the inflationary \(a(\tau)\) with that for a constant expansion rate \(a(\tau)=\tau\). We assume that at the end of inflation, at \(\tau_1\), the expansion rate \(\dot{a}(\tau_1)\) is the same. Then, all the time before \(\dot{a}(\tau)< 1\), because we have inflation, thus, the expansion rate increases. This gives: \[ l = \int_{\tau_0}^{\tau_1} a^{-1}(\tau)d\tau = \int_{a_0}^{a_1} a^{-1} \frac{d\tau}{da} da = \int_{a_0}^{a_1} \frac{1}{a \dot{a}} da > \int_{a_0}^{a_1} \frac{1}{a} da = \ln a_1-\ln a_0 \to \infty \text{ for } a_0 \to 0.\]

Why what is suggested in the pictures cannot solve the horizon problem

The technique used in this proof can, as well, be used to prove another theorem. Assume, as the naive interpretation of "inflation" would suggest, that the expansion rate for the same value of \(a(.)\) without inflation, as predicted by the SM only, would be lower than with inflation, \(\dot{a}_{SM} < \dot{a}_{infl}\). Then the same formula would lead us to \[ l_{infl} = \int_{\tau_0}^{\tau_1} a^{-1}_{infl}(\tau)d\tau = \int_{a_0}^{a_1} \frac{1}{a \dot{a}_{infl}} da < \int_{a_0}^{a_1} \frac{1}{a \dot{a}_{SM}} da = l_{SM},\] that means, the maximal causally connected region would be even smaller than without inflation:

The problem which is in reality solved by inflation would be made even worse if the pictures would describe the real picture.

The cause of this error is, essentially, the misleading choice to name this thing "inflation". If, say, \(a(\tau)\) would be the price, in dollar, for bread, you would start to talk about inflation if the rate of increase of the price per year would be high. But this would mean that \(\dot{a}(\tau)\) is high. What is named "inflation" is, instead, \(\ddot{a}(\tau)>0\), that means, an increasing inflation rate. But if last year the rate of price increase was 0.1%, and this year 0.2%, would one name this inflation? Or if, on the other hand, the rate was 900% last year, but this year only 800%? This would be a horrible inflation anyway, not? But if "inflation" would be defined by \(\ddot{a}(\tau)>0\), the former case, with 0.2%, would be inflation, the latter with 800% not.