C.3. Gravitational light bending and the Shapiro delay

\label{sec-gr-light}

We discuss gravitational lensing by stars, compact objects, galaxies, and clusters of galaxies in Chapter 15. The basic ingredients from the general theory of relativity that we need to study gravitational lensing are the deflection and the time delay experienced by light traveling through a gravitational field. Because the gravitational fields involved are always weak, we can use the Newtonian limit from the previous section to derive these.

In the limit of a weak gravitational field, the metric is that of Equation (C.73), but we’ll use the more general form of Equation (C.74) to clearly distinguish the role of the Newtonian and curvature potential (as discussed at the start of Section C.2, technically, we should use the metric from Equation C.75, but this is not necessary for our purposes as we only consider lensing within or by galaxies; only cosmic-shear calculations require the use of Equation C.75). Trajectories of light are again solutions of the geodesic Equation (C.34), but unlike in the previous section, we now have to solve this for relativistic matter. Light travels on null geodesics that have \(\Delta s = 0\). Because of this, we cannot use the proper time to parameterize the light’s path, but we can use a different parameter \(\lambda\) such that the light’s trajectory is \(x^\mu(\lambda)\). The null-geodesic condition is then \begin{equation}\label{eq-gr-null-geodesic} g_{\mu\nu}{\mathrm{d} x^\mu \over \mathrm{d} \lambda}{\mathrm{d} x^\nu \over \mathrm{d} \lambda} = 0\,. \end{equation} For the perturbed metric \(g_{\mu\nu} = \eta_{\mu\nu} + h_{\mu\nu}\), this needs to hold to zero-th and first order. We decompose the trajectory as \(x^\mu(\lambda) = x_0^\mu(\lambda)+x_1^\mu(\lambda)\), where \(|x_1^\mu| \ll |x_0^\mu|\) and writing \(\mathrm{d}x_0^\mu / \mathrm{d}\lambda \equiv k^\mu\) and \(\mathrm{d}x_1^\mu / \mathrm{d}\lambda \equiv l^\mu\), we have that the null-geodesic condition to zero-th order implies that \begin{equation} (k^0)^2 = (\vec{k})^2 = k^2\,, \end{equation} where \(k\) is the length of \(\vec{k}\), which is the spatial part of \(k^\mu= (k^0,\vec{k})\). The zero-th order geodesic equation is simply \begin{equation} {\mathrm{d} k^\mu \over \mathrm{d} \lambda} = 0\,, \end{equation} and \(k^\mu\) is therefore constant; this is the straight trajectory of light in flat space.

The first-order null-geodesic equation (C.82) is \begin{equation} \eta_{\mu\nu} k^\mu l^\nu + \eta_{\mu\nu} k^\nu l^\mu + h_{\mu\nu} k^\mu k^\nu = 0\,, \end{equation} or substituting in the metric from Equation (C.74) \begin{equation}\label{eq-gr-veclk} \vec{l}\cdot \vec{k} - k\, l^0 = {k^2 \over c^2}\left(\Phi+\Psi\right)\,. \end{equation}

To work out the geodesic equation to first order, we compute the Christoffel connection using Equation (C.54) for the metric in Equation (C.74) and we can still drop time derivatives of the metric, because they are much smaller than spatial derivatives. However, we need more than just \(\Gamma^{\mu}_{00}\) in this case and now have to first order that \begin{align}\label{eq-gr-christoffel-metric-weak-newton-full} c^2\Gamma^{0}_{00} & = c^2\Gamma^{0}_{ij} = c^2\Gamma^{i}_{0j} = c^2\Gamma^{i}_{j0} = 0\,,\\ c^2\Gamma^{i}_{00} & = c^2\Gamma^{0}_{0i} = c^2\Gamma^{0}_{i 0} = \partial_i \Phi\,,\\ c^2\Gamma^{i}_{jk} & = \delta_{jk}\partial_i \Psi - \delta_{ik}\partial_j \Psi - \delta_{ij}\partial_k \Psi\,. \end{align} Because the Christoffel connection is a first-order quantity, the first-order equations of motion for \(l^\mu\) are therefore \begin{align}\label{eq-gr-lensing-trajectory-basic-1} c^2{\mathrm{d} l^0 \over \mathrm{d} \lambda} & = -2 k\,\sum_{i}{(\partial_i \Phi)k^i}\,,\\ c^2{\mathrm{d} l^i \over \mathrm{d} \lambda} & = -k^2(\partial_i [\Phi + \Psi]) +2k^i \sum_{j} (\partial_j \Psi) k^j\,.\label{eq-gr-lensing-trajectory-basic-2} \end{align} The first of these equations has the solution \begin{equation}\label{eq-gr-lensing-trajectory-l0} l^0 = -{2k \over c^2} \Phi\,, \end{equation} through direct integration and using the boundary condition that \(l^0 = 0\) for \(\Phi = 0\). The angle between the direction of light \(\vec{k}\) to zero-th order and the deflection \(\vec{l}\) is then given by plugging this into Equation (C.86) \begin{equation}\label{eq-gr-photon-trajectory-ldotk} \vec{l}\cdot \vec{k} = {k^2 \over c^2}\left(\Psi - \Phi\right)\,. \end{equation} This is zero when the curvature and Newtonian potential are equal, as is the case in GR. But we’ll work out the general case here. The observable deflection angle is the component of \(\vec{l}\) that is perpendicular to \(\vec{k}\): \(\vec{l}_\perp = \vec{l} - k^{-2}(\vec{l}\cdot \vec{k})\vec{k}\). To obtain \(\mathrm{d} \vec{l}_\perp / \mathrm{d} \lambda\), we can write Equation (C.91) as \begin{equation} c^2{\mathrm{d} l^i \over \mathrm{d} \lambda} = -k^2(\partial_i [\Phi + \Psi]) +k^i \sum_{j} (\partial_j [\Phi + \Psi]) k^j +k^i \sum_{j} (\partial_j [\Psi - \Phi]) k^j \,. \end{equation} The last term on the right-hand side is parallel to \((\mathrm{d}\vec{l}/\mathrm{d}\lambda\cdot \vec{k})\vec{k}\) and thus does not enter into \(\mathrm{d} \vec{l}_\perp / \mathrm{d} \lambda\), while the first two terms are proportional to the projection of the gradient \(\nabla (\Phi + \Psi)\) onto the plane perpendicular to \(\vec{k}\) and these two terms thus give \(\mathrm{d} \vec{l}_\perp / \mathrm{d} \lambda\) as \begin{equation} c^2{\mathrm{d} \vec{l}_\perp \over \mathrm{d} \lambda} = -k^2\,\nabla_\perp \left(\Phi + \Psi\right)\,, \end{equation} where \(\nabla_\perp f= \nabla f - k^{-2}\,(\vec{k}\cdot \nabla f)\,\vec{k}\).

The observed deflection angle \(\hat{\boldsymbol{\alpha}}\) is a two-dimensional vector given by \begin{equation} \hat{\boldsymbol{\alpha}} = -{\Delta \vec{l}_\perp \over k}\,, \end{equation} where the minus sign comes from the fact that we look backwards along the light path. We can work out this deflection angle as \begin{align} \hat{\boldsymbol{\alpha}} & = -{1 \over k}\int \mathrm{d} \lambda\, {\mathrm{d} \vec{l}_\perp \over \mathrm{d} \lambda} = k\,\int \mathrm{d} \lambda\, \nabla_\perp \left({\Phi + \Psi \over c^2}\right) = \int \mathrm{d} s \,\nabla_\perp \left({\Phi + \Psi \over c^2}\right)\,, \end{align} where in the last step we integrate over the unperturbed path \(\mathrm{d}s = k\mathrm{d}\lambda\), because the deflection angle is small. Because \(\Phi = \Psi\) in the general theory of relativity, the GR-specific result is \begin{align}\label{eq-gr-light-bend-alpha} \hat{\boldsymbol{\alpha}} & = 2\int \mathrm{d} s\, \nabla_\perp \left({\Phi \over c^2}\right)\,. \end{align} Thus, we see that the reason that Einstein’s prediction for the gravitational bending of light is twice the Newtonian prediction is that light travels according to two potentials, the Newtonian and the curvature potential, which are equal. The motion of non-relativistic matter, as we saw in the previous section, is solely determined by the Newtonian potential.

Another consequence of the GR equations of motion for light is that light moving in a gravitational field takes longer to reach us than light moving in a flat background. There are two contributions to this: (i) the fact that the curved trajectory is geometrically longer and therefore takes longer to traverse; we’ll call this delay \(\Delta t_\mathrm{geom}\). And (ii) the fact that time appears to slow down along light’s trajectory in the presence of a gravitational field, which we’ll call the gravitational time delay \(\Delta t_\mathrm{Shapiro}\). Both are small compared to the time it takes light to traverse the unperturbed trajectory on galactic or cosmological scales. Thus, we can compute \(\Delta t_\mathrm{geom}\) as the difference in the (spatial) lengths of the curved and straight trajectories divided by the speed of light in vacuum \(c\), \(\Delta t_\mathrm{Shapiro}\) from the slowed-down unperturbed trajectory, and obtain the total time delay as \(\Delta t = \Delta t_\mathrm{geom} + \Delta t_\mathrm{Shapiro}\). The geometric time delay then does not require any further ingredients from GR and is discussed further in Chapter 15.

In the formalism that we are using here, the gravitational time delay has two components: (i) the fact that coordinate time \(\Delta t_\mathrm{coord} = (1/c)\int \mathrm{d} \lambda \,\mathrm{d} x^0 / \mathrm{d} \lambda\) elapses more slowly in a gravitational field and the fact that part of the perturbation \(\vec{l}\) is along the unperturbed trajectory (see Equation C.93) and we therefore need to subtract this. In GR, the latter contribution is zero because \(\Phi = \Psi\), but as before we’ll work out the more general case where we may have that \(\Phi \neq \Psi\). The total gravitational time delay is then \begin{equation} c\Delta t_\mathrm{Shapiro} = \int \mathrm{d} \lambda \,\left( l^0 - l_\parallel\right)\,, \end{equation} where \(l_\parallel = (\vec{l} \cdot \vec{k})/k\). From the discussion above, it is clear that \begin{equation} c^2\,{\mathrm{d} l_\parallel \over \mathrm{d} \lambda} = k (\vec{k}\cdot\nabla [\Psi - \Phi])\,, \end{equation} with solution \begin{equation} l_\parallel = {k \over c^2} \left(\Psi-\Phi\right)\,, \end{equation} through direct integration and using the boundary condition that \(l_\parallel = 0\) for \(\Phi-\Psi = 0\). Combining this with Equation (C.92), we then get \begin{align} c\Delta t_\mathrm{Shapiro} & = - \int \mathrm{d} s \left({\Phi + \Psi \over c^2}\right)\,, \end{align} where as above \(\mathrm{d}s = k\mathrm{d}\lambda\), because we integrate over the unperturbed trajectory. When \(\Phi = \Psi\) as the general theory of relativity predicts, this becomes \begin{align}\label{eq-gr-gravtimedelay} c\Delta t_\mathrm{Shapiro} & = - 2\int \mathrm{d} s \left({\Phi \over c^2}\right)\,. \end{align} This gravitational time delay was first derived by Shapiro (1964) as a test of the general theory of relativity, which it passed with flying colors (e.g., Shapiro et al. 1971; Reasenberg et al. 1979). Writing \(\Psi = \gamma \Phi\), measuring the value of \(\gamma\) becomes a strong test of GR, with the GR prediction being \(\gamma = 1\) (\(\gamma\) is a parameter in the parameterized post-Newtonian formalism). In the solar system, measurements of the Shapiro delay from the Cassini spacecraft of radio signals as they pass near the Sun give \(\gamma - 1 = (2.1 \pm 2.3)\times 10^{-5}\) (Bertotti et al. 2003). However, \(\gamma\) could depend on scale in alternatives to GR and constraints on the scales of galaxies are significantly weaker, e.g., \(\gamma = 0.97 \pm 0.09\) from comparing the non-relativistic, \(\Phi\)-dependent kinematics of (non-relativistic) stars in galaxy that also acts as a \((\Phi+\Psi)\)-dependent gravitational lens (Collett et al. 2018).