C. The general theory of relativity and galaxies: Section 1 (Dynamics and Astrophysics of Galaxies)

C.1. Einstein’s field equations and geodesic motion¶

C.1.1. Mathematical background¶

In his special theory of relativity, Einstein (1905) introduced the notion that space and time are really part of a single four-dimensional spacetime where distances between points are measured using the four-dimensional line element in rectangular coordinates \begin{equation}\label{eq-gr-minkowski-metric} \mathrm{d} s^2 = -c^2\mathrm{d}t^2 + \mathrm{d} x^2 + \mathrm{d} y^2 + \mathrm{d} z^2\,. \end{equation} The line element, or metric, is used to computed distances between two points \((t_1,\vec{x}_1)\) and \((t_2,\vec{x}_2)\). In general we will abbreviate these coordinates as a four-dimensional vector \(x^\mu = (ct,\vec{x})\) where \(\mu\) indexes the vector and write the line element as \begin{equation}\label{eq-gr-minkowski-metric-asmetric} \mathrm{d} s^2 = \eta_{\mu\nu}\mathrm{d}x^\mu\mathrm{d} x^\nu\,, \end{equation} where in the special theory of relativity, \(\eta_{\mu\nu}\) is a diagonal matrix with \(-1\) as the first element and \(+1\) as the other diagonal elements. Depending on the velocity of a body, the line element can be either positive—known as a spacelike path—or negative—known as a timelike path—or zero—the path of light or any particles moving at the speed of light. Massive particles move at velocities \(v < c\) and therefore on timelike paths. Because \(\mathrm{d} s^2\) is negative in this case, it makes sense to introduce a coordinate \(\tau\), which satisfies \begin{equation}\label{eq-gr-minkowski-propertime} c^2\mathrm{d} \tau^2 = - \mathrm{d} s^2 = c^2\mathrm{d}t^2 - \mathrm{d} x^2 - \mathrm{d} y^2 - \mathrm{d} z^2=-\eta_{\mu\nu}\mathrm{d}x^\mu\mathrm{d} x^\nu\,. \end{equation} The coordinate \(\tau\) is the proper time and it corresponds to the time measured by a clock carried along a body’s path. The metric is used to compute distances in spacetime, e.g., for spacelike paths \(x^\mu(\lambda)\) parameterized by a parameter \(\lambda\) \begin{equation} \Delta s = \int_{x_1^\mu=(ct_1,\vec{x}_1)}^{x_2^\mu=(ct_2,\vec{x}_2)}\mathrm{d}\lambda\sqrt{\eta_{\mu\nu}{\mathrm{d}x^\mu \over \mathrm{d} \lambda}{\mathrm{d}x^\nu\over \mathrm{d}\lambda}}\,, \end{equation} while for timelike paths we have that the distance is the difference in proper time \begin{equation} \Delta \tau = \int_{x_1^\mu=(ct_1,\vec{x}_1)}^{x_2^\mu=(ct_2,\vec{x}_2)}\mathrm{d}\lambda\sqrt{-\eta_{\mu\nu}{\mathrm{d}x^\mu \over \mathrm{d} \lambda}{\mathrm{d}x^\nu\over \mathrm{d}\lambda}}\,, \end{equation} The minus sign in front of the time dimension in the line element means that this metric is not of the usual Euclidean form, but is instead Lorentzian; the specific metric in Equation (C.1) is the Minkowski metric.

The special theory of relativity is built upon the assumption that the laws of nature are the same in all inertial frames, that is, frames that are moving at constant velocity with respect to each other. In particular, the speed of light in vacuum is assumed to be a universal constant. The constancy of the speed of light in different inertial frames leads to the requirement that Lorentz transformations describe the coordinate transformation between two frames \((t,x,y,z)\) and \((t',x',y',y')\) moving with respect to each other; if the primed coordinate frame moves with respect to the unprimed one with a velocity \(v\) in the \(x\) direction, the Lorentz transformation is given by \begin{equation}\label{eq-gr-lorentz} \begin{pmatrix} ct'\\x'\\y'\\z'\end{pmatrix} = \begin{pmatrix} \gamma & -\beta\gamma & 0 & 0 \\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0& 0& 0& 1\end{pmatrix} \begin{pmatrix} ct\\x\\y\\z\end{pmatrix}\,, \end{equation} where \(\beta = v/c\) and \(\gamma = 1/\sqrt{1-\beta^2}\). The transformation for a general velocity vector \(\vec{v}\) can be obtained by first rotating space such that the velocity is along \(x\), applying the transformation above, and rotating space back to the original direction. Denoting the position four vector as \(x'^{\mu} = (ct',x',y',z')\) and \(x^{\mu} = (ct,x,y,z)\), we can write the general transformation equation as \begin{equation}\label{eq-gr-sr-vectransform} x'^{\mu} = \Lambda^{\mu}_{\phantom{\mu}\nu}\,x^{\nu}\,, \end{equation} where \(\Lambda^{\mu}_{\phantom{\mu}\nu}\) is a Lorentz transformation matrix (e.g., that from Equation C.6 for a velocity along the \(x\) direction). As is standard in discussions of the special and general theories of relativity, we have used the Einstein summation convention here, which states that repeated indices are summed over; we refer to this process as a contraction. Another example of a vector is the four velocity \(U^\mu = \gamma\,(c,v_x,v_y,v_z)\). Equation (C.7) states how vectors transform under Lorentz transformations.

The first important tensor in special relativity is the metric tensor \(\eta_{\mu\nu}\), which gives the line element in Equation (C.2). Because \(\mathrm{d} s^2\) is a scalar and \(\mathrm{d}x^\nu\) and \(\mathrm{d} x^\nu\) are both vectors, it is clear that \(\eta_{\mu\nu}\) is a \((0,2)\) tensor. The Minkowski metric \(\eta_{\mu\nu}\) is simply \begin{equation} \eta_{\mu\nu} = \mathrm{diag}(-1,1,1,1)\,, \end{equation} for \(\mathrm{d}x^\nu = (c\mathrm{d}t,\mathrm{d}x,\mathrm{d}y,\mathrm{d}z)\), where \(\mathrm{diag}(\cdot,\cdot,\cdot,\cdot)\) denotes a diagonal matrix. The inverse metric \(\eta^{\mu\nu}\) is defined such that \(\eta_{\mu\lambda}\,\eta^{\lambda\nu} = \delta^{\mu}_{\nu}\), where \(\delta^{\mu}_{\nu}\) is the Kronecker delta. Because the metric is a matrix in the language of linear algebra, the inverse metric’s elements are given by the inverse matrix’ elements. Because it is equivalent to a diagonal matrix, the elements of the inverse Minkowski metric are equal to those of the Minkowski metric itself. In the general theory of relativity, the metric is generalized, but it is always the case that both the metric and the inverse metric are symmetric. One useful property of the metric is that it allows one to raise or lower indices on tensors to create different tensors. For example, to turn the four velocity \(U^\mu\) into a one form \(U_\mu\), do \begin{equation} U_\mu = \eta_{\mu\nu}\,U^\nu\,. \end{equation}

With this mathematical infrastructure of tensors, we can then re-state special relativity’s postulate that the laws of motion are the same in every inertial reference frame as: any law of physics needs to be written in tensorial form where tensors transform under Lorentz transformations according to Equation (C.10). For example, a law of physics \begin{equation} {\partial F^{\alpha\beta} \over \partial x^\alpha} = \mu_0\,J^\beta\,, \end{equation} where \(\mu_0\) is some constant, \(F^{\alpha\beta}\) is a \((2,0)\) tensor, and \(J^\beta\) is a vector, would transform to a primed frame as \begin{align} {\partial F'^{\alpha\beta} \over \partial x'^\alpha} & = \Lambda^{\alpha}_{\phantom{\alpha}\delta} \Lambda^{\beta}_{\phantom{\beta}\epsilon} \Lambda_{\alpha}^{\phantom{\alpha}\zeta} {\partial F^{\delta\epsilon} \over \partial x^\zeta} = \Lambda^{\beta}_{\phantom{\beta}\epsilon} \delta^{\zeta}_{\delta} {\partial F^{\delta\epsilon} \over \partial x^\zeta} = \Lambda^{\beta}_{\phantom{\beta}\epsilon} {\partial F^{\delta\epsilon} \over \partial x^\delta} = \Lambda^{\beta}_{\phantom{\beta}\epsilon}\,\mu_0\,J^\epsilon = \mu_0\,J'^\beta\,, \end{align} and therefore has the same form. This only serves as an example, but it is in fact one of Maxwell’s equations describing electromagnetism.

In the general theory of relativity, invariance of the laws of physics under Lorentz transformations is generalized to require invariance under any coordinate transformation of spacetime. Spacetime itself is also generalized from the Minkowskian form where distances are computed using Equation (C.1) to being calculated using a general metric \(g_{\mu\nu}\) as \begin{equation}\label{eq-gr-lineelement} \mathrm{d} s^2 = g_{\mu\nu}\mathrm{d}x^\mu\mathrm{d} x^\nu\,. \end{equation} where \(g_{\mu\nu}\) is a symmetric \((0,2)\) tensor and the metric is always dimensionless. Such more general spaces are mathematically known as manifolds and one manifold that is likely familiar is that of the surface of a sphere, which is a two-dimensional analog of the types of manifolds that appear in GR; on the surface of a sphere, the line element is given by \begin{equation} \mathrm{d} s^2 = \mathrm{d}\theta^2 + \sin^2 \theta\,\mathrm{d}\phi^2\,, \end{equation} where \((\theta,\phi)\) are the polar and azimuthal angle in the spherical coordinates of Appendix A.1. Much like the surface of a sphere is curved, spacetime itself can be curved. While we think of a sphere as being curved because we can see it as curved embedded in the three-dimensional space of regular life, an important property of manifolds is that the curvature is in fact an intrinsic property of the manifold that can be determined without any reference to an embedding space. A standard way to make this palatable is to point out that one can figure out the curvature of a sphere from determining the sum of the angles of a triangle drawn on its surface, which is larger than \(180^\circ\) by an amount that depends on the curvature; this measurement can manifestly be performed without reference to the embedding space.

Properly defining vectors, one-forms, and general tensors on manifolds is mathematically difficult and requires the introduction of tangent spaces and cotangent spaces that correspond to the space of vectors and one forms, respectively, at a given point. We’ll try to get by in the following without introducing this level of mathematical rigor. Assuming that vectors, one forms, and tensors can be defined on the spacetime manifold, they transform under a coordinate transformation \(x^\mu \rightarrow x'^\mu\) as, for example, for a vector \(V^\mu\) \begin{equation}\label{eq-gr-gr-vectransform} V'^{\mu} = {\partial x'^\mu \over \partial x^\nu}\,V^{\nu}\,, \end{equation} that is, simply using the Jacobian of the coordinate transformation. One forms \(X_\mu\) transform as \begin{equation}\label{eq-gr-gr-oneformtransform} V'_{\mu} = {\partial x^\nu \over \partial x'^\mu}\,V_{\nu}\,, \end{equation} that is, using the inverse Jacobian. General tensors transform as \begin{equation}\label{eq-gr-gr-tensortransform} T'^{\mu_1 \mu_2\ldots\mu_n}_{\phantom{\mu_1 \mu_2\ldots\mu_n}\nu_1\nu_2\ldots\nu_m} = {\partial x'^{\mu_1} \over \partial x^{\alpha_1}}{\partial x'^{\mu_2} \over \partial x^{\alpha_2}}\ldots{\partial x'^{\mu_n} \over \partial x^{\alpha_n}} {\partial x^{\beta_1} \over \partial x'^{\nu_1}}{\partial x^{\beta_2} \over \partial x'^{\nu_2}}\ldots{\partial x^{\beta_n} \over \partial x'^{\nu_n}}\,T^{\alpha_1 \alpha_2\ldots\alpha_n}_{\phantom{\alpha_1 \alpha_2\ldots\alpha_n}\beta_1\beta_2\ldots\beta_m}\,. \end{equation} It is clear that the tensor-transformation equation from the special theory of relativity—Equation (C.10)—is a special case of this for coordinate transformations given by the Lorentz transformation.

Laws of physics are generally written as differential equations, relating the rate of change with respect to time or space of different quantities. For example, Newton’s second law relates the acceleration, or the rate of change of the velocity with respect to time, to the force. The force itself is (minus) the gradient of a potential for a conservative force, or the rate of change of the potential with respect to the different spatial coordinates. To write down tensorial laws of physics on general manifolds, we therefore need a notion of a derivative on the manifold. Because the partial derivative operator \(\partial_\mu\) looks like it might be a one form, one might be tempted to think that the partial derivative can just act as a general derivative on any manifold, but in fact, the partial derivative operator acting on a tensor does not transform as a tensor, that is, according to Equation (C.19). However, it turns out that we can fix the partial derivative by adding a linear correction to it in a way that conserves desirable properties of the derivative, such as linearity and satisfying the product rule; this leads to the covariant derivative. For example, for vectors the covariant derivative \(\nabla_\mu\) acting on a vector \(X^\mu\) gives the \((1,1)\) tensor \(\nabla_\mu X^\nu\) through \begin{equation} \nabla_\mu X^\nu = \partial_\mu X^\nu + \Gamma^{\nu}_{\mu\lambda}\,X^\lambda\,. \end{equation} We want this covariant derivative of a vector to transform as a \((1,1)\) tensor, so we need \begin{equation} \nabla'_\mu X'^\nu = {\partial x^\delta \over \partial x'^\mu}{\partial x'^\nu \over \partial x^\epsilon}\nabla_\delta X^\epsilon\,. \end{equation} By requiring that this is the case, we can determine the transformation properties of \(\Gamma^{\nu}_{\mu\lambda}\). Working out \(\nabla'_\mu X'^\nu\) with the transformation properties we know, we have that \begin{align} \nabla'_\mu X'^\nu & = {\partial \over \partial x'^\mu} X'^\nu + \Gamma'^{\nu}_{\mu\lambda}\,X'^\lambda = {\partial x^\delta \over \partial x'^\mu}{\partial \over \partial x^\delta}\left( {\partial x'^\nu \over \partial x^\epsilon}X^\epsilon\right) + \Gamma'^{\nu}_{\mu\lambda}\,{\partial x'^\lambda \over \partial x^\epsilon}X^\epsilon\\ & = {\partial x^\delta \over \partial x'^\mu}{\partial x'^\nu \over \partial x^\epsilon}{\partial \over \partial x^\delta}X^\epsilon+{\partial x^\delta \over \partial x'^\mu}X^\epsilon{\partial \over \partial x^\delta}\left( {\partial x'^\nu \over \partial x^\epsilon}\right) + \Gamma'^{\nu}_{\mu\lambda}\,{\partial x'^\lambda \over \partial x^\epsilon}X^\epsilon\,, \end{align} and we want this to equal \begin{equation} {\partial x^\delta \over \partial x'^\mu}{\partial x'^\nu \over \partial x^\epsilon}\nabla_\delta X^\epsilon = {\partial x^\delta \over \partial x'^\mu}{\partial x'^\nu \over \partial x^\epsilon}{\partial \over \partial x^\delta} X^\epsilon+ {\partial x^\delta \over \partial x'^\mu}{\partial x'^\nu \over \partial x^\epsilon} \Gamma^{\epsilon}_{\delta\lambda} X^\lambda\,. \end{equation} For this equality to hold for any vector \(X^\mu\), it has to be the case that \begin{equation}\label{eq-gr-connection-coeffs-transform} \Gamma'^{\nu}_{\mu\lambda} = {\partial x^\delta \over \partial x'^\mu}{\partial x^\epsilon \over \partial x'^\lambda}{\partial x'^\nu\over \partial x^\zeta}\Gamma^\zeta_{\delta\epsilon} + {\partial x^\delta \over \partial x'^\mu}{\partial x^\epsilon \over \partial x'^\lambda}{\partial^2 x'^\nu \over \partial x^\delta \partial x^\epsilon}\,. \end{equation} The \(\Gamma^{\nu}_{\mu\lambda}\) coefficients are known as connection coefficients. Note that \(\Gamma^{\nu}_{\mu\lambda}\) is not a tensor itself; indeed, it cannot be, because we constructed it to correct the non-tensorial nature of the partial derivative in such a way that the covariant derivative is tensorial itself. Furthermore requiring that the covariant derivative reduces to partial derivatives when applied to scalars and that it commutes with contractions of indices—that is, e.g., \(\nabla_\mu(T^\lambda_{\phantom{\lambda}\lambda\nu}) = (\nabla T)^{\phantom{\mu}\lambda}_{\mu\phantom{\lambda}\lambda\nu}\)—allows one to show that the covariant derivative of a one form \(X_\mu\) is given by \begin{equation} \nabla_\mu X_\nu = \partial_\mu X_\nu - \Gamma^{\lambda}_{\mu\nu}\,X_\lambda\,, \end{equation} where \(\Gamma^{\lambda}_{\mu\nu}\) is the same matrix as that appears in the covariant derivative of a vector. Because tensors transform as combinations of \(n\) vectors and \(m\) one-forms, the covariant derivative of a general tensor is \begin{align} & \nabla_{\sigma} T^{\mu_1 \ldots\mu_n}_{\phantom{\mu_1 \ldots\mu_n}\nu_1\nu_2\ldots\nu_m} = \partial_{\sigma} T^{\mu_1 \ldots\mu_n}_{\phantom{\mu_1\ldots\mu_n}\nu_1\nu_2\ldots\nu_m} + \Gamma^{\mu_1}_{\sigma\lambda}T^{\lambda \mu_2\ldots\mu_n}_{\phantom{\lambda \mu_2\ldots\mu_n}\nu_1\nu_2\ldots\nu_m} + \ldots + \Gamma^{\mu_n}_{\sigma\lambda}T^{\mu_1 \mu_2\ldots\lambda}_{\phantom{\mu_1 \mu_2\ldots\lambda}\nu_1\nu_2\ldots\nu_m}\nonumber \\ & \quad - \Gamma^{\lambda}_{\sigma\nu_1}T^{\mu_1 \mu_2\ldots\mu_n}_{\phantom{\mu_1 \mu_2\ldots\mu_n}\lambda\nu_2\ldots\nu_m} - \Gamma^{\lambda}_{\sigma\nu_2}T^{\mu_1 \mu_2\ldots\mu_n}_{\phantom{\mu_1 \mu_2\ldots\mu_n}\nu_1\lambda\ldots\nu_m} - \ldots - \Gamma^{\lambda}_{\sigma\nu_m}T^{\mu_1 \mu_2\ldots\mu_n}_{\phantom{\mu_1 \mu_2\ldots\mu_n}\nu_1\nu_2\ldots\lambda} \,. \end{align}

One can define many different covariant derivatives by choosing different connection coefficients \(\Gamma^{\lambda}_{\mu\nu}\), because these all define a good covariant derivatives as long as they satisfy Equation (C.25). But if we additionally require that the connection coefficients are symmetric under the swapping of their lower coefficients, \(\Gamma^{\lambda}_{\mu\nu} = \Gamma^{\lambda}_{\nu\mu}\), and that the metric is covariantly constant, \(\nabla_\sigma g_{\mu\nu} = 0\), then there is a single covariant derivative that has these properties and it is this covariant derivative that is used in the general theory of relativity for reasons that will become clear in the next subsection. From the two defining properties of these connection coefficients, one can derive an explicit expression for them \begin{equation}\label{eq-gr-christoffel-metric} \Gamma^{\lambda}_{\mu\nu} = {1\over 2}g^{\lambda \epsilon}\,\left(\partial_\mu g_{\nu\epsilon} + \partial_\nu g_{\epsilon\mu}-\partial_\epsilon g_{\mu\nu}\right)\,. \end{equation} This connection is known as the Christoffel connection.

C.1.2. Generalizing Newton’s second law: the geodesic equation¶

We will require a bit more mathematical formalism to understand how the presence of matter and energy curves spacetime, but assuming for now that spacetime is curved by the presence of matter and energy, we can derive the GR generalization of Newton’s second law of motion. Because in GR, gravity is not a force, but rather the motion under the influence of gravity is that of motion in the spacetime curved by matter and energy, the relevant form that we need to generalize is the force-free form of Newton’s second law (assuming that no other forces are relevant). For a path \(x^\mu(\lambda)\) parameterized by a so-called affine parameter \(\lambda\), we can write Newton’s second law without force as \begin{equation}\label{eq-gr-newton2} {\mathrm{d}^2 x^\mu \over \mathrm{d} \lambda^2} = 0\,, \end{equation} because in the Newtonian framework, the zero-th component of this is \begin{equation} {\mathrm{d}^2 x^0 \over \mathrm{d} \lambda^2} = c{\mathrm{d}^2 t \over \mathrm{d} \lambda^2}= 0\,, \end{equation} with solution \(t = A\,\lambda+B\), and \(\lambda\) is therefore nothing more than an arbitrarily re-scaled time; choosing \(A=1\), \(B=0\), we have that \(\lambda = t\) and the spatial part of Equation (C.29) is \begin{equation} {\mathrm{d}^2 \vec{x} \over \mathrm{d} t^2} = 0\,, \end{equation} which is equivalent to Newton’s second law in the absence of a force for constant mass. To generalize Equation (C.29) to curved spacetime, we therefore have to write it in tensorial form. To do this, we use the principle of minimal coupling, which we can informally state as saying that the generalization of any law of physics from the Minkowski spacetime of the special theory of relativity to the general curved spacetimes of GR has to be as simple as possible. In particular, we should not introduce the tensor that we will use to describe curvature below or its contractions explicitly. Because we have not introduced the curvature tensor yet, we are in no danger of doing that here! Thus, we simply write Equation (C.29) in a covariant way under Lorentz transformations and generalize this form to curved spacetimes. A Lorentz-invariant way of writing Equation (C.29) is as \begin{equation}\label{eq-gr-newton3} {\mathrm{d}^2 x^\mu \over \mathrm{d} \lambda^2} = {\mathrm{d} x^\nu \over \mathrm{d} \lambda}\partial_\nu {\mathrm{d} x^\mu \over \mathrm{d} \lambda} = 0\,, \end{equation} because \({\mathrm{d} x^\mu \over \mathrm{d} \lambda}\) is a vector and in the absence of curvature, partial differentiation of a vector gives a tensor. In curved spacetime, we know from the discussion above that partial differentiation of a vector does not produce a tensor, but we also know that we can correct this by switching to the covariant derivative. The obvious curved-spacetime generalization of Equation (C.32) is therefore \begin{equation} {\mathrm{d} x^\nu \over \mathrm{d} \lambda}\nabla_\nu {\mathrm{d} x^\mu \over \mathrm{d} \lambda} = 0\,. \end{equation} Writing out the covariant derivative explicitly and slightly simplifying the resulting expression, we have \begin{equation}\label{eq-gr-geodesic} {\mathrm{d}^2 x^\mu \over \mathrm{d} \lambda^2} +\Gamma^{\mu}_{\nu\epsilon}{\mathrm{d} x^\nu \over \mathrm{d} \lambda} {\mathrm{d} x^\epsilon \over \mathrm{d} \lambda} = 0\,. \end{equation} This is the geodesic equation and it is the GR generalization of Newton’s second law. Any connection \(\Gamma^{\mu}_{\nu\epsilon}\) on a manifold defines a different geodesic equation, but only the Christoffel connection from Equation (C.28) has the property that the resulting path is the shortest spacetime path between two points \(x_1^\mu\) and \(x_2^\mu\). Proving this explicitly is a somewhat tedious and uninsightful exercise using standard calculus of variations and we will skip this proof here. But the important thing to remember is that the geodesic equation using the Christoffel connection implies that bodies move along paths that are the shortest path in spacetime in the absence of non-gravitational forces.

C.1.3. Curvature¶

Our remaining task for constructing a fully tensorial theory of gravity in general spacetimes is to generalize Newton’s law of gravity, or equivalently, the Poisson equation (2.2). Because in GR, the force of gravity is replaced by the curvature of spacetime, which is curved by matter and energy, we need to describe the curvature and the matter and energy content and relate them in a tensorial way. Let’s start with curvature. As we saw above, partial differentiation in flat spacetimes is replaced by covariant derivatives in curved spacetimes. In flat space, the order of partial differentiation does not matter, that is, for example \(\partial_\mu \partial_\nu X^\epsilon = \partial_\nu \partial_\mu X^\epsilon\), but this does not hold for covariant derivatives. For the specific case of the Christoffel connection, we can write \begin{equation} \nabla_\mu \nabla_\nu X^\delta - \nabla_\nu \nabla_\mu X^\delta = R^\delta_{\phantom{\delta}\epsilon\mu\nu}X^\epsilon\,, \end{equation} (for connections that are not symmetric under the swapping of their lower indices, there is an additional term proportional to the antisymmetric part of the connection). Because the right-hand side is zero and, thus, \(R^\delta_{\phantom{\delta}\epsilon\mu\nu}=0\) in flat spacetime where the covariant derivative reduces to a partial derivative, it is clear that the tensor \(R^\delta_{\phantom{\delta}\epsilon\mu\nu}\) is a measure of the curvature. This tensor is known as the Riemann tensor and from its definition above, we can obtain an explicit expression for it in terms of the connection \begin{equation}\label{eq-gr-riemann-tensor} R^\delta_{\phantom{\delta}\epsilon\mu\nu} = \partial_\mu\Gamma^\delta_{\nu\epsilon}-\partial_\nu\Gamma^\delta_{\mu\epsilon}+\Gamma^\delta_{\mu\lambda}\Gamma^\lambda_{\nu\epsilon}-\Gamma^\delta_{\nu\lambda}\Gamma^\lambda_{\mu\epsilon}\,. \end{equation} Combined with Equation (C.28) for the Christoffel connection, it is clear that the curvature is solely determined by the metric \(g_{\mu\nu}\). We need two further tensors that are computed from the Riemann tensor. The Ricci tensor \(R_{\mu\nu}\) is the following contraction of the Riemann tensor \begin{equation}\label{eq-gr-ricci-tensor} R_{\mu\nu} = R^\delta_{\phantom{\delta}\mu\delta\nu}\,, \end{equation} and we can define the Ricci scalar \(R\) from it: \begin{equation}\label{eq-gr-ricci-scalar} R = g^{\mu\nu}R_{\mu\nu}\,. \end{equation} The Ricci tensor and Ricci scalar satisfy the following useful relation \begin{equation}\label{eq-gr-deinsteintensor} g^{\mu\delta}\nabla_\delta\left(R_{\mu\nu}-{1\over 2}R\,g_{\mu\nu}\right)=0\,. \end{equation} The tensor in the parentheses here is known as the Einstein tensor for reasons that will soon become clear. As we discussed above, the metric is dimensionless and the Riemann tensor, the Ricci tensor, and the Ricci scalar have dimensions of 1/length\(^2\), because they are obtained from second derivatives with respect to \(x^\mu\) (which has units of length). The inverse of the square root of these tensors therefore gives a typical length on which the space is curved, with a large length corresponding to a small curvature. As an example, the Ricci scalar for a sphere with radius \(r\) is \(2/r^2\), which makes sense: a sphere with a large radius (say the Earth) has a smaller curvature than one with a small radius (say a tennis ball you are holding here on Earth).

C.1.4. Matter and energy¶

The final ingredient that we need is to express the matter and energy that curve spacetime. So far we haven’t said much about quantities like momentum, energy, etc., instead mainly focusing on the geometric structure of spacetime. But now we will discuss how we can deal with quantities like momentum and energy in a tensorial fashion. As we already discussed above, the position four vector is given by \(x^{\mu} = (ct,x,y,z)\) and the four velocity is \(U^\mu = \gamma\,(c,v_x,v_y,v_z)\); the four velocity is in fact the derivative of the position four vector with respect to the proper time \begin{equation} U^\mu = {\mathrm{d} x^\mu \over \mathrm{d}\tau}\,. \end{equation} (to show that this is equivalent to \(U^\mu = \gamma\,(c,v_x,v_y,v_z)\), use the fact that \(\mathrm{d}t = \gamma \mathrm{d}\tau\)). The four momentum \(p^\mu\) is then given by \begin{equation} p^\mu = m\,U^\mu\,. \end{equation} In the limit of small velocities, \(p^0 = \gamma\,mc \approx mc+ mv^2/[2c] = E/c\), where \(E = mc^2 + mv^2/2\) is the rest energy \(mc^2\) plus the kinetic energy. Similarly, \(p^i\) (where the latin index \(i\) indicates that this only indexes the spatial directions) is \(p^i = \gamma\,mv^i \approx mv^i\), the momentum in classical mechanics. The four momentum therefore generalizes the concepts of energy and momentum.

While \(x^\mu\) and \(p^\mu\) describe a single particle, to describe an extended system with macroscopic properties such as density and pressure, we need a more general concept, much like we describe a gas with density, pressure, etc.. In this context, such a system is always described as a fluid, of which the important characteristics for how it couples to gravity are its energy and momentum densities, pressure, and anisotropic stress (but it can also have entropy etc.). Counting the number of properties, you see that there are 10 and they can therefore be described by a symmetric \(4\times 4\) matrix (four diagonal elements and six off-diagonal elements). By constructing this matrix such that it is a tensor \(T^{\mu\nu}\) we’ll have what we need to write down Einstein’s equations for coupling matter and energy to gravity.

The tensor \(T^{\mu\nu}\) is the stress-energy tensor or the energy-momentum tensor. It is technically defined as the flux of four-momentum \(p^\mu\) across a surface at constant \(x^\nu\). In a fluid’s rest frame, the first component, \(T^{00}\), is therefore the flux of energy in time, which is the density multiplied by \(c^2\), and the \(T^{i0} = T^{0i}\) components are similarly the momentum density. The spatial components \(T^{ij}\) describe the stresses in the fluid, with the diagonal terms representing pressure in different directions and the off-diagonal terms describing shears. In all of the applications that we consider here, we can approximate the relevant fluids as a perfect fluid, which is a fluid that can be fully characterized by the rest-frame energy density \(\rho\) and the rest-frame isotropic pressure \(p\) (\(p\) without an index here represents the pressure rather than the momentum). Thus, in the rest frame of a perfect fluid, the stress-energy tensor is given by \begin{equation} T^{\mu\nu} = \begin{pmatrix} \rho c^2 & 0 & 0 & 0 \\ 0 & p & 0 & 0\\ 0 & 0 & p & 0\\ 0& 0& 0& p\end{pmatrix}\,. \end{equation} To figure out how we can write this in a tensorial manner, we can for example transform it to a moving reference frame by applying a Lorentz transformation. For the frame moving at speed \(v\) in the \(x\) direction from Equation (C.6), we get \begin{align} T^{\mu\nu} & = \begin{pmatrix} \gamma & -\beta\gamma & 0 & 0 \\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0& 0& 0& 1\end{pmatrix}\begin{pmatrix} \rho c^2 & 0 & 0 & 0 \\ 0 & p & 0 & 0\\ 0 & 0 & p & 0\\ 0& 0& 0& p\end{pmatrix}\begin{pmatrix} \gamma & -\beta\gamma & 0 & 0 \\ -\beta\gamma & \gamma & 0 & 0\\ 0 & 0 & 1 & 0\\ 0& 0& 0& 1\end{pmatrix}\\ & = \gamma^2 \begin{pmatrix} \rho c^2 + p & -\beta [\rho c^2+ p] & 0 & 0 \\ -\beta [\rho c^2+ p]& \beta^2 [\rho c^2 + p] & 0 & 0\\ 0 & 0 & 0 & 0\\ 0& 0& 0& 0\end{pmatrix} +\begin{pmatrix} -p & 0 & 0 & 0 \\ 0& p & 0 & 0\\ 0 & 0 & p & 0\\ 0& 0& 0& p\end{pmatrix} \end{align} We can compare this to the tensor product of the four-velocity \(U^\mu\) with itself, which in this frame is \(U^\mu = \gamma(c,-v,0,0) = \gamma c(1,-\beta,0,0)\) (because the fluid was stationary in the original frame, its velocity in the moving frame is the opposite of the frame’s velocity) and we find that \begin{equation} T^{\mu\nu} = \left(\rho + {p\over c^2}\right)\,U^\mu U^\nu + p\,\eta^{\mu\nu}\,, \end{equation} where we use the Minkowski metric because we are considering only the Lorentz transformations from the special theory of relativity. This is now manifestly a tensor and it turns out to be the correct one, as one can check by applying different Lorentz transformations. The obvious generalization to curved spacetime of this is \begin{equation}\label{eq-gr-stressenergy-perfectfluid} T^{\mu\nu} = \left(\rho + {p\over c^2}\right)\,U^\mu U^\nu + p\,g^{\mu\nu}\,, \end{equation} that is, we simply replace the Minkowski metric with the general metric. Non-relativistic matter in galaxies and the Universe has a pressure that is much smaller than its energy density, and for such pressureless dust, the stress-energy tensor is simply \begin{equation}\label{eq-gr-stressenergy-dust} T^{\mu\nu} = \rho\,U^\mu U^\nu\,. \end{equation}

Conservation of energy and momentum when a system is invariant under time or space translations is an important aspect of classical mechanics. In the special theory of relativity, this can be expressed as \begin{equation} \partial_\mu T^{\mu\nu} = 0\,. \end{equation} In curved spacetime, this is generalized to \begin{equation} \nabla_\mu T^{\mu\nu} = 0\,, \end{equation} which from the discussion above should come as no surprise. Generally, the stress-energy tensor is derived from whatever theory describes the matter or energy that one is considering. For example, the stress-energy tensor of electromagnetic fields is determined from electromagnetism. We won’t say more about that here, because for our purposes we can get by with the informal discussion of the stress-energy tensor presented here.

C.1.5. Einstein’s field equations¶

We are now finally at the point where we can state how macroscopic systems of matter and energy curve spacetime. As discussed in the previous few paragraphs, macroscopic systems are described by their stress-energy tensor \(T^{\mu\nu}\), while the curvature of spacetime is expressed using the Riemann tensor \(R^\delta_{\phantom{\delta}\epsilon\mu\nu}\) and its contractions in the form of the Ricci tensor \(R_{\mu\nu}\) and the Ricci scalar \(R\). Because we have to relate curvature to a \((2,0)\) tensor, the obvious choice is the Ricci tensor, but stating that \(R_{\mu\nu} \propto T_{\mu\nu}\) would violate the conservation of the stress-energy tensor (because \(g^{\mu\delta}\nabla_\delta R_{\mu\nu} = g^{\mu\delta}\nabla_\delta R\,g_{\mu\nu}/2\); see Equation C.39). Because of Equation (C.39), however, if the Einstein tensor is proportional to the stress-energy tensor, then the stress-energy tensor is conserved. The resulting equations are the Einstein field equations \begin{equation}\label{eq-gr-fieldeq} R_{\mu\nu}-{1\over 2}R\,g_{\mu\nu} = {8\pi G \over c^4}\,T_{\mu\nu}\,. \end{equation} Because each side is a symmetric \((0,2)\) tensor, this represents ten equations, which is why the plural ‘field equations’ is used. Because the metric is covariantly constant, the following more general version still leads to \(\nabla_\mu T^{\mu\nu} = 0\) \begin{equation}\label{eq-gr-fieldeq-general} R_{\mu\nu}-{1\over 2}R\,g_{\mu\nu} + \Lambda g_{\mu\nu}= {8\pi G \over c^4}\,T_{\mu\nu}\,. \end{equation} The constant \(\Lambda\) here is the cosmological constant. Einstein first introduced this term and then removed it, but one interpretation of the discovery that the expansion of the present-day Universe is accelerating (Riess et al. 1998; Perlmutter et al. 1999) is that \(\Lambda \neq 0\) and that Equation (C.51) therefore describes our Universe; all present observations are consistent with this interpretation. Because in the following we work with relatively simple stress-energy tensors, a useful alternative formulation of the field equations can be obtained by contracting Equation (C.51) with the metric, solving for the Ricci scalar, and plugging this into Equation (C.51); this gives \begin{equation}\label{eq-gr-fieldeq-general-alt} R_{\mu\nu}- \Lambda g_{\mu\nu}= {8\pi G \over c^4}\,\left(T_{\mu\nu} -{1 \over 2}T\,g_{\mu\nu}\right)\,, \end{equation} where \(T = g_{\mu\nu} T^{\mu\nu} = g^{\mu\nu} T_{\mu\nu}\).

Our description of how the general theory of relativity changes Newtonian mechanics and gravitation is now complete. Newton’s second law is replaced by the geodesic Equation (C.34) and Newton’s law of gravitation is replaced by the Einstein field equations of Equation (C.51). In the next section, we demonstrate that these equations reduce to Newton’s laws in the limit of small velocities and weak gravitational fields.

← Previous section Next section →

Dynamics and Astrophysics of Galaxies

By Jo Bovy

Related Topics