15.2. The lensing and Fermat potentials

\label{sec-gravlens-pot}

15.2.1. The lensing potential

\label{sec-gravlens-lensingpot}

While the formalism introduced in the previous section in principle suffices to fully describe lensing of non-variable sources (variable sources have additional lensing effects due to the time delay that accompanies lensing), much understanding is gained by taking a step back and presenting the alternative formalism of lensing contained in the lensing and Fermat potentials. This formalism starts from the observation that we can swap the order of the integral and the gradient in the calculation of the reduced deflection angle obtained from combining Equations (15.1) and (15.6). As usual defining our coordinates such that the unperturbed light ray propagates along the \(z\) axis, we can write the reduced deflection angle as \begin{align}\label{eq-gravlens-reduced-deflect-asgradient} \boldsymbol{\alpha}(\boldsymbol{\theta}) & = \nabla_\perp\left[{D_{\mathrm{LS}}\over D_{\mathrm{S}}}\,{2\over c^2}\,\int \mathrm{d} z\,\Phi(D_{\mathrm{L}}\boldsymbol{\theta},z)\right] = \nabla_{\boldsymbol{\theta}}\left[{D_{\mathrm{LS}}\over D_{\mathrm{S}}\,D_{\mathrm{L}}}\,{2\over c^2}\,\int \mathrm{d} z\,\Phi(D_{\mathrm{L}}\boldsymbol{\theta},z)\right] \equiv \nabla_{\boldsymbol{\theta}} \psi(\boldsymbol{\theta})\,, \end{align} where \(D_{\mathrm{L}}\boldsymbol{\theta} = (x,y)\) and we have replaced the spatial gradient with the equivalent angular gradient. The (dimensionless) function \(\psi(\boldsymbol{\theta})\) that is defined by this equation is \begin{equation}\label{eq-gravlens-lensingpot} \psi(\boldsymbol{\theta}) = {D_{\mathrm{LS}}\over D_{\mathrm{S}}\,D_{\mathrm{L}}}\,{2\over c^2}\,\int \mathrm{d} z\,\Phi(D_{\mathrm{L}}\boldsymbol{\theta},z)\,. \end{equation} The function \(\psi(\boldsymbol{\theta})\) is therefore a scalar function that contains all of the information necessary to compute the two-dimensional deflection on the sky. From now on, we’ll typically drop the argument \(\boldsymbol{\theta}\), but remember that \(\psi(\boldsymbol{\theta})\) is a function of angle. The function \(\psi\) turns out to be related to the lens mass distribution in a simple manner, as can be seen by taking the two-dimensional Laplacian of \(\psi(\boldsymbol{\theta})\) \begin{align} \nabla^2_{\boldsymbol{\theta}}\psi & = {D_{\mathrm{LS}}\,D_{\mathrm{L}}\over D_{\mathrm{S}}}\,{2\over c^2}\,\int \mathrm{d} z\,\nabla^2_{\boldsymbol{\xi}} \Phi(D_{\mathrm{L}}\boldsymbol{\theta},z)\\ & = {D_{\mathrm{LS}}\,D_{\mathrm{L}}\over D_{\mathrm{S}}}\,{2\over c^2}\,\int \mathrm{d} z\,\left[\frac{1}{R}\,\frac{\partial}{\partial R}\left(R\,\frac{\partial \Phi(R,\phi,z)}{\partial R}\right) + \frac{1}{R^2}\,\frac{\partial^2 \Phi(R,\phi,z)}{\partial \phi^2}\right]\\ & = {D_{\mathrm{LS}}\,D_{\mathrm{L}}\over D_{\mathrm{S}}}\,{2\over c^2}\,\int \mathrm{d} z\,\left[4\pi G \rho(R,\phi,z)-\frac{\partial^2 \Phi(R,\phi,z)}{\partial z^2}\right]\\ & = 4\pi G\,{D_{\mathrm{LS}}\,D_{\mathrm{L}}\over D_{\mathrm{S}}}\,{2\over c^2}\,\Sigma(R,\phi)\,,\label{eq-gravlens-lensingpot-laplacian-1} \end{align} where we converted the angular Laplacian \(\nabla^2_{\boldsymbol{\theta}}\) to the spatial one \(\nabla^2_{\boldsymbol{\xi}}\), we used the Laplacian in polar coordinates from Equation (A.15) to work out \(\nabla^2_{\boldsymbol{\xi}}\) in polar coordinates, and we used the Poisson equation in cylindrical coordinates (Equation 7.21) to replace the two-dimensional Laplacian with the density. The integral of the second term in the last step is zero because we assume that \(\partial \Phi / \partial z \rightarrow 0\) as \(|z| \rightarrow \pm\infty\) in all galactic applications of lensing (this assumption is also used to derive the deflection angle in GR). We thus see that the relation between the Laplacian of \(\psi\) and the surface density looks like a two-dimensional version of the Poisson equation and \(\psi\) is therefore known as the lensing potential or the deflection potential. Equation (15.20) is generally written as \begin{align}\label{eq-gravlens-2dpoisson} \nabla^2_{\boldsymbol{\theta}}\psi & = {\partial^2 \psi \over \partial \theta_1\partial \theta_1} + {\partial^2 \psi \over \partial \theta_2\partial \theta_2} = 2\kappa(\boldsymbol{\theta}) = 2\,{\Sigma(R,\phi) \over \Sigma_{\mathrm{crit}}}\,, \end{align} where \begin{align}\label{eq-gravlens-convergence} \kappa(\boldsymbol{\theta}) & = {\Sigma(R,\phi) \over \Sigma_{\mathrm{crit}}} = {4\pi G \over c^2}\,{D_{\mathrm{LS}}\,D_{\mathrm{L}}\over D_{\mathrm{S}}}\,\Sigma(R,\phi)\,, \end{align} is the convergence or the dimensionless surface density and \begin{align}\label{eq-gravlens-criticalsurfdens} \Sigma_{\mathrm{crit}} & = {c^2 \over 4\pi G}\,{D_{\mathrm{S}} \over D_{\mathrm{LS}}\,D_{\mathrm{L}}}\,, \end{align} is the critical surface density. The critical surface density plays a significant role in gravitational lensing that we will discuss below.

For any given gravitational potential, we can compute the lensing potential using Equation (15.16). We implement this in the following function:

[11]:
from scipy import integrate
from astropy.cosmology import Planck18 as cosmo
from astropy.constants import c
from galpy.potential import evaluatePotentials
def lensing_potential(Pot,theta,zlens,zsource,zmax=numpy.inf):
    DL= cosmo.angular_diameter_distance(zlens)
    cosmo_fac= cosmo.angular_diameter_distance_z1z2(zlens,zsource)\
        /cosmo.angular_diameter_distance(zsource)/DL
    # Some gymnastics to deal with units
    R= (numpy.sqrt(theta[0]**2.+theta[1]**2.)*DL)\
        .to(u.kpc,equivalencies=u.dimensionless_angles())
    R= (R/(Pot[0]._ro if isinstance(Pot,list) else Pot._ro)/u.kpc)\
         .to_value(u.dimensionless_unscaled)
    phi= numpy.arctan2(theta[1],theta[0])
    # Result includes more gymnastics to deal with units
    return (cosmo_fac*2./c**2.
            *integrate.quad(lambda z: evaluatePotentials(Pot,R,z,phi=phi,
                                                         use_physical=False),
                            -zmax,zmax)[0]
            *(Pot[0]._ro*Pot[0]._vo**2. if isinstance(Pot,list)
              else Pot._ro*Pot._vo**2.)*u.kpc*u.km**2/u.s**2)\
       .to(u.arcsec**2,equivalencies=u.dimensionless_angles())

For the \(M_\mathrm{vir} = 10^{13}\,M_\odot\) NFW halo that we considered in the previous section, the lensing potential only depends on the distance \(|\boldsymbol{\theta}|\) from the center—this is more generally the case for any axially-symmetric lens mass distribution—and is shown in Figure 15.4.

[12]:
from galpy.potential import NFWPotential
np= NFWPotential(mvir=10,conc=6.37,
                 H=67.1,overdens=200.,wrtcrit=True,
                 ro=8.,vo=220.)
zlens, zsource= 0.5, 1.
DL= cosmo.angular_diameter_distance(zlens)
thetas= numpy.linspace(-np.rvir(H=67.1,overdens=200.,wrtcrit=True)/DL,
                       np.rvir(H=67.1,overdens=200.,wrtcrit=True)/DL,101)\
        .to(u.arcsec,equivalencies=u.dimensionless_angles())
figure(figsize=(6,4))
plot(thetas,
     u.Quantity([lensing_potential(np,u.Quantity([theta,0.*u.arcsec]),
                                   zlens,zsource).to(u.arcsec**2)
                 for theta in thetas]))
xlabel(r'$|\boldsymbol{\theta}|\,(\mathrm{arcsec})$')
ylabel(r'$\psi\,(\mathrm{arcsec}^2)$');
../_images/chapters_III-04.-Gravitational-Lensing_45_0.svg

Figure 15.4: The lensing potential for the NFW halo from Figure 15.2.

Note that we express the lensing potential in units of arcseconds squared here for reasons that will soon become clear.

Because the lensing potential satisfies the two-dimensional Poisson equation, we can derive an expression for the lensing potential in terms of the surface density of the lens. Similar to how the solution of the Poisson equation for a point-mass density in three dimensions is \(-GM/r\) (see Chapter 2.1), the solution of the two-dimensional Poisson equation for a two-dimensional point (a wire in three dimensions) is \(\propto \ln R\). By decomposing the surface density into points, we can therefore obtain the lensing potential as (cf. Equation 2.12) \begin{equation}\label{eq-gravlens-lenspot-from-kappa} \psi (\boldsymbol{\theta})= {1 \over \pi}\,\iint \mathrm{d}^2\boldsymbol{\theta}'\,\kappa(\boldsymbol{\theta}')\,\ln|\boldsymbol{\theta}-\boldsymbol{\theta}'|\,. \end{equation} Because the reduced deflection angle is the gradient of \(\psi\), we can obtain it as \begin{align}\label{eq-gravlens-reduceddeflect-from-kappa} \boldsymbol{\alpha}(\boldsymbol{\theta}) &= \nabla_{\boldsymbol{\theta}} \psi(\boldsymbol{\theta}) = {1 \over \pi}\,\iint \mathrm{d}^2\boldsymbol{\theta}'\,\kappa(\boldsymbol{\theta}')\,{\boldsymbol{\theta}-\boldsymbol{\theta}' \over |\boldsymbol{\theta}-\boldsymbol{\theta}'|^2}\,. \end{align}

We can also derive an alternative expression to Equation (15.1) for the deflection angle \(\hat{\boldsymbol{\alpha}}\) \begin{equation}\label{eq-gravlens-deflect-from-kappa} \hat{\boldsymbol{\alpha}} = {4G\over c^2}\,\iint \mathrm{d}^2\boldsymbol{\xi}'\,\Sigma(\boldsymbol{\xi}')\,{\boldsymbol{\xi}-\boldsymbol{\xi}' \over |\boldsymbol{\xi}-\boldsymbol{\xi}'|^2}\,, \end{equation} where \(\boldsymbol{\xi} = D_L\,\boldsymbol{\theta}\). Thus, this discussion of the lensing potential makes it clear that the gravitational-lensing phenomenon is entirely determined by the density integrated along the line of sight, the surface density \(\Sigma\), or equivalently, the convergence \(\kappa\). For axially-symmetric lenses with \(\Sigma(R,\phi) \equiv \Sigma(R)\), these expressions simplify to \begin{align}\label{eq-gravlens-lensingpot-from-kappa-axially-symm} \psi (\boldsymbol{\theta}) & = 2\,\int_0 ^{|\boldsymbol{\theta}|}\mathrm{d}\theta'\,\theta'\,\kappa(\theta')\,\ln\left({|\boldsymbol{\theta}|\over \theta'}\right)\,,\\ \boldsymbol{\alpha}(\boldsymbol{\theta}) & = {2\over |\boldsymbol{\theta}|}\,\hat{\boldsymbol{\theta}}\,\int_0^{|\boldsymbol{\theta}|}\mathrm{d}\theta'\,\theta'\,\kappa(\theta')\,,\label{eq-gravlens-reduceddeflect-from-kappa-axially-symm}\\ \hat{\boldsymbol{\alpha}} & = {8\pi G\over c^2\,|\boldsymbol{\xi}|}\,\hat{\boldsymbol{\xi}}\,\int_0^{ |\boldsymbol{\xi}|}\mathrm{d}\xi'\,\xi'\,\Sigma(\xi') = {4G M(< |\boldsymbol{\xi}|) \over c^2\,|\boldsymbol{\xi}|}\,\hat{\boldsymbol{\xi}}\,,\label{eq-gravlens-deflect-from-kappa-axially-symm} \end{align} that is, the deflection angles are along the vector connecting the lens’ center to the point of closest approach with unit vectors \(\hat{\boldsymbol{\theta}}\) or \(\hat{\boldsymbol{\xi}}\) depending on whether we use angular or spatial coordinates. Similar to how the gravitational force of a spherical mass distribution is fully determined by the enclosed mass, the deflection angle of an axially-symmetric lens is fully determined by the mass \(M(< |\boldsymbol{\xi}|)\) enclosed within the impact parameter.

While evaluating the lensing potential for a given mass distribution is straightforward using Equation (15.24) or Equation (15.27), for non-axially-symmetric lenses computing this double integral is generally computationally expensive, especially because lensing analyses of observational data often demand running through large numbers of lens mass distributions. This is a similar problem as we faced in Chapters 7 and 12 when calculating the gravitational potential for flattened or triaxial mass distributions. There are similar ways of dealing with this situation as we considered in those chapters. Like in Chapter 7.2, we can circumvent Equation (15.24) altogether by simply positing a form for the lensing potential and computing the resulting convergence from Equation (15.21) (e.g., Blandford & Kochanek 1987). This makes all operations on the lensing potential, such as computing the reduced deflection angle, or the magnification matrix defined below, simple. But it has the disadvantage that one cannot directly specify the convergence and that the resulting convergence may be unphysical. When introducing flattening in the lensing potential, for large values of the flattening, the surface density typically develops a dumbbell shape and it may become negative (similar to what happens for the flattened logarithmic potential; see Chapter 7.2.3). Another option is to work with flattened mass models for which the lensing potential, deflection angle, and other lensing quantities can be computed quickly when care is taken with the numerical implementation. This is, for example, the case for a convergence of the form \begin{equation} \kappa(m) \propto \left({a \over m}\right)^t\,,\quad m^2 = \theta_1^2 + {\theta_2^2 \over q^2}\,, \end{equation} where \(q\) is the flattening. For \(t = 1\), this corresponds to a flattened singular-isothermal sphere. For this convergence, all lensing quantities can be computed using fast series approximations of the relevant integrals (e.g., Barkana 1998; Tessore & Metcalf 2015). It is also possible to expand the convergence into multipoles as we did in Chapter 12.3.1 for gravitational potentials, but the procedure for lensing is much simpler, because the potential and convergence are only two-dimensional.

15.2.2. The Fermat potential and the time delay

\label{sec-gravlens-fermatpot}

The lensing potential allows for an alternative statement of the (single-plane) lensing equation (15.5) by writing it as \begin{equation}\label{eq-gravlens-lensing-eq-asgradient} \nabla_{\boldsymbol{\theta}}\left[{1\over 2}\left|\boldsymbol{\theta}-\boldsymbol{\beta}\right|^2-\psi(\boldsymbol{\theta})\right] = 0\,, \end{equation} because \(\nabla_{\boldsymbol{\theta}} \psi = \boldsymbol{\alpha}\). Thus images located at \(\boldsymbol{\theta}\) for a source located at \(\boldsymbol{\beta}\) form where the scalar function in the square brackets is extremized. This scalar function is known as the Fermat potential defined by \begin{equation}\label{eq-gravlens-fermat} \tau(\boldsymbol{\theta};\boldsymbol{\beta}) = {1\over 2}\left|\boldsymbol{\theta}-\boldsymbol{\beta}\right|^2-\psi(\boldsymbol{\theta})\,. \end{equation}

As an example, we compute the Fermat potential for an elliptical galaxy represented as a spherical perfect ellipsoid from Chapter 12.2.3 (\(b=c=1\)) with a mass of \(5\times 10^{11}\,M_\odot\) and a scale radius of \(a=4\,\mathrm{kpc}\) and show the result in Figure 15.5. For three different source positions \(\boldsymbol{\beta}\), indicated by the dashed line, we can compute \(\tau(\boldsymbol{\theta};\boldsymbol{\beta})\) along the line connecting \(\boldsymbol{\beta}\) and the origin where we know the images to lie for an axially-symmetric mass distribution.

[13]:
def fermat_potential(Pot,beta,theta,zlens,zsource,zmax=numpy.inf):
    return 0.5*numpy.sum((beta-theta)**2.)\
        -lensing_potential(Pot,theta,zlens,zsource,zmax=zmax)
def extrema(x,y):
    # Find position of extrema in y
    indx= ((y > numpy.roll(y,1))*(y > numpy.roll(y,-1)))\
         +((y < numpy.roll(y,1))*(y < numpy.roll(y,-1)))
    # Remove edges
    ii= numpy.arange(len(x))
    indx*= (ii != 0)*(ii != (len(x)-1))
    return (x[indx],y[indx])
from galpy.potential import PerfectEllipsoidPotential
pep= PerfectEllipsoidPotential(amp=5*10.**11*u.Msun,a=4*u.kpc,b=1.,c=1.)
zlens, zsource= 0.5, 1.
betas= [0.1,0.7,1.9]
figure(figsize=(11,4))
for ii,beta in enumerate(betas):
    subplot(1,len(betas),ii+1)
    if numpy.fabs(beta) < 0.5:
        thetas= numpy.linspace(-1.5,1.5,30)*u.arcsec
    elif numpy.fabs(beta) < 1.5:
        thetas= numpy.linspace(-4.01,4.01,30)*u.arcsec
    else:
        thetas= numpy.linspace(-5.01,5.01,30)*u.arcsec
    ploty= u.Quantity([fermat_potential(pep,u.Quantity([beta,0.])*u.arcsec,
                                      u.Quantity([theta,0.*u.arcsec]),
                                      zlens,zsource).to(u.arcsec**2)
                       for theta in thetas])
    line= plot(thetas,ploty)
    plot(*extrema(thetas,ploty),'o',color=line[0].get_color())
    axvline(beta,color='k',ls='--')
    xlabel(r'$|\boldsymbol{\theta}|\,(\mathrm{arcsec})$')
    if ii == 0:
        ylabel(r'$\tau\,(\mathrm{arcsec}^2)$');
../_images/chapters_III-04.-Gravitational-Lensing_50_0.svg

Figure 15.5: The Fermat potential for a spherical perfect ellipsoid with \(M=5\times 10^{11}\,M_\odot\) for three different source positions.

We see that the Fermat potential has three extrema labeled with dots for a source position close to the center of the lensing potential, corresponding to three images. For source positions further from the center, only a single extremum (a minimum) is present, but as long as the source remains close to the center, the image position is significantly offset from the source position. As the source position gets further from the center, the \(|\boldsymbol{\theta}-\boldsymbol{\beta}|^2/2\) term in the Fermat potential starts to dominate and the single image is close to the source position.

The reason that the Fermat potential is called the Fermat potential is that it is related to the time delay experienced by light along its deflected trajectory. As discussed in Appendix C.3, in gravitational fields light is delayed compared to its unperturbed path because of two reasons: (i) the deflected path is longer than the unperturbed path, leading to a geometrical time delay \(\Delta t_\mathrm{geom}\); and (ii) time appears to slow down along light’s trajectory in the presence of a gravitational field, which gives the gravitational time delay \(\Delta t_\mathrm{Shapiro}\) (first discussed by Shapiro 1964). In the weak gravitational fields of galactic gravitational lensing, we can approximate these as (i) the geometrical time delay being equal to the length of the deflected path divided by the speed of light \(c\) and (ii) the gravitational time delay along the unperturbed path. To compute the geometric time delay, we can use the extension of the basic lensing diagram shown in Figure 15.6.

[14]:
def angle_plot(line1,line2,offset=1,color='k',origin=(0, 0),
               len_x_axis=1,len_y_axis=1,
               dir1=1,dir2=1):
    # Draw angle arc between two lines
    # Edited from https://stackoverflow.com/a/25228427 and
    # https://gist.github.com/battlecook/0c0bdb7097ec7c8fa160e342b1bf51ef
    from matplotlib.patches import Arc
    # Angle between line1 and x-axis
    l1xy= line1.get_xydata()
    slope1= (l1xy[1][1]-l1xy[0][1])/float(l1xy[1][0]-l1xy[0][0])
    angle1= numpy.degrees(numpy.arctan(slope1))+90.*(dir1-1)
    # Angle between line2 and x-axis
    l2xy= line2.get_xydata()
    slope2= (l2xy[1][1]-l2xy[0][1])/float(l2xy[1][0]-l2xy[0][0])
    angle2= numpy.degrees(numpy.arctan(slope2))+90.*(dir2-1)
    # Angle between them
    theta1= numpy.amin([angle1,angle2])
    theta2= numpy.amax([angle1,angle2])
    angle= theta2-theta1
    return Arc(origin,len_x_axis*offset,len_y_axis*offset,
               angle=0,theta1=theta1,theta2=theta2,
               color=color,label=str(angle)+u"\u00b0")
def line_between_points(o1,o2,*args,**kwargs):
    return plot([o1[0],o2[0]],[o1[1],o2[1]],*args,**kwargs)
def extend_line(o1,o2,xnew,*args,**kwargs):
    """Extend the line between o1 and o2 to xnew"""
    slope= (o2[1]-o1[1])/(o2[0]-o1[0])
    ynew= slope*(xnew-o1[0])+o1[1]
    return line_between_points(o1,[xnew,ynew],*args,**kwargs)
figure(figsize=(11,5.5))
# Source, lens, observer position
source= [0.,.5]
lens= [1.1,.75]
observer= [2.5,0.]
line_os= line_between_points(source,observer,'k--')[0]
line_sl= line_between_points(source,lens,'k-')[0]
line_ol= line_between_points(observer,lens,'k-')[0]
gca().add_patch(angle_plot(line_ol,line_os,-1.5,origin=observer))
line_beta= line_between_points([0,0],observer,'k:',lw=2.)[0]
gca().add_patch(angle_plot(line_beta,line_os,-2.,origin=observer))
gca().add_patch(angle_plot(line_beta,line_ol,-.75,origin=observer))
line_ex= extend_line(lens,observer,source[0],'k:',lw=2.)[0]
# Also compute the apparent position of the source, re-use 'extend-line' code
# We put the apparent source at the position on the sky and
# at the correct geometric time-delay
slope= (lens[1]-observer[1])/(lens[0]-observer[0])
apparent_x= 0.105
apparent= [apparent_x,slope*(apparent_x-observer[0])+observer[1]]
gca().add_patch(angle_plot(line_ex,line_sl,-0.5,origin=lens))
line_sa= line_between_points(source,apparent,'k--')[0]
# Now draw the line between the source and the position along the
# observer-apparent line that is at the same distance from the
# observer as the unperturbed source
# This position is set to make this the case
x_eql= 0.235
eql= [x_eql,slope*(x_eql-observer[0])+observer[1]]
line_se= line_between_points(source,eql,'k-.')[0]
gca().add_patch(angle_plot(line_os,line_se,0.5,origin=source))
gca().add_patch(angle_plot(line_os,line_se,0.55,origin=source))
from matplotlib.lines import Line2D
line_eo= Line2D([observer[0],eql[0]],[observer[1],eql[1]])
gca().add_patch(angle_plot(line_eo,line_se,0.5,origin=eql,dir2=-1))
gca().add_patch(angle_plot(line_eo,line_se,0.55,origin=eql,dir2=-1))
# Also add angles at source plane
gca().add_patch(angle_plot(line_sl,line_sa,0.2,origin=source))
gca().add_patch(angle_plot(line_sl,line_sa,0.22,origin=source))
gca().add_patch(angle_plot(line_sl,line_sa,0.24,origin=source))
gca().add_patch(angle_plot(line_ex,line_sa,0.2,origin=apparent,dir2=-1))
gca().add_patch(angle_plot(line_ex,line_sa,0.22,origin=apparent,dir2=-1))
gca().add_patch(angle_plot(line_ex,line_sa,0.24,origin=apparent,dir2=-1))
gca().add_patch(angle_plot(line_se,line_sa,0.4,origin=source))
# Time delay
annotate(text='',xy=(apparent[0],apparent[1]+0.075),
         xytext=(eql[0]+0.04,eql[1]+0.0475),
         arrowprops=dict(arrowstyle='<|-|>',color='k'))
# Angle labels
galpy_plot.text(0.8,0.8,r'$\hat{\alpha}$',ha='center',va='center',fontsize=18.)
galpy_plot.text(1.74,0.3,r'$\alpha$',ha='center',va='center',fontsize=18.)
galpy_plot.text(1.45,0.1,r'$\beta$',ha='center',va='center',fontsize=18.)
galpy_plot.text(2.1,0.13,r'$\theta$',ha='center',va='center',fontsize=18.)
galpy_plot.text(-0.1,0.85,r'$\delta$',ha='center',va='center',fontsize=18.)
plot([-0.05,0.05],[0.825,0.725],'k-',lw=0.8)
# Some dots
plot([source[0],lens[0],observer[0],lens[0],apparent[0],eql[0]],
     [source[1],lens[1],observer[1],0.,apparent[1],eql[1]],
    'ko',ms=10.)
# Impact parameter
annotate(text='',xy=(lens[0],0.02),xytext=(lens[0],lens[1]-0.02),
         arrowprops=dict(arrowstyle='<|-|>',color='k'))
galpy_plot.text(lens[0],lens[1]/2.,r'$\xi$',
                ha='center',va='center',fontsize=18.,backgroundcolor='w')
# Distance arrows
annotate(text='',xy=(source[0],-0.1),xytext=(lens[0],-0.1),
         arrowprops=dict(arrowstyle='<|-|>',color='k'))
galpy_plot.text(0.5*(source[0]+lens[0]),-0.1,r'$D_{\mathrm{LS}}$',
                ha='center',va='center',fontsize=18.,backgroundcolor='w')
annotate(text='',xy=(observer[0],-0.1),xytext=(lens[0],-0.1),
         arrowprops=dict(arrowstyle='<|-|>',color='k'))
galpy_plot.text(0.5*(observer[0]+lens[0]),-0.1,r'$D_{\mathrm{L}}$',
                ha='center',va='center',fontsize=18.,backgroundcolor='w')
annotate(text='',xy=(source[0],-0.175),xytext=(observer[0],-0.175),
         arrowprops=dict(arrowstyle='<|-|>',color='k'))
galpy_plot.text(0.5*(source[0]+observer[0]),-0.175,r'$D_{\mathrm{S}}$',
                ha='center',va='center',fontsize=18.,backgroundcolor='w')
# Observer, source, lens annotations
galpy_plot.text(source[0],source[1]-0.15,r'$\mathrm{Source}$',
                ha='center',va='bottom',fontsize=18.)
galpy_plot.text(lens[0]-0.1,0.05,r'$\mathrm{Lens}$',
                ha='center',va='bottom',fontsize=18.)
galpy_plot.text(observer[0],observer[1]+0.1,r'$\mathrm{Observer}$',
                ha='center',va='bottom',fontsize=18.)
galpy_plot.text(apparent[0]-0.1,apparent[1]-0.1,r'$\mathrm{A}$',
                ha='center',va='bottom',fontsize=18.)
galpy_plot.text(eql[0]+0.15,eql[1]-0.05,r'$\mathrm{EQ}$',
                ha='center',va='bottom',fontsize=18.)
xlim(-0.28,2.6)
ylim(-.3,1.5)
gca().axis('off');
../_images/chapters_III-04.-Gravitational-Lensing_52_0.svg

Figure 15.6: The geometric time delay in gravitional lensing.

Compared to the previous version in Figure 15.3, we have extended the line towards the apparent position of the source \(\boldsymbol{\theta}\) and added the following two points along that line: point ‘EQ’ that is at the same distance from the observer as the source along the unperturbed trajectory and point ‘A’ that is at the same distance from the observer as the source along the perturbed trajectory. The geometric difference in the path length is therefore the distance between ‘EQ’ and ‘A’. This definition implies that the angles indicated by the double and triple arcs are pairwise equal. Working in a flat geometry and in the limit of small angles, the difference in path length is then equal to the distance between the source and ‘A’ multiplied by the small angle \(\delta\). Consideration of all the angles involved shows that \(\delta = (\hat{\alpha}-\alpha)/2 = \hat{\alpha}/2\times D_\mathrm{L}/D_{\mathrm{S}}\) and that the distance between the source and ‘A’ equals \(\hat{\alpha}\,D_{\mathrm{LS}}\). The geometric time delay is therefore \begin{equation} \Delta t_\mathrm{geom} = {1\over 2c}\,{D_\mathrm{L}\,D_{\mathrm{LS}} \over D_{\mathrm{S}}} |\hat{\boldsymbol{\alpha}}|^2\,= {1\over 2c}\,{D_\mathrm{L}\,D_{\mathrm{S}} \over D_{\mathrm{LS}}} |\boldsymbol{\theta}-\boldsymbol{\beta}|^2\,. \end{equation} We have derived this equation assuming a spatially-flat background, but galaxy or cluster lensing happens at cosmological distances where the Universe is expanding and the Universe may be spatially curved on those scales. Accounting for this is simple, because physically the time delay happens when the light passes through the lens and gravitational lenses are small compared to the scale on which the Universe expands or may be curved. For cosmological applications of lensing, the above equation therefore holds in the frame of the lens and to convert the time delay in the lens frame to that of the observer, we need to account for the expansion of the Universe with its accompanying \((1+z)\) lengthening of time intervals \begin{equation} \Delta t_\mathrm{geom} = {1+z_\mathrm{L}\over 2c}\,{D_\mathrm{L}\,D_{\mathrm{LS}} \over D_{\mathrm{S}}} |\hat{\boldsymbol{\alpha}}|^2\,= {1+z_\mathrm{L}\over 2c}\,{D_\mathrm{L}\,D_{\mathrm{S}} \over D_{\mathrm{LS}}} |\boldsymbol{\theta}-\boldsymbol{\beta}|^2\,, \end{equation} where \(z_\mathrm{L}\) is the redshift of the lens. As shown in Appendix C.3, the gravitational time delay is given by Equation (C.103), which in our coordinate system and accounting for the cosmological time delay becomes \begin{align} c\Delta t_\mathrm{Shapiro} & = - {2(1+z_{\mathrm{L}}) \over c^2}\int \mathrm{d} z \,\Phi(D_{\mathrm{L}}\boldsymbol{\theta},z)=-(1+z_{\mathrm{L}})\,{D_{\mathrm{S}}\,D_{\mathrm{L}} \over D_{\mathrm{LS}}}\,\psi(\boldsymbol{\theta})\,, \end{align} where we have used the definition of \(\psi\) in Equation (15.16). The total time delay is therefore given by \begin{align} \Delta t_\mathrm{lensing} & = {1+z_\mathrm{L}\over c}\,{D_\mathrm{L}\,D_{\mathrm{S}} \over D_{\mathrm{LS}}} \left[{1\over 2}|\boldsymbol{\theta}-\boldsymbol{\beta}|^2 -\psi(\boldsymbol{\theta})\right]= {D_{\Delta t}\over c}\,\tau(\boldsymbol{\theta};\boldsymbol{\beta})\,,\label{eq-gravlens-fermat-vs-timedelay} \end{align} where the time-delay distance \(D_{\Delta t}\) is defined by \begin{equation} D_{\Delta t} = (1+z_\mathrm{L})\,{D_\mathrm{L}\,D_{\mathrm{S}} \over D_{\mathrm{LS}}}\,. \end{equation} Equation (15.36) demonstrates that the total time delay is proportional to the Fermat potential. Comparing this to Equation (15.31), we can therefore state the lensing equation as (Schneider 1985; Blandford & Narayan 1986) \begin{equation}\label{eq-gravlens-lensing-eq-fermat} \nabla_{\boldsymbol{\theta}}\left(\Delta t_\mathrm{lensing}\right)= 0\,. \end{equation} This means that images form at extrema of the time delay surface, in the same way as they do in geometrical optics according to Fermat’s principle (hence, Fermat potential). In fact, Fermat’s principle holds in the general theory of relativity and we could have derived the lensing equation from it in the first place (Kovner 1990; Nityananda & Samuel 1992).

To get a sense of the magnitude of the time delays involved in gravitational lensing, we can estimate both the geometric and gravitational time delays for a typical lens. Assuming an elliptical galaxy modeled as a singular isothermal sphere with \(\sigma \approx 200\,\mathrm{km\,s}^{-1}\), we estimated above that \(|\hat{\boldsymbol{\alpha}}| \approx 1''\) and because for typical lens systems \(D_{\mathrm{LS}}/D_{\mathrm{S}}\approx 0.5\), similarly \(|\boldsymbol{\alpha}| = |\boldsymbol{\theta}-\boldsymbol{\beta}| \approx 0.5''\). For typical lens systems, \((1+z_{\mathrm{L}})\,D_\mathrm{L}\,D_{\mathrm{S}}/ D_{\mathrm{LS}} \approx 4\,\mathrm{Gpc}\) and we thus have that \begin{equation} \Delta t_\mathrm{geom} \sim 17\,\mathrm{days}\,, \end{equation} corresponding to a distance of \(c\Delta t_\mathrm{geom} \approx 0.014\,\mathrm{pc}\). We can estimate the gravitational time delay from combining \(|\Phi| \approx 2\sigma^2\) with an estimate of the thickness of the lens, which is approximately twice the virial radius of \(\sim 300\,\mathrm{kpc}\), so for a lens at \(z_\mathrm{L}\approx 0.5\) \begin{equation} \Delta t_\mathrm{Shapiro} \sim 5\,\mathrm{yr}\,, \end{equation} which corresponds to a distance of \(c\Delta t_\mathrm{Shapiro} \approx 1.6\,\mathrm{pc}\). The gravitational time delay therefore dominates the total time delay. However, the absolute time delay is unobservable, because we can’t know when the images would have arrived in the absence of the lens. What is observable is the relative time delay between multiple images of the same source, because this can be determined if the source is variable by cross-correlating time series observations of the different images (e.g., if the lensed image is a quasar, its stochastic variability can be used to determine the relative time delays). The typical image separation is \(\approx 2\,\boldsymbol{\alpha}\), so the relative geometric time delay is \begin{equation} \Delta t^\mathrm{rel}_\mathrm{geom} \sim 34\,\mathrm{days}\,, \end{equation} but for the gravitational time delay, we need the gradient of the integrated potential over the image, multiplying the absolute gravitational time delay by \(\approx b/R_{\mathrm{vir}} \approx 1/60\), because typical impact parameters are \(b \sim \mathcal{O}(5\,\mathrm{kpc})\); the relative gravitational time delay is therefore \begin{equation} \Delta t^\mathrm{rel}_\mathrm{Shapiro} \sim 30\,\mathrm{days}\,. \end{equation} Thus, the relative gravitational time delay between images of the same source is similar to the relative geometric time delay. That this has to be the case follows from the lensing equation, which in the Fermat form from Equation (15.38) states that \(\nabla_{\boldsymbol{\theta}}\left(\Delta t_\mathrm{geom}\right)= \nabla_{\boldsymbol{\theta}}\left(\Delta t_\mathrm{shapiro}\right)\). We have used \(\sim\) signs instead of \(\approx\) to indicate that these are very rough estimates. In addition to the time delays being dependent on the lens mass and the lens and source redshifts, even for a given lens the time delays can vary by an order of magnitude depending on the image configuration. The reason for this is that gravitationally-lensed images must be located in very particular regions surrounding the lens to be observable (e.g., Turner et al. 1984).

15.2.3. The critical surface density

\label{sec-gravlens-criticaldensity}

Before discussing how we can use the Fermat potential to understand where lensed images form and what their properties are, we want to understand the physical meaning of the critical surface density from Equation (15.23). Let’s consider a gravitational lens with constant surface density \(\Sigma(R,\phi) = \Sigma_{\mathrm{crit}}\) and therefore \(\kappa(\boldsymbol{\theta}) \equiv 1\). Then we can compute \(\boldsymbol{\alpha}\) from Equation (15.28) and find \begin{align}\label{eq-gravlens-reduceddeflect-surfdens-eq-critsurfdens} \boldsymbol{\alpha}(\boldsymbol{\theta}) & = {2\over |\boldsymbol{\theta}|}\,\hat{\boldsymbol{\theta}}\,\int_0^{|\boldsymbol{\theta}|}\mathrm{d}\theta'\,\theta' = \boldsymbol{\theta}\,, \end{align} that is, the reduced deflection angle is equal to the image position. The lensing equation then implies that \(\boldsymbol{\beta} = 0\) for every \(\boldsymbol{\theta}\). Such a lens focuses perfectly: a source at \(\boldsymbol{\beta} = 0\) appears at all \(\boldsymbol{\theta}\) or, equivalently, every ray originating from the observer perfectly focuses at \(\boldsymbol{\beta} = 0\) (all within the small-angle approximation of course, so in reality this can’t quite happen). Thus, lenses with \(\Sigma > \Sigma_{\mathrm{crit}}\) are therefore able to focus well and the importance of the critical surface density in gravitational lensing derives from the fact (which we will not prove) that if a lens has \(\Sigma > \Sigma_{\mathrm{crit}}\) anywhere, then it is capable of producing multiple images of a source (Subramanian & Cowling 1986). In general, this is only a sufficient condition, not a necessary condition. However, for an axially-symmetric lens whose mass peaks at its center (the latter is generally the case for galactic lenses), \(\Sigma > \Sigma_{\mathrm{crit}}\) is a necessary and sufficient condition for multiple images to be possible. For example, for the point-mass lens in Section 15.1.3 above, we found that it always produces multiple images and a point-mass always exceeds the critical surface density, because it has infinite density at the center. For the singular-isothermal-sphere lens, similarly, \(\Sigma = \Sigma_{\mathrm{crit}}\) at \(\theta = \theta_\mathrm{E}/2\) and we saw that multiple images are possible when \(|\boldsymbol{\beta}| < \theta_\mathrm{E}\).

To understand the typical value of the critical surface density, we can write Equation (15.23) as \begin{align}\label{eq-gravlens-criticalsurfdens-units-galaxy} \Sigma_{\mathrm{crit}} & = 0.35\,\mathrm{g\,cm}^{-2} \left({D_{\mathrm{S}} \over D_{\mathrm{LS}}\,D_{\mathrm{L}}}\,1\,\mathrm{Gpc}\right)= 1663\,M_\odot\,\mathrm{pc}^{-2} \left({D_{\mathrm{S}} \over D_{\mathrm{LS}}\,D_{\mathrm{L}}}\,1\,\mathrm{Gpc}\right)\,, \end{align} for galaxy lenses. Massive elliptical galaxies have masses of \(\sim 10^{11}\,M_\odot\) within their central few kpc and therefore typically exceed the critical surface density when observed at Gpc-scale distances. For lensing by stars, the relative surface density is that within their radius and the typical critical surface density is therefore \begin{align}\label{eq-gravlens-criticalsurfdens-units-stars} \Sigma_{\mathrm{crit}} & = 8.5\times 10^{-7}\,M_\odot\,R_\odot^{-2} \left({D_{\mathrm{S}} \over D_{\mathrm{LS}}\,D_{\mathrm{L}}}\,1\,\mathrm{kpc}\right)= 174\,M_\odot\,R_\odot^{-2} \left({D_{\mathrm{S}} \over D_{\mathrm{LS}}\,D_{\mathrm{L}}}\,1\,\mathrm{AU}\right)\,. \end{align}

The second equation shows that lensing by the Sun does not produce multiple images (\(\Sigma > \Sigma_{\mathrm{crit}}\) is necessary, because the Sun is approximately spherically symmetric), but the first equation demonstrates that lensing by stars in our own Galaxy or at cosmological distances can produce multiple images. Indeed, the critical surface density for lensing by stars is so low that at 1 kpc, the critical surface density is \(\approx 1\,M_\odot/(30\,\mathrm{AU})^2)\) and multiple images can therefore form for background stars within 30 projected AU; this allows lensing by stars in our Galaxy to constrain the population of exoplanets at tens of AU (Gould & Loeb 1992). Because the critical surface density is \(\propto D_{\mathrm{S}} /(D_{\mathrm{LS}}\,D_{\mathrm{L}})\), lensing by stars in external galaxies can also produce multiple images out to even \(\sim 1/10\,\mathrm{pc}\), but it requires a source that is smaller than this projected size, because otherwise the multiple images are blurred together. Only the innermost regions of a quasar, the accretion disk and broad-line region (see Chapter 18.3), are small enough for this. Note that the multiple images formed by stars are separated by such a small amount that they cannot be resolved, but the accompanying magnification (see below) can be observed.

Background sources with images near \(\kappa \approx 1\) display large distortions and magnifications and may be multiply imaged. For this reason, lensing in the regime \(\kappa \approx 1\) is called strong lensing. Lensing when \(\kappa \ll 1\) only gives rise to subtle differences in the shape and magnitude of background sources and multiple images do not occur; this regime is therefore known as weak lensing. There is no precise definition of the boundary between strong and weak lensing, but generally it is the case that the effects of strong lensing can be determined for a single source, while weak lensing signatures are only detectable when combining data on multiple background sources.

15.2.4. Galaxy masses from strong gravitational lensing

\label{sec-gravlens-determine-mass}

Now that we understand the basic ingredients in gravitational lensing, we can start considering the application of gravitational lensing to observational data. The basic observational data in gravitational lensing are:

  • The positions of the images for multiply-imaged sources;

  • the relative arrival times of different images for variable sources;

  • the shape of the images for extended sources;

  • and the relative fluxes of different images (or the relative sizes of extended sources).

To describe shape, flux, and size distortions, we need the derivative of the lensing transformation, which we discuss in the next section, so here we will start by considering the first two observables.

Starting with the image positions \(\boldsymbol{\theta}_i\) for a multiply-imaged source, we know that each position satisfies the lensing equation (15.7). Because each image results from the same source position \(\boldsymbol{\beta}\), we have that \begin{equation} \boldsymbol{\beta} = \boldsymbol{\theta}_i-\boldsymbol{\alpha}(\boldsymbol{\theta}_i)\,. \end{equation} Fully modeling an observed lensing system requires one to infer the unknown source position \(\boldsymbol{\beta}\), but simply from the knowledge that different images come from the same source, we have that \begin{equation} \boldsymbol{\alpha}(\boldsymbol{\theta}_i)-\boldsymbol{\alpha}(\boldsymbol{\theta}_j) = \boldsymbol{\theta}_i-\boldsymbol{\theta}_j\,. \end{equation} Thus, the observed image locations directly constrain the difference in reduced deflection angle between the locations. For axially-symmetric lenses, Equation (15.29) demonstrates that this difference is simply related to the difference in enclosed masses contained in the cylinder traced out by \(|\boldsymbol{\theta}| = |\boldsymbol{\theta}_i|\). For non-axially-symmetric lenses, \(\Delta \boldsymbol{\alpha}\) corresponds to a difference in the integrated convergence, but as long as the lens isn’t too far from axial symmetry, the latter is close to the enclosed mass. This is similar to how measurements of the circular velocity relate to the enclosed mass: for spherical potentials, the circular velocity is directly related to the enclosed mass, for non-spherical potentials \(V_c\) directly relates to the gravitational field.

In the examples of the point-mass and singular-isothermal-sphere lenses in Section 15.1.3, we saw that when \(\boldsymbol{\beta} = 0\), an entire ring \(|\boldsymbol{\theta}| = \theta_\mathrm{E}\) solves the lensing equation, with \(\theta_\mathrm{E}\) the Einstein radius. This in fact happens for any axially-symmetric lens, because \(\boldsymbol{\alpha} \parallel \boldsymbol{\theta}\). Using Equation (15.29), the Einstein radius is \begin{equation}\label{eq-gravlens-einstein-radius-units} \theta_\mathrm{E} = \sqrt{{D_{\mathrm{LS}}\over D_{\mathrm{S}}\,D_{\mathrm{L}}}\,{4GM(< \theta_\mathrm{E}) \over c^2}}= 0.902433''\,\left({D_{\mathrm{LS}} \over D_{\mathrm{L}}\,D_{\mathrm{S}}}\,1\,\mathrm{Gpc}\right)^{1/2}\,\left({M[< \theta_\mathrm{E}] \over 10^{11}\,M_\odot}\right)^{1/2}\,, \end{equation} where \(M(< \theta_\mathrm{E})\) is the mass enclosed within \(\theta_\mathrm{E}\) (this is, of course, the same expression as we derived for a point mass in Equation 15.11). Thus, when we observe an Einstein ring, the radius of the ring, combined with the \(D_{\mathrm{LS}}/(D_{\mathrm{L}} D_{\mathrm{S}})\) factor that can be determined using redshifts, directly determines the enclosed mass within the Einstein radius. Because gravitational lensing does not need to make any assumptions about the dynamical state of the lens, this is a highly robust mass measurement. The physical size of the Einstein radius is \begin{equation}\label{eq-gravlens-einstein-radius-kpc} R_\mathrm{E} = D_\mathrm{L}\,\theta_\mathrm{E} = 4.85\,\mathrm{kpc}\,\left({D_{\mathrm{L}}\over 1\,\mathrm{Gpc}}\right)\,\left({\theta_\mathrm{E} \over 1''}\right)\,. \end{equation} Observed Einstein radii are typically \(\sim 1''\), so Equations (15.48) and (15.49) demonstrate that this corresponds to a mass of \(\sim 10^{11}\,M_\odot\) within \(\sim 5\,\mathrm{kpc}\). Thus, strong lensing constrains the mass of galaxies in the region where stars (and gas for spirals) contribute significantly to the mass, not at large distances where dark matter dominates.

Image positions, or the Einstein ring, in gravitational lens systems quite directly measure the enclosed mass and they are therefore analogous to rotation curve observations, except that they only cover a narrow range of radii. For variable sources, it is also possible to determine the relative arrival times of the images (by looking for when the time series observed in one image repeats in the other images). Equation (15.36) shows that the difference between arrival times then directly determines differences in the Fermat potential at the image locations, multiplied by the time-delay distance that can again be determined from observed redshifts for the source and lens. Modeling of the image locations allows one to determine the source location \(\boldsymbol{\beta}\) and the difference in Fermat potential can then be converted to a difference in the lensing potential and provide a strong constraint on the mass distribution.

As we have already stressed, unlike the situation in dynamical modeling, lensing does not need any additional assumptions and only relies on the general theory of relativity (in dynamical modeling, unless we can directly determine accelerations, we always need additional assumptions about the dynamical state of the system). This makes lensing the most robust method for determining masses. However, lensing does have some disadvantages compared to dynamical modeling:

  • An observed Einstein ring essentially only determines the total mass contained within the Einstein ring. Because the appearance of an Einstein ring indicates that the system is very close to having axial symmetry, the Einstein ring is not sensitive to the details of the mass distribution either inside or outside of the ring. Multiple images (coming from an off-center source or from a non-axially-symmetric source) similarly appear at \(\kappa \approx 1\) and thus also only constrain the mass distribution over a narrow range in radii. Strong lensing therefore essentially only provides a measurement of the mass within the Einstein radius. Weak lensing allows one to study the mass distribution outside the Einstein radius, but neither strong nor weak lensing can determine the mass distribution at scales \(\ll \theta_\mathrm{E}\), in particular the cores of galaxies (although the general absence of the central, odd image does provide a qualitative constraint; see Section 15.4.3 below).

  • Because an individual galaxy or cluster is so thin compared to the cosmological distances between the observer, lens, and source, gravitational lensing by an individual system is only sensitive to the integrated surface density, and has no way of determining the mass distribution along the line of sight. Thus, with lensing alone, we cannot determine the three-dimensional mass distribution of a galaxy or cluster.

  • All lensing observables with the exception of relative arrival times are dimensionless (the angular positions of images, and the relative magnification and shear that we will consider below). Converting image positions or the Einstein angle to constraints on the mass distribution therefore depends on our knowledge of cosmological distances (through the critical surface density). Furthermore, lensing is very susceptible to exact degeneracies that seriously limit how much information we can extract from a lensing system without additional information.

The degeneracies pose a serious problem. Degeneracies result from transformations of the total time delay in Equation (15.36) from which all lensing observables can be derived. There are two types of degeneracies: (i) general degeneracies of the time delay that occur for any image positions and (ii) degeneracies for the observed image positions. The latter consists of transformations that would affect images formed for other sources, but not the actual observed images. The main types of degeneracies are similarity degeneracies and the mass-sheet degeneracy (Falco et al. 1985; Gorenstein et al. 1988; Saha 2000). The similarity degeneracies rescale the geometric and Shapiro contributions to the time delay separately. For example, if we multiply both the time-delay distance \(D_{\Delta t}\) and the surface density \(\Sigma\) by the same factor \(s\), then \(\Delta t_\mathrm{lensing} \rightarrow s\,\Delta t_\mathrm{lensing}\). Because the lensing equation is equivalent to \(\nabla_{\boldsymbol{\theta}}\left(\Delta t_\mathrm{lensing}\right)= 0\), this transformation leaves the image positions unchanged, but scales the relative time delays by \(s\). Of course, we can only scale the time-delay distance if we do not know it! Because we can typically determine the lens and source redshift, the main uncertainty in \(D_{\Delta t}\) comes from the assumed cosmological parameters, with the main uncertainty being the inverse dependence of \(D_{\Delta t}\) on the Hubble constant \(H_0\) (varying the density parameters of dark energy and matter has a much more subtle effect). Thus, any uncertainty in \(H_0\) directly translates into an equivalent uncertainty in the inferred \(\Sigma\) from lensing. Observations of relative time delays break this degeneracy and allow both the Hubble constant and \(\Sigma\) to be determined (Refsdal 1964). Another version of the similarity degeneracy is to rescale the source and image positions by \(\sqrt{s}\) instead of rescaling \(D_{\Delta t}\); this can only be done without affecting observables for unresolved lensing, as in the case of microlensing in the Milky Way.

A more serious degeneracy is the mass-sheet degeneracy. This degeneracy only acts on the Fermat potential part of the time delay and so we can work with \(\tau(\boldsymbol{\theta};\boldsymbol{\beta})\). Under the following transformation of the lensing potential \begin{equation}\label{eq-gravlens-mass-sheet-transform} \psi(\boldsymbol{\theta}) \rightarrow {\lambda \over 2}|\boldsymbol{\theta}|^2 + \boldsymbol{\beta}'\cdot\boldsymbol{\theta} + {1\over 2}|\boldsymbol{\beta}|^2-{1\over 2(1-\lambda)}|\boldsymbol{\beta}+\boldsymbol{\beta}'|^2 + (1-\lambda)\,\psi(\boldsymbol{\theta})\,, \end{equation} the Fermat potential transforms as \begin{align} \tau(\boldsymbol{\theta};\boldsymbol{\beta}) & = {(1-\lambda)\over 2}\,\left|\boldsymbol{\theta}-{1\over 1-\lambda}\,\left(\boldsymbol{\beta} + \boldsymbol{\beta}'\right)\right|^2 - (1-\lambda)\,\psi(\boldsymbol{\theta})= (1-\lambda)\,\tau\left(\boldsymbol{\theta};{1\over 1-\lambda}\,\left[\boldsymbol{\beta} + \boldsymbol{\beta}'\right]\right)\,. \end{align} Thus, the Fermat potential gets multiplied by \((1-\lambda)\) and the source is shifted by \(\boldsymbol{\beta}'\) and scaled by \(1/(1-\lambda)\). Because the source position is not directly observable, the source shift/scale is unobservable. Relative time delays scale as \((1-\lambda)\). Because the entire Fermat potential is scaled by \((1-\lambda)\), the lensing equation remains the same, except for the shift and scale in the source position; observed image positions are therefore conserved by the transformation. The mass-sheet degeneracy is known as the mass-sheet degeneracy, because the transformation of the lensing potential according to Equation (15.50) modifies the convergence as \begin{align} \kappa & \rightarrow {1\over 2}\nabla^2 \left[{\lambda \over 2}|\boldsymbol{\theta}|^2 + \boldsymbol{\beta}'\cdot\boldsymbol{\theta} + {1\over 2}|\boldsymbol{\beta}|^2-{1\over 2(1-\lambda)}|\boldsymbol{\beta}+\boldsymbol{\beta}'|^2 + (1-\lambda)\,\psi(\boldsymbol{\theta})\right] = \lambda + (1-\lambda)\,\kappa\,, \end{align} (note that \(\nabla^2 |\boldsymbol{\theta}|^2 = 4\)), and therefore corresponds to the addition of a sheet with constant surface density \(\lambda\,\Sigma_{\mathrm{crit}}\) (and, thus, constant convergence \(\lambda\)).

The mass-sheet degeneracy is a serious issue for measurements of the Hubble constant using relative time delays in strong gravitational lenses. Relative time delays scale as \(H_0^{-1}\) and varying \(\lambda\) directly affects the inferred Hubble constant as \(H_0(1-\lambda)\). By letting \(\lambda\) go to one, the inferred Hubble constant can be arbitrarily small!

The mass-sheet degeneracy may seem like an unrealistic degeneracy, because it requires an infinite sheet of constant surface density to be added to the system and we may think that a simple boundary condition that \(\kappa \rightarrow 0\) at large distances from the lens would force \(\lambda \rightarrow 0\). But this is a case where a degeneracy that leaves the observed properties unaffected while otherwise changing the model is important. Because the deflection angle for an axially-symmetric lens only depends on the enclosed mass, we can re-distribute the constant-density sheet’s mass into a finite disk as long as we do this in an axially-symmetric way and keep the enclosed mass within and outside of all observed images the same (Saha 2000). This is then a perfectly-plausible mass distribution that is hard to exclude a priori. This degeneracy can only be broken by including non-lensing data, such as measurements of the internal velocities of the lens (Grogin & Narayan 1996).