5.6. Spherical distribution functions¶
Because of the simplicity of spherical systems, a large variety of spherical distribution functions \(f(E,L)\) have been studied in the literature. To find spherical distribution functions \(f(E,L)\), it is convenient to work with slightly different definitions of the potential, energy, and distribution function than we have used until now. We define the relative potential to be \begin{equation}\label{eq-psi-phi} \Psi = -\Phi + \Phi_0\,, \end{equation} for some constant \(\Phi_0\). We also define the relative energy \(\mathcal{E}\) as \begin{equation}\label{eq-rel-energy} \mathcal{E} = -E + \Phi_0 = \Psi -\frac{1}{2}\,v^2\,, \end{equation} where \(v = |\vec{v}|\). This is also a specific energy (that is, per unit mass), but we will simply refer to it as an “energy” in what follows. The constant \(\Phi_0\) will in general be chosen such that \(f > 0\) if \(\mathcal{E} > 0\) and \(f = 0\) when \(\mathcal{E} \leq 0\). For isolated systems with \(\Phi(\infty)=0\) that extend to \(\infty\), we can set \(\Phi_0 = 0\) and we simply have that \(\mathcal{E} = -E\), the binding energy. From the definition of the escape velocity in Equation (3.13), we then also have that \begin{equation}\label{eq-spherequil-varE-vesc} 2\mathcal{E} = v_\mathrm{esc}^2-v^2\,. \end{equation}
The Poisson equation in terms of the relative potential is \begin{equation}\label{eq-poisson-psi} \nabla^2 \Psi = -4\pi\,G\,\rho\,. \end{equation}
Furthermore, we will sometimes define the normalization of the distribution function to be such that \(\int \mathrm{d}\,\vec{x}\,\mathrm{d}\vec{v}\, f(\vec{x},\vec{v}) = M\), the total mass of the system rather than unity. Then we have that \(\rho(\vec{x}) = \int \mathrm{d} \vec{v}\,f(\vec{x},\vec{v})\), which can be useful in simplifying expressions.
5.6.1. Ergodic DFs and the Eddington inversion formula¶
The simplest solution of the collisionless Boltzmann equation for self-consistent systems is a distribution function that only depends on the (relative) energy, \(f \equiv f(\mathcal{E})\). Such distribution functions are called ergodic distribution functions. We will demonstrate that there is a unique ergodic distribution function for any spherical mass distribution \(\rho(r)\). However, this distribution function is not guaranteed to be everywhere non-negative and thus cannot necessarily be a physical solution when considered on its own. Not all spherical mass distributions can therefore be instantiated as ergodic distribution functions.
We start by writing the density \(\rho(r)\) in terms of the distribution function and changing variables from \(v\) to \(\mathcal{E}\) using Equation (5.55) \begin{align} \rho(r) & = \int \mathrm{d}\vec{v}\,f(r,\vec{v})= 4\,\pi\,\int \mathrm{d}v\,v^2\,f(r,v)= 4\,\pi\,\int \mathrm{d}v\,v^2\,f(\mathcal{E}) \label{eq-to-eddington-1-int}\\ & = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\sqrt{2(\Psi-\mathcal{E})}\,f(\mathcal{E})\,,\label{eq-to-eddington-1} \end{align} where the lower limit in the final line is zero because \(f = 0\) for \(\mathcal{E}\leq 0\). For any spherical potential, \(\Phi\)—and therefore also \(\Psi\)—is a monotonic function of radius (e.g., Equation 2.23), so we can write \(\rho\) as a function \(\rho(\Psi)\) of \(\Psi\) in a well-defined way and Equation (5.59) can be written as \begin{equation}\label{eq-to-eddington-2} \frac{1}{\sqrt{8}\,\pi}\,\rho(\Psi) = 2\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\sqrt{\Psi-\mathcal{E}}\,f(\mathcal{E})\,. \end{equation} Differentiating with respect to \(\Psi\) gives \begin{equation}\label{eq-to-eddington-3} \frac{1}{\sqrt{8}\,\pi}\,\frac{\mathrm{d}\rho(\Psi)}{\mathrm{d}\Psi} = \int_0^\Psi \mathrm{d}\mathcal{E}\,\frac{f(\mathcal{E})}{\sqrt{\Psi-\mathcal{E}}}\,, \end{equation} because the integrand is zero when \(\mathcal{E} = \Psi\). Equation (5.61) is an Abel integral equation, which can be inverted to (see Appendix B.4.1) \begin{equation}\label{eq-eddington-alt} f(\mathcal{E}) = \frac{1}{\sqrt{8}\,\pi^2}\,\frac{\mathrm{d}}{\mathrm{d}\mathcal{E}}\,\int_0^{\mathcal{E}}\mathrm{d}\Psi\,\frac{1}{\sqrt{\mathcal{E}-\Psi}}\,\frac{\mathrm{d}\rho}{\mathrm{d}\Psi}\,, \end{equation} or without the derivative of the integral \begin{equation}\label{eq-eddington} f(\mathcal{E}) = \frac{1}{\sqrt{8}\,\pi^2}\,\left[\int_0^{\mathcal{E}}\mathrm{d}\Psi\,\frac{1}{\sqrt{\mathcal{E}-\Psi}}\,\frac{\mathrm{d}^2\rho}{\mathrm{d}\Psi^2} +\frac{1}{\sqrt{\mathcal{E}}}\,\frac{\mathrm{d}\rho}{\mathrm{d}\Psi}\Bigg|_{\Psi=0}\right]\,. \end{equation}
This is the Eddington formula (Eddington 1916). From Equation (5.62) it is clear that \(f(\mathcal{E})\) can only be everywhere non-negative if \(\int_0^{\mathcal{E}}\mathrm{d}\Psi\,\frac{1}{\sqrt{\mathcal{E}-\Psi}}\,\frac{\mathrm{d}\rho}{\mathrm{d}\Psi}\) is an increasing function of \(\mathcal{E}\). If this is not the case, then no ergodic distribution function is consistent with the density \(\rho(r)\). Because ergodic distribution functions only depend on the energy, which only depends on velocity through \(v^2 = v_r^2 + v_\theta^2+v_\phi^2\), such systems have \(\overline{\vec{v}} = 0\) and \(\overline{v_r^2} = \overline{v_\theta^2} = \overline{v_\phi^2}\), such that the anisotropy \(\beta = 0\) (see Equation 5.46). This provides an example of a Jeans equation closure that would cause an unphysical situation: if we close the spherical Jeans equation by assuming \(\beta = 0\) for a density that does not satisfy the constraint that \(\int_0^{\mathcal{E}}\mathrm{d}\Psi\,\frac{1}{\sqrt{\mathcal{E}-\Psi}}\,\frac{\mathrm{d}\rho}{\mathrm{d}\Psi}\) must be an increasing function of \(\mathcal{E}\), then the solution found through the spherical Jeans Equation (5.45) would be unphysical.
We did not use the Poisson equation in the derivation of (5.63). This equation therefore applies whether or not the density \(\rho\) that appears in it generates the potential. We also did not rely on the normalization of \(f\), so Equation (5.63) applies no matter how we normalize \(f\).
To sample phase-space points from an ergodic distribution function, it is easiest to proceed as follows:
Sample \(r\) from the cumulative mass profile, e.g., using inverse transform sampling;
Sample random angles on the sphere \((\theta,\phi)\) (keeping in mind that while the distribution of \(\phi\) is uniform in \([0,2\pi]\), the distribution of \(\theta\) is \(p(\theta) \propto \sin\theta\) in \([0,\pi]\));
Sample the magnitude \(v\) of the velocity from \(p(v|r) \propto v^2 f(\mathcal{E}) = v^2 f(\Psi[r]-v^2/2)\). This can also be done using inverse transform sampling between \(v=0\) and the escape velocity \(v= v_\mathrm{esc}(r)\);
Sample velocity angles \((\eta,\psi)\) randomly on the sphere as well, to convert the magnitude of the velocity to the spherical velocity components as
\begin{align} v_r& = v\,\cos \eta\,; \quad v_\theta = v\,\sin\eta\,\cos\psi\,; \quad v_\phi = v\,\sin\eta\,\sin\psi\,. \end{align}
By randomly distributing the magnitude of the velocity over radial and tangential components, an isotropic distribution function is obtained. Note that the tangential velocity \(v_t\) that is used for anisotropic distribution functions below is \(v_t = v\,\sin\eta\).
For the anisotropic distribution functions that we discuss below, the first three steps are the same (or very similar), but the final step will need to be adjusted to ensure the correct anisotropy of the distribution function.
Let’s use the Eddington formula to find a distribution function that self-consistently generates the Hernquist profile discussed in Section 2.4.6. The Hernquist profile has a mass \(M = 2\pi\,\rho_0\,a^3\) (see Equation 2.51) and a non-zero density at all \(r\). We therefore set \(\Phi_0 = 0\). Using the expression for the Hernquist potential from Equation (2.52), we can express \(r\) as a function of \(\Psi\) as \begin{equation}\label{eq-spherdf-hernquist-r-psi} \frac{r}{a} = \frac{1}{\tilde{\Psi}} - 1\,, \end{equation} where \(\tilde{\Psi} = \Psi a/(G M) = -\Phi a/(G M)\). We can then express the Hernquist density from Equation (2.49) in terms of \(\Psi\) as (we use the number density \(\nu = \rho/M\) here rather than the mass density \(\rho\)) \begin{equation}\label{eq-spherdf-hernquist-dens-psi} \nu(\Psi) = \frac{1}{2\pi\,a^3}\,\frac{\tilde{\Psi}^4}{1-\tilde{\Psi}}\,. \end{equation} Taking the derivative of this with respect to \(\Psi\) and using the Eddington formula in the form from Equation (5.63) gives the Hernquist distribution function \begin{equation}\label{eq-df-hern} f_H(\mathcal{E}) = \frac{(G\,M\,a)^{-3/2}}{\sqrt{2}\,(2\pi)^3}\,\frac{\sqrt{\tilde{\mathcal{E}}}}{(1-\tilde{\mathcal{E}})^2}\,\left[(1-2\tilde{\mathcal{E}})(8\tilde{\mathcal{E}}^2-8\tilde{\mathcal{E}}-3)+\frac{3\sin^{-1}\sqrt{\tilde{\mathcal{E}}}}{\sqrt{\tilde{\mathcal{E}}(1-\tilde{\mathcal{E}})}}\right]\,, \end{equation} in terms of the dimensionless relative energy \(\tilde{\mathcal{E}} = \mathcal{E} a/(G M) = -E a/(G M)\).
One advantage of having a full equilibrium distribution-function for a given mass profile is that one can sample a realization of the distribution function as a set of bodies. These could then be used as the starting point of a numerical investigation, like an \(N\)-body simulation of the evolution of a merging satellite galaxy within a larger galaxy. As an example, we draw a realization of a Hernquist mass profile that is similar to the dark-matter halo in the Milky Way (with a mass of \(M = 10^{12}\,M_\odot\) and a scale parameter \(a = 16\,\mathrm{kpc}\)). By binning the distribution of radial velocities \(v_r\) at different \(r\), we can compute the velocity dispersion of the sample of bodies and we compare it in Figure 5.6 to that obtained by solving the Jeans equation (5.47).
[17]:
from galpy.potential import HernquistPotential
from galpy.df import isotropicHernquistdf
from galpy.df import jeans
from matplotlib.ticker import FuncFormatter
hp= HernquistPotential(amp=2.*1e12*u.Msun,a=16.*u.kpc)
dfh= isotropicHernquistdf(pot=hp)
orbs= dfh.sample(n=100000)
rs, vrs= orbs.r(), orbs.vr()
figure(figsize=(8,4))
subplot(1,2,1)
w,e= numpy.histogram(rs,range=[0.*u.kpc,40.*u.kpc],bins=31,weights=numpy.ones_like(rs))
mv2,_= numpy.histogram(rs,range=[0.*u.kpc,40.*u.kpc],bins=31,weights=vrs**2.)
brs= (numpy.roll(e,-1)+e)[:-1]/2.
plot(brs,numpy.sqrt(mv2/w),label=r'$\mathrm{samples}$')
# Compare to Jeans
plot(brs,numpy.array([jeans.sigmar(hp,r,beta=0.).to_value(u.km/u.s) for r in brs]),
label=r'$\mathrm{Jeans\ eq.}$')
legend(frameon=False,fontsize=18.)
subplot(1,2,2)
w,e= numpy.histogram(numpy.log(rs.to(u.kpc).value),
range=[numpy.log(1),numpy.log(100.)],
bins=31,weights=numpy.ones_like(rs))
mv2,_= numpy.histogram(numpy.log(rs.to(u.kpc).value),
range=[numpy.log(1),numpy.log(100.)],
bins=31,weights=vrs**2.)
brs= numpy.exp((numpy.roll(e,-1)+e)[:-1]/2.).to_value(u.dimensionless_unscaled)
semilogx(brs,numpy.sqrt(mv2/w),label=r'$\mathrm{samples}$')
# Compare to Jeans
semilogx(brs,numpy.array([jeans.sigmar(hp,r*u.kpc,beta=0.).to_value(u.km/u.s)
for r in brs]),
label=r'$\mathrm{Jeans\ eq.}$')
legend(frameon=False,fontsize=18.)
subplot(1,2,1)
xlabel(r'$r\,(\mathrm{kpc})$')
ylabel(r'$\sigma_r\,(\mathrm{km\,s}^{-1})$')
subplot(1,2,2)
xlabel(r'$r\,(\mathrm{kpc})$')
gca().xaxis.set_major_formatter(FuncFormatter(
lambda y,pos: (r'${{:.{:1d}f}}$'.format(int(\
numpy.maximum(-numpy.log10(y),0)))).format(y)))
tight_layout();
Figure 5.6: Samples from the ergodic Hernquist distribution function.
To better show the radial behavior, we have binned the bodies both in equal-width bins in radius \(r\) and in \(\log r\). We see that the velocity dispersion of the sample of bodies matches that computed using the Jeans equation very well.
But the full distribution function also allows us to go beyond just a simple velocity dispersion and look at, for example, the distribution of radial velocities at different radii. As an example of that, we plot the distribution of the magnitude \(v\) of the velocity, \(p(v|r) = v^2 f(\mathcal{E}[r,v])\), at three different radii in Figure 5.7.
[18]:
figure(figsize=(6,4))
rs= [1.*u.kpc,8.*u.kpc,32.*u.kpc]
for r in rs:
vs= numpy.linspace(0.,hp.vesc(r),101)
# p(v|r) ~ v^2 f(E[r,v])
pvr= vs**2.*dfh(r*numpy.ones(len(vs)),vs,0.,0.,0.).to_value(u.Msun/(u.kpc*u.km/u.s)**3)
pvr/= numpy.sum(pvr*(vs[1]-vs[0]))
plot(vs,pvr,label=rf'$r = {r.to_value(u.kpc):.0f}\,\mathrm{{kpc}}$')
xlabel(r'$v\,(\mathrm{km\,s}^{-1})$')
ylabel(r'$p(v|r)$')
legend(frameon=False,fontsize=18.);
Figure 5.7: The velocity distribution of the ergodic Hernquist distribution function.
The distributions are shown up to the escape velocity at the different radii and you can see the diminishing escape velocity with increasing radius as the decrease of the extent of the velocity distribution. The width of the velocity distribution is largest for the intermediate radius (being at \(r=8\) kpc in our Milky-Way-inspired model, this is a proxy for the dark-matter velocity distribution near the Sun). But the shape of the velocity distribution changes non-trivially with radius and only a full distribution-function model of a dynamical system can provide this full distribution.
5.6.2. The isothermal sphere¶
A different approach to solving the collisionless Boltzmann equation using the Jeans theorem is to posit a form for \(f(\mathcal{E},L)\)—for a spherical system—and then determine what density profile it implies. In this case you do not know what will come out, but what you get might be useful! This procedure can be followed for self-consistent and non-self-consistent distribution functions: in the former case, we need to use the constraint that the density \(\rho(r) = \int \mathrm{d}v\,v^2\,f(\mathcal{E},L)\) sources the potential through the Poisson equation, in the latter case one simply computes the energy using an external potential.
Let’s consider a distribution function \(f(\mathcal{E})\) of the following form \begin{equation}\label{eq-isothermal-df-form} f(\mathcal{E}) = \frac{\rho_1}{(2\pi\,\sigma^2)^{3/2}}\,\exp\left(\frac{\Psi-\frac{v^2}{2}}{\sigma^2}\right)\,, \end{equation} with parameters \((\rho_1,\sigma)\). Unlike most of the distribution functions that we discuss in this chapter, this distribution function extends all the way to \(\mathcal{E} = -\infty\) (we will see below that this is because this system has infinite mass and therefore all orbits are bound). Because \(\Psi\) only depends on \(r\), the velocity distribution at any given \(r\) is a Gaussian and the density obtained by integrating over all velocities is (see Equation 5.58) \begin{equation}\label{eq-isothermaldf-dens-derivation} \rho(r) = \rho_1\,\exp\left(\frac{\Psi}{\sigma^2}\right)\,. \end{equation}
For a self-consistent system, a second relation between \(\rho(r)\) and \(\Psi\) comes from the Poisson equation in terms of \(\Psi\): Equation 5.57. Substituting Equation (5.69) in the Poisson equation gives \begin{equation}\label{eq-isothermaldf-dens-derivation-equation} \frac{\mathrm{d}}{\mathrm{d} r}\,\left(r^2\,\frac{\mathrm{d} \ln \rho}{\mathrm{d} r}\right) = -\frac{4\pi\,G}{\sigma^2}\,r^2\,\rho\,. \end{equation} One solution of this equation is \begin{equation}\label{eq-dens-isothermal} \rho(r) = \frac{\sigma^2}{2\pi\,G\,r^2}\,, \end{equation} which is the singular isothermal sphere. This solution is “singular”, because it diverges as \(r \rightarrow 0\). It is isothermal, because the velocity distribution at any radius is a spherical Gaussian with dispersion \(\sigma\), which can be seen directly from Equation (5.68): \begin{align} p(\vec{v} | r) & \propto \frac{f(\vec{v},r)}{f(r)} = \frac{1}{(2\pi\,\sigma^2)^{3/2}}\,\exp\left(\frac{-v^2}{2\,\sigma^2}\right)\,. \end{align}
From Equation (5.69), we find the potential \begin{equation}\label{eq-spherequil-sis-potential} \Phi(r) = 2\,\sigma^2\,\ln r + \mathrm{constant}\,, \end{equation} and the circular velocity \(v_c(r)\) is \(v_c(r) = \sqrt{2}\,\sigma\). The singular isothermal sphere therefore corresponds to the distribution function of a spherical distribution of matter that gives rise to a flat rotation curve. For this reason, the isothermal sphere is an important distribution-function model. This correspondence also demonstrates the success of the second approach to solving the collisionless Boltzmann equation. Simply by positing a simple form for the distribution function (Equation 5.68) we found an important self-consistent, approximate model for galaxies with a flat rotation curve!
5.6.3. Distribution functions for globular clusters: King profiles¶
Despite its simplicity, the isothermal sphere model above has significant issues if we want to use it to model real stellar systems. Primary among these is the fact that it has infinite mass and to model a real system we therefore have to cut it off in some manner. A simple way of achieving this is to modify Equation (5.68) to \begin{equation}\label{eq-king-df-form} f(\mathcal{E}) = \frac{\rho_1}{(2\pi\,\sigma^2)^{3/2}}\,\left[\exp\left(\frac{\Psi-\frac{v^2}{2}}{\sigma^2}\right)-1\right]\,, \end{equation} for \(\mathcal{E} > 0\) and let \(f(\mathcal{E})\) be zero otherwise. This family of models is known as King models and they are a popular model for globular clusters (Michie 1963; Michie & Bodenheimer 1963; King 1966). Integrating over velocity gives the density as a function of \(\Psi\) \begin{align} \rho(\Psi) & = \rho_1\,\left[e^{\tilde{\Psi}}\,\mathrm{erf} \sqrt{\tilde{\Psi}}-\sqrt{\frac{4\,\tilde{\Psi}}{\pi}}\,\left(1+\frac{2\,\tilde{\Psi}}{3}\right)\right]\,,\label{eq-king-dens-implicit} \end{align} where we have defined \(\tilde{\Psi} = \Psi/\sigma^2\). The Poisson equation then becomes \begin{equation}\label{eq-king-poisson} \frac{\mathrm{d}}{\mathrm{d} r}\,\left(r^2\,\frac{\mathrm{d} \tilde{\Psi}}{\mathrm{d} r}\right) =-\frac{4\pi\,G\,\rho_1}{\sigma^2}\,r^2\left[e^{\tilde{\Psi}}\,\mathrm{erf} \sqrt{\tilde{\Psi}}-\sqrt{\frac{4\,\tilde{\Psi}}{\pi}}\,\left(1+\frac{2\,\tilde{\Psi}}{3}\right)\right]\,. \end{equation} This equation needs to be solved numerically, which we can do by starting at \(r=0\), assuming \(\mathrm{d}\Psi / \mathrm{d} r = 0\) and starting from some chosen value of \(\Psi(0)\). Equation (5.76) is then an ordinary differential equation that can be solved to give \(\Psi(r)\). As we have already discussed, \(\Psi(r)\) decreases as \(r\) increases for any spherical potential and therefore at some \(r = r_t\) we have that \(\Psi(r_t) = 0\). From Equation (5.75) it is clear that \(\rho = 0\) when \(\Psi =0\) and therefore the density becomes zero at \(r_t\) (it remains zero for \(r > r_t\), because we have that \(f(\mathcal{E}) = 0\) for \(\mathcal{E} \leq 0\) and \(\mathcal{E} \leq 0\) when \(\Psi < 0\)). The radius \(r_t\) is the tidal radius. Once we have found the tidal radius, we can determine the central potential \(\Phi(0)\), because from Newton’s second theorem discussed in Section 2.2 we know that \(\Phi(r_t) = -G\,M/r_t\), where \(M = 4\pi\,\int \mathrm{d}r\,r^2\rho(r)\) is the total mass of the system: \(\Phi(0) = \Phi(r_t) - \Psi(0)\) from Equation (5.54).
Besides the tidal radius, another typical radius for a King model is the King radius \(r_0\) defined as \begin{equation} r_0 \equiv \sqrt{\frac{9\,\sigma^2}{4\pi\,G\,\rho_0}}\,, \end{equation} where \(\rho_0\) is the (finite) central density obtained from Equation (5.75). The ratio of the tidal to the King radius gives the concentration, technically defined as \(c = \log_{10} \left(r_t/r_0\right)\).
King models form a three-dimensional sequence of mass, tidal radius, and concentration. The latter is also often parameterized in terms of \(W_0 = \Psi(0)/\sigma^2\), because this allows the King model for a given mass, tidal radius, and \(\Psi(0)/\sigma^2\) to be straightforwardly computed using the procedure described above. In fact, if you inspect the equations above carefully, you see that the fundamental structure of the model, \(\Psi(r)\) and \(\mathrm{d}\Psi(r) / \mathrm{d}r\), can be computed by working in the scaled parameter \(W = \Psi/\sigma^2\). The obtained solution \((W[r],\mathrm{d} W/\mathrm{d}r)\) can then be used with any desired mass and tidal radius by simply scaling the mass and tidal radius to the desired values.
To get a sense of what these King models look like, let’s compute the density profile for a range of values of the scaled central potential \(W_0\). Because the mass (or density) and length scales are unimportant for the overall shape of the distribution, we plot the radius as a fraction of \(r_0\) and the density as a fraction of the central density \(\rho_0\) in the left panel of Figure 5.8.
[19]:
from galpy.df import kingdf
from matplotlib.ticker import FuncFormatter
W0s= [1.,3.,5.,7.,9.,11]
figure(figsize=(12,4))
for W0 in W0s:
dfk= kingdf(W0=W0)
conc= numpy.log10(dfk.rt/dfk.r0)
rs= numpy.geomspace(dfk.r0/100.,dfk.rt,101)
subplot(1,2,1)
# Density
loglog(rs/dfk.r0,dfk.dens(rs)/dfk.rho0,label=rf'${W0:.0f}$')
subplot(1,2,2)
# Velocity dispersion
# Approach rt logarithmically, to better resolve the behavior there
rs= numpy.hstack((numpy.geomspace(dfk.r0/100.,numpy.amax([dfk.r0,dfk.rt/10.]),101),
sorted(dfk.rt-numpy.geomspace(1e-4,dfk.rt-numpy.amax([dfk.r0,dfk.rt/10.]),101))))
semilogx(rs/dfk.r0,numpy.array([dfk.sigmar(r) for r in rs])
/dfk.sigma,label=rf'${W0:.0f}$')
subplot(1,2,1)
xlim(3e-2,3e2)
ylim(1e-6,2)
xlabel(r'$r/r_0$')
ylabel(r'$\rho(r)/\rho_0$')
gca().xaxis.set_major_formatter(FuncFormatter(
lambda y,pos: (r'${{:.{:1d}f}}$'.format(int(\
numpy.maximum(-numpy.log10(y),0)))).format(y)))
text(8e-2,5e-2,r'$W_0$',fontsize=18.,horizontalalignment='center')
legend(frameon=False,fontsize=18.,loc='lower left')
subplot(1,2,2)
xlim(3e-2,3e3)
ylim(0.,1.1)
xlabel(r'$r/r_0$')
ylabel(r'$\sigma(r)/\sigma$')
gca().xaxis.set_major_formatter(FuncFormatter(
lambda y,pos: (r'${{:.{:1d}f}}$'.format(int(\
numpy.maximum(-numpy.log10(y),0)))).format(y)))
text(400,9.9*1e-1,r'$W_0$',fontsize=18.,horizontalalignment='center')
legend(frameon=False,fontsize=18.,loc='upper right',bbox_to_anchor=(1,0.92))
tight_layout();
Figure 5.8: King models: density (left) and velocity dispersion (right).
We see that for small values of \(W_0\), the King model is essentially a homogeneous-density sphere that is cut off at the tidal radius. At large values of \(W_0\), there is a small core and outside of the core the density drops as \(\rho(r) \approx r^{-2}\) until the tidal radius is reached.
From the definition of the King distribution function in Equation (5.74), it is clear that at large binding energies \(\mathcal{E} \gg 0\) the distribution function is approximately that of the isothermal sphere and the velocity distribution is therefore a spherical Gaussian with dispersion \(\sigma\). As \(r\) increases, \(\mathcal{E} \rightarrow 0\) and the velocity dispersion approaches zero. We can see this behavior in the right panel of Figure 5.8, which displays the velocity dispersion as a function of radius for the models shown above. We have normalized the velocity dispersion by the parameter \(\sigma\) and Figure 5.8 makes clear that this parameter \(\sigma\) is not the actual velocity dispersion of the model anywhere (except for large \(W_0\), where the core dispersion is approximately \(\sigma\), because the distribution function is very close to isothermal there).
Because the King model is an ergodic distribution function, the velocity anisotropy is zero. A generalization of the King models that have non-zero anisotropy is given by the Michie models (Michie & Bodenheimer 1963).
5.6.4. Anisotropic distribution functions¶
So far we have only considered ergodic distribution functions, which are always isotropic because the velocities enter the specific energy \(E\) only through \(v^2 = v_r^2 + v_\theta^2+v_\phi^2\) and the velocity dispersion in different directions therefore have to equal each other. However, realistic stellar systems often have some degree of anisotropy. Dark matter halos in the standard cold-dark-matter paradigm are close to isotropic only near their centers, becoming radially biased in their outskirts (\(\beta \approx 0.5\); Diemand et al. 2004; Wojtak et al. 2005). Stars in globular clusters and stellar halos can also have non-isotropic velocity distributions that are not well captured by ergodic distribution functions. To investigate such systems in more detail, we, therefore, require a way to create anisotropic distribution function models.
We will only consider anisotropic distribution functions for spherical, non-rotating systems. Such distribution functions can only depend on the magnitude \(L = |\vec{L}|\) of the specific angular momentum in addition to the specific energy \(E\), so we will consider functions \(f(E,L)\). As before, distribution functions are most easily expressed in terms of the relative energy \(\mathcal{E}\) defined in Equation (5.55) and we will therefore consider the function \(f(\mathcal{E},L)\). As for ergodic distribution functions, one important model-building task is to find an anisotropic distribution function \(f(\mathcal{E},L)\) that generates a given density profile \(\rho(r)\). The equation relating the distribution function and the density that is analogous to Equation (5.59) for the ergodic case is now \begin{align} \rho(r) & = \int \mathrm{d}\vec{v}\,f(r,\vec{v})= 2\,\pi\,\int \mathrm{d}v_r\,\mathrm{d} v_t\,v_t\,f(r,v_r,v_t)\\ & = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\int_0^{\sqrt{2(\Psi-\mathcal{E})}}\mathrm{d} v_t\,v_t\,{f(\mathcal{E},r\,v_t)\over \sqrt{2(\Psi-\mathcal{E})-v_t^2}}\,,\label{eq-dens-anisotropic-general-1} \end{align} where \(v_t = \sqrt{v_\theta^2+v_\phi^2}\) and as in the ergodic case we require knowledge of \(\Psi(r)\). Unlike Equation (5.59), Equation (5.79) cannot be directly inverted, mathematically, because a function of a single variable, \(\rho(r)\), is not sufficient to determine a function of two variables, \(f(\mathcal{E},L)\). Physically what happens is that, once we allow for any anisotropy profile, many distribution functions are consistent with the same spatial density.
To make progress in inverting Equation (5.79), we therefore need to constrain the possible solutions. This could be done by positing specific forms for \(f(\mathcal{E},L)\) and solving the inversion, but this has the disadvantage of appearing quite ad hoc. Because our first inkling of the breaking of the ergodic assumption typically comes from measurements of the orbital anisotropy \(\beta(r)\), we would like a procedure that guides us towards solutions with a given anisotropy profile. One way to do this, is to consider the density defined by the right-hand side of Equation (5.79) as a function not just of \(r\), but as a two-dimensional function of \((\Psi,r)\) called the augmented density (Dejonghe 1986) \begin{align} \tilde{\rho}(\Psi,r) & = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\int_0^{\sqrt{2(\Psi-\mathcal{E})}}\mathrm{d} v_t\,v_t\,{f(\mathcal{E},r\,v_t)\over \sqrt{2(\Psi-\mathcal{E})-v_t^2}}\,.\label{eq-dens-anisotropic-general-2} \end{align} The true density is obtained as \(\rho(r) = \tilde{\rho}(\Psi[r],r)\). We can then create unique anisotropic distribution functions for a given density \(\rho(r)\), by positing a form for \(\tilde{\rho}(\Psi,r)\) that satisfies \(\rho(r) = \tilde{\rho}(\Psi[r],r)\); because \(\tilde{\rho}(\Psi,r)\) is a function of two variables, it can uniquely determine \(f(\mathcal{E},L)\) and, in fact, one can show that it does (Dejonghe 1986).
To gain insight into the meaning of the augmented density, we can compute the radial and tangential velocity dispersions, \(\sigma_r\) and \(\sigma_t\). We have that \begin{align} \rho(r)\,\sigma_r^2(r) & = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\int_0^{\sqrt{2(\Psi-\mathcal{E})}}\mathrm{d} v_t\,v_t\,{f(\mathcal{E},r\,v_t)\, \sqrt{2(\Psi-\mathcal{E})-v_t^2}}\,, \end{align} and thus using Equation (5.80) \begin{align} \tilde{\rho}(\Psi,r) & = \sigma_r^2(r)\,{\mathrm{d} \over \mathrm{d} \Psi}\Big[\tilde{\rho}(\Psi,r)\Big]\,, \end{align} or \begin{align}\label{eq-anidf-sigr-aug} \sigma_r^2(r) & = {1 \over \rho(r)}\,\int_0^\Psi\mathrm{d}\Psi' \tilde{\rho}(\Psi',r)\,. \end{align}
Similarly, for the tangential velocity dispersion \(\sigma_t\), we have that \begin{align} \rho(r)\,\sigma_t^2(r) & = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\int_0^{\sqrt{2(\Psi-\mathcal{E})}}\mathrm{d} v_t\,v_t^3\,{f(\mathcal{E},r\,v_t)\over \sqrt{2(\Psi-\mathcal{E})-v_t^2}}\,. \end{align} To relate this to the augmented density, we first re-write Equation (5.80) as \begin{align} r^2\,\tilde{\rho}(\Psi,r) & = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\int_0^{\sqrt{2r^2\,(\Psi-\mathcal{E})}}\mathrm{d} L\,L\,{f(\mathcal{E},L)\over \sqrt{2(\Psi-\mathcal{E})-L^2/r^2}}\,. \end{align} Similar to the derivation of \(\sigma_r^2(r)\) above, it then follows that \begin{align} \sigma_t^2(r) & = {2\over \rho(r)}\,\int_0^\Psi\mathrm{d}\Psi'\,{\mathrm{d} \over \mathrm{d} r^2}\Big[r^2\,\tilde{\rho}(\Psi',r)\Big]\,.\label{eq-anidf-sigt-aug} \end{align}
Therefore, we can easily compute the radial and tangential velocity dispersions given the augmented density using Equations (5.83) and (5.86). The orbital anisotropy directly follows from its definition \begin{align} \beta(r) = 1-{\int_0^\Psi\mathrm{d}\Psi'\,{\mathrm{d} \over \mathrm{d} r^2}\Big[r^2\,\tilde{\rho}(\Psi',r)\Big] \over \int_0^\Psi\mathrm{d}\Psi' \tilde{\rho}(\Psi',r)}\,.\label{eq-anidf-beta-aug} \end{align} Thus, we see that the augmented density is directly related to a model’s anisotropy profile. As one would expect from the Eddington-inversion argument, an augmented density that is a function of the potential alone, \(\tilde{\rho}(\Psi,r) \equiv \rho(\Psi)\), corresponds to an isotropic distribution function and the inversion equation (5.80) reduces to the Eddington inversion.
Unfortunately, inverting Equation (5.80) for a general augmented density is difficult (requiring combined forward and inverse Laplace and Mellin transformations; Dejonghe 1986). However, we can make progress by considering augmented densities that are separable functions of \(\Psi\) and \(r\): \begin{equation}\label{eq-anidf-aug-sep} \tilde{\rho}(\Psi,r) = f_\Psi(\Psi)\,g(r)\,. \end{equation} Then the anisotropy from Equation (5.87) reduces to the simpler expression \begin{equation} \beta(r) = 1-{1\over g(r)}\,{\mathrm{d} \over \mathrm{d} r^2}\Big[r^2\,g(r)\Big]\,.\label{eq-anidf-beta-aug-sep} \end{equation}
5.6.4.1. Distribution functions with constant anisotropy¶
Armed with Equation (5.89), we can now proceed to construct distribution functions that have constant anisotropy as a function of radius. Substituting \(g(r) = r^{-2\beta}\) into Equation (5.89), we find that \(\beta(r) = \beta\), constant with radius, and therefore models with constant anisotropy can be obtained using the augmented density \begin{equation} \tilde{\rho}(\Psi,r) = f_\Psi(\Psi)\,r^{-2\beta}\,.\label{eq-anidf-constbeta-aug} \end{equation} Because \(r\) only enters the relation (5.80) between \(\tilde{\rho}(\Psi,r)\) and \(f(\mathcal{E},L)\) through \(f(\mathcal{E},L)\) itself, it is clear that this augmented density must correspond to a distribution function \begin{equation} f(\mathcal{E},L) = f_{\mathcal{E}}(\mathcal{E})\,L^{-2\beta}\,, \end{equation} where \begin{equation} f_\Psi(\Psi) = 4\,\pi\,\int_0^\Psi \mathrm{d}\mathcal{E}\,\int_0^{\sqrt{2(\Psi-\mathcal{E})}}\mathrm{d} v_t\,v_t^{1-2\beta}\,{f_{\mathcal{E}}(\mathcal{E})\over \sqrt{2(\Psi-\mathcal{E})-v_t^2}}\,. \end{equation} The inner integral can be worked out and using Equation (5.90) to write \(f_\Psi(\Psi)\) in terms of the density, we get \begin{equation} \rho(\Psi)\,[r(\Psi)]^{2\beta} = 2^{3/2-\beta}\,\pi^{3/2}\,{\Gamma(1-\beta) \over \Gamma(3/2-\beta)}\,\int_0^\Psi \mathrm{d}\mathcal{E}\,f_{\mathcal{E}}(\mathcal{E})\,(\Psi-\mathcal{E})^{1/2-\beta}\,.\label{eq-anidf-constbeta-inveq-1} \end{equation}
Before discussing the general solution of this inversion, it is worth noting that the case \(\beta = 1/2\) is especially simple, because then Equation (5.93) simplifies to \(\rho(\Psi)\,r(\Psi) = 2\pi^2\,\int_0^\Psi \mathrm{d}\mathcal{E}\,f_{\mathcal{E}}(\mathcal{E})\), and therefore \begin{equation} f_{\mathcal{E}}(\mathcal{E}) = {1\over 2\pi^2}\,{\mathrm{d} \over \mathrm{d}\Psi}\Big[\rho(\Psi)\,r(\Psi)\Big]\Big|_{\Psi=\mathcal{E}}\,. \end{equation}
As an example, for the Hernquist profile, combining Equations (5.65) and (5.66), we have that \(\rho(\Psi)\,r(\Psi) = \tilde{\Psi}^3/(2\pi a^2)\), where \(\tilde{\Psi} = \Psi a/(GM)\) as before. Therefore, a distribution function with constant anisotropy \(\beta = 1/2\) for the Hernquist profile is given by \begin{equation} f(\mathcal{E},L) = {3\over 4\pi^3}\,{a\over [GM]^3}\,\mathcal{E}^2\,L^{-1}\,. \end{equation} In fact, all half-integer values of \(\beta\) are of a similar simplicity, because we can take a suitable number of derivatives of Equation (5.93) with respect to \(\Psi\) to make the exponent of \((\Psi-\mathcal{E})\) in the integrand equal to zero. For example, for \(\beta = -1/2\), we first take the derivative of Equation (5.93) with respect to \(\Psi\) to get \begin{align} {\mathrm{d} \over \mathrm{d} \Psi}& \Big(\rho(\Psi)\,[r(\Psi)]^{2\beta}\Big) =\nonumber\\& 2^{3/2-\beta}\,\pi^{3/2}\,{\Gamma(1-\beta)\,(1/2-\beta) \over \Gamma(3/2-\beta)}\,\int_0^\Psi \mathrm{d}\mathcal{E}\,f_{\mathcal{E}}(\mathcal{E})\,(\Psi-\mathcal{E})^{-1/2-\beta}\,,\label{eq-anidf-constbeta-inveq-1psideriv} \end{align} and then plugging in \(\beta = -1/2\), we find that \(\mathrm{d}/\mathrm{d} \Psi\left(\rho(\Psi)\,[r(\Psi)]^{-1}\right) = 2\,\pi^{2}\,\int_0^\Psi \mathrm{d}\mathcal{E}\,f_{\mathcal{E}}(\mathcal{E})\), or \begin{equation} f_{\mathcal{E}}(\mathcal{E}) = {1 \over 2\,\pi^{2}}\,{\mathrm{d}^2 \over \mathrm{d} \Psi^2} \Big(\rho(\Psi)\,[r(\Psi)]^{-1}\Big)\Big|_{\Psi=\mathcal{E}}\,. \end{equation} In this case, for the Hernquist profile, we have that \(2\pi a^4 \rho(\Psi)\,[r(\Psi)]^{-1} = \tilde{\Psi}^5/(1-\tilde{\Psi})^2\), where again \(\tilde{\Psi} = \Psi a/(GM)\), and therefore \begin{equation} f_{\mathcal{E}}(\mathcal{E},L) = {1\over 2\pi^3\,a^2\,[GM]^2}\,{\tilde{\mathcal{E}}^3\, (3 \tilde{\mathcal{E}}^2 - 10 \tilde{\mathcal{E}} + 10) \over(1-\tilde{\mathcal{E}})^4}\,L\,, \end{equation} where \(\tilde{\mathcal{E}} = \mathcal{E} a/(GM)\).
For general, half-integer \(\beta\) solution we need to take \(1/2-\beta\) derivatives of Equation (5.93) with respect to \(\Psi\) to make the exponent of \((\Psi-\mathcal{E})\) equal to zero and then another derivative to remove the integral. Thus, we generate a factor \(\Gamma(3/2-\beta)\) from \(\mathrm{d}^{1/2-\beta} (\Psi-\mathcal{E})^{1/2-\beta} / \mathrm{d} \Psi^{1/2-\beta}\) that exactly cancels the same factor in the denominator in Equation (5.93) and the remaining pre-factors can be succinctly written in terms of the double factorial \((-2\beta)!! = 2^{1/2-\beta}\,\Gamma(1-\beta)/\sqrt{\pi}\), leading to (Cuddeford 1991; An & Evans 2006) \begin{equation} f(\mathcal{E},L) = {1 \over 2\pi^2\,(-2\beta)!!}\,L^{-2\beta}\,{\mathrm{d}^{3/2-\beta} \over \mathrm{d} \Psi^{3/2-\beta}}\Big(\rho(\Psi)\,[r(\Psi)]^{2\beta}\Big)\,,\quad 1/2-\beta \in \mathbb{N}\,. \end{equation} For non-half-integer values of \(\beta\), we can solve for \(f_{\mathcal{E}}(\mathcal{E})\) by noting that Equation (5.93) is an Abel integral equation similar to Equation (5.61) for a similar suitable number of derivatives of the equation with respect to \(\beta\) (the number of derivatives required is that to make the exponent of \((\Psi-\mathcal{E})\) in the integrand between \(0\) and \(-1\)); the full solution of this Abel inversion is given by (see Appendix B.4.1) \begin{align} f(\mathcal{E},L) = &{2^\beta\,L^{-2\beta} \over (2\pi)^{3/2}\,\Gamma(1-\alpha)\,\Gamma(1-\beta)}\nonumber\\&\quad \times\,\left[\int_0^{\mathcal{E}}\mathrm{d}\Psi\,{1\over (\mathcal{E}-\Psi)^\alpha}\,{\mathrm{d}^{m+1} \over \mathrm{d} \Psi^{m+1}}\Big(\rho(\Psi)\,[r(\Psi)]^{2\beta}\Big)\right.\label{eq-anidf-constbeta-generalsol} \\&\qquad \ \ \left.+{1\over \mathcal{E}^\alpha}\,{\mathrm{d}^{m} \over \mathrm{d} \Psi^{m}}\Big(\rho(\Psi)\,[r(\Psi)]^{2\beta}\Big)\Big|_{\Psi=0}\right]\,,\nonumber \end{align} where \(m = \lfloor 3/2-\beta \rfloor\) and \(\alpha = 3/2-\beta-m\). While this is not an easy integral to solve, for values of \(\beta\) that are not too far from zero, computing this integral is not that much more onerous than evaluating the equivalent integral for the isotropic case (Equation 5.63). The main bottleneck is computing the derivatives of the density times \(r^{2\beta}\) with respect to \(\Psi\) (see the discussion in Lane et al. 2022). Sampling from these constant-anisotropy distribution functions proceeds in the same way as for ergodic distribution functions as described in Section 5.6.1, except that the velocity angle \(\eta\) is now sampled from \(p(\eta) \propto \sin^{1-2\beta}\eta\) (because of the \(L^{-2\beta} = r^{-2\beta}\,v_t^{-2\beta} = r^{-2\beta}\,v^{-2\beta}\,\sin^{-2\beta}\eta\) additional factor in the distribution function compared to the ergodic case). This can be sampled using inverse transform sampling and the cumulative distribution can be worked out in terms of a hypergeometric function (but it cannot be analytically inverted, so it needs to be inverted using interpolation).
For the Hernquist profile, the integral in Equation (5.100) can be performed analytically for all \(\beta\) and expressed as (Baes & Dejonghe 2002) \begin{align}\label{eq-anidf-hernquist-anybeta} f(\mathcal{E},L) = &{2^\beta \over (2\pi)^{5/2}\,[GMa]^{3/2-\beta}}\,{\Gamma(5-2\beta) \over \Gamma(1-\beta)\,\Gamma(7/2-\beta)}\\&\quad\times\,L^{-2\beta}\,\tilde{\mathcal{E}}^{5/2-\beta}\,{}_2 F_1\left(5-2\beta,1-2\beta;7/2-\beta;\tilde{\mathcal{E}}\right)\,,\nonumber \end{align} where \({}_2 F_1\) is the Gaussian hypergeometric function (see Appendix B.3.7) and, as usual for the Hernquist profile, \(\tilde{\mathcal{E}} = \mathcal{E} a/(GM)\).
As an example, we can look at the distribution of total velocities \(v^2 = v_r^2+v_t^2\) at a given radius in Hernquist profiles with different constant anisotropy parameters. We’ll look at \(r = 8\,\mathrm{kpc}\) in the same Milky-Way dark-matter-halo-like Hernquist profile as in Section 5.6.1 above. Making use of the fact that a full distribution function model allows us to sample points from the distribution function, we look at samples from the distribution function in the left panel of Figure 5.9.
[20]:
from galpy.potential import HernquistPotential
from galpy.df import constantbetaHernquistdf
hp= HernquistPotential(amp=2.*1e12*u.Msun,a=16.*u.kpc)
betas= [0.5,0,-0.5,-1.5]
r= 8.*u.kpc
figure(figsize=(11,4))
subplot(1,2,1)
for beta in betas:
dfh= constantbetaHernquistdf(pot=hp,beta=beta)
orbs= dfh.sample(R=r,z=0.,n=100000)
vs= numpy.sqrt(orbs.vR()**2.+orbs.vT()**2.+orbs.vz()**2.)
w,e= numpy.histogram(vs.to_value(u.km/u.s),bins=61,
range=[0.,hp.vesc(r).to_value(u.km/u.s)])
plotvs= (numpy.roll(e,1)+e)[1:]/2.
plot(plotvs,w/numpy.sum(w)/(plotvs[1]-plotvs[0]),
label=rf'$\beta = {beta:+.1f}$')
xlabel(r'$v\,(\mathrm{km\,s}^{-1})$')
ylabel(r'$p(v|r)$')
legend(frameon=False,fontsize=18.)
subplot(1,2,2)
betas= [-1.5,-10.5,-30.5]
r= 8.*u.kpc
for beta in betas:
dfh= constantbetaHernquistdf(pot=hp,beta=beta)
orbs= dfh.sample(R=r,z=0.,n=100000)
vs= numpy.sqrt(orbs.vR()**2.+orbs.vT()**2.+orbs.vz()**2.)
w,e= numpy.histogram(vs.to_value(u.km/u.s),bins=61,
range=[0.,hp.vesc(r).to_value(u.km/u.s)])
plotvs= (numpy.roll(e,1)+e)[1:]/2.
plot(plotvs,w/numpy.sum(w)/(plotvs[1]-plotvs[0]),
label=rf'$\beta = {beta:+.1f}$')
axvline(hp.vcirc(r).to_value(u.km/u.s),color='k',ls='--',zorder=0,
label=r'$v_c(r)$')
xlabel(r'$v\,(\mathrm{km\,s}^{-1})$')
legend(frameon=False,fontsize=18.)
tight_layout(w_pad=0.5);
Figure 5.9: The velocity distribution of Hernquist models with different orbital anisotropies.
We see that as \(\beta\) decreases, the distribution of velocities narrows and the entire shape changes from a wide distribution at positive \(\beta\) to a more Gaussian-like distribution at tangential values of \(\beta\). As \(\beta\) decreases and the distribution of velocities becomes more and more tangential, eventually all orbits become circular and this distribution narrows to a delta function centered on the circular velocity, as can be seen in the right panel of Figure 5.9 (note that in practice, the hypergeometric functions become so large that numerically it is difficult to go to \(\beta \lesssim -30\)).
As for ergodic distribution functions obtained using the Eddington inversion, there is no guarantee that the constant-anisotropy distribution functions computed using this method are positive everywhere and this needs to be checked before proceeding with using these distribution functions. For example, the anisotropic Hernquist distribution functions derived here have negative density regions whenever \(\beta > 1/2\).
5.6.4.2. Distribution functions with radially-varying anisotropy¶
The orbital distribution of many galactic systems does not have constant anisotropy everywhere, but instead varies with radius. For example, as we already discussed above, dark matter halos in the standard cold-dark-matter paradigm are close to isotropic near the centers, but they become radially biased in their outskirts, with typical values of \(\beta \approx 0.3\) to \(0.7\). To model such systems, we require distribution function models with radially-varying anisotropy for a given density profile. Equation (5.89) in principle provides a way to do this for a given \(\beta(r)\) profile: solve this differential equation for \(g(r)\) that gives rise to a given \(\beta(r)\) profile, use it to define the augmented density from the density using Equation (5.88), and then invert Equation (5.80) to obtain \(f(\mathcal{E},L)\). However, the inversion in the last step does not have a clean solution for most \(\beta(r)\) profiles.
For certain \(\beta(r)\) profiles, the inversion of Equation (5.80) can be done more easily and a particularly elegant solution is possible for the anisotropy profile (Osipkov 1979; Merritt 1985) \begin{equation}\label{eq-anidf-osipkov} \beta(r) = {1 \over 1+r_a^2/r^2}\,. \end{equation} Such an anisotropy profile results from a distribution function, known as an Osipkov-Merritt distribution function, that is a function of \begin{equation}\label{eq-anidf-osipkov-Q} Q = \mathcal{E}-{L^2 \over 2r_a^2}\,, \end{equation} alone, that is \(f \equiv f(Q)\). Before demonstrating that, let us take a look at how \(Q\) behaves as a function of \(\mathcal{E}\) and \(L\) for a given \(r_a\); we will use that to set an additional constraint on the distribution function. Bound orbits populate the region between \(L = 0\) and \(L = L_c(\mathcal{E})\), that is, between radial orbits and circular orbits of a given energy; we plot this as the hatched region below. For a Hernquist potential (using the usual normalized energy \(\tilde{\mathcal{E}} = \mathcal{E}\,a/[GM]\)) and using \(r_a = a/2\), we show the behavior of \(\tilde{Q} = Q\,a/[GM]\) in the left panel of Figure 5.10.
[21]:
from galpy.potential import HernquistPotential
a=16.*u.kpc
ra= a/2
hp= HernquistPotential(amp=2.*1e12*u.Msun,a=a)
Es= numpy.linspace(0.,-0.99,1001)
def Lc(E):
# Return the angular momentum of a circular orbit
# with energy E
rE= numpy.amax(numpy.roots([-E,2*-E-0.5,-E-1])*a)
return rE*hp.vcirc(rE)
figure(figsize=(11,4))
subplot(1,2,1)
plot(Es,[Lc(E)/Lc(-0.5) for E in Es],color='k')
fill_between(Es,[Lc(E)/Lc(-0.5) for E in Es],hatch='/',fc='w')
xlabel(r'$-\tilde{\mathcal{E}}$')
ylabel(r'$L/L_c(\tilde{\mathcal{E}} = 1/2)$')
xs= numpy.linspace(-2.25,1.5,1001)*(-hp(0,0))
ys= numpy.geomspace(0.003,100,101)*Lc(-0.5)
xv, yv = numpy.meshgrid(xs,ys,indexing='ij')
Qv= -xv-yv**2./2./(ra)**2.
C= contour((xv/(-hp(0,0))).to_value(u.dimensionless_unscaled),
(yv/Lc(-0.5)).to_value(u.dimensionless_unscaled),
(Qv/(-hp(0,0))).to_value(u.dimensionless_unscaled),
levels=[-3,-2,-1,-0.5,0.,0.5,0.9])
gca().clabel(C,inline=1,fontsize=14)
galpy_plot.text(1.45,3.6,r'$r_a = a/2$',fontsize=18.,ha='right')
ylim(0.,4)
subplot(1,2,2)
ra= a/20
plot(Es,[Lc(E)/Lc(-0.5) for E in Es],color='k')
fill_between(Es,[Lc(E)/Lc(-0.5) for E in Es],hatch='/',fc='w')
xlabel(r'$-\tilde{\mathcal{E}}$')
ylabel(r'$L/L_c(\tilde{\mathcal{E}} = 1/2)$')
xs= numpy.linspace(-2.25,1.5,1001)*(-hp(0,0))
ys= numpy.geomspace(0.003,100,101)*Lc(-0.5)
xv, yv = numpy.meshgrid(xs,ys,indexing='ij')
Qv= -xv-yv**2./2./(ra)**2.
C= contour((xv/(-hp(0,0))).to_value(u.dimensionless_unscaled),
(yv/Lc(-0.5)).to_value(u.dimensionless_unscaled),
(Qv/(-hp(0,0))).to_value(u.dimensionless_unscaled),
levels=[-3.5,-2.5,-1.5,-0.5,0.,0.5,0.9])
gca().clabel(C,inline=1,fontsize=14)
galpy_plot.text(1.4,0.49,r'$r_a = a/20$',fontsize=18.,ha='right')
ylim(0.,0.55)
tight_layout();
Figure 5.10: Allowed energies and angular momenta in Osipkov-Merritt models.
As expected, curves of constant \(Q\) (the labeled contours) are parabolas that curve leftwards for the axes used in this figure. The region to the left of the hatched region is not populated by any orbit in the potential, whether bound or unbound, because it consists of combinations of the energy and angular momentum that cannot be attained for any phase-space point (there are no energies less than that of the most bound orbit and there are no angular momenta larger than the circular one at a given energy). We can therefore ignore this region. However, the region to the right of the hatched region consists of unbound orbits and could therefore be populated. But such orbits cannot be used to build an equilibrium model and we should therefore not include them in such a model. As can be seen, these orbits all have \(Q < 0\). Orbits with \(Q < 0\) do exist within the bound, hatched region, but we choose to exclude these as well to simplify the resulting distribution function. We could keep them, but then we would have to carefully make sure to exclude all unbound orbits when working with the distribution function. Therefore, we additionally impose that \(Q \geq 0\).
Before computing the anisotropy, we can first get a sense of what type of anisotropy behavior to expect. When \(r_a\) is so large that \(\mathcal{E} >> L^2/[2r_a^2]\), it is clear that \(f(Q)\) becomes approximately \(f(\mathcal{E})\), that is, an isotropic, ergodic distribution function. The opposite extreme, where \(r_a\) is very small, is less straightforward, but we can understand what happens by looking at the contours of \(Q\) when \(r_a\) is small. Therefore, we plot similar contours of \(\tilde{Q}\) as in the example above, but now for \(r_a = a/20\), in the right panel of Figure 5.10. Because \(r_a\) is small, the \(L^2/[2r_a^2]\) terms is large and the contours therefore become more horizontal. Because we are only considering \(Q \geq 0\) to build the distribution function, allowed values of \(L\)—which are located in the intersection between the hatched region and the \(Q \geq 0\) region—are much smaller than before. The model therefore largely consists of radial orbits and as \(r_a\) decreases, the orbit becomes more and more radial until it eventually becomes purely radial as \(r_a \rightarrow 0\). Therefore, we can expect the form \(f(\mathcal{E}-L^2/[2r_a^2])\) to give rise to an anisotropy profile that is isotropic at \(r \ll r_a\) and fully radially anisotropic at \(r \gg r_a\) with a smooth transition between these regimes around \(r \approx r_a\).
To compute the actual anisotropy profile and work out how to obtain \(f(Q)\) for a given density profile, we obtain the augmented density using Equation (5.80). We first interchange the order of the integrals in the general expression and then we substitute the form \(f(Q)\) for \(f(\mathcal{E},r\,v_t)\) to obtain \begin{align} \tilde{\rho}(\Psi,r) & = 2\,\pi\,\int_0^{2\Psi}\mathrm{d} v^2_t\,\int_0^{\Psi-v_t^2/2}\mathrm{d}\mathcal{E} \,{f(\mathcal{E}-[r/r_a]^2\,v_t/2)\over \sqrt{2(\Psi-\mathcal{E})-v_t^2}}\,. \end{align} Re-writing the inner integral in terms of \(Q\) then given \begin{align} \tilde{\rho}(\Psi,r) & = 2\,\pi\,\int_0^{2\Psi}\mathrm{d} v^2_t\,\int_0^{\Psi-v_t^2\,\left(1+[r/r_a]^2\right)/2}\mathrm{d}Q \,{f(Q)\over \sqrt{2(\Psi-Q)-v_t^2\,\left(1+[r/r_a]^2\right)}}\,. \end{align} where we have used the constraint that \(f(Q)\) is only non-zero for \(Q \geq 0\). We then write the outer integral in terms of \(\tilde{v}^2_t = v_t^2\,\left(1+[r/r_a]^2\right)\) and also interchange the order of the integration once more \begin{align} \tilde{\rho}(\Psi,r) & = 2\,\pi\,{1 \over \left(1+[r/r_a]^2\right)}\,\int_0^{\Psi}\mathrm{d}Q\,f(Q) \,\int_0^{2(\Psi-Q)}\mathrm{d} \tilde{v}^2_t\,{1\over \sqrt{2(\Psi-Q)-\tilde{v}^2_t}}\,. \end{align} This inner integral can now be computed to give \begin{align} \tilde{\rho}(\Psi,r) & = 4\sqrt{2}\,\pi\,{1 \over \left(1+[r/r_a]^2\right)}\,\int_0^{\Psi}\mathrm{d}Q\,f(Q) \,\sqrt{\Psi-Q}\,. \end{align} Thus, the augmented density is of the separable form of Equation (5.88) with \(g(r) = 1/(1+r^2/r_a^2)\) and \begin{equation}\label{eq-anidf-osipkov-fpsi} f_\Psi(\Psi) = 2\sqrt{8}\,\pi\,\int_0^{\Psi}\mathrm{d}Q\,f(Q) \,\sqrt{\Psi-Q}\,. \end{equation} From \(g(r)\) and Equation (5.89), we find that the anisotropy profile is given by Equation (5.102). This profile has the behavior that we expected, with \(\beta \approx 0\) for \(r \ll r_a\) and \(\beta \approx 1\) for \(r \gg r_a\). The full behavior is shown in Figure 5.11.
[22]:
figure(figsize=(5,3))
rs= numpy.linspace(0.,5.,101)
plot(rs,1./(1+1./rs))
xlabel(r'$r/r_a$')
ylabel(r'$\beta$');
Figure 5.11: The orbital anisotropy in Osipkov-Merritt models.
The anisotropy rises quickly to \(\beta = 1/2\) at \(r = r_a\) and then slowly approaches \(\beta = 1\). Nevertheless, this anisotropy profile is a reasonable, simple model for an anisotropic dark-matter halo if we set \(r_a \approx r_\mathrm{vir}\) to have \(\beta \approx 0.5\) at the virial radius \(r_\mathrm{vir}\).
To compute the distribution function \(f(Q)\) for a given spherical density profile \(\rho(r)\), we need to invert Equation (5.108). Comparing this equation to Equation (5.60) that we encountered in deriving the Eddington inversion for ergodic distribution functions, we see that these equations are the same if we replace \(\rho(\Psi)\) in Equation (5.60) with \(f_\Psi(\Psi) = \rho(\Psi)/g(r[\Psi]) = \rho(\Psi)\,(1+r^2[\Psi]/r_a^2)\). A similar Abel inversion therefore leads to \begin{equation} \begin{split} f(Q) = \frac{1}{\sqrt{8}\,\pi^2}\,&\left[\int_0^Q\mathrm{d}\Psi\,\frac{1}{\sqrt{Q-\Psi}}\,\frac{\mathrm{d}^2}{\mathrm{d}\Psi^2}\Big[\rho(\Psi)\,(1+r^2[\Psi]/r_a^2)\Big] \right.\\&\quad \left.+\frac{1}{\sqrt{Q}}\,\frac{\mathrm{d}}{\mathrm{d}\Psi}\Big[\rho(\Psi)\,(1+r^2[\Psi]/r_a^2)\Big]\Big|_{\Psi=0}\right]\,. \end{split} \end{equation} (an expression similar to Equation 5.62 can also be derived). Obtaining the Osipkov-Merritt distribution function for a given density profile is therefore very similar to computing the ergodic distribution function with the only difference being that the density \(\rho(r)\) in the calculation is replaced by \(\rho(r)\,(1+r^2/r_a^2)\), but everything else being exactly the same. Because the density only enters as \(\rho(r)\,(1+r^2/r_a^2)\), the calculation can actually be split into a part that is identical to the Eddington formula and a part proportional to \(r_a^{-2}\) as \begin{equation}\label{eq-spherdf-omdf} \begin{split} f(Q) = f^{\beta=0}(Q)+\frac{1}{\sqrt{8}\,\pi^2\,r_a^2}\,&\left[\int_0^Q\mathrm{d}\Psi\,\frac{1}{\sqrt{Q-\Psi}}\,\frac{\mathrm{d}^2}{\mathrm{d}\Psi^2}\Big[\rho(\Psi)\,r^2(\Psi)\Big]\right.\\&\quad \left.+\frac{1}{\sqrt{Q}}\,\frac{\mathrm{d}}{\mathrm{d}\Psi}\Big[\rho(\Psi)\,r^2(\Psi)\Big]\Big|_{\Psi=0}\right]\,, \end{split} \end{equation} where \(f^{\beta=0}(Q)\) is the ergodic distribution function computed using the Eddington formula (5.63).
Sampling from Osipkov-Merritt-type distribution functions proceeds along the same lines as for ergodic distribution functions as described in Section 5.6.1, sampling positions \((r,\theta,\phi)\) in the same way. But now we need to first sample the velocity angle \(\eta\) from \(p(\eta|r) \propto \sin\eta\,[1+(r/r_a)^2\,\sin^2\eta]^{-3/2}\), which can be efficiently done using inverse transform sampling, because the cumulative distribution of \(\cos\eta\) can be analytically inverted. Then we sample \(v' = v\,\sqrt{1+(r/r_a)^2\,\sin^2\eta}\) from \(p(v'|r,\eta) \propto v'^2 f(Q) = v'^2 f(\Psi[r]-v'^2/2)\) and we transform \(v'\) to \(v\) before proceeding as before.
As an example, we can once more consider the Hernquist profile. From Equations (5.65) and (5.66), we have that \(\rho(\Psi)\,r^2(\Psi) = \tilde{\Psi}^2\,(1-\tilde{\Psi})/(2\pi a)\), where as usual \(\tilde{\Psi} = \Psi a/(GM)\). Taking the necessary derivatives and plugging this into Equation (5.110), we find the Hernquist distribution function with Osipkov-Merritt anisotropy \begin{equation} f_H(Q) = f_H^{\beta=0}(Q) + 8\,{(G\,M\,a)^{-3/2} \over \sqrt{2}\,(2\pi)^3}\,{a^2\over r_a^2}\,\sqrt{\tilde{Q}}\,(1-2\tilde{Q})\,, \end{equation} where \(f^{\beta=0}_H(Q)\) is the isotropic Hernquist distribution function from Equation (5.67) and \(\tilde{Q} = Q a/(GM)\).
As an example, we can look at the distribution of velocities at different radii in the Milky-Way dark-matter-halo-like Hernquist profile that we introduced in Section 5.6.1 above. By sampling from the Osipkov-Merritt distribution function at different \(r/r_a\), we look at the distribution of radial velocities \(v_r\) and rotational velocities \(v_\phi\) in Figure 5.12. For an isotropic distribution function, these velocity distributions would be the same.
[23]:
from galpy.potential import HernquistPotential
from galpy.df import osipkovmerrittHernquistdf
hp= HernquistPotential(amp=2.*1e12*u.Msun,a=16.*u.kpc)
dfh= osipkovmerrittHernquistdf(pot=hp,ra=40.*u.kpc)
rs= [8.*u.kpc,40.*u.kpc,80*u.kpc]
figure(figsize=(10,4))
for ii,r in enumerate(rs):
orbs= dfh.sample(R=r,z=0.,n=100000)
vrs= orbs.vr()
vTs= orbs.vT() # v_phi in galpy is vT, not vphi!
subplot(1,3,ii+1)
wr,e= numpy.histogram(vrs.to_value(u.km/u.s),bins=61,
range=[-hp.vesc(r).to_value(u.km/u.s),
hp.vesc(r).to_value(u.km/u.s)])
plotvrs= (numpy.roll(e,1)+e)[1:]/2.
ploty= wr/numpy.sum(wr)/(plotvrs[1]-plotvrs[0])
plot(plotvrs,ploty,label=r'$v_r$')
wT,e= numpy.histogram(vTs.to_value(u.km/u.s),bins=61,
range=[-hp.vesc(r).to_value(u.km/u.s),
hp.vesc(r).to_value(u.km/u.s)])
plotvTs= (numpy.roll(e,1)+e)[1:]/2.
ploty= wT/numpy.sum(wT)/(plotvTs[1]-plotvTs[0])
plot(plotvTs,ploty,label=r'$v_\phi$')
xlabel(r'$v\,(\mathrm{km\,s}^{-1})$')
ylim(0.,1.15*numpy.amax(ploty))
if ii == 0: ylabel(r'$p(v|r)$')
legend(frameon=False,fontsize=18.,loc='lower center');
galpy_plot.text(rf'$r = {r.to_value(u.kpc):.0f}\,\mathrm{{kpc}} = {(r/40.*u.kpc).value:.1f}\,r_a$',
top_left=True,size=18.)
tight_layout()
Figure 5.12: Velocity distributions in Osipkov-Merritt models.
We see that at \(r/r_a \ll 1\), the distribution of \(v_r\) and \(v_\phi\) is approximately the same, but as \(r/r_a\approx 1\) and \(r/r_a > 1\), the distribution of velocities becomes increasingly radial, with a wide distribution of \(v_r\) accompanied by a narrow distribution of \(v_\phi\) around \(v_\phi=0\).
The Osipkov-Merritt distribution functions \(f(Q)\) can be further generalized to a form \(f(\mathcal{E},L) = f(Q)\,L^{-2\beta}\) which have \(\beta(r=0) = \beta\) and can therefore be used to model systems with tangential bias in their centers that transition to a radial bias in their outskirts. Such models can be constructed using a combination of the inversion techniques from this and the previous section (Cuddeford 1991).