B. Mathematical background: Section 3 (Dynamics and Astrophysics of Galaxies)

B.3. Special functions¶

B.3.1. The normal distribution and the error function¶

The normal distribution, also often known as a Gaussian distribution, is a probability distribution that is commonly encountered in many fields of (astro)physics. The general form of a normal distribution with mean \(\mu\) and variance \(\sigma^2\) is \begin{equation}\label{eq-math-normal} \mathcal{N}(x;\mu,\sigma^2) = {1\over \sqrt{2\pi\,\sigma^2}}\,\exp\left[-{\left(x-\mu\right)^2\over 2\,\sigma^2}\right]\,. \end{equation} It is straightforward to show that this distribution is properly normalized, that is, it integrates to one when considering all \(-\infty < x < \infty\), and that \(\mu\) and \(\sigma^2\) are the actual mean and variance of the distribution. The standard normal distribution \(\varphi(x)\) is the special case of the normal distribution when \(\mu=0\) and \(\sigma^2=1\): \begin{equation}\label{eq-math-normal-standard} \varphi(x) = {1\over \sqrt{2\pi}}\,e^{-x^2/2}\,. \end{equation} In terms of the standard normal distribution, the normal distribution is given by \begin{equation}\label{eq-math-normal-as-standard} \mathcal{N}(x;\mu,\sigma^2) = {1\over \sigma}\,\varphi\left({x-\mu\over \sigma}\right)\,. \end{equation}

The cumulative distribution function of the standard normal is \begin{equation}\label{eq-math-normal-cdf} \Phi(x) = \int_{-\infty}^x\mathrm{d}t\,\varphi(t)= {1\over \sqrt{2\pi}}\,\int_{-\infty}^x\mathrm{d}t\,e^{-t^2/2} = {1\over 2}\,\left[1+\mathrm{erf}\left({x\over \sqrt{2}}\right)\right]\,, \end{equation} where in the second equality we have expressed it in terms of the error function \(\mathrm{erf}(x)\), which is defined as \begin{equation}\label{eq-math-erf} \mathrm{erf}(x) = {2\over \sqrt{\pi}}\,\int_0^x\mathrm{d}t\,e^{-t^2}\,. \end{equation} The cumulative distribution of a normal distribution with mean \(\mu\) and variance \(\sigma^2\) is similarly given by \begin{equation}\label{eq-math-normal-cdf-as-erf-general} \Phi(x;\mu,\sigma^2) = {1\over 2}\,\left[1+\mathrm{erf}\left({x-\mu\over \sqrt{2}\,\sigma}\right)\right]\,. \end{equation} The complementary error function \(\mathrm{erfc}(x)\) is closely related to the error function and is defined as \begin{align}\label{eq-math-erfc} \mathrm{erfc}(x) & = 1-\mathrm{erf}(x) = {2\over \sqrt{\pi}}\,\int_x^\infty\mathrm{d}t\,e^{-t^2}\,. \end{align} From their definitions, the derivatives of the error function and of the complementary error function are given by \begin{equation}\label{eq-math-derf} {\mathrm{d} \,\mathrm{erf}(x)\over \mathrm{d}x} = -{\mathrm{d} \,\mathrm{erfc}(x)\over \mathrm{d}x} = {2\over \sqrt{\pi}}\,e^{-x^2}\,. \end{equation}

In Chapter 19.2.2, we need the half-normal distribution when constructing dark-matter halo trees. The half-normal distribution is the distribution of \(x = |y|\) where \(y\) follows a zero-mean normal distribution. Thus, the half-normal distribution \(f_\mathrm{hn}\) is \begin{equation}\label{eq-math-halfnormal} f_\mathrm{hn} (x;\sigma) = {1\over \sigma}\,\sqrt{2\over \pi}\,\exp\left(-{x^2\over 2\,\sigma^2}\right)\,,\quad x \geq 0\,. \end{equation} The mean of the half-normal distribution is \(\sigma\,\sqrt{2/\pi}\) and the variance is \(\sigma^2(1-2/\pi)\). Sampling \(x_i\) from a half-normal distribution is straightforwardly done using its definition: sample \(y_i\) from \(\mathcal{N}(y;0,\sigma)\) and obtain \(x_i = |y_i|\).

B.3.2. The Dirac delta function and related functions¶

We encounter the Dirac delta function in many situations in (astro)physics. As physicists, we heuristically define the Dirac delta function, which we will generally simply refer to as the delta function, as the function \begin{equation}\label{eq-math-delta} \delta(x) = \begin{cases} +\infty\,,\quad &x = 0\,,\\ 0\,, &x\neq 0\,.\end{cases} \end{equation} However, this is only a useful way to think about the delta function, not a mathematically rigorous way of defining it. Mathematically, the delta function isn’t technically a function, but something that is rather described as a measure or a distribution. A measure is a technical, mathematical specification of what we denote as \(\mathrm{d} x\) in an integral and a loose way of defining the delta function as a measure is through the equation \begin{equation}\label{eq-math-delta-asint} \int_{-\infty}^{+\infty}\mathrm{d}x\,\delta(x)\,f(x) = f(0)\,, \end{equation} where the combination \(\mathrm{d}x\,\delta(x)\) is the Dirac measure (mathematicians would be horrified at this abuse of notation and insist on a form like \(\int_{-\infty}^{+\infty}\delta\{\mathrm{d}x\}\,f(x) = f(0)\); but it will do for us). Thus, the delta function only leads to physical quantities through integration over it.

The delta function can be expressed as \begin{equation}\label{eq-math-delta-fourier-1} \delta(x) = {1\over 2\pi}\,\int_{-\infty}^\infty\mathrm{d}k\,e^{ikx}\,, \end{equation} and as \begin{equation}\label{eq-math-delta-fourier-2} \delta(x) = {1\over 2\pi}\,\sum_{m=-\infty}^\infty{e^{imx}}\,, \end{equation} which are useful identities in the study of Fourier transforms. The delta function can also be represented as the zero-width limit of a wide range of functions characterized by a width parameter (see below).

While one is in general quite happy when the solution of a physics problem requires an integral over a delta function (because Equation B.23 makes these integrals simple compared to just about any other integral you encounter), we will occasionally need to perform integrals that involve a delta function that do not quite have the simple form of Equation (B.23) and are therefore trickier to obtain. One way to perform such integrals is to view the delta function as the limit of a family of well-defined functions that approach the delta function as some parameter goes to, say, zero. There are many families that approach a delta function in this way, but one particularly simple one is the rectangular function \(\Pi(x)\) defined as \begin{equation}\label{eq-math-rect1} \Pi(x) = \begin{cases} 0\,,\quad & |x| > \frac{1}{2}\,,\\ \frac{1}{2}\,,\quad & |x| = \frac{1}{2}\,,\\ 1\,,\quad & |x| < \frac{1}{2}\,.\end{cases} \end{equation} In terms of the rectangular function, the delta function can be defined as \begin{equation}\label{eq-math-delta-rect} \delta(x) = \lim_{h\rightarrow 0} {1 \over h}\,\Pi\left({x\over h}\right)\,. \end{equation} As an example of how we can use this limit approximation, we can perform the integral of a delta function multiplied by a function \(f(x)\) over half the real line rather than the full real line \begin{align}\label{eq-math-deltainthalfline-1} \int_{0}^{+\infty}\mathrm{d}x\,\delta(x)\,f(x) & = \lim_{h\rightarrow 0}\int_{0}^{+\infty}\mathrm{d}x\,{1 \over h}\,\Pi\left({x\over h}\right)\,f(x)\,,\\ & = \lim_{h\rightarrow 0}{1 \over h}\int_{0}^{+\infty}\mathrm{d}x\,\,\Pi\left({x\over h}\right)\,\left[f(0) + f'(0)\,x+ \mathcal{O}(x^2)\right]\,,\\ & = \lim_{h\rightarrow 0}\left[ {f(0) \over 2} + h\,{f'(0) \over 8}+ \mathcal{O}(h^2)\right] = {f(0) \over 2}\,.\label{eq-math-intdeltahalfline-2} \end{align} This result makes intuitive sense, but the limit approach of the delta function makes it rigorous. We could use a similar approach to show that \(\int_{-a}^{+b}\mathrm{d}x\,\delta(x)\,f(x) = f(0)\) for any \(a,b > 0\).

The derivative \(\delta'(x)\) of the delta function can be found by working out the following integral \begin{align} \int\mathrm{d}x\,x\,f(x)\,\delta'(x) & = -\int\mathrm{d}x\,\left[f(x)-x\,f'(x)\right]\,\delta(x) = -\int\mathrm{d}x\,f(x)\,\delta(x)\,, \end{align} where we employed partial integration in the first line and we used that \(\int\mathrm{d}x\,x\,f'(x)\,\delta(x) = 0\) in the second line. Because this equation holds for any function \(f(x)\), the derivative of the delta function has to be \begin{equation}\label{eq-math-delta-deriv} {\mathrm{d}\delta(x) \over \mathrm{d} x} = -{\delta(x) \over x}\,. \end{equation}

B.3.3. The gamma function¶

The gamma function \(\Gamma(\alpha)\) is defined as \begin{equation}\label{eq-math-gamma-def} \Gamma(a) = \int_0^\infty\mathrm{d}t\,t^{a-1}\,e^{-t}\,. \end{equation} For integer values of \(a=n\), the gamma function is related to the factorial function through \begin{equation} \Gamma(n) = (n-1)!\,. \end{equation} Replacing either the lower or the upper limit of the integration in Equation (B.36) with a finite number \(x\) defines the upper and lower incomplete gamma functions. The upper gamma function \(\Gamma(a,x)\) is given by \begin{equation}\label{eq-math-uppergamma-def} \Gamma(a,x) = \int_x^\infty\mathrm{d}t\,t^{a-1}\,e^{-t}\,, \end{equation} while the lower incomplete gamma function \(\gamma(a,x)\) is defined as \begin{equation}\label{eq-math-lowergamma-def} \gamma(a,x) = \int_0^x\mathrm{d}t\,t^{a-1}\,e^{-t}\,. \end{equation} The lower and upper incomplete gamma functions sum to \(\Gamma(a)\) for all \(x\).

B.3.4. Elliptic integrals¶

Elliptic integrals are a set of integrals that originally appeared in the problem of finding the arc length of an ellipse, but that more generally pop up in the solution of various mathematical-physics problems. They are notorious for being defined in slightly different ways by different authors and different software packages. We use the following definitions.

The incomplete elliptic integral of the first kind, \(K(\phi,m)\) is defined as the integral \begin{equation} K(\phi,m) = \int_0^\phi \mathrm{d}\theta\,{1 \over \sqrt{1-m\,\sin^2 \theta}}\,. \end{equation} The complete elliptic integral of the first kind is \(K(m) \equiv K(\pi/2,m)\) or \begin{equation}\label{eq-math-ellipint-Km} K(m) = \int_0^{\pi/2} \mathrm{d}\theta\,{1 \over \sqrt{1-m\,\sin^2 \theta}}\,. \end{equation} In this book, we occasionally need the value of \(K(0)\) for which the integrand is one and therefore \(K(0) = \pi/2\).

Elliptic integrals of the second kind are defined similarly, with the incomplete elliptic integral of the second kind \(E(\phi,m)\) \begin{equation} E(\phi,m) = \int_0^\phi \mathrm{d}\theta\,\sqrt{1-m\,\sin^2 \theta}\,. \end{equation} and the complete elliptic integral of the second kind is \(E(m) \equiv E(\pi/2,m)\) or \begin{equation} E(m) = \int_0^{\pi/2} \mathrm{d}\theta\,\sqrt{1-m\,\sin^2 \theta}\,. \end{equation} There are elliptic integrals of the third kind as well, but we do not use these.

The derivatives of the complete elliptical integrals of the first and second kind are \begin{align}\label{eq-math-elliptint-dK} {\mathrm{d} K(m) \over \mathrm{d} m} & = {E(m) - (1-m)\,K(m) \over 2\,(1-m)\,m}\;,\quad {\mathrm{d} E(m) \over \mathrm{d} m} = {E(m)-K(m) \over 2\,m}\,. \end{align}

B.3.5. Bessel functions¶

Bessel functions, named after Friedrich Bessel who studied them in detail (and also measured the first astronomical parallax!), are solutions \(y(x)\) of the differential equation \begin{equation}\label{eq-math-bessel-diffeq} x^2\frac{\mathrm{d}^2 y}{\mathrm{d} x^2} + x\,\frac{\mathrm{d}y}{\mathrm{d}x} + \left(x^2-\alpha^2\right)\,y = 0\,. \end{equation} They often come up in physics and in gravitational dynamics in particular, because they appear in the solution of the Helmholtz equation in cylindrical coordinates (the Poisson equation with the density proportional to the density which is used to find eigenfunctions of the Laplace operator).

Bessel functions come in various kinds. In this text, we encounter Bessel functions of the first kind \(J_{\alpha}(x)\) with \(\alpha\) an integer \(m\), which are solutions that are finite at the origin \(x=0\). Bessel functions of the first kind are highly oscillatory, with an amplitude that falls off as \(\approx x^{-1/2}\) and a set of zero-crossings that does not display regular behavior. Figure B.1 shows the first few Bessel functions of the first kind.

[5]:
from scipy import special
xs= numpy.linspace(0.,10.,101)
figure(figsize=(7,5))
plot(xs,special.jv(0,xs),label=r'$J_0$')
plot(xs,special.jv(1,xs),label=r'$J_1$')
plot(xs,special.jv(2,xs),label=r'$J_2$')
plot(xs,special.jv(3,xs),label=r'$J_3$')
xlabel(r'$x$')
ylabel(r'$J_m$')
legend(frameon=False,fontsize=18.);

../_images/chapters_B.-Mathematical-background_15_0.svg

Figure B.1: Bessel functions of the first kind.

Bessel functions of the first kind of different integer orders are related by relations such as \begin{equation}\label{eq-math-bessel-negm-posm} J_{-m}(x) = (-1)^m\,J_m(x)\, \end{equation} and \begin{equation}\label{eq-math-bessel-dJmx} {\mathrm{d} \over \mathrm{d} x} \Big[ x^m\,J_m(x)\Big] = x^m\, J_{m-1}(x)\,, \end{equation} Combining these, we can derive that \begin{equation}\label{eq-math-bessel-dJ0J1} {\mathrm{d} J_0(x) \over \mathrm{d} x} = -J_1(x)\,. \end{equation} Another useful relation that follows from Equation (B.47) is \begin{equation}\label{eq-math-bessel-dJ1} {\mathrm{d} J_1(x) \over \mathrm{d} x} = J_0(x)-{J_1(x)\over x}\,. \end{equation}

Bessel functions of the first kind satisfy the following orthonormality relation (the Fourier-Bessel theorem): \begin{equation}\label{eq-math-bessel-FourierBessel} \int_0^\infty\mathrm{d}R\,J_m(kR)\,J_m(k'R)R = \delta(k-k')/k,\quad \forall k, k' > 0\,. \end{equation}

The Laplace transform of \(J_m(kx)\) is \begin{equation}\label{eq-math-bessel-laplace} \mathcal{L}\left[J_m(kx)\right](s) = \int_0^\infty \,\mathrm{d} x\,e^{-sx}\,J_m(kx) = {k^m \over \sqrt{s^2+k^2}\,\left(s + \sqrt{s^2+k^2}\right)^m}\,, \end{equation} and in particular (set \(s=0\) and \(k=1\)) \begin{equation}\label{eq-math-bessel-norm} \int_0^\infty \mathrm{d} x\,J_m(x) = 1\,. \end{equation}

Bessel functions of the second kind are denoted as \(Y_\alpha(x)\) and they are, for integer \(\alpha\), the second linearly independent solution of the defining differential equation (B.45) (for non-integer \(\alpha\), \(J_{\alpha}\) and \(J_{-\alpha}\) are two linearly independent solutions, because Equation B.46 does not hold for non-integer \(\alpha\)). Bessel functions of the second kind have a singularity at the origin.

Modified Bessel functions of the first, \(I_\alpha(x)\), and second, \(K_\alpha(x)\), kind are defined in terms of the Bessel function of the first kind evaluated at purely imaginary values (Bessel functions of the first kind are defined on the entire complex plane): \begin{align}\label{eq-math-bessel-modified} I_\alpha(x) & = i^{-\alpha}\,J_\alpha(ix)\,,\\ K_\alpha(x) & = {\pi \over 2}\,{I_{-\alpha}(x)-I_\alpha(x) \over \sin (\alpha \pi)}\,. \end{align} For integer \(\alpha = m\), \(K_m\) is defined as the limit \(\alpha \rightarrow m\). Similar to how the oscillatory sine and cosine functions become a growing or decaying exponential when evaluated for imaginary arguments, the modified Bessel functions of the first and second kinds are exponentially growing and decaying functions, respectively, without any oscillatory behavior.

Combining their definition with relations for the Bessel functions of the first kind from Equation (B.46) to (B.49), one can show that \begin{align}\label{eq-math-bessel-dI0dK0} {\mathrm{d} I_0 (x) \over \mathrm{d} x} & = I_1(x)\,;\quad {\mathrm{d} K_0 (x) \over \mathrm{d} x} = -K_1(x)\,,\\ {\mathrm{d} I_1 (x) \over \mathrm{d} x} & = I_0(x)-I_1(x)/x\,;\quad {\mathrm{d} K_1 (x) \over \mathrm{d} x} = -K_0(x)-K_1(x)/x\,,\label{eq-math-bessel-dI1dK1} \end{align} which are all relations we need in deriving the rotation curve of a razor-thin exponential disk in Chapter 7.3.3.2.

Spherical Bessel functions are the solutions of the differential equation \begin{equation}\label{eq-math-spherebessel-diffeq} x^2\frac{\mathrm{d}^2 y}{\mathrm{d} x^2} + 2x\,\frac{\mathrm{d}y}{\mathrm{d}x} + \left(x^2-n\,[n+1]\right)\,y = 0\,. \end{equation} This equation is similar to the defining equation (B.45) of the Bessel functions except for the factor of two in the second term and that we have replaced \(\alpha^2\) with \(n\,(n+1)\), where \(n\) is an integer. Like the defining equation (B.45), Equation (B.57) appears in solutions of the Helmholtz equation, but now in spherical coordinates. Equation (B.57) has two linearly-independent solutions \(j_n(x)\) and \(y_n(x)\) that are given in terms of the Bessel functions of the first and second kind through \begin{align}\label{eq-math-bessel-jn} j_n(x) & = \sqrt{{\pi\over 2x}}\,J_{n+1/2}(x)\;\quad y_n(x) = \sqrt{{\pi\over 2x}}\,Y_{n+1/2}(x)\,. \end{align}

Numerical integration over Bessel functions is complicated for Bessel functions of the first kind by their oscillatory behavior. Yet we need such integrals in, for example, the general solution for the potential of a razor-thin disk or to compute the potential and forces of a double-exponential disk. A simple strategy for integrating over a Bessel function of the first kind is to numerically integrate between the zeros for a large number of zeros, making use of the fact that the amplitude of the oscillation drops off and that functions that return the zeros of the Bessel functions are readily available in many programming languages.

A different and better way is to use numerical quadrature methods specifically designed to deal with oscillatory functions. For Bessel functions of the first kind, the quadrature formula from Ogata (2005) is very useful. Ogata (2005) demonstrates that integrals of the type \begin{equation} \int_0^\infty \mathrm{d}x\,J_\nu(x)\,f(x) \end{equation} can be approximated as \begin{align} \int_0^\infty \mathrm{d}x\,J_\nu(x)\,f(x) & \approx \label{eq-math-besselint} \pi\,\sum_{k=1}^{\infty}{w_{\nu k}\,f\left({\pi \over h}\,\psi\left[h\,\xi_{\nu k}\right]\right)\,J_\nu\left({\pi \over h}\,\psi\left[h\,\xi_{\nu k}\right]\right)\,\psi'\left(h\,\xi_{\nu k}\right)}\,, \end{align} where \(\psi(t)\) and its derivative \(\psi'(t)\) are \begin{equation} \psi(t) = t\,\tanh\left({\pi \over 2}\,\sinh t\right)\,,\quad \psi'(t) = {\sinh\left(\pi\,\sinh t\right)+\pi\,t\,\cosh t \over \cosh\left(\pi\,\sinh t\right)+1}\,. \end{equation} The weights \(w_{\nu k}\) are given by \begin{equation}\label{eq-math-besselint-weights} w_{\nu k} = {2\over \pi^2\,\xi_{\nu k}\,J^2_{\nu+1}\left(\pi\,\xi_{\nu k}\right)}\,. \end{equation} The \(\xi_{\nu k}\) are the zeros of the Bessel function \(J_{\nu}(\pi\,x)\) and \(h\) is a numerical step size that should be set as small as possible. Because as \(k \rightarrow \infty\), \({\pi \over h}\,\psi(h\,\xi_{\nu k}) \rightarrow \pi\,\xi_{\nu k}\), which are the zeros of \(J_{\nu}(x)\), the terms in the sum quickly approach zero, such that convergence is obtained with a small number of terms. Mathematically, this approach happens “double exponentially” (that is, fast), but this has nothing to do with and should not be confused for the double exponential disk model common in galactic dynamics (however, this formula is highly useful for working with double exponential disk models!).

To illustrate this quadrature method for Bessel functions, we can apply it to some integrals with known solutions, for example, relation (B.52) above.

First, we implement the \(\psi(t)\) and \(\psi'(t)\) functions:

[7]:
def de_psi(t):
    return t*numpy.tanh(numpy.pi/2.*numpy.sinh(t))
def de_psiprime(t):
    return (numpy.sinh(numpy.pi*numpy.sinh(t))+numpy.pi*t*numpy.cosh(t))\
        /(numpy.cosh(numpy.pi*numpy.sinh(t))+1)

and then some functions that return the relevant zeros of the Bessel function and the weights from Equation (B.62)

[8]:
from scipy import special
def de_xinuk(nu,nzeros):
    return special.jn_zeros(nu,nzeros)/numpy.pi
def de_wnuk(nu,nzeros,xinuk=None):
    if xinuk is None: xinuk= de_xinuk(nu,nzeros)
    return 2./(numpy.pi**2.*xinuk*special.jv(nu+1,numpy.pi*xinuk)**2.)

and finally a function that implements the quadrature formula of Equation (B.60) (here, we stop adding terms once they have a relative contribution less than \(10^{-15}\))

[9]:
def de_approx(fun,nu,h,nzeros):
    xinuk= de_xinuk(nu,nzeros)
    sumthis= fun(numpy.pi/h*de_psi(h*xinuk))\
                *special.jv(nu,numpy.pi/h*de_psi(h*xinuk))\
                *de_psiprime(h*xinuk)
    sumthis[numpy.fabs(sumthis/numpy.cumsum(sumthis)) < 1e-15]= 0.
    sumthis[numpy.isnan(sumthis)]= 0.
    return numpy.pi*numpy.sum(de_wnuk(nu,nzeros,xinuk=xinuk)*sumthis)

Applying the numerical quadrature formula to Equation (B.52) for \(m=0,1,2\) for a range of step sizes \(h\), we get the results in Figure B.2.

[10]:
figure(figsize=(6,4))
hs= numpy.geomspace(1e-3,1,100)
loglog(hs,[numpy.fabs(1.-de_approx(lambda x: 1.,0,h,10000)) for h in hs],
      label=r'$m=0$')
loglog(hs,[numpy.fabs(1.-de_approx(lambda x: 1.,1,h,10000)) for h in hs],
      label=r'$m=1$')
loglog(hs,[numpy.fabs(1.-de_approx(lambda x: 1.,2,h,10000)) for h in hs],
      label=r'$m=2$')
xlabel(r'$h$')
ylabel(r'$1-\int \mathrm{d}x\,J_m$')
legend(frameon=False,fontsize=18.);

../_images/chapters_B.-Mathematical-background_28_0.svg

Figure B.2: Double-exponential integration of integrals containing Bessel functions.

It is clear that we reach very high precision when using a suitably small step \(h\).

B.3.6. Orthogonal polynomials¶

Orthogonal polynomials come up in the solution of many partial differential equations in (astro)physics and in this book in particular in the multipole and basis-function expansions discussed in Chapter 12.3. Let \(P_n(x)\) denote a polynomial of degree \(n\), then a family \(\{P_n(x)\}\) of orthogonal polynomials of orders \(n=0,1,\ldots\) satisfies the following orthogonality conditions \begin{align} \int\mathrm{d}x\,W(x)\,P_n(x)\,P_m(x) & = 0\,,\quad n\neq m\,,\\ \int\mathrm{d}x\,W(x)\,P_n(x)\,P_n(x) & = A_n\,, \end{align} where \(A_n\) is a constant that can depend on the degree of the polynomial (that is, orthogonal polynomials are not usually defined to be orthonormal). The weight function \(W(x)\) defines the particular family. All of the families defined in this way are classical orthogonal polynomials (to distinguish them from more modern families defined in a more general manner). For example, for the Legendre polynomials, we have that \begin{align} W(x) = \begin{cases} 1 & -1 \leq x \leq 1\,,\\ 0 & \mathrm{otherwise}\,. \end{cases} \end{align} For the Gegenbauer polynomials \(C_n^{(\alpha)}(x)\) from the self-consistent field method from Chapter 12.3.2, we have that \begin{align} W(x) = \begin{cases} (1-x^2)^{\alpha-1/2} & -1 \leq x \leq 1\,,\\ 0 & \mathrm{otherwise}\,. \end{cases} \end{align} The Gegenbauer polynomials are normalized such that \(A_n = 2^{1-2\alpha}\,\pi\,\Gamma(n+2\alpha)/[\Gamma(n+1)\,(n+\alpha)\,\Gamma^2(\alpha)]\). The Legendre polynomials are a special case of the Gegenbauer polynomials when \(\alpha = 1/2\) and the Gegenbauer polynomials themselves are a special case of the Jacobi polynomials, which have \(W(x) = (1-x)^\alpha\,(1+x)^\beta\) over the same \(-1\leq x \leq 1\) range. Other families of polynomials use different weight functions and different ranges. For example, Hermite polynomials have \(W(x) = e^{-x^2}\) and extend over the entire real axis.

The sequence of polynomials \(P_n(x)\) for a given weight function can be obtained by Gram-Schmidt orthogonalization of the basic polynomial sequence \(1, x, x^2, \ldots\) with an inner product defined by \begin{equation}\label{eq-math-orthopoly-innerproduct} \langle f, g\rangle = \int\mathrm{d}x\,W(x)\,f(x)\,g(x)\,. \end{equation} Gram-Schmidt orthogonalization is the process of orthogonalizing a set of vectors in a vector space defined by an inner product by subtracting out the projection of the next vector in a sequence onto the sub-space defined by the previous vectors. Denoting the basic polynomials as \(v_i = x^i\) and the resulting orthogonal set of polynomials under the inner product from Equation (B.67) as \(u_i\), Gram-Schmidt orthogonalization computes the orthogonal polynomials starting from \(u_0 = v_0\) as \begin{equation}\label{eq-math-orthopoly-gramschmidt-general} u_n(x) = v_n(x)-{\large\sum_{k=0}^{n-1}}{u_k(x)\,{\langle v_n,u_k \rangle \over \langle u_k,u_k\rangle}}\,. \end{equation} For example, for the Legendre polynomials, which are normalized such that \(P_n(1) = 1\), we have that \(1, x, x^2, \ldots\) becomes \begin{align}\label{eq-math-orthopoly-gramschmidt-legendre-0} 1 & \rightarrow 1 \rightarrow P_0(x) = 1\,,\\ x & \rightarrow x-1\,{\int_{-1}^1\mathrm{d}y\,(1\times y) \over \int_{-1}^1\mathrm{d}y\,(1\times 1)} = x \rightarrow P_1(x) = x\,,\label{eq-math-orthopoly-gramschmidt-legendre-1}\\ x^2 & \rightarrow x^2-1\,{\int_{-1}^1\mathrm{d}y\,(1\times y^2) \over \int_{-1}^1\mathrm{d}y\,(1\times 1)}-x\,{\int_{-1}^1\mathrm{d}y\,(y\times y^2) \over \int_{-1}^1\mathrm{d}y\,(y\times y)} = x^2 - {1\over 3}\\ & \rightarrow P_2(x) = {3x^2-1 \over 2}\,,\label{eq-math-orthopoly-gramschmidt-legendre-2}\\ \ldots\\ x^n & \xrightarrow{\text{orthogonalize}} \tilde{P}_n(x) = x^n-{\large\sum_{k=0}^{n-1}}{P_k(x)\,{\int_{-1}^1\mathrm{d}y\,W(y)\,P_k(y)\, y^n \over \int_{-1}^1\mathrm{d}y\,W(y)\,P_k(y)\, P_k(y)}} \label{eq-math-orthopoly-gramschmidt-orthopart}\\ & \xrightarrow{\text{normalize}} P_n(x) = {\tilde{P}_0(x)\over \tilde{P}_0(1) }\,,\label{eq-math-orthopoly-gramschmidt-normpart} \end{align} where the first arrow indicates the orthogonalization and the second the \(P_n(1) = 1\) normalization. The general orthogonolization step in Equation (B.74) is the same for all families of polynomials, but the normalization step in Equation (B.75) depends on the specific normalization scheme used for each family.

All classical orthogonal polynomials satisfy a differential equation of the form \begin{equation}\label{eq-math-orthopoly-difeqdefinition} Q(x)\,{\mathrm{d}^2 f \over \mathrm{d} x^2} + L(x)\,{\mathrm{d} f \over \mathrm{d} x}+\lambda\,f = 0\,, \end{equation} where \(Q(x)\) and \(L(x)\) are constant quadratic and linear polynomials, respectively. Solutions of this equation are only well-behaved for certain combinations of \(\left[f,\lambda\right] = \left[P_n(x),\lambda_n\right]\) and the orthogonal polynomials in a given family are all of the well-behaved solutions of the differential equation. Equation (B.76) is a Sturm–Liouville problem and the solutions can be obtained using Sturm-Liouville theory, which we will not discuss here, but which guarantees the orthogonality and completeness of the set of solutions. Given the differential equation (B.76), we can define \(R(x) = \exp\left[\int\mathrm{d} x\,L(x)/Q(x)\right]\) and this function is related to the weight function \(W(x)\) as \(R(x) = W(x)\,Q(x)\). The domain of the polynomials is set by the roots of \(Q(x)\), of which there can be two, one, or zero for quadratic, linear, or constant \(Q(x)\). Rodrigues’ formula then provides the orthogonal polynomials as \begin{equation}\label{eq-math-orthopoly-rodrigues-poly} P_n(x) \propto {1 \over W(x)}\,{\mathrm{d}^n \over \mathrm{d} x^n}\left[W(x)\,Q^n(x)\right]\,, \end{equation} where the constant of proportionality depends on how the polynomials are normalized. The corresponding \(\lambda_n\) are given by \begin{equation}\label{eq-math-orthopoly-rodrigues-lambda} \lambda_n = -n\,\left({n-1 \over 2}\,{\mathrm{d}^2 Q\over \mathrm{d} x^2}+{\mathrm{d} L\over \mathrm{d}x}\right)\,, \end{equation} which are constants. For example, the Legendre polynomials satisfy \begin{equation}\label{eq-math-orthopoly-legendre-diffeq} (1-x^2)\,{\mathrm{d}^2 P_n\over \mathrm{d} x^2} -2x\,{\mathrm{d} P_n\over \mathrm{d} x}+\lambda_n\,P_n = 0\,, \end{equation} so the domain is \([-1,1]\), \(R(x) = 1-x^2\), the weight function is \(W(x) = R(x)/Q(x) = 1\), and the constants are \(\lambda_n = n\,(n+1)\). It is straightforward to check that Rodrigues’ formula from Equation (B.77) generates the sequence given above (see Equation B.69 and following). Equation (B.79) can also be written as \begin{equation}\label{eq-math-orthopoly-legendre-diffeq-alt} {\mathrm{d} \over \mathrm{d} x} \left[(1-x^2)\,{\mathrm{d} P_n\over \mathrm{d} x}\right]+n\,(n+1)\,P_n = 0\,. \end{equation} The Gegenbauer polynomials satisfy a similar differential equation \begin{equation}\label{eq-math-orthopoly-gegenbauer-diffeq} (1-x^2)\,{\mathrm{d}^2 C_n^{(\alpha)}\over \mathrm{d} x^2} -(2\alpha+1)\,x\,{\mathrm{d} C_n^{(\alpha)}\over \mathrm{d} x}+n\,(n+2\alpha)\,C_n^{(\alpha)} = 0\,. \end{equation} Again the domain is \([-1,1]\), now \(R(x) = (1-x^2)^{\alpha+1/2}\) and \(W(x) = (1-x^2)^{\alpha-1/2}\), and we substituted in the constant computed using Equation (B.78).

All classical orthogonal polynomials satisfy three-term recurrence formulae between \(P_{n+1}\), \(P_n\), and \(P_{n-1}\) that make them easy to calculate, especially if all polynomials up to a certain order are required to be evaluated at the same point (as is the case when using them in basis-function expansions). These three-term recurrence formulae take the form \begin{equation} P_{n+1}(x) = \left(A_n\,x+B_n\right)\,P_n(x) + C_n\,P_{n-1}(x)\,, \end{equation} where \((A_n,B_n,C_n)\) are constants that depend on the order \(n\). Deriving that this recurrence relation must exist and determining its coefficients for any specific family of polynomials is most easily done using the monic form \(\pi_n(x)\) of the polynomials, that is, the polynomials normalized such that the coefficient of the highest-order term is one. Considering then \(x\,\pi_n(x)\), we know that this is a polynomial of order \(n+1\), which can be expressed as a linear combination of the \(\pi_i(x)\) with \(i \leq n+1\). The coefficient of the \(\pi_{n+1}(x)\) term in the linear combination must be one; the other coefficients can be obtained by projecting \(x\,\pi_n(x)\) onto each \(\pi_i(x)\) using the inner product from Equation (B.67): \begin{equation}\label{eq-math-orthopoly-recurrence-monic-general} x\,\pi_n(x) = \pi_{n+1}(x) + \sum_{i=0}^n{ \pi_i(x)\,{\langle x\,\pi_n,\pi_i\rangle \over \langle \pi_i,\pi_i\rangle}}\,. \end{equation} From Equation (B.67), it is clear that \(\langle x\,\pi_n,\pi_i\rangle = \langle \pi_n,x\pi_i\rangle\), but by construction \(\pi_n(x)\) is orthogonal to all lower-order polynomials, so \(\langle \pi_n,x\pi_i\rangle\) can only be non-zero for \(i=n\) and \(i=n-1\) and the expansion in Equation (B.83) reduces to \begin{equation}\label{eq-math-orthopoly-recurrence-monic} \pi_{n+1}(x) = (a_n-x)\,\pi_n(x) - b_n\,\pi_{n-1}(x)\,, \end{equation} where \(a_n = \langle x\,\pi_n,\pi_n\rangle / \langle \pi_n,\pi_n \rangle\) and \(b_n = \langle \pi_n,\pi_n\rangle / \langle \pi_{n-1},\pi_{n-1} \rangle\), where to simplify \(b_n\) we again used that \(\langle x\,\pi_n,\pi_i\rangle = \langle \pi_n,x\pi_i\rangle\). Thus, \(b_n\) simply follows from how the polynomials are normalized, while \(a_n\) needs to be determined from the specific properties of the polynomial family.

As an example, the Gegenbauer polynomials satisfy the recurrence relation \begin{equation} (n+1)\,C_{n+1}^{(\alpha)}(x) = 2(n+\alpha)\,x\,C_n^{(\alpha)}(x)-(n+2\alpha-1)\,C_{n-1}^{(\alpha)}(x)\,, \end{equation} starting from \(C_0^\alpha(x) = 1\) and \(C_1^\alpha(x) = 2\alpha x\).

To conclude this section, we discuss the associated Legendre polynomials, which are technically not polynomials but functions and which we use in Chapter 12.3. Associated Legendre functions \(P_l^m(x)\) satisfy a generalized version of Equation (B.79) \begin{equation}\label{eq-math-orthopoly-assoclegendre-diffeq} (1-x^2)\,{\mathrm{d}^2 P^m_l\over \mathrm{d} x^2} -2x\,{\mathrm{d} P^m_l\over \mathrm{d} x}+\left[l(l+1)-{m^2 \over 1-x^2}\right]\,P_l^m = 0\,. \end{equation} This equation has a non-singular solution over the interval \([-1,1]\) only if \(l\) and \(m\) are integers with \(|m| \leq l\) and we will only consider these values. The associated Legendre functions can be obtained from the Legendre polynomials for \(m \geq 0\) as \begin{equation} P_l^m(x) = (-1)^m\,(1-x^2)^{m/2}\,{\mathrm{d}^m P_l(x)\over \mathrm{d}x^m}\,. \end{equation} It is clear that for \(m=0\), the Legendre functions are equal to the Legendre polynomials. For negative \(m\), the associated Legendre functions are proportional to the positive \(m\) version through \begin{equation}\label{eq-math-orthopoly-assoclegendre-def-negm} P_l^{-m}(x) = (-1)^m\,{(l-m)!\over (l+m)!}\,P_l^m(x)\,. \end{equation} For even \(m\), the associated Legendre functions are polynomials themselves, but they are commonly referred to as polynomials for all integer values of \(m\) and \(l\) even though they are not polynomials for odd \(m\). Associated Legendre functions are not orthogonal, although orthogonality exists for specific combinations of \((l,m,l',m')\) for different weight functions. Nevertheless, associated Legendre polynomials satisfy various recurrence relations similar to those that we found for orthogonal polynomials above and which are useful for computing them in the context of multipole or basis-function expansions. For example, \begin{equation}\label{eq-math-orthopoly-assoclegendre-recurrence-l} (l-m+1)\,P_{l+1}^m(x) = (2l+1)\,x\,P_l^m(x)-(l+m)\,P_{l-1}^m(x)\,, \end{equation} and \begin{equation}\label{eq-math-orthopoly-assoclegendre-recurrence-m} P_l^{m+1}(x) = 2m\,{x\over \sqrt{1-x^2}}\,P_l^m(x)-(l+m)\,(l-m+1)\,P_l^{m-1}(x)\,, \end{equation} which together allow one to compute all associated Legendre polynomials starting from the lowest-order ones: \begin{align} P_0^0(x) & = 1\,;\quad P_1^0(x) = x\,;\quad P_1^1(x) = -\sqrt{1-x^2}\,;\\ P_2^0(x) & = {3x^2-1 \over 2}\,;\quad P_2^1(x) = -3x\,\sqrt{1-x^2}\,;\quad P_2^2(x) = 3(1-x^2)\,,\\ \ldots \end{align} and the negative \(m\) functions can be obtained from Equation (B.88).

B.3.7. Hypergeometric functions¶

Hypergeometric functions are a very general class of special functions based on geometric power series. Whenever you encounter them in an analytical solution to an integral or other math problem, you tend to take a pause and wonder whether you are actually happy to see them, because they can be challenging to evaluate numerically and support for them is spotty in many programming languages. The only one that we need in this book is the Gaussian or ordinary hypergeometric function \(_2F_1(a;b;c;z)\), which is defined for \(|z| < 1\) by the series \begin{equation}\label{eq-math-hypergeo-series} _2F_1(a;b;c;z) = \sum_{n=0}^{\infty}{(a)_n\,(b)_n \over (c)_n}\,{z^n \over n!} = 1 + {ab \over c}\,{z \over 1!} + {a(a+1)\,b(b+1)\over c(c+1)}\,{z^2 \over 2!}+\ldots\,, \end{equation} where \((a)_n\) is the rising Pochhammer symbol whose definition should be clear from the right-hand side of this equation. Many other special functions are special cases of the hypergeometric functions. For example, the complete elliptic integral of the first kind from Equation (B.41) satisfies \begin{equation} K(m) = {\pi \over 2}\, _2 F_1\left( {1\over 2}, {1\over 2}; 1; m \right)\,. \end{equation}

← Previous section Next section →

Dynamics and Astrophysics of Galaxies

By Jo Bovy

Related Topics