4.5. Numerical orbit integration

\label{chapter-spherorb-orbitint}

We end our first dive into the properties of galactic orbits by discussing numerical orbit integration in detail. We will integrate a lot of orbits in this book and this integration will 99.99% of the time be numerical.

For most gravitational potentials, we cannot analytically solve the equations of motion (the Kepler point-mass potential being a notable exception) or we can, but doing so is quite complicated (for example, the isochrone potential). In this case, we can solve the equations of motion numerically, which we refer to as “integrating the orbit”. Newton’s second law, which we introduced in Chapter 3.1, is a ordinary differential equation (ODE) \begin{equation}\label{eq-newton-second-numerical} \ddot{\vec{x}} = \vec{g}(\vec{x})\,, \end{equation} where \(\vec{g}(\vec{x})\) is the gravitational field. The typical situation is that we have the initial position \(\vec{x}_0\) and velocity \(\vec{v}_0\) and want to find the position \(\vec{x}(t)\) and velocity \(\vec{v}(t)\) at a later time \(t\); this is an initial-value problem. Among differential equations, Equation (4.62) is not particularly difficult to solve. Galactic gravitational fields and the orbits within them are typically very smooth and, as we discussed in Chapter 2.3, galactic bodies have typically performed only a few dozen orbits, and a few thousand at most. Thus, we can solve the equation of motion with simple methods, like the standard Runge-Kutta method discussed below that can be coded up in a few lines, and we do not need high precision, because numerical errors do not add up over long integration times.

In most situations in galactic dynamics, the equations of motion can be written as Hamilton’s equations (3.35)–(3.36) using the Hamiltonian. As we will discuss below, we can construct better numerical orbit solutions by using and respecting the Hamiltonian structure of phase–space.

Before discussing particular algorithms, we note that not all problems in galactic dynamics are of the form described above. For example, to construct surfaces of section we want to find the position and velocity when the orbit starting at \((\vec{x}_0,\vec{v}_0)\) goes through the surface of section, which is a different problem from asking where the initial points ends up after a time \(\Delta t\) has passed. This problem can be expressed in the form above, if we re-parameterize the system such that the independent coordinate becomes a coordinate, say \(\psi\), that measures the distance from the surface of section such that the orbit goes through the surface of section at particular values of \(\psi\). The problem then has the same form as above, where \(t \rightarrow \psi\) and we ask where the orbit is at certain values of \(\psi\). We discuss this particular problem briefly in Chapter 13.1 where we will use it to study surfaces of section.

4.5.1. General ODE solvers

\label{sec-spherorb-generalode}

We write Equation (4.62) as a set of coupled, first-order (that is, only containing first-order derivatives) ordinary differential for \((\vec{x},\vec{v})\) \begin{align}\label{eq-newton-second-numerical-first-order-set} \dot{\vec{x}} & = \vec{v}\,,\\ \dot{\vec{v}} & = \vec{g}(\vec{x},t)\,, \end{align} where we have made the possible dependence of the field on time explicit. To determine the time evolution \([\vec{x},\vec{v}](t)\) of an initial point \((\vec{x}_0,\vec{v}_0)\), we can solve this set of differential equations using any of a large number of standard techniques. To simplify the notation, we use \(\vec{w} = (\vec{x},\vec{v})\) and Equation (4.63) then becomes \begin{equation} \dot{\vec{w}} = \vec{f}(\vec{w},t)\,, \end{equation} with \(\vec{f}(\vec{w},t) = (\vec{v},\vec{g}[\vec{x},t])\). While we did not write down a velocity dependence for the force (because forces in galactic dynamics typically do not have any), the force can depend on velocity without changing the formalism in this section; one just needs to take velocity into account when computing \(\vec{g}\) (this is, for example, necessary when considering dynamical friction [Chapter 19.4.1 ] or orbits in a rotating reference frame [Chapter 19.4.2.1]). The simplest method for solving this equation is the Euler method, in which one obtains the position and velocity at time \(t+\Delta t\) by moving along the derivative \begin{align}\label{eq-euler-step-dt} \vec{w}(t+\Delta t) = \vec{w}(t)+\Delta t\,\vec{f}(\vec{w},t)\,. \end{align} The error in this method is \(\mathcal{O}([\Delta t]^2)\) and we say that this is a first-order method. In general we call a method a n-th-order method when the error is \(\mathcal{O}([\Delta t]^{n+1})\).

Euler’s method is conceptually simple—one simply takes a small step along the gradient, re-evaluates the gradient, takes a small step along the updated gradient, and so forth—but the error increases with each step and quickly becomes large, even when using a small step \(\Delta t\).

A better Euler-like approximation would be to use the derivative at the mid-way point between \(\vec{w}(t)\) and \(\vec{w}(t+\Delta t)\) to move the point \(\vec{w}(t)\). Of course, we cannot know a priori what the mid-way point is! However, we can estimate the mid-way point by doing an Euler iteration (Equation 4.66) with a step \(\Delta t\), evaluating the gradient at the estimated mid-way point, and then using this gradient to move from \(\vec{w}(t)\) to \(\vec{w}(t+\Delta t)\). This gives the following method \begin{align} \vec{k}_1 & = \Delta t\,\vec{f}\left(\vec{w},t\right)\,\\ \vec{k}_2 & = \Delta t\,\vec{f}\left(\vec{w}+\frac{\vec{k}_1}{2},t+\frac{\Delta t}{2}\right)\,\\ \vec{w}(t+\Delta t) & = \vec{w}(t)+\vec{k}_2\,. \end{align} This method is known as the second-order Runge-Kutta method (RK2) and it is a second-order method. This method has smaller errors than the Euler method, but requires two evaluations of \(\vec{f}\left(\vec{w},t\right)\) rather than one for the Euler method. Typically, the derivative is easy enough to evaluate that the time saved by having smaller errors by far makes up for the increased time from evaluating the function twice. However, when we discuss Hamiltonian integration below, we will see that for Hamiltonian systems we can design second-order methods that only use a single evaluation of the gradient of the potential.

We can design higher-order methods by evaluating the derivatives at more than one intermediate point and/or at an estimated endpoint and combining the information from all derivatives in such a way that the total error scales as \(\mathcal{O}([\Delta t]^{n+1})\) with \(n > 2\). The most popular of such methods computes the gradient \(\vec{k}_2\) at the estimated mid-way point like for the RK2 method, but then uses this gradient to provide a different estimate for the mid-way point and computes the gradient \(\vec{k}_3\) there. This gradient is then used to estimate the endpoint, and we can compute the gradient \(\vec{k}_4\) at the estimated endpoint. Finally, the gradient at the initial point, \(\vec{k}_1\) is combined with the gradients at the intermediate points and at the endpoint to give an estimate of the endpoint that is fourth-order accurate. In detail, the method looks as follows: \begin{align} \vec{k}_1 & = \Delta t\,\vec{f}\left(\vec{w},t\right)\,\\ \vec{k}_2 & = \Delta t\,\vec{f}\left(\vec{w}+\frac{\vec{k}_1}{2},t+\frac{\Delta t}{2}\right)\,\\ \vec{k}_3 & = \Delta t\,\vec{f}\left(\vec{w}+\frac{\vec{k}_2}{2},t+\frac{\Delta t}{2}\right)\,\\ \vec{k}_4 & = \Delta t\,\vec{f}\left(\vec{w}+\vec{k}_3,t+\Delta t\right)\,\\ \vec{w}(t+\Delta t) & = \vec{w}(t)+\frac{\vec{k}_1}{6}+\frac{\vec{k}_2}{3}+\frac{\vec{k}_3}{3}+\frac{\vec{k}_4}{6}\,. \end{align} The coefficients in the final line are chosen such that the error scales as \(\mathcal{O}([\Delta t]^{5})\) and this is the fourth-order Runge-Kutta (RK4) method. The fourth-order Runge-Kutta methods uses four gradient evaluations to estimate \(\vec{w}(t+\Delta t)\). The extra work from the four gradient evaluations is typically offset by the increased accuracy when integrating orbits in smooth galactic potentials.

Higher-order methods of this type can be designed by using even more mid-way estimates of the gradient; all methods become more complicated as one goes to higher order. One advantage of certain higher-order Runge-Kutta-style methods is that one can construct multiple estimates of \(\vec{w}(t+\Delta t)\) from the estimated mid-way and endpoint gradients and the difference between these values can be used to estimate the error. This error estimate can then be employed to adaptively change the step \(\Delta t\), reducing the step when the error is larger than desired. Dormand-Prince methods of fifth and eight order are popular examples of this type of method (see Press et al. 2007 for more information on these).

Like for the Euler method, errors accumulate with every time step for typical orbits with Runge-Kutta (or Dormand-Prince) methods. If we use these methods to integrate orbits for many dynamical times, the total error will become large, unless one uses a prohibitively small time step. Therefore, these methods are only useful when integrating orbits for a relatively small number of dynamical times (e.g., hundreds for RK4, but it depends on the step). For \(N\)-body simulations, the gravity calculation is sufficiently slow that in practice we can only evaluate the gradient in Equation (4.63) once per time step. The best method that only uses a single gradient evaluation for a Hamiltonian system is not the Euler method, but a method that we will discuss next.

One situation where we have to use the methods from this section is when there are dissipative, non-gravitational forces that affect the dynamics of a star or galactic system. Examples of this include dynamical friction (see Chapter 19.4.1) or interactions between gravitational bodies and a gas disk (for example, in a proto-planetary disk).

4.5.2. Hamiltonian integration

\label{sec-spherorb-hamiltonianint}

The general methods for solving ordinary differential equations that we discussed in the previous section all essentially work by discretizing the differential equation. For a Hamiltonian system, we can instead discretize the Hamiltonian in a way that allows us to exactly integrate Hamilton’s equations for the discretized Hamiltonian. The advantage of such a method is that it is straightforward to discretize the Hamiltonian in such a way that the discretized and original Hamiltonian are indistinguishable on long time scales. The long-term behavior of an orbit integration will therefore remain close to the true trajectory without accumulating errors like the general ODE solvers above. We term such methods Hamiltonian integration methods.

To illustrate this, we consider the Hamiltonian in Cartesian coordinates \begin{equation} H(\vec{q},\vec{p}) = \frac{|\vec{p}|^2}{2} + \Phi(\vec{q})\,. \end{equation} Hamilton’s equations of motion for this Hamiltonian are \begin{align} \dot{\vec{q}} & = \phantom{-}\,\vec{p}\,,\\ \dot{\vec{p}} & = -\,\vec{\nabla}\Phi(\vec{q})\,. \end{align}

Let us now discretize the Hamiltonian by multiplying the potential in it with a Dirac comb \(\operatorname{III}(t;\Delta t) = \Delta t\,\sum_{j=-\infty}^{\infty} \delta\left(t-j\,\Delta t\right)\), where the \(\delta(\cdot)\) functions are delta functions (see Appendix B.3.2). The discretized Hamiltonian \(H_{\Delta t}\) then becomes \begin{align}\label{eq-hamiltonian-discretized} H(\vec{q},\vec{p}) & = \frac{|\vec{p}|^2}{2} + \Phi(\vec{q})\,\Delta t\,\sum_{j=-\infty}^{\infty} \delta\left(t-j\,\Delta t\right)\,. \end{align} Using the Poisson summation formula, we can write this equivalently as \begin{align} H(\vec{q},\vec{p}) & = \frac{|\vec{p}|^2}{2} + \Phi(\vec{q})\,\sum_{j=-\infty}^{\infty} \cos\left(\frac{2\pi\,j}{\Delta t}\,t\right)\nonumber\,. \end{align} Thus, we have effectively replaced the potential with a time-dependent potential which fluctuates on time scales with periods \(T = \Delta t / j\), where \(j\) is an integer. On time scales \(\gg \Delta t\), all of these fluctuations average out and we have that \(H_{\Delta t} \approx H\). Thus, on time scales larger than the step \(\Delta t\), the difference between the discretized and the original Hamiltonian is small.

Hamilton’s equations for the discretized Hamiltonian in Equation (4.78) are \begin{align}\label{eq-discrete-hamiltonian-qdot} \dot{\vec{q}} & = \phantom{-}\,\vec{p}\,,\\ \dot{\vec{p}} & = -\,\vec{\nabla}\Phi(\vec{q})\,\Delta t\,\sum_{j=-\infty}^{\infty} \delta\left(t-j\,\Delta t\right)\,.\label{eq-discrete-hamiltonian-pdot} \end{align} These equations can be solved exactly as follows, because the gradient in the second of these equations is now only non-zero at a set of discrete times \(j\,\Delta t\). Let \(j=1\). If we are at \(t=\varepsilon\), where \(\varepsilon \ll \Delta t\) is an infinitesimal number, we can integrate analytically up to \(t=\Delta t-\varepsilon\), because \(\dot{\vec{p}}\) is zero at all these times, and we thus get the update step \begin{align}\label{eq-symplectic-drift-eps} \vec{q}(\Delta t-\varepsilon) & = \vec{q}(\varepsilon) + (\Delta t - 2\varepsilon)\,\vec{p}(+\varepsilon)\,,\\ \vec{p}(\Delta t-\varepsilon) & = \vec{p}(\varepsilon)\,.\nonumber \end{align} When we then integrate from \(\Delta t-\varepsilon\) to \(\Delta t+\varepsilon\), we can integrate Equations (4.79) and (4.80) over the time interval \(2\varepsilon\) \begin{align} \vec{q}(\Delta t+\varepsilon) = \vec{q}(\Delta t-\varepsilon)+\int_{\Delta t-\varepsilon}^{\Delta t+\varepsilon}\mathrm{d}t\,\dot{\vec{q}} & = \vec{q}(\Delta t-\varepsilon)+2\varepsilon\,\vec{p}(\Delta t-\varepsilon)\,\nonumber\\ \vec{p}(\Delta t+\varepsilon) = \vec{p}(\Delta t-\varepsilon)+\int_{\Delta t-\varepsilon}^{\Delta t+\varepsilon}\mathrm{d}t\,\dot{\vec{p}} & = \vec{p}(\Delta t-\varepsilon)-\,\vec{\nabla}\Phi\left(\vec{q}[\Delta t-\varepsilon]\right)\,\Delta t\,.\label{eq-symplectic-kick-eps} \end{align} If we then let \(\varepsilon \rightarrow 0\), Equations (4.81) and Equation (4.82) reduce to \begin{align} \vec{q}(\Delta t) & = \vec{q}(0) + \Delta t\,\vec{p}(0)\,,\label{eq-symplectic-drift}\\ \vec{p}(\Delta t) & = \vec{p}(0)-\Delta t\,\vec{\nabla}\Phi\left(\vec{q}[\Delta t]\right)\,.\label{eq-symplectic-kick} \end{align}

Thus, we first let the position move at constant momentum for a time \(\Delta t\), called the drift step, and we then update the momentum using the gradient computed at the new position, which is called the kick step. This method is similar to the Euler method from Equation (4.66), except that it uses the force at the new position, rather than at the old; this method is therefore called the modified Euler method. One can show that it is also a first-order method.

Using a similar procedure, we can construct a second-order method that is widely used. Rather than starting at \(t=\varepsilon\) in the above derivation, we start at \(t=-\Delta/2\) and drift until \(t=-\varepsilon\); we then integrate over the kick from \(t=-\varepsilon\) to \(t=\epsilon\), and finally drift until \(t=\Delta t/2\). This also completes an increment of the system by a step \(\Delta t\); shifting all the times by \(\Delta t/2\) in this procedure (such that the step goes from \(t=0\) to \(t=\Delta t\), with the kick at \(t=\Delta t/2\)), the actual sequence of steps is \begin{align} \vec{q}\left(\frac{\Delta t}{2}\right) & = \vec{q}(0) + \frac{\Delta t}{2}\,\vec{p}(0)\,,\\ \vec{p}\left(\frac{\Delta t}{2}\right) & = \vec{p}(0)-\Delta t\,\vec{\nabla}\Phi\left(\vec{q}\left[\frac{\Delta t}{2}\right]\right)\,,\\ \vec{q}\left(\Delta t\right) & = \vec{q}\left(\frac{\Delta t}{2}\right) + \frac{\Delta t}{2}\,\vec{p}\left(\frac{\Delta t}{2}\right)\,. \end{align} This is the drift-kick-drift leapfrog method. This is a second-order method with errors \(\mathcal{O}([\Delta t]^3)\). Like the Euler and modified Euler methods, it requires only a single force evaluation to advance the orbit by a time \(\Delta t\), but unlike the Euler methods, it is second order. A further speed-up can be achieved by chaining successive drift steps together, because the last step and the first step of the next iteration both add the same vector to the position. Therefore, as long as we do not need to know the position at the \(t=j\,\Delta t\) interval (for example, when we only record the orbit every 10 steps, we do not need to record its position for 9 out of 10 steps and we can chain the drifts for these nine steps together). For this reason, the leapfrog integrator is the preferred method for almost all collisionless simulations, which do not require high accuracy, because they use approximate forces anyway.

The main advantage of integrators that integrate an approximate Hamiltonian exactly, rather than approximately integrating the exact Hamiltonian, is that they are a legitimate phase-space mapping. Thus, Hamiltonian integration satisfies, e.g., Liouville’s theorem for the approximate Hamiltonian (see Chapter 5.3). As we discussed above, at time scales \(\gg \Delta t\), the exact and approximate Hamiltonian coincide. For a small enough step \(\Delta t\), this typically causes the energy error to be bounded, such that an accumulation of energy errors as for the standard ODE solvers does not occur.

Hamiltonian integrators are often called symplectic. This name comes from the fact that these integrators are Hamiltonian maps, whose mathematical structure is that of a vector flow on a symplectic manifold. Many fine dynamicists have made great contributions to the field without delving deeply into the meaning of the previous sentence and we do not discuss this further.

Most orbit integrations that one encounters in galactic dynamics can be solved using the simple leapfrog integrator above or, for test-particle orbits in a smooth gravitational field, with the fourth-order Runge-Kutta method from the previous section. Collisionless \(N\)-body simulations typically use the leapfrog method, with particles assigned to a hierarchy of step sizes \(\Delta t = \Delta T / 2^k\) that are some factor of two smaller than a maximum step size \(\Delta T\); kicks are only applied at \(\Delta t = \Delta T / 2^k\), where \(k\) can vary between different particles. This is called a block time step scheme. For further information on this scheme or on other integrators, we refer the interested reader to Dehnen & Read (2011).