3. Elements of classical mechanics: Section 4 (Dynamics and Astrophysics of Galaxies)

3.4. Hamiltonian mechanics¶

In the Lagrangian framework, the coordinate \(\vec{q}\) has primacy: the Lagrangian \(\mathcal{L}(\vec{q},\dot{\vec{q}},t)\) and the Lagrange equations serve to provide a (typically) second-order equation of motion for \(\vec{q}\), while the derivatives \(\dot{\vec{q}}\) only appears as a convenient short-hand for the time derivative of \(\vec{q}\), but without having much meaning in and of themselves. The Lagrangian formalism makes it straightforward to derive the equations of motion in a transformed set of coordinates \(\vec{q} = \vec{G}(\vec{x})\) (where \(\vec{G}(\cdot)\) is some vector-valued, invertible function). However, much is to be gained, both in understanding the structure of the theory of classical mechanics and in its practical application to galactic dynamics, by considering broader sets of transformations.

3.4.1. Hamilton’s equations¶

In Hamiltonian mechanics, this is achieved by upgrading the canonical momentum \(\vec{p}\) to a prime ingredient of the theory on the same footing as the coordinate \(\vec{q}\). Because the Lagrangian is a function of \(\vec{q}\) and \(\dot{\vec{q}}\), in practice this is accomplished by performing a Legendre transformation of the Lagrangian. A Legendre transformation of a differentiable function \(f(x,y)\) of two variables where \(u = \partial f / \partial x\) and \(v = \partial f / \partial y\) is a transformation of the form \begin{equation} g = f - ux\,. \end{equation} In terms of \((x,y)\) the differential of \(f\) is \begin{equation} \mathrm{d}f = u\,\mathrm{d}x+v\,\mathrm{d}y\,. \end{equation} The differential of the Legendre transformation \(g\) is \begin{equation} \mathrm{d}g= \mathrm{d}f - u\,\mathrm{d}x - x\,\mathrm{d}u = -x\,\mathrm{d}u+v\,\mathrm{d}y\,. \end{equation} Thus, the function \(g\) is specified in terms of \((u,y)\) and we have that \begin{align} x & = -\frac{\partial g}{\partial u}\,;\quad v = \phantom{-}\frac{\partial g}{\partial y}\,. \end{align} We then obtain the Hamiltonian as the following Legendre transformation (remember that \(\vec{p} = \partial L / \partial \dot{\vec{q}}\) from Equation 3.26) \begin{equation}\label{eq-hamiltonian-def-lagrangian} H(\vec{q},\vec{p},t) = \dot{\vec{q}}\,\vec{p}-\mathcal{L}(\vec{q},\dot{\vec{q}},t)\,. \end{equation} In this equation, \(\dot{\vec{q}}\) is determined by \(\vec{q}\), \(\vec{p}\), \(t\) and Equation (3.26).

Why is the Hamiltonian a useful quantity to introduce? Because it leads to a new, and in some sense simpler, set of equations of motion. The total derivative \(\mathrm{d}H\) of the Hamiltonian can be written in two forms: simply using its own variables: \begin{equation}\label{eq-dH-own} \mathrm{d}H = \frac{\partial H}{\partial \vec{q}}\,\mathrm{d}\vec{q} + \frac{\partial H}{\partial \vec{p}}\,\mathrm{d}\vec{p} + \frac{\partial H}{\partial t}\,\mathrm{d}t\,, \end{equation} and using its definition in terms of the Lagrangian in Equation (3.32) \begin{align} \mathrm{d}H & = \vec{p}\,\mathrm{d}\dot{\vec{q}}+\dot{\vec{q}}\,\mathrm{d}\vec{p} - \frac{\partial \mathcal{L}}{\partial \vec{q}}\,\mathrm{d}\vec{q} - \frac{\partial \mathcal{L}}{\partial \dot{\vec{q}}}\,\mathrm{d}\dot{\vec{q}}-\frac{\partial \mathcal{L}}{\partial t}\,\mathrm{d}t = - \frac{\partial \mathcal{L}}{\partial \vec{q}}\,\mathrm{d}\vec{q} +\dot{\vec{q}}\,\mathrm{d}\vec{p}-\frac{\partial \mathcal{L}}{\partial t}\,\mathrm{d}t\,,\label{eq-dH-L} \end{align} where the cancellation happens because of Equation (3.26). Equating the prefactors of each of the differentials in Equations (3.33) and (3.34) and using the Lagrange equations in the form of Equation (3.27) then gives the following system of equations \begin{align}\label{eq-classmech-Hamiltons-eq-qdot} \dot{\vec{q}} & = \phantom{-}\frac{\partial H}{\partial \vec{p}}\,,\\ \dot{\vec{p}} & = -\frac{\partial H}{\partial \vec{q}}\,,\label{eq-classmech-Hamiltons-eq-pdot} \end{align} known as Hamilton’s equations, as well as \begin{equation} \frac{\partial H}{\partial t} = -\frac{\partial \mathcal{L}}{\partial t}\,. \end{equation}

Hamilton’s equation is a set of equations of motion that is equivalent to the Lagrange equations, but it is clear that this set of equations treats coordinates \(\vec{q}\) and momenta \(\vec{p}\) on the same footing. Rather than having a set of second-order differential equations for \(\ddot{\vec{q}}\), we now have a twice as large set of first-order differential equations for \((\vec{q},\vec{p})\).

The space spanned by \(\vec{q}\) is known as configuration space, while that spanned by \(\vec{w} = (\vec{q},\vec{p})\) is phase space. Phase space is the fundamental space of gravitational mechanics and Hamilton’s equations describe the dynamics in this space. For an \(N\)-dimensional configuration space, phase-space is \(2N\) dimensional and Hamilton’s equations provide \(2N\) first-order differential equations. The solution of this set of differential equations includes \(2N\) integration constants, which can be related to the initial condition \(\vec{w}_0\).

For a set of generalized coordinates that is related to \(\vec{x}\) by a transformation that does not explicitly depend on time, it can be shown that the Hamiltonian is equal to the energy: \(H = T+V\) (technically, this requires the potential to not depend on the velocities, which is the case for gravitational potentials). The total time derivative of the Hamiltonian is \begin{equation}\label{eq-classmech-hamiltonian-energy-conservation} \frac{\mathrm{d} H}{\mathrm{d} t} = \frac{\partial H}{\partial \vec{q}}\,\dot{\vec{q}}+\frac{\partial H}{\partial \vec{p}}\,\dot{\vec{p}} + \frac{\partial H}{\partial t} = \frac{\partial H}{\partial t}\,. \end{equation} The first two terms cancel because of Hamilton’s equations. Thus, we find that when the Hamiltonian does not explicitly depend on time, the Hamiltonian, or energy, is conserved. This is a more general statement than the one in Equation (3.12) that is useful when investigating, for example, dynamics in non-inertial reference frames (see Chapter 19.4.2.1).

As an example of the Hamiltonian framework, we can compute the Hamiltonian for cylindrical coordinates, starting from the results in Section 3.3 above. Recall that the Lagrangian was given by Equation (3.24). The generalized momenta are therefore \begin{align}\label{eq-hamiltonian-cylin-momenta} \frac{\partial \mathcal{L}}{\partial \dot{R}} & = m\,\dot{R}\,; \quad p_\phi = m\,R^2\,\dot{\phi}\,; \quad p_z = m\,\dot{z}\,. \end{align} We can then compute the Hamiltonian \begin{align} H(R,\phi,z,p_R,p_\phi,p_z) & = \dot{R}\,p_R+\dot{\phi}\,p_\phi+\dot{z}\,p_z-\mathcal{L}(\vec{q},\dot{\vec{q}},t)\\ & = \frac{1}{2m}\,\left(p_R^2 + \frac{p_\phi^2}{R^2}+p_z^2\right) + m\Phi(R,\phi,z)\,.\label{eq-hamiltonian-cylin} \end{align}

The Hamiltonian in two-dimensional polar coordinates is similarly \begin{align} H(R,\phi,p_R,p_\phi) & = \frac{1}{2m}\,\left(p_R^2 + \frac{p_\phi^2}{R^2}\right) + m\Phi(R,\phi)\,.\label{eq-hamiltonian-polar} \end{align}

An important theorem that we can prove using Hamilton’s equations is the following:

Poincaré invariant theorem: For any two-dimensional surface \(S(0)\) in phase space that gets mapped to \(S(t)\) under orbit integration, we have that \begin{equation}\label{eq-classmech-poincare-invariant} \int_{S(0)}\mathrm{d}\vec{q}\cdot\mathrm{d}\vec{p} = \int_{S(t)}\mathrm{d}\vec{q}\cdot\mathrm{d}\vec{p}\,. \end{equation} Such conserved quantities are known as Poincaré invariants. Using the version of Green’s theorem from Equation (B.6), this also implies that \begin{equation}\label{eq-classmech-poincare-invariant-lineint} \int_{\gamma(0)}\mathrm{d}\vec{q}\cdot\vec{p} = \oint_{\gamma(t)}\mathrm{d}\vec{q}\cdot\vec{p}\,, \end{equation} where \(\gamma(t)\) is the boundary of \(S(t)\).

3.4.2. Canonical transformations¶

Just like the Lagrange equations can be derived from a variational principle (see section 3.3 above) and this is useful in generalizing Newton’s second law to arbitrary configuration-space coordinate frames, Hamilton’s equations can be derived from a variational principle. Hamilton’s principle above states that the motion of a system in configuration space \(\vec{q}\) between times \(t_1\) and \(t_2\) is such that the action integral, \(S = \int_{t_1}^{t_2}\mathrm{d}t \,\mathcal{L}(\vec{q},\dot{\vec{q}},t)\) has a extremal value for the actual path of the system. To obtain a variational-principle formulation of Hamiltonian mechanics, we upgrade the path from a path in configuration space \(\vec{q}\) to a path in phase space \((\vec{q},\vec{p})\) and replace the Lagrangian with its equivalent in terms of the Hamiltonian using Equation (3.32). This gives the

Modified Hamilton’s principle: The trajectory of a system from time \(t_1\) to time \(t_2\) in phase space \((\vec{q},\vec{p})\) is such that the integral \begin{equation} \int_{t_1}^{t_2}\mathrm{d}t \,\left[\dot{\vec{q}}\,\vec{p}-H(\vec{q},\vec{p},t)\right]\,, \end{equation} where \(H(\vec{q},\vec{p},t)\) is the Hamiltonian, has an extremal value for the actual path of the system. The Euler-Lagrange equations applied to this integral directly lead to Hamilton’s equations.

One may then ask what the most general set of transformations of phase-space \begin{align} \vec{q}' & = \vec{q}'(\vec{q},\vec{p},t)\,;\quad \vec{p}' = \vec{p}'(\vec{q},\vec{p},t)\,, \end{align} is such that the dynamics of \((\vec{q}',\vec{p}')\) is given by Hamilton’s equations for some Hamiltonian \(K\): \begin{align} \dot{\vec{q}'} & = \phantom{-}\frac{\partial K}{\partial \vec{p}'}\,;\quad \dot{\vec{p}'} = -\frac{\partial K}{\partial \vec{q}'}\,. \end{align} These equations would hold if the path of the system is an extremum of the following integral in the modified Hamilton’s principle \begin{equation} \int_{t_1}^{t_2}\mathrm{d}t \,\left(\dot{\vec{q}'}\,\vec{p}'-K(\vec{q}',\vec{p}',t)\right)\,. \end{equation} The path of the system in the transformed phase-space coordinates is an extremum if and only if \begin{equation} \lambda\,\left[\dot{\vec{q}}\,\vec{p}-H(\vec{q},\vec{p},t)\right] = \dot{\vec{q}'}\,\vec{p}'-K(\vec{q}',\vec{p}',t) + \frac{\mathrm{d} F}{\mathrm{d}t}\,, \end{equation} because the path in the original coordinates is an extremum. The final term on the right-hand side is the total derivative of an arbitrary function of the phase-space coordinates and it is there because we can always add a total derivative, as this does not vary at the ends of the trajectory. The \(\lambda\) parameter can be set to 1, because it can be absorbed by a change of units of \((\vec{q}',\vec{p}')\) and the transformation is therefore specified by \begin{equation}\label{eq-condition-canonical-transformation} \dot{\vec{q}}\,\vec{p}-H = \dot{\vec{q}'}\,\vec{p}'-K + \frac{\mathrm{d} F}{\mathrm{d}t}\,. \end{equation}

Transformations that satisfy this relation are canonical transformations. The function \(F\) is called the generating function, because when expressed as a function of a mix of either of the configuration and momentum components of the old and new coordinates (so, e.g., \([\vec{q},\vec{p}']\) or \([\vec{q}',\vec{p}]\), but not \([\vec{q},\vec{p}]\) or \([\vec{q}',\vec{p}']\)) it (implicitly) generates the transformation equations between the old and new phase-space coordinates. For example, for \(F = F_1(\vec{q},\vec{q}',t)\) Equation (3.50) becomes \begin{align} \dot{\vec{q}}\,\vec{p}-H & = \dot{\vec{q}'}\,\vec{p}'-K + \frac{\mathrm{d} F_1(\vec{q},\vec{q}',t)}{\mathrm{d}t} = \dot{\vec{q}'}\,\vec{p}'-K + \frac{\partial F_1}{\partial t} + \frac{\partial F_1}{\partial \vec{q}}\,\dot{\vec{q}} + \frac{\partial F_1}{\partial \vec{q}'}\,\dot{\vec{q}}'\,. \end{align} Because this equation must hold everywhere and \(\vec{q}\) and \(\vec{q}'\) are independent, we must have that \begin{align}\label{eq-gen1stkind-transform} \vec{p} & = \phantom{-}\frac{\partial F_1}{\partial \vec{q}}\,;\quad \vec{p}' = -\frac{\partial F_1}{\partial \vec{q}'}\,. \end{align} The new Hamiltonian \(K\) is then also related to the old Hamiltonian \(H\) through \begin{equation}\label{eq-canonical-hamiltonian-transform} K = H + \frac{\partial F_1}{\partial t}\,. \end{equation} Equations (3.52) make it clear that the transformation defined by the generating function is an implicit transformation: we can compute one of the new and one of the old coordinates using the transformation equations, not both of the new or both of the old components.

We labeled the function as \(F_1\) in the example above, because when the generating function depends on the configuration coordinate of both the old and the new coordinate system, the function is a generating function of the first kind. In total, there are four different types of generating functions, one for each of the combinations of (old,new) and (configuration,momentum)-component. These lead to slightly different transformation equations that are easily derived in the same manner as for \(F_1(\vec{q}\,\vec{q}',t)\) above. For example, a generating function of the second kind is a function \(F_2(\vec{q},\vec{p}',t)\) such that in Equation (3.50) \(F = F_2(\vec{q},\vec{p}',t) - \vec{q}'\,\vec{p}\). Substituting this into Equation (3.50) gives the transformation equations \begin{align}\label{eq-gen2ndkind-transform} \vec{p} & = \frac{\partial F_2}{\partial \vec{q}}\,;\quad \vec{q}' = \frac{\partial F_2}{\partial \vec{p}'}\,. \end{align} Useful exercises are to show that (a) the generating function of the second kind \(F_2(\vec{q},\vec{p}') = \vec{q}\cdot\vec{p}'\) generates the identity transformation \((\vec{q},\vec{p}) \rightarrow (\vec{q},\vec{p})\) and (b) that the transformation generated by \(F_2(\vec{q},\vec{p}') = \vec{f}(\vec{q})\,\vec{p}'\) is that which performs the coordinate transformation \(\vec{q} \rightarrow \vec{q}' = \vec{f}(\vec{q})\). All configuration-space coordinate transformations are thus canonical; in the context of canonical transformations, they are also known as point transformations.

One important aspect of canonical transformations is that they have various useful invariants. If we consider canonical transformations that do not explicitly depend on time, we can show that the phase-space volume \(|\mathrm{d}\vec{q}\,\mathrm{d}\vec{p}|\) is invariant. The transformation of an infinitesimal volume is given by the absolute value of the Jacobian of the transformation \begin{equation}\label{eq-transform-phasespace-volume} |\mathrm{d}\vec{q}'\,\mathrm{d}\vec{p}'| = \left| \begin{matrix} \frac{\partial \vec{q}'}{\partial \vec{q}} & \frac{\partial \vec{q}'}{\partial \vec{p}}\\ \frac{\partial \vec{p}'}{\partial \vec{q}} & \frac{\partial \vec{p}'}{\partial \vec{p}}\end{matrix}\right|\, |\mathrm{d}\vec{q}\,\mathrm{d}\vec{p}|\,. \end{equation} For a canonical transformation that does not explicitly involve time, the Hamiltonian does not change: \(K = H\) (see Equation 3.53). Therefore, we can compute the time derivative of \(\vec{q}'\) explicitly through partial differentiation through the canonical transformation and using Hamilton’s equations for the original coordinates: \begin{align} \dot{\vec{q}}' & = \frac{\partial \vec{q}'}{\partial \vec{q}}\,\dot{\vec{q}}+\frac{\partial \vec{q}'}{\partial \vec{p}}\,\dot{\vec{p}}= \frac{\partial \vec{q}'}{\partial \vec{q}}\,\frac{\partial H}{\partial \vec{p}}-\frac{\partial \vec{q}'}{\partial \vec{p}}\,\frac{\partial H}{\partial \vec{q}}\,, \end{align} and, alternatively, through Hamilton’s equations for the transformed coordinates and partial differentiation of the inverse transformation \begin{align} \dot{\vec{q}}' & = \frac{\partial H}{\partial \vec{p}'} = \frac{\partial H}{\partial \vec{q}} \,\frac{\partial \vec{q}}{\partial \vec{p}'} +\frac{\partial H}{\partial \vec{p}} \,\frac{\partial \vec{p}}{\partial \vec{p}'}\,. \end{align} For these two alternatives to give the same result, we must have that \begin{align} \frac{\partial \vec{q}'}{\partial \vec{q}} & = \frac{\partial \vec{p}}{\partial \vec{p}'}\,; \quad \frac{\partial \vec{q}'}{\partial \vec{p}} = -\frac{\partial \vec{q}}{\partial \vec{p}'}\,. \end{align} Similarly, from working out \(\dot{\vec{p}}'\) in two ways, we find that \begin{align} \frac{\partial \vec{p}'}{\partial \vec{q}} & = -\frac{\partial \vec{p}}{\partial \vec{q}'}\,;\quad \frac{\partial \vec{p}'}{\partial \vec{p}} = \frac{\partial \vec{q}}{\partial \vec{q}'}\,. \end{align} If we substitute these expressions into the Jacobian in Equation (3.55), we find that (using the fact that we can bring signs out of the Jacobian) \begin{equation} \left| \begin{matrix} \frac{\partial \vec{q}'}{\partial \vec{q}} & \frac{\partial \vec{q}'}{\partial \vec{p}}\\ \frac{\partial \vec{p}'}{\partial \vec{q}} & \frac{\partial \vec{p}'}{\partial \vec{p}}\end{matrix}\right| = \left| \begin{matrix} \frac{\partial \vec{p}}{\partial \vec{p}'} & \frac{\partial \vec{q}}{\partial \vec{p}'}\\ \frac{\partial \vec{p}}{\partial \vec{q}'}& \frac{\partial \vec{q}}{\partial \vec{q}'}\end{matrix}\right| \end{equation} The determinant on the right-hand side is the determinant of the inverse transformation, which is equal to the inverse of the determinant of the original transformation. Therefore, we have that the determinant is equal to its own inverse and, thus, has to equal one (because it is positive). This means that Equation (3.55) simplifies to \(|\mathrm{d}\vec{q}'\,\mathrm{d}\vec{p}'| = |\mathrm{d}\vec{q}\,\mathrm{d}\vec{p}|\). Phase-space volume is thus conserved by canonical transformations.

3.4.3. The Hamilton-Jacobi equation and action-angle coordinates¶

Because Hamiltonian mechanics and the canonical-transformation formalism allow transformations to phase-space coordinate systems that mix traditional configuration and momentum variables, they can be used to separate dynamical systems into conserved quantities and non-conserved quantities. Using the definition of a canonical transformation above, we can seek a transformation characterized by a generating function of the second kind \(S(\vec{q},\vec{p}',t)\) such that the Hamiltonian \(K\) for the transformed coordinates is equal to zero. When this is the case, Hamilton’s equations for the transformed coordinates are simply \begin{align}\label{eq-hamilton-jacobi-ideal-eqs} \dot{\vec{q}}' & = \phantom{-}\frac{\partial K}{\partial \vec{p}'} = 0\,;\quad \dot{\vec{p}}' = -\frac{\partial K}{\partial \vec{q}'} = 0\,. \end{align} Thus, the transformed coordinates are all conserved and the solution of the equations of motion is trivial. To obtain these simple equations, we require \(K = 0\) or from Equation (3.53) \begin{equation}\label{eq-classmech-HJ-inE} H\left(\vec{q},\frac{\partial S}{\partial \vec{q}},t\right) + \frac{\partial S}{\partial t} = 0\,. \end{equation} This is the Hamilton-Jacobi equation and the generating function \(S\) is Hamilton’s principal function. This is a first-order partial differential equation in \(N+1\) variables \((\vec{q},t)\) and its solution is characterized by \(N+1\) constants of integration \(C_i\), one of which is simply a constant addition to \(S\) that is irrelevant because \(S\) only appears in the dynamical problem through its derivative. Because the other \(N\) constants \(C_i\) fully characterize the solution, we can express \(S\) as a function \(S(\vec{q},\vec{C},t)\) and, thus, we can set the transformed momentum \(\vec{p} = \vec{C}\). That is, the transformed momentum is simply the set of \(N\) integration constants.

If the Hamiltonian does not explicitly depend on time, \(H\) and the energy are conserved. The function \(S\) can then be separated as \(S(\vec{q},\vec{C},t) = W(\vec{q};\vec{C})-Et\) and one of the \(C_i\), say \(C_0\), equals \(E\). The Hamilton-Jacobi equation then simplifies to \begin{equation}\label{eq-classmech-HJ-Econserved} H\left(\vec{q},\frac{\partial W}{\partial \vec{q}}\right)= C_0 = E\,, \end{equation} where \(W(\vec{q};\vec{C})\) is Hamilton’s characteristic function. Solving the Hamilton-Jacobi equation for general systems—galactic or otherwise—is difficult. The only systems for which it is practical to solve the Hamilton-Jacobi equation are those for which the equation can be solved using additive separation of variables. That is, systems for which one can show that \(W(\vec{q};\vec{C})\) along a dynamical trajectory can be written as a sum over functions of a single component: \(W(\vec{q};\vec{C}) = \sum_i W_i(q_i;\vec{C})\). For example, for a two-dimensional system with a potential that only depends on \(R\), we can write the Hamilton-Jacobi equation in polar coordinates using the expression for the Hamiltonian from Equation (3.42) \begin{equation} {1 \over 2m}\,\left[\left(\frac{\partial W}{\partial R}\right)^2 + \frac{1}{R^2}\,\left(\frac{\partial W}{\partial \phi}\right)^2\right] + \Phi(R) = E\,, \end{equation} which we can separate as \(W(R,\phi) = W_\phi(\phi) + W_R(R)\) as (we sometimes drop the argument \(\vec{C}\) for notational simplicity) \begin{equation} \left(\frac{\partial W_\phi(\phi)}{\partial \phi}\right)^2 = 2m\,R^2\,\left[E-m\Phi(R)\right]-R^2\,\left(\frac{\partial W_R(R)}{\partial R}\right)^2\,. \end{equation} The left-hand side only depends on \(\phi\), while the right-hand side only depends on \(R\), so they need to be separately conserved and equal to a separation constant, which is \(m^2\,L_z^2\) (\(L_z\) is the conserved, specific angular momentum). The full solution of the Hamilton-Jacobi equation in this case is then (we assume that the trajectory is closed, such that it has finite bounds in each coordinate) \begin{equation}\label{eq-Wfunction-polar-coordinates} W(R,\phi;\vec{C}) = W_\phi(\phi) + W_R(R) = m\,\int_0^\phi\mathrm{d}\phi\,L_z + \int_{R_\mathrm{min}}^R\mathrm{d}R\,\sqrt{2m\,\left[E-m\Phi(R)\right]-\frac{m^2\,L_z^2}{R^2}}\,. \end{equation}

When we solve the Hamilton-Jacobi equation through separation of variables, we can write the solution in a form like this. Rather than using the set of integration constants \(\vec{C}\) as the transformed momentum, we can then instead choose to use the \(N\) quantities \(J_i\) defined as \begin{equation}\label{eq-actions-definition} J_i = \frac{1}{2\pi}\,\oint \mathrm{d}q_i\,\frac{\partial W_{q_i}(q_i;\vec{C})}{\partial q_i}= \frac{1}{2\pi}\,\oint \mathrm{d}q_i\,p_i\,, \end{equation} as the momentum. Because the only non-constant of the motion, the \(q_i\), on the right-hand side are integrated over, it is clear that \(\vec{J}\) is a function of the integration constants \(\vec{C}\) alone, \(\vec{J}\equiv \vec{J}(\vec{C})\), and they are thus conserved quantities as well. Because \(C_0 = H\), from the inverse transformation we have that \(H \equiv H(\vec{J})\), that is, the Hamiltonian is a function of the \(\vec{J}\) only, not of its associated configuration coordinates. Considering \(W\) as a function of \((\vec{q},\vec{J})\) instead of of \((\vec{q},\vec{C})\) then defines a new canonical transformation—one for which the very simple dynamics of Equation (3.61) no longer holds. The transformed configuration coordinates are now \begin{equation}\label{eq-classmech-aa-angles-fromW} \boldsymbol\theta = \frac{\partial W(\vec{q},\vec{J})}{\partial \vec{J}} = \sum_i \frac{\partial W_i(q_i,\vec{J})}{\partial J_i}\,. \end{equation} Because \(W\) still does not depend on time, the Hamiltonian to be used in Hamilton’s equations for this new set of coordinates \((\boldsymbol\theta,\vec{J})\) is the same and, as discussed above, we have that \(H \equiv H(\vec{J})\). Therefore, the time evolution is given by \begin{align}\label{eq-classmech-aa-dangledt} \dot{\boldsymbol\theta} & = \phantom{-}\frac{\partial H(\vec{J})}{\partial \vec{J}} = \mathrm{constant}\,,\\ \dot{\vec{J}} & = -\frac{\partial H(\vec{J})}{\partial \boldsymbol\theta} = 0\,.\label{eq-classmech-aa-dJdt} \end{align}

Thus, while the dynamics is slightly more complicated than the ideal of Equation (3.61), the dynamics is still quite simple: the \(\vec{J}\) are constant and the \(\boldsymbol\theta\) increase linearly in time. Phase-space volume is conserved by a canonical transformation and for a separable system, this has to hold for each area spanned by a single component of configuration \(\theta_i\) and \(J_i\). We can then re-write the definition from Equation (3.67) of the action as \begin{align} J_i & = \frac{1}{2\pi}\,\iint \mathrm{d}q_i\,\mathrm{d}p_i = \frac{1}{2\pi}\,\iint \mathrm{d}\theta_i\,\mathrm{d}J_i = \frac{1}{2\pi}\,\oint \mathrm{d}\theta_i\,J_i = \frac{\Delta \theta_i}{2\pi}\,J_i\,, \end{align} where \(\Delta \theta_i\) is the range spanned by the \(\theta_i\) variable and we have used Green’s theorem (Equation B.6). Each \(\theta_i\) component therefore spans \(2\pi\) and these variables are therefore known as angle variables. The \(J_i\) are action variables and the set \((\boldsymbol\theta,\vec{J})\) are angle-action coordinates or action-angle coordinates. The rate at which the angles increase \begin{equation}\label{eq-classmech-orbfreqs} \boldsymbol\Omega = {\partial H(\vec{J})\over \partial \vec{J}} \end{equation} are the frequencies.

The action variables are called the “actions” because of their relation to the action in Hamilton’s principle. The total time derivative of Hamilton’s principal function \(S\) is equal to the Lagrangian \begin{align} \frac{\mathrm{d}S}{\mathrm{d}t} & = \sum_i \frac{\partial S}{\partial q_i}\,\dot{q}_i + \frac{\partial S}{\partial t} = \sum_i p_i\,\dot{q}_i -H = \mathcal{L}\,. \end{align} The integral of \(S\) is therefore equal to the integral of the Lagrangian, or the action in Hamilton’s principle, up to an additive constant. Equation (3.66) and (3.67) demonstrate that the important part of \(S\) in the case of time-independent Hamiltonians, the function \(W\), is made up of integrals that are the same as those defining the actions, but over different ranges. Thus, the actions and the action are intimately related.

One important property of the actions is that they are adiabatically invariant. What this means is that when we slowly (adiabatically) change the potential, the actions remain constant. For the change to be slow enough, the Hamiltonian must change in a way in which it does not depend on the angle coordinates, that is, \(H(\vec{J}) \rightarrow H(\vec{J}',t)\) without introducing an angle dependence, where \(\vec{J}'\) are the new actions. Surfaces of constant \(\vec{J}\) are then surfaces of constant \(\vec{J}'\), because all stars with \(\vec{J}\) get mapped to \(\vec{J}'\) regardless of their angles. The curve \(\gamma\) defining the action \(J_i\) in Equation (3.67) then gets mapped to a curve \(\gamma'\) that can be used to define the action \(J_i'\) in the changed potential. By the Poincaré invariant theorem expressed as Equation (3.44), we then have that \(J_i' = J_i\) and the action is conserved. Because the relevant time scales on which the angles matter are the periods \(2\pi/\boldsymbol{\Omega}\) associated with the frequencies, the change is slow enough when it occurs on a time scale that is \(\gg 2\pi/\Omega_i\) for the action \(J_i\).

Why are the Hamilton-Jacobi equation and action-angle coordinates important in galactic dynamics? Because the Hamilton-Jacobi equation can only be solved in special cases, specific solutions only have limited usefulness (although these do include all spherical potentials as discussed in Chapter 4.4 and the \(\phi\) dynamics in an axisymmetric potential, so they are not entirely useless). But Hamilton-Jacobi theory is important in the context of astrophysical dynamics, because it makes it clear that for any system and trajectory for which the Hamilton-Jacobi equation could in principle be solved—these are called regular orbits—bound orbits have three integrals of the motion, which could be chosen to be the actions, and an orbit is essentially a libration in three two-dimensional subspaces that is similar to the libration of a pendulum. For galactic orbits, these are the following three librations: (a) a rotation around the center of the mass distribution, in cylindrical coordinates associated with the angle \(\phi\) with azimuthal action \(J_\phi\) equal to the \(z\)-component \(L_z\) of the angular momentum in axisymmetric potentials, (b) a radial oscillation, with an amplitude characterized by a radial action \(J_R\), towards and away from the center of the mass distribution, and (c) an oscillation perpendicular to an average orbital plane (in a disk galaxy this would be the galaxy’s mid-plane) with an amplitude characterized by a vertical action \(J_z\). The actions, when they can be computed or estimated, quantify the oscillation amplitudes in these three different directions and therefore immediately tell us much about the characteristics of an orbit. Because the actions—unlike, say, the energy—are adiabatically invariant, they are the best quantities to label these amplitudes, because they are comparable between different gravitational potentials (as we can adiabatically morph orbits in one potential into orbits in another potential keeping the actions constant). Because an orbit is the combination of three oscillatory motions, it is equivalent to a three-dimensional torus and orbits are therefore also sometimes referred to as tori. The angles describe the time-dependent part of galactic motions and when considering ensembles of orbits believed to be in equilibrium, the angles can therefore not be important variables; action-angle coordinates thus naturally separate the ephemeral from the eternal in the study of galaxies. When considering orbits in galaxies, in which we believe many orbits to be regular and thus to conform to this picture, it is worth keeping this picture in mind. Action-angle coordinates are furthermore very important in the study of the stability of stellar systems and their response to perturbations.

← Previous section Next chapter →

Dynamics and Astrophysics of Galaxies

By Jo Bovy

Related Topics

3.4. Hamiltonian mechanics¶

3.4.1. Hamilton’s equations¶

3.4.2. Canonical transformations¶

3.4.3. The Hamilton-Jacobi equation and action-angle coordinates¶