D.80 Derivation of perturbation theory

This note derives the perturbation theory results for the solution of the eigenvalue problem $(H_0+H_1)\psi$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E\psi$ where $H_1$ is small. The considerations for degenerate problems use linear algebra.

First, small is not a valid mathematical term. There are no small numbers in mathematics, just numbers that become zero in some limit. Therefore, to mathematically analyze the problem, the perturbation Hamiltonian will be written as

H_1 \equiv \varepsilon H_{\varepsilon}

where $\varepsilon$ is some chosen number that physically indicates the magnitude of the perturbation potential. For example, if the perturbation is an external electric field, $\varepsilon$ could be taken as the reference magnitude of the electric field. In perturbation analysis, $\varepsilon$ is assumed to be vanishingly small.

The idea is now to start with a good eigenfunction $\psi_{{\vec n},0}$ of $H_0$, (where good is still to be defined), and correct it so that it becomes an eigenfunction of $H$ $\vphantom0\raisebox{1.5pt}{$=$}$ $H_0+H_1$. To do so, both the desired energy eigenfunction and its energy eigenvalue are expanded in a power series in terms of $\varepsilon$:

\psi_{\vec n}= \psi_{{\vec n},0}
+ \varepsilon \psi_{...
+ \varepsilon^2 E_{{\vec n},\varepsilon^2}
+ \ldots

If $\varepsilon$ is a small quantity, then $\varepsilon^2$ will be much smaller still, and can probably be ignored. If not, then surely $\varepsilon^3$ will be so small that it can be ignored. A result that forgets about powers of $\varepsilon$ higher than one is called first order perturbation theory. A result that also includes the quadratic powers, but forgets about powers higher than two is called second order perturbation theory, etcetera.

Before proceeding with the practical application, a disclaimer is needed. While it is relatively easy to see that the eigenvalues expand in whole powers of $\varepsilon$, (note that they must be real whether $\varepsilon$ is positive or negative), it is much more messy to show that the eigenfunctions must expand in whole powers. In fact, for degenerate energies $E_{{\vec n},0}$ they only do if you choose good states $\psi_{{\vec n},0}$. See Rellich’s lecture notes on Perturbation Theory [Gordon & Breach, 1969] for a proof. As a result the problem with degeneracy becomes that the good unperturbed eigenfunction $\psi_{{\vec n},0}$ is initially unknown. It leads to lots of messiness in the procedures for degenerate eigenvalues described below.

When the above power series are substituted into the eigenvalue problem to be solved,

\left(H_0+\varepsilon H_\varepsilon\right)\psi_{\vec n}
= E_{\vec n}\psi_{\vec n}

the net coefficient of every power of $\varepsilon$ must be equal in the left and right hand sides. Collecting these coefficients and rearranging them appropriately produces:

&\varepsilon^0:& (H_0-E_{{\vec n},0})\psi_{{\vec n},0} = 0 \...
...{{\vec n},\varepsilon^3}\psi_{{\vec n},0} \\
&\vdots& \cdots

These are the equations to be solved in succession to give the various terms in the expansion for the wave function $\psi_{\vec n}$ and the energy $E_{\vec n}$. The further you go down the list, the better your combined result should be.

Note that all it takes is to solve problems of the form

(H_0-E_{{\vec n},0})\psi_{{\vec n},\ldots} = \ldots

The equations for the unknown functions are in terms of the unperturbed Hamiltonian $H_0$, with some additional but in principle knowable terms.

For difficult perturbation problems like you find in engineering, the use of a small parameter $\varepsilon$ is essential to get the mathematics right. But in the simple applications in quantum mechanics, it is usually overkill. So most of the time the expansions are written without, like

\psi_{\vec n}= \psi_{{\vec n},0} + \psi_{{\vec n},1} + ...
...c n}= E_{{\vec n},0} + E_{{\vec n},1} + E_{{\vec n},2} + \ldots

where you are assumed to just imagine that $\psi_{{\vec n},1}$ and $E_{{\vec n},1}$ are first order small, $\psi_{{\vec n},2}$ and $E_{{\vec n},2}$ are “second order small,” etcetera. In those terms, the successive equations to solve are:
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},0} = 0$  (D.55)
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},1}
= - H_1\psi_{{\vec n},0}
+ E_{{\vec n},1}\psi_{{\vec n},0}$  (D.56)
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},2}
= - H_1\psi_{{\vec n},1}
+ E_{{\vec n},1}\psi_{{\vec n},1}
+ E_{{\vec n},2}\psi_{{\vec n},0}$  (D.57)
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},3}
= - H_{1}\psi_{{\vec n},2}
...{\vec n},2}
+ E_{{\vec n},2}\psi_{{\vec n},1}
+ E_{{\vec n},3}\psi_{{\vec n},0}$  (D.58)
     $\displaystyle \cdots$   

Now consider each of these equations in turn. First, (D.55) is just the Hamiltonian eigenvalue problem for $H_0$ and is already satisfied by the chosen unperturbed solution $\psi_{{\vec n},0}$ and its eigenvalue $E_{{\vec n},0}$. However, the remaining equations are not trivial. To solve them, write their solutions in terms of the other eigenfunctions $\psi_{\underline{\vec n},0}$ of the unperturbed Hamiltonian $H_0$. In particular, to solve (D.56), write

\psi_{{\vec n},1} =
\sum_{\underline{\vec n}\ne{\vec n}}
c_{\underline{\vec n},1} \psi_{\underline{\vec n},0}

where the coefficients $c_{\underline{\vec n},1}$ are still to be determined. The coefficient of $\psi_{{\vec n},0}$ is zero on account of the normalization requirement. (And in fact, it is easiest to take the coefficient of $\psi_{{\vec n},0}$ also zero for $\psi_{{\vec n},2}$, $\psi_{{\vec n},3}$, ..., even if it means that the resulting wave function will no longer be normalized.)

The problem (D.56) becomes

\sum_{\underline{\vec n}\ne{\vec n}}
c_{\underline{\vec ...
... = - H_1\psi_{{\vec n},0}
+ E_{{\vec n},1}\psi_{{\vec n},0}

where the left hand side was cleaned up using the fact that the $\psi_{\underline{\vec n},0}$ are eigenfunctions of $H_0$. To get the first order energy correction $E_{{\vec n},1}$, the trick is now to take an inner product of the entire equation with $\langle\psi_{{\vec n},0}\vert$. Because of the fact that the energy eigenfunctions of $H_0$ are orthonormal, this inner product produces zero in the left hand side, and in the right hand side it produces:

0 = - H_{{\vec n}{\vec n},1} + E_{{\vec n},1}
...} = \langle\psi_{{\vec n},0}\vert H_1\psi_{{\vec n},0}\rangle

And that is exactly the first order correction to the energy claimed in {A.37.1}; $E_{{\vec n},1}$ equals the Hamiltonian perturbation coefficient $H_{{\vec n}{\vec n},1}$. If the problem is not degenerate or $\psi_{{\vec n},0}$ is good, that is.

To get the coefficients $c_{\underline{\vec n},1}$, so that you know what is the first order correction $\psi_{{\vec n},1}$ to the wave function, just take an inner product with each of the other eigenfunctions $\langle\psi_{\underline{\vec n},0}\vert$ of $H_0$ in turn. In the left hand side it only leaves the coefficient of the selected eigenfunction because of orthonormality, and for the same reason, in the right hand side the final term drops out. The result is

c_{\underline{\vec n},1} (E_{\underline{\vec n},0} - E_{{\...
...\psi_{\underline{\vec n},0} \vert H_1\psi_{{\vec n},0}\rangle

The coefficients $c_{\underline{\vec n},1}$ can normally be computed from this.

Note however that if the problem is degenerate, there will be eigenfunctions $\psi_{\underline{\vec n},0}$ that have the same energy $E_{{\vec n},0}$ as the eigenfunction $\psi_{{\vec n},0}$ being corrected. For these the left hand side in the equation above is zero, and the equation cannot in general be satisfied. If so, it means that the assumption that an eigenfunction $\psi_{\vec n}$ of the full Hamiltonian expands in a power series in $\varepsilon$ starting from $\psi_{{\vec n},0}$ is untrue. Eigenfunction $\psi_{{\vec n},0}$ is bad. And that means that the first order energy correction derived above is simply wrong. To fix the problem, what needs to be done is to identify the submatrix of all Hamiltonian perturbation coefficients in which both unperturbed eigenfunctions have the energy $E_{{\vec n},0}$, i.e. the submatrix

H_{{\vec n}_i{\vec n}_j,1}
E_{{\vec n}_i,0}=E_{{\vec n}_j,0}=E_{{\vec n},0}

The eigenvalues of this submatrix are the correct first order energy changes. So, if all you want is the first order energy changes, you can stop here. Otherwise, you need to replace the unperturbed eigenfunctions that have energy $E_{{\vec n},0}$. For each orthonormal eigenvector $(c_1,c_2,\ldots)$ of the submatrix, there is a corresponding replacement unperturbed eigenfunction

c_1 \psi_{{\vec n}_1,0,{\rm old}} +
c_2 \psi_{{\vec n}_2,0,{\rm old}} +

You will need to rewrite the Hamiltonian perturbation coefficients in terms of these new eigenfunctions. (Since the replacement eigenfunctions are linear combinations of the old ones, no new integrations are needed.) You then need to reselect the eigenfunction $\psi_{{\vec n},0}$ whose energy to correct from among these replacement eigenfunctions. Choose the first order energy change (eigenvalue of the submatrix) $E_{{\vec n},1}$ that is of interest to you and then choose $\psi_{{\vec n},0}$ as the replacement eigenfunction corresponding to a corresponding eigenvector. If the first order energy change $E_{{\vec n},1}$ is not degenerate, the eigenvector is unique, so $\psi_{{\vec n},0}$ is now good. If not, the good eigenfunction will be some combination of the replacement eigenfunctions that have that first order energy change, and the good combination will have to be figured out later in the analysis. In any case, the problem with the equation above for the $c_{\underline{\vec n},1}$ will be fixed, because the new submatrix will be a diagonal one: $H_{\underline{\vec n}{\vec n},1}$ will be zero when $E_{\underline{\vec n},0}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E_{{\vec n},0}$ and $\underline{\vec n}$ $\raisebox{.2pt}{$\ne$}$ ${\vec n}$. The coefficients $c_{\underline{\vec n},1}$ for which $E_{\underline{\vec n},0}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E_{{\vec n},0}$ remain indeterminate at this stage. They will normally be found at a later stage in the expansion.

With the coefficients $c_{\underline{\vec n},1}$ as found, or not found, the sum for the first order perturbation $\psi_{{\vec n},1}$ in the wave function becomes

\psi_{{\vec n},1} = - \hspace{-5pt} \sum_{E_{\underline{\v...
...c n}}}
c_{\underline{\vec n},1} \psi_{\underline{\vec n},0}

The entire process repeats for higher order. In particular, to second order (D.57) gives, writing $\psi_{{\vec n},2}$ also in terms of the unperturbed eigenfunctions,

\sum_{\underline{\vec n}}
c_{\underline{\vec n},2}
...\psi_{\underline{\vec n},0}
+ E_{{\vec n},2}\psi_{{\vec n},0}

To get the second order contribution to the energy, take again an inner product with $\langle\psi_{{\vec n},0}\vert$. That produces, again using orthonormality, (and diagonality of the submatrix discussed above if degenerate),

0 =
\sum_{E_{\underline{\vec n},0}\ne E_{{\vec n},0}}
...E_{\underline{\vec n},0} - E_{{\vec n},0}}
+ E_{{\vec n},2}

This gives the second order change in the energy stated in {A.37.1}, if $\psi_{{\vec n},0}$ is good. Note that since $H_1$ is Hermitian, the product of the two Hamiltonian perturbation coefficients in the expression is just the square magnitude of either.

In the degenerate case, when taking an inner product with a $\langle\psi_{\underline{\vec n},0}\vert$ for which $E_{\underline{\vec n},0}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E_{{\vec n},0}$, the equation can be satisfied through the still indeterminate $c_{\underline{\vec n},1}$ provided that the corresponding diagonal coefficient $H_{\underline{\vec n}\underline{\vec n},1}$ of the diagonalized submatrix is unequal to $E_{{\vec n},1}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $H_{{\vec n}{\vec n},1}$. In other words, provided that the first order energy change is not degenerate. If that is untrue, the higher order submatrix

\mbox{all }
\sum_{E_{\underline{\vec n},0}\ne E_{{\vec n...
E_{{\vec n}_i,1}=E_{{\vec n}_j,1}=E_{{\vec n},1}

will need to be diagonalized, (the rest of the equation needs to be zero). Its eigenvalues give the correct second order energy changes. To proceed to still higher energy, reselect the eigenfunctions following the same general lines as before. Obviously, in the degenerate case the entire process can become very messy. And you may never become sure about the good eigenfunction.

This problem can often be eliminated or greatly reduced if the eigenfunctions of $H_0$ are also eigenfunctions of another operator $A$, and $H_1$ commutes with $A$. Then you can arrange the eigenfunctions $\psi_{\underline{\vec n},0}$ into sets that have the same value for the good quantum number $a$ of $A$. You can analyze the perturbed eigenfunctions in each of these sets while completely ignoring the existence of eigenfunctions with different values for quantum number $a$.

To see why, consider two example eigenfunctions $\psi_1$ and $\psi_2$ of $A$ that have different eigenvalues $a_1$ and $a_2$. Since $H_0$ and $H_1$ both commute with $A$, their sum $H$ does, so

0 = \langle\psi_2\vert(H A - A H)\psi_1\rangle
= \langle...
= (a_1-a_2)\langle\psi_2\vert H\vert\psi_1\rangle

and since $a_1-a_2$ is not zero, $\langle\psi_2\vert H\vert\psi_1\rangle$ must be. Now $\langle\psi_2\vert H\vert\psi_1\rangle$ is the amount of eigenfunction $\psi_2$ produced by applying $H$ on $\psi_1$. It follows that applying $H$ on an eigenfunction with an eigenvalue $a_1$ does not produce any eigenfunctions with different eigenvalues $a$. Thus an eigenfunction of $H$ satisfying

H \left(\sum_{a=a_1}c_{\vec n}\psi_{{\vec n},0}
+ \sum_{...
...c n},0}
+ \sum_{a\ne a_1}c_{\vec n}\psi_{{\vec n},0}\right)

can be replaced by just $\sum_{a=a_1}c_{\vec n}\psi_{{\vec n},0}$, since this by itself must satisfy the eigenvalue problem: the Hamiltonian of the second sum does not produce any amount of eigenfunctions in the first sum and vice-versa. (There must always be at least one value of $a_1$ for which the first sum at $\varepsilon$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0 is independent of the other eigenfunctions of $H$.) Reduce every eigenfunction of $H$ to an eigenfunction of $A$ in this way. Now the existence of eigenfunctions with different values of $a$ than the one being analyzed can be ignored since the Hamiltonian does not produce them. In terms of linear algebra, the Hamiltonian has been reduced to block diagonal form, with each block corresponding to a set of eigenfunctions with a single value of $a$. If the Hamiltonian also commutes with another operator $B$ that the $\psi_{{\vec n},0}$ are eigenfunctions of, the argument repeats for the subsets with a single value for $b$.

The Hamiltonian perturbation coefficient $\langle\psi_2\vert H_1\vert\psi_1\rangle$ is zero whenever two good quantum numbers $a_1$ and $a_2$ are unequal. The reason is the same as for $\langle\psi_2\vert H\vert\psi_1\rangle$ above. Only perturbation coefficients for which all good quantum numbers are the same can be nonzero.