Subsections


9.1 The Variational Method

Solving the equations of quantum mechanics is typically difficult, so approximations must usually be made. One very effective way to find an approximate ground state is the variational principle. This section gives some of the basic ideas, including ways to apply it best, and how to find eigenstates of higher energy in similar ways.


9.1.1 Basic variational statement

The variational method is based on the observation that the ground state is the state among all allowable wave functions that has the lowest expectation value of energy:

\begin{displaymath}
\fbox{$\displaystyle
\big\langle E\big\rangle
\mbox{ is minimal for the ground state wave function.}
$} %
\end{displaymath} (9.1)

The variational method has already been used to find the ground states for the hydrogen molecular ion, chapter 4.6, and the hydrogen molecule, chapter 5.2. The general procedure is to guess an approximate form of the wave function, invariably involving some parameters whose best values you are unsure about. Then search for the parameters that give you the lowest expectation value of the total energy; those parameters will give your best possible approximation to the true ground state {N.6}. In particular, you can be confident that the true ground state energy is no higher than what you compute, {A.7}.

To get the second lowest energy state, you could search for the lowest energy among all wave functions orthogonal to the ground state. But since you would not know the exact ground state, you would need to use your approximate one instead. That would involve some error, and it is no longer sure that the true second-lowest energy level is no higher than what you compute, but anyway.

If you want to get more accurate values, you will need to increase the number of parameters. The molecular example solutions were based on the atom ground states, and you could consider adding some excited states to the mix. In general, a procedure using appropriate guessed functions is called a Rayleigh-Ritz method. Alternatively, you could just chop space up into little pieces, or elements, and use a simple polynomial within each piece. That is called a finite element method. In either case, you end up with a finite, but relatively large number of unknowns; the parameters and/or coefficients of the functions, or the coefficients of the polynomials.


9.1.2 Differential form of the statement

You might by now wonder about the wisdom of trying to find the minimum energy by searching through the countless possible combinations of a lot of parameters. Brute-force search worked fine for the hydrogen molecule examples since they really only depended nontrivially on the distance between the nuclei. But if you add some more parameters for better accuracy, you quickly get into trouble. Semi-analytical approaches like Hartree-Fock even leave whole functions unspecified. In that case, simply put, every single function value is an unknown parameter, and a function has infinitely many of them. You would be searching in an infinite-di­men­sion­al space, and could search forever. Maybe you could try some clever genetic algorithm.

Usually it is a much better idea to write some equations for the minimum energy first. From calculus, you know that if you want to find the minimum of a function, the sophisticated way to do it is to note that the partial derivatives of the function must be zero at the minimum. Less rigorously, but a lot more intuitive, at the minimum of a function the changes in the function due to small changes in the variables that it depends on must be zero. In the simplest possible example of a function $f(x)$ of one variable $x$, a rigorous mathematician would say that at a minimum, the derivative $f'(x)$ must be zero. Instead a typical physicist would say that the change $\delta{f}$, (or ${\rm d}{f}$,) in $f$ due to a small change $\delta{x}$ in $x$ must be zero. It is the same thing, since $\delta{f}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $f'\delta{x}$, so that if $f'$ is zero, then so is $\delta{f}$. But mathematicians do not like the word small, since it has no rigorous meaning. On the other hand, in physics you may not like to talk about derivatives, for if you say derivative, you must say with respect to what variable; you must say what $x$ is as well as what $f$ is, and there is often more than one possible choice for $x$, with none preferred under all circumstances. (And in practice, the word small does have an unambiguous meaning: it means that you must ignore everything that is of square magnitude or more in terms of the small quantities.)

In physics terms, the fact that the expectation energy must be minimal in the ground state means that you must have:

\begin{displaymath}
\fbox{$\displaystyle
\delta \big\langle E\big\rangle = 0...
...ox{ for all acceptable small changes in wave function}
$} %
\end{displaymath} (9.2)

The changes must be acceptable; you cannot allow that the changed wave function is no longer normalized. Also, if there are boundary conditions, the changed wave function should still satisfy them. (There may be exceptions permitted to the latter under some conditions, but these will be ignored here.) So, in general you have constrained minimization; you cannot make your changes completely arbitrary.


9.1.3 Example application using Lagrangian multipliers

As an example of how you can apply the variational formulation of the previous subsection analytically, and how it can also describe eigenstates of higher energy, this subsection will work out a very basic example. The idea is to figure out what you get if you truly zero the changes in the expectation value of energy $\langle{E}\rangle$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\langle\psi\vert H\vert\psi\rangle$ over all acceptable wave functions $\psi$. (Instead of just over all possible versions of a numerical approximation, say.) It will illustrate how you can deal with the constraints.

The differential statement is:

\begin{displaymath}
\delta \langle\psi\vert H\vert\psi\rangle = 0
\mbox{ for all acceptable changes $\delta\psi$\ in $\psi$}
\end{displaymath}

But acceptable is not a mathematical concept. What does it mean? Well, if it is assumed that there are no boundary conditions, (like the harmonic oscillator, but unlike the particle in a pipe,) then acceptable just means that the wave function must remain normalized under the change. So the change in $\langle\psi\vert\psi\rangle$ must be zero, and you can write more specifically:

\begin{displaymath}
\delta \langle\psi\vert H\vert\psi\rangle = 0
\mbox{ whenever } \delta\langle\psi\vert\psi\rangle = 0.
\end{displaymath}

But how do you crunch a statement like that down mathematically? Well, there is a very important mathematical trick to simplify this. Instead of rigorously trying to enforce that the changed wave function is still normalized, just allow any change in wave function. But add penalty points to the change in expectation energy if the change in wave function goes out of allowed bounds:

\begin{displaymath}
\delta \langle\psi\vert H\vert\psi\rangle
- \epsilon \delta\langle\psi\vert\psi\rangle = 0
\end{displaymath}

Here $\epsilon$ is the penalty factor; such factors are called “Lagrangian multipliers” after a famous mathematician who probably watched a lot of soccer. For a change in wave function that does not go out of bounds, the second term is zero, so nothing changes. And if the penalty factor is carefully tuned, the second term can cancel any erroneous gain or decrease in expectation energy due to going out of bounds, {D.48}.

You do not, however, have to explicitly tune the penalty factor yourself. All you need to know is that a proper one exists. In actual application, all you do in addition to ensuring that the penalized change in expectation energy is zero is ensure that at least the unchanged wave function is normalized. It is really a matter of counting equations versus unknowns. Compared to simply setting the change in expectation energy to zero with no constraints on the wave function, one additional unknown has been added, the penalty factor. And quite generally, if you add one more unknown to a system of equations, you need one more equation to still have a unique solution. As the one-more equation, use the normalization condition. With enough equations to solve, you will get the correct solution, which means that the implied value of the penalty factor should be OK too.

So what does this variational statement now produce? Writing out the differences explicitly, you must have

\begin{displaymath}
\Big(\langle\psi+\delta\psi\vert H\vert\psi+\delta\psi\ran...
...+\delta\psi\rangle
- \langle\psi\vert\psi\rangle\Big)
= 0
\end{displaymath}

Multiplying out, canceling equal terms and ignoring terms that are quadratically small in $\delta\psi$, you get

\begin{displaymath}
\langle\delta\psi\vert H\vert\psi\rangle
+ \langle\psi\v...
...t\psi\rangle
+ \langle\psi\vert\delta\psi\rangle\Big)
= 0
\end{displaymath}

That is not yet good enough to say something specific about. But remember that you can exchange the sides of an inner product if you add a complex conjugate, so

\begin{displaymath}
\langle\delta\psi\vert H\vert\psi\rangle
+ \langle\delta...
...psi\rangle
- \langle\delta\psi\vert\psi\rangle^*\Big)
= 0
\end{displaymath}

Also remember that you can allow any change $\delta\psi$ you want, including the $\delta\psi$ you are now looking at times ${\rm i}$. That means that you also have:

\begin{displaymath}
\langle{\rm i}\delta\psi\vert H\vert\psi\rangle
+ \langl...
...gle
+ \langle{\rm i}\delta\psi\vert\psi\rangle^*\Big)
= 0
\end{displaymath}

or using the fact that numbers come out of the left side of an inner product as complex conjugates

\begin{displaymath}
-{\rm i}\langle\delta\psi\vert H\vert\psi\rangle
+{\rm i...
...ngle
+{\rm i}\langle\psi\vert\delta\psi\rangle^*\Big)
= 0
\end{displaymath}

If you divide out a $\vphantom0\raisebox{1.5pt}{$-$}$${\rm i}$ and then average with the original equation, you get rid of the complex conjugates:

\begin{displaymath}
\langle\delta\psi\vert H\vert\psi\rangle
- \epsilon\langle\delta\psi\vert\psi\rangle
= 0
\end{displaymath}

You can now combine them into one inner product with $\delta\psi$ on the left:

\begin{displaymath}
\langle\delta\psi\vert H\psi-\epsilon\psi\rangle
= 0
\end{displaymath}

If this is to be zero for any change $\delta\psi$, then the right hand side of the inner product must unavoidably be zero. For example, just take $\delta\psi$ equal to a small number $\varepsilon$ times the right hand side, you will get $\varepsilon$ times the square norm of the right hand side, and that can only be zero if the right hand side is. So $H\psi-\epsilon\psi$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0, or

\begin{displaymath}
H\psi=\epsilon\psi.
\end{displaymath}

So you see that you have recovered the Hamiltonian eigenvalue problem from the requirement that the variation of the expectation energy is zero. Unavoidably then, $\epsilon$ will have to be an energy eigenvalue $E$. It often happens that Lagrangian multipliers have a physical meaning beyond being merely penalty factors. But note that there is no requirement for this to be the ground state. Any energy eigenstate would satisfy the equation; the variational principle works for them all.

Indeed, you may remember from calculus that the derivatives of a function may be zero at more than one point. For example, a function might also have a maximum, or local minima and maxima, or stationary points where the function is neither a maximum nor a minimum, but the derivatives are zero anyway. This sort of thing happens here too: the ground state is the state of lowest possible energy, but there will be other states for which $\delta\langle{E}\rangle$ is zero, and these will correspond to energy eigenstates of higher energy, {D.49}.