9.1 The Variational Method

Solving the equations of quantum mechanics is typically difficult, so approximations must usually be made. One very effective tool for finding approximate solutions is the variational principle. This section gives some of the basic ideas, including ways to apply it best.

9.1.1 Basic variational statement

Finding the state of a physical system in quantum mechanics means finding the wave function $\Psi$ that describes it. For example, at sufficiently low temperatures, physical systems will be described by the ground state wave function. The problem is that if there are more than a couple of particles in the system, the wave function is a very high-dimensional function. It is far too complex to be crunched out using brute force on any current computer.

However, the expectation value of energy is just a simple single number for any given wave function. It is defined as

$\begin{displaymath} \left\langle{E}\right\rangle = \left\langle\vphantom{H\Psi}... ...ce{.03em}\right.\!\left\vert\vphantom{\Psi}H\Psi\right\rangle \end{displaymath}$

where

is the Hamiltonian of the system. The key observation on which the variational method is based is that the ground state is the state among all allowable wave functions that has the lowest expectation value of energy:

$\begin{displaymath} \fbox{$\displaystyle \left\langle{E}\right\rangle \mbox{ is minimal for the ground state wave function.} $} % \end{displaymath}$

(9.1)

That means that if you would find $\left\langle{E}\right\rangle$ for all possible system wave functions, you would be able to pick out the ground state simply as the state that has the lowest value.

Of course, finding the expectation value of the energy for all possible wave functions is still an impossible task. But you may be able to guess a generic type of wave function that you would expect to be able to approximate the ground state well, under suitable conditions. Normally, suitable conditions means that the approximation will be good only if various parameters appearing in the approximate wave function are well chosen.

That then leaves you with the much smaller task of finding good values for this limited set of parameters. Here the key idea is:

$\begin{displaymath} \fbox{$\displaystyle \left\langle{E}\right\rangle \mbox{ is lowest for the best approximation to the ground state.} $} % \end{displaymath}$

(9.2)

Following that idea, what you do is adjust the parameters values so that you get the lowest possible value of the expectation energy for your type of approximate wave function. The true ground state wave function always has the lowest possible energy, so the lower you make your approximate energy, the closer that energy is to the exact value.

So this procedure gives you the best possible approximation to the true energy, and energy is usually the key quantity in quantum mechanics. In addition you know for sure that the true energy must be lower than your approximation, which is also often very useful information.

The variational method as described above has already been used earlier in this book to find an approximate ground state for the hydrogen molecular ion, chapter 4.6, and for the hydrogen molecule, chapter 5.2. It will also be used to find an approximate ground state for the helium atom, {A.38.2}. The method works quite well even for the crude approximate wave functions used in those examples.

To be sure, it is not at all obvious that getting the best energy will also produce the best wave function. After all, best is a somewhat tricky term for a complex object like a wave function. To take an example from another field, surely you would not argue that the best sprinter in the world must also be the best person in the world.

But in this case, your wave function will in fact be close to the exact wave function if you manage to get close enough to the exact energy. More precisely, assuming that the ground state is unique, the closer your energy gets to the exact energy, the closer your wave function gets to the exact wave function. One way of thinking about it is to note that your approximate wave function is always a combination of the desired exact ground state plus polluting amounts of higher energy states. By minimizing the energy, in some sense you minimize the amount of these polluting higher energy states. The mathematics of that idea is explored in more detail in addendum {A.7}.

And there are other benefits to specifically getting the energy as accurate as possible. One problem is often to figure out whether a system is bound. For example, can you add another electron to a hydrogen atom and have that electron at least weakly bound? The answer is not obvious. But if using a suitable approximate solution, you manage to show that the approximate energy of the bound system is less than that of having the additional electron at infinity, then you have proved that the bound state exist. Despite the fact that your solution has errors. The reason is that, by definition, the ground state must have lower energy than your approximate wave function. So the ground state is even more tightly bound together than your approximate wave function says.

Another reason to specifically getting the energy as accurate as possible is that energy values are directly related to how fast systems evolve in time when not in the ground state, chapter 7.

For the above reasons, it is also great that the errors in energy turn out to be unexpectedly small in a variational procedure, when compared to the errors in the guessed wave function, {A.7}.

To get the second lowest energy state, you could search for the lowest energy among all wave functions orthogonal to the ground state. But since you would not know the exact ground state, you would need to use your approximate one instead. That would involve some error, and it is no longer sure that the true second-lowest energy level is no higher than what you compute, but anyway. The suprising accuracy in energy will still apply.

If you want to get truly accurate results in a variational method, in general you will need to increase the number of parameters. The molecular example solutions were based on the atomic ground states, and you could consider adding some excited states to the mix. In general, a procedure using appropriate guessed functions is called a Rayleigh-Ritz method. Alternatively, you could just chop space up into little pieces, or elements, and use a simple polynomial within each piece. That is called a finite-element method. In either case, you end up with a finite, but relatively large number of unknowns; the parameters and coefficients of the functions, or the coefficients of the polynomials.

9.1.2 Differential form of the statement

You might by now wonder about the wisdom of trying to find the minimum energy by searching through the countless possible combinations of a lot of parameters. Brute-force search worked fine for the hydrogen molecule examples since they really only depended nontrivially on the distance between the nuclei. But if you add some more parameters for better accuracy, you quickly get into trouble. Semi-analytical approaches like Hartree-Fock even leave whole functions unspecified. In that case, simply put, every single function value is an unknown parameter, and a function has infinitely many of them. You would be searching in an infinite-dimensional space, and might search forever.

Usually it is a much better idea to write some equations for the minimum energy first. From calculus, you know that if you want to find the minimum of a function, the sophisticated way to do it is to note that the derivatives of the function must be zero at the minimum. Less rigorously, but a lot more intuitive, at the minimum of a function the changes in the function due to small changes in the variables that it depends on must be zero. Mathematicians may not like that, since the word small has no rigorous meaning. But unless you misuse your small quantities, you can always convert your results using them to rigorous mathematics after the fact.

In the simplest possible example of a function of one variable , a rigorous mathematician would say that at a minimum, the derivative must be zero. But a physicists may not like that, for if you say derivative, you must say with respect to what variable; you must say what is as well as what is. There is often more than one possible choice for , with none preferred under all circumstances. So a typical physicist would say that the change ${\rm d}{f}$ in due to a small change in whatever variable it depends on must be zero. It is the same thing, since for a small enough change ${\rm d}{x}$ in the variable, ${\rm d}{f}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $f'{\rm d}{x}$ , so that if is zero, then so is ${\rm d}{f}$ . (Mathematically more accurately, if ${\rm d}{x}$ becomes small enough, ${\rm d}{f}$ becomes zero compared to ${\rm d}{x}$ .) If there is more than one independent variable that the function depends on, then the derivatives become partial derivatives, ${\rm d}{f}$ becomes $\partial{f}$ , and specifying the precise derivatives would become much messier still.

In variational procedures, it is common to use $\delta{f}$ instead of ${\rm d}{f}$ or $\partial{f}$ for the small change in . This book will do so too.

So in quantum mechanics, the fact that the expectation energy must be minimal in the ground state can be written as:

$\begin{displaymath} \fbox{$\displaystyle \delta \left\langle{E}\right\rangle =... ...mbox{ for all acceptable small changes in wave function} $} % \end{displaymath}$

(9.3)

The changes must be acceptable; you cannot allow that the changed wave function is no longer normalized. Also, if there are boundary conditions, the changed wave function should still satisfy them. (There may be exceptions permitted to the latter under some conditions, but these will be ignored here.) So, in general you have constrained minimization; you cannot make your changes completely arbitrary.

9.1.3 Using Lagrangian multipliers

As an example of how the variational formulation of the previous subsection can be applied analytically, and how it can also describe eigenstates of higher energy, this subsection will work out a very basic example. The idea is to figure out what you get if you truly zero the changes in the expectation value of energy $\langle{E}\rangle$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\langle\psi\vert H\vert\psi\rangle$ over all acceptable wave functions $\psi$ . (Instead of just over all possible versions of a numerical approximation, say.) It will illustrate how the Lagrangian multiplier method can deal with the constraints.

The differential statement is:

$\begin{displaymath} \delta \langle\psi\vert H\vert\psi\rangle = 0 \mbox{ for all acceptable changes $\delta\psi$\ in $\psi$} \end{displaymath}$

But acceptable is not a mathematical concept. What does it mean? Well, if it is assumed that there are no boundary conditions, (like the harmonic oscillator, but unlike the particle in a pipe,) then acceptable just means that the wave function must remain normalized under the change. So the change in $\langle\psi\vert\psi\rangle$ must be zero, and you can write more specifically:

$\begin{displaymath} \delta \langle\psi\vert H\vert\psi\rangle = 0 \mbox{ whenever } \delta\langle\psi\vert\psi\rangle = 0. \end{displaymath}$

But how do you crunch a statement like that down mathematically? Well, there is a very important mathematical trick to simplify this. Instead of rigorously trying to enforce that the changed wave function is still normalized, just allow any change in wave function. But add penalty points to the change in expectation energy if the change in wave function goes out of allowed bounds:

$\begin{displaymath} \delta \langle\psi\vert H\vert\psi\rangle - \epsilon \delta\langle\psi\vert\psi\rangle = 0 \end{displaymath}$

Here $\epsilon$ is the penalty factor. Such penalty factors are called “Lagrangian multipliers” after a famous mathematician who probably watched a lot of soccer. For a change in wave function that does not go out of bounds, the second term is zero, so nothing changes. And if the change does go out of bounds, the second term will cancel any resulting erroneous gain or decrease in expectation energy, {D.48}, assuming that the penalty factor is carefully tuned. Note that the penalty factor $\epsilon$ must be real because the other two quantities in the equation above are changes in real functions.

You do not, however, have to explicitly tune the penalty factor yourself. All you need to know is that a proper one exists. In actual application, all you do in addition to ensuring that the penalized change in expectation energy is zero is ensure that at least the unchanged wave function is normalized. It is really a matter of counting equations versus unknowns. Compared to simply setting the change in expectation energy to zero with no constraints on the wave function, one additional unknown has been added, the penalty factor. And quite generally, if you add one more unknown to a system of equations, you need one more equation to still have a unique solution. As the one-more equation, use the normalization condition. With enough equations to solve, you will get the correct solution, which means that the implied value of the penalty factor should be OK too.

So what does this variational statement now produce? Writing out the differences explicitly, you must have

$\begin{displaymath} \Big(\langle\psi+\delta\psi\vert H\vert\psi+\delta\psi\rang... ...psi+\delta\psi\rangle - \langle\psi\vert\psi\rangle\Big) = 0 \end{displaymath}$

Multiplying out, canceling equal terms and ignoring terms that are quadratically small in $\delta\psi$ , you get

$\begin{displaymath} \langle\delta\psi\vert H\vert\psi\rangle + \langle\psi\ver... ...vert\psi\rangle + \langle\psi\vert\delta\psi\rangle\Big) = 0 \end{displaymath}$

Remarkably, you can throw away the second of each pair of inner products in the expression above. To see why, remember that you can allow any change $\delta\psi$ you want, including the $\delta\psi$ you are now looking at times $-{\rm i}$ . If you plug that into the above equation and divide the entire thing by ${\rm i}$ to get rid of the added factors ${\rm i}$ again, you get

$\begin{displaymath} \langle\delta\psi\vert H\vert\psi\rangle - \langle\psi\ver... ...vert\psi\rangle - \langle\psi\vert\delta\psi\rangle\Big) = 0 \end{displaymath}$

The two additional minus signs arise because an $-{\rm i}$ comes out of the left side of an inner product as ${\rm i}$ , but out of the right side as $-{\rm i}$ . Averaging this equation with the original above it has the effect of throwing away the second of each pair of inner products in the original equation.

You can now combine the remaining two terms into one inner product with $\delta\psi$ on the left:

$\begin{displaymath} \langle\delta\psi\vert H\psi-\epsilon\psi\rangle = 0 \end{displaymath}$

If this is to be zero for any change $\delta\psi$ , then the right hand side of the inner product must unavoidably be zero. For example, just take $\delta\psi$ equal to a small number $\varepsilon$ times the right hand side, you will get $\varepsilon$ times the square norm of the right hand side, and that can only be zero if the right hand side is. So $H\psi-\epsilon\psi$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0, or

$\begin{displaymath} H\psi=\epsilon\psi. \end{displaymath}$

So you see that you have recovered the Hamiltonian eigenvalue problem from the requirement that the variation of the expectation energy is zero. Unavoidably then, $\epsilon$ will have to be an energy eigenvalue . It often happens that Lagrangian multipliers have a physical meaning beyond being merely penalty factors. But note that there is no requirement for this to be the ground state. Any energy eigenstate would satisfy the equation; the variational principle works for them all.

Indeed, you may remember from calculus that the derivatives of a function may be zero at more than one point. For example, a function might also have a maximum, or local minima and maxima, or stationary points where the function is neither a maximum nor a minimum, but the derivatives are zero anyway. This sort of thing happens here too: the ground state is the state of lowest possible energy, but there will be other states for which $\delta\left\langle{E}\right\rangle$ is zero, and these will correspond to energy eigenstates of higher energy, {D.49}.