D.54 Derivation of the Hartree-Fock equations

This note derives the canonical Hartree-Fock equations. It will use some linear algebra; see the Notations section under matrix for some basic concepts. The derivation will be performed under the normally stated rules of engagement that the orbitals are of the form $\pe{n}//u//$ or $\pe{n}//d//$ . So the spins are chosen, and only the spatial orbitals $\pe{n}////$ are to be found.

The derivations must allow for the fact that in restricted Hartree-Fock, it is required that pairs of spin-up and spin-down orbitals have the same spatial orbital. So there are three possible kinds of spatial orbitals. A spatial orbital may produce a single unpaired spin orbital that is spin-up, or a single unpaired spin orbital that is spin-down, or a pair of spin-up and spin-down orbitals with the same spatial orbital. These three types of spatial orbitals will be referred to as unpaired spin-up, unpaired spin-down, and restricted. Note that these names do not refer to properties of the spatial orbits themselves, of course, but to the properties of the spin orbits that these spatial orbitals produce.

Assume that there are $N_{\rm {u}}$ spin-up spatial orbitals, $N_{\rm {d}}$ spin-down ones, and $N_{\rm {r}}$ restricted ones. The total number of spatial orbitals, call it N, is then

$\begin{displaymath} N = N_{\rm {u}} + N_{\rm {d}} + N_{\rm {r}} \end{displaymath}$

and that is the total number of unknown spatial orbitals to find. A corresponding number of

equations will be needed for them.

However, the total number of spin orbitals, , is larger than by an additional amount $N_{\rm {r}}$ , because the restricted spatial orbitals appear in both spin-up and spin-down versions. That makes the mathematics messy.

Things become a bit easier if the ordering of the orbitals is specified a priori. The ordering makes no difference physically. So it will be assumed that the spatial orbitals are ordered with the unpaired spin-up ones first, the unpaired spin-down ones second, and the restricted ones last. The ordering of the spin orbitals will be the same as that of the spatial orbitals, but with the restricted orbitals at the end appearing twice; first in the spin-up versions and then in the spin-down versions.

To find the spatial orbitals, the variational method as discussed in chapter 9.1.3 says that the expectation energy $\left\langle{E}\right\rangle$ must be unchanged under small changes in the orbitals, provided that the orbitals remain orthonormal. To easily enforce that orthonormality constraint requires that terms are added to the change in orbitals that penalize for any going out of bounds.

To do so, first note that $\left\langle{E}\right\rangle$ can be considered to be a real function from the real and imaginary parts of the spatial orbitals, and both these parts are real functions. The condition that any spatial orbital $\pe{n}////$ must be normalized means that the inner product of the orbital with itself must be 1,

$\begin{displaymath} \langle\pe n////\vert\pe n////\rangle=1 \end{displaymath}$

This condition is real too. However, the condition that any spatial mode $\pe{n}////$ must be orthogonal to any other spatial mode $\pe{\underline n}////$ means that the inner product of the two modes must be zero,

$\begin{displaymath} \langle\pe n////\vert\pe{\underline n}////\rangle=0 \end{displaymath}$

In general this condition has both a real and an imaginary component. But it can be written as two real conditions;

$\begin{displaymath} {\textstyle\frac{1}{2}}\Big(\langle\pe n////\vert\pe{\under... ...e - \langle\pe{\underline n}////\vert\pe n////\rangle\Big)=0. \end{displaymath}$

The reason is that if you swap the sides in an inner product, you get the complex conjugate; therefore the first equation above is the real part of the inner product and the second the imaginary part.

Since we now have a completely real problem in real independent variables, the penalty factors (the Lagrangian multipliers) in the problem will be real too. For reasons evident in a second, the penalty factor for the normalization condition above will be called $\epsilon_{nn}$ , while the ones for the two real orthogonality conditions will be called $2\epsilon_{n{\underline n},r}$ and $2\epsilon_{n{\underline n},i}$ , respectively. To avoid enforcing the same orthogonality condition twice, it is here assumed that ${\underline n}>n$ .

The reason for these notations is that in terms of them, the penalized variational condition that the spatial orbitals must satisfy, chapter 9.1.3, takes the simple form

$\begin{displaymath} \delta \left\langle{E}\right\rangle - \sum_{n=1}^N \sum_{{... ...} \delta \langle\pe n////\vert\pe{\underline n}////\rangle = 0 \end{displaymath}$

where $\delta$ denotes a small change in the following quantity, ${\underline n}$ is now allowed to be both smaller or larger than

, and $\epsilon_{n{\underline n}}$ is a Hermitian matrix, meaning that $\epsilon_{{\underline n}{}n}=\epsilon_{n{\underline n}}^*$

Note however that two spatial orbitals do not have to be orthogonal if one is a unpaired spin-up one and the other an unpaired spin-down one. In that case the spins take care of orthogonality. This can be accomodated by stipulating that the penalty factors of the corresponding constraints are zero,

$\begin{displaymath} \epsilon_{n{\underline n}}=0 \quad\mbox{if $\pe{n}////$\ is spin-up and $\pe{n}////$\ is spin-down, or vice versa} \end{displaymath}$

Next the variational condition is to be evaluated for a small change $\delta\pe{m}////$ in a sample spatial wave function $\pe{m}////$ where is no larger than . This is straightforward for the inner products in the penalty terms. However, the expectation value of energy $\left\langle{E}\right\rangle$ was obtained in chapter 9.3.3 in terms of the spin, rather than spatial orbitals:

$\begin{eqnarray*} \left\langle{E}\right\rangle & = &\sum_{n=1}^I \langle\pe n//... ...angle{\updownarrow}_n\vert{\updownarrow}_{\underline n}\rangle^2 \end{eqnarray*}$

(From here on, the argument of the first orbital of a pair in either side of an inner product is taken to be the first inner product integration variable ${\skew0\vec r}$ and the argument of the second orbital is the second integration variable ${\underline{\skew0\vec r}}$ )

Taking that into account, the variational condition for the $\delta\pe{m}////$ takes the messy form

$\begin{eqnarray*} && [2?] \Big( \langle\delta\pe m////\vert h^{\rm e}\vert\p... ...N \epsilon_{nm} \langle\pe n////\vert\delta\pe m////\rangle = 0 \end{eqnarray*}$

Here

means to insert a factor 2 there if

is one of the restricted spatial orbitals, because each of the two corresponding spin orbitals produces a term like that. And $[\langle{\updownarrow}_.\vert{\updownarrow}_.\rangle^2?]$ means leave away this inner product if

is one of the restricted spatial orbitals, because exactly one of the two corresponding spin orbitals has that inner product equal to one, and the other has it zero.

Note that the difference between ${\underline n}$ and can from now on be ignored; the name of a summation variable makes no difference for the result, and there are no longer name conflicts in the individual terms. Note also that the sums over (or ${\underline n}$ ) with upper limit include the restricted spatial orbitals twice, once for each spin direction.

The second term in each row in the expression above is just the complex conjugate of the first. These second terms can be thrown out using the same trick as in chapter 9.1.3. (In other words, average with the same equation with $\delta\pe{m}////$ replaced by $-{\rm i}\delta\pe{m}////$ and divided by ${\rm i}$ .) And the integrals with the factors ${\textstyle\frac{1}{2}}$ are pairwise the same; the difference is just a name swap of the inner product integration variables. So all there is really left is

$\begin{eqnarray*} && [2?] \langle\delta\pe m////\vert h^{\rm e}\vert\pe m////\r... ...{n=1}^N \epsilon_{mn}\langle\delta\pe m////\vert\pe n////\rangle \end{eqnarray*}$

Now write out the inner product over the first position coordinate ${\skew0\vec r}$ , being the argument of $\delta\pe{m}////$ , for all terms:

$\begin{eqnarray*} \lefteqn{\int_{{\rm all}\;{\skew0\vec r}}\delta\pe m////\stru... ...psilon_{mn}\pe n//// \\ && \bigg) {\,\rm d}^3{\skew0\vec r}= 0 \end{eqnarray*}$

If this integral is to be zero for whatever is $\delta\pe{m}////$ , then the terms within the parentheses must be zero. (Otherwise just take $\delta\pe{m}////$ proportional to the parenthetical expression; you would get the norm of the expression, and that is only zero if the expression is.)

Unavoidably then, the following equations, one for each value of , must be satisfied:

$\begin{displaymath}[2?]h^{\rm e}\pe m//// + [2?] \sum_{n=1}^I \langle\pe n////\... ... m////\rangle\pe n//// = \sum_{n=1}^N \epsilon_{mn}\pe n//// \end{displaymath}$

This can be cleaned up a bit by dividing by [2?]:

$\begin{displaymath} \fbox{$\displaystyle h^{\rm e}\pe m//// + \sum_{n=1}^I \l... ...space{-4pt} \right\} \sum_{n=1}^N \epsilon_{mn}\pe n//// $} \end{displaymath}$

(D.35)

These are the general Hartree-Fock equations, one for each $\raisebox{-.3pt}{$\leqslant$}$ . The upper value between braces applies if the spatial orbital $\pe{m}////$ is not a restricted one; otherwise the lower value applies. Recall that the sums with upper limit include the restricted spatial orbitals twice. And that $\epsilon_{mn}$ is zero if spatial orbital $\pe{m}////$ is unpaired spin-up and $\pe{n}////$ unpaired spin-down or vice-versa. For such index values, $\langle{\updownarrow}_m\vert{\updownarrow}_n\rangle$ is zero too.

Note that the general Hartree-Fock equation above includes eigenvalues $\epsilon_{mn}$ . The canonical equations include just a single eigenvalue $\epsilon_m$ . So to get the canonical Hartree-Fock equations, the sum in the right hand side must be further simplified to the form $\epsilon_m\pe{m}////$ .

The restricted closed-shell Hartree-Fock case will be done first, since it is the easiest one. Every spatial orbital is restricted, so the lower choice in the curly brackets always applies. The summation upper limits , being the number of spin orbitals, can be reduced to the number of spatial orbitals by adding a factor 2. We can also get rid of the factor ${\textstyle\frac{1}{2}}$ in front of the $\epsilon_{mn}$ by simply redefining them by that factor. So for restricted closed-shell Hartree-Fock

$\begin{displaymath} h^{\rm e}\pe m//// + 2 \sum_{n=1}^N \langle\pe n////\vert ... ... m////\rangle\pe n//// = \sum_{n=1}^N \epsilon_{mn}\pe n//// \end{displaymath}$

Now the reason why all these $\epsilon_{mn}$ are there is because the set of spatial orbitals that gives the lowest energy state are not unique. The equation above applies to a typical set. Only a special set will get rid of the $\epsilon_{mn}$ for not equal to , leaving only $\epsilon_{mm}$ , which can then be defined to be $\epsilon_m$ .

Each orbital in the special set will be some combination of the orbitals in the typical set above. In particular, any orbital in the special set, call it $\overline\pe\nu////$ , will be a linear combination of the orbitals $\pe{n}////$ in the typical set as follows:

$\begin{displaymath} \overline\pe\nu//// \equiv \sum_{n=1}^N c_{n,\nu} \pe n//// \qquad\mbox{for any $\nu=1,2,\ldots,N$} \end{displaymath}$

where the numbers $c_{1,\nu},c_{2,\nu},\ldots$ are the multiples of the typical orbitals $\pe1////$ , $\pe2////$ , .... The complete set of numbers $c_{n.\nu}$ for all possible values of both

and $\nu$ can be written as a matrix, a table of numbers. This matrix will be indicated by

. The first index in $c_{n,\nu}$ ,

, says what row in

that coefficient is in, and the second index, $\nu$ , what column.

The multiples $c_{n,\nu}$ cannot be arbitrary, because the special orbitals must still be orthonormal. As noted earlier, they will be if they are normalized (so the inner product of any orbital with itself is 1), and mutually orthogonal (so the inner product of any orbital with any other one is zero). In short, the requirement is that

$\begin{displaymath} \langle\overline\pe\mu////\vert\overline\pe\nu////\rangle=\delta_{\mu\nu} \end{displaymath}$

where $\delta_{\mu\nu}$ is one if $\mu$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\nu$ , and zero otherwise. The set of numbers $\delta_{\mu\nu}$ is called the “Kronecker delta” or “unit matrix” or “identity matrix.” (The identity matrix is for matrices what the number 1 is for normal numbers; multiplying an arbitrary matrix or vector by the identity matrix does not change that matrix or vector.)

Substituting in the expression for the special orbitals above, making sure not to use the same name $\nu$ for two different indices, the requirement becomes

$\begin{displaymath} \sum_{m=1}^N \sum_{n=1}^N \langle c_{m,\mu}\pe m////\vert c_{n,\nu}\pe n//// \rangle=\delta_{\mu\nu} \end{displaymath}$

or noting that numbers come out of the left side of an inner product as complex conjugates,

$\begin{displaymath} \sum_{m=1}^N \sum_{n=1}^N c^*_{m,\mu} c_{n,\nu} \langle\pe m////\vert\pe n////\rangle=\delta_{\mu\nu} \end{displaymath}$

Now since the set of typical orbitals $\pe{n}////$ are already orthonormal, the inner product in the requirement above is only nonzero when is , and then it is one. So dropping the zero terms that have $m\ne{}n$ , the requirement on the coefficients simplifies to

$\begin{displaymath} \sum_{n=1}^N c^*_{n,\mu} c_{n,\nu} = \delta_{\mu\nu} \end{displaymath}$

What does that mean? Well, for given values of $\mu$ and $\nu$ , consider the coefficients $c_{n,\mu}$ to form a vector $\vec{u}_\mu$ , where

indicates the component number of that vector. Similarly, consider the coefficients $c_{n,\nu}$ to form a vector $\vec{u}_\nu$ . Then the left hand side in the requirement above is the inner (or dot, if real) product of these two vectors. So the set of vectors must be orthonormal, just like the special orbitals must be orthonormal. So the matrix of coefficients

must consist of orthonormal vectors. Mathematicians call such matrices “unitary,” rather than orthonormal, since it is easily confused with unit, and that keeps mathematicians in business explaining all the confusion.

The Hermitian adjoint matrix $C^\dagger$ of is defined as the matrix you get by swapping the order of the indices of the elements of and adding a complex conjugate. So by definition the factor $c^*_{n,\mu}$ in the requirement above equals the coefficient $c^\dagger_{\mu,n}$ of $C^\dagger$ . And matrix multiplication is defined such that then the sum over in the requirement gives exactly the coefficients of the product $C^\dagger{}C$ . So the requirement above can be written as

$\begin{displaymath} C^\dagger C = I \end{displaymath}$

where

is the unit matrix. That means $C^\dagger$ is the inverse matrix to

, $C^\dagger$ $\vphantom0\raisebox{1.5pt}{$=$}$ $C^{-1}$ . Then you also have that

is the inverse of $C^\dagger$ , $CC^\dagger$ $\vphantom0\raisebox{1.5pt}{$=$}$

, which writes out to

$\begin{displaymath} \sum_{\nu=1}^N c_{n,\nu} c^*_{m,\nu} = \delta_{mn}. \end{displaymath}$

This can be used to find the typical orbitals in terms of the special ones. To do so, premultiply the expression for the special orbitals as given earlier by $c^*_{m,\nu}$ and sum over $\nu$ :

$\begin{displaymath} \sum_{\nu=1}^N c^*_{m,\nu} \overline\pe\nu//// = \sum_{\nu=1}^N c^*_{m,\nu} \sum_{n=1}^N c_{n\nu} \pe n//// \end{displaymath}$

As seen above, the sum over $\nu$ in the right hand side is just $\delta_{mn}$ , so in the sum over

, only the term with

equal to

is nonzero:

$\begin{displaymath} \sum_{\nu=1}^N c^*_{m,\nu} \overline\pe\nu//// = \pe m//// \end{displaymath}$

That then gives any typical orbital $\pe{m}////$ in terms of a sum of the special orbitals $\overline\pe\nu////$ .

Now plug that into the non canonical restricted closed-shell Hartree-Fock equations given earlier. Be careful not to use the same summation index name twice in the same term; this derivation will use

$\begin{displaymath} \pe m//// = \sum_{\nu=1}^N c^*_{m,\nu} \overline\pe\nu//// ... ...//// = \sum_{\kappa=1}^N c^*_{n,\kappa} \overline\pe\kappa//// \end{displaymath}$

for $\pe{m}////$ , the first occurrence of $\pe{n}////$ in the terms, and the second occurrence, respectively. Premultiply it all by

, i.e. put $\sum_{m=1}^{N}c_{m,\mu}$ in front of each term. That cleans up to

$\begin{displaymath} h^{\rm e}\overline\pe\mu//// + 2 \sum_{\lambda=1}^N \lang... ..._{m,\mu} \epsilon_{mn} c^*_{n,\lambda} \overline\pe\lambda//// \end{displaymath}$

Note that the only thing that has changed more than just by symbol names is the matrix in the right hand side. Now for each separate value of $\lambda$ , take $c^*_{n\lambda}$ as the $\lambda$ -th orthonormal eigenvector of Hermitian matrix $\epsilon_{mn}$ , calling the eigenvalue $\epsilon_\lambda$ . Then by the definition of eigenvector,

$\begin{displaymath} \sum_{n=1}^N \epsilon_{mn} c^*_{n,\lambda} = \epsilon_\lambda c^*_{m,\lambda} \end{displaymath}$

So the right hand side becomes

$\begin{displaymath} \sum_{m=1}^{I} \sum_{\lambda=1}^N c_{m,\mu} \epsilon_{\lam... ...\overline\pe\lambda//// = \epsilon_{\mu} \overline\pe\mu//// \end{displaymath}$

So, in terms of the special orbitals defined by the requirement that $c^*_{m,\mu}$ gives the $\mu$ -th eigenvector of $\epsilon_{mn}$ , the right hand side simplifies to the canonical one.

Since the old typical orbitals are no longer of interest, the overlines on the special orbitals can be dropped to save typing, and the Greek index names $\mu$ and $\lambda$ can be renamed and ${\underline n}$ . That then finally produces the canonical closed-shell restricted Hartree-Fock equations:

$\begin{displaymath} \fbox{$\displaystyle h^{\rm e}\pe n//// + 2 \sum_{{\under... ...n////\rangle\pe{\underline n}//// = \epsilon_n \pe n//// $} \end{displaymath}$

(D.36)

Note that the left-hand side directly provides a Hermitian Fock operator if you identify it as ${\cal F}\pe{n}////$ ; there is no need to involve spin in the closed-shell restricted case. This also provides a much simpler explanation than all the algebra above why all the earlier $\epsilon_{mn}$ with $m\ne{}n$ were not needed; existence of a set of orthonormal eigenfunctions of a Hermitian operator is automatic. So there is no fundamental need to enforce that separately through Lagrangian multipliers.

Turning now to the case of (fully) unrestricted Hartree-Fock (UHF), you might make the same simple argument as above and be done. But it is worthwhile to go through the full mathematics anyway, to better understand open-shell restricted Hartree-Fock later. In the unrestricted case, the non canonical equations are

$\begin{displaymath} h^{\rm e}\pe m//// + \sum_{n=1}^N \langle\pe n////\vert v^... ... m////\rangle\pe n//// = \sum_{n=1}^N \epsilon_{mn}\pe n//// \end{displaymath}$

In this case, there are two different types of spatial orbitals; those appearing in spin-up spin orbitals, and those appearing in spin-down spin orbitals. You cannot just make arbitrary combinations of all these orbitals. If you combine spin-up and spin-down orbitals, they correspond to spin orbitals of uncertain spin. That would make the assumptions used to derive the Hartree-Fock equations invalid.

However, combinations of purely spin-up orbitals can still be made without problems, and so can combinations of purely spin down orbitals. To do the mathematics, the spatial orbitals can be separated into two sets. The set of orbital numbers corresponding to spin-up spin orbitals will be indicated by U, and the set of numbers corresponding to spin-down spin orbitals by D. So you can partition (separate) the non canonical equations above into equations for $m\in{\rm {U}}$ (meaning is one of the values in set U),

$\begin{displaymath} h^{\rm e}\pe m//// + \sum_{n\in{\rm U}} \langle\pe n////\v... ...\rangle\pe n//// = \sum_{n\in{\rm U}} \epsilon_{mn}\pe n//// \end{displaymath}$

and equations for $m\in{\rm {D}}$ ,

$\begin{displaymath} h^{\rm e}\pe m//// + \sum_{n\in{\rm U}} \langle\pe n////\v... ...\rangle\pe n//// = \sum_{n\in{\rm D}} \epsilon_{mn}\pe n//// \end{displaymath}$

In these two types of equations, the fact that the up and down spin states are orthogonal was used to get rid of one pair of sums, and another pair was eliminated by the fact that there are no Lagrangian variables $\epsilon_{mn}$ linking the sets, since the spatial orbitals in the two sets are allowed to be mutually non orthogonal.

Now separately replace the orbitals of the up and down states by a modified set just like for the restricted closed-shell case above, for each using the unitary matrix of eigenvectors of the $\epsilon_{mn}$ coefficients appearing in the right hand side of the equations for that set. It leaves the equations intact except for changes in names, but gets rid of the $\epsilon_{mn}$ for $\raisebox{.2pt}{$\ne$}$ , leaving only $\epsilon_{mm}$ values, call them $\epsilon_m$ . Then combine the spin-up and spin-down equations again into a single expression. You get, in terms of revised symbol names,

$\begin{displaymath} \fbox{$\displaystyle h^{\rm e}\pe n//// + \sum_{{\underli... ...////\rangle\pe {\underline n}//// = \epsilon_n \pe n//// $} \end{displaymath}$

(D.37)

That leaves only the restricted open-shell Hartree-Fock method. Here, the partitioning also needs to include the set R of of restricted orbitals besides U and D. There is now a problem, because you cannot make combinations of restricted orbitals with spin-up or spin-down orbitals. That means that the $\epsilon_{mn}$ values where either or is restricted and the other is not, cannot be eliminated. Solutions range from just ignoring the whole thing to properly accounting for these $\epsilon_{mn}$ values by enforcing that restricted and non restricted orbitals must stay orthogonal as additional equations. This (even more) elaborate case will be left to the references that you can find in [46], in particular [28, pp. 242-253].

Woof.

D.54 De­riva­tion of the Hartree-Fock equa­tions

D.54 Derivation of the Hartree-Fock equations