D.71 Emergence of spin from relativity

This note will give a (relatively) simple derivation of the Dirac equation to show how relativity naturally gives rise to spin. The equation will be derived without ever mentioning the word spin while doing it, just to prove it can be done. Only Dirac’s assumption that Einstein's square root disappears,

\begin{displaymath}
\sqrt{\left(m c^2\right)^2 + \sum_{i=1}^3 \left({\widehat ...
...}
= \alpha_0 mc^2 + \sum_{i=1}^3 \alpha_i {\widehat p}_i c,
\end{displaymath}

will be used and a few other assumptions that have nothing to do with spin.

The conditions on the coefficient matrices $\alpha_i$ for the linear combination to equal the square root can be found by squaring both sides in the equation above and then comparing sides. They turn out to be:

\begin{displaymath}
\alpha_i^2 = 1 \mbox{ for every $i$}
\qquad
\alpha_i\alpha_j+\alpha_j\alpha_i = 0
\mbox{ for $i\ne j$} %
\end{displaymath} (D.45)

Now assume that the matrices $\alpha_i$ are Hermitian, as appropriate for measurable energies, and choose to describe the wave function vector in terms of the eigenvectors of matrix $\alpha_0$. Under those conditions $\alpha_0$ will be a diagonal matrix, and its diagonal elements must be $\pm1$ for its square to be the unit matrix. So, choosing the order of the eigenvectors suitably,

\begin{displaymath}
\alpha_0=\left(\begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array}\right)
\end{displaymath}

where the sizes of the positive and negative unit matrices in $\alpha_0$ are still undecided; one of the two could in principle be of zero size.

However, since $\alpha_0\alpha_i+\alpha_i\alpha_0$ must be zero for the three other Hermitian $\alpha_i$ matrices, it is seen from multiplying that out that they must be of the form

\begin{displaymath}
\alpha_1 =
\left(\begin{array}{cc}0&\sigma^{\rm {H}}_1\\...
...rray}{cc}0&\sigma^{\rm {H}}_3\\ \sigma_3&0\end{array}\right).
\end{displaymath}

The $\sigma_i$ matrices, whatever they are, must be square in size or the $\alpha_i$ matrices would be singular and could not square to one. This then implies that the positive and negative unit matrices in $\alpha_0$ must be the same size.

Now try to satisfy the remaining conditions on $\alpha_1$, $\alpha_2$, and $\alpha_3$ using just complex numbers, rather than matrices, for the $\sigma_i$. By multiplying out the conditions (D.45), you see that

\begin{displaymath}
\alpha_i \alpha_i = 1
\Longrightarrow \sigma_i^{\rm {H}}\sigma_i = \sigma_i\sigma_i^{\rm {H}} = 1
\end{displaymath}


\begin{displaymath}
\alpha_i \alpha_j + \alpha_j \alpha_i = 0
\Longrightarro...
... \sigma_i\sigma_j^{\rm {H}} + \sigma_j\sigma_i^{\rm {H}} = 0.
\end{displaymath}

The first condition above would require each $\sigma_i$ to be a number of magnitude one, in other words, a number that can be written as $e^{{\rm i}\phi_i}$ for some real angle $\phi_i$. The second condition is then according to the Euler formula (2.5) equivalent to the requirement that

\begin{displaymath}
\cos\left(\phi_i - \phi_j\right) = 0 \mbox{ for $i\ne j$};
\end{displaymath}

this implies that all three angles would have to be 90 degrees apart. That is impossible: if $\phi_2$ and $\phi_3$ are each 90 degrees apart from $\phi_1$, then $\phi_2$ and $\phi_3$ are either the same or apart by 180 degrees; not by 90 degrees.

It follows that the components $\sigma_i$ cannot be numbers, and must be matrices too. Assume, reasonably, that they correspond to some measurable quantity and are Hermitian. In that case the conditions above on the $\sigma_i$ are the same as those on the $\alpha_i$, with one critical difference: there are only three $\sigma_i$ matrices, not four. And so the analysis repeats.

Choose to describe the wave function in terms of the eigenvectors of the $\sigma_3$ matrix; this does not conflict with the earlier choice since all half wave function vectors are eigenvectors of the positive and negative unit matrices in $\alpha_0$. So you have

\begin{displaymath}
\sigma_3=\left(\begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array}\right)
\end{displaymath}

and the other two matrices must then be of the form

\begin{displaymath}
\sigma_1=\left(\begin{array}{cc}0&\tau^{\rm {H}}_1\\ \tau_...
...gin{array}{cc}0&\tau^{\rm {H}}_2\\ \tau_2&0\end{array}\right)
\end{displaymath}

But now the components $\tau_1$ and $\tau_2$ can indeed be just complex numbers, since there are only two, and two angles can be apart by 90 degrees. You can take $\tau_1$ $\vphantom0\raisebox{1.5pt}{$=$}$ $e^{{\rm i}\phi_1}$ and then $\tau_2$ $\vphantom0\raisebox{1.5pt}{$=$}$ $e^{{\rm i}(\phi_1+\pi/2)}$ or $e^{{\rm i}(\phi_1-\pi/2)}$. The existence of two possibilities for $\tau_2$ implies that on the wave function level, nature is not mirror symmetric; momentum in the positive $y$-​direction interacts differently with the $x$ and $z$ momenta than in the opposite direction. Since the observable effects are mirror symmetric, do not worry about it and just take the first possibility.

So, the goal of finding a formulation in which Einstein's square root falls apart has been achieved. However, you can clean up some more, by redefining the value of $\tau_1$ away. If the four-di­men­sion­al wave function vector takes the form $(a_1,a_2,a_3,a_4)$, define $\bar{a}_1$ $\vphantom0\raisebox{1.5pt}{$=$}$ $e^{{\rm i}\phi_1/2}{a}_1$, $\bar{a}_2$ $\vphantom0\raisebox{1.5pt}{$=$}$ $e^{-{\rm i}\phi_1/2}a_2$ and similar for $\bar{a}_3$ and $\bar{a}_4$.

In that case, the final cleaned-up $\sigma$ matrices are

\begin{displaymath}
\sigma_3 =
\left(\begin{array}{rr} 1 & 0\\ 0 & -1\end{ar...
...\begin{array}{rr} 0 & -{\rm i}\\ {\rm i}& 0\end{array}\right)
\end{displaymath} (D.46)

The s word has not been mentioned even once in this derivation. So, now please express audible surprise that the $\sigma_i$ matrices turn out to be the Pauli (it can now be said) spin matrices of chapter 12.10.

But there is more. Suppose you define a new coordinate system rotated 90 degrees around the $z$-​axis. This turns the old $y$-​axis into a new $x$-​axis. Since $\tau_2$ has an additional factor $e^{{\rm i}\pi/2}$, to get the normalized coefficients, you must include an additional factor $e^{{\rm i}\pi/4}$ in $\bar{a}_1$, which by the fundamental definition of angular momentum discussed in addendum {A.19} means that it describes a state with angular momentum $\leavevmode \kern.03em\raise.7ex\hbox{\the\scriptfont0 1}\kern-.2em
/\kern-.2em\lower.4ex\hbox{\the\scriptfont0 2}\kern.05em\hbar$. Similarly $a_3$ corresponds to a state with angular momentum $\leavevmode \kern.03em\raise.7ex\hbox{\the\scriptfont0 1}\kern-.2em
/\kern-.2em\lower.4ex\hbox{\the\scriptfont0 2}\kern.05em\hbar$ and $a_2$ and $a_4$ to ones with $-\leavevmode \kern.03em\raise.7ex\hbox{\the\scriptfont0 1}\kern-.2em
/\kern-.2em\lower.4ex\hbox{\the\scriptfont0 2}\kern.05em\hbar$.

For nonzero momentum, the relativistic evolution of spin and momentum becomes coupled. But still, if you look at the eigenstates of positive energy, they take the form:

\begin{displaymath}
\left(
\begin{array}{c}
\vec a\\
\varepsilon ({\skew0\vec p}\cdot\vec\sigma) \vec a
\end{array}
\right)
\end{displaymath}

where $\varepsilon$ is a small number in the nonrelativistic limit and $\vec{a}$ is the two-component vector $(a_1,a_2)$. The operator corresponding to rotation of the coordinate system around the momentum vector commutes with ${\skew0\vec p}\cdot\vec\sigma$, hence the entire four-di­men­sion­al vector transforms as a combination of a spin $\frac12\hbar$ state and a spin $-\frac12\hbar$ state for rotation around the momentum vector.