Subsections


A.19 Conservation Laws and Symmetries

This note has a closer look at the relation between conservation laws and symmetries. As an example it derives the law of conservation of angular momentum directly from the rotational symmetry of physics. It then briefly explains how the arguments carry over to other conservation laws like linear momentum and parity. A simple example of a local gauge symmetry is also given. The final subsection has a few remarks about the symmetry of physics with respect to time shifts.


A.19.1 An example symmetry transformation

The mathematician Weyl gave a simple definition of a symmetry. A symmetry exists if you do something and it does not make a difference. A circular cylinder is an axially symmetric object because if you rotate it around its axis over some arbitrary angle, it still looks exactly the same. However, this note is not concerned with symmetries of objects, but of physics. That are symmetries where you do something, like place a system of particles at a different position or angle, and the physics stays the same. The system of particles itself does not necessarily need to be symmetric here.

As an example, this subsection and the next ones will explore one particular symmetry and its conservation law. The symmetry is that the physics is the same if a system of particles is placed under a different angle in otherwise empty space. There are no preferred directions in empty space. The angle that you place a system under does not make a difference. The corresponding conservation law will turn out to be conservation of angular momentum.

First a couple of clarifications. Empty space should really be understood to mean that there are no external effects on the system. A hydrogen atom in a vacuum container on earth is effectively in empty space. Or at least it is as far as its electronic structure is concerned. The energies associated with the gravity of earth and with collisions with the walls of the vacuum container are negligible. Atomic nuclei are normally effectively in empty space because the energies to excite them are so large compared to electronic energies. As a macroscopic example, to study the internal motion of the solar system the rest of the galaxy can presumably safely be ignored. Then the solar system too can be considered to be in empty space.

Further, placing a system under a different angle may be somewhat awkward. Don’t burn your fingers on that hot sun when placing the solar system under a different angle. And there always seems to be a vague suspicion that you will change something nontrivially by placing the system under a different angle.

There is a different, better, way. Note that you will always need a coordinate system to describe the evolution of the system of particles mathematically. Instead of putting the system of particles under an different angle, you can put that coordinate system under a different angle. It has the same effect. In empty space there is no reference direction to say which one got rotated, the particle system or the coordinate system. And rotating the coordinate system leaves the system truly untouched. That is why the view that the coordinate system gets rotated is called the “passive view.” The view that the system itself gets rotated is called the “active view.”

Figure A.7: Effect of a rotation of the coordinate system on the spherical coordinates of a particle at an arbitrary location P.
\begin{figure}
\centering
\setlength{\unitlength}{1pt}
\begin{picture}(...
...(0,0){$\phi$}}
\put(41,111){\makebox(0,0){P}}
\end{picture}
\end{figure}

Figure A.7 shows graphically what happens to the position coordinates of a particle if the coordinate system gets rotated. The original coordinate system is indicated by primes. The $z'$-​axis has been chosen along the axis of the desired rotation. Rotation of this coordinate system over an angle $\gamma$ produces a new coordinate system indicated without primes. In terms of spherical coordinates, the radial position $r$ of the particle does not change. And neither does the polar angle $\theta$. But the azimuthal angle $\phi$ does change. As the figure shows, the relation between the azimuthal angles is

\begin{displaymath}
\phi' = \phi + \gamma
\end{displaymath}

That is the basic mathematical description of the symmetry transformation.

However, it must still be applied to the description of the physics. And in quantum mechanics, the physics is described by a wave function $\Psi$ that depends on the position coordinates of the particles;

\begin{displaymath}
\Psi(r_1,\theta_1,\phi_1,r_2,\theta_2,\phi_2,\ldots ; t)
\end{displaymath}

where 1, 2, ..., is the numbering of the particles. Particle spin will be ignored for now.

Physically absolutely nothing changes if the coordinate system is rotated. So the values $\Psi$ of the wave function in the rotated coordinate system are exactly the same as the values $\Psi'$ in the original coordinate system. But the particle coordinates corresponding to these values do change:

\begin{displaymath}
\Psi(r_1,\theta_1,\phi_1,r_2,\theta_2,\phi_2,\ldots ; t)
...
...si'(r_1',\theta_1',\phi_1',r_2',\theta_2',\phi_2',\ldots ; t)
\end{displaymath}

Therefore, considered as functions, $\Psi'$ and $\Psi$ are different. However, only the azimuthal angles change. In particular, putting in the relation between the azimuthal angles above gives:

\begin{displaymath}
\Psi(r_1,\theta_1,\phi_1,r_2,\theta_2,\phi_2,\ldots ; t)
...
...\theta_1,\phi_1+\gamma,r_2,\theta_2,\phi_2+\gamma,\ldots ; t)
\end{displaymath}

Mathematically, changes in functions are most conveniently written in terms of an appropriate operator, chapter 2.4. The operator here is called the “generator of rotations around the $z$-​axis.” It will be indicated as ${\cal R}_{z,\gamma}$. What it does is add $\gamma$ to the azimuthal angles of the function. By definition:

\begin{displaymath}
{\cal R}_{z,\gamma} \Psi'(r_1,\theta_1,\phi_1,r_2,\theta_2...
...\theta_1,\phi_1+\gamma,r_2,\theta_2,\phi_2+\gamma,\ldots ; t)
\end{displaymath}

In terms of this operator, the relationship between the wave functions in the rotated and original coordinate systems can be written concisely as

\begin{displaymath}
\Psi = {\cal R}_{z,\gamma} \Psi'
\end{displaymath}

Using ${\cal R}_{z,\gamma}$, there is no longer a need for using primes on one set of coordinates. Take any wave function in terms of the original coordinates, written without primes. Application of ${\cal R}_{z,\gamma}$ will turn it into the corresponding wave function in the rotated coordinates, also written without primes.

So far, this is all mathematics. The above expression applies whether or not there is symmetry with respect to rotations. It even applies whether or not $\Psi$ is a wave function.


A.19.2 Physical description of a symmetry

The next question is what it means in terms of physics that empty space has no preferred directions. According to quantum mechanics, the Schrö­din­ger equation describes the physics. It says that the time derivative of the wave function can be found as

\begin{displaymath}
\frac{\partial \Psi}{\partial t} = \frac{1}{{\rm i}\hbar} H \Psi
\end{displaymath}

where $H$ is the Hamiltonian. If space has no preferred directions, then the Hamiltonian must be the same regardless of angular orientation of the coordinate system used.

In particular, consider the two coordinate systems of the previous subsection. The second system differed from the first by a rotation over an arbitrary angle $\gamma$ around the $z$-​axis. If one system had a different Hamiltonian than the other, then systems of particles would be observed to evolve in a different way in that coordinate system. That would provide a fundamental distinction between the two coordinate system orientations right there.

A couple of very basic examples can make this more concrete. Consider the electronic structure of the hydrogen atom as analyzed in chapter 4.3. The electron was not in empty space in that analysis. It was around a proton, which was assumed to be at rest at the origin. However, the electric field of the proton has no preferred direction either. (Proton spin was ignored). Therefore the current analysis does apply to the electron of the hydrogen atom. In terms of Cartesian coordinates, the Hamiltonian in the original $x',y',z'$ coordinate system is

\begin{displaymath}
H' = - \frac{\hbar^2}{2m_{\rm e}}
\left[
\frac{\partia...
...rac{e^2}{4\pi\epsilon_0}\frac{1}{\sqrt{{x'}^2+{y'}^2+{z'}^2}}
\end{displaymath}

The first term is the kinetic energy operator. It is proportional to the Laplacian operator, inside the square brackets. Standard vector calculus says that this operator is independent of the angular orientation of the coordinate system. So to get the corresponding operator in the rotated $x,y,z$ coordinate system, simply leave away the primes. The second term is the potential energy in the field of the proton. It is inversely proportional to the distance of the electron from the origin. The expression for the distance from the origin is the same in the rotated coordinate system. Once again, just leave away the primes. The bottom line is that you cannot see a difference between the two coordinate systems by looking at their Hamiltonians. The expressions for the Hamiltonians are identical.

As a second example, consider the analysis of the complete hydrogen atom as described in addendum {A.5}. The complete atom was assumed to be in empty space; there were no external effects on the atom included. The analysis still ignored all relativistic effects, including the electron and proton spins. However, it did include the motion of the proton. That meant that the kinetic energy of the proton had to be added to the Hamiltonian. But that too is a Laplacian, now in terms of the proton coordinates $x'_{\rm {p}},y'_{\rm {p}},z'_{\rm {p}}$. Its expression too is the same regardless of angular orientation of the coordinate system. And in the potential energy term, the distance from the origin now becomes the distance between electron and proton. But the formula for the distance between two points is the same regardless of angular orientation of the coordinate system. So once again, the expression for the Hamiltonian does not depend on the angular orientation of the coordinate system.

The equality of the Hamiltonians in the original and rotated coordinate systems has a consequence. It leads to a mathematical requirement for the operator ${\cal R}_{z,\gamma}$ of the previous subsection that describes the effect of a coordinate system rotation on wave functions. This operator must commute with the Hamiltonian:

\begin{displaymath}
H {\cal R}_{z,\gamma} = {\cal R}_{z,\gamma} H
\end{displaymath}

That follows from examining the wave function of a system as seen in both the original and the rotated coordinate system. There are two ways to find the time derivative of the wave function in the rotated coordinate system. One way is to rotate the original wave function using ${\cal R}_{z,\gamma}$ to get the one in the rotated coordinate system. Then you can apply the Hamiltonian on that. The other way is to apply the Hamiltonian on the wave function in the original coordinate system to find the time derivative in the original coordinate system. Then you can use ${\cal R}_{z,\gamma}$ to convert that time derivative to the rotated system. The Hamiltonian and ${\cal R}_{z,\gamma}$ get applied in the opposite order, but the result must still be the same.

This observation can be inverted to define a symmetry of physics in general:

A symmetry of physics is described by a unitary operator that commutes with the Hamiltonian.
If an operator commutes with the Hamiltonian, then the same Hamiltonian applies in the changed coordinate system. So there is no physical difference in how systems evolve between the two coordinate systems.

The qualification “unitary” means that the operator should not change the magnitude of the wave function. The wave function should remain normalized. It does for the transformations of interest in this note, like rotations of the coordinate system, shifts of the coordinate system, time shifts, and spatial coordinate inversions. All of these transformations are unitary. Like Hermitian operators, unitary operators have a complete set of orthonormal eigenfunctions. However, the eigenvalues are normally not real numbers.

For those who wonder, time reversal is somewhat of a special case. To understand the difficulty, consider first the operation “take the complex conjugate of the wave function.” This operator preserves the magnitude of the wave function. And it commutes with the Hamiltonian, assuming a basic real Hamiltonian. But taking complex conjugate is not a linear operator. For a linear operator $({\rm i}\Psi)'$ $\vphantom0\raisebox{1.5pt}{$=$}$ ${\rm i}(\Psi)'$. But $({\rm i}\Psi)^*$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\vphantom0\raisebox{1.5pt}{$-$}$${\rm i}\Psi^*$. If constants come out of an operator as complex conjugates, the operator is called “antilinear.” So taking complex conjugate is antilinear. Another issue: a linear unitary operator preserves the inner products between any two wave functions $\Psi_1$ and $\Psi_2$. (That can be verified by expanding the square magnitudes of $\Psi_1+\Psi_2$ and $\Psi_1+{\rm i}\Psi_2$). However, taking complex conjugate changes inner products into their complex conjugates. Operators that do that are called “antiunitary.” So taking complex conjugate is both antilinear and antiunitary. (Of course, in normal language it is neither. The appropriate terms would have been conjugate-linear and conjugate-unitary. But if you got this far in this book, you know how much chance appropriate terms have of being used in physics.)

Now the effect of time-reversal on wave functions turns out to be antilinear and antiunitary too, [48, p. 76]. One simple way to think about it is that a straightforward time reversal would change $e^{-{{\rm i}}Et/\hbar}$ into $e^{{{\rm i}}Et/\hbar}$. Then an additional complex conjugate will take things back to positive energies. For the same reason you do not want to add a complex conjugate to spatial transformations or time shifts.


A.19.3 Derivation of the conservation law

The definition of a symmetry as an operator that commutes with the Hamiltonian may seem abstract. But it has a less abstract consequence. It implies that the eigenfunctions of the symmetry operation can be taken to be also eigenfunctions of the Hamiltonian, {D.18}. And, as chapter 7.1.4 discussed, the eigenfunctions of the Hamiltonian are stationary. They change in time by a mere scalar factor $e^{{{\rm i}}Et/\hbar}$ of magnitude 1 that does not change their physical properties.

The fact that the eigenfunctions do not change is responsible for the conservation law. Consider what a conservation law really means. It means that there is some number that does not change in time. For example, conservation of angular momentum in the $z$-​direction means that the net angular momentum of the system in the $z$-​direction, a number, does not change.

And if the system of particles is described by an eigenfunction of the symmetry operator, then there is indeed a number that does not change: the eigenvalue of that eigenfunction. The scalar factor $e^{{{\rm i}}Et/\hbar}$ changes the eigenfunction, but not the eigenvalue that would be produced by applying the symmetry operator at different times. The eigenvalue can therefore be looked upon as a specific value of some conserved quantity. In those terms, if the state of the system is given by a different eigenfunction, with a different eigenvalue, it has a different value for the conserved quantity.

The eigenvalues of a symmetry of physics describe the possible values of a conserved quantity.

Of course, the system of particles might not be described by a single eigenfunction of the symmetry operator. It might be a mixture of eigenfunctions, with different eigenvalues. But that merely means that there is quantum mechanical uncertainty in the conserved quantity. That is just like there may be uncertainty in energy. Even if there is uncertainty, still the mixture of eigenvalues does not change with time. Each eigenfunction is still stationary. Therefore the probability of getting a given value for the conserved quantity does not change with time. In particular, neither the expectation value of the conserved quantity, nor the amount of uncertainty in it changes with time.

The eigenvalues of a symmetry operator may require some cleaning up. They may not directly give the conserved quantity in the desired form. Consider for example the eigenvalues of the rotation operator ${\cal R}_{z,\gamma}$ discussed in the previous subsections. You would surely expect a conserved quantity of a system to be a real quantity. But the eigenvalues of ${\cal R}_{z,\gamma}$ are in general complex numbers.

The one thing that can be said about the eigenvalues is that they are always of magnitude 1. Otherwise an eigenfunction would change in magnitude during the rotation. But a function does not change in magnitude if it is merely viewed under a different angle. And if the eigenvalues are of magnitude 1, then the Euler formula (2.5) implies that they can always be written in the form

\begin{displaymath}
e^{{\rm i}\alpha}
\end{displaymath}

where $\alpha$ is some real number. If the eigenvalue does not change with time, then neither does $\alpha$, which is basically just its logarithm.

But although $\alpha$ is real and conserved, still it is not the desired conserved quantity. Consider the possibility that you perform another rotation of the axis system. Each rotation multiplies the eigenfunction by a factor $e^{{\rm i}\alpha}$ for a total of $e^{2{\rm i}\alpha}$. In short, if you double the angle of rotation $\gamma$, you also double the value of $\alpha$. But it does not make sense to say that both $\alpha$ and $2\alpha$ are conserved. If $\alpha$ is conserved, then so is $2\alpha$; that is not a second conservation law. Since $\alpha$ is proportional to $\gamma$, it can be written in the form

\begin{displaymath}
\alpha = m \gamma
\end{displaymath}

where the constant of proportionality $m$ is independent of the amount of coordinate system rotation.

The constant $m$ is the desired conserved quantity. For historical reasons it is called the magnetic quantum number. Unfortunately, long before quantum mechanics, classical physics had already figured out that something was preserved. It called that quantity the angular momentum $L_z$. It turns out that what classical physics defines as angular momentum is simply a multiple of the magnetic quantum number:

\begin{displaymath}
L_z = m \hbar
\end{displaymath}

So conservation of angular momentum is the same thing as conservation of magnetic quantum number.

But the magnetic quantum number is more fundamental. Its possible values are pure integers, unlike those of angular momentum. To see why, note that in terms of $m$, the eigenvalues of ${\cal R}_{z,\gamma}$ are of the form

\begin{displaymath}
e^{{\rm i}m \gamma}
\end{displaymath}

Now if you rotate the coordinate system over an angle $\gamma$ $\vphantom0\raisebox{1.5pt}{$=$}$ $2\pi$, it gets back to the exact same position as it was in before the rotation. The wave function should not change in that case, which means that the eigenvalue must be equal to one. And that requires that the value of $m$ is an integer. If $m$ was a fractional number, $e^{{{\rm i}}m2\pi}$ would not be 1.

It may be interesting to see how all this works out for the two examples mentioned in the previous subsection. The first example was the electron in a hydrogen atom where the proton is assumed to be at rest at the origin. Chapter 4.3 found the electron energy eigenfunctions in the form

\begin{displaymath}
\psi_{nlm}({\skew0\vec r}) = R_{nl}(r) Y_l^m(\theta,\phi)
= R_{nl}(r) \Theta_l^m(\theta) e^{{\rm i}m\phi}
\end{displaymath}

It is the final exponential that changes by the expected factor $e^{{{\rm i}}m\gamma}$ when ${\cal R}_{z,\gamma}$ replaces $\phi$ by $\phi+\gamma$.

The second example was the complete hydrogen atom in empty space. In addendum {A.5}, the energy eigenfunctions were found in the form

\begin{displaymath}
\psi_{nlm,\rm red}({\skew0\vec r}-{\skew0\vec r}_{\rm p}) \psi_{\rm cg}({\skew0\vec r}_{\rm cg})
\end{displaymath}

The first term is like before, except that it is computed with a reduced mass that is slightly different from the true electron mass. The argument is now the difference in position between the electron and the proton. It still produces a factor $e^{{{\rm i}}m\gamma}$ when ${\cal R}_{z,\gamma}$ is applied. The second factor reflects the motion of the center of gravity of the complete atom. If the center of gravity has definite angular momentum around whatever point is used as origin, it will produce an additional factor $e^{{{\rm i}}m_{\rm {cg}}\gamma}$. (See addendum {A.6} on how the energy eigenfunctions $\psi_{\rm {cg}}$ can be written as spherical Bessel functions of the first kind times spherical harmonics that have definite angular momentum. But also see chapter 7.9 about the nasty normalization issues with wave functions in infinite empty space.)

As a final step, it is desirable to formulate a nicer operator for angular momentum. The rotation operators ${\cal R}_{z,\gamma}$ are far from perfect. One problem is that there are infinitely many of them, one for every angle $\gamma$. And they are all related, a rotation over an angle $2\gamma$ being the same as two rotations over an angle $\gamma$.

If you define a rotation operator over a very small angle, call it ${\cal R}_{z,\varepsilon}$, then you can approximate any other operator ${\cal R}_{z,\gamma}$ by just applying ${\cal R}_{z,\varepsilon}$ sufficiently many times. To make this approximation exact, you need to make $\varepsilon$ infinitesimally small. But when $\varepsilon$ becomes zero, ${\cal R}_{z,\varepsilon}$ would become just 1. You have lost the nicer operator that you want by going to the extreme. The trick to avoid this is to subtract the limiting operator 1, and in addition, to avoid that the resulting operator then becomes zero, you must also divide by $\varepsilon$. The nicer operator is therefore

\begin{displaymath}
\lim_{\varepsilon\to0}\frac{{\cal R}_{z,\varepsilon} - 1}{\varepsilon}
\end{displaymath}

Now consider what this operator really means for a single particle with no spin:

\begin{displaymath}
\lim_{\varepsilon\to0}\frac{{\cal R}_{z,\varepsilon} - 1}{...
...(r,\theta,\phi+\varepsilon)-\Psi(r,\theta,\phi)}{\varepsilon}
\end{displaymath}

By definition, the final term is the partial derivative of $\Psi$ with respect to $\phi$. So the new operator is just the operator $\partial$$\raisebox{.5pt}{$/$}$$\partial\phi$!

You can go one better still, because the eigenvalues of the operator just defined are

\begin{displaymath}
\lim_{\varepsilon\to0}\frac{e^{{\rm i}m\varepsilon} - 1}{\varepsilon}
= {\rm i}m
\end{displaymath}

If you add a factor $\hbar$$\raisebox{.5pt}{$/$}$${\rm i}$ to the operator, the eigenvalues of the operator are going to be $m\hbar$, the quantity defined in classical physics as the angular momentum. So you are led to define the angular momentum operator of a single particle as:

\begin{displaymath}
\L _z \equiv \frac{\hbar}{{\rm i}} \frac{\partial}{\partial\phi}
\end{displaymath}

This agrees perfectly with what chapter 4.2.2 got from guessing that the relationship between angular and linear momentum is the same in quantum mechanics as in classical mechanics.

The angular momentum operator of a general system can be defined using the same scale factor:

\begin{displaymath}
\fbox{$\displaystyle
\L _z \equiv
\frac{\hbar}{{\rm i}...
...n\to0}\frac{{\cal R}_{z,\varepsilon} - 1}{\varepsilon}
$} %
\end{displaymath} (A.76)

The system has definite angular momentum $m\hbar$ if

\begin{displaymath}
\L _z \Psi = m \hbar \Psi
\end{displaymath}

Consider now what happens if the angular operator $\L _z$ as defined above is applied to the wave function of a system of multiple particles, still without spin. It produces

\begin{displaymath}
\L _z \Psi = \frac{\hbar}{{\rm i}}
\lim_{\varepsilon\to0...
...,\theta_1,\phi_1,r_2,\theta_2,\phi_2,\ldots)}
{\varepsilon}
\end{displaymath}

The limit in the right hand side is a total derivative. According to calculus, it can be rewritten in terms of partial derivatives to give

\begin{displaymath}
\L _z \Psi = \frac{\hbar}{{\rm i}}
\left[
\frac{\part...
...+
\frac{\partial}{\partial\phi_2} +
\ldots
\right] \Psi
\end{displaymath}

The scaled derivatives in the new right hand side are the orbital angular momenta of the individual particles as defined above, so

\begin{displaymath}
\L _z \Psi = \left[\L _{z,1} + \L _{z,2} + \ldots\right] \Psi
\end{displaymath}

It follows that the angular momenta of the individual particles just add, like they do in classical physics.

Of course, even if the complete system has definite angular momentum, the individual particles may not. A particle numbered $i$ has definite angular momentum $m_i\hbar$ if

\begin{displaymath}
\L _{z,i} \Psi \equiv \frac{\hbar}{{\rm i}} \frac{\partial}{\partial\phi_i} \Psi
= m_i \hbar \Psi
\end{displaymath}

If every particle has definite momentum like that, then these momenta directly add up to the total system momentum. At the other extreme, if both the system and the particles have uncertain angular momentum, then the expectation values of the momenta of the particles still add up to that of the system.

Now that the angular momentum operator has been defined, the generator of rotations ${\cal R}_{z,\gamma}$ can be identified in terms of it. It turns out to be

\begin{displaymath}
\fbox{$\displaystyle
{\cal R}_{z,\gamma}=\exp\left(\frac{{\rm i}}{\hbar}\L _z\gamma\right)
$}
\end{displaymath} (A.77)

To check that it does indeed take the form above, expand the exponential in a Taylor series. Then apply it on an eigenfunction with angular momentum $L_z$ $\vphantom0\raisebox{1.5pt}{$=$}$ $m\hbar$. The effect is seen to be to multiply the eigenfunction by the Taylor series of $e^{{\rm i}{m}\gamma}$ as it should. So ${\cal R}_{z,\gamma}$ as given above gets all eigenfunctions right. It must therefore be correct since the eigenfunctions are complete.

Now consider the generator of rotations in terms of the individual particles. Since $\L _z$ is the sum of the angular momenta of the individual particles,

\begin{displaymath}
{\cal R}_{z,\gamma} =
\exp\left(\frac{{\rm i}}{\hbar}\L ...
...exp\left(\frac{{\rm i}}{\hbar}\L _{z,2}\gamma\right)
\ldots
\end{displaymath}

So, while the contributions of the individual particles to total angular momentum add together, their contributions to the generator of rotations multiply together. In particular, if a particle $i$ has definite angular momentum $m_i\hbar$, then it contributes a factor $e^{{{\rm i}}m_i\gamma}$ to ${\cal R}_{z,\gamma}$.

How about spin? The normal angular momentum discussed so far suggests its true meaning. If a particle $i$ has definite spin angular momentum in the $z$-​direction $m_{s,i}\hbar$, then presumably the wave function changes by an additional factor $e^{{{\rm i}}m_{s,i}\gamma}$ when you rotate the axis system over an angle $\gamma$.

But there is something curious here. If the axis system is rotated over an angle $2\pi$, it is back in its original position. So you would expect that the wave function is also again the same as before the rotation. And if there is just orbital angular momentum, then that is indeed the case, because $e^{{{\rm i}}m2\pi}$ $\vphantom0\raisebox{1.5pt}{$=$}$ 1 as long as $m$ is an integer, (2.5). But for fermions the spin angular momentum $m_s$ in a given direction is half-integer, and $e^{{\rm i}\pi}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $\vphantom0\raisebox{1.5pt}{$-$}$1. Therefore the wave function of a fermion changes sign when the coordinate system is rotated over $2\pi$ and is back in its original position. That is true even if there is uncertainty in the spin angular momentum. For example, the wave function of a fermion with spin $\leavevmode \kern.03em\raise.7ex\hbox{\the\scriptfont0 1}\kern-.2em
/\kern-.2em\lower.4ex\hbox{\the\scriptfont0 2}\kern.05em$ can be written as, chapter 5.5.1,

\begin{displaymath}
\Psi_+{\uparrow}+\Psi_-{\downarrow}
\end{displaymath}

where the first term has $\frac12\hbar$ angular momentum in the $z$-​direction and the second term $-\frac12\hbar$. Each term changes sign under a turn of the coordinate system by $2\pi$. So the complete wave function changes sign. More generally, for a system with an odd number of fermions the wave function changes sign when the coordinate system is rotated over $2\pi$. For a system with an even number of fermions, the wave function returns to the original value.

Now the sign of the wave function does not make a difference for the observed physics. But it is still somewhat unsettling to see that on the level of the wave function, nature is only the same when the coordinate system is rotated over $4\pi$ instead of $2\pi$. (However, it may be only a mathematical artifact. The antisymmetrization requirement implies that the true system includes all electrons in the universe. Presumably, the number of fermions in the universe is infinite. That makes the question whether the number is odd or even unanswerable. If the number of fermions does turn out to be finite, this book will reconsider the question when people finish counting.)

(Some books now raise the question why the orbital angular momentum functions could not do the same thing. Why could the quantum number of orbital angular momentum not be half-integer too? But of course, it is easy to see why not. If the spatial wave function would be multiple valued, then the momentum operators would produce infinite momentum. You would have to postulate arbitrarily that the derivatives of the wave function at a point only involve wave function values of a single branch. Half-integer spin does not have the same problem; for a given orientation of the coordinate system, the opposite wave function is not accessible by merely changing position.)


A.19.4 Other symmetries

The previous subsections derived conservation of angular momentum from the symmetry of physics with respect to rotations. Similar arguments may be used to derive other conservation laws. This subsection briefly outlines how.

Conservation of linear momentum can be derived from the symmetry of physics with respect to translations. The derivation is completely analogous to the angular momentum case. The translation operator ${\cal T}_{z,d}$ shifts the coordinate system over a distance $d$ in the $z$-​direction. Its eigenvalues are of the form

\begin{displaymath}
e^{{\rm i}k_z d}
\end{displaymath}

where $k_z$ is a real number, independent of the amount of translation $d$, that is called the wave number. Following the same arguments as for angular momentum, $k_z$ is a preserved quantity. In classical physics not $k_z$, but $p_z$ $\vphantom0\raisebox{1.5pt}{$=$}$ ${\hbar}k_z$ is defined as the conserved quantity. To get the operator for this quantity, form the operator
\begin{displaymath}
\fbox{$\displaystyle
{\widehat p}_z = \frac{\hbar}{{\rm ...
...n\to0}\frac{{\cal T}_{z,\varepsilon} - 1}{\varepsilon}
$} %
\end{displaymath} (A.78)

For a single particle, this becomes the usual linear momentum operator $\hbar\partial$$\raisebox{.5pt}{$/$}$${\rm i}\partial{z}$. For multiple particles, the linear momenta add up.

It may again be interesting to see how that works out for the two example systems introduced earlier. The first example was the electron in a hydrogen atom. In that example it is assumed that the proton is fixed at the origin. The energy eigenfunctions for the electron then were of the form

\begin{displaymath}
\psi_{nlm}({\skew0\vec r})
\end{displaymath}

with ${\skew0\vec r}$ the position of the electron. Shifting the coordinate system for this solution means replacing ${\skew0\vec r}$ by ${\skew0\vec r}+d{\hat k}$. That shifts the position of the electron without changing the position of the proton. The physics is not the same after such a shift. Correspondingly, the eigenfunctions do not change by a factor of the form $e^{{{\rm i}}k_zd}$ under the shift. Just looking at the ground state,

\begin{displaymath}
\psi_{100}({\skew0\vec r}) = \frac{1}{\sqrt{\pi a_0^3}} e^{-\vert{\skew0\vec r}\vert/a_0}
\end{displaymath}

is enough to see that. An electron around a stationary proton does not have definite linear momentum. In other words, the linear momentum of the electron is not conserved.

However, the physics of the complete hydrogen atom as described in addendum {A.5} is independent of coordinate shifts. A suitable choice of energy eigenfunctions in this context is

\begin{displaymath}
\psi_{nlm,\rm red}({\skew0\vec r}-{\skew0\vec r}_{\rm p}) e^{{\rm i}{\vec k}\cdot{\skew0\vec r}_{\rm cg}}
\end{displaymath}

where ${\vec k}$ is a constant wave number vector. The first factor does not change under coordinate shifts because the vector ${\skew0\vec r}-{\skew0\vec r}_{\rm {p}}$ from proton to electron does not. The exponential changes by the expected factor $e^{{{\rm i}}k_zd}$ because the position ${\skew0\vec r}_{\rm {cg}}$ of the center of gravity of the atom changes by an amount $d$ in the $z$-​direction.

The derivation of linear momentum can be extended to conduction electrons in crystalline solids. In that case, the physics of the conduction electrons is unchanged if the coordinate system is translated over a crystal period $d$. (This assumes that the $z$-​axis is chosen along one of the primitive vectors of the crystal structure.) The eigenvalues are still of the form $e^{{{\rm i}}k_zd}$. However, unlike for linear momentum, the translation $d$ must be the crystal period, or an integer multiple of it. Therefore, the operator ${\widehat p}_z$ is not useful; the symmetry does not continue to apply in the limit $d\to0$.

The conserved quantity in this case is just the $e^{{{\rm i}}k_zd}$ eigenvalue of ${\cal T}_{z,d}$. It is not possible from that eigenvalue to uniquely determine a value of $k_z$ and the corresponding crystal momentum ${\hbar}k_z$. Values of $k_z$ that differ by a whole multiple of $2\pi$$\raisebox{.5pt}{$/$}$$d$ produce the same eigenvalue. But Bloch waves have the same indeterminacy in their value of $k_z$ anyway. In fact, Bloch waves are eigenfunctions of ${\cal T}_{z,d}$ as well as energy eigenfunctions.

One consequence of the indeterminacy in $k_z$ is an increased number of possible electromagnetic transitions. Typical electromagnetic radiation has a wave length that is large compared to the atomic spacing. Essentially the electromagnetic field is the same from one atom to the next. That means that it has negligible crystal momentum, using the smallest of the possible values of $k_x$ as measure. Therefore the radiation cannot change the conserved eigenvalue $e^{{{\rm i}}k_zd}$. But it can still produce electron transitions between two Bloch waves that have been assigned different $k_z$ values in some extended zone scheme, chapter 6.22.4. As long as the two $k_z$ values differ by a whole multiple of $2\pi$$\raisebox{.5pt}{$/$}$$d$, the actual eigenvalue $e^{{{\rm i}}k_zd}$ does not change. In that case there is no violation of the conservation law in the transition. The ambiguity in $k_z$ values may be eliminated by switching to a reduced zone scheme description, chapter 6.22.4.

The time shift operator ${\cal U}_\tau$ shifts the time coordinate over an interval $\tau$. In empty space, its eigenfunctions are exactly the energy eigenfunctions. Its eigenvalues are of the form

\begin{displaymath}
e^{-{\rm i}\omega\tau}
\end{displaymath}

where classical physics defines $\hbar\omega$ as the energy $E$. The energy operator can be defined correspondingly, and is simply the Hamiltonian:
\begin{displaymath}
H = {\rm i}\hbar \lim_{\varepsilon\to0}\frac{{\cal U}_\var...
... 1}{\varepsilon}
= {\rm i}\hbar \frac{\partial}{\partial t}
\end{displaymath} (A.79)

In other words, we have reasoned in a circle and rederived the Schrö­din­ger equation from time shift symmetry. But you could generalize the reasoning to the motion of particles in an external field that varies periodically in time.

Usually, nature is not just symmetric under rotating or translating it, but also under mirroring it. A transformation that creates a mirror image of a given system is called a parity transformation. The mathematically cleanest way to do it is to invert the direction of each of the three Cartesian axes. That is called spatial inversion. Physically it is equivalent to mirroring the system using some mirror passing through the origin, and then rotating the system 180$\POW9,{\circ}$ around the axis normal to the mirror.

(In a strictly two-di­men­sion­al system, spatial inversion does not work, since the rotation would take the system into the third dimension. In that case, mirroring can be achieved by replacing just $x$ by $\vphantom0\raisebox{1.5pt}{$-$}$$x$ in some suitably chosen $xy$-​coordinate system. Subsequently replacing $y$ by $\vphantom0\raisebox{1.5pt}{$-$}$$y$ would amount to a second mirroring that would restore a nonmirror image. In those terms, in three dimensions it is replacing $z$ by $\vphantom0\raisebox{1.5pt}{$-$}$$z$ that produces the final mirror image in spatial inversion.)

The analysis of the conservation law corresponding to spatial inversion proceeds much like the one for angular momentum. One difference is that applying the spatial inversion operator a second time turns $\vphantom0\raisebox{1.5pt}{$-$}$${\skew0\vec r}$ back into the original ${\skew0\vec r}$. Then the wave function is again the same. In other words, applying spatial inversion twice multiplies wave functions by 1. It follows that the square of every eigenvalue is 1. And if the square of an eigenvalues is 1, then the eigenvalue itself must be either 1 or $\vphantom0\raisebox{1.5pt}{$-$}$1. In the same notation as used for angular momentum, the eigenvalues of the spatial inversion operator can therefore be written as

\begin{displaymath}
e^{{\rm i}m'\pi} = (-1)^{m'}
\end{displaymath} (A.80)

where $m'$ must be integer. However, it is pointless to give an actual value for $m'$; the only thing that makes a difference for the eigenvalue is whether $m'$ is even or odd. Therefore, parity is simply called odd” or “minus one or negative if the eigenvalue is $\vphantom0\raisebox{1.5pt}{$-$}$1, and even” or “one or positive if the eigenvalue is 1.

In a system, the $\pm1$ parity eigenvalues of the individual particles multiply together. That is just like how the eigenvalues of the generator of rotation ${\cal R}_{z,\gamma}$ multiply together for angular momentum. Any particle with even parity has no effect on the system parity; it multiples the total eigenvalue by 1. On the other hand, each particle with odd parity flips over the total parity from odd to even or vice-versa; it multiplies the total eigenvalue by $\vphantom0\raisebox{1.5pt}{$-$}$1. Particles can also have intrinsic parity. However, there is no half-integer parity like there is half-integer spin.


A.19.5 A gauge symmetry and conservation of charge

Modern quantum theories are build upon so-called “gauge symmetries.” This subsection gives a simple introduction to some of the ideas.

Consider classical electrostatics. The force on charged particles is the product of the charge of the particle times the so-called electric field $\skew3\vec{\cal E}$. Basic physics says that the electric field is minus the derivative of a potential $\varphi$. The potential $\varphi$ is commonly known as the voltage in electrical applications. Now it too has a symmetry: adding some arbitrary constant, call it $C$, to $\varphi$ does not make a difference. Only differences in voltage can be observed physically. That is a very simple example of a gauge symmetry, a symmetry in an unobservable field, here the potential $\varphi$.

Note that this symmetry does not involve the gauges used to measure voltages in any way. Instead it is a reference point symmetry; it does not make a difference what voltage you want to declare to be zero. It is conventional to take the earth as the reference voltage, but that is a completely arbitrary choice. So the term “gauge symmetry” is misleading, like many other terms in physics. A symmetry in a unobservable quantity should of course simply have been called an unobservable symmetry.

There is a relationship between this gauge symmetry in $\varphi$ and charge conservation. Suppose that, say, a few photons create an electron and an antineutrino. That can satisfy conservation of angular momentum and of lepton number, but it would violate charge conservation. Photons have no charge, and neither have neutrinos. So the negative charge $\vphantom0\raisebox{1.5pt}{$-$}$$e$ of the electron would appear out of nothing. But so what? Photons can create electron-positron pairs, so why not electron-antineutrino pairs?

The problem is that in electrostatics an electron has an electrostatic energy $-e\varphi$. Therefore the photons would need to provide not just the rest mass and kinetic energy for the electron and antineutrino, but also an additional electrostatic energy $-e\varphi$. That additional energy could be determined from comparing the energy of the photons against that of the electron-antineutrino pair. And that would mean that the value of $\varphi$ at the point of pair creation has been determined. Not just a difference in $\varphi$ values between different points. And that would mean that the value of the constant $C$ would be fixed. So nature would not really have the gauge symmetry that a constant in the potential is arbitrary.

Conversely, if the gauge symmetry of the potential is fundamental to nature, creation of lone charges must be impossible. Each negatively charged electron that is created must be accompanied by a positively charged particle so that the net charge that is created is zero. In electron-positron pair creation, the positive charge $+e$ of the positron makes the net charge that is created zero. Similarly, in beta decay, an uncharged neutron creates an electron-antineutrino pair with charge $\vphantom0\raisebox{1.5pt}{$-$}$$e$, but also a proton with charge $+e$.

You might of course wonder whether an electrostatic energy contribution $-e\varphi$ is really needed to create an electron. It is because of energy conservation. Otherwise there would be a problem if an electron-antineutrino pair was created at a location P and disintegrated again at a different location Q. The electron would pick up a kinetic energy $-e(\varphi_{\rm {P}}-\varphi_{\rm {Q}})$ while traveling from P to Q. Without electrostatic contributions to the electron creation and annihilation energies, that kinetic energy would make the photons produced by the pair annihilation more energetic than those destroyed in the pair creation. So the complete process would create additional photon energy out of nothing.

The gauge symmetry takes on a much more profound meaning in quantum mechanics. One reason is that the Hamiltonian is based on the potential, not on the electric field itself. To appreciate the full impact, consider electrodynamics instead of just electrostatics. In electrodynamics, a charged particle does not just experience an electric field $\skew3\vec{\cal E}$ but also a magnetic field $\skew2\vec{\cal B}$. There is a corresponding additional so-called vector potential $\skew3\vec A$ in addition to the scalar potential $\varphi$. The relation between these potentials and the electric and magnetic fields is given by, chapter 13.1:

\begin{displaymath}
\skew3\vec{\cal E}= -\nabla \varphi - \frac{\partial \skew...
... t}
\qquad
\skew2\vec{\cal B}= \nabla \times \skew3\vec A
\end{displaymath}

Here $\nabla$, nabla, is the differential operator of vector calculus (calculus III in the U.S. system):

\begin{displaymath}
\nabla \equiv
{\hat\imath}\frac{\partial}{\partial x} +
...
...\partial}{\partial y} +
{\hat k}\frac{\partial}{\partial z}
\end{displaymath}

The gauge property now becomes more general. The constant $C$ that can be added to $\varphi$ in electrostatics no longer needs to be constant. Instead, it can be taken to be the time-derivative of any arbitrary function $\chi(x,y,z,t)$. However, the gradient of this function must also be subtracted from $\skew3\vec A$. In particular, the potentials

\begin{displaymath}
\varphi' = \varphi + \frac{\partial \chi}{\partial t}
\qquad
\skew3\vec A' = \skew3\vec A- \nabla \chi
\end{displaymath}

produce the exact same electric and magnetic fields as $\varphi$ and $\skew3\vec A$. So they are physically equivalent. They produce the same observable motion.

However, the wave function computed using the potentials $\varphi'$ and $\skew3\vec A'$ is different from the one computed using $\varphi$ and $\skew3\vec A$. The reason is that the Hamiltonian uses the potentials rather than the electric and magnetic fields. Ignoring spin, the Hamiltonian of an electron in an electromagnetic field is, chapter 13.1:

\begin{displaymath}
H =
\frac{1}{2m_{\rm e}}\left(\frac{\hbar}{{\rm i}}\nabla + e \skew3\vec A\right)^2 - e \varphi
\end{displaymath}

It can be seen by crunching it out that if $\Psi$ satisfies the Schrö­din­ger equation in which the Hamiltonian is formed with $\varphi$ and $\skew3\vec A$, then
\begin{displaymath}
\Psi' = e^{{\rm i}e \chi/\hbar} \Psi
\end{displaymath} (A.81)

satisfies the one in which $H$ is formed with $\varphi'$ and $\skew3\vec A'$.

To understand what a stunning result that is, recall the physical interpretation of the wave function. According to Born, the square magnitude of the wave function $\vert\Psi\vert^2$ determines the probability per unit volume of finding the electron at a given location. But the wave function is a complex number; it can always be written in the form

\begin{displaymath}
\Psi = e^{{\rm i}\alpha}\vert\Psi\vert
\end{displaymath}

where $\alpha$ is a real quantity corresponding to a phase angle. This angle is not directly observable; it drops out of the magnitude of the wave function. And the gauge property above shows that not only is $\alpha$ not observable, it can be anything. For, the function $\chi$ can change $\alpha$ by a completely arbitrary amount $e\chi$$\raisebox{.5pt}{$/$}$$\hbar$ and it remains a solution of the Schrö­din­ger equation. The only variables that change are the equally unobservable potentials $\varphi$ and $\skew3\vec A$.

As noted earlier, a symmetry means that you can do something and it does not make a difference. Since $\alpha$ can be chosen completely arbitrary, varying with both location and time, this is a very strong symmetry. Zee writes, (Quantum Field Theory in a Nutshell, 2003, p. 135): "The modern philosophy is to look at [the equations of quantum electrodynamics] as a result of [the gauge symmetry above]. If we want to construct a gauge-invariant relativistic field theory involving a spin $\leavevmode \kern.03em\raise.7ex\hbox{\the\scriptfont0 1}\kern-.2em
/\kern-.2em\lower.4ex\hbox{\the\scriptfont0 2}\kern.05em$ and a spin 1 field, then we are forced to quantum electrodynamics."

Geometrically, a complex number like the wave function can be shown in a two-di­men­sion­al complex plane in which the real and imaginary parts of the number form the axes. Multiplying the number by a factor $e^{{{\rm i}}e\chi/\hbar}$ corresponds to rotating it over an angle $e\chi$$\raisebox{.5pt}{$/$}$$\hbar$ around the origin in that plane. In those terms, the wave function can be rotated over an arbitrary, varying, angle in the complex plane and it still satisfies the Schrö­din­ger equation.

For a relatively accessible derivation how the gauge invariance produces quantum electrodynamics, see [24, pp. 358ff]. To make some sense out of it, chapter 1.2.5 gives a brief inroduction to relativistic index notation, chapter 12.12 to the Dirac equation and its matrices, addendum {A.1} to Lagrangians, and {A.21} to photon wave functions. The $F^{\mu\nu}$ are derivatives of this wave function, [24, p. 239].


A.19.6 Reservations about time shift symmetry

It is not quite obvious that the evolution of a physical system in empty space is the same regardless of the time that it is started. It is certainly not as obvious as the assumption that changes in spatial position do not make a difference. Cosmology does not show any evidence for a fundamental difference between different locations in space. For each spatial location, others just like it seem to exist elsewhere. But different cosmological times do show a major physical distinction. They differ in how much later they are than the time of the creation of the universe as we know it. The universe is expanding. Spatial distances between galaxies are increasing. It is believed with quite a lot of confidence that the universe started out extremely concentrated and hot at a “Big Bang” about 15 billion years ago.

Consider the cosmic background radiation. It has cooled down greatly since the universe became transparent to it. The expansion stretched the wave length of the photons of the radiation. That made them less energetic. You can look upon that as a violation of energy conservation due to the expansion of the universe.

Alternatively, you could explain the discrepancy away by assuming that the missing energy goes into potential energy of expansion of the universe. However, whether this potential energy is anything better than a different name for “energy that got lost” is another question. Potential energy is normally energy that is lost but can be fully recovered. The potential energy of expansion of the universe cannot be recovered. At least not on a global scale. You cannot stop the expansion of the universe.

And a lack of exact energy conservation may not be such a bad thing for physical theories. Failure of energy conservation in the early universe could provide a possible way of explaining how the universe got all that energy in the first place.

In any case, for practical purposes nontrivial effects of time shifts seem to be negligible in the current universe. When astronomy looks at far-away clusters of galaxies, it sees them as they were billions of years ago. That is because the light that they emit takes billions of years to reach us. And while these galaxies look different from the current ones nearby, there is no evident difference in their basic laws of physics. Also, gravity is an extremely small effect in most other physics. And normal time variations are negligible compared to the age of the universe. Despite the Big Bang, conservation of energy remains one of the pillars on which physics is build.