1.2: Early Quantum Mechanics
- Page ID
- 1623
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)This is just a quick review of the experimental basis for quantum mechanics, and some of the early formulations. You don’t need to know the historical facts, of course, but some of the physics arguments are worth recalling—for example, Bohr’s derivation of the Rydberg constant from his model atom.
Why Do We Need Quantum Mechanics?
Just over 100 years ago, in the 1890’s, physics looked in pretty good shape. The beautiful mathematical development of Newton’s mechanics, coupled with increasingly sophisticated technology, predicted the movements of the solar system to incredible accuracy, apart from a tiny discrepancy in the orbit of Mercury. It had been less than a hundred years since it was realized that an electric current could exert a force on a magnet, but that discovery had led to power stations, electric trains and a network of telegraph wires across land and under the oceans. It had also been only a hundred years since it had been established that light was a wave, and only forty years since Maxwell’s realization that the waves in a light signal were electric and magnetic fields, satisfying a wave equation he was able to derive purely by considering electric and magnetic field phenomena. In particular, he was able to predict the speed of light by measuring the electrostatic attractive forces between charges and the magnetic forces between currents.
At about the same time, in the 1860’s, Maxwell and Boltzmann gave a brilliant account of the properties of gases by assuming that they were made up of weakly interacting molecules flying about in a container, bouncing off the sides, with a statistical distribution of energies so that the probability of a molecule having energy \(E\) was proportional to \( e^{-E/kT} \), \(k\) being a universal constant known as Boltzmann’s constant. Boltzmann generalized this result from a box of gas to any system. For example, a solid can be envisioned classically as a lattice of balls (the atoms) connected by springs, which can sustain oscillations in many different ways, each such mode can be thought of as a simple harmonic oscillator, with reasonable approximations concerning the properties of the springs, etc. Boltzmann’s work leads to the conclusion that each such mode of oscillation, or degree of freedom, would at temperature \(T\) have average energy \(kT\), made up of \( \frac{1}{2} kT\) potential energy, \( \frac{1}{2} kT\) kinetic energy. Notice that this average energy is independent of the strength of the springs, or the masses! All modes of vibration, which will vibrate at very different rates, contain the same energy at the same temperature. This equal sharing is called the Equipartition of Energy. It is not difficult to check this for a one-dimensional classical harmonic oscillator, averaging the energy by integrating over all displacements and momenta (independently) with the weighting factor \( e^{-E/kT} \) (which of course needs to be normalized). The result doesn’t depend on the spring constant or the mass. Boltzmann’s result gave an excellent account of the specific heats of a wide range of materials over a wide temperature range, but there were some exceptions, for example hydrogen gas at low temperatures, and even solids at low enough temperatures. Still, it was generally felt these problems could be handled within the existing framework, just as the slightly odd behavior of Mercury was likely caused by a small planet, named Vulcan, closer to the sun, and so very hard to observe.
Blackbody Radiation
But there was one problem that was hard to get a grip on, an apparently blatant violation of the equipartition of energy. Consider an oven with a small hole in the door, through which the radiation inside is observed. This oven can be heated until it’s white hot. The radiation inside is infrared at low temperatures, becoming visible light as the temperature increases. So, the oven’s full of electromagnetic waves, satisfying Maxwell’s wave equation, with boundary conditions at the walls of the oven, the electric field has to be essentially zero there, because the walls conduct currents. Of course, the radiation originates in oscillating charges in the walls, using the same analysis of Maxwell’s equations that gives the radiation form an antenna. Anyway, there is a set of standing wave modes of electromagnetic vibrations inside the oven, just a three-dimensional version of the series of allowed standing wave modes of vibration of a string fixed at both ends. So, we should be able to find the energy density of these waves using the same ideas that worked pretty well for the specific heats of solids and gases, that is to say, assume there’s \(kT\) of energy in each mode of vibration. (This is \( \frac{1}{2} kT \) of kinetic energy, \( \frac{1}{2} kT \) of potential energy for each independent direction of vibration.)
But—this leads to disaster. The problem is that there are infinitely many modes of vibration of the electromagnetic field in an oven. There is no upper limit to the number of wiggles the wave can have between the walls. So, if we take \(kT\) in each mode, we deduce that the oven contains an infinite amount of energy, and radiates an infinite amount through our small hole. Furthermore, this analysis gives no clue as to why the color we see changes with temperature. Evidently, equipartition of energy isn’t working in this case. There’s only a finite amount of energy in the oven—and at low temperatures there’s no energy at all in the modes corresponding to visible light, although that changes as things get hotter.
In the 1890’s, German experimentalists measured the energy density as a function of wavelength to great precision, it’s called the blackbody radiation spectrum. A theorist, Planck, found a mathematical formula that fitted this curve exactly,
\[ R_T(\nu) d\nu = \dfrac{8\pi h V \nu^3 df / c^3}{e^{h\nu/kT}-1} \tag{1.2.1}\]
He did not at first have any theoretical justification for this formula, but it was a very accurate fit to some very precise experiments for a suitable value of the constant \(h\), which we discuss in a moment.
Factoring out the number of modes of oscillation in the frequency range \(d\nu\), Planck’s formula gives the average energy per mode to be
\[ \dfrac{h \nu}{e^{h\nu/kT}-1} \tag{1.2.2}\]
For low frequencies, \(h \nu << kT\), this correctly gives \(kT\) per mode.
But, for higher frequencies it’s clear that the oscillators are not getting their “fair share” \(kT\) of energy. Somehow, the oscillating charges in the walls are not radiating so much energy at the high frequencies. The only way Planck could derive the formula theoretically was by making a weird assumption: he assumed that the oscillating charges in the walls could not just radiate energy continuously, as Maxwell’s equations would predict (and as was known to be true for ordinary antennas) but were only allowed to radiate energy in chunks he called quanta. Furthermore, the amount of energy in one quantum depended on the frequency of the oscillation, in fact linearly: for frequency \(f\), the quantum has energy \(hf\), where \(h\) is the constant introduced into the formula above, now known as Planck’s constant. It follows that the oscillators themselves could only be oscillating with energies that form a ladder with steps \(hf\) apart, above some lowest energy which would be their energy at absolute zero temperature.
The formula follows if we assume the oscillating field component in the oven having frequency \(f\) can only have a whole number of quanta of energy, that is to say, its energy must be one of: \(0, hf, 2hf, 3hf, …\) If we further assume that the relative probability of it having energy \(E\) is \( e^{-E/kT} \), then its relative probabilities of having energy \(0, hf, 2hf, …\) are in the ratio 1: \( e^{-hf/kT}) \) : \( e^{-2hf/kT} \), etc.
The actual probabilities are given by dividing these relative probabilities by the sum of all of them. They clearly are the terms of a geometric series, so their sum is just \( 1/(1-e^{-hf/kT}) \). So, to find the average energy in the oscillator, we take the possible energies \(0, hf, 2hf, 3hf, …\) and weight each of them with their probability of occurring, that is, we must find \[0\cdot 1+hf\cdot e^{-hf/kT}+2hf\cdot e^{-2hf/kT}+..., \tag{1.2.3}\] and divide the sum by \( 1/(1-e^{-hf/kT}) \).
So, Planck’s quantum assumption explains the observed blackbody radiation curve. It also gives a qualitative explanation of the change in color of the radiated light as the temperature is increased. The oscillators in the walls derive their energy from the heat vibrations of neighboring molecules: typically, such a vibration has energy of order \(kT\), with probabilities of more energy going down as \( e^{-E/kT} \). This means that if the potentially radiating oscillator can only absorb energies in quanta \(hf\), if \(kT\) << \(hf\), it will be very unlikely to absorb any energy, and therefore very unlikely to radiate. In the three-dimensional oven, the number of standing wave oscillations in a small frequency range \(\Delta f\) increases with \(f\) as \(f^2\) so we find that the maximum radiation intensity occurs at a frequency \(f\) such that \(hf\) is of order \(kT\). Therefore, as the temperature increases, the frequency at which the most intense radiation occurs increases, and hence the color moves from red to blue.
The Photoelectric Effect
If light shines on certain metals, electrons are emitted. This is the photoelectric effect. If the metal is in air, the electrons bounce off air molecules and are almost certainly rapidly reabsorbed, but if the metal surface is in a vacuum, the electrons can fly away, and in a vacuum tube they can be collected by another piece of metal, and light can cause a current to flow, the origin of the photoelectric cell.
In 1902, Lenard studied how the energy of the emitted photoelectrons varied with the intensity of the light. He used a carbon arc light, and could increase the intensity a thousand-fold. The ejected electrons hit another metal plate, the collector, which was connected to the cathode by a wire with a sensitive ammeter, to measure the current produced by the illumination. To measure the energy of the ejected electrons, Lenard charged the collector plate negatively, to repel the electrons coming towards it. Thus, only electrons ejected with enough kinetic energy to get up this potential hill would contribute to the current. Lenard discovered that there was a well-defined minimum voltage that stopped any electrons getting through, we’ll call it \(V_{stop}\). To his surprise, he found that \(V_{stop}\) did not depend at all on the intensity of the light! Doubling the light intensity doubled the number of electrons emitted, but did not affect the energies of the emitted electrons. He also discovered, by using light of different colors, that the maximum electron energy did increase as the frequency of the incident light increased.
Einstein Suggests an Explanation
In 1905 Einstein gave a very simple interpretation of Lenard’s results. He just assumed that the incoming radiation should be thought of as quanta of frequency \(hf\), with \(f\) the frequency. In photoemission, one such quantum is absorbed by one electron. If the electron is some distance into the material of the cathode, some energy will be lost as it moves towards the surface. There will always be some electrostatic cost as the electron leaves the surface, this is usually called the work function, \(W\). The most energetic electrons emitted will be those very close to the surface, and they will leave the cathode with kinetic energy
\[ E=hf-W \tag{1.2.4} \]
On cranking up the negative voltage on the collector plate until the current just stops, that is, to \(V_{stop}\), the highest kinetic energy electrons must have had energy \(eV_{stop}\) on leaving the cathode. Thus,
\[ eV_{stop}=hf-W \tag{1.2.5} \]
Thus Einstein’s theory makes a very definite quantitative prediction: if the frequency of the incident light is varied, and \(V_{stop}\) plotted as a function of frequency, the slope of the line should be \(h/e\). It is also clear that there is a minimum light frequency for a given metal, that for which the quantum of energy is equal to the work function. Light below that frequency, no matter how bright, will not cause photoemission.
Millikan’s Attempts to Disprove Einstein’s Theory
If we accept Einstein’s theory, then, this is a completely different way to measure Planck’s constant. The American experimental physicist Robert Millikan, who did not accept Einstein’s theory, which he saw as an attack on the wave theory of light, worked for ten years, until 1916, on the photoelectric effect, to disprove Einstein’s theory. He even devised techniques for scraping clean the metal surfaces inside the vacuum tube. For all his efforts he found disappointing results (for him!): he confirmed Einstein’s theory, measuring Planck’s constant to within \(0.5\%\) by this method. One consolation was that he did get a Nobel prize for this series of experiments.
The point to be emphasized is that the same value for Planck’s constant, \(6.6 \times 10^{-34} \) Joule.sec, emerges from two completely different experiments: the measurement of blackbody radiation, and measuring energies of emitted electrons in the photoelectric effect. This is clearly a general property of electromagnetic radiation, and is confirmed by many later experiments, for example Compton scattering, in which light scatters off electrons. By measuring the energy change and momentum change of the electron, it is found that a single quantum of light was scattered. (At very high energies, more particles may be generated.)
The Nature of Light
It is firmly established experimentally that the propagation of light is well described by a wave equation, which in fact is not difficult to derive from Maxwell’s equations:\[ \nabla^2 \vec E -\frac{1}{c^2} \frac{\partial^2 \vec E}{\partial t^2}=0 \tag{1.2.6}\]
For a plane wave moving in the x-direction this reduces to
\[ \frac{\partial^2 \vec E}{\partial x^2} -\frac{1}{c^2} \frac{\partial^2 \vec E}{\partial t^2} =0 \tag{1.2.7} \]
The monochromatic solution to this wave equation has the form
\[ \vec E (x,t)= \vec E_0 e^{i(kx-\omega t)} \tag{1.2.8} \]
(Another possible solution is proportional to \( cos(kx-\omega t) \). We shall find that the exponential form, although a complex number, proves more convenient. The physical electric field can be taken to be the real part of the exponential for the classical case.)
Applying the wave equation differential operator to our plane wave solution
\[ \left(\frac{\partial^2}{\partial x^2} -\frac{1}{c^2} \frac{\partial^2}{\partial t^2} \right) \vec E_0 e^{i(kx-\omega t)}= \left(k^2 -\frac{\omega^2}{c^2} \right) \vec E_0 e^{i(kx-\omega t)} =0 \tag{1.2.9} \]
If the plane wave is a solution to the wave equation, this must be true for all \(x\) and \(t\), so we must have \[ \omega =ck \tag{1.2.10}\]
Solving this equation for boundary conditions like an antenna can be quite challenging, but all we need consider at the moment is some illustration of diffraction. We take the case of a double slit experiment: if a plane wave encounters a barrier with two equal narrow parallel slit openings, the transmitted wave reaching a screen some distance further on will show a series of bright and dark stripes parallel to the slits. This pattern can be quantitatively accounted for. The two slits transmit radiation in phase with each other. At each point on the screen, the electric field vector from slit 1 must be added to the electric field vector from slit 2. At a point on the screen equidistant from the two slits, the electric field vectors will be equal. Moving away from that point in a direction perpendicular to the slits we will reach a point where the field from one slit is exactly out of phase with the field from the other slit—the screen will be dark.
In fact, the intensity if the light at any point on the screen is proportional to \( |E_0 |^2\).
Now consider what happens as we make the light dimmer and dimmer. How easy is it to see this diffraction pattern? Eventually we need to soup up our detection apparatus. We replace our screen and visual inspection with a series of photodetectors. Experimentally, we find that, just as in the photoelectric effect, our detectors will only detect quanta, just as if the light were made up of particles, photons. Suppose now we dim the light so that our photodetectors only detect one photon per minute coming through the slits. If we record where each photon lands, and build up a picture, we find the very same pattern of light and dark stripes that we saw with bright light.
In other words, if we send through one photon, we cannot predict where it will land, but if we send through a thousand, we will begin to discern the stripes. The best we can do for one photon is to say it will more probably land where the solution to Maxwell’s wave equation gives a large \( |E_0 |^2\). That is to say,\( |E_0 (x)|^2\) is proportional to the probability of the photon being at \(x\).
But this means each photon must have gone through both slits! The probability distribution for a single photon is given by the stripes, and the distance between the stripes depends on the distance between the slits. The photon, therefore, knows about both slits. So the bottom line is: to find where one photon will be, solve the wave equation to find the electric field everywhere on the screen. The probability of the photon landing at any particular point is proportional to \( |E_0 |^2\) at that point.
To illustrate how weird this really is, consider a beam of photons split into two by a half silvered mirror, the two half-beams than follow widely separated paths until they are reunited by a suitable sequence of mirrors to interfere with each other. Sending one photon at a time, we will eventually build up a diffraction pattern of some sort. So if we think of the initial photon as a “wave packet” it will split into two half “wave packets” which will finally interfere with each other. Now suppose I put \(100\%\) efficient photon detectors on both paths. If I send photons through the apparatus one at a time, I get a series of clicks from the two detectors: path 1 clicks, path 1 clicks again, path 2 clicks, etc.: a random series. I never get both clicking with one photon. (We can dim the light enough so that the photons are far apart, that is, they definitely come one at a time.) What does this tell us about the nature of the wavefunction?
You might be inclined to think that the photon goes at random, half the time it goes along one path, half the time the other. That is to say, the photon really is on one of the paths, we just don’t know which until we detect it, and the wavefunction represents our ignorance. We do know that once we detect the photon on one path, there’s zero probability of finding it on the other path—so that part of the wavefunction has gone! But was it really there in the first place for that particular photon? Yes: the other half wavepacket must have been there, because if I hadn’t captured the photon with a detector in the way, the two half wavefunctions would have gone on to interfere with it to give the diffraction pattern. So this line of thinking is wrong: we cannot say that the photon “really is” on one of the two paths before we detect it.
The Nature of Matter
By the 1890’s and early 1900’s, most scientists believed in the existence of atoms. Not all—the distinguished German chemist Ostwald did not, for example. But nobody had a clear picture of even a hydrogen atom. The electron had just been discovered, and it was believed that the hydrogen atom had a single electron. It was suggested that maybe the electron went in circles around a central charge, but nobody believed that because Maxwell had established that accelerating charges radiate, so it was assumed that a circling electron would rapidly loose energy, spiral in to the center, and the atom would collapse. Instead, it was thought, the hydrogen atom (which was of course electrically neutral) was a ball of positively charged jelly with an electron inside, which would oscillate when heated, and emit radiation. Rough calculations, based on the accepted size of the atom, suggested that the radiation would be in the visible range, but no-one could remotely reproduce the known spectrum of hydrogen.
The big breakthrough came in 1909, when Rutherford tried to map the distribution of positive charge in a heavy atom (gold) by scattering alpha particles from it. To his amazement, he found the positive charge was all concentrated in a tiny nucleus, with a radius of order one ten-thousandth that of the atom. This meant that after all the electrons must be going in some kind of planetary orbits, and the Maxwell’s equations prediction of radiation did not apply, just as it did not always apply in blackbody radiation.
The Bohr Atom
The Danish theorist Niels Bohr was visiting Manchester at the time Rutherford did this experiment, and Bohr decided that there must be certain allowed sets of electron orbits in the atom where the classical acceleration radiation did not occur: he called them “stationary states”. The lowest energy stationary state would be the ground state of the atom, the others would eventually go to that state by emitting photons corresponding to energy differences between states.
But Bohr was of the opinion that looking at the very complex spectra emitted by heated atoms would never be helpful—he remarked that it would be like trying to understand fundamental biology by studying the colors of butterfly wings.
He changed his mind in February 1913, when a casual conversation with the spectroscopist H. R. Hansen revealed that one pattern had been discerned in the apparent chaos of spectral lines. In particular, Hansen (a colleague and former classmate of Bohr) showed him Balmer's formula for hydrogen. Balmer was a math and Latin teacher at a girls’ school in Switzerland, and had found his formula in the 1880’s. Balmer's formula is:
\[ \frac{1}{\lambda}=R_H \left(\frac{1}{4} -\frac{1}{n^2} \right) \tag{1.2.11} \]
for the sequence of wavelengths of light emitted, with \(n = 3, 4, 5, 6\) being in the visible, the lines used by Balmer in finding the formula. Hansen would doubtless have informed Bohr that the \(1/4\) could be replaced by \(1/m^2\), with \(m\) another integer. The constant appearing on the right hand side is called the Rydberg constant, \(R_H\) = 109,737 cm-1. (This is the modern value—Balmer got it right to one part in 10,000, about the limit of spectral measurements at the time.)
Bohr said later: “As soon as I saw Balmer's formula, the whole thing was immediately clear to me.” What he saw was that the set of allowed frequencies (proportional to inverse wavelengths) emitted by the hydrogen atom could all be expressed as differences. This immediately suggested to him a generalization of his idea of a “stationary state” lowest energy level, in which the electron did not radiate. There must be a whole sequence of these stationary states, with radiation only taking place as the atom jumps from one to another of lower energy, emitting a single quantum of frequency \(f\) such that \[ hf=E_n-E_m \tag{1.2.12}\] the difference between the energies of the two states.
Evidently, from the Balmer formula and its extension to general integers \(m\), \(n\), these allowed non-radiating orbits, the stationary states, could be labeled 1, 2, 3, ... ,\(n\), ... and had energies
\[ E_n=-hcR_H/n^2 \tag{1.2.13} \]
using \( \lambda f=c \) and the Balmer equation above.
The energies are of course negative, because these are bound states, and we take the zero of energy to be where the two particles are at rest infinitely far apart.
Bohr was very familiar with the dynamics of simple circular orbits in an inverse square field. He knew that if the energy of the orbit was\( -hcR_H/n^2 \), that meant the kinetic energy of the electron,\( \frac{1}{2} mv^2=hcR_H/n^2 \), and the potential energy would be
\[ -\frac{1}{4 \pi \varepsilon_0} \cdot \frac{e^2}{r_n} =-\frac{2hcR_H}{n^2} \tag{1.2.14} \]
It immediately follows that the radius of the \(n^{th}\) orbit is proportional to \(n^2\), and the speed in that orbit is proportional to \(1/n\).
It then follows that the angular momentum of the \(n^{th}\) orbit is just proportional to \(n\): and Bohr knew that Planck’s constant, the basis of quantum theory had the dimensions of angular momentum!
Evidently, then the angular momentum in the \(n^{th}\) orbit was \(nKh\), where \(h\) is Planck’s constant and \(K\) is some multiplying factor, the same for all the orbits, still to be determined.
In fact, the value of \(K\) follows from the results above. \(R_H\), \(m\), \(h\), and \(c\) are all known quantities (\(R_H\) being measured experimentally by observing the lines in the Balmer series) so the above formulas immediately give the electron's speed and distance from the nucleus in the \(n^{th}\) orbit, and hence its angular momentum. Therefore, by putting in these experimentally determined quantities, we can find \(K\).
Bohr’s Semiclassical Argument to Fix the Quantum of Angular Momentum
However, Bohr found a clever theoretical way to determine \(R_H\) from his model: by equating his prediction of the frequency emitted when an electron goes from one orbit to another in a very large atom with the classical prediction—which would be just the orbital frequency of the electron, how many times per second it goes around, he deduced \(K=1/2\pi \) and from that the Rydberg constant that appeared before is here given in terms of \(h\), \(m\) and \(e\). The rather abstract argument that the quantum predictions must match the known classical results for large slow systems actually fixes the Rydberg constant.
His argument goes as follows: for the circular orbits \[ \frac{mv^2}{r}=\frac{1}{4\pi\varepsilon_0}\cdot\frac{e^2}{r^2}\;\; so\;\; mv^2=\frac{1}{4\pi\varepsilon_0}\cdot\frac{e^2}{r},\;\; K.E.=-\frac{1}{2}P.E.,\;\; E=-\frac{1}{4\pi\varepsilon_0}\cdot\frac{e^2}{2r}. \tag{1.2.15}\]
With the angular momentum quantized, for the \(n^{th}\) orbit: \[ mv_n r_n =nKh \tag{1.2.16} \]
where \(h\) is Planck’s constant, \(n\) an integer, \(K\) the unknown multiplying factor (ok, fixed by experiment, but we’re finding it independently).
From this quantization condition we can find the radius, and hence the energy, of the \(n^{th}\) orbit: \[ -\frac{1}{4\pi \varepsilon_0} \cdot \frac{e^2}{r_n} =mv_n^2 =m \left(\frac{nKh}{mr_n} \right)^2 \tag{1.2.17} \]
Giving \[ r_n=\frac{4\pi\varepsilon_0 n^2K^2h^2}{me^2} , \, E_n =-\frac{1}{4\pi\varepsilon_0} \cdot \frac{e^2}{2r_n} =-\left(\frac{1}{4\pi\varepsilon_0} \right)^2 \cdot \frac{me^4}{2K^2 h^2} \cdot \frac{1}{n^2} \tag{1.2.18}\]
In the large \(n\) limit, \[ E_{n+1}-E_n \cong \left(\frac{1}{4\pi\varepsilon_0} \right)^2 \cdot \frac{me^4}{2K^2 h^2} \cdot \frac{2}{n^3} =h\nu \tag{1.2.19} \]
so \[ \nu =\left(\frac{1}{4\pi \varepsilon_0} \right)^2 \cdot \frac{me^4}{K^2 h^3 n^3} \tag{1.2.20} \]
where \(\nu\) is the frequency of the emitted photon on jumping down one quantum number.
In the classical limit of large \(n\), \(\nu\) must match the orbital frequency of the electron, since Maxwell’s equations will be valid. That is, \[ \nu =\frac{v_n}{2\pi r_n} =\frac{nKh}{2\pi m r_n^2} = \left(\frac{1}{4\pi\varepsilon_0} \right)^2 \cdot \frac{nKh}{2\pi m} \cdot \frac{m^2 e^4}{n^4 K^4 h^4} =\left(\frac{1}{4\pi\varepsilon_0} \right)^2 \cdot \frac{m e^4}{2\pi n^3 K^3 h^3} \tag{1.2.21} \]
Comparing the two expressions, we see that they agree if \(K=1/2\pi \)
Putting \(K=1/2\pi \) into the energy level formula, \[ E_n=-\left( \frac{1}{4\pi\varepsilon_0}\right)^2\cdot\frac{me^4}{2K^2h^2}\cdot\frac{1}{n^2}=-\left( \frac{1}{4\pi\varepsilon_0}\right)^2\cdot\frac{2\pi^2me^4}{h^2}\cdot\frac{1}{n^2}. \tag{1.2.22}\]
Now the Rydberg constant is defined by \[ E_n =-hcR_H/n^2 \tag{1.2.23} \]
so the Bohr model predicts that \[ R_H =\left(\frac{1}{4\pi \varepsilon_0} \right)^2 \cdot \frac{2\pi^2 me^4}{ch^3} \tag{1.2.24} \]
This formula was found to be correct within the limits of experimental error in measuring the quantities on the right.
But few people believed his theory. For one thing, it soon became apparent that in the spectra of some stars (actually including the sun) there were spectral lines apparently corresponding to half the angular momentum quantum. How could that be?
Bohr’s response was that these lines must be from ionized helium, not hydrogen. A neutral helium atom has two electrons, a singly-ionized helium atom has just one electron, but the nucleus has a charge twice that of the hydrogen nucleus, so the factor \(e^4\) in the Rydberg constant is replaced by \(4e^4\), which leads to the observed result. But then a spectroscopist called Fowler did some very precise measurements, and found that actually the \(R_H\) for these new lines corresponded to a factor of 4.0016. How could Bohr explain that?
Bohr pointed out that at this level of precision, the finite mass of the nucleus must be taken into account by using a reduced mass for the electron. This gives just the right factor. This result greatly impressed Einstein, who concluded that Bohr must be on the right track.
Remark: the Bohr Atom is Still Important!
Although, as we shall see shortly, Bohr’s semiclassical analysis has long been replaced by Schrödinger’s wave function, there are recent experiments in atomic physics where the classical approach provides valuable insight. In particular, so-called Rydberg atoms, which are atoms with one electron in a spatially large orbit (large \(n\), weakly bound), act a lot like classical systems. Such atoms can be ionized by microwave fields. For a considerable range of parameters, the onset of this ionization can be accounted for by ignoring quantum mechanics altogether, and interpreting ionization as the onset of chaotic motion in the classical driven system! (And, the standard perturbation theoretic methods of quantum mechanics don’t work for this system anyway, because the perturbing microwave electric field is of the same order of magnitude as the atom’s electric field at these large orbits.) We should mention that, counterintuitively, quantum mechanics does become important again at very large \(n\) (or high microwave frequency), where some tricks from condensed matter physics have been used successfully to interpret the experiments. This is a rich subject: qualitatively different phenomena occur as the ratio of microwave frequency to orbital frequency is varied.
Prince Louis de Broglie Gets His Ph.D.
The next real advance in understanding the atom came from an unlikely quarter—a student prince in Paris. Prince Louis de Broglie was a member of an illustrious family, prominent in politics and the military since the 1600’s. Louis began his university studies with history, but his elder brother Maurice studied x-rays in his own laboratory, and Louis became interested in physics. He worked with the very new radio telegraphy during the war.
After the war, de Broglie focused his attention on Einstein's two major achievements, the theory of special relativity and the quantization of light waves. He wondered if there could be some connection between them. Perhaps the quantum of radiation really should be thought of as a particle. It had been known for a long time that light waves carry momentum: this is famously demonstrated by the “radiometer”, a small “windmill” in a vacuum, with vanes silver on one side and blackened on the other. If the vacuum is good, the radiometer begins to rotate when exposed to light because the light bouncing off the silvered side delivers twice the momentum of the light absorbed by the blackened side. (It should be added that cheap versions of this device have poor vacua, and the heated gas near the blackened side tends to push the vanes the wrong way.)
In fact, it follows from Maxwell’s equations that the momentum density of a light beam is related to its energy density by \(E = cp\). We would therefore expect this same energy-momentum relationship to be true for the photons of which the light beam is composed. Now, from special relativity we know that all particles have an energy-momentum relationship \(E^2= m_0^2 c^4 +c^2 p^2\), where \(m_0\) is the rest mass of the particle. The only way this can be the same as \(E = cp\) is if \(m_0 = 0\), or, at least, if \(m_0\) is so small that all our observations are on particles having kinetic energy so far in excess of their rest energy that the tiny mass is not detectable. De Broglie suspected that the photon did have a very tiny nonzero rest mass, so that if the speed of a sufficiently low energy quantum could be measured, it would be found to be less than \(c\). On this point he was wrong (as far as we know!) Nevertheless, it was a very valuable conceptual breakthrough to think of the quantum of radiation as a particle, knowing full well that radiation is a wave. In fact, his incorrect idea that the photon (as we now call the light quantum) had a rest mass led him to analyze the relationship between particle properties and wave properties by transforming to the rest frame of the photon, and he discovered that the energy and momentum of the particle were related to the frequency and wavelength of the wave by: \[ E=hf \; , \; p=h/\lambda \tag{1.2.25}\]
Of course, the first condition is the Planck-Einstein quantization, and the second follows trivially from it if we take \(E = cp\) and \(\lambda f=c\). But de Broglie showed it was more generally true—it worked even if the photon had a rest mass.
Having decided that the photon might well be a particle with a rest mass, albeit very small, it dawned on de Broglie that in other respects it might not be too different from other particles, especially the very light electron. In particular, maybe the electron also had an associated wave. The obvious objection was that if the electron was wavelike, why had no diffraction or interference effects been observed? But there was an answer. If de Broglie’s relation between momentum and wavelength, \(p=h/\lambda\) also held for electrons, the wavelength was sufficiently short that these effects would be easy to miss. As de Broglie himself pointed out, the wave nature of light isn’t very evident in everyday life, or in ray tracing in geometrical optics. He suspected the apparently pure particle nature of electronic trajectories was analogous to the apparent straight-line propagation of rays of light, over distance scales much greater than the wavelength.
However, the wavelike properties should be important on an atomic scale. No progress had been made in a decade in understanding why the electronic orbits in the Bohr atom were restricted to integral values of the angular momentum in units of \(h\). But if the electron were in some sense a wave, it would be very natural to restrict the orbits to those of standing waves, for otherwise the electron wave on going around the orbit would interfere with itself destructively.
Suppose now the electron, having momentum \(p\), is moving in a circular orbit of radius \(r\). Then for a standing wave, a whole number of wavelengths must fit around the circle, so for some integer \(n\), \(n\lambda =2\pi r\). Putting this together with \(p=h/\lambda\) we find: \[ 2\pi r=n\lambda =nh/p \tag{1.2.26}\]
so \[ L=pr=nh/2\pi \tag{1.2.27} \]
The “standing wave” condition immediately gives Bohr’s quantization of angular momentum!
This was the prince’s Ph. D. thesis, presented in 1924. His thesis advisor was somewhat taken aback, and wasn’t sure if this was sound work. He asked de Broglie for an extra copy of the thesis, which he sent to Einstein. Einstein wrote shortly afterwards: “I believe it is a first feeble ray of light on this worst of our physics enigmas”. The prince got his Ph. D.
An Accident at the Phone Company Makes Everything Crystal Clear
There was an accident at the Bell Telephone Laboratories in April 1925. Clinton Davisson and L. H. Germer, looking for ways to improve vacuum tubes, were watching how electrons from an electron gun in a vacuum tube scattered off a flat nickel surface. Suddenly, while the experiment was running and the nickel target was very hot, a bottle of liquid air near the apparatus exploded, smashing one of the vacuum pipes, and air rushed into the apparatus. The hot nickel target oxidized immediately. The layer of oxide made their target useless for further investigations. They decided to clean off the oxide by heating the nickel in a hydrogen atmosphere then in vacuum. After doing this for a prolonged period, the nickel looked good, and they resumed the investigation.
To their amazement, the pattern of electron scattering from the newly cleaned nickel target was completely different from that before the accident. What had changed? On examining their newly cleaned crystal carefully, they found a clue. The original target was polycrystalline—made up of a multitude of tiny crystals, oriented randomly. During the prolonged heating of the cleaning process, the nickel had re-crystallized into a few large crystals.
To quote from their paper: “It seemed probable to us from these results that the intensity of scattering from a single crystal would exhibit a marked dependence on crystal direction, and we set about at once preparing experiments for an investigation of this dependence. We must admit that the results obtained in these experiments have proved to be quite at variance with our expectations. It seemed likely that strong beams would be found issuing from the crystal along what may be termed its transparent directions—the directions in which the atoms in the lattice are arranged along the smallest number of lines per unit area. Strong beams are indeed found issuing from the crystal, but only when the speed of bombardment lies near one or another of a series of critical values, and then in directions quite unrelated to crystal transparency.
“The most striking characteristic of these beams is a one to one correspondence ...which the strongest of them bear to the Laue beams that would be found issuing from the same crystal if the incident beam were a beam of x-rays. Certain others appear to be analogues ... of optical diffraction beams from plane reflection gratings—the lines of these gratings being lines or rows of atoms in the surface of the crystal. Because of these similarities ... a description ... in terms of an equivalent wave radiation ... is not only possible, but most simple and natural. This involves the association of a wavelength with the incident electron beam, and this wavelength turns out to be in acceptable agreement with the value \(h/mv\) of the undulatory mechanics, Planck's action constant divided by the momentum of the electron.
“That evidence for the wave nature of particle mechanics would be found in the reaction between a beam of electrons and a single crystal was predicted by Elsasser two years ago—shortly after the appearance of L. de Broglie's original papers on wave mechanics.”
The above quotes are from Physical Review 30, 705 (1927).
It should be added that the two-slit diffraction pattern is of course exhibited by a beam of electrons, has been observed experimentally many times, and has precisely the same form as that for light. Electrons and photons generate interference patterns that are identical—although the short wavelength of the electrons used presents a challenge! A double slit used by C. Jönsson in 1961 consisted of slits 0.5 microns wide 1-2 microns apart in copper foil. See D. Brandt and S Hirschi, Am. J. Phys. 42, 5 (1974). (This reference from French and Taylor’s Introduction to Quantum Physics.)
Contributor
- Michael Fowler (Beams Professor, Department of Physics, University of Virginia)