1.1: Breakdown of Classical Mechanics

Last updated
Save as PDF

Page ID: 1452

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

What was Wrong with Classical Mechanics?

Basically, classical statistical mechanics wasn’t making sense...

Maxwell and Boltzmann evolved the equipartition theorem: a physical system can have many states (gas with particles having different velocities, or springs in different states of compression).

At nonzero temperature, energy will flow around in the system, it will constantly move from one state to another. So, what is the probability that at any instant it is in a particular state with energy \(E\)?

M&B proves it was proportional to \(e^{-E/kT}\). This proportionality factor is also correct for any subsystem of the system: for example a single molecule.

Notice this means if a system is a set of oscillators, different masses on different strength springs, for example, then in thermal equilibrium each oscillator has on average the same energy as all the others. For three-dimensional oscillators in thermal equilibrium, the average energy of each oscillator is \( 3kT\), where \(k\) is Boltzmann’s constant.

Black Body Radiation

Now put this together with Maxwell’s discovery that light is an electromagnetic wave: inside a hot oven, Maxwell’s equations can be solved yielding standing wave solutions, and the set of different wavelength allowed standing waves amount to an infinite series of oscillators, with no upper limit on the frequencies on going far into the ultraviolet. Therefore, from the classical equipartition theorem, an oven at thermal equilibrium at a definite temperature should contain an infinite amount of energy—of order \(kT\) in each of an infinite number of modes—and if you let radiation out through a tiny hole in the side, you should see radiation of all frequencies.

This is not, of course, what is observed: as an oven is warmed, it emits infrared, then red, then yellow light, etc. This means that the higher frequency oscillators (blue, etc.) are in fact not excited at low temperatures: equipartition is not true.

Planck showed that the experimentally observed intensity/frequency curve was exactly reproduced if it was assumed that the radiation was quantized: light of frequency \(f\) could only be emitted in quanta—now photons—having energy \(hf\), \(h\) being Planck’s constant. This was the beginning of quantum mechanics.

The Photoelectric Effect

Einstein showed the same quantization of electromagnetic radiation explained the photoelectric effect: a photon of energy \(hf\) knocks an electron out of a metal, it takes a certain work \(W\) to get it out, the rest of the photon energy goes to the kinetic energy of the electron, for the fastest electrons emitted (those that come right from the surface, so encountering no further resistance). Plotting the maximum electron kinetic energy as a function of incident light frequency confirms the hypothesis, giving the same value for \(h\) as that needed to explain radiation from an oven. (It had previously been assumed that more intense light would increase the kinetic energy—this turned out not to be the case.)

The Bohr Atom

Bohr put together this quantization of light energy with Rutherford’s discovery that the atom had a nucleus, with electrons somehow orbiting around it: for the hydrogen atom, light emitted when the atom is thermally excited has a particular pattern, the observed emitted wavelengths are given by

\[\dfrac{1}{\lambda}=R_H\left(\dfrac{1}{4}-\dfrac{1}{n^2}\right) \tag{1.1.1}\]

with \(n = 3, 4, 5...\) \(R_H\) is now called the Rydberg constant.) Bohr realized these were photons having energy equal to the energy difference between two allowed orbits of the electron circling the nucleus (the proton), \(E_n -E_m =hf\), leading to the conclusion that the allowed levels must be:

\[E_n =-\dfrac{hcR_H}{n^2} \tag{1.1.2}\]

How could the quantum \(hf\) restricting allowed radiation energies also restrict the allowed electron orbits? Bohr realized there must be a connection—because \(h\) has the dimensions of angular momentum! What if the electron were only allowed to be in circular orbits of angular momentum \(nKh\), with \(n\) an integer? Bohr did the math for orbits under an inverse square law, and found that the observed spectra were in fact correctly accounted for by taking \(K = 1/2\pi \).

But then he realized he did not even need the experimental results to find \(K\): quantum mechanics must agree with classical mechanics in the regime where we know experimentally that classical mechanics (including Maxwell’s equations) is correct, that is, for systems of macroscopic size. Consider a negative charge orbiting around a fixed positive charge at a radius of 10 cm., the charges being such that the speed is of order meters per second (we don’t want relativistic effects making things more complicated). Then from classical E&M, the charge will radiate at the orbital frequency. Now imagine this is actually a hydrogen atom, in a perfect vacuum, in a high state of excitation. It must be radiating at this same frequency. But Bohr’s theory can’t just be right for small orbits, so the radiation must satisfy \(E_n -E_m =hf\). The spacing between adjacent levels will vary slowly for these large orbits, so \(h\) times the orbital frequency must be the energy difference between adjacent levels. Now, that energy difference depends on the allowed angular momentum step between the adjacent levels: that is, on \(K\). Reconciling these two expressions for the radiation frequency gives \( K = 1/2\pi \).

This classical limit argument, then, predicts the Rydberg constant in terms of already known quantities:

\[R_H= \left(\dfrac{1}{4\pi\varepsilon_0} \right)^2 \cdot\dfrac{2\pi^2 me^4}{ch^3} \tag{1.1.3}\].

What’s right about the Bohr atom?

It gives the Balmer series spectra.
The first orbit size is close to the observed size of the atom: and remember there are no adjustable parameters, the classical limit argument determines the spectra and the size.

What’s wrong with the Bohr atom?

No explanation for why angular momentum should be quantized. (This was solved by de Broglie a little later.)
Why don’t the circling electrons radiate, as predicted classically? Well, the fact that radiation is quantized means the classical picture of an accelerating charge smoothly emitting radiation cannot work if the energies involved are of order \(h\) times the frequencies involved.
The lowest state has nonzero angular momentum. This is a defect of the model, corrected in the truly quantum model (Schrödinger’s equation).
In an inverse square field, orbits are in general elliptical.

This was at first a puzzle: why should there be only circular orbits allowed? In fact, the model does allow elliptical orbits, and they do not show up in the Balmer series because, as proved by Sommerfeld, if the allowed elliptical orbits have the same allowed angular momenta as Bohr’s orbits, they have the same set of energies. This is a special property of the inverse square force. .

De Broglie Waves

The first explanation of why only certain angular momenta are allowed for the circling electron was given by de Broglie: just as photons act like particles (definite energy and momentum), but undoubtedly are wave like, being light, so particles like electrons perhaps have wave like properties. For photons, the relationship between wavelength and momentum is \(p = h/\lambda\). Assuming this is also true of electrons, and that the allowed circular orbits are standing waves, Bohr’s angular momentum quantization follows.

Schrödinger’s Wave Equation

De Broglie’s idea was clearly on the right track—but waves in space are three-dimensional, thinking of the circular orbit as a string under tension cannot be right, even if the answer is.

Photon waves (electromagnetic waves) obey the equation

\[ \nabla^2 \vec E -\dfrac{1}{c^2} \dfrac{\partial^2 \vec E}{\partial t^2}=0 \tag{1.1.4}\]

A solution of definite momentum is the plane wave

\[ \left(\dfrac{\partial^2}{\partial x^2} -\dfrac{1}{c^2}\dfrac{\partial^2}{\partial t^2} \right) \vec E_0 e^{i(kx-\omega t)} = \left(k^2 -\dfrac{\omega^2}{c^2} \right)\vec E_0 e^{i(kx-\omega t)} =0 \tag{1.1.5}\]

Notice that the last equality is essentially just \(\omega =ck\), where for a plane wave solution the energy and momentum of the photon are translated into differential operators with respect to time and space respectively, to give a differential equation for the wave.

Schrödinger’s wave equation is equivalently taking the (nonrelativistic) energy-momentum relation \(E = p^2/2m\) and using the same recipe to translate it into a differential equation:

\[ i\hbar \dfrac{\partial \psi(x,t)}{\partial t} =-\dfrac{\hbar^2}{2m} \dfrac{\partial^2 \psi(x,t)}{\partial x^2} \tag{1.1.6}\]

Making the natural extension to three dimensions, and assuming we can add a potential term in the most naïve way possible, that is, going from \(E = p^2/2m\) to \(E = p^2/2m + V(x,y,z)\), we get

\[ i\hbar \dfrac{\partial \psi(x,y,z,t)}{\partial t}=-\dfrac{\hbar^2}{2m} \nabla^2 \psi(x,y,z,t) +V(x,y,z) \psi(x,y,z,t) \tag{1.1.7}\]

This is the equation Schrödinger wrote down and solved, the solutions gave the same set of energies as the Bohr model, but now the ground state had zero angular momentum, and many of the details of the solutions were borne out by experiment, as we shall discuss further later.

A Conserved Current

Schrödinger also showed that a conserved current could be defined in terms of the wave function \(\psi \):

\[ \dfrac{\partial \rho}{\partial t} +div \vec j =0 \tag{1.1.8}\]

where

\(\rho =\psi^\ast \psi =|\psi |^2 \) and
\(\vec j =\dfrac{\hbar}{2mi} (\psi^\ast \vec \nabla \psi -\psi \vec \nabla \psi^\ast ). \)

Schrödinger’s interpretation of his equation was that the electron was simply a wave, not a particle, and this was the wave intensity. But thinking of electromagnetic waves in this way gave no clue to the quantum photon behavior—this could not be the whole story.

Interpreting the Wave Function

The correct interpretation of the wave function (due to Born) follows from analogy to the electromagnetic case. Let’s review that briefly. The basic example is the two-slit diffraction pattern, as built up by sending through one photon at a time, to a bank of photon detectors. The pattern gradually emerges: solve the wave equation, then the predicted local energy density (proportional to \(|E(x,y,z,t)|^2 dxdydz\)) gives the probability of one photon going through the system landing at that spot.

Born suggested that similarly \( |\psi |^2 \) at any point was proportional to the probability of detecting the electron at that point. This has turned out to be correct.

Localizing the Electron

Despite its wavelike properties, we know that an electron can behave like a particle: specifically, it can move as a fairly localized entity from one place to another. What’s the wave representation of that? It’s called a wave packet: a localized wave excitation. To see how this can come about, first remember that the Schrödinger equation is a linear equation, the sum of any two or more solutions is itself a solution. If we add together two plane waves close in wavelength, we get beats, which can be regarded as a string of wave packets. To get a single wave packet, we must add together a continuous range of wavelengths.

The standard example is the Gaussian wave packet, \( \psi (x, t=0) =A e^{ik_0 x} e^{-x^2 /2 \Delta^2} \) where \( p_0 = \hbar k_0 \)

Using the standard result \[ \int\limits_{-\infty}^{+\infty} e^{-a x^2}\,dx = \sqrt{ \frac{ \pi}{a}} \tag{1.1.9} \]

we find \( |A|^2 =(\pi \Delta^2)^{-1/2} \) so \[ \psi (x,t=0) =\frac{1}{(\pi \Delta^2)^{1/4}} e^{ik_0 x} e^{-x^2 /2 \Delta^2} . \tag{1.1.10}\]

But how do we construct this particular wavepacket by superposing plane waves? That is to say, we need a representation of the form: \[ \psi(x) =\int\limits_{-\infty}^{+\infty} \frac{dk}{2\pi} e^{ikx} \phi (k) \tag{1.1.11} \]

The function \( \phi(k) \) represents the weighting of plane waves in the neighborhood of wavenumber \(k\). This is a particular example of a Fourier transform—we will be discussing the general case in detail a little later in the course. Note that if \( \phi(k) \) is a bounded function, any particular \(k\) value gives a vanishingly small contribution, the plane-wave contribution to \(\psi(x)\) from a range \(dk\) is \( \phi(k) dk/2\pi \). In fact, \(\phi(k)\) is given in terms of \(\psi(x)\) by \[ \phi(k) =\int\limits_{-\infty}^{+\infty} dxe^{-ikx} \psi(x) . \tag{1.1.12} \]

It is perhaps worth mentioning at this point that this can be understood qualitatively by observing that the plane wave prefactor \(e^{-ikx}\) will interfere destructively with all plane wave components of \( \psi(x) \) except that of wavenumber \(k\), where it may at first appear that the contribution is infinite, but recall that as stated above, any particular \(k\) component has a vanishingly small weight—and, in fact, this is the right answer, as we shall show in more convincing fashion later.

In the present case, the above handwaving argument is unnecessary, because both the integrals can be carried out exactly, using the standard result: \[ \int\limits_{-\infty}^{\infty} e^{-ax^2 +bx}\, dx =e^{b^2 /4a}\sqrt{\frac{\pi}{a}} \tag{1.1.13} \]

giving \[ \phi(k) =(4\pi \Delta^2)^{\frac{1}{4}} e^{-\Delta^2 (k-k_0)^2 /2} . \tag{1.1.14} \].

The Uncertainty Principle

Note that the spreads in x-space and p-space are inversely related: \(\Delta x\) is of order \(\Delta\), \( \Delta p=\hbar \Delta k\sim \hbar/\Delta \). This is of course the Uncertainty Principle, localization in x-space requires a large spread in contributing momentum states.

It’s worth reviewing the undergraduate exercises on applications of the uncertainty principle. The help sharpen one’s appreciation of the wave/particle nature of quantum objects.

There’s a limit to how well the position of an electron can be determined: it is detected by bouncing a photon off of it, and the photon wavelength sets the limit on \(\Delta x\). But if the photon has enough energy to create an electron-positron pair out of the vacuum, you can’t be sure which electron you’re seeing. This limits \(\Delta x \sim \hbar/mc\) at best. (This is called the Compton wavelength, written \(\lambda_c\) – it appears in Compton scattering.) How much smaller that a hydrogen atom ground state wave function is this? \(\lambda_c /a_0 =e^2/\hbar c (CGS)=e^2/4\pi \varepsilon_0 \hbar c (SI)=1/137\), known as the fine structure constant. This is also the ratio of the electron speed in the first Bohr orbit to the speed of light, and so is an indication of the importance of relativistic corrections to energies of electron states; these differences in electron orbit energies for circular and elliptical states having the same energy when calculated nonrelativistically lead to fine structure in the atomic spectra.

Contributor

Michael Fowler (Beams Professor, Department of Physics, University of Virginia)