MODERN
PHYSICS
MODERN
PHYSICS
Third edition
K e n n e t h S. K r a n e
DEPARTMENT OF PHYSICS
OREGON STATE UNIVERSITY
JOHN WILEY & SONS, INC
VP AND EXECUTIVE PUBLISHER
EXECUTIVE EDITOR
MARKETING MANAGER
DESIGN DIRECTOR
DESIGNER
PRODUCTION MANAGER
ASSISTANT PRODUCTION EDITOR
PHOTO DEPARTMENT MANAGER
PHOTO EDITOR
COVER DESIGNER
COVER IMAGE
Kaye Pace
Stuart Johnson
Christine Kushner
Jeof Vita
Kristine Carney
Janis Soo
Elaine S. Chew
Hilary Newman
Sheena Goldstein
Seng Ping Ngieng
CERN/SCIENCE PHOTO LIBRARY/Photo Researchers, Inc.
This book was set in Times by Laserwords Private Limited and printed and bound by R. R. Donnelley and Sons Company, Von
Hoffman. The cover was printed by R. R. Donnelley and Sons Company, Von Hoffman.
This book is printed on acid free paper.
Founded in 1807, John Wiley & Sons, Inc. has been a valued source of knowledge and understanding for more than 200 years,
helping people around the world meet their needs and fulfill their aspirations. Our company is built on a foundation of principles that
include responsibility to the communities we serve and where we live and work. In 2008, we launched a Corporate Citizenship
Initiative, a global effort to address the environmental, social, economic, and ethical challenges we face in our business. Among the
issues we are addressing are carbon impact, paper specifications and procurement, ethical conduct within our business and among our
vendors, and community and charitable support. For more information, please visit our website: www.wiley.com/go/citizenship.
Copyright 2012, 1996, 1983, John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a
retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise,
except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of
the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc. 222
Rosewood Drive, Danvers, MA 01923, website www.copyright.com. Requests to the Publisher for permission should be addressed to
the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, (201)748–6011, fax
(201)748–6008, website http://www.wiley.com/go/permissions.
Evaluation copies are provided to qualified academics and professionals for review purposes only, for use in their courses during the
next academic year. These copies are licensed and may not be sold or transferred to a third party. Upon completion of the review
period, please return the evaluation copy to Wiley. Return instructions and a free of charge return mailing label are available at
www.wiley.com/go/returnlabel. If you have chosen to adopt this textbook for use in your course, please accept this book as your
complimentary desk copy. Outside of the United States, please contact your local sales representative.
Library of Congress Cataloging-in-Publication Data
Krane, Kenneth S.
Modern physics/Kenneth S. Krane. -- 3rd ed.
p. cm.
Includes bibliographical references and index.
ISBN 978-1-118-06114-5 (hardback)
1. Physics. I. Title.
QC21.2.K7 2012
539--dc23
2011039948
Printed in the United States of America
10 9 8 7 6 5 4 3 2 1
PREFACE
This textbook is meant to serve a first course in modern physics, including
relativity, quantum mechanics, and their applications. Such a course often follows
the standard introductory course in calculus-based classical physics. The course
addresses two different audiences: (1) Physics majors, who will later take a
more rigorous course in quantum mechanics, find an introductory modern course
helpful in providing background for the rigors of their imminent coursework
in classical mechanics, thermodynamics, and electromagnetism. (2) Nonmajors,
who may take no additional physics class, find an increasing need for concepts
from modern physics in their disciplines—a classical introductory course is not
sufficient background for chemists, computer scientists, nuclear and electrical
engineers, or molecular biologists.
Necessary prerequisites for undertaking the text include any standard calculusbased course covering mechanics, electromagnetism, thermal physics, and optics.
Calculus is used extensively, but no previous knowledge of differential equations,
complex variables, or partial derivatives is assumed (although some familiarity
with these topics would be helpful).
Chapters 1–8 constitute the core of the text. They cover special relativity and
quantum theory through atomic structure. At that point the reader may continue
with Chapters 9–11 (molecules, quantum statistics, and solids) or branch to
Chapters 12–14 (nuclei and particles). The final chapter covers cosmology and
can be considered the capstone of modern physics as it brings together topics from
relativity (special and general) as well as from nearly all of the previous material
covered in the text.
The unifying theme of the text is the empirical basis of modern physics.
Experimental tests of derived properties are discussed throughout. These include
the latest tests of special and general relativity as well as studies of wave-particle
duality for photons and material particles. Applications of basic phenomena are
extensively presented, and data from the literature are used not only to illustrate
those phenomena but to offer insight into how “real” physics is done. Students
using the text have the opportunity to study how laboratory results and the analysis
based on quantum theory go hand-in-hand to illuminate such diverse topics as
Bose-Einstein condensation, heat capacities of solids, paramagnetism, the cosmic
microwave background radiation, X-ray spectra, dilute mixtures of 3 He in 4 He,
and molecular spectroscopy of the interstellar medium.
This third edition offers many changes from the previous edition. Most of the
chapters have undergone considerable or complete rewriting. New topics have
been introduced and others have been rearranged. More experimental results are
presented and recent discoveries are highlighted, such as the WMAP microwave
background data and Bose-Einstein condensation. End-of-chapter problem sets
now include problems organized according to chapter section, which offer the
student an opportunity to gain familiarity with a particular topic, as well as general
problems, which often require the student to apply a broader array of concepts or
techniques. The number of worked examples in the chapters and the number of
end-of-chapter questions and problems have each increased by about 15% from
the previous edition. The range of abilities required to solve the problems has been
vi
Preface
broadened, so that this edition includes both more straightforward problems that
build confidence as well as more difficult problems that will challenge students.
Each chapter now includes a brief summary of the important points. Some of
the end-of-chapter problems are available for assignment using the WebAssign
program (www.webassign.net).
A new development in physics teaching since the appearance of the 2nd edition
of this text has been the availability of a large and robust body of literature from
physics education research (PER). My own teaching style has been profoundly
influenced by PER findings, and in preparing this new edition I have tried
to incorporate PER results wherever possible. One of the major themes that
has emerged from PER in the past decade or two is that students can often
learn successful algorithms for solving problems while lacking a fundamental
understanding of the underlying concepts. Many approaches to addressing this
problem are based on pre-class conceptual exercises and in-class individual or
group activities that help students to reason through diverse problems that can’t be
resolved by plugging numbers into an equation. It is absolutely essential to devote
class time to these exercises and to follow through with exam questions that
require similar analysis and articulation of the conceptual reasoning. More details
regarding the application of PER to the teaching of modern physics, including
references to articles from the PER literature, are included in the Instructor’s
Manual for this text, which can be found at www.wiley.com/college/krane. The
Instructor’s Manual also includes examples of conceptual questions for in-class
discussion or exams that have been developed and class tested through the support
of a Course, Curriculum and Laboratory Improvement grant from the National
Science Foundation.
Specific changes to the chapters include the following:
Chapter 1: The sections on Units and Dimensions and on Significant Figures
have been removed. In their place, a more detailed review of applications
of classical energy and momentum conservation is offered. The need for
special relativity is briefly established with a discussion of the failures of
the classical concepts of space and time, and the need for quantum theory is
previewed in the failure of Maxwell-Boltzmann particle statistics to account
for the heat capacities of diatomic gases.
Chapter 2: Spacetime diagrams have been introduced to help illustrate relationships in the twin paradox. The application of the relativistic conservation
laws to decay and collisions processes is now given a separate section to
help students learn to apply those laws. The section on tests of special
relativity has been updated to include recent results.
Chapter 3: The section on thermal radiation has been rewritten, and more
detailed derivations of the Rayleigh-Jeans and Planck formulas are now
given.
Chapter 4: New experimental results for particle diffraction and interference
are discussed. The sections on the classical uncertainty relationships and on
wave packet construction and motion have been rewritten.
Chapter 5: To help students understand the processes involved in applying
boundary conditions to solutions of the Schrödinger equation, a new section
on wave boundary conditions has been added. A new introductory section
on particle confinement introduces energy quantization and helps to build
the connection between the wave function and the uncertainty relationships.
Time dependence of the wave function is introduced more explicitly at an
Preface
earlier stage in the formulism. Graphic illustrations for step and barrier
problems now show the real and imaginary parts of the wave function as
well as its squared magnitude.
Chapter 6: The derivation of the Thomson model scattering angle has been
modified, and the section on deficiencies of the Bohr model has been
rewritten.
Chapter 7: To ease the entry into the 3-dimensional Schrödinger analysis of
the hydrogen atom in spherical coordinates, a new section on the onedimensional hydrogen atom has been added. Angular momentum concepts
relating to the hydrogen atom are now introduced before the full solutions
to the wave equation.
Chapter 8: Much of the material has been reorganized for clarity and ease of
presentation. The screening discussion has been made more explicit.
Chapter 9: More emphasis has been given to the use of bonding and antibonding
orbitals to predict the relative stability of molecules. Sections on molecular
vibrations and rotations have been rewritten.
Chapter 10: This chapter has been extensively rewritten. A new section on
the density of states function allows statistical distributions for photons or
particles to be discussed more rigorously. New applications of quantum
statistics include Bose-Einstein condensation, white dwarf stars, and dilute
mixtures of 3 He in 4 He.
Chapter 11: The chapter has been rewritten to broaden the applications of the
quantum theory of solids to include not only electrical conductivity but also
the heat capacity of solids and paramagnetism.
Chapter 12: To emphasize the unity of various topics within modern physics,
this chapter now includes proton and neutron separation energies, a new
section on quantum states in nuclei, and nuclear vibrational and rotational
states, all of which have analogues in atomic or molecular structure.
Chapter 13: The discussion of the physics of fission has been expanded while
that of the properties of nuclear reactors has been reduced somewhat.
Because much current research in nuclear physics is related to astrophysics,
this chapter now features a section on nucleosynthesis.
Chapter 14: New material on quarkonium and neutrino oscillations has been
added.
Chapter 15: Chapters 15 and 16 of the 2nd edition have been collapsed into
a single chapter on cosmology. New results from COBE and WMAP are
included, along with discussions of the horizon and flatness problems (and
their inflationary solution).
Many reviewers and class-testers of the manuscript of this edition have offered
suggestions to improve both the physics and its presentation. I am particularly
grateful to:
David Bannon, Oregon State University
Gerald Crawford, Fort Lewis College
Luther Frommhold, University of Texas-Austin
Gary Goldstein, Tufts University
Leon Gunther, Tufts University
Gary Ihas, University of Florida
vii
viii
Preface
Paul Lee, California State University, Northridge
Jeff Loats, Metropolitan State College of Denver
Jay Newman, Union College
Stephen Pate, New Mexico State University
David Roundy, Oregon State University
Rich Schelp, Erskine College
Weidian Shen, Eastern Michigan University
Hongtao Shi, Sonoma State University
Janet Tate, Oregon State University
Jeffrey L. Wragg, College of Charleston
Weldon Wilson, University of Central Oklahoma
I am also grateful for the many anonymous comments from students who used
the manuscript at the test sites. I am indebted to all those reviewers and users for
their contributions to the project.
Funding for the development and testing of the supplemental exercises in the
Instructor’s Manual was provided through a grant from the National Science
Foundation. I am pleased to acknowledge their support. Two graduate students
at Oregon State University helped to test and implement the curricular reforms:
K. C. Walsh and Pornrat Wattasinawich. I appreciate their assistance in this
project.
The staff at John Wiley & Sons have been especially helpful throughout the
project. I am particularly grateful to: Executive Editor Stuart Johnson for his
patience and support in bringing the new edition into reality; Assistant Production
Editor Elaine Chew for handling a myriad of complicated composition and
illustration details with efficiency and good humor; and Photo Editor Sheena
Goldstein for helping me navigate the treacherous waters of new copyright and
permission restrictions.
In my research and other professional activities, I occasionally meet physicists
who used earlier editions of this text when they were students. Some report that
their first exposure to modern physics kindled the spark that led them to careers
in physics. For many students, this course offers their first insights into what
physicists really do and what is exciting, perplexing, and challenging about our
profession. I hope students who use this new edition will continue to find those
inspirations.
Corvallis, Oregon
August 2011
Kenneth S. Krane
[email protected]
CONTENTS
Preface
v
1. The Failures of Classical Physics 1
1.1
1.2
Review of Classical Physics 3
The Failure of Classical Concepts of Space and Time 11
1.3
1.4
The Failure of the Classical Theory of Particle Statistics 13
Theory, Experiment, Law 20
Questions 21
Problems 22
2. The Special Theory of Relativity
25
2.1
Classical Relativity
26
2.2
2.3
2.4
The Michelson-Morley Experiment 29
Einstein’s Postulates 31
Consequences of Einstein’s Postulates 32
2.5
2.6
The Lorentz Transformation
The Twin Paradox 44
2.7
2.8
2.9
Relativistic Dynamics 47
Conservation Laws in Relativistic Decays and Collisions 53
Experimental Tests of Special Relativity 56
40
Questions 63
Problems 64
3. The Particlelike Properties of Electromagnetic Radiation
3.1
Review of Electromagnetic Waves
3.2
3.3
The Photoelectric Effect
Thermal Radiation 80
3.4
3.5
3.6
The Compton Effect 87
Other Photon Processes 91
What is a Photon? 94
Questions 97
Problems 98
75
70
69
x
Contents
4. The Wavelike Properties of Particles 101
4.1 De Broglie’s Hypothesis 102
4.2 Experimental Evidence for De Broglie Waves 104
4.3 Uncertainty Relationships for Classical Waves 110
4.4 Heisenberg Uncertainty Relationships
4.5 Wave Packets 119
113
4.6 The Motion of a Wave Packet 123
4.7 Probability and Randomness 126
Questions 128
Problems
129
5. The Schrödinger Equation 133
5.1 Behavior of a Wave at a Boundary
5.2 Confining a Particle 138
134
5.3 The Schrödinger Equation 140
5.4 Applications of the Schrödinger Equation
144
5.5 The Simple Harmonic Oscillator 155
5.6 Steps and Barriers 158
Questions 166
Problems
166
6. The Rutherford-Bohr Model of the Atom 169
6.1 Basic Properties of Atoms 170
6.2 Scattering Experiments and the Thomson Model
171
6.3 The Rutherford Nuclear Atom 174
6.4 Line Spectra 180
6.5 The Bohr Model 183
6.6 The Franck-Hertz Experiment 189
6.7 The Correspondence Principle 190
6.8 Deficiencies of the Bohr Model 191
Questions 193
Problems 194
7. The Hydrogen Atom in Wave Mechanics 197
7.1 A One-Dimensional Atom 198
7.2 Angular Momentum in the Hydrogen Atom 200
7.3 The Hydrogen Atom Wave Functions 203
7.4 Radial Probability Densities 207
Contents
7.5
Angular Probability Densities
7.6
7.7
7.8
Intrinsic Spin 211
Energy Levels and Spectroscopic Notation
The Zeeman Effect 217
7.9
Fine Structure 219
Questions 222
Problems
210
216
222
8. Many-Electron Atoms
225
8.1
8.2
The Pauli Exclusion Principle 226
Electronic States in Many-Electron Atoms
8.3
8.4
8.5
Outer Electrons: Screening and Optical Transitions 232
Properties of the Elements 235
Inner Electrons: Absorption Edges and X Rays 240
8.6
8.7
Addition of Angular Momenta
Lasers 248
228
244
Questions 252
Problems 253
9. Molecular Structure
257
9.1
9.2
The Hydrogen Molecule 258
Covalent Bonding in Molecules
9.3
9.4
9.5
Ionic Bonding 271
Molecular Vibrations 275
Molecular Rotations 278
9.6
Molecular Spectra
Questions 286
Problems
262
281
286
10. Statistical Physics 289
10.1
Statistical Analysis 290
10.2
10.3
Classical and Quantum Statistics 292
The Density of States 296
10.4
10.5
10.6
The Maxwell-Boltzmann Distribution 301
Quantum Statistics 306
Applications of Bose-Einstein Statistics 309
10.7
Applications of Fermi-Dirac Statistics 314
Questions 320
Problems 321
xi
xii
Contents
11. Solid-State Physics 325
11.1
11.2
Crystal Structures 326
The Heat Capacity of Solids
11.3
11.4
11.5
Electrons in Metals 338
Band Theory of Solids 342
Superconductivity 346
11.6
11.7
Intrinsic and Impurity Semiconductors
Semiconductor Devices 353
11.8
Magnetic Materials 357
Questions 364
Problems 365
334
350
12. Nuclear Structure and Radioactivity
12.1
12.2
12.3
Nuclear Constituents 370
Nuclear Sizes and Shapes 372
Nuclear Masses and Binding Energies
12.4
12.5
The Nuclear Force 378
Quantum States in Nuclei
12.6
12.7
12.8
Radioactive Decay 382
Alpha Decay 387
Beta Decay 391
369
374
380
12.9 Gamma Decay and Nuclear Excited States
12.10 Natural Radioactivity 398
394
Questions 402
Problems 403
13. Nuclear Reactions and Applications
407
13.1
13.2
Types of Nuclear Reactions 408
Radioisotope Production in Nuclear Reactions
13.3
13.4
Low-Energy Reaction Kinematics
Fission 416
13.5
13.6
13.7
Fusion 422
Nucleosynthesis 428
Applications of Nuclear Physics
Questions 437
Problems 437
414
432
412
Contents
14. Elementary Particles
441
14.1
14.2
The Four Basic Forces 442
Classifying Particles 444
14.3
14.4
14.5
Conservation Laws 448
Particle Interactions and Decays 453
Energy and Momentum in Particle Decays
14.6
14.7
Energy and Momentum in Particle Reactions 460
The Quark Structure of Mesons and Baryons 464
14.8
The Standard Model
Questions 474
Problems 474
458
470
15. Cosmology: The Origin and Fate of the Universe
15.1
15.2
15.3
The Expansion of the Universe 478
The Cosmic Microwave Background Radiation
Dark Matter 484
15.4
15.5
The General Theory of Relativity
Tests of General Relativity 493
15.6
15.7
15.8
Stellar Evolution and Black Holes 496
Cosmology and General Relativity 501
The Big Bang Cosmology 503
482
486
15.9 The Formation of Nuclei and Atoms
15.10 Experimental Cosmology 509
506
Questions 514
Problems 515
Appendix A: Constants and Conversion Factors 517
Appendix B: Complex Numbers 519
Appendix C: Periodic Table of the Elements 521
Appendix D: Table of Atomic Masses 523
Answers to Odd-Numbered Problems 533
Photo Credits 537
Index 539
Index to Tables 545
477
xiii
Chapter
1
THE FAILURES OF CLASSICAL
PHYSICS
CASSINI
INTERPLANETARY TRAJECTORY
SATURN ARRIVAL
1 JUL 2004
VENUS SWINGBY
26 APR 1998
VENUS SWINGBY
24 JUN 1999
ORBIT OF
JUPITER
ORBIT OF
EARTH
ORBIT OF
SATURN
DEEP SPACE
MANEUVER
3 DEC 1990
ORBIT OF
VENUS
EARTH SWINGBY
18 AUG 1999
LAUNCH
15 OCT 1997
JUPITER SWINGBY
30 DEC 2000
Classical physics, as postulated by Newton, has enabled us to send space probes on
trajectories involving many complicated maneuvers, such as the Cassini mission to Saturn,
which was launched in 1997 and gained speed for its trip to Saturn by performing four
‘‘gravity-assist’’ flybys of Venus (twice), Earth, and Jupiter. The spacecraft arrived at Saturn
in 2004 and is expected to continue to send data through at least 2017. Planning and
executing such interplanetary voyages are great triumphs for Newtonian physics, but when
objects move at speeds close to the speed of light or when we examine matter on the atomic
or subatomic scale, Newtonian mechanics is not adequate to explain our observations, as
we discuss in this chapter.
2
Chapter 1 | The Failures of Classical Physics
If you were a physicist living at the end of the 19th century, you probably would
have been pleased with the progress that physics had made in understanding the
laws that govern the processes of nature. Newton’s laws of mechanics, including
gravitation, had been carefully tested, and their success had provided a framework
for understanding the interactions among objects. Electricity and magnetism
had been unified by Maxwell’s theoretical work, and the electromagnetic waves
predicted by Maxwell’s equations had been discovered and investigated in the
experiments conducted by Hertz. The laws of thermodynamics and kinetic theory
had been particularly successful in providing a unified explanation of a wide
variety of phenomena involving heat and temperature. These three successful
theories—mechanics, electromagnetism, and thermodynamics—form the basis
for what we call “classical physics.”
Beyond your 19th-century physics laboratory, the world was undergoing rapid
changes. The Industrial Revolution demanded laborers for the factories and
accelerated the transition from a rural and agrarian to an urban society. These
workers formed the core of an emerging middle class and a new economic order.
The political world was changing, too—the rising tide of militarism, the forces
of nationalism and revolution, and the gathering strength of Marxism would
soon upset established governments. The fine arts were similarly in the middle
of revolutionary change, as new ideas began to dominate the fields of painting,
sculpture, and music. The understanding of even the very fundamental aspects of
human behavior was subject to serious and critical modification by the Freudian
psychologists.
In the world of physics, too, there were undercurrents that would soon cause
revolutionary changes. Even though the overwhelming majority of experimental
evidence agreed with classical physics, several experiments gave results that were
not explainable in terms of the otherwise successful classical theories. Classical
electromagnetic theory suggested that a medium is needed to propagate electromagnetic waves, but precise experiments failed to detect this medium. Experiments
to study the emission of electromagnetic waves by hot, glowing objects gave
results that could not be explained by the classical theories of thermodynamics
and electromagnetism. Experiments on the emission of electrons from surfaces
illuminated with light also could not be understood using classical theories.
These few experiments may not seem significant, especially when viewed
against the background of the many successful and well-understood experiments
of the 19th century. However, these experiments were to have a profound and
lasting effect, not only on the world of physics, but on all of science, on the
political structure of our world, and on the way we view ourselves and our place
in the universe. Within the short span of two decades between 1905 and 1925, the
shortcomings of classical physics would lead to the special and general theories
of relativity and the quantum theory.
The designation modern physics usually refers to the developments that began
in about 1900 and led to the relativity and quantum theories, including the
applications of those theories to understanding the atom, the atomic nucleus and
the particles of which it is composed, collections of atoms in molecules and solids,
and, on a cosmic scale, the origin and evolution of the universe. Our discussion
of modern physics in this text touches on each of these areas.
We begin our study in this chapter with a brief review of some important
principles of classical physics, and we discuss some situations in which classical
1.1 | Review of Classical Physics
physics offers either inadequate or incorrect conclusions. These situations are not
necessarily those that originally gave rise to the relativity and quantum theories,
but they do help us understand why classical physics fails to give us a complete
picture of nature.
1.1 REVIEW OF CLASSICAL PHYSICS
Although there are many areas in which modern physics differs radically from
classical physics, we frequently find the need to refer to concepts of classical
physics. Here is a brief review of some of the concepts of classical physics that we
may need.
Mechanics
A particle of mass m moving with velocity v has a kinetic energy defined by
K=
1
2
mv2
(1.1)
and a linear momentum p
defined by
p
= mv
(1.2)
In terms of the linear momentum, the kinetic energy can be written
K=
p2
2m
(1.3)
When one particle collides with another, we analyze the collision by applying
two fundamental conservation laws:
I. Conservation of Energy. The total energy of an isolated system (on which
no net external force acts) remains constant. In the case of a collision between
particles, this means that the total energy of the particles before the collision
is equal to the total energy of the particles after the collision.
II. Conservation of Linear Momentum. The total linear momentum of an
isolated system remains constant. For the collision, the total linear momentum
of the particles before the collision is equal to the total linear momentum of the
particles after the collision. Because linear momentum is a vector, application
of this law usually gives us two equations, one for the x components and
another for the y components.
These two conservation laws are of the most basic importance to understanding
and analyzing a wide variety of problems in classical physics. Problems 1–4 and
11–14 at the end of this chapter review the use of these laws.
The importance of these conservation laws is both so great and so fundamental
that, even though in Chapter 2 we learn that the special theory of relativity modifies
Eqs. 1.1, 1.2, and 1.3, the laws of conservation of energy and linear momentum
remain valid.
3
4
Chapter 1 | The Failures of Classical Physics
Example 1.1
A helium atom (m = 6.6465 × 10−27 kg) moving at a speed
of vHe = 1.518 × 106 m/s collides with an atom of nitrogen (m = 2.3253 × 10−26 kg) at rest. After the collision,
the helium atom is found to be moving with a velocity of
v′He = 1.199 × 106 m/s at an angle of θHe = 78.75◦ relative to the direction of the original motion of the helium
atom. (a) Find the velocity (magnitude and direction) of the
nitrogen atom after the collision. (b) Compare the kinetic
energy before the collision with the total kinetic energy of
the atoms after the collision.
Solution
(a) The law of conservation of momentum for this collision can be written in vector form as p
initial = pfinal , which
is equivalent to
px,initial = px,final
py,initial = py,final
and
The collision is shown in Figure 1.1. The initial values
of the total momentum are, choosing the x axis to be the
direction of the initial motion of the helium atom,
px,initial = mHe vHe
and
py,initial = 0
The final total momentum can be written
px,final =
py,final =
mHe v′He
mHe v′He
cos θHe + mN v′N cos θN
sin θHe + mN v′N sin θN
The expression for py,final is written in general form with
a + sign even though we expect that θHe and θN are on
opposite sides of the x axis. If the equation is written in
this way, θN will come out to be negative. The law of
y
x
N
v′He
= −3.3613 × 105 m/s
We can now solve for v′N and θN :
v′N = (v′N sin θN )2 + (v′N cos θN )2
= (−3.3613 × 105 m/s)2 + (3.6704 × 105 m/s)2
= 4.977 × 105 m/s
v′ sin θN
θN = tan−1 ′N
vN cos θN
5
◦
−1 −3.3613 × 10 m/s
= tan
= −42.48
3.6704 × 105 m/s
(b) The initial kinetic energy is
Kinitial = 21 mHe v2He
= 7.658 × 10−15 J
and the total final kinetic energy is
= 12 (6.6465 × 10−27 kg)(1.199 × 106 m/s)2
θHe
θN
(b)
mHe v′He sin θHe
mN
= −(6.6465 × 10−27 kg)(1.199 × 106 m/s)
◦
×(sin78.75 )(2.3253 × 10−26 kg)−1
v′N sin θN = −
2
Kfinal = 21 mHe v′He
+ 21 mN v′N2
(a)
y
= 3.6704 × 105 m/s
= 21 (6.6465 × 10−27 kg)(1.518 × 106 m/s)2
vHe
He
conservation of momentum gives, for the x components,
mHe vHe = mHe v′He cos θHe + mN v′N cos θN , and for the y
components, 0 = mHe v′He sin θHe + mN v′N sin θN . Solving
for the unknown terms, we find
m (v − v′He cos θHe )
v′N cos θN = He He
mN
= {(6.6465 × 10−27 kg)[1.518 × 106 m/s
−(1.199 × 106 m/s)(cos 78.75◦ )]}
×(2.3253 × 10−26 kg)−1
x
v′N
FIGURE 1.1 Example 1.1. (a) Before collision;
(b) after collision.
+ 12 (2.3253 × 10−26 kg)(4.977 × 105 m/s)2
= 7.658 × 10−15 J
Note that the initial and final kinetic energies are equal.
This is the characteristic of an elastic collision, in which
no energy is lost to, for example, internal excitation of the
particles.
1.1 | Review of Classical Physics
5
Example 1.2
An atom of uranium (m = 3.9529 × 10−25 kg) at rest
decays spontaneously into an atom of helium (m =
6.6465 × 10−27 kg) and an atom of thorium (m = 3.8864 ×
10−25 kg). The helium atom is observed to move in the
positive x direction with a velocity of 1.423 × 107 m/s
(Figure 1.2). (a) Find the velocity (magnitude and direction) of the thorium atom. (b) Find the total kinetic energy
of the two atoms after the decay.
Setting px,initial = px,final and solving for v′Th , we obtain
v′Th = −
=−
U
x
= −2.432 × 105 m/s
2
2
K = 12 mHe v′He
+ 21 mTh v′Th
= 12 (6.6465 × 10−27 kg)(1.423 × 107 m/s)2
(a)
y
v′Th
+ 21 (3.8864 × 10−25 kg)(−2.432 × 105 m/s)2
He
v′He
= 6.844 × 10−13 J
x
(b)
FIGURE 1.2 Example 1.2. (a) Before decay; (b) after decay.
Solution
(a) Here we again use the law of conservation of momentum. The initial momentum before the decay is zero, so the
total momentum of the two atoms after the decay must also
be zero:
px,initial = 0
(6.6465 × 10−27 kg)(1.423 × 107 m/s)
3.8864 × 10−25 kg
The thorium atom moves in the negative x direction.
(b) The total kinetic energy after the decay is:
y
Th
mHe v′He
mTh
px,final = mHe v′He + mTh v′Th
Clearly kinetic energy is not conserved in this decay,
because the initial kinetic energy of the uranium atom
was zero. However total energy is conserved —if we
write the total energy as the sum of kinetic energy
and nuclear energy, then the total initial energy (kinetic
+ nuclear) is equal to the total final energy (kinetic +
nuclear). Clearly the gain in kinetic energy occurs as a
result of a loss in nuclear energy. This is an example of
the type of radioactive decay called alpha decay, which we
discuss in more detail in Chapter 12.
Another application of the principle of conservation of energy occurs when
a particle moves subject to an external force F. Corresponding to that external
force there is often a potential energy U, defined such that (for one-dimensional
motion)
F=−
dU
dx
(1.4)
The total energy E is the sum of the kinetic and potential energies:
E =K+U
(1.5)
As the particle moves, K and U may change, but E remains constant. (In
Chapter 2, we find that the special theory of relativity gives us a new definition of
total energy.)
6
Chapter 1 | The Failures of Classical Physics
When a particle moving with linear momentum p
is at a displacement r from the
about the point O is defined (see Figure 1.3) by
origin O, its angular momentum L
z
= r × p
L
L = r × p
y
O
r
p
x
FIGURE 1.3 A particle of mass m,
located with respect to the origin
r and moving
O by position vector
p, has angular
with linear momentum
about O.
momentum L
(1.6)
There is a conservation law for angular momentum, just as with linear momentum.
In practice this has many important applications. For example, when a charged
particle moves near, and is deflected by, another charged particle, the total
angular momentum of the system (the two particles) remains constant if no net
external torque acts on the system. If the second particle is so much more massive
than the first that its motion is essentially unchanged by the influence of the first
particle, the angular momentum of the first particle remains constant (because
the second particle acquires no angular momentum). Another application of the
conservation of angular momentum occurs when a body such as a comet moves
in the gravitational field of the Sun—the elliptical shape of the comet’s orbit is
necessary to conserve angular momentum. In this case r and p
of the comet must
remains constant.
simultaneously change so that L
Velocity Addition
Another important aspect of classical physics is the rule for combining velocities.
For example, suppose a jet plane is moving at a velocity of vPG = 650 m/s,
as measured by an observer on the ground. The subscripts on the velocity
mean “velocity of the plane relative to the ground.” The plane fires a missile
in the forward direction; the velocity of the missile relative to the plane is
vMP = 250 m/s. According to the observer on the ground, the velocity of the
missile is: vMG = vMP + vPG = 250 m/s + 650 m/s = 900 m/s.
vAB represent the velocity of A
We can generalize this rule as follows. Let
vBC represent the velocity of B relative to C. Then the velocity
relative to B, and let
of A relative to C is
(1.7)
vAC = vAB + vBC
This equation is written in vector form to allow for the possibility that the
velocities might be in different directions; for example, the missile might be fired
not in the direction of the plane’s velocity but in some other direction. This seems
to be a very “common-sense” way of combining velocities, but we will see later
in this chapter (and in more detail in Chapter 2) that this common-sense rule can
lead to contradictions with observations when we apply it to speeds close to the
speed of light.
A common application of this rule (for speeds small compared with the
speed of light) occurs in collisions, when we want to analyze conservation of
momentum and energy in a frame of reference that is different from the one
in which the collision is observed. For example, let’s analyze the collision of
Example 1.1 in a frame of reference that is moving with the center of mass.
Suppose the initial velocity of the He atom defines the positive x direction.
The velocity of the center of mass (relative to the laboratory) is then vCL =
(vHe mHe + vN mN )/(mHe + mN ) = 3.374 × 105 m/s. We would like to find the
initial velocity of the He and N relative to the center of mass. If we start with
vHeL = vHeC + vCL and vNL = vNC + vCL , then
vHeC = vHeL − vCL = 1.518 × 106 m/s − 3.374 × 105 m/s = 1.181 × 106 m/s
vNC = vNL − vCL = 0 − 3.374 × 105 m/s = −0.337 × 106 m/s
1.1 | Review of Classical Physics
In a similar fashion we can calculate the final velocities of the He and N.
The resulting collision as viewed from this frame of reference is illustrated in
Figure 1.4. There is a special symmetry in this view of the collision that is not
apparent from the same collision viewed in the laboratory frame of reference
(Figure 1.1); each velocity simply changes direction leaving its magnitude
unchanged, and the atoms move in opposite directions. The angles in this view of
the collision are different from those of Figure 1.1, because the velocity addition
in this case applies only to the x components and leaves the y components
unchanged, which means that the angles must change.
7
y
x
N
He
(a)
y
He
x
N
Electricity and Magnetism
The electrostatic force (Coulomb force) exerted by a charged particle q1 on
another charge q2 has magnitude
F=
1 |q1 ||q2 |
4πε0 r2
(1.8)
The direction of F is along the line joining the particles (Figure 1.5). In the SI
system of units, the constant 1/4πε0 has the value
1
= 8.988 × 109 N · m2 /C2
4πε0
(b)
FIGURE 1.4 The collision of Figure
1.1 viewed from a frame of reference moving with the center of mass.
(a) Before collision. (b) After collision. In this frame the two particles
always move in opposite directions,
and for elastic collisions the magnitude of each particle’s velocity is
unchanged.
The corresponding potential energy is
U=
1 q1 q2
4πε0 r
(1.9)
In all equations derived from Eq. 1.8 or 1.9 as starting points, the quantity 1/4πε0
must appear. In some texts and reference books, you may find electrostatic
quantities in which this constant does not appear. In such cases, the centimetergram-second (cgs) system has probably been used, in which the constant 1/4πε0
is defined to be 1. You should always be very careful in making comparisons
of electrostatic quantities from different references and check that the units are
identical.
An electrostatic potential difference V can be established by a distribution of
charges. The most common example of a potential difference is that between the
two terminals of a battery. When a charge q moves through a potential difference
V , the change in its electrical potential energy U is
U = qV
(1.10)
At the atomic or nuclear level, we usually measure charges in terms of the basic
charge of the electron or proton, whose magnitude is e = 1.602 × 10−19 C. If
such charges are accelerated through a potential difference V that is a few volts,
the resulting loss in potential energy and corresponding gain in kinetic energy will
be of the order of 10−19 to 10−18 J. To avoid working with such small numbers,
it is common in the realm of atomic or nuclear physics to measure energies in
electron-volts (eV), defined to be the energy of a charge equal in magnitude to
that of the electron that passes through a potential difference of 1 volt:
U = qV = (1.602 × 10−19 C)(1 V) = 1.602 × 10−19 J
r
+
F
+
F
FIGURE 1.5 Two charged particles
experience equal and opposite electrostatic forces along the line joining
their centers. If the charges have the
same sign (both positive or both negative), the force is repulsive; if the signs
are different, the force is attractive.
8
Chapter 1 | The Failures of Classical Physics
and thus
1 eV = 1.602 × 10−19 J
Some convenient multiples of the electron-volt are
keV = kilo electron-volt = 103 eV
MeV = mega electron-volt = 106 eV
GeV = giga electron-volt = 109 eV
(In some older works you may find reference to the BeV, for billion electron-volts;
this is a source of confusion, for in the United States a billion is 109 while in
Europe a billion is 1012 .)
Often we wish to find the potential energy of two basic charges separated by
typical atomic or nuclear dimensions, and we wish to have the result expressed
in electron-volts. Here is a convenient way of doing this. First we express the
quantity e2 /4πε0 in a more convenient form:
e2
= (8.988 × 109 N · m2 /C2 )(1.602 × 10−19 C)2 = 2.307 × 10−28 N · m2
4πε0
9
1
10 nm
= (2.307 × 10−28 N · m2 )
1.602 × 10−19 J/eV
m
= 1.440 eV · nm
With this useful combination of constants it becomes very easy to calculate
electrostatic potential energies. For two electrons separated by a typical atomic
dimension of 1.00 nm, Eq. 1.9 gives
e2 1
1
1 e2
=
= (1.440 eV · nm)
= 1.44 eV
U=
4πε0 r
4πε0 r
1.00 nm
B
For calculations at the nuclear level, the femtometer is a more convenient unit of
distance and MeV is a more appropriate energy unit:
15
1m
10 fm
1 MeV
e2
= (1.440 eV · nm)
= 1.440 MeV · fm
4πε0
109 nm
1m
106 eV
i
(a)
Bext
µ
i
(b)
FIGURE 1.6 (a) A circular current
at
loop produces a magnetic field B
its center. (b) A current loop with
magnetic moment µ
in an external
ext . The field exerts
magnetic field B
a torque on the loop that will tend to
rotate it so that µ
lines up with B ext .
It is remarkable (and convenient to remember) that the quantity e2 /4πε0 has the
same value of 1.440 whether we use typical atomic energies and sizes (eV · nm)
or typical nuclear energies and sizes (MeV · fm).
can be produced by an electric current i. For example, the
A magnetic field B
magnitude of the magnetic field at the center of a circular current loop of radius r
is (see Figure 1.6a)
B=
μ0 i
2r
(1.11)
The SI unit for magnetic field is the tesla (T), which is equivalent to a newton per
ampere-meter. The constant μ0 is
μ0 = 4π × 10−7 N · s2 /C2
Be sure to remember that i is in the direction of the conventional (positive) current,
opposite to the actual direction of travel of the negatively charged electrons that
is chosen
typically produce the current in metallic wires. The direction of B
according to the right-hand rule: if you hold the wire in the right hand with the
1.1 | Review of Classical Physics
thumb pointing in the direction of the current, the fingers point in the direction of
the magnetic field.
It is often convenient to define the magnetic moment µ
of a current loop:
|µ
| = iA
(1.12)
where A is the geometrical area enclosed by the loop. The direction of µ
is
perpendicular to the plane of the loop, according to the right-hand rule.
ext (as in
When a current loop is placed in a uniform external magnetic field B
ext :
Figure 1.6b), there is a torque
τ on the loop that tends to line up µ
with B
ext
τ = µ
×B
(1.13)
Another way to describe this interaction is to assign a potential energy to the
ext :
magnetic moment µ
in the external field B
ext
U = −µ
·B
(1.14)
c = (ε0 μ0 )−1/2
(1.15)
ext is applied, µ
When the field B
rotates so that its energy tends to a minimum
ext are parallel.
value, which occurs when µ
and B
It is important for us to understand the properties of magnetic moments,
because particles such as electrons or protons have magnetic moments. Although
we don’t imagine these particles to be tiny current loops, their magnetic moments
do obey Eqs. 1.13 and 1.14.
A particularly important aspect of electromagnetism is electromagnetic waves.
In Chapter 3 we discuss some properties of these waves in more detail. Electromagnetic waves travel in free space with speed c (the speed of light), which is
related to the electromagnetic constants ε0 and μ0 :
The speed of light has the exact value of c = 299,792,458 m/s.
Electromagnetic waves have a frequency f and wavelength λ, which are
related by
c = λf
(1.16)
The wavelengths range from the very short (nuclear gamma rays) to the very
long (radio waves). Figure 1.7 shows the electromagnetic spectrum with the
conventional names assigned to the different ranges of wavelengths.
Wavelength (m)
10
6
10
4
2
10
AM
10
0
FM TV
10
−2
10−4
Microwave
Infrared
Broadcast
Long-wave radio
102
104
10−6
10−8
10−12
Nuclear gamma rays
Ultraviolet
Visible
light
X rays
Short-wave radio
106
10−10
108
1010
1012
1014
1016
1018
Frequency (Hz)
FIGURE 1.7 The electromagnetic spectrum. The boundaries of the regions are not sharply defined.
1020
1022
9
10
Chapter 1 | The Failures of Classical Physics
Kinetic Theory of Matter
An example of the successful application of classical physics to the structure
of matter is the understanding of the properties of gases at relatively low
pressures and high temperatures (so that the gas is far from the region of pressure
and temperature where it might begin to condense into a liquid). Under these
conditions, most real gases can be modeled as ideal gases and are well described
by the ideal gas equation of state
PV = NkT
(1.17)
where P is the pressure, V is the volume occupied by the gas, N is the number of
molecules, T is the temperature, and k is the Boltzmann constant, which has the
value
k = 1.381 × 10−23 J/K
In using this equation and most of the equations in this section, the temperature
must be measured in units of kelvins (K). Be careful not to confuse the symbol K
for the unit of temperature with the symbol K for kinetic energy.
The ideal gas equation of state can also be expressed as
PV = nRT
(1.18)
where n is the number of moles and R is the universal gas constant with a
value of
R = 8.315 J/mol · K
One mole of a gas is the quantity that contains a number of fundamental entities
(atoms or molecules) equal to Avogadro’s constant NA , where
NA = 6.022 × 1023 per mole
That is, one mole of helium contains NA atoms of He, one mole of nitrogen
contains NA molecules of N2 (and thus 2NA atoms of N), and one mole of
water vapor contains NA molecules of H2 O (and thus 2NA atoms of H and NA
atoms of O).
Because N = nNA (number of molecules equals number of moles times
number of molecules per mole), the relationship between the Boltzmann constant
and the universal gas constant is
R = kNA
(1.19)
The ideal gas model is very successful for describing the properties of many
gases. It assumes that the molecules are of negligibly small volume (that is, the
gas is mostly empty space) and move randomly throughout the volume of the
container. The molecules make occasional collisions with one another and with
the walls of the container. The collisions obey Newton’s laws and are elastic and
of very short duration. The molecules exert forces on one another only during
collisions. Under these assumptions, there is no potential energy so that kinetic
energy is the only form of energy that must be considered. Because the collisions
are elastic, there is no net loss or gain of kinetic energy during the collisions.
1.2 | The Failure of Classical Concepts of Space and Time
11
Individual molecules may speed up or slow down due to collisions, but the
average kinetic energy of all the molecules in the container does not change. The
average kinetic energy of a molecule in fact depends only on the temperature:
Kav = 32 kT (per molecule)
(1.20)
For rough estimates, the quantity kT is often used as a measure of the mean
kinetic energy per particle. For example, at room temperature (20◦ C = 293 K),
the mean kinetic energy per particle is approximately 4 × 10−21 J (about 1/40 eV),
while in the interior of a star where T ∼ 107 K, the mean energy is approximately
10−16 J (about 1000 eV).
Sometimes it is also useful to discuss the average kinetic energy of a mole of
the gas:
average K per mole = average K per molecule × number of molecules per mole
Using Eq. 1.19 to relate the Boltzmann constant to the universal gas constant, we
find the average molar kinetic energy to be
Kav = 23 RT (per mole)
(1.21)
It should be apparent from the context of the discussion whether Kav refers to the
average per molecule or the average per mole.
v
O2
v
1.2 THE FAILURE OF CLASSICAL CONCEPTS
OF SPACE AND TIME
In 1905, Albert Einstein proposed the special theory of relativity, which is in
essence a new way of looking at space and time, replacing the “classical” space
and time that were the basis of the physical theories of Galileo and Newton.
Einstein’s proposal was based on a “thought experiment,” but in subsequent years
experimental data have clearly indicated that the classical concepts of space and
time are incorrect. In this section we examine how experimental results support
the need for a new approach to space and time.
The Failure of the Classical Concept of Time
In high-energy collisions between two protons, many new particles can be
produced, one of which is a pi meson (also known as a pion). When the pions are
produced at rest in the laboratory, they are observed to have an average lifetime (the
time between the production of the pion and its decay into other particles) of 26.0 ns
(nanoseconds, or 10−9 s). On the other hand, pions in motion are observed to have
a very different lifetime. In one particular experiment, pions moving at a speed of
2.737 × 108 m/s (91.3% of the speed of light) showed a lifetime of 63.7 ns.
Let us imagine this experiment as viewed by two different observers
(Figure 1.8). Observer #1, at rest in the laboratory, sees the pion moving relative
to the laboratory at a speed of 91.3% of the speed of light and measures its
LABORATORY
O1
A
B
A
B
(a)
O2
LABORATORY
−v
O1
(b)
FIGURE 1.8 (a) The pion experiment
according to O1 . Markers A and
B respectively show the locations
of the pion’s creation and decay.
(b) The same experiment as viewed
by O2 , relative to whom the pion is
at rest and the laboratory moves with
velocity −v.
12
Chapter 1 | The Failures of Classical Physics
lifetime to be 63.7 ns. Observer #2 is moving relative to the laboratory at exactly
the same velocity as the pion, so according to observer #2 the pion is at rest and
has a lifetime of 26.0 ns. The two observers measure different values for the time
interval between the same two events—the formation of the pion and its decay.
According to Newton, time is the same for all observers. Newton’s laws are
based on this assumption. The pion experiment clearly shows that time is not
the same for all observers, which indicates the need for a new theory that relates
time intervals measured by different observers who are in motion with respect to
each other.
The Failure of the Classical Concept of Space
The pion experiment also leads to a failure of the classical ideas about space.
Suppose observer #1 erects two markers in the laboratory, one where the pion
is created and another where it decays. The distance D1 between the two
markers is equal to the speed of the pion multiplied by the time interval from
its creation to its decay: D1 = (2.737 × 108 m/s)(63.7 × 10−9 s) = 17.4 m. To
observer #2, traveling at the same velocity as the pion, the laboratory appears
to be rushing by at a speed of 2.737 × 108 m/s and the time between passing
the first and second markers, showing the creation and decay of the pion in the
laboratory, is 26.0 ns. According to observer #2, the distance between the markers
is D2 = (2.737 × 108 m/s)(26.0 × 10−9 s) = 7.11 m. Once again, we have two
observers in relative motion measuring different values for the same interval, in
this case the distance between the two markers in the laboratory. The physical
theories of Galileo and Newton are based on the assumption that space is the
same for all observers, and so length measurements should not depend on relative
motion. The pion experiment again shows that this cornerstone of classical physics
is not consistent with modern experimental data.
The Failure of the Classical Concept
of Velocity
Classical physics places no limit on the maximum velocity that a particle can
reach. One of the basic equations of kinematics, v = v0 + at, shows that if a
particle experiences an acceleration a for a long enough time t, velocities as large
as desired can be achieved, perhaps even exceeding the speed of light. For another
example, when an aircraft flying at a speed of 200 m/s relative to an observer
on the ground launches a missile at a speed of 250 m/s relative to the aircraft, a
ground-based observer would measure the missile to travel at a speed of 200 m/s +
250 m/s = 450 m/s, according to the classical velocity addition rule (Eq. 1.7). We
can apply that same reasoning to a spaceship moving at a speed of 2.0 × 108 m/s
(relative to an observer on a space station), which fires a missile at a speed of
2.5 × 108 m/s relative to the spacecraft. We would expect that the observer on
the space station would measure a speed of 4.5 × 108 m/s for the missile. This
speed exceeds the speed of light (3.0 × 108 m/s). Allowing speeds greater than
the speed of light leads to a number of conceptual and logical difficulties, such as
the reversal of the normal order of cause and effect for some observers.
Here again modern experimental results disagree with the classical ideas. Let’s
go back again to our experiment with the pion, which is moving through the
laboratory at a speed of 2.737 × 108 m/s. The pion decays into another particle,
called a muon, which is emitted in the forward direction (the direction of the
pion’s velocity) with a speed of 0.813 × 108 m/s relative to the pion. According to
Eq. 1.7, an observer in the laboratory should observe the muon to be moving with
1.3 | The Failure of the Classical Theory of Particle Statistics
a velocity of 2.737 × 108 m/s + 0.813 × 108 m/s = 3.550 × 108 m/s, exceeding
the speed of light. The observed velocity of the muon, however, is 2.846 ×
108 m/s, below the speed of light. Clearly the classical rule for velocity addition
fails in this experiment.
The properties of time and space and the rules for combining velocities are
essential concepts of the classical physics of Newton. These concepts are derived
from observations at low speeds, which were the only speeds available to Newton
and his contemporaries. In Chapter 2, we shall discover how the special theory
of relativity provides the correct procedure for comparing measurements of time,
distance, and velocity by different observers and thereby removes the failures of
classical physics at high speed (while reducing to the classical laws at low speed,
where we know the Newtonian framework works very well).
1.3 THE FAILURE OF THE CLASSICAL THEORY
OF PARTICLE STATISTICS
Thermodynamics and statistical mechanics were among the great triumphs of
19th-century physics. Describing the behavior of complex systems of many
particles was shown to be possible using a small number of aggregate or average
properties—for example, temperature, pressure, and heat capacity. Perhaps the
crowning achievement in this field was the development of relationships between
macroscopic properties, such as temperature, and microscopic properties, such as
the molecular kinetic energy.
Despite these great successes, this statistical approach to understanding the
behavior of gases and solids also showed a spectacular failure. Although the
classical theory gave the correct heat capacities of gases at high temperatures, it
failed miserably for many gases at low temperatures. In this section we summarize
the classical theory and explain how it fails at low temperatures. This failure
directly shows the inadequacy of classical physics and the need for an approach
based on quantum theory, the second of the great theories of modern physics.
The Distribution of Molecular Energies
In addition to the average kinetic energy, it is also important to analyze the
distribution of kinetic energies—that is, what fraction of the molecules in the
container has kinetic energies between any two values K1 and K2 . For a gas in
thermal equilibrium at absolute temperature T (in kelvins), the distribution of
molecular energies is given by the Maxwell-Boltzmann distribution:
1
2N
E1/2 e−E/kT
N(E) = √
π (kT)3/2
(1.22)
In this equation, N is the total number of molecules (a pure number) while N(E)
is the distribution function (with units of energy−1 ) defined so that N(E)dE is the
number of molecules dN in the energy interval dE at E (or, in other words, the
number of molecules with energies between E and E + dE):
dN = N(E) dE
(1.23)
The distribution N(E) is shown in Figure 1.9. The number dN is represented by the
area of the narrow strip between E and E + dE. If we divide the entire horizontal
13
Chapter 1 | The Failures of Classical Physics
1.20
N(E) (× 10–25)
14
Most probable energy ( 12 kT )
dN
Average energy ( 32 kT)
0.80
dE
N(E1: E2)
0.40
0.00
0.00
E1
E2
0.10
Energy (eV)
0.20
FIGURE 1.9 The Maxwell-Boltzmann energy distribution function, shown for
one mole of gas at room temperature (300 K).
axis into an infinite number of such small intervals and add the areas of all the
resulting narrow strips, we obtain the total number of molecules in the gas:
∞
∞
∞
1
2N
N(E) dE =
dN =
E1/2 e−E/kT dE = N
(1.24)
√
3/2
(kT)
π
0
0
0
∞
The final step in this calculation involves the definite integral 0 x1/2 e−x dx,
which you can find in tables of integrals. Also using calculus techniques (see
Problem 8), you can show that the peak of the distribution function (the most
probable energy) is 12 kT.
The average energy in this distribution of molecules can also be found by
dividing the distribution into strips. To find the contribution of each strip to
the energy of the gas, we multiply the number of molecules in each strip,
dN = N(E)dE, by the energy E of the molecules in that strip, and then we add
the contributions of all the strips by integrating over all energies. This calculation
would give the total energy of the gas; to find the average we divide by the total
number of molecules N:
∞
1
2
1 ∞
EN(E) dE =
E3/2 e−E/kT dE
(1.25)
Eav =
√
3/2
N 0
(kT)
π
0
Once again, the definite integral can be found in integral tables. The result of
carrying out the integration is
(1.26)
Eav = 32 kT
Equation 1.26 gives the average energy of a molecule in the gas and agrees
precisely with the result given by Eq. 1.20 for the ideal gas in which kinetic
energy is the only kind of energy the gas can have.
Occasionally we are interested in finding the number of molecules in our
distribution with energies between any two values E1 and E2 . If the interval
between E1 and E2 is very small, Eq. 1.23 can be used, with dE = E2 − E1 and with
N(E) evaluated at the midpoint of the interval. This approximation works very well
when the interval is small enough that N(E) is either approximately flat or linear
over the interval. If the interval is large enough that this approximation is not valid,
then it is necessary to integrate to find the number of molecules in the interval:
E2
E2
1
2N
E1/2 e−E/kT dE
(1.27)
N(E) dE =
N(E1 : E2 ) =
√
π (kT)3/2
E1
E1
This number is represented by the shaded area in Figure 1.9. This integral cannot
be evaluated directly and must be found numerically.
1.3 | The Failure of the Classical Theory of Particle Statistics
15
Example 1.3
(a) In one mole of a gas at a temperature of 650 K (kT =
8.97 × 10−21 J = 0.0560 eV), calculate the number of
molecules with energies between 0.0105 eV and 0.0135 eV.
(b) In this gas, calculate the fraction of the molecules with
energies in the range of ±2.5% of the most probable
energy ( 21 kT).
N(E) (× 10–25)
0.50
(a) Figure 1.10a shows the distribution N(E) in the region
between E1 = 0.0105 eV and E2 = 0.0135 eV. Because the
graph is very close to linear in this region, we can use Eq.
1.23 to find the number of molecules in this range. We
take dE to be the width of the range, dE = E2 − E1 =
0.0135 eV − 0.0105 eV = 0.0030 eV, and for E we use the
energy at the midpoint of the range (0.0120 eV):
dN = N(E) dE
0.40
1
2N
=√
E1/2 e−E/kT dE
π (kT)3/2
0.30
0.20
0.10
0.00
0.008 0.010 0.012 0.014 0.016
Energy (eV)
(a)
= 2(6.022 × 1023 )(0.0120 eV)1/2 π −1/2 (0.0560 eV)−3/2
×e−(0.0120 eV)/(0.0560 eV) (0.0030 eV)
= 1.36 × 1022
(b) Figure 1.10b shows the distribution in this region. To
find the fraction of the molecules in this energy range, we
want dN/N. The most probable energy is 21 kT or 0.0280 eV,
and ±2.5% of this value corresponds to ±0.0007 eV or
a range from 0.0273 eV to 0.0287 eV. The fraction is
0.50
N(E) (× 10–25)
Solution
0.40
0.30
0.20
dN
1
N(E) dE
2
E1/2 e−E/kT dE
=
=√
N
N
π (kT)3/2
0.10
0.00
0.026 0.027 0.028 0.029 0.030
Energy (eV)
(b)
FIGURE 1.10 Example 1.3.
= 2(0.0280 eV)1/2 π −1/2 (0.0560 eV)−3/2
×e−(0.0280 eV)/(0.0560 eV) (0.0014 eV)
= 0.0121
Note from these examples how we use a distribution function. We do not use
Eq. 1.22 to calculate the number of molecules at a particular energy. In this way
N(E) differs from many of the functions you have encountered previously in your
study of physics and mathematics. We always use the distribution function to
calculate how many events occur in a certain interval of values rather than at an
exact particular value. There are two reasons for this: (1) Asking the question in
the form of how many molecules have a certain value of the energy implies that
the energy is known exactly (to an infinite number of decimal places), and there
is zero probability to find a molecule with that exact value of the energy. (2) Any
measurement apparatus accepts a finite range of energies (or speeds) rather than
a single exact value, and thus asking about intervals is a better representation of
what can be measured in the laboratory.
16
Chapter 1 | The Failures of Classical Physics
Note that N(E) has dimensions of energy−1 —it gives the number of molecules
per unit energy interval (for example, number of molecules per eV). To get an
actual number that can be compared with measurement, N(E) must be multiplied
by an energy interval. In our study of modern physics, we will encounter many
different types of distribution functions whose use and interpretation are similar
to that of N(E). These functions generally give a number or a probability per
some sort of unit interval (for example, probability per unit volume), and to use
the distribution function to calculate an outcome we must always multiply by an
appropriate interval (for example, an element of volume). Sometimes we will be
able to deal with small intervals using a relationship similar to Eq. 1.23, as we did
in Example 1.3, but in other cases we will find the need to evaluate an integral, as
we did in Eq. 1.27.
Polyatomic Molecules and the Equipartition
of Energy
So far we have been considering gases with only one atom per molecule
(monatomic gases). For “point” molecules with no internal structure, only one
form of energy is important: translational kinetic energy 12 mv2 . (We call this
“translational” kinetic energy because it describes motion as the gas particles
move from one location to another. Soon we will also consider rotational kinetic
energy.)
Let’s rewrite Eq. 1.26 in a more instructive form by recognizing that, with
translational kinetic energy as the only form of energy, E = K = 12 mv2 . With
v2 = v2x + v2y + v2z , we can write the energy as
E = 21 mv2x + 21 mv2y + 12 mv2z
(1.28)
The average energy is then
1
2
2 m(vx )av
+ 21 m(v2y )av + 21 m(v2z )av = 32 kT
(1.29)
For a gas molecule there is no difference between the x, y, and z directions, so
the three terms on the left are equal and each term is equal to 21 kT. The three
terms on the left represent three independent contributions to the energy of the
molecule—the motion in the x direction, for example, is not affected by the y or
z motions.
We define a degree of freedom of the gas as each independent contribution to
the energy of a molecule, corresponding to one quadratic term in the expression
for the energy. There are three quadratic terms in Eq. 1.28, so in this case there
are three degrees of freedom. As you can see from Eq. 1.29, each of the three
degrees of freedom of a gas molecule contributes an energy of 21 kT to its average
energy. The relationship we have obtained in this special case is an example of
the application of a general theorem, called the equipartition of energy theorem:
When the number of particles in a system is large and Newtonian mechanics
is obeyed, each molecular degree of freedom corresponds to an average
energy of 12 kT.
The average energy per molecule is then the number of degrees of freedom times
1
2 kT, and the total energy is obtained by multiplying the average energy per
1.3 | The Failure of the Classical Theory of Particle Statistics
molecule by the number of molecules N: Etotal = NEav . We will refer to this total
energy as the internal energy Eint to indicate that it represents the random motions
of the gas molecules (in contrast, for example, to the energy involved with the
motion of the entire container of gas molecules).
Eint = N( 23 kT) = 32 NkT = 23 nRT
(translation only)
(1.31)
Here we have 5 quadratic terms in the energy, and thus 5 degrees of freedom.
According to the equipartition theorem, the average total energy per molecule is
5 × 21 kT = 25 kT, and the total internal energy of n moles of the gas is
Eint = 52 nRT
(translation + rotation)
(1.32)
If the molecule can also vibrate, we can imagine the rigid rod connecting the
atoms in Figure 1.11 to be replaced by a spring. The two atoms can then vibrate
in opposite directions along the z′ axis, with the center of mass of the molecule
remaining fixed. The vibrational motion adds two quadratic terms to the energy,
corresponding to the vibrational potential energy ( 21 kz′ 2 ) and the vibrational
kinetic energy ( 21 mv2z′ ). Including the vibrational motion, there are now 7 degrees
of freedom, so that
Eint = 27 nRT
(translation + rotation + vibration)
(1.33)
Heat Capacities of an Ideal Gas
Now we examine where the classical molecular distribution theory, which gives
a very good accounting of molecular behavior under most circumstances, fails to
agree with one particular class of experiments. Suppose we have a container of
gas with a fixed volume. We transfer energy to the gas, perhaps by placing the
container in contact with a system at a higher temperature. All of this transferred
energy increases the internal energy of the gas by an amount Eint , and there is
an accompanying increase in temperature T.
We define the molar heat capacity for this constant-volume process as
CV =
Eint
n T
z′
(1.30)
where Eq. 1.19 has been used to express Eq. 1.30 in terms of either the number
of molecules or the number of moles.
The situation is different for a diatomic gas (two atoms per molecule),
illustrated in Figure 1.11. There are still three degrees of freedom associated with
the translational motion of the molecule, but now two additional forms of energy
are permitted—rotational and vibrational.
First we consider the rotational motion. The molecule shown in Figure 1.11
can rotate about the x′ and y′ axes (but not about the z′ axis, because the rotational
inertia about that axis is zero for diatomic molecules in which the atoms are treated
as points). Using the general form of 21 Iω2 for the rotational kinetic energy, we
can write the energy of the molecule as
E = 21 mv2x + 12 mv2y + 21 mv2z + 21 Ix′ ωx2′ + 12 Iy′ ωy2′
17
(1.34)
(The subscript V reminds us that we are doing this measurement at constant
volume.) From Eqs. 1.30, 1.32, and 1.33, we see that the molar heat capacity
depends on the type of gas:
y′
x′
FIGURE 1.11 A diatomic molecule,
with the origin at the center of mass.
Rotations can occur about the x′ and y′
axes, and vibrations can occur along
the z′ axis.
18
Chapter 1 | The Failures of Classical Physics
CV = 23 R
(monatomic or nonrotating, nonvibrating diatomic ideal gas)
CV =
(rotating and vibrating diatomic ideal gas)
CV =
5
2R
7
2R
(rotating diatomic ideal gas)
(1.35)
When we add energy to the gas, the equipartition theorem tells us that the added
energy will on the average be distributed uniformly among all the possible forms
of energy (corresponding to the number of degrees of freedom). However, only
the translational kinetic energy contributes to the temperature (as shown by Eqs.
1.20 and 1.21). Thus, if we add 7 units of energy to a diatomic gas with rotating
and vibrating molecules, on the average only 3 units go into translational kinetic
energy and so 3/7 of the added energy goes into increasing the temperature.
(To measure the temperature rise, the gas molecules must collide with the
thermometer, so energy in the rotational and vibrational motions is not recorded
by the thermometer.) Put another way, to obtain the same temperature increase
T, a mole of diatomic gas requires 7/3 times the energy that is needed for a
mole of monatomic gas.
Comparison with Experiment How well do these heat capacity values
agree with experiment? For monatomic gases, the agreement is very good. The
equipartition theorem predicts a value of CV = 3R/2 = 12.5 J/mol · K, which
should be the same for all monatomic gases and the same at all temperatures (as
long as the conditions of the ideal gas model are fulfilled). The heat capacity
of He gas is 12.5 J/mol · K at 100 K, 300 K (room temperature), and 1000 K, so
in this case our calculation is in perfect agreement with experiment. Other inert
gases (Ne, Ar, Xe, etc.) have identical values, as do vapors of metals (Cu, Na, Pb,
Bi, etc.) and the monatomic (dissociated) state of elements that normally form
diatomic molecules (H, N, O, Cl, Br, etc.). So over a wide variety of different
elements and a wide range of temperatures, classical statistical mechanics is in
excellent agreement with experiment.
The situation is much less satisfactory for diatomic molecules. For a rotating
and vibrating diatomic molecule, the classical calculation gives CV = 7R/2 =
29.1 J/mol · K. Table 1.1 shows some values of the heat capacities for different
diatomic gases over a range of temperatures.
TABLE 1.1 Heat Capacities of Diatomic Gases
CV (J/mol · K)
Element
100 K
300 K
1000 K
H2
18.7
20.5
21.9
N2
20.8
20.8
24.4
O2
20.8
21.1
26.5
F2
20.8
23.0
28.8
Cl2
21.0
25.6
29.1
Br2
22.6
27.8
29.5
I2
24.8
28.6
29.7
Sb2
28.1
29.0
Te2
28.2
29.0
Bi2
28.6
29.1
1.3 | The Failure of the Classical Theory of Particle Statistics
At high temperatures, many of the diatomic gases do indeed approach the
expected value of 7R/2, but at lower temperatures the values are much smaller.
For example, fluorine seems to behave as if it has 5 degrees of freedom
(CV = 20.8 J/mol · K) at 100 K and 7 degrees of freedom (CV = 29.1 J/mol · K)
at 1000 K.
Hydrogen behaves as if it has 5 degrees of freedom at room temperature,
but at high enough temperature (3000 K), the heat capacity of H2 approaches
29.1 J/mol · K, corresponding to 7 degrees of freedom, while at lower temperatures
(40 K) the heat capacity is 12.5 J/mol · K, corresponding to 3 degrees of freedom.
The temperature dependence of the heat capacity of H2 is shown in Figure 1.12.
There are three plateaus in the graph, corresponding to heat capacities for 3, 5, and
7 degrees of freedom. At the lowest temperatures, the rotational and vibrational
motions are “frozen” and do not contribute to the heat capacity. At about 100 K,
the molecules have enough energy to allow rotational motion to occur, and by
about 300 K the heat capacity is characteristic of 5 degrees of freedom. Starting
about 1000 K, the vibrational motion can occur, and by about 3000 K there are
enough molecules above the vibrational threshold to allow 7 degrees of freedom.
What’s going on here? The classical calculation demands that CV should be
constant, independent of the type of gas or the temperature. The equipartition
of energy theorem, which is very successful in predicting many thermodynamic
properties, fails miserably in accounting for the heat capacities. This theorem
requires that the energy added to a gas must on average be divided equally among
all the different forms of energy, and classical physics does not permit a threshold
energy for any particular type of motion. How is it possible for 2 degrees of
freedom, corresponding to the rotational or vibrational motions, to be “turned on”
as the temperature is increased?
The solution to this dilemma can be found in quantum mechanics, according
to which there is indeed a minimum or threshold energy for the rotational and
vibrational motions. We discuss this behavior in Chapters 5 and 9. In Chapter 11
we discuss the failure of the equipartition theorem to account for the heat capacities
of solids and the corresponding need to replace the classical Maxwell-Boltzmann
energy distribution function with a different distribution that is consistent with
quantum mechanics.
Classical Prediction
29.1
7
2
R
5
2
R
3
2
R
Cv (J/mol.K)
Vibration
20.8
Rotation
12.5
Translation
10
25
50
100
250
500
1000
2500
5000
Temperature (K)
FIGURE 1.12 The heat capacity of molecular hydrogen at different temperatures.
The data points disagree with the classical prediction.
19
20
Chapter 1 | The Failures of Classical Physics
1.4 THEORY, EXPERIMENT, LAW
When you first began to study science, perhaps in your elementary or high school
years, you may have learned about the “scientific method,” which was supposed
to be a sort of procedure by which scientific progress was achieved. The basic
idea of the “scientific method” was that, on reflecting over some particular aspect
of nature, the scientist would invent a hypothesis or theory, which would then be
tested by experiment and if successful would be elevated to the status of law. This
procedure is meant to emphasize the importance of doing experiments as a way of
testing hypotheses and rejecting those that do not pass the tests. For example, the
ancient Greeks had some rather definite ideas about the motion of objects, such as
projectiles, in the Earth’s gravity. Yet they tested none of these by experiment, so
convinced were they that the power of logical deduction alone could be used to
discover the hidden and mysterious laws of nature and that once logic had been
applied to understanding a problem, no experiments were necessary. If theory
and experiment were to disagree, they would argue, then there must be something
wrong with the experiment! This dominance of analysis and faith was so pervasive
that it was another 2000 years before Galileo, using an inclined plane and a crude
timer (equipment surely within the abilities of the early Greeks to construct), discovered the laws of motion, which were later organized and analyzed by Newton.
In the case of modern physics, none of the fundamental concepts is obvious
from reason alone. Only by doing often difficult and necessarily precise experiments do we learn about these unexpected and fascinating effects associated with
such modern physics topics as relativity and quantum physics. These experiments
have been done to unprecedented levels of precision—of the order of one part
in 106 or better—and it can certainly be concluded that modern physics was
tested far better in the 20th century than classical physics was tested in all of the
preceding centuries.
Nevertheless, there is a persistent and often perplexing problem associated
with modern physics, one that stems directly from your previous acquaintance
with the “scientific method.” This concerns the use of the word “theory,” as in
“theory of relativity” or “quantum theory,” or even “atomic theory” or “theory
of evolution.” There are two contrasting and conflicting definitions of the word
“theory” in the dictionary:
1. A hypothesis or guess.
2. An organized body of facts or explanations.
The “scientific method” refers to the first kind of “theory,” while when we
speak of the “theory of relativity” we refer to the second kind. Yet there is
often confusion between the two definitions, and therefore relativity and quantum
physics are sometimes incorrectly regarded as mere hypotheses, on which evidence
is still being gathered, in the hope of someday submitting that evidence to some sort
of international (or intergalactic) tribunal, which in turn might elevate the “theory”
into a “law.” Thus the “theory of relativity” might someday become the “law of
relativity,” like the “law of gravity.” Nothing could be further from the truth!
The theory of relativity and the quantum theory, like the atomic theory or the
theory of evolution, are truly “organized bodies of facts and explanations” and
not “hypotheses.” There is no question of these “theories” becoming “laws”—the
“facts” (experiments, observations) of relativity and quantum physics, like those
of atomism or evolution, are accepted by virtually all scientists today. The
Questions
21
experimental evidence for all of these processes is so compelling that no one who
approaches them in the spirit of free and open inquiry can doubt the observational
evidence or their inferences. Whether these collections of evidence are called
theories or laws is merely a question of semantics and has nothing to do with their
scientific merits. Like all scientific principles, they will continue to develop and
change as new discoveries are made; that is the essence of scientific progress.
Chapter Summary
Section
Section
Classical kinetic
energy
p2
K = 12 mv2 =
2m
1.1
Magnetic field of
a current loop
Classical linear
momentum
p = mv
1.1
Classical angular
momentum
= r × p
L
1.1
Classical
conservation laws
Electric force and
potential energy
of two interacting
charges
In an isolated system, the
energy, linear momentum, and
angular momentum remain
constant.
1 |q1 ||q2 |
F=
4πε0 r2
1 q1 q2
U=
4πε0 r
Relationship between U = qV
electric potential
energy and potential
1.1
μ i
B= 0
2r
1.1
Potential energy
of magnetic
dipole
U = −µ
· B ext
1.1
Average kinetic
energy in a gas
Kav = 32 kT(per molecule)
1.1
=
3
2 RT(per
mole)
1
2N
E1/2 e−E/kT 1.3
Maxwell-Boltzmann N(E) = √
(kT)3/2
π
distribution
1.1
Equipartition of
energy
Energy per degree of freedom
= 12 kT
1.3
1.1
Questions
1. Under what conditions can you apply the law of conservation
of energy? Conservation of linear momentum? Conservation
of angular momentum?
2. Which of the conserved quantities are scalars and which are
vectors? Is there a difference in how we apply conservation
laws for scalar and vector quantities?
3. What other conserved quantities (besides energy, linear
momentum, and angular momentum) can you name?
4. What is the difference between potential and potential
energy? Do they have different dimensions? Different units?
5. In Section 1.1 we defined the electric force between two
charges and the magnetic field of a current. Use these quantities to define the electric field of a single charge and the
magnetic force on a moving electric charge.
6. Other than from the ranges of wavelengths shown in
Figure 1.7, can you think of a way to distinguish radio
waves from infrared waves? Visible from infrared? That is,
could you design a radio that could be tuned to infrared
waves? Could living beings “see” in the infrared region?
7. Suppose we have a mixture of an equal number N of
molecules of two different gases, whose molecular masses
are m1 and m2 , in complete thermal equilibrium at temperature T. How do the distributions of molecular energies of the
two gases compare? How do their average kinetic energies
per molecule compare?
8. In most gases (as in the case of hydrogen) the rotational
motion begins to occur at a temperature well below the
temperature at which vibrational motion occurs. What does
this tell us about the properties of the gas molecules?
9. Suppose it were possible for a pitcher to throw a baseball
faster than the speed of light. Describe how the flight of the
ball from the pitcher’s hand to the catcher’s glove would
look to the umpire standing behind the catcher.
22
Chapter 1 | The Failures of Classical Physics
10. At low temperatures the molar heat capacity of carbon dioxide (CO2 ) is about 5R/2, and it rises to about 7R/2 at
room temperature. However, unlike the gases discussed in
Section 1.3, the heat capacity of CO2 continues to rise as the
temperature increases, reaching 11R/2 at 1000 K. How can
you explain this behavior?
11. If we double the temperature of a gas, is the number of
molecules in a narrow interval dE around the most probable
energy about the same, double, or half what it was at the
original temperature?
Problems
1.1 Review of Classical Physics
1. A hydrogen atom (m = 1.674 × 10−27 kg) is moving with
a velocity of 1.1250 × 107 m/s. It collides elastically with
a helium atom (m = 6.646 × 10−27 kg) at rest. After the
collision, the hydrogen atom is found to be moving with
a velocity of −6.724 × 106 m/s (in a direction opposite to
its original motion). Find the velocity of the helium atom
after the collision in two different ways: (a) by applying
conservation of momentum; (b) by applying conservation
of energy.
2. A helium atom (m = 6.6465 × 10−27 kg) collides elastically
with an oxygen atom (m = 2.6560 × 10−26 kg) at rest. After
the collision, the helium atom is observed to be moving with
a velocity of 6.636 × 106 m/s in a direction at an angle of
84.7◦ relative to its original direction. The oxygen atom is
observed to move at an angle of −40.4◦ . (a) Find the speed
of the oxygen atom. (b) Find the speed of the helium atom
before the collision.
3. A beam of helium-3 atoms (m = 3.016 u) is incident on a
target of nitrogen-14 atoms (m = 14.003 u) at rest. During
the collision, a proton from the helium-3 nucleus passes to
the nitrogen nucleus, so that following the collision there
are two atoms: an atom of “heavy hydrogen” (deuterium,
m = 2.014 u) and an atom of oxygen-15 (m = 15.003 u).
The incident helium atoms are moving at a velocity of
6.346 × 106 m/s. After the collision, the deuterium atoms
are observed to be moving forward (in the same direction as
the initial helium atoms) with a velocity of 1.531 × 107 m/s.
(a) What is the final velocity of the oxygen-15 atoms?
(b) Compare the total kinetic energies before and after the
collision.
4. An atom of beryllium (m = 8.00 u) splits into two atoms of
helium (m = 4.00 u) with the release of 92.2 keV of energy.
If the original beryllium atom is at rest, find the kinetic
energies and speeds of the two helium atoms.
5. A 4.15-volt battery is connected across a parallel-plate
capacitor. Illuminating the plates with ultraviolet light causes
electrons to be emitted from the plates with a speed of
1.76 × 106 m/s. (a) Suppose electrons are emitted near the
center of the negative plate and travel perpendicular to that
plate toward the opposite plate. Find the speed of the electrons when they reach the positive plate. (b) Suppose instead
that electrons are emitted perpendicular to the positive plate.
Find their speed when they reach the negative plate.
1.2 The Failure of Classical Concepts of Space and Time
6. Observer A, who is at rest in the laboratory, is studying a
particle that is moving through the laboratory at a speed of
0.624c and determines its lifetime to be 159 ns. (a) Observer
A places markers in the laboratory at the locations where
the particle is produced and where it decays. How far apart
are those markers in the laboratory? (b) Observer B, who
is traveling parallel to the particle at a speed of 0.624c,
observes the particle to be at rest and measures its lifetime to
be 124 ns. According to B, how far apart are the two markers
in the laboratory?
1.3 The Failure of the Classical Theory of Particle Statistics
7. A sample of argon gas is in a container at 35.0◦ C and
1.22 atm pressure. The radius of an argon atom (assumed
spherical) is 0.710 × 10−10 m. Calculate the fraction of the
container volume actually occupied by the atoms.
8. By differentiating the expression for the MaxwellBoltzmann energy distribution, show that the peak of the
distribution occurs at an energy of 21 kT.
9. A container holds N molecules of nitrogen gas at T = 280 K.
Find the number of molecules with kinetic energies between
0.0300 eV and 0.0312 eV.
10. A sample of 2.37 moles of an ideal diatomic gas experiences a temperature increase of 65.2 K at constant volume.
(a) Find the increase in internal energy if only translational
and rotational motions are possible. (b) Find the increase
in internal energy if translational, rotational, and vibrational
motions are possible. (c) How much of the energy calculated
in (a) and (b) is translational kinetic energy?
General Problems
11. An atom of mass m1 = m moving in the x direction with
speed v1 = v collides elastically with an atom of mass
m2 = 3m at rest. After the collision the first atom moves in
the y direction. Find the direction of motion of the second
atom and the speeds of both atoms (in terms of v) after the
collision.
12. An atom of mass m1 = m moves in the positive x direction
with speed v1 = v. It collides with and sticks to an atom of
Problems
mass m2 = 2m moving in the positive y direction with speed
v2 = 2v/3. Find the resultant speed and direction of motion
of the combination, and find the kinetic energy lost in this
inelastic collision.
13. Suppose the beryllium atom of Problem 4 were not at rest,
but instead moved in the positive x direction and had a
kinetic energy of 40.0 keV. One of the helium atoms is
found to be moving in the positive x direction. Find the
direction of motion of the second helium, and find the velocity of each of the two helium atoms. Solve this problem in
two different ways: (a) by direct application of conservation
of momentum and energy; (b) by applying the results of
Problem 4 to a frame of reference moving with the original
beryllium atom and then switching to the reference frame in
which the beryllium is moving.
14. Suppose the beryllium atom of Problem 4 moves in the positive x direction and has kinetic energy 60.0 keV. One helium
atom is found to move at an angle of 30◦ with respect to the
x axis. Find the direction of motion of the second helium
atom and find the velocity of each helium atom. Work this
23
problem in two ways as you did the previous problem. (Hint:
Consider one helium to be emitted with velocity components
vx and vy in the beryllium rest frame. What is the relationship
between vx and vy ? How do vx and vy change when we move
in the x direction at speed v?)
15. A gas cylinder contains argon atoms (m = 40.0 u). The temperature is increased from 293 K (20◦ C) to 373 K (100◦ C).
(a) What is the change in the average kinetic energy per
atom? (b) The container is resting on a table in the Earth’s
gravity. Find the change in the vertical position of the container that produces the same change in the average energy
per atom found in part (a).
16. Calculate the fraction of the molecules in a gas that are
moving with translational kinetic energies between 0.02kT
and 0.04kT.
17. For a molecule of O2 at room temperature (300 K), calculate
the average angular velocity for rotations about the x′ or y′
axes. The distance between the O atoms in the molecule is
0.121 nm.
Chapter
2
THE SPECIAL THEORY OF
RELATIVITY
This 12-foot tall statue of Albert Einstein is located at the headquarters of the National
Academy of Sciences in Washington DC. The page in his hand shows three equations that
he discovered: the fundamental equation of general relativity, which revolutionized our
understanding of gravity; the equation for the photoelectric effect, which opened the path to
the development of quantum mechanics; and the equation for mass-energy equivalence,
which is the cornerstone of his special theory of relativity.
26
Chapter 2 | The Special Theory of Relativity
Einstein’s special theory of relativity and Planck’s quantum theory burst forth
on the physics scene almost simultaneously during the first decade of the 20th
century. Both theories caused profound changes in the way we view our universe
at its most fundamental level.
In this chapter we study the special theory of relativity.∗ This theory has
a completely undeserved reputation as being so exotic that few people can
understand it. On the contrary, special relativity is basically a system of kinematics
and dynamics, based on a set of postulates that are different from those of classical
physics. The resulting formalism is not much more complicated than Newton’s
laws, but it does lead to several predictions that seem to go against our common
sense. Even so, the special theory of relativity has been carefully and thoroughly
tested by experiment and found to be correct in all its predictions.
We first review the classical relativity of Galileo and Newton, and then we show
why Einstein proposed to replace it. We then discuss the mathematical aspects of
special relativity, the predictions of the theory, and finally the experimental tests.
2.1 CLASSICAL RELATIVITY
A “theory of relativity” is in effect a way for observers in different frames of
reference to compare the results of their observations. For example, consider
an observer in a car parked by a highway near a large rock. To this observer,
the rock is at rest. Another observer, who is moving along the highway in a car,
sees the rock rush past as the car drives by. To this observer, the rock appears
to be moving. A theory of relativity provides the conceptual framework and
mathematical tools that enable the two observers to transform a statement such
as “rock is at rest” in one frame of reference to the statement “rock is in motion”
in another frame of reference. More generally, relativity gives a means for
expressing the laws of physics in different frames of reference.
The mathematical basis for comparing the two descriptions is called a transformation. Figure 2.1 shows an abstract representation of the situation. Two observers
y
y′
x, y, z, t
x′, y′, z′, t′
Event
O′
u
O
x
z
x′
z′
FIGURE 2.1 Two observers O and O′ observe the same event. O′ moves relative
u.
to O with a constant velocity
∗ The general theory of relativity, which is covered briefly in Chapter 15, deals with “curved”
coordinate systems, in which gravity is responsible for the curvature. Here we discuss the special case
of the more familiar “flat” coordinate systems.
2.1 | Classical Relativity
27
O and O′ are each at rest in their own frames of reference but move relative to one
another with constant velocity u
. (O and O′ refer both to the observers and their
reference frames or coordinate systems.) They observe the same event, which
happens at a particular point in space and a particular time, such as a collision
between two particles. According to O, the space and time coordinates of the
event are x, y, z, t, while according to O′ the coordinates of the same event are x′ ,
y′ , z′ , t′ . The two observers use calibrated meter sticks and synchronized clocks,
so any differences between the coordinates of the two events are due to their
different frames of reference and not to the measuring process. We simplify the
discussion by assuming that the relative velocity u
always lies along the common
xx′ direction, as shown in Figure 2.1, and we let u
represent the velocity of O′ as
measured by O (and thus O′ would measure velocity −u
for O).
In this discussion we make a particular choice of the kind of reference frames
inhabited by O and O′ . We assume that each observer has the capacity to test
Newton’s laws and finds them to hold in that frame of reference. For example, each
observer finds that an object at rest or moving with a constant velocity remains
in that state unless acted upon by an external force (Newton’s first law, the law
of inertia). Such frames of reference are called inertial frames. An observer in
interstellar space floating in a nonrotating rocket with the engines off would be in
an inertial frame of reference. An observer at rest on the surface of the Earth is
not in an inertial frame, because the Earth is rotating about its axis and orbiting
about the Sun; however, the accelerations associated with those motions are so
small that we can usually regard our reference frame as approximately inertial.
(The noninertial reference frame at the Earth’s surface does produce important
and often spectacular effects, such as the circulation of air around centers of high
or low pressure.) An observer in an accelerating car, a rotating merry-go-round,
or a descending roller coaster is not in an inertial frame of reference!
We now derive the classical or Galilean transformation that relates the
coordinates x, y, z, t to x′ , y′ , z′ , t′ . We assume as a postulate of classical physics
that t = t′ , that is, time is the same for all observers. We also assume for simplicity
that the coordinate systems are chosen so that their origins coincide at t = 0.
Consider an object in O′ at the coordinates x′ , y′ , z′ (Figure 2.2). According to O,
the y and z coordinates are the same as those in O′ . Along the x direction, O would
observe the object at x = x′ + ut. We therefore have the Galilean coordinate
transformation
x′ = x − ut
y′ = y
z′ = z
(2.1)
y
To find the velocities of the object as observed by O and O′ , we take the derivatives
of these expressions with respect to t′ on the left and with respect to t on the
right (which we can do because we have assumed t′ = t). This gives the Galilean
velocity transformation
y′
u
P
O
x
O′
y′
x′
z′
v′x = vx − u
v′y = vy
v′z = vz
In a similar fashion, we can take the derivatives of Eq. 2.2 with respect to time
and obtain relationships between the accelerations
a′x
= ax
a′y
= ay
a′z
= az
ut
(2.2)
(2.3)
Equation 2.3 shows again that Newton’s laws hold for both observers. As long as
u is constant (du/dt = 0), the observers measure identical accelerations and agree
= ma.
on the results of applying F
z
x
x′
z′
FIGURE 2.2 An object or event at
point P is at coordinates x′ , y′ , z′ with
respect to O′ . The x coordinate measured by O is x = x′ + ut. The y and z
coordinates in O are the same as those
in O′ .
28
Chapter 2 | The Special Theory of Relativity
Example 2.1
Two cars are traveling at constant speed along a road in the
same direction. Car A moves at 60 km/h and car B moves
at 40 km/h, each measured relative to an observer on the
ground (Figure 2.3a). What is the speed of car A relative
to car B?
A
O
B
O′
v
u
(a)
Solution
A
Let O be the observer on the ground, who observes car A
to move at vx = 60 km/h. Assume O′ to be moving with
car B at u = 40 km/h. Then
O
Figure 2.3b shows the situation as observed by O′ .
O′
v′
–u
v′x = vx − u = 60 km/h − 40 km/h
= 20 km/h
B
(b)
FIGURE 2.3 Example 2.1. (a) As observed by O at rest on the
ground. (b) As observed by O′ in car B.
Example 2.2
An airplane is flying due east relative to still air at a speed
of 320 km/h. There is a 65 km/h wind blowing toward
the north, as measured by an observer on the ground.
What is the velocity of the plane measured by the ground
observer?
Relative to the ground, the plane flies in a direction
determined by φ = tan−1 (65 km/h)/(320 km/h) = 11.5◦ ,
or 11.5◦ north of east.
v
Solution
v′
Let O be the observer on the ground, and let O′ be an
observer who is moving with the wind, for example a
balloonist (Figure 2.4). Then u = 65 km/h, and (because
our equations are set up with u
in the xx′ direction) we
′
must choose the xx direction to be to the north. In this
case we know the velocity with respect to O′ ; taking the y
direction to the east, we have v′x = 0 and v′y = 320 km/h.
Using Eq. 2.2 we obtain
vx = v′x + u = 0 + 65 km/h = 65 km/h
vy = v′y = 320 km/h
u
u
N
W
O′
E
S
O
FIGURE 2.4 Example 2.2. As observed by O at rest on the
ground, the balloon drifts north with the wind, while the plane
flies north of east.
Example 2.3
A swimmer capable of swimming at a speed c in still water
is swimming in a stream in which the current is u (which
we assume to be less than c). Suppose the swimmer swims
upstream a distance L and then returns downstream to the
starting point. Find the time necessary to make the round
trip, and compare it with the time to swim across the stream
a distance L and return.
Solution
Let the frame of reference of O be the ground and
the frame of reference of O′ be the water, moving at
speed u (Figure 2.5a). The swimmer always moves at
speed c relative to the water, and thus v′x = −c for the
upstream swim. (Remember that u always defines the
positive x direction.) According to Eq. 2.2, v′x = vx − u,
2.2 | The Michelson-Morley Experiment
so vx = v′x + u = u − c. (As expected, the velocity relative to the ground has magnitude smaller than c; it
is also negative, since the swimmer is swimming in
the negative x direction, so |vx | = c − u.) Therefore,
tup = L/(c − u). For the downstream swim, v′x = c, so
vx = u + c, tdown = L/(c + u), and the total time is
29
O
u–c
u+c
O′
u
O′
u
L
L
L(c − u) + L(c + u)
t=
+
=
c+u c−u
c2 − u2
=
2Lc
2L
1
=
2
−u
c 1 − u2 /c2
c2
(a)
(2.4)
O
To swim directly across the stream, the swimmer’s efforts
must be directed somewhat upstream to counter the effect
of the current (Figure 2.5b). That is, in the frame of reference of O we would like to have vx = 0, which requires
relative to
v′x = −u according toEq. 2.2. Since the speed
′2 = c; thus v′ =
thewater is always c, v′2
+
v
c2 − v′2
x
y
x
y
= c2 − u2 , and the round-trip time is
t = 2tacross =
2L
c2 − u2
=
2L
1
c 1 − u2 /c2
(2.5)
Notice the difference in form between this result and the
result for the upstream-downstream swim, Eq. 2.4.
(b)
FIGURE 2.5 Example 2.3. The motion of a swimmer as seen
by observer O at rest on the bank of the stream. Observer O′
moves with the stream at speed u.
2.2 THE MICHELSON-MORLEY EXPERIMENT
We have seen how Newton’s laws remain valid with respect to a Galilean transformation that relates the description of the motion of an object in one reference frame
to that in another reference frame. It is then interesting to ask whether the same
transformation rules apply to the motion of a light beam. According to the Galilean
transformation, a light beam moving relative to observer O′ in the x′ direction at
speed c = 299,792,458 m/s would have a speed of c + u relative to O. Direct highprecision measurements of the speed of light beams have become possible in recent
years (as we discuss later in this chapter), but in the 19th century it was necessary
to devise a more indirect measurement of the speed of light according to different
observers in relative motion.
Suppose the swimmer in Example 2.3 is replaced by a light beam. Observer
O′ is in a frame of reference in which the speed of light is c, and the frame of
reference of observer O′ is in motion relative to observer O. What is the speed
of light as measured by observer O? If the Galilean transformation is correct, we
should expect to see a difference between the speed of the light beam according
to O and O′ and therefore a time difference between the upstream-downstream
and cross-stream times, as in Example 2.3.
Albert A. Michelson (1852–1931,
United States). He spent 50 years
doing increasingly precise experiments with light, for which he became
the first U.S. citizen to win the Nobel
Prize in physics (1907).
30
S
Chapter 2 | The Special Theory of Relativity
A
C
B
FIGURE 2.6 (Top) Beam diagram of
Michelson interferometer. Light from
source S is split at A by the half-silvered
mirror; one part is reflected by the
mirror at B and the other is reflected
at C. The beams are then recombined for observation of the interference. (Bottom) Michelson’s apparatus. To improve sensitivity, the beams
were reflected to travel each leg of the
apparatus eight times, rather than just
twice. To reduce vibrations from the
surroundings, the interferometer was
mounted on a 1.5-m square stone slab
floating in a pool of mercury.
Physicists in the 19th century postulated just such a situation—a preferred frame
of reference in which the speed of light has the precise value of c and other frames in
relative motion in which the speed of light would differ, according to the Galilean
transformation. The preferred frame, like that of observer O′ in Example 2.3, is
one that is at rest with respect to the medium in which light propagates at c (like
the water of that example). What is the medium of propagation for light waves? It
was inconceivable to physicists of the 19th century that a wave disturbance could
propagate without a medium (consider mechanical waves such as sound or seismic
waves, for example, which propagate due to mechanical forces in the medium).
They postulated the existence of an invisible, massless medium, called the ether,
which filled all space, was undetectable by any mechanical means, and existed solely
for the propagation of light waves. It seemed reasonable then to obtain evidence
for the ether by measuring the velocity of the Earth moving through the ether. This
could be done in the geometry of Figure 2.5 by measuring the difference between
the upstream-downstream and cross-stream times for a light wave. The calculation
based on Galilean relativity would then give the relative velocity u
between O (in
the Earth’s frame of reference) and the ether.
The first detailed and precise search for the preferred frame was performed
in 1887 by the American physicist Albert A. Michelson and his associate
E. W. Morley. Their apparatus consisted of a specially designed Michelson
interferometer, illustrated in Figure 2.6. A monochromatic beam of light is split
in two; the two beams travel different paths and are then recombined. Any
phase difference between the combining beams causes bright and dark bands or
“fringes” to appear, corresponding, respectively, to constructive and destructive
interference, as shown in Figure 2.7.
There are two contributions to the phase difference between the beams. The
first contribution comes from the path difference AB − AC; one of the beams may
travel a longer distance. The second contribution, which would still be present
even if the path lengths were equal, comes from the time difference between the
upstream-downstream and cross-stream paths (as in Example 2.3) and indicates
the motion of the Earth through the ether. Michelson and Morley used a clever
method to isolate this second contribution—they rotated the entire apparatus by
90◦ ! The rotation doesn’t change the first contribution to the phase difference
(because the lengths AB and AC don’t change), but the second contribution
changes sign, because what was an upstream-downstream path before the rotation
becomes a cross-stream path after the rotation. As the apparatus is rotated through
90◦ , the fringes should change from bright to dark and back again as the phase
difference changes. Each change from bright to dark represents a phase change of
180◦ (a half cycle), which corresponds to a time difference of a half period (about
10−15 s for visible light). Counting the number of fringe changes thus gives a
measure of the time difference between the paths, which in turn gives the relative
velocity u. (See Problem 3.)
When Michelson and Morley performed their experiment, there was no
observable change in the fringe pattern—they deduced a shift of less than 0.01
fringe, corresponding to a speed of the Earth through the ether of at most 5 km/s.
As a last resort, they reasoned that perhaps the orbital motion of the Earth just
happened to cancel out the overall motion through the ether. If this were true,
2.3 | Einstein’s Postulates
six months later (when the Earth would be moving in its orbit in the opposite
direction) the cancellation should not occur. When they repeated the experiment
six months later, they again obtained a null result. In no experiment were
Michelson and Morley able to detect the motion of the Earth through the ether.
In summary, we have seen that there is a direct chain of reasoning that leads
from Galileo’s principle of inertia, through Newton’s laws with their implicit
assumptions about space and time, ending with the failure of the MichelsonMorley experiment to observe the motion of the Earth relative to the ether.
Although several explanations were offered for the unobservability of the ether
and the corresponding failure of the upstream-downstream and cross-stream
velocities to add in the expected way, the most novel, revolutionary, and ultimately
successful explanation is given by Einstein’s special theory of relativity, which
requires a serious readjustment of our traditional concepts of space and time, and
therefore alters some of the very foundations of physics.
31
FIGURE 2.7 Interference fringes as
observed with the Michelson interferometer of Figure 2.6. When the path
length ACA changes by one-half wavelength relative to ABA, all light areas
turn dark and all dark areas turn light.
2.3 EINSTEIN’S POSTULATES
The special theory of relativity is based on two postulates proposed by Albert
Einstein in 1905:
The principle of relativity: The laws of physics are the same in all inertial
reference frames.
The principle of the constancy of the speed of light: The speed of light in
free space has the same value c in all inertial reference frames.
The first postulate declares that the laws of physics are absolute, universal, and
the same for all inertial observers. Laws that hold for one inertial observer cannot
be violated for any inertial observer.
The second postulate is more difficult to accept because it seems to go against
our “common sense,” which is based on the Galilean kinematics we observe in
everyday experiences. Consider three observers A, B, and C. Observer B is at rest,
while A and C move away from B in opposite directions each at a speed of c/4. B
fires a light beam in the direction of A. According to the Galilean transformation,
if B measures a speed of c for the light beam, then A measures a speed of
c − c/4 = 3c/4, while C measures a speed of c + c/4 = 5c/4. Einstein’s second
postulate, on the other hand, requires all three observers to measure the same
speed of c for the light beam! This postulate immediately explains the failure of
the Michelson-Morley experiment—the upstream-downstream and cross-stream
speeds are identical (both are equal to c), so there is no phase difference between the
two beams.
The two postulates also allow us to dispose of the ether hypothesis. The first
postulate does not permit a preferred frame of reference (all inertial frames are
equivalent), and the second postulate does not permit only a single frame of
reference in which light moves at speed c, because light moves at speed c in all
frames. The ether, as a preferred reference frame in which light has a unique
speed, is therefore unnecessary.
Albert Einstein (1879–1955, Germany-United States). A gentle philosopher and pacifist, he was the intellectual
leader of two generations of theoretical
physicists and left his imprint on nearly
every field of modern physics.
32
Chapter 2 | The Special Theory of Relativity
2.4 CONSEQUENCES OF EINSTEIN’S POSTULATES
M
Among their many consequences, Einstein’s postulates require a new consideration of the fundamental nature of time and space. In this section we discuss how
the postulates affect measurements of time and length intervals by observers in
different frames of reference.
L0
S
FIGURE 2.8 The clock ticks at intervals t0 determined by the time for
a light flash to travel the distance 2L0
from the light source S to the mirror
M and back to the source where it
is detected. (We assume the emission
and detection occur at the same location, so the beam travels perpendicular
to the mirror).
The Relativity of Time
To demonstrate the relativity of time, we use the timing device illustrated in
Figure 2.8. It consists of a flashing light source S that is a distance L0 from a
mirror M. A flash of light from the source is reflected by the mirror, and when
the light returns to S the clock ticks and triggers another flash. The time interval
between ticks is the distance 2L0 (assuming the light travels perpendicular to the
mirror) divided by the speed c:
(2.6)
t0 = 2L0 /c
This is the time interval that is measured when the clock is at rest with respect to
the observer.
We consider two observers: O is at rest on the ground, and O′ moves with speed
u. Each observer carries a timing device. Figure 2.9 shows a sequence of events that
O observes for the clock carried by O′ . According to O, the flash is emitted when
the clock of O′ is at A, reflected when it is at B, and detected at C. In this interval
t, O observes the clock to move forward a distance of ut from the point at which
the flash was
√ emitted, and O concludes that the light beam travels a distance 2L,
where L = L20 + (ut/2)2 , as shown in Figure 2.9. Because O observes the light
beam to travel at speed c (as required by Einstein’s second postulate) the time
interval measured by O is
√
2 L20 + (ut/2)2
2L
=
(2.7)
t =
c
c
Substituting for L0 from Eq. 2.6 and solving Eq. 2.7 for t, we obtain
t0
t = √
1 − u2 /c2
A
B
L
L0
O′
C
u
L
O′
(2.8)
O′
u ∆t
O
FIGURE 2.9 In the frame of reference of O, the clock carried by O′ moves with speed
u. The dashed line, of length 2L, shows the path of the light beam according to O.
2.4 | Consequences of Einstein’s Postulates
33
According to Eq. 2.8, observer O measures a longer time interval than O′ measures.
This is a general result of special relativity, which is known as time dilation. An
observer O′ is at rest relative to a device that produces a time interval t0 . For this
observer, the beginning and end of the time interval occur at the same location,
and so the interval t0 is known as the proper time. An observer O, relative to
whom O′ is in motion, measures a longer time interval t for the same device.
The dilated time interval t is always longer than the proper time interval t0 ,
no matter what the magnitude or direction of u
.
This is a real effect that applies not only to clocks based on light beams but
also to time itself; all clocks run more slowly according to an observer in relative
motion, biological clocks included. Even the growth, aging, and decay of living
systems are slowed by the time dilation effect. However, note that under normal
circumstances (u ≪ c), there is no measurable difference between t and t0 ,
so we don’t notice the effect in our everyday activities. Time dilation has been
verified experimentally with decaying elementary particles as well as with precise
atomic clocks carried aboard aircraft. Some experimental tests are discussed in
the last section of this chapter.
Example 2.4
Muons are elementary particles with a (proper) lifetime of 2.2 μs. They are produced with very high
speeds in the upper atmosphere when cosmic rays (highenergy particles from space) collide with air molecules.
Take the height L0 of the atmosphere to be 100 km in the
reference frame of the Earth, and find the minimum speed
that enables the muons to survive the journey to the surface
of the Earth.
If the muon is to be observed at the surface of the Earth,
it must live for at least 333 μs in the Earth’s frame of
reference. In the muon’s frame of reference, the interval
between its birth and decay is a proper time interval of
2.2 μs. The time intervals are related by Eq. 2.8:
Solution
The birth and decay of the muon can be considered as the
“ticks” of a clock. In the frame of reference of the Earth
(observer O) this clock is moving, and therefore its ticks are
slowed by the time dilation effect. If the muon is moving at
a speed that is close to c, the time necessary for it to travel
from the top of the atmosphere to the surface of the Earth is
t =
100 km
L0
=
= 333 μs
c
3.00 × 108 m/s
Solving, we find
2.2 μs
333 μs =
1 − u2 /c2
u = 0.999978c
If it were not for the time dilation effect, muons would
not survive to reach the Earth’s surface. The observation
of these muons is a direct verification of the time dilation
effect of special relativity.
The Relativity of Length
For this discussion, the moving timing device of O′ is turned sideways, so that
the light travels parallel to the direction of motion of O′ . Figure 2.10 shows the
sequence of events that O observes for the moving clock. According to O, the
length of the clock (distance between the light source and the mirror) is L; as we
shall see, this length is different from the length L0 measured by O′ , relative to
whom the clock is at rest.
The flash of light is emitted when the clock of O′ is at A and reaches the mirror
(position B) at time t1 later. In this time interval, the light travels a distance
34
Chapter 2 | The Special Theory of Relativity
L
u ∆t1
O
A
O′
L + u ∆t1
B
O′
O
u ∆t2
L − u ∆t2
C
u
O′
O
FIGURE 2.10 Here the clock carried by O′ emits its light flash in the direction of
motion.
c t1 , equal to the length L of the clock plus the additional distance u t1 that the
mirror moves forward in this interval. That is,
c t1 = L + u t1
(2.9)
The flash of light travels from the mirror to the detector in a time t2 and covers
a distance of c t2 , equal to the length L of the clock less the distance u t2 that
the clock moves forward in this interval:
c t2 = L − u t2
(2.10)
Solving Eqs. 2.9 and 2.10 for t1 and t2 , and adding to find the total time
interval, we obtain
t = t1 + t2 =
L
L
2L
1
+
=
c−u c+u
c 1 − u2 /c2
MODERN
PHYSICS
L0
L0
L0
MODERN
PHYSICS
L0
L0
L
FIGURE 2.11 Some length-contracted objects. Notice that the shortening occurs only in the direction of motion.
(2.11)
2.4 | Consequences of Einstein’s Postulates
From Eq. 2.8,
2L
1
t0
= 0
t =
c
1 − u2 /c2
1 − u2 /c2
(2.12)
Setting Eqs. 2.11 and 2.12 equal to one another and solving, we obtain
L = L0 1 − u2 /c2
(2.13)
Equation 2.13 summarizes the effect known as length contraction. Observer O′ ,
who is at rest with respect to the object, measures the rest length L0 (also known
as the proper length, in analogy with the proper time). All observers relative to
whom O′ is in motion measure a shorter length, but only along the direction of
motion; length measurements transverse to the direction of motion are unaffected
(Figure 2.11).
For ordinary speeds (u ≪ c), the effects of length contraction are too small to
be observed. For example, a rocket of length 100 m traveling at the escape speed
from Earth (11.2 km/s) would appear to an observer on Earth to contract only by
about two atomic diameters!
Length contraction suggests that objects in motion are measured to have a
shorter length than they do at rest. The objects do not actually shrink; there is
merely a difference in the length measured by different observers. For example,
to observers on Earth a high-speed rocket ship would appear to be contracted
along its direction of motion (Figure 2.12a), but to an observer on the ship it is
the passing Earth that appears to be contracted (Figure 2.12b).
These representations of length-contracted objects are somewhat idealized.
The actual appearance of a rapidly moving object is determined by the time at
which light leaves the various parts of the object and enters the eye or the camera.
The result is that the object appears distorted in shape and slightly rotated.
Example 2.5
Consider the point of view of an observer who is moving
toward the Earth at the same velocity as the muon. In
this reference frame, what is the apparent thickness of the
Earth’s atmosphere?
Solution
In this observer’s reference frame, the muon is at rest and
the Earth is rushing toward it at a speed of u = 0.999978c,
as we found in Example 2.4. To an observer on the Earth,
the height of the atmosphere is its rest length L0 of 100 km.
35
(a)
(b)
FIGURE 2.12 (a) The Earth views the
passing contracted rocket. (b) From
the rocket’s frame of reference, the
Earth appears contracted.
To the observer in the muon’s rest frame, the moving Earth
has an atmosphere of height given by Eq. 2.13:
L = L0 1 − u2 /c2
= (100 km) 1−(0.999978)2 = 0.66 km = 660 m
This distance is small enough for the muons to reach the
Earth’s surface within their lifetime.
Note that what appears as a time dilation in one frame of reference (the observer
on Earth) can be regarded as a length contraction in another frame of reference
(the observer traveling with the muon). For another example of this effect, let’s
review again the example of the pion decay discussed in Section 1.2. A pion at
rest has a lifetime of 26.0 ns. According to observer O1 at rest in the laboratory
frame of reference, a pion moving through the laboratory at a speed of 0.913c
has a longer lifetime, which can be calculated to be 63.7 ns (using Eq. 2.8 for the
time dilation). According to observer O2 , who is traveling through the laboratory
at the same velocity as the pion, the pion appears to be at rest and has its proper
lifetime of 26.0 ns. Thus O1 sees a time dilation effect.
36
Chapter 2 | The Special Theory of Relativity
O1 erects two markers in the laboratory, at the locations where the pion is
created and decays. To O1 , the distance between those markers is the pion’s speed
times its lifetime, which works out to be 17.4 m. Suppose O1 places a stick of
length 17.4 m in the laboratory connecting the two markers. That stick is at rest
in the laboratory reference frame and so has its proper length in that frame. In
the reference frame of O2 , the stick is moving at a speed of 0.913c and has a
shorter length of 7.1 m, which we can find using the length contraction formula
(Eq. 2.13). So O2 measures a distance of 7.1 m between the locations in the
laboratory where the pion was created and where it decayed.
Note that O1 measures the proper length and the dilated time, while O2
measures the proper time and the contracted length. The proper time and proper
length must always be referred to specific observers, who might not be in the same
reference frame. The proper time is always measured by an observer according to
whom the beginning of the time interval and the end of the time interval occur at
the same location. If the time interval is the lifetime of the pion, then O2 (relative
to whom the pion does not move) sees its creation and decay at the same location
and thus measures the proper time interval. The proper length, on the other hand,
is always measured by an observer according to whom the measuring stick is at
rest (O1 in this case).
Example 2.6
An observer O is standing on a platform of length
D0 = 65 m on a space station. A rocket passes at a relative
speed of 0.80c moving parallel to the edge of the platform.
The observer O notes that the front and back of the rocket
simultaneously line up with the ends of the platform at a
particular instant (Figure 2.13a). (a) According to O, what
65 m
0.8c
is the time necessary for the rocket to pass a particular
point on the platform? (b) What is the rest length L0 of the
rocket? (c) According to an observer O′ on the rocket, what
is the length D of the platform? (d) According to O′ , how
long does it take for observer O to pass the entire length
of the rocket? (e) According to O, the ends of the rocket
simultaneously line up with the ends of the platform. Are
these events simultaneous to O′ ?
Solution
O′
(a) According to O, the length L of the rocket matches the
length D0 of the platform. The time for the rocket to pass a
particular point is measured by O to be
O
(a)
108 m
O′
O
39 m
0.8c
(b)
O
O′
0.8c
(c)
FIGURE 2.13 Example 2.6. (a) From the reference frame of O
at rest on the platform, the passing rocket lines up simultaneously with the front and back of the platform. (b, c) From the
reference frame O′ in the rocket, the passing platform lines up
first with the front of the rocket and later with the rear. Note
the differing effects of length contraction in the two reference
frames.
t0 =
L
65 m
=
= 0.27 μs
0.80c
2.40 × 108 m/s
This is a proper time interval, because O measures the
interval between two events that occur at the same point in
the frame of reference of O (the front of the rocket passes
a point, and then the back of the rocket passes the same
point).
(b) O measures the contracted length L of the rocket. We
can find its proper length L0 using Eq. 2.13:
L
65 m
L0 =
= 108 m
=
1 − u2 /c2
1 − (0.80)2
(c) According to O the platform is at rest, so 65 m is its
proper length D0 . According to O′ , the contracted length of
2.4 | Consequences of Einstein’s Postulates
the platform is therefore
D = D0 1 − u2 /c2 = (65 m) 1 − (0.80)2 = 39 m
(d) For O to pass the entire length of the rocket, O′ concludes that O must move a distance equal to its rest length,
or 108 m. The time needed to do this is
t′ =
108 m
= 0.45 μs
0.80c
Note that this is not a proper time interval for O′ , who determines this time interval using one clock at the front of the
rocket to measure the time at which O passes the front of the
rocket, and another clock on the rear of the rocket to measure the time at which O passes the rear of the rocket. The
two events therefore occur at different points in O′ and so
cannot be separated by a proper time in O′ . The corresponding time interval measured by O for the same two events,
which we calculated in part (a), is a proper time interval for
O, because the two events do occur at the same point in O.
37
The time intervals measured by O and O′ should be related
by the time dilation formula, as you should verify.
(e) According to O′ , the rocket has a rest length of L0 =
108 m and the platform has a contracted length of D = 39 m.
There is thus no way that O′ could observe the two ends
of both to align simultaneously. The sequence of events
according to O′ is illustrated in Figures 2.13b and c. The time
interval t′ in O′ between the two events that are simultaneous in O can be calculated by noting that, according to O′ ,
the time interval between the situations shown in Figures
2.13b and c must be that necessary for the platform to move
a distance of 108 m − 39 m = 69 m, which takes a time
t′ =
69 m
= 0.29 μs
0.80c
This result illustrates the relativity of simultaneity: two
events at different locations that are simultaneous to O (the
lining up of the two ends of the rocket with the two ends of
the platform) cannot be simultaneous to O′ .
Relativistic Velocity Addition
The timing device is now modified as shown in Figure 2.14. A source P emits
particles that travel at speed v′ according to an observer O′ at rest with respect
to the device. The flashing bulb F is triggered to flash when a particle reaches it.
The flash of light makes the return trip to the detector D, and the clock ticks. The
time interval t0 between ticks measured by O′ is composed of two parts: one for
the particle to travel the distance L0 at speed v′ and another for the light to travel
the same distance at speed c:
t0 = L0 /v′ + L0 /c
(2.14)
According to observer O, relative to whom O′ moves at speed u, the sequence of
events is similar to that shown in Figure 2.10. The emitted particle, which travels
at speed v according to O, reaches F in a time interval t1 after traveling the
distance v t1 equal to the (contracted) length L plus the additional distance u t1
moved by the clock in that interval:
v t1 = L + u t1
(2.15)
In the interval t2 , the light beam travels a distance c t2 equal to the length L
less the distance u t2 moved by the clock in that interval:
P
c t2 = L − u t2
D
(2.16)
We now solve Eqs. 2.15 and 2.16 for t1 and t2 , add to find the total interval
t between ticks according to O, use the time dilation formula, Eq. 2.8, to relate
this result to t0 from Eq. 2.14, and finally use the length contraction formula,
Eq. 2.13, to relate L to L0 . After doing the algebra, we find the result
v=
v′ + u
1 + v′ u/c2
(2.17)
v′
Particle
Light
F
L0
FIGURE 2.14 In this timing device,
a particle is emitted by P at a speed
v′ . When the particle reaches F, it
triggers the emission of a flash of light
that travels to the detector D.
38
Chapter 2 | The Special Theory of Relativity
Equation 2.17 is the relativistic velocity addition law for velocity components
that are in the direction of u. Later in this chapter we use a different method to
derive the corresponding results for motion in other directions.
We can also regard Eq. 2.17 as a velocity transformation, enabling us to convert
a velocity v′ measured by O′ to a velocity v measured by O. The corresponding
classical law was given by Eq. 2.2: v = v′ + u. The difference between the
classical and relativistic results is the denominator of Eq. 2.17, which reduces to
1 in cases when the speeds are small compared with c. Example 2.7 shows how
this factor prevents the measured speeds from exceeding c.
Equation 2.17 gives an important result when O′ observes a light beam.
For v′ = c,
c+u
v=
=c
(2.18)
1 + cu/c2
That is, when v′ = c, then v = c, independent of the value of u. All observers
measure the same speed c for light, exactly as required by Einstein’s second
postulate.
Example 2.7
A spaceship moving away from the Earth at a speed of
0.80c fires a missile parallel to its direction of motion
(Figure 2.15). The missile moves at a speed of 0.60c relative to the ship. What is the speed of the missile as measured
by an observer on the Earth?
ν′ = 0.60c
O
O′
u = 0.80c
FIGURE 2.15 Example 2.7. A spaceship moves away from
Earth at a speed of 0.80c. An observer O′ on the spaceship
fires a missile and measures its speed to be 0.60c relative to
the ship.
Solution
Here O′ is on the ship and O is on Earth; O′ moves with
a speed of u = 0.80c relative to O. The missile moves at
speed v′ = 0.60c relative to O′ , and we seek its speed v
relative to O. Using Eq. 2.17, we obtain
v=
=
0.60c + 0.80c
v′ + u
=
1 + v′ u/c2
1 + (0.60c)(0.80c)/c2
1.40c
= 0.95c
1.48
According to classical kinematics (the numerator of
Eq. 2.17), an observer on the Earth would see the missile moving at 0.60c + 0.80c = 1.40c, thereby exceeding
the maximum relative speed of c permitted by relativity. You can see how Eq. 2.17 brings about this speed
limit. Even if v′ were 0.9999 . . . c and u were 0.9999 . . . c,
the relative speed v measured by O would remain less
than c.
The Relativistic Doppler Effect
In the classical Doppler effect for sound waves, an observer moving relative to
a source of waves (sound, for example) detects a frequency different from that
emitted by the source. The frequency f ′ heard by the observer O is related to the
frequency f emitted by the source S according to
f′ = f
v ± vO
v ∓ vS
(2.19)
where v is the speed of the waves in the medium (such as still air, in the case of
sound waves), vS is the speed of the source relative to the medium, and vO is the
speed of the observer relative to the medium. The upper signs in the numerator
2.4 | Consequences of Einstein’s Postulates
and denominator are chosen whenever S moves toward O or O moves toward S,
while the lower signs apply whenever O and S move away from one another.
The classical Doppler shift for motion of the source differs from that for
motion of the observer. For example, suppose the source emits sound waves at
f = 1000 Hz. If the source moves at 30 m/s toward the observer who is at rest
in the medium (which we take to be air, in which sound moves at v = 340 m/s),
then f ′ = 1097 Hz, while if the source is at rest in the medium and the observer
moves toward the source at 30 m/s, the frequency is 1088 Hz. Other possibilities
in which the relative speed between S and O is 30 m/s, such as each moving
toward the other at 15 m/s, give still different frequencies.
Here we have a situation in which it is not the relative speed of the source and
observer that determines the Doppler shift—it is the speed of each with respect
to the medium. This cannot occur for light waves, since there is no medium
(no “ether”) and no preferred reference frame by Einstein’s first postulate. We
therefore require a different approach to the Doppler effect for light waves, an
approach that does not distinguish between source motion and observer motion,
but involves only the relative motion between the source and the observer.
Consider a source of waves that is at rest in the reference frame of observer
O. Observer O′ moves relative to the source at speed u. We consider the situation
from the frame of reference of O′ , as shown in Figure 2.16. Suppose O observes
the source to emit N waves at frequency f . According to O, it takes an interval
t0 = N/f for these N waves to be emitted; this is a proper time interval in the
frame of reference of O. The corresponding time interval to O′ is t′ , during
which O moves a distance u t′ . The wavelength according to O′ is the total
length interval occupied by these waves divided by the number of waves:
λ′ =
c t′ + u t′
c t′ + u t′
=
N
f t0
(2.20)
The frequency according to O′ is f ′ = c/λ′ , so
f′ = f
1
t0
t′ 1 + u/c
(2.21)
and using the time dilation formula, Eq. 2.8, to relate t′ and t0 , we obtain
2 /c2
1
−
u
1 − u/c
f′ = f
=f
(2.22)
1 + u/c
1 + u/c
This is the formula for the relativistic Doppler shift, for the case in which the
waves are observed in a direction parallel to u
. Note that, unlike the classical
formula, it does not distinguish between source motion and observer motion; the
u ∆t′
–u
O
c ∆t′
O
O′
N waves
FIGURE 2.16 A source of waves, in the reference frame of O, moves
at speed u away from observer O′ . In the time t′ (according to O′ ), O
moves a distance u t′ and emits N waves.
39
40
Chapter 2 | The Special Theory of Relativity
relativistic Doppler effect depends only on the relative speed u between the source
and observer.
Equation 2.22 assumes that the source and observer are separating. If the
source and observer are approaching one another, replace u by −u in the formula.
Example 2.8
A distant galaxy is moving away from the Earth at such
high speed that the blue hydrogen line at a wavelength
of 434 nm is recorded at 600 nm, in the red range of the
spectrum. What is the speed of the galaxy relative to the
Earth?
Solution
Using Eq. 2.22 with f = c/λ and f ′ = c/λ′ , we obtain
c
c
=
′
λ
λ
1 − u/c
1 + u/c
c
c
=
600 nm
434 nm
Solving, we find
1 − u/c
1 + u/c
u/c = 0.31
Thus the galaxy is moving away from Earth at a speed
of 0.31c = 9.4 × 107 m/s. Evidence obtained in this way
indicates that nearly all the galaxies we observe are moving
away from us. This suggests that the universe is expanding,
and is usually taken to provide evidence in favor of the Big
Bang theory of cosmology (see Chapter 15).
2.5 THE LORENTZ TRANSFORMATION
We have seen that the Galilean transformation of coordinates, time, and velocity
is not consistent with Einstein’s postulates. Although the Galilean transformation
agrees with our “common-sense” experience at low speeds, it does not agree with
experiment at high speeds. We therefore need a new set of transformation equations
that replaces the Galilean set and that is capable of predicting such relativistic effects
as time dilation, length contraction, velocity addition, and the Doppler shift.
As before, we seek a transformation that enables observers O and O′ in
relative motion to compare their measurements of the space and time coordinates
of the same event. The transformation equations relate the measurements of O
(namely, x, y, z, t) to those of O′ (namely, x′ , y′ , z′ , t′ ). This new transformation
must have several properties: It must be linear (depending only on the first power
of the space and time coordinates), which follows from the homogeneity of space
and time; it must be consistent with Einstein’s postulates; and it must reduce to the
Galilean transformation when the relative speed between O and O′ is small. We
again assume that the velocity of O′ relative to O is in the positive xx′ direction.
This new transformation consistent with special relativity is called the Lorentz
transformation∗ . Its equations are
x − ut
(2.23a)
x′ =
1 − u2 /c2
y′ = y
(2.23b)
∗ H.
A. Lorentz (1853–1928) was a Dutch physicist who shared the 1902 Nobel Prize for his work
on the influence of magnetic fields on light. In an unsuccessful attempt to explain the failure of the
Michelson-Morley experiment, Lorentz developed the transformation equations that are named for
him in 1904, a year before Einstein published his special theory of relativity. For a derivation of the
Lorentz transformation, see R. Resnick and D. Halliday, Basic Concepts in Relativity (New York,
Macmillan, 1992).
2.5 | The Lorentz Transformation
z′ = z
(2.23c)
t − (u/c2 )x
t′ =
1 − u2 /c2
(2.23d)
It is often useful to write these equations in terms of intervals of space and time
by replacing each coordinate by the corresponding interval (replace x by x, x′
by x′ , t by t, t′ by t′ ).
These equations are written assuming that O′ moves away from O in the xx′
direction. If O′ moves toward O, replace u with −u in the equations.
The first three equations reduce directly to the Galilean transformation for
space coordinates, Eqs. 2.1, when u ≪ c. The fourth equation, which links the
time coordinates, reduces to t′ = t, which is a fundamental postulate of the
Galilean-Newtonian world.
We now use the Lorentz transformation equations to derive some of the
predictions of special relativity. The problems at the end of the chapter guide you
in some other derivations. The results derived here are identical with those we
obtained previously using Einstein’s postulates, which shows that the equations of
the Lorentz transformation are consistent with the postulates of special relativity.
Length Contraction
A rod of length L0 is at rest in the reference frame of observer O′ . The rod extends
along the x′ axis from x′1 to x′2 ; that is, O′ measures the proper length L0 = x′2 − x′1 .
Observer O, relative to whom the rod is in motion, measures the ends of the rod
to be at coordinates x1 and x2 . For O to determine the length of the moving rod,
O must make a simultaneous determination of x1 and x2 , and then the length is
L = x2 − x1 . Suppose the first event is O′ setting off a flash bulb at one end of the
rod at x′1 and t1′ , which O observes at x1 and t1 , and the second event is O′ setting
off a flash bulb at the other end at x′2 and t2′ , which O observes at x2 and t2 . The
equations of the Lorentz transformation relate these coordinates, specifically,
x − ut1
x′1 = 1
1 − u2 /c2
x − ut2
x′2 = 2
1 − u2 /c2
Subtracting these equations, we obtain
x′2 − x′1 =
x2 − x1
1−
u2 /c2
u(t − t1 )
− 2
1 − u2 /c2
(2.24)
(2.25)
O′ must arrange to set off the flash bulbs so that the flashes appear to be
simultaneous to O. (They will not be simultaneous to O′ , as we discuss later in this
section.) This enables O to make a simultaneous determination of the coordinates
of the endpoints of the rod. If O observes the flashes to be simultaneous, then
t2 = t1 , and Eq. 2.25 reduces to
x2 − x1
(2.26)
With x′2 − x′1 = L0 and x2 − x1 = L, this becomes
L = L0 1 − u2 /c2
(2.27)
x′2 − x′1 =
1 − u2 /c2
which is identical with Eq. 2.13, which we derived earlier using Einstein’s
postulates.
41
42
Chapter 2 | The Special Theory of Relativity
Velocity Transformation
If O observes a particle to travel with velocity v (components vx , vy , vz ), what
velocity v′ does O′ observe for the particle? The relationship between the velocities
measured by O and O′ is given by the Lorentz velocity transformation:
vx − u
v′x =
(2.28a)
1 − vx u/c2
vy 1 − u2 /c2
′
vy =
(2.28b)
1 − vx u/c2
vz 1 − u2 /c2
′
(2.28c)
vz =
1 − vx u/c2
By solving Eq. 2.28a for vx , you can show that it is identical to Eq. 2.17, a result
we derived previously based on Einstein’s postulates. Note that, in the limit of
low speeds (u ≪ c), the Lorentz velocity transformation reduces to the Galilean
velocity transformation, Eq. 2.2. Note also that v′y = vy , even though y′ = y. This
occurs because of the way the Lorentz transformation handles the time coordinate.
We can derive these transformation equations for velocity from the Lorentz
coordinate transformation. By way of example, we derive the velocity transformation for v′y = dy′ /dt′ . Differentiating the coordinate transformation y′ = y,
we obtain dy′ = dy. Similarly, differentiating the time coordinate transformation
(Eq. 2.23d), we obtain
dt − (u/c2 )dx
dt′ =
1 − u2 /c2
So
dy
dy
dy′
=
= 1 − u2 /c2
′
dt
dt − (u/c2 ) dx
[dt − (u/c2 ) dx]/ 1 − u2 /c2
vy 1 − u2 /c2
dy/dt
=
= 1 − u2 /c2
1 − (u/c2 ) dx/dt
1 − uvx /c2
v′y =
v′z .
Clock 1
Similar methods can be used to obtain the transformation equations for v′x and
These derivations are left as exercises (Problem 14).
Clock 2
O′
Simultaneity and Clock Synchronization
O
x=0
x = L /2
x=L
FIGURE 2.17 A flash of light, emitted
from a point midway between the two
clocks, starts the two clocks simultaneously according to O. Observer O′
sees clock 2 start ahead of clock 1.
Under ordinary circumstances, synchronizing one clock with another is a simple
matter. But for scientific work, where timekeeping at a precision below the
nanosecond range is routine, clock synchronization can present some significant
challenges. At very least, we need to correct for the time that it takes for the signal
showing the reading on one clock to be transmitted to the other clock. However,
for observers who are in motion with respect to each other, special relativity gives
yet another way that clocks may appear to be out of synchronization.
Consider the device shown in Figure 2.17. Two clocks are located at x = 0 and
x = L. A flash lamp is located at x = L/2, and the clocks are set running when they
2.5 | The Lorentz Transformation
receive the flash of light from the lamp. The light takes the same interval of time
to reach the two clocks, so the clocks start together precisely at a time L/2c after
the flash is emitted, and the clocks are exactly synchronized.
Now let us examine the same situation from the point of view of the moving
observer O′ . In the frame of reference of O, two events occur: the receipt of a
light signal by clock 1 at x1 = 0, t1 = L/2c and the receipt of a light signal by
clock 2 at x2 = L, t2 = L/2c. Using Eq. 2.23d, we find that O′ observes clock 1 to
receive its signal at
t1 − (u/c2 )x1
L/2c
=
t1′ =
2
2
1 − u /c
1 − u2 /c2
(2.29)
while clock 2 receives its signal at
L/2c − (u/c2 )L
t2 − (u/c2 )x2
=
t2′ =
1 − u2 /c2
1 − u2 /c2
(2.30)
Thus t2′ is smaller than t1′ and clock 2 appears to receive its signal earlier than
clock 1, so that the clocks start at times that differ by
uL/c2
t′ = t1′ − t2′ =
1 − u2 /c2
(2.31)
according to O′ . Keep in mind that this is not a time dilation effect—time
dilation comes from the first term of the Lorentz transformation (Eq. 2.23d) for t′ ,
while the lack of synchronization arises from the second term. O′ observes both
clocks to run slow, due to time dilation; O′ also observes clock 2 to be ahead of
clock 1.
We therefore reach the following conclusion: two events that are simultaneous
in one reference frame are not simultaneous in another reference frame moving
with respect to the first, unless the two events occur at the same point in space.
(If L = 0, Eq. 2.31 shows that the clocks are synchronized in all reference
frames.) Clocks that appear to be synchronized in one frame of reference will not
necessarily be synchronized in another frame of reference in relative motion.
It is important to note that this clock synchronization effect does not depend
on the location of observer O′ but only on the velocity of O′ . In Figure 2.17, the
location of O′ could have been drawn far to the left side of clock 1 or far to the
right side of clock 2, and the result would be the same. In those different locations,
the propagation time of the light signal showing clock 1 starting will differ from
the propagation time of the light signal showing clock 2 starting. However, O′
is assumed to be an “intelligent” observer who is aware of the locations where
the light signals showing the two clocks starting are received relative to the
locations of the clocks. O′ corrects for this time difference, which is due only to
the propagation time of the light signals, and even after making that correction
the clocks still do not appear to be synchronized!
Although the location of O′ does not appear in Eq. 2.31, the direction of
the velocity of O′ is important—if O′ is moving in the opposite direction, the
observed starting order of the two clocks is reversed.
43
44
Chapter 2 | The Special Theory of Relativity
Example 2.9
Two rockets are leaving their space station along perpendicular paths, as measured by an observer on the space
station. Rocket 1 moves at 0.60c and rocket 2 moves at
0.80c, both measured relative to the space station. What is
the velocity of rocket 2 as observed by rocket 1?
would be identical with vy ,
the Galilean transformation, v′y
and thus the speed would be (0.60c)2 + (0.80c)2 = c.
Once again, the Lorentz transformation prevents relative
speeds from reaching or exceeding the speed of light.
Solution
Observer O is the space station, observer O′ is rocket 1
(moving at u = 0.60c), and each observes rocket 2, moving
(according to O) in a direction perpendicular to rocket 1.
We take this to be the y direction of the reference frame of
O. Thus O observes rocket 2 to have velocity components
vx = 0, vy = 0.80c, as shown in Figure 2.18a.
We can find v′x and v′y using the Lorentz velocity transformation:
v′x =
v′y =
Rocket 2
vx = 0
vy = 0.80c
Rocket 1
O
(a)
0 − 0.60c
vx − u
=
= −0.60c
2
1 − vx u/c
1 − 0(0.60c)/c2
vy 1 − u2 /c2
Rocket 2
v′x = –0.60c
v′y = 0.64c
1 − vx u/c2
0.80c 1 − (0.60c)2 /c2
= 0.64c
=
1 − 0(0.60c)/c2
Thus, according to O′ , the situation looks like Figure 2.18b.
′
The speed of rocket 2 according to O is
2
2
(0.60c) + (0.64c) = 0.88c, less than c. According to
O′
u = 0.60c
Rocket 1
O
O′
u = – 0.60c
(b)
FIGURE 2.18 Example 2.9. (a) As viewed from the reference
frame of O. (b) As viewed from the reference frame of O′ .
Example 2.10
In Example 2.6, two events that were simultaneous to O
(the lining up of the front and back of the rocket ship with
the ends of the platform) were not simultaneous to O′ . Find
the time interval between these events according to O′ .
Solution
According to O, the two simultaneous events are separated
by a distance of L = 65 m. For u = 0.80c, Eq. 2.31 gives
t′ =
uL/c2
1 − u2 /c2
(0.80)(65 m)/(3.00 × 108 m/s)
=
= 0.29 μs
1 − (0.80)2
which agrees with the result calculated in part (e) of
Example 2.6.
2.6 THE TWIN PARADOX
We now turn briefly to what has become known as the twin paradox. Suppose there
is a pair of twins on Earth. One, whom we shall call Casper, remains on Earth, while
his twin sister Amelia sets off in a rocket ship on a trip to a distant planet. Casper,
based on his understanding of special relativity, knows that his sister’s clocks will
2.6 | The Twin Paradox
run slow relative to his own and that therefore she should be younger than he when
she returns, as our discussion of time dilation would suggest. However, recalling
that discussion, we know that for two observers in relative motion, each thinks the
other’s clocks are running slow. We could therefore study this problem from the
point of view of Amelia, according to whom Casper and the Earth (accompanied
by the solar system and galaxy) make a round-trip journey away from her and
back again. Under such circumstances, she will think it is her brother’s clocks
(which are now in motion relative to her own) that are running slow, and will
therefore expect her brother to be younger than she when they meet again. While
it is possible to disagree over whose clocks are running slow relative to his or her
own, which is merely a problem of frames of reference, when Amelia returns to
Earth (or when the Earth returns to Amelia), all observers must agree as to which
twin has aged less rapidly. This is the paradox—each twin expects the other to be
younger.
The resolution of this paradox lies in considering the asymmetric role of the
two twins. The laws of special relativity apply only to inertial frames, those
moving relative to one another at constant velocity. We may supply Amelia’s
rockets with sufficient thrust so that they accelerate for a very short length of time,
bringing the ship to a speed at which it can coast to the planet, and thus during
her outward journey Amelia spends all but a negligible amount of time in a frame
of reference moving at constant speed relative to Casper. However, in order to
return to Earth, she must decelerate and reverse her motion. Although this also
may be done in a very short time interval, Amelia’s return journey occurs in a
completely different inertial frame than her outward journey. It is Amelia’s jump
from one inertial frame to another that causes the asymmetry in the ages of the
twins. Only Amelia has the necessity of jumping to a new inertial frame to return,
and therefore all observers will agree that it is Amelia who is “really” in motion,
and that it is her clocks that are “really” running slow; therefore she is indeed the
younger twin on her return.
Let us make this discussion more quantitative with a numerical example. We
assume, as discussed above, that the acceleration and deceleration take negligible
time intervals, so that all of Amelia’s aging is done during the coasting. For
simplicity, we assume the distant planet is at rest relative to the Earth; this does
not change the problem, but it avoids the need to introduce yet another frame of
reference. Suppose the planet to be 6 light-years distant from Earth, and suppose
Amelia travels at a speed of 0.6c. Then according to Casper it takes his sister
10 years (10 years ×0.6c = 6 light-years) to reach the planet and 10 years to
return, and therefore she is gone for a total of 20 years. (However, Casper doesn’t
know his sister has reached the planet until the light signal carrying news of her
arrival reaches Earth. Since light takes 6 years to make the journey, it is 16 years
after her departure when Casper sees his sister’s arrival at the planet. Four years
later she returns to Earth.) From the frame of reference of
Amelia aboard the
rocket, the distance to the planet is contracted by a factor of 1 − (0.6)2 = 0.8,
and is therefore 0.8 × 6 light-years = 4.8 light-years. At a speed of 0.6c, Amelia
will measure 8 years for the trip to the planet, for a total round trip time of 16 years.
Thus Casper ages 20 years while Amelia ages only 16 years and is indeed the
younger on her return.
We can confirm this analysis by having Casper send a light signal to his sister
each year on his birthday. We know that the frequency of the signal as received
45
46
Chapter 2 | The Special Theory of Relativity
by Amelia will be Doppler shifted. During the outward journey, she will receive
signals at the rate of
1 − u/c
= 0.5/year
(1/year)
1 + u/c
During the return journey, the Doppler-shifted rate will be
1 + u/c
(1/year)
= 2/year
1 − u/c
Thus for the first 8 years, during Amelia’s trip to the planet, she receives 4 signals,
and during the return trip of 8 years, she receives 16 signals, for a total of 20. She
receives 20 signals, indicating her brother has celebrated 20 birthdays during her
16-year journey.
Spacetime Diagrams
Particle at rest
Light
beam
Time
v<c
45°
Distance
FIGURE 2.19 A spacetime diagram.
Casper’s
worldline
20
Amelia’s
worldline
Time (years)
15
Light
signals
10
5
tan−1(0.6)
5
10
0
Distance (light-years)
FIGURE 2.20 Casper’s spacetime diagram, showing his worldline and
Amelia’s.
A particularly helpful way of visualizing the journeys of Casper and Amelia uses
a spacetime diagram. Figure 2.19 shows an example of a spacetime diagram for
motion that involves only one spatial direction.
In your introductory physics course, you probably became familiar with
plotting motion on a graph in which distance appeared on the vertical axis and
time on the horizontal axis. On such a graph, a straight line represents motion
at constant velocity; the slope of the line is equal to the velocity. Note that the
axes of the spacetime diagram are switched from the traditional graph of particle
motion, with time on the vertical axis and space on the horizontal axis.
On a spacetime diagram, the graph that represents the motion of a particle is
called its worldline. The inverse of the slope of the particle’s worldline gives its
velocity. Equivalently, the velocity is given by the tangent of the angle that the
worldline makes with the vertical axis (rather than with the horizontal axis, as
would be the case with a conventional plot of distance vs. time). Usually, the units
of x and t are chosen so that motion at the speed of light is represented by a line
with a 45◦ slope. A vertical line represents a particle that is at the same spatial
locations at all times—that is, a particle at rest. Permitted motions with constant
velocity are then represented by straight lines between the vertical and the 45◦
line representing the maximum velocity.
Let’s draw the worldlines of Casper and Amelia according to Casper’s frame
of reference. Casper’s worldline is a vertical line, because he is at rest in this frame
(Figure 2.20). In Casper’s frame of reference, 20 years pass between Amelia’s
departure and her return, so we can follow Casper’s vertical worldline for
20 years.
Amelia is traveling at a speed of 0.6c, so her worldline makes an angle with
the vertical whose tangent is 0.6 (31◦ ). In Casper’s frame of reference, the planet
visited by Amelia is 6 light-years from Earth. Amelia travels a distance of 6
light-years in a time of 10 years (according to Casper) so that v = 6 light-years/10
years = 0.6c.
The birthday signals that Casper sends to Amelia at the speed of light are
represented by the series of 45◦ lines in Figure 2.20. Amelia receives 4 birthday
signals during her outbound journey (the 4th arrives just as she reaches the planet)
and 16 birthday signals during her return journey (the 16th is sent and received
just as she returns to Earth).
2.7 | Relativistic Dynamics
It is left as an exercise (Problems 22 and 24) to consider the situation if it is
Amelia who is sending the signals.
2.7 RELATIVISTIC DYNAMICS
We have seen how Einstein’s postulates have led to a new “relative” interpretation
of such previously absolute concepts as length and time, and that the classical
concept of absolute velocity is not valid. It is reasonable then to ask how far this
revolution is to go in changing our interpretation of physical concepts. Dynamical
quantities, such as momentum and kinetic energy, depend on length, time, and
velocity. Do classical laws of momentum and energy conservation remain valid
in Einstein’s relativity?
Let’s test the conservation laws by examining the collision shown in
Figure 2.21a. Two particles collide elastically as observed in the reference
frame of O′ . Particle 1 of mass m1 = 2m is initially at rest, and particle 2 of
mass m2 = m is moving in the negative x direction with an initial velocity of
v′2i = −0.750c. Using the classical law of momentum conservation to analyze
this collision, O′ would calculate the particles to be moving with final velocities
v′1f = −0.500c and v′2f = +0.250c. According to O′ , the total initial and final
momenta of the particles would be:
p′i = m1 v′1i + m2 v′2i = (2m)(0) + (m)(−0.750c) = −0.750mc
p′f = m1 v′1f + m2 v′2f = (2m)(−0.500c) + (m)(0.250c) = −0.750mc
The initial and final momenta are equal according to O′ , demonstrating that
momentum is conserved.
Suppose that the reference frame of O′ moves at a velocity of u = +0.550c in
the x direction relative to observer O, as in Figure 2.21b. How would observer
O analyze this collision? We can find the initial and final velocities of the two
particles according to O using the velocity transformation of Eq. 2.17, which gives
y′
y
2m
2m
m
v′2i = −0.750c
m
v1i = 0.550c
Initial
v2i = −0.340c
Initial
2m
2m
m
v′2f
v′1f
m
v1f
u = 0.550c
v2f
Final
Final
x′
(a)
x
(b)
FIGURE 2.21 (a) A collision between two particles as observed from the reference frame of O′ .
(b) The same collision observed from the reference frame of O.
47
48
Chapter 2 | The Special Theory of Relativity
the initial velocities shown in the figure and the final velocities v1f = +0.069c
and v2f = +0.703c. Observer O can now calculate the initial and final values of
the total momentum of the two particles:
pi = m1 v1i + m2 v2i = (2m)(+0.550c) + (m)(−0.340c) = +0.760mc
pf = m1 v1f + m2 v2f = (2m)(+0.069c) + (m)(+0.703c) = +0.841mc
Momentum is therefore not conserved according to observer O.
This collision experiment has shown that that the law of conservation of linear
momentum, with momentum defined as p
= mv, does not satisfy Einstein’s first
postulate (the law must be the same in all inertial frames). We cannot have a
law that is valid for some observers but not for others. Therefore, if we are to
retain the conservation of momentum as a general law consistent with Einstein’s
first postulate, we must find a new definition of momentum. This new definition
of momentum must have two properties: (1) It must yield a law of conservation
of momentum that satisfies the principle of relativity; that is, if momentum is
conserved according to an observer in one inertial frame, then it is conserved
according to observers in all inertial frames. (2) At low speeds, the new definition
must reduce to p
= mv, which we know works perfectly well in the nonrelativistic
case.
These requirements are satisfied by defining the relativistic momentum for a
v as
particle of mass m moving with velocity
p
=
m
v
(2.32)
1 − v2 /c2
In terms of components, we can write Eq. 2.32 as
px =
mvx
1 − v2 /c2
and
py =
mvy
1 − v2 /c2
(2.33)
The velocity v that appears in the denominator of these expressions is always the
velocity of the particle as measured in a particular inertial frame. It is not the
velocity of an inertial frame. The velocity in the numerator can be any of the
components of the velocity vector.
We can now reanalyze the collision shown in Figure 2.21 using the relativistic
definition of momentum. The initial relativistic momentum according to O′ is
p′i =
m1 v′1i
1−
2
v′2
1i /c
+
m2 v′2i
1−
2
v′2
2i /c
(m)(−0.750c)
(2m)(0)
= −1.134mc
+
=√
2
1−0
1 − (0.750)2
The final velocities according to O′ are v′1f = −0.585c and v′2f = +0.294c, and
the total final momentum is
p′f =
m v′
+ 2 2f
2
2
1 − v′2
1 − v′2
1f /c
2f /c
m1 v′1f
(m)(0.294c)
(2m)(−0.585c)
+
= −1.134mc
=
1 − (0.585)2
1 − (0.294)2
2.7 | Relativistic Dynamics
49
Thus p′i = p′f , and observer O′ concludes that momentum is conserved. According
to O, the initial relativistic momentum is
m v
m v
(2m)(+0.550c)
(m)(−0.340c)
+ 2 2i
=
+
= 0.956mc
pi = 1 1i
2
1 − (0.550)
1 − (0.340)2
1 − v21i /c2
1 − v22i /c2
Using the velocity transformation, the final velocities measured by O are v1f =
−0.051c and v2f = +0.727c, and so O calculates the final momentum to be
m v
m v
(2m)(−0.051c) (m)(+0.727c)
+ 2 2f
=
+
= 0.956mc
pf = 1 1f
1 − (0.051)2
1 − (0.727)2
1 − v21f /c2
1 − v22f /c2
Observer O also concludes that pi = pf and that the law of conservation of
momentum is valid. Defining momentum according to Eq. 2.32 gives conservation
of momentum in all reference frames, as required by the principle of relativity.
Example 2.11
What is the momentum of a proton moving at a speed of
v = 0.86c?
Solution
Using Eq. 2.32, we obtain
mv
p=
1 − v2 /c2
=
(1.67 × 10−27 kg)(0.86)(3.00 × 108 m/s)
1 − (0.86)2
= 8.44 × 10−19 kg · m/s
The units of kg · m/s are generally not convenient in solving problems of this type. Instead, we manipulate Eq. 2.32
to obtain
pc =
mc2 (v/c)
(938 MeV)(0.86)
=
=
2
2
2
2
1 − v /c
1 − v /c
1 − (0.86)2
mvc
= 1580 MeV
Here we have used the proton’s rest energy mc2 , which
is defined later in this section. The momentum is obtained
from this result by dividing by the symbol c (not its
numerical value), which gives
p = 1580 MeV/c
The units of MeV/c for momentum are often used in relativistic calculations because, as we show later, the quantity
pc often appears in these calculations. You should be able
to convert MeV/c to kg · m/s and show that the two results
obtained for p are equivalent.
Relativistic Kinetic Energy
Like the classical definition of momentum, the classical definition of kinetic
energy also causes difficulties when we try to compare the interpretations of
different observers. According to O′ , the initial and final kinetic energies in the
collision shown in Figure 2.21a are:
1
′2
2
2
2
Ki′ = 12 m1 v′2
1i + 2 m2 v2i = (0.5)(2m)(0) + (0.5)(m)(−0.750c) = 0.281mc
1
′2
2
2
2
Kf′ = 12 m1 v′2
1f + 2 m2 v2f = (0.5)(2m)(−0.500c) + (0.5)(m)(0.250c) = 0.281mc
50
Chapter 2 | The Special Theory of Relativity
and so energy is conserved according to O′ . The initial and final kinetic energies
observed from the reference frame of O (as in Figure 2.21b) are
Ki = 12 m1 v21i + 12 m2 v22i = (0.5)(2m)(0.550c)2 + (0.5)(m)(−0.340c)2 = 0.360mc2
Kf = 12 m1 v21f + 12 m2 v22f = (0.5)(2m)(0.069c)2 + (0.5)(m)(0.703c)2 = 0.252mc2
Thus energy is not conserved in the reference frame of O if we use the classical
formula for kinetic energy. This leads to a serious inconsistency—an elastic
collision for one observer would not be elastic for another observer. As in the
case of momentum, if we want to preserve the law of conservation of energy for
all observers, we must replace the classical formula for kinetic energy with an
expression that is valid in the relativistic case (but that reduces to the classical
formula for low speeds).
We can derive the relativistic expression for the kinetic energy of a particle
using essentially the same procedure used to derive the classical expression,
starting with the particle form of the work-energy theorem (see Problem 28). The
result of this calculation is
mc2
K=
− mc2
1 − v2 /c2
(2.34)
Using Eq. 2.34, you can show that both O and O′ will conclude that kinetic energy
is conserved. In fact, all observers will agree on the applicability of the energy
conservation law using the relativistic definition for kinetic energy.
Equation 2.34 looks very different from the classical result K = 12 mv2 , but, as
you should show (see Problem 32), Eq. 2.34 reduces to the classical expression in
the limit of low speeds (v ≪ c).
The classical expression for kinetic energy also violates the second relativity
postulate by allowing speeds in excess of the speed of light. There is no limit
(in either classical or relativistic dynamics) to the energy we can give to a
particle. Yet, if we allow the kinetic energy to increase without limit, the classical
expression K = 21 mv2 implies that the velocity must correspondingly increase
without limit, thereby violating the second postulate. You can also see from the
first term of Eq. 2.34 that K → ∞ as v → c. Thus we can increase the relativistic
kinetic energy of a particle without limit, and its speed will not exceed c.
Relativistic Total Energy and Rest Energy
We can also express Eq. 2.34 as
K = E − E0
(2.35)
where the relativistic total energy E is defined as
mc2
E=
1 − v2 /c2
(2.36)
2.7 | Relativistic Dynamics
51
and the rest energy E0 is defined as
E0 = mc2
(2.37)
The rest energy is in effect the relativistic total energy of a particle measured in a
frame of reference in which the particle is at rest.
Sometimes m in Eq. 2.37 is called the rest mass m0 and is distinguished from
the “relativistic mass,” which is defined as m0 / 1 − v2 /c2 . We choose not to use
relativistic mass, because it can be a misleading concept. Whenever we refer to
mass, we always mean rest mass.
Equation 2.37 suggests that mass can be expressed in units of energy divided
by c2 , such as MeV/c2 . For example, a proton has a rest energy of 938 MeV and
thus a mass of 938 MeV/c2 . Just like expressing momentum in units of MeV/c,
expressing mass in units of MeV/c2 turns out to be very useful in calculations.
The relativistic total energy is given by Eq. 2.35 as
E = K + E0
(2.38)
Collisions of particles at high energies often result in the production of new
particles, and thus the final rest energy may not be equal to the initial rest energy
(see Example 2.18). Such collisions must be analyzed using conservation of total
relativistic energy E; kinetic energy will not be conserved when the rest energy
changes in a collision. In the special example of the elastic collision considered
in this section, the identities of the particles did not change, and so kinetic energy
was conserved. In general, collisions do not conserve kinetic energy—it is the
relativistic total energy that is conserved in collisions.
Manipulation of Eqs. 2.32 and 2.36 gives a useful relationship among the total
energy, momentum, and rest energy:
(2.39)
E = (pc)2 + (mc2 )2
Figure 2.22 shows a useful mnemonic device for remembering this relationship,
which has the form of the Pythagorean theorem for the sides of a right triangle.
When a particle travels at a speed close to the speed of light (say, v > 0.99c),
which often occurs in high-energy particle accelerators, the particle’s kinetic
energy is much greater than its rest energy; that is, K ≫ E0 . In this case, Eq. 2.39
can be written, to a very good approximation,
E∼
= pc
E
(2.40)
This is called the extreme relativistic approximation and is often useful for
simplifying calculations. As v approaches c, the angle in Figure 2.22 between the
bottom leg of the triangle (representing mc2 ) and the hypotenuse (representing E)
approaches 90◦ . Imagine in this case a very tall triangle, in which the vertical leg
(pc) and the hypotenuse (E) are nearly the same length.
For massless particles (such as photons), Eq. 2.39 becomes exactly
E = pc
K
(2.41)
All massless particles travel at the speed of light; otherwise, by Eqs. 2.34 and
2.36 their kinetic and total energies would be zero.
2
mc
pc
sine = v/c
E0 = mc2
FIGURE 2.22 A useful mnemonic
device for recalling the relationships
among E0 , p, K, and E. Note that to
put all variables in energy units, the
quantity pc must be used.
52
Chapter 2 | The Special Theory of Relativity
Example 2.12
What are the kinetic and relativistic total energies of a
proton (E0 = 938 MeV) moving at a speed of v = 0.86c?
The kinetic energy follows from Eq. 2.35:
K = E − E0
Solution
In Example 2.11 we found the momentum of this particle
to be p = 1580 MeV/c. The total energy can be found from
Eq. 2.39:
E = (pc)2 + (mc2 )2 = (1580 MeV)2 + (938 MeV)2
= 1837 MeV − 938 MeV
= 899 MeV
We also could have solved this problem by finding the
kinetic energy directly from Eq. 2.34.
= 1837 MeV
Example 2.13
Find the velocity and momentum of an electron (E0 =
0.511 MeV) with a kinetic energy of 10.0 MeV.
Solution
The total energy is E = K + E0 = 10.0 MeV + 0.511 MeV
= 10.51 MeV. We then can find the momentum from
Eq. 2.39:
1
1 2
2
2
E − (mc ) =
(10.51 MeV)2 −(0.511 MeV)2
p=
c
c
= 10.5 MeV/c
Note that in this problem we could have used the
extreme relativistic approximation, p ∼
= E/c, from Eq. 2.40.
The error we would make in this case would be
only 0.1%.
The velocity can be found by solving Eq. 2.36 for v.
v
=
c
1−
mc2
E
2
=
1−
= 0.9988
0.511 MeV
10.51 MeV
2
(2.42)
Example 2.14
In the Stanford Linear Collider electrons are accelerated to a
kinetic energy of 50 GeV. Find the speed of such an electron
as (a) a fraction of c, and (b) a difference from c. The rest
energy of the electron is 0.511 MeV = 0.511 × 10−3 GeV.
(a) First we solve Eq. 2.34 for v, obtaining
v=c 1−
1
(1 + K/mc2 )2
v=c 1−
1
[1 + (50 GeV)/(0.511 × 10−3 GeV)]2
= 0.999 999 999 948c
Solution
and thus
(2.43)
Calculators cannot be trusted to 12 significant digits. Here
is a way to avoid this difficulty. We can write Eq. 2.43
as v = c(1 + x)1/2 , where x = −1/(1 + K/mc2 )2 . Because
K ≫ mc2 , we have x ≪ 1, and we can use the binomial
2.8 | Conservation Laws in Relativistic Decays and Collisions
expansion to write v ∼
= c(1 + 12 x), or
1
v∼
=c 1−
2(1 + K/mc2 )2
53
This leads to the same value of v given above.
(b) From the above result, we have
c − v = 5.2 × 10−11 c
which gives
= 0.016 m/s
v∼
= c(1 − 5.2 × 10−11 )
= 1.6 cm/s
Example 2.15
At a distance equal to the radius of the Earth’s orbit
(1.5 × 1011 m), the Sun’s radiation has an intensity of
about 1.4 × 103 W/m2 . Find the rate at which the mass of
the Sun is decreasing.
Solution
If we assume that the Sun’s radiation is distributed uniformly over the surface area 4π r2 of a sphere of radius
1.5 × 1011 m, then the total radiative power emitted by the
Sun is
11
2
3
2
4π(1.5 × 10 m) (1.4 × 10 W/m )
= 4.0 × 1026 W = 4.0 × 1026 J/s
By conservation of energy, we know that the energy lost
by the Sun through radiation must be accounted for by
a corresponding loss in its rest energy. The change in
mass m corresponding to a change in rest energy E0 of
4.0 × 1026 J each second is
m =
4.0 × 1026 J
E0
=
= 4.4 × 109 kg
c2
9.0 × 1016 m2 /s2
The Sun loses mass at a rate of about 4 billion kilograms
per second! If this rate were to remain constant, the Sun
(with a present mass of 2 × 1030 kg) would shine “only”
for another 1013 years.
2.8 CONSERVATION LAWS IN RELATIVISTIC
DECAYS AND COLLISIONS
In all decays and collisions, we must apply the law of conservation of momentum.
The only difference between applying this law for collisions at low speed (as we
did in Example 1.1) and at high speed is the use of the relativistic expression for
momentum (Eq. 2.32) instead of Eq. 1.2. The law of conservation of momentum
for relativistic motion can be stated in exactly the same way as for classical motion:
In an isolated system of particles, the total linear momentum remains
constant.
In the classical case, kinetic energy is the only form of energy that is present in
elastic collisions, so conservation of energy is equivalent to conservation of kinetic
energy. In inelastic collisions or decay processes, the kinetic energy does not
remain constant. Total energy is conserved in classical inelastic collisions, but we
did not account for the other forms of energy that might be important. This missing
energy is usually stored in the particles, perhaps as atomic or nuclear energy.
54
Chapter 2 | The Special Theory of Relativity
In the relativistic case, the internal stored energy contributes to the rest energy
of the particles. Usually rest energy and kinetic energy are the only two forms
of energy that we consider in atomic or nuclear processes (later we’ll add the
energy of radiation to this balance). A loss of kinetic energy in a collision is thus
accompanied by a gain in rest energy, but the total relativistic energy (kinetic
energy + rest energy) of all the particles involved in the process doesn’t change.
For example, in a reaction in which new particles are produced, the loss in kinetic
energy of the original reacting particles gives the increase in rest energy of the
product particles. On the other hand, in a nuclear decay process such as alpha
decay, the initial nucleus gives up some rest energy to account for the kinetic
energy carried by the decay products.
The law of energy conservation in the relativistic case is:
In an isolated system of particles, the relativistic total energy (kinetic energy
plus rest energy) remains constant.
In applying this law to relativistic collisions, we don’t have to worry whether the
collision is elastic or inelastic, because the inclusion of the rest energy accounts
for any loss in kinetic energy.
The following examples illustrate applications of the conservation laws for
relativistic momentum and energy.
Example 2.16
A neutral K meson (mass 497.7 MeV/c2 ) is moving with a
kinetic energy of 77.0 MeV. It decays into a pi meson (mass
139.6 MeV/c2 ) and another particle of unknown mass. The
pi meson is moving in the direction of the original K meson
with a momentum of 381.6 MeV/c. (a) Find the momentum and total relativistic energy of the unknown particle.
(b) Find the mass of the unknown particle.
Conservation of relativistic momentum (pinitial = pfinal )
gives pK = pπ + px (where x represents the unknown particle), so
Solution
and conservation of total relativistic energy (Einitial = Efinal )
gives EK = Eπ + Ex , so
(a) The total energy and momentum of the K meson are
px = pK − pπ = 287.4 MeV/c − 381.6 MeV/c
= −94.2 MeV/c
EK = KK + mK c2 = 77.0 MeV + 497.7 MeV = 574.7 MeV
Ex = EK − Eπ = 574.7 MeV − 406.3 MeV
1
EK2 − (mK c2 )2
pK =
= 168.4 MeV
c
1
=
(574.7 MeV)2 − (497.7 MeV)2
(b) We can find the mass by solving Eq. 2.39 for mc2 :
c
= 287.4 MeV/c
mx c2 = Ex2 − (cpx )2
and for the pi meson
= (168.4 MeV)2 − (94.2 MeV)2
Eπ = (cpπ )2 + (mπ c2 )2
= 139.6 MeV
= (381.6 MeV)2 + (139.6 MeV)2
Thus the unknown particle has a mass of 139.6 MeV/c2 ,
= 406.3 MeV
and its mass shows that it is another pi meson.
2.8 | Conservation Laws in Relativistic Decays and Collisions
55
Example 2.17
In the reaction K− + p → 0 + π 0 , a charged K
meson (mass 493.7 MeV/c2 ) collides with a proton
(938.3 MeV/c2 ) at rest, producing a lambda particle
(1115.7 MeV/c2 ) and a neutral pi meson (135.0 MeV/c2 ),
as represented in Figure 2.23. The initial kinetic energy
of the K meson is 152.4 MeV. After the interaction, the
pi meson has a kinetic energy of 254.8 MeV. (a) Find the
kinetic energy of the lambda. (b) Find the directions of
motion of the lambda and the pi meson.
(b) To find the directional information we must apply
conservation of momentum. The initial momentum is just
that of the K meson. From its total energy, EK = KK +
mK c2 = 152.4 MeV + 493.7 MeV = 646.1 MeV, we can
find the momentum:
pinitial = pK =
1
(EK )2 − (mK c2 )2
c
1
(646.1 MeV)2 −(493.7 MeV)2
c
= 416.8 MeV/c
=
y
x
K−
A similar procedure applied to the two final particles
gives p = 426.9 MeV/c and pπ = 365.7 MeV/c. The
total momentum of the two final particles is px,final =
p cos θ + pπ cos φ and py,final = p sin θ − pπ sin φ. Conservation of momentum in the x and y directions gives
p
(a)
y
π0
Λ0
θ
x
φ
(b)
FIGURE 2.23 Example 2.17. (a) A K− meson collides with a
proton at rest. (b) After the collision, a π 0 meson and a 0 are
produced.
Solution
(a) The initial and final total energies are
Einitial = EK + Ep = KK + mK c2 + mp c2
Efinal = E +Eπ = K +m c2 +Kπ +mπ c2
In these two equations, the value of every quantity is known
except the kinetic energy of the lambda. Using conservation of total relativistic energy, we set Einitial = Efinal and
solve for K :
K = KK + mK c2 + mp c2 − m c2 − Kπ − mπ c2
= 152.4 MeV + 493.7 MeV + 938.3 MeV
− 1115.7 MeV − 254.8 MeV − 135.0 MeV
= 78.9 MeV
p cos θ + pπ cos φ = pinitial and p sin θ − pπ sin φ = 0
Here we have two equations with two unknowns
(θ and φ). We can eliminate θ by writing the first equation as
p cos θ = pinitial − pπ cos φ, then squaring both equations
and adding them. The resulting equation can be solved
for φ:
p2initial + p2π − p2
2pπ pinitial
φ = cos−1
⎞
(416.8 MeV/c)2 + (365.7 MeV/c)2
⎜
⎟
−(426.9 MeV/c)2
⎟
= cos−1 ⎜
⎝ 2(365.7 MeV/c)(416.8 MeV/c) ⎠
⎛
◦
= 65.7
From the conservation of momentum equation for the y
components, we have
θ = sin−1
−1
= sin
pπ sin φ
p
(365.7 MeV/c)(sin 65.7◦ )
426.9 MeV/c
◦
= 51.3
56
Chapter 2 | The Special Theory of Relativity
Example 2.18
The discovery of the antiproton p (a particle with the same
rest energy as a proton, 938 MeV, but with the opposite
electric charge) took place in 1956 through the following
reaction:
p+p→p+p+p+p
proton. Thus the initial total energy of the two protons is
Ep + mp c2 . Let Ep′ and p′p represent the total energy and
momentum of each of the four final particles (which move
together and thus have the same energy and momentum).
We can then apply conservation of total energy:
in which accelerated protons were incident on a target of
protons at rest in the laboratory. The minimum incident
kinetic energy needed to produce the reaction is called the
threshold kinetic energy, for which the final particles move
together as if they were a single unit (Figure 2.24). Find
the threshold kinetic energy to produce antiprotons in this
reaction.
Ep + mp c2 = 4Ep′
y
y
p
p
v
p
(a)
and conservation of momentum:
pp = 4p′p
We can write the momentum equation as Ep2 − (mp c2 )2 =
4 Ep′2 − (mp c2 )2 , so now we have two equations in two
unknowns (Ep and Ep′ ). We eliminate Ep′ , for example
by solving the energy conservation equation for Ep′ and
substituting into the momentum equation. The result is
x
x
(b)
FIGURE 2.24 Example 2.18. (a) A proton moving with velocity v collides with another proton at rest. (b) The reaction
produces three protons and an antiproton, which move together
as a unit.
Solution
This problem can be solved by a straightforward application of energy and momentum conservation. Let Ep and pp
represent the total energy and momentum of the incident
Ep = 7mp c2
from which we can calculate the kinetic energy of the
incident proton:
Kp = Ep − mp c2 = 6mp c2 = 6(938 MeV) = 5628 MeV
= 5.628GeV
The Bevatron accelerator at the Lawrence Berkeley Laboratory was designed with this experiment in mind, so that
it could produce a beam of protons whose energy exceeded
5.6 GeV. The discovery of the antiproton in this reaction
was honored with the award of the 1959 Nobel Prize to the
experimenters, Emilio Segrè and Owen Chamberlain.
2.9 EXPERIMENTAL TESTS OF SPECIAL RELATIVITY
Because special relativity provided such a radical departure from the notions of
space and time in classical physics, it is important to perform detailed experimental
tests that can clearly distinguish between the predictions of special relativity and
those of classical physics. Many tests of increasing precision have been done
since the theory was originally presented, and in every case the predictions of
special relativity are upheld. Here we discuss a few of these tests.
2.9 | Experimental Tests of Special Relativity
Universality of the Speed of Light
The second relativity postulate asserts that the speed of light has the same value
c for all observers. This leads to several types of experimental tests, of which we
discuss two: (1) Does the speed of light change with the direction of travel? (2)
Does the speed of light change with relative motion between source and observer?
The Michelson-Morley experiment provides a test of the first type. This
experiment compared the upstream-downstream and cross-stream speeds of light
and concluded that they were equal within the experimental error. Equivalently,
we may say that the experiment showed that there is no preferred reference frame
(no ether) relative to which the speed of light must be measured. If there is an
ether, the speed of the Earth through the ether is less than 5 km/s, which is much
smaller than the Earth’s orbital speed about the Sun, 30 km/s. We can express
their result as a difference c between the upstream-downstream and cross-stream
speeds; the experiment showed that c/c < 3 × 10−10 .
To reconcile the result of the Michelson-Morley experiment with classical
physics, Lorentz proposed the “ether drag” hypothesis, according to which
the motion of the Earth through the ether caused an electromagnetic drag that
contracted the arm of the interferometer in the direction of motion. This contraction
was just enough to compensate for the difference in the upstream-downstream
and cross-stream times predicted by the Galilean transformation. This hypothesis
succeeds only when the two arms of the interferometer are of the same length.
To test this hypothesis, a similar experiment was done in 1932 by Kennedy and
Thorndike; in their experiment, the lengths of the interferometer arms differed by
about 16 cm, the maximum distance over which light sources available at that time
could remain coherent. The Kennedy-Thorndike experiment in effect tests the
second question, whether the speed of light changes due to relative motion. Their
result was c/c < 3 × 10−8 , which excludes the Lorentz contraction hypothesis
as an explanation for the Michelson-Morley experiment.
In recent years, these fundamental experiments have been repeated with
considerably improved precision using lasers as light sources. Experimenters
working at the Joint Institute for Laboratory Astrophysics in Boulder, Colorado,
built an apparatus that consisted of two He-Ne lasers on a rotating granite
platform. By electronically stabilizing the lasers, they improved the sensitivity
of their apparatus by several orders of magnitude. Again expressing the result
as a difference between the speeds along the two arms of the apparatus, this
experiment corresponds to c/c < 8 × 10−15 , an improvement of about 5 orders
of magnitude over the original Michelson-Morley experiment. In a similar
repetition of the Kennedy-Thorndike experiment using He-Ne lasers, they obtained
c/c < 1 × 10−10 , an improvement over the original experiment by a factor of
300. [See A. Brillet and J. L. Hall, Physical Review Letters 42, 549 (1979); D.
Hils and J. L. Hall, Physical Review Letters 64, 1697 (1990).] A considerable
improvement in the Kennedy-Thorndike type of experiment has been made
possible by comparing the oscillation frequency of a crystal with the frequency
of a hydrogen maser (a maser is similar to a laser, but it uses microwaves rather
than visible light). The experimenters measured for nearly one year, looking for a
change in the relative frequencies as the Earth’s velocity changed. No effect was
observed, leading to a limit of c/c < 2 × 10−12 . [See P. Wolf et al., Physical
Review Letters 90, 060402 (2003).]
Another way of testing the second question is to measure the speed of a light
beam emitted by a source in motion. Suppose we observe this beam along the
57
58
Chapter 2 | The Special Theory of Relativity
direction of motion of the moving source, which might be moving toward us or
away from us. In the rest frame of the source, the emitted light travels at speed c.
We can express the speed of light in our reference frame as c′ = c + c, where
c is zero according to special relativity (c′ = c) or is ±u according to classical
physics (c′ = c ± u in the Galilean transformation, depending on whether the
motion is toward or away from the observer).
In one experiment of this type, the decay of pi mesons (pions) into gamma
rays (a form of electromagnetic waves traveling at c) was observed. When pions
(produced in laboratories with large accelerators) emit these gamma rays, they are
traveling at speeds close to the speed of light, relative to the laboratory. Thus if
Galilean relativity were valid, we should expect to find gamma rays emitted in the
direction of motion of the decaying pions traveling at a speed c′ in the laboratory of
nearly 2c, rather than always with c as predicted by special relativity. The observed
laboratory speed of these gamma rays in one experiment was (2.9977 ± 0.0004)
×108 m/s when the decaying pions were moving at u/c = 0.99975. These results
give c/c < 2 × 10−4 , and thus c′ = c as expected from special relativity. This
experiment shows directly that an object moving at a speed of nearly c relative to
the laboratory emits “light” that travels at a speed of c relative to both the object
and the laboratory, giving direct evidence for Einstein’s second postulate. [See T.
Alvager et al., Physics Letters 12, 260 (1964).]
Another experiment of this type is to study the X rays emitted by a binary
pulsar, a rapidly pulsating source of X rays in orbit about another star, which
would eclipse the pulsar as it rotated in its orbit. If the speed of light (in this case,
X rays) were to change as the pulsar moved first toward and later away from the
Earth in its orbit, the beginning and end of the eclipse would not be equally spaced
in time from the midpoint of the eclipse. No such effect is observed, and from
these observations it is concluded that c/c < 2 × 10−12 , in agreement with
predictions of special relativity. These experiments were done at u/c = 10−3 .
[See K. Brecher, Physical Review Letters 39, 1051 (1977).]
A different type of test of the limit by which the speed of light changes with
direction of travel can be done using the clocks carried aboard the network of Earth
satellites that make up the Global Positioning System (GPS). By comparing the
readings of clocks on the GPS satellites with clocks on the ground at different times
of day (as the satellites move relative to the ground stations), it is possible to test
whether the change in the direction of travel affects the apparent synchronization
of the clocks. No effect was observed, and the experimenters were able to set a
limit of c/c < 5 × 10−9 for the difference between the one-way and round-trip
speeds of light. [See P. Wolf and G. Petit, Physical Review A 56, 4405 (1997).]
Time Dilation
We have already discussed the time dilation effect on the decay of muons produced
by cosmic rays. Muon decay can also be studied in the laboratory. Muons can be
produced following collisions in high-energy accelerators, and the decay of the
muons can be followed by observing their decay products (ordinary electrons).
These muons can either be trapped and decay at rest, or they can be placed
in a beam and decay in flight. When muons are observed at rest, their decay
lifetime is 2.198 μs. (As we discuss in Chapter 12, decays generally follow an
exponential law. The lifetime is the time after which a fraction 1/e = 0.368 of
the original muons remain.) This is the proper lifetime, measured in a frame of
reference in which the muon is at rest. In one particular experiment, muons were
trapped in a ring and circulated at a momentum of p = 3094 MeV/c. The decays
2.9 | Experimental Tests of Special Relativity
in flight occurred with a lifetime of 64.37 μs (measured in the laboratory frame
of reference). For muons of this momentum, Eq. 2.8 gives a dilated lifetime of
(see Problem 43) 64.38 μs, which is in excellent agreement with the measured
value and confirms the time dilation effect. [See J. Bailey et al., Nature 268, 301
(1977).]
Another similar experiment was done with pions. The proper lifetime, measured
for pions at rest, is known to be 26.0 ns. In one experiment, pions were observed
in flight at u/c = 0.913, and their lifetime was measured to be 63.7 ns. (Pions
decay to muons, so we can follow the exponential radioactive decay of the pions
by observing the muons emitted as a result of the decay.) For pions moving at
this speed, the expected dilated lifetime is in exact agreement with the measured
value, once again confirming the time dilation effect. [See D. S. Ayres et al.,
Physical Review D 3, 1051 (1971).]
The Doppler Effect
Confirmation of the relativistic Doppler effect first came from experiments done
in 1938 by Ives and Stilwell. They sent a beam of hydrogen atoms, generated
in a gas discharge, down a tube at a speed u, as shown in Figure 2.25. They
could simultaneously observe light emitted by the atoms in a direction parallel
to u (atom 1) and opposite to u (atom 2, reflected from the mirror). Using a
spectrograph, the experimenters were able to photograph the characteristic spectral
lines from these atoms and also, on the same photographic plate, from atoms at
rest. If the classical Doppler formula were valid, the wavelengths of the lines
from atoms 1 and 2 would be placed at symmetric intervals λ1 = ±λ0 (u/c) on
either side of the line from the atoms at rest (wavelength λ0 ), as in Figure 2.25b.
The relativistic Doppler formula, on the other hand, gives a small additional
asymmetric shift λ2 = + 21 λ0 (u/c)2 , as in Figure 2.25c (computed for u ≪ c, so
100 V
30 kV
lon
Mirror
2 u
1
Hot
filament
Hydrogen
arc region
To spectrograph
u
Acceleration
region
(a)
λ0
∆λ1
λ0
∆λ1
(b)
∆λ1–∆λ2 ∆λ1+∆λ2
(c)
FIGURE 2.25 (a) Apparatus used in the Ives-Stilwell experiment. (b) Line
spectrum expected from classical Doppler effect. (c) Line spectrum expected
from relativistic Doppler effect.
59
Chapter 2 | The Special Theory of Relativity
5
∆λ2 (10−3 nm)
4
3
2
1
0
0
1
2
3
4
5
u/c (units of 10−3)
FIGURE 2.26 Results of the IvesStilwell experiment. According to
classical theory, λ2 = 0, while
according to special relativity, λ2
depends on (u/c)2 . The solid line,
which represents the relativistic formula, gives excellent agreement with
the data points.
that higher-order terms in u/c can be neglected). Figure 2.26 shows the results of
Ives and Stilwell for one of the hydrogen lines (the blue line of the Balmer series
at λ0 = 486 nm). The agreement between the observed values and those predicted
by the relativistic formula is impressive.
Recent experiments with lasers have verified the relativistic formula at greater
accuracy. These experiments are based on the absorption of laser light by an
atom; when the radiation is absorbed, the atom changes from its lowest-energy
state (the ground state) to one of its excited states. The experiment consists
essentially of comparing the laser wavelength needed to excite atoms at rest
with that needed for atoms in motion. One experiment used a beam of hydrogen
atoms with kinetic energy 800 MeV (corresponding to u/c = 0.84) produced in a
high-energy proton accelerator. An ultraviolet laser was used to excite the atoms.
This experiment verified the relativistic Doppler effect to an accuracy of about
3 × 10−4 . [See D. W. MacArthur et al., Physical Review Letters 56, 282 (1986).]
In another experiment, a beam of neon atoms moving with a speed of u = 0.0036c
was irradiated with light from a tunable dye laser. This experiment verified the
relativistic Doppler shift to a precision of 2 × 10−6 . [See R. W. McGowan et al.,
Physical Review Letters 70, 251 (1993).] A more recent study used two tunable
dye lasers parallel and antiparallel to a beam of lithium atoms moving at 0.064c.
The results of this experiment agreed with the relativistic Doppler formula to
within a precision of 2 × 10−7 , improving on the best previous results by an order
of magnitude. [See G. Saathoff et al., Physical Review Letters 91, 190403 (2003).]
Relativistic Momentum and Energy
The earliest direct confirmation of the relativistic relationship for energy and
momentum came just a few years after Einstein’s 1905 paper. Simultaneous
measurements were made of the momentum and velocity of high-energy electrons
emitted in certain radioactive decay processes (nuclear beta decay, which is
discussed in Chapter 12). Figure 2.27 shows the results of several different
investigations plotted as p/mv, which should have the value 1 according to
classical physics. The results agree with the relativistic formula and disagree with
the classical one. Note that the relativistic and classical formulas give the same
1.8
1.6
Relativistic:
p/mv
60
p
1
mv = √
1–v2/c2
1.4
1.2
Nonrelativistic:
1.0 0
0.1
0.2
0.3
0.4
0.5
0.6
p
=1
mv
0.7
0.8
Velocity (v/c)
FIGURE 2.27 The ratio p/mv is plotted for electrons of various speeds. The data agree
with the relativistic result and not at all with the nonrelativistic result (p/mv = 1).
2.9 | Experimental Tests of Special Relativity
61
results at low speeds, and in fact the two cannot be distinguished for speeds below
0.l c, which accounts for our failure to observe these effects in experiments with
ordinary laboratory objects.
Other more recent experiments, in which the kinetic energies of fast electrons
were measured, are shown in Figure 2.28. Once again, the data at high speeds
agree with special relativity and disagree with the classical equations. In a
more extreme example, experimenters at the Stanford Linear Accelerator Center
measured the speed of 20 GeV electrons, whose speed is within 5 × 10−10 of the
speed of light (or about 0.15 m/s less than c). The measurement was not capable
of this level of precision, but it did determine that the speed of the electrons was
within 2 × 10−7 of the speed of light (60 m/s). [See Z. G. T. Guiragossian et al.,
Physical Review Letters 34, 335 (1975).]
Nearly every time the nuclear or particle physicist enters the laboratory, a direct
or indirect test of the momentum and energy relationships of special relativity
is made. Principles of special relativity must be incorporated in the design of
the high-energy accelerators used by nuclear and particle physicists, so even the
construction of these projects gives testimony to the validity of the formulas of
special relativity.
For example, consider the capture of a neutron by an atom of hydrogen to form
an atom of deuterium or “heavy hydrogen.” Energy is released in this process,
mostly in the form of electromagnetic radiation (gamma rays). The energy of the
gamma rays is measured to be 2.224 MeV. Where does this energy come from?
0.9
Nonrelativistic
K = 12 mv2
5.0
0.7
0.6
Nonrelativistic
p2
=m
2K
1.0
4.0
Nonrelativistic
K = p2/2m
3.0
0.1 0.2 0.3 0.4 0.5 0.6 0.7
Kinetic energy (MeV)
(a)
K = mc2
–1
√1 – (v/c)2
0.5
2.0
1.0
0.5
0
Relativistic
1
(v/c)2
Relativistic
p2
K
=m+ 2
2K
2c
Kinetic energy (MeV)
p2/2K (MeV/c2)
0.8
0.0
0.0
Relativistic
K = √ p2c2 + m2c4 – mc2
1.0
2.0
3.0
Momentum (MeV/c)
(b)
0.0
0
1
2
3
4
Kinetic energy (MeV)
5
(c)
FIGURE 2.28 Confirmation of relativistic kinetic energy relationships. In (a) and (b) the momentum and energy of
radioactive decay electrons were measured simultaneously. In these two independent experiments, the data were plotted in
different ways, but the results are clearly in good agreement with the relativistic relationships and in poor agreement with
the classical, nonrelativistic relationships. In (c) electrons were accelerated to a fixed energy through a large electric field
(up to 4.5 million volts, as shown) and the velocities of the electrons were determined by measuring the flight time over
8.4 m. Notice that at small kinetic energies (K ≪ mc2 ), the relativistic and nonrelativistic relationships become identical.
[Sources: (a) K. N. Geller and R. Kollarits, Am. J. Phys. 40, 1125 (1972); (b) S. Parker, Am. J. Phys. 40, 241 (1972);
(c) W. Bertozzi, Am. J. Phys. 32, 551 (1964).].
62
Chapter 2 | The Special Theory of Relativity
It comes from the difference in mass when the hydrogen and neutron combine to
form deuterium. The difference between the initial and final masses is:
m = m(hydrogen) + m(neutron) − m(deuterium)
= 1.007825 u + 1.008665 u − 2.014102 u = 0.002388 u
The initial mass of hydrogen plus neutron is greater than the final mass of
deuterium by 0.002388 u. The energy equivalent of this change in mass is
E = (m)c2 = 2.224 MeV
which is equal to the energy released as gamma rays.
Similar experiments have been done to test the E = mc2 relationship by
measuring the energy released as gamma rays following the capture of neutrons
by atoms of silicon and sulfur, and comparing the gamma-ray energies with the
difference between the initial and final masses. These experiments are consistent
with E = mc2 to a precision of about 4 × 10−7 . [See S. Rainville et al., Nature
438, 1096 (2006).]
Twin Paradox
Although we cannot perform the experiment to test the twin paradox as we have
described it, we can do an equivalent experiment. We take two clocks in our
laboratory and synchronize them carefully. We then place one of the clocks in an
airplane and fly it around the Earth. When we return the clock to the laboratory
and compare the two clocks, we expect to find, if special relativity is correct,
that the clock that has left the laboratory is the “younger” one—that is, it will
have ticked away fewer seconds and appear to run behind its stationary twin. In
this experiment, we must use very precise clocks based on the atomic vibrations
of cesium in order to measure the time differences between the clock readings,
which amount to only about 10−7 s. This experiment is complicated by several
factors, all of which can be computed rather precisely: the rotating Earth is not
an inertial frame (there is a centripetal acceleration), clocks on the surface of the
Earth are already moving because of the rotation of the Earth, and the general
theory of relativity predicts that a change in the gravitational field strength, which
our moving clock will experience as it changes altitude in its airplane flight, will
also change the rate at which the clock runs. In this experiment, as in the others
we have discussed, the results are entirely in agreement with the predictions of
special relativity. [See J. C. Hafele and R. E. Keating, Science 177, 166 (1972).]
In a similar experiment, a cesium atomic clock carried on the space shuttle was
compared with an identical clock on the Earth. The comparison was made through
a radio link between the shuttle and the ground station. At an orbital height of
about 328 km, the shuttle moves at a speed of about 7712 m/s, or 2.5 × 10−5 c. A
clock moving at this speed runs slower than an identical clock at rest by the time
dilation factor. For every second the clock is in orbit, it loses 330 ps relative to the
clock on Earth; equivalently, it loses about 1.8 μs per orbit. These time intervals
can be measured with great precision, and the predicted asymmetric aging was
verified to a precision of about 0.1%. [See E. Sappl, Naturwissenschaften 77,
325 (1990).]
Questions
63
Chapter Summary
Section
Galilean relativity
x′
Einstein’s
postulates
(1) The laws of physics are the 2.3
same in all inertial frames. (2)
The speed of light has the same
value c in all inertial frames.
Time dilation
Length contraction
Velocity addition
=x−
t =
ut, v′x
= vx − u
t0
1 − u2 /c2
(t0 = proper time)
L = L0 1 − u2 /c2
(L0 = proper length)
v′ + u
v=
1 + v′ u/c2
x′ =
x − ut
1 − u2 /c2
Lorentz velocity
transformation
p =
Relativistic kinetic
energy
K=
Relativistic total
energy
Momentum-energy
relationship
,
2.5
y′ = y, z′ = z,
t − (u/c2 )x
t′ =
1 − u2 /c2
t′ =
Relativistic
momentum
Rest energy
2.4
vx − u
,
1 − vx u/c2
vy 1 − u2 /c2
1 − vx u/c2
vz 1 − u2 /c2
′
vz =
1 − vx u/c2
Clock
synchronization
2.4
v′x =
v′y =
2.4
2.4
Doppler effect
1 − u/c
′
(source and
f =f
1 + u/c
observer separating)
Lorentz
transformation
2.1
Section
2.5
,
uL/c2
2.5
1 − u2 /c2
m
v
2.7
1 − v2 /c2
mc2
1 − v2 /c2
E0 = mc2
− mc2
2.7
2.7
2
mc
E = K + E0 =
1 − v2 /c2
E = (pc)2 + (mc2 )2
2.7
2.7
Extreme relativistic E ∼
= pc
approximation
2.7
Conservation laws
2.8
In an isolated system of
particles, the total momentum
and the relativistic total energy
remain constant.
Questions
1. Explain in your own words what is meant by the term
“relativity.” Are there different theories of relativity?
2. Suppose the two observers and the rock described in the
first paragraph of Section 2.1 were isolated in interstellar
space. Discuss the two observers’ differing perceptions of
the motion of the rock. Is there any experiment they can
do to determine whether the rock is moving in any absolute
sense?
3. Describe the situation of Figure 2.4 as it would appear from
the reference frame of O′ .
4. Does the Michelson-Morley experiment show that the ether
does not exist or that it is merely unnecessary?
5. Suppose we made a pair of shears in which the cutting blades
were many orders of magnitude longer than the handle. Let
us in fact make them so long that, when we move the handles
at angular velocity ω, a point on the tip of the blade has a
tangential velocity v = ωr that is greater than c. Does this
contradict special relativity? Justify your answer.
6. Light travels through water at a speed of about 2.25 × 108
m/s. Is it possible for a particle to travel through water at a
speed v greater than 2.25 × 108 m/s?
64
Chapter 2 | The Special Theory of Relativity
7. Is it possible to have particles that travel at the speed of
light? What does Eq. 2.36 require of such particles?
8. How does relativity combine space and time coordinates
into spacetime?
9. Einstein developed the relativity theory after trying unsuccessfully to imagine how a light beam would look to an
observer traveling with the beam at speed c. Why is this so
difficult to imagine?
10. Explain in your own words the terms time dilation and
length contraction.
11. Does the Moon’s disk appear to be a different size to a
space traveler approaching it at v = 0.99c, compared with
the view of a person at rest at the same location?
12. According to the time dilation effect, would the life
expectancy of someone who lives at the equator be longer
or shorter than someone who lives at the North Pole? By
how much?
13. Criticize the following argument. “Here is a way to travel
faster than light. Suppose a star is 10 light-years away. A
radio signal sent from Earth would need 20 years to make
the round trip to the star. If I were to travel to the star in my
rocket
at v = 0.8c, to me the distance to the star is contracted
by 1 − (0.8)2 to 6 light-years, and at that speed it would
take me 6 light-years/0.8c = 7.5 years to travel there. The
14.
15.
16.
17.
18.
19.
20.
round trip takes me only 15 years, and therefore I travel
faster than light, which takes 20 years.”
Is it possible to synchronize clocks that are in motion relative to each other? Try to design a method to do so. Which
observers will believe the clocks to be synchronized?
Suppose event A causes event B. To one observer, event A
comes before event B. Is it possible that in another frame of
reference event B could come before event A? Discuss.
Is mass a conserved quantity in classical physics? In special
relativity?
“In special relativity, mass and energy are equivalent.”
Discuss this statement and give examples.
Which is more massive, an object at low temperature or
the same object at high temperature? A spring at its natural
length or the same spring under compression? A container of
gas at low pressure or at high pressure? A charged capacitor
or an uncharged one?
Could a collision be elastic in one frame of reference and
inelastic in another?
(a) What properties of nature would be different if there
were a relativistic transformation law for electric charge?
(b) What experiments could be done to prove that electric
charge does not change with velocity?
Problems
2.1 Classical Relativity
1. You are piloting a small airplane in which you want to reach
a destination that is 750 km due north of your starting location. Once you are airborne, you find that (due to a strong
but steady wind) to maintain a northerly course you must
point the nose of the plane at an angle that is 22◦ west of
true north. From previous flights on this route in the absence
of wind, you know that it takes you 3.14 h to make the
journey. With the wind blowing, you find that it takes 4.32
h. A fellow pilot calls you to ask about the wind velocity
(magnitude and direction). What is your report?
2. A moving sidewalk 95 m in length carries passengers at
a speed of 0.53 m/s. One passenger has a normal walking
speed of 1.24 m/s. (a) If the passenger stands on the sidewalk without walking, how long does it take her to travel
the length of the sidewalk? (b) If she walks at her normal
walking speed on the sidewalk, how long does it take to
travel the full length? (c) When she reaches the end of the
sidewalk, she suddenly realizes that she left a package at the
opposite end. She walks rapidly back along the sidewalk at
double her normal walking speed to retrieve the package.
How long does it take her to reach the package?
2.2 The Michelson-Morley Experiment
3. A shift of one fringe in the Michelson-Morley experiment
corresponds to a change in the round-trip travel time along
one arm of the interferometer by one period of vibration
of light (about 2 × 10−15 s) when the apparatus is rotated
by 90◦ . Based on the results of Example 2.3, what velocity through the ether would be deduced from a shift of
one fringe? (Take the length of the interferometer arm to
be 11 m.)
2.4 Consequences of Einstein’s Postulates
4. The distance from New York to Los Angeles is about
5000 km and should take about 50 h in a car driving at
100 km/h. (a) How much shorter than 5000 km is the distance according to the car travelers? (b) How much less than
50 h do they age during the trip?
5. How fast must an object move before its length appears to
be contracted to one-half its proper length?
6. An astronaut must journey to a distant planet, which is
200 light-years from Earth. What speed will be necessary
if the astronaut wishes to age only 10 years during the
round trip?
Problems
7. The proper lifetime of a certain particle is 100.0 ns. (a) How
long does it live in the laboratory if it moves at v = 0.960c?
(b) How far does it travel in the laboratory during that time?
(c) What is the distance traveled in the laboratory according
to an observer moving with the particle?
8. High-energy particles are observed in laboratories by photographing the tracks they leave in certain detectors; the
length of the track depends on the speed of the particle and its lifetime. A particle moving at 0.995c leaves
a track 1.25 mm long. What is the proper lifetime of the
particle?
9. Carry out the missing steps in the derivation of Eq. 2.17.
10. Two spaceships approach the Earth from opposite directions.
According to an observer on the Earth, ship A is moving at
a speed of 0.753c and ship B at a speed of 0.851c. What is
the velocity of ship A as observed from ship B? Of ship B as
observed from ship A?
11. Rocket A leaves a space station with a speed of 0.826c.
Later, rocket B leaves in the same direction with a speed of
0.635c. What is the velocity of rocket A as observed from
rocket B?
12. One of the strongest emission lines observed from distant
galaxies comes from hydrogen and has a wavelength of
122 nm (in the ultraviolet region). (a) How fast must a
galaxy be moving away from us in order for that line to be
observed in the visible region at 366 nm? (b) What would
be the wavelength of the line if that galaxy were moving
toward us at the same speed?
13. A physics professor claims in court that the reason he
went through the red light (λ = 650 nm) was that, due to
his motion, the red color was Doppler shifted to green
(λ = 550 nm). How fast was he going?
2.5 The Lorentz Transformation
14. Derive the Lorentz velocity transformations for v′x and v′z .
15. Observer O fires a light beam in the y direction (vy = c).
Use the Lorentz velocity transformation to find v′x and v′y
and show that O′ also measures the value c for the speed of
light. Assume that O′ moves relative to O with velocity u in
the x direction.
16. A light bulb at point x in the frame of reference of O
blinks on and off at intervals t = t2 − t1 . Observer O′ ,
moving relative to O at speed u, measures the interval to be
t′ = t2′ − t1′ . Use the Lorentz transformation expressions
to derive the time dilation expression relating t and t′ .
17. A neutral K meson at rest decays into two π mesons, which
travel in opposite directions along the x axis with speeds of
0.828c. If instead the K meson were moving in the positive
x direction with a velocity of 0.486c, what would be the
velocities of the two π mesons?
18. A rod in the reference frame of observer O makes an angle
of 31◦ with the x axis. According to observer O′ , who is in
motion in the x direction with velocity u, the rod makes an
angle of 46◦ with the x axis. Find the velocity u.
65
19. According to observer O, two events occur separated by a
time interval t = +0.465 μs and at locations separated by
x = +53.4 m. (a) According to observer O′ , who is in
motion relative to O at a speed of 0.762c in the positive x
direction, what is the time interval between the two events?
(b) What is the spatial separation between the two events,
according to O′ ?
20. According to observer O, a blue flash occurs at xb = 10.4 m
when tb = 0.124 μs, and a red flash occurs at xr = 23.6 m
when tr = 0.138 μs. According to observer O′ , who is in
motion relative to O at velocity u, the two flashes appear to
be simultaneous. Find the velocity u.
2.6 The Twin Paradox
21. Suppose the speed of light were 1000 mi/h. You are traveling
on a flight from Los Angeles to Boston, a distance of 3000
mi. The plane’s speed is a constant 600 mi/h. You leave Los
Angeles at 10:00 A.M., as indicated by your wristwatch and
by a clock in the airport. (a) According to your watch, what
time is it when you land in Boston? (b) In the Boston airport
is a clock that is synchronized to read exactly the same time
as the clock in the Los Angeles airport. What time does that
clock read when you land in Boston? (c) The following day
when the Boston clock that records Los Angeles time reads
10:00 A.M., you leave Boston to return to Los Angeles on
the same airplane. When you land in Los Angeles, what are
the times read on your watch and on the airport clock?
22. Suppose rocket traveler Amelia has a clock made on Earth.
Every year on her birthday she sends a light signal to brother
Casper on Earth. (a) At what rate does Casper receive
the signals during Amelia’s outward journey? (b) At what
rate does he receive the signals during her return journey?
(c) How many of Amelia’s birthday signals does Casper
receive during the journey that he measures to last 20 years?
23. Suppose Amelia traveled at a speed of 0.80c to a star that
(according to Casper on Earth) is 8.0 light-years away.
Casper ages 20 years during Amelia’s round trip. How much
younger than Casper is Amelia when she returns to Earth?
24. Make a drawing similar to Figure 2.20 showing the worldlines of Casper and Amelia from Casper’s frame of reference.
Divide the world line for Amelia’s outward journey into 8
equal segments (for the 8 birthdays that Amelia celebrates).
For each birthday, draw a line that represents a light signal
that Amelia sends to Casper on her birthday. Do the same
for Amelia’s return journey. (a) According to Casper’s time,
when does he receive the signal showing Amelia celebrating
her 8th birthday after leaving Earth? (b) How long does
it take for Casper to receive the signals showing Amelia
celebrating birthdays 9 through 16?
2.7 Relativistic Dynamics
25. (a) Using the relativistically correct final velocities for
the collision shown in Figure 2.21a (v′1f = −0.585c, v′2f =
+0.294c), show that relativistic kinetic energy is conserved
66
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
Chapter 2 | The Special Theory of Relativity
according to observer O′ . (b) Using the relativistically correct final velocities for the collision shown in Figure 2.21b
(v1f = −0.051c, v2f = +0.727c), show that relativistic
kinetic energy is conserved according to observer O.
Find the momentum, kinetic energy, and total energy of a
proton moving at a speed of 0.756c.
An electron is moving with a kinetic energy of 1.264 MeV.
What is its speed?
The work-energy theorem relates the change in kinetic
energy of a particle to the work done on it by an external
force: K = W = ∫ F dx. Writing Newton’s second law
as F = dp/dt, show that W = ∫ v dp and integrate by parts
using the relativistic momentum to obtain Eq. 2.34.
For what range of velocities of a particle of mass m can
we use the classical expression for kinetic energy 12 mv2 to
within an accuracy of 1%?
For what range of velocities of a particle of mass m can we
use the extreme relativistic approximation E = pc to within
an accuracy of 1%?
Use Eqs. 2.32 and 2.36 to derive Eq. 2.39.
Use the binomial expansion (1 + x)n = 1 + nx +
[n(n − 1)/2!]x2 + · · · to show that Eq. 2.34 for the relativistic kinetic energy reduces to the classical expression
1
2
2 mv when v ≪ c. This important result shows that our
familiar expressions are correct at low speeds. By evaluating the first term in the expansion beyond 21 mv2 , find the
speed necessary before the classical expression is off by
0.01%.
(a) According to observer O, a certain particle has a
momentum of 817 MeV/c and a total relativistic energy
of 1125 MeV. What is the rest energy of this particle?
(b) An observer O′ in a different frame of reference measures the momentum of this particle to be 953 MeV/c.
What does O′ measure for the total relativistic energy of the
particle?
An electron is moving at a speed of 0.81c. By how much
must its kinetic energy increase to raise its speed to 0.91c?
What is the change in mass when 1 g of copper is heated
from 0 to 100◦ C? The specific heat capacity of copper is
0.40 J/g · K.
Find the kinetic energy of an electron moving at a speed of
(a) v = 1.00 × 10−4 c; (b) v = 1.00 × 10−2 c; (c) v =
0.300c; (d) v = 0.999c.
An electron and a proton are each accelerated starting from
rest through a potential difference of 10.0 million volts. Find
the momentum (in MeV/c) and the kinetic energy (in MeV)
of each, and compare with the results of using the classical
formulas.
In a nuclear reactor, each atom of uranium (of atomic mass
235 u) releases about 200 MeV when it fissions. What is the
change in mass when 1.00 kg of uranium-235 is fissioned?
2.8 Conservation Laws in Relativistic Decays and Collisions
39. A π meson of rest energy 139.6 MeV moving at a speed of
0.906c collides with and sticks to a proton of rest energy
938.3 MeV that is at rest. (a) Find the total relativistic
energy of the resulting composite particle. (b) Find the total
linear momentum of the composite particle. (c) Using the
results of (a) and (b), find the rest energy of the composite
particle.
40. An electron and a positron (an antielectron) make a head-on
collision, each moving at v = 0.99999c. In the collision
the electrons disappear and are replaced by two muons
(mc2 = 105.7 MeV), which move off in opposite directions.
What is the kinetic energy of each of the muons?
41. It is desired to create a particle of mass 9700 MeV/c2 in a
head-on collision between a proton and an antiproton (each
having a mass of 938.3 MeV/c2 ) traveling at the same speed.
What speed is necessary for this to occur?
42. A particle of rest energy mc2 is moving with speed v in the
positive x direction. The particle decays into two particles,
each of rest energy 140 MeV. One particle, with kinetic
energy 282 MeV, moves in the positive x direction, and the
other particle, with kinetic energy 25 MeV, moves in the
negative x direction. Find the rest energy of the original
particle and its speed.
2.9 Experimental Tests of Special Relativity
43. In the muon decay experiment discussed in Section 2.9 as a
verification of time dilation, the muons move in the lab with
a momentum of 3094 MeV/c. Find the dilated lifetime in
the laboratory frame. (The proper lifetime is 2.198 μs.)
44. Derive the relativistic expression p2 /2K = m + K/2c2 ,
which is plotted in Figure 2.28a.
General Problems
45. Suppose we want to send an astronaut on a round trip to
visit a star that is 200 light-years distant and at rest with
respect to Earth. The life support systems on the spacecraft
enable the astronaut to survive at most 20 years. (a) At what
speed must the astronaut travel to make the round trip in
20 years of spacecraft time? (b) How much time passes on
Earth during the round trip?
46. A “cause” occurs at point 1 (x1 , t1 ) and its “effect” occurs
at point 2 (x2 , t2 ). Use the Lorentz transformation to find
t2′ − t1′ , and show that t2′ − t1′ > 0; that is, O′ can never see
the “effect” coming before its “cause.”
47. Observer O sees a red flash of light at the origin at t = 0 and
a blue flash of light at x = 3.26 km at a time t = 7.63 μs.
What are the distance and the time interval between the
flashes according to observer O′ , who moves relative to O
in the direction of increasing x with a speed of 0.625c?
Problems
48.
49.
50.
51.
52.
Assume that the origins of the two coordinate systems line
up at t = t′ = 0.
Several spacecraft (A, B, C, and D) leave a space station
at the same time. Relative to an observer on the station,
A travels at 0.60c in the x direction, B at 0.50c in the y
direction, C at 0.50c in the negative x direction, and D at
0.50c at 45◦ between the y and negative x directions. Find
the velocity components, directions, and speeds of B, C, and
D as observed from A.
Observer O sees a light turn on at x = 524 m when
t = 1.52 μs. Observer O′ is in motion at a speed of 0.563c
in the positive x direction. The two frames of reference are
synchronized so that their origins match up (x = x′ = 0) at
t = t′ = 0. (a) At what time does the light turn on according
to O′ ? (b) At what location does the light turn on in the
reference frame of O′ ?
Suppose an observer O measures a particle of mass m
moving in the x direction to have speed v, energy E,
and momentum p. Observer O′ , moving at speed u in
the x direction, measures v′ , E′ , and p′ for the same object.
(a) Use the Lorentz velocity transformation to find E′ and
p′ in terms of m, u, and v. (b) Reduce E′2 − (p′ c)2 to its
simplest form and interpret the result.
Repeat Problem 50 for the mass moving in the y direction
according to O. The velocity u of O′ is still along the x
direction.
Consider again the situation described in Section 2.6.
Amelia’s friend Bernice leaves Earth at the same time
as Amelia and travels in the same direction at the same
speed, but Bernice continues in the original direction when
Amelia reaches the planet and turns her ship around.
(a) From Bernice’s frame of reference, Casper is moving
at a velocity of −0.60c. Draw Casper’s worldline in Bernice’s frame of reference. (b) Casper celebrates 20 birthdays
during Amelia’s journey. In Bernice’s frame of reference,
how long does it take for Casper to celebrate 20 birthdays?
(c) In Bernice’s frame of reference, draw a worldline representing Amelia’s outbound journey to the planet. (d) Calculate Amelia’s velocity during her return journey as observed
from Bernice’s frame of reference, and draw a worldline
showing Amelia’s return journey. Amelia’s and Casper’s
worldlines should intersect when Amelia return to Earth.
53.
54.
55.
56.
57.
67
(e) Divide Casper’s worldline into 20 segments, representing his birthdays. He sends a light signal to Amelia on each
birthday. Amelia receives a light signal from Casper just as
she arrives at the planet. On which birthday did Casper send
this signal? (f ) Amelia sends Casper a light signal on her
8th birthday. Draw a line on your diagram representing this
light signal. When does Casper receive this signal?
Electrons are accelerated to high speeds by a two-stage
machine. The first stage accelerates the electrons from rest
to v = 0.99c. The second stage accelerates the electrons
from 0.99c to 0.999c. (a) How much energy does the first
stage add to the electrons? (b) How much energy does the
second stage add in increasing the velocity by only 0.9%?
A beam of 1.35 × 1011 electrons/s moving at a speed of
0.732c strikes a block of copper that is used as a beam
stop. The copper block is a cube measuring 2.54 cm on edge.
What is the temperature increase of the block after one hour?
An electron moving at a speed of vi = 0.960c in the positive
x direction collides with another electron at rest. After the
collision, one electron is observed to move with a speed of
v1f = 0.956c at an angle of θ1 = 9.7◦ with the x axis. (a) Use
conservation of momentum to find the velocity (magnitude
and direction) of the second electron. (b) Based only on
the original data given in the problem, use conservation of
energy to find the speed of the second electron.
A pion has a rest energy of 135 MeV. It decays into two
gamma ray photons, bursts of electromagnetic radiation that
travel at the speed of light. A pion moving through the
laboratory at v = 0.98c decays into two gamma ray photons
of equal energies, making equal angles θ with the original
direction of motion. Find the angle θ and the energies of the
two gamma ray photons.
Consider again the decay described in Example 2.16 and
determine the energies of the two pi mesons emitted in the
decay of the K meson by first making a Lorentz transformation to a reference frame in which the initial K meson is
at rest. When a K meson at rest decays into two pi mesons,
they move in opposite directions with equal and opposite
velocities, so they share the decay energy equally. Find
the energies and velocities of the two pi mesons in the K
meson’s rest frame. Then transform back to the lab frame to
find their kinetic energies.
Chapter
3
THE PARTICLELIKE PROPERTIES OF
ELECTROMAGNETIC RADIATION
Thermal emission, the radiation emitted by all objects due to their temperatures, laid the
groundwork for the development of quantum mechanics around the beginning of the 20th
century. Today we use thermography for many applications, including the study of heat loss
by buildings, medical diagnostics, night vision and other surveillance, and monitoring
potential volcanoes.
70
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
We now turn to a discussion of wave mechanics, the second theory on which
modern physics is based. One consequence of wave mechanics is the breakdown
of the classical distinction between particles and waves. In this chapter we consider
the three early experiments that provided evidence that light, which we usually
regard as a wave phenomenon, has properties that we normally associate with
particles. Instead of spreading its energy smoothly over a wave front, the energy
is delivered in concentrated bundles like particles; a discrete bundle (quantum) of
electromagnetic energy is known as a photon.
Before we begin to discuss the experimental evidence that supports the
existence of the photon and the particlelike properties of light, we first review
some of the properties of electromagnetic waves.
3.1 REVIEW OF ELECTROMAGNETIC WAVES
and magnetic field
An electromagnetic field is characterized by its electric field E
B. For example, the electric field at a distance r from a point charge q at the
origin is
=
E
1 q
r̂
4π ε0 r2
(3.1)
where r̂ is a unit vector in the radial direction. The magnetic field at a distance r
from a long, straight, current-carrying wire along the z axis is
=
B
μ0 i
φ̂
2π r
(3.2)
where φ̂ is the unit vector in the azimuthal direction (in the xy plane) in cylindrical
coordinates.
If the charges are accelerated, or if the current varies with time, an electro and B
vary not only with r but also
magnetic wave is produced, in which E
with t. The mathematical expression that describes such a wave may have many
different forms, depending on the properties of the source of the wave and of the
medium through which the wave travels. One special form is the plane wave, in
which the wave fronts are planes. (A point source, on the other hand, produces
spherical waves, in which the wave fronts are spheres.) A plane electromagnetic
wave traveling in the positive z direction is described by the expressions
=E
0 sin(kz − ωt),
E
=B
0 sin(kz − ωt)
B
(3.3)
where the wave number k is found from the wavelength λ (k = 2π/λ) and the
angular frequency ω is found from the frequency f (ω = 2π f ). Because λ and f
are related by c = λf , k and ω are also related by c = ω/k.
0 ; the plane of
The polarization of the wave is represented by the vector E
0 and the direction of propagation,
polarization is determined by the direction of E
the z axis in this case. Once we specify the direction of travel and the polarization
0 , the direction of B
0 is fixed by the requirements that B
must be perpendicular
E
and the direction of travel, and that the vector product E
×B
point in
to both E
0 is in the x direction (E
0 = E0 î, where î
the direction of travel. For example if E
3.1 | Review of Electromagnetic Waves
0 must be in the y direction (B
0 = B0 ĵ).
is a unit vector in the x direction), then B
0 is determined by
Moreover, the magnitude of B
E
B0 = 0
(3.4)
c
where c is the speed of light.
An electromagnetic wave transmits energy from one place to another; the
:
energy flux is specified by the Poynting vector S
= 1 E
×B
S
(3.5)
μ0
For the plane wave, this reduces to
= 1 E0 B0 sin2 (kz − ωt)k̂
S
μ0
(3.8)
There are two important features of this expression that you should recognize:
1. The intensity (the average power per unit area) is proportional to E02 . This
is a general property of waves: the intensity is proportional to the square of the
amplitude. We will see later that this same property also characterizes the waves
that describe the behavior of material particles.
2. The intensity fluctuates with time, with the frequency 2f = 2(ω/2π). We
don’t usually observe this rapid fluctuation—visible light, for example, has a
frequency of about 1015 oscillations per second, and because our eye doesn’t
respond that quickly, we observe the time average of many (perhaps 1013 ) cycles.
If T is the observation time (perhaps 10−2 s in the case of the eye) then the average
power is
1 T
Pdt
(3.9)
Pav =
T 0
and using Eq. 3.8 we obtain the intensity I:
I=
1
Pav
=
E2
A
2μ0 c 0
because the average value of sin2 θ is 1 /2 .
z
S
B
E
(3.6)
where k̂ is a unit vector in the z direction. The Poynting vector has dimensions
of power (energy per unit time) per unit area—for example, J/s/m2 or W/m2 .
, B
, and S for this special case.
Figure 3.1 shows the orientation of the vectors E
Let us imagine the following experiment. We place a detector of electromagnetic radiation (a radio receiver or a human eye) at some point on the z axis,
and we determine the electromagnetic power that this plane wave delivers to the
receiver. The receiver is oriented with its sensitive area A perpendicular to the z
axis, so that the maximum signal is received; we can therefore drop the vector
and work only with its magnitude S. The power P entering the
representation of S
receiver is then
1
E B A sin2 (kz − ωt)
(3.7)
P = SA =
μ0 0 0
which we can rewrite using Eq. 3.4 as
1 2
P=
E A sin2 (kz − ωt)
μ0 c 0
71
(3.10)
B
E
y
x
FIGURE 3.1 An electromagnetic wave
traveling in the z direction. The electric
lies in the xz plane and the
field E
lies in the yz plane.
magnetic field B
72
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
Interference and Diffraction
Plane
wave fronts
Double
slit
Maxima
Screen
Minima
(a)
(b)
FIGURE 3.2 (a) Young’s doubleslit experiment. A plane wave front
passes through both slits; the wave
is diffracted at the slits, and interference occurs where the diffracted
waves overlap on the screen. (b) The
interference fringes observed on the
screen.
The property that makes waves a unique physical phenomenon is the principle
of superposition, which, for example, allows two waves to meet at a point, to
cause a combined disturbance at the point that might be greater or less than the
disturbance produced by either wave alone, and finally to emerge from the point
of “collision” with all of the properties of each wave totally unchanged by the
collision. To appreciate this important distinction between material objects and
waves, imagine trying that trick with two automobiles!
This special property of waves leads to the phenomena of interference and
diffraction. The simplest and best-known example of interference is Young’s
double-slit experiment, in which a monochromatic plane wave is incident on
a barrier in which two narrow slits have been cut. (This experiment was first
done with light waves, but in fact any wave will do as well, not only other
electromagnetic waves, such as microwaves, but also mechanical waves, such as
water waves or sound waves. We assume that the experiment is being done with
light waves.)
Figure 3.2 illustrates this experimental arrangement. The plane wave is
diffracted by each of the slits, so that the light passing through each slit covers a
much larger area on the screen than the geometric shadow of the slit. This causes
the light from the two slits to overlap on the screen, producing the interference.
If we move away from the center of the screen just the right distance, we reach
a point at which a wave crest passing through one slit arrives at exactly the
same time as the previous wave crest that passed through the other slit. When this
occurs, the intensity is a maximum, and a bright region appears on the screen. This
is constructive interference, and it occurs continually at the point on the screen
that is exactly one wavelength further from one slit than from the other. That is,
if X1 and X2 are the distances from the point on the screen to the two slits, then a
condition for maximum constructive interference is |X1 − X2 | = λ. Constructive
interference occurs when any wave crest from one slit arrives simultaneously
with another from the other slit, whether it is the next, or the fourth, or the
forty-seventh. The general condition for complete constructive interference is that
the difference between X1 and X2 be an integral number of wavelengths:
|X1 − X2 | = nλ
n = 0, 1, 2, . . .
(3.11)
It is also possible for the crest of the wave from one slit to arrive at a point on
the screen simultaneously with the trough (valley) of the wave from the other slit.
When this happens, the two waves cancel, giving a dark region on the screen. This
is known as destructive interference. (The existence of destructive interference at
intensity minima immediately shows that we must add the electric field vectors E
of the waves from the two slits, and not their powers P, because P can never be
negative.) Destructive interference occurs whenever the distances X1 and X2 are
such that the phase of one wave differs from the other by one-half cycle, or by
one and one-half cycles, two and one-half cycles, and so forth:
|X1 − X2 | = 21 λ, 23 λ, 52 λ, . . . = (n + 12 )λ
n = 0, 1, 2, . . .
(3.12)
We can find the locations on the screen where the interference maxima occur in
the following way. Let d be the separation of the slits, and let D be the distance
3.1 | Review of Electromagnetic Waves
from the slits to the screen. If yn is the distance from the center of the screen to the
nth maximum, then from the geometry of Figure 3.3 we find (assuming X1 > X2 )
2
2
d
d
X12 = D2 +
and X22 = D2 +
(3.13)
+ yn
− yn
2
2
d
X1
Subtracting these equations and solving for yn , we obtain
yn =
X12 − X22
(X + X2 )(X1 − X2 )
= 1
2d
2d
D
d
X2
D
(3.14)
In experiments with light, D is of order 1 m, and yn and d are typically at most
1 mm; thus X1 ∼
= D and X2 ∼
= D, so X1 + X2 ∼
= 2D, and to a good approximation
yn = (X1 − X2 )
73
(3.15)
d
2
yn
d−y
2 n
FIGURE 3.3 The geometry of the
double-slit experiment.
Using Eq. 3.11 for the values of (X1 − X2 ) at the maxima, we find
yn = n
λD
d
(3.16)
Crystal Diffraction of X Rays
Another device for observing the interference of light waves is the diffraction
grating, in which the wave fronts pass through a barrier that has many slits
(often thousands or tens of thousands) and then recombine. The operation of this
device is illustrated in Figure 3.4; interference maxima corresponding to different
wavelengths appear at different angles θ, according to
d sin θ = nλ
(3.17)
where d is the slit spacing and n is the order number of the maximum
(n = 1, 2, 3, . . .).
The advantage of the diffraction grating is its superior resolution—it enables us
to get very good separation of wavelengths that are close to one another, and thus it
is a very useful device for measuring wavelengths. Notice, however, that in order
to get reasonable values of the angle θ —for example, sin θ in the range of 0.3
to 0.5—we must have d of the order of a few times the wavelength. For visible
light this is not particularly difficult, but for radiations of very short wavelength,
mechanical construction of a grating is not possible. For example, for X rays with
a wavelength of the order of 0.1 nm, we would need to construct a grating in
which the slits were less than 1 nm apart, which is roughly the same as the spacing
between the atoms of most materials.
The solution to this problem has been known since the pioneering experiments
of Laue and Bragg:∗ use the atoms themselves as a diffraction grating! A beam
of X rays sees the regular spacings of the atoms in a crystal as a sort of
three-dimensional diffraction grating.
Source
Grating
θ
Red
Blue
∗ Max
von Laue (1879–1960, Germany) developed the method of X-ray diffraction for the study of
crystal structures, for which he received the 1914 Nobel Prize. Lawrence Bragg (1890–1971, England)
developed the Bragg law for X-ray diffraction while he was a student at Cambridge University. He
shared the 1915 Nobel Prize with his father, William Bragg, for their research on the use of X rays to
determine crystal structures.
FIGURE 3.4 The use of a diffraction grating to analyze light into its
constituent wavelengths.
74
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
X rays
θ
Reflection
planes
d
d sin θ
FIGURE 3.5 A beam of X rays
reflected from a set of crystal planes
of spacing d. The beam reflected from
the second plane travels a distance 2d
sin θ greater than the beam reflected
from the first plane.
Consider the set of atoms shown in Figure 3.5, which represents a small portion
of a two-dimensional slice of the crystal. The X rays are reflected from individual
atoms in all directions, but in only one direction will the scattered “wavelets”
constructively interfere to produce a reflected beam, and in this case we can
regard the reflection as occurring from a plane drawn through the row of atoms.
(This situation is identical with the reflection of light from a mirror—only in
one direction will there be a beam of reflected light, and in that direction we can
regard the reflection as occurring on a plane with the angle of incidence equal to
the angle of reflection.)
Suppose the rows of atoms are a distance d apart in the crystal. Then a portion
of the beam is reflected from the front plane, and a portion is reflected from the
second plane, and so forth. The wave fronts of the beam reflected from the second
plane lag behind those reflected from the front plane, because the wave reflected
from the second plane must travel an additional distance of 2d sin θ, where θ
is the angle of incidence as measured from the face of the crystal. (Note that
this is different from the usual procedure in optics, in which angles are defined
with respect to the normal to the surface.) If this path difference is a whole
number of wavelengths, the reflected beams interfere constructively and give an
intensity maximum; thus the basic expression for the interference maxima in
X-ray diffraction from a crystal is
2d sin θ = nλ
n = 1, 2, 3, . . .
(3.18)
This result is known as Bragg’s law for X-ray diffraction. Notice the factor of 2
that appears in Eq. 3.18 but does not appear in the otherwise similar expression
of Eq. 3.17 for the ordinary diffraction grating.
Example 3.1
A single crystal of table salt (NaCl) is irradiated with
a beam of X rays of wavelength 0.250 nm, and the first
Bragg reflection is observed at an angle of 26.3◦ . What is
the atomic spacing of NaCl?
Incident
beam
Reflected
beams
θ1
d1
θ2
θ3
d2
d3
FIGURE 3.6 An incident beam of X
rays can be reflected from many different crystal planes.
Solution
Solving Bragg’s law for the spacing d, we have
d=
0.250 nm
nλ
= 0.282 nm
=
2 sin θ
2 sin 26.3◦
Our drawing of Figure 3.5 was very arbitrary—we had no basis for choosing
which set of atoms to draw the reflecting planes through. Figure 3.6 shows a larger
section of the crystal. As you can see, there are many possible reflecting planes,
each with a different value of θ and d. (Of course, di and θi are related and cannot
be varied independently.) If we used a beam of X rays of a single wavelength,
it might be difficult to find the proper angle and set of planes to observe the
interference. However, if we use a beam of X rays of a continuous range of
wavelengths, for each di and θi interference will occur for a certain wavelength
λi , and so there will be a pattern of interference maxima appearing at different
angles of reflection as shown in Figure 3.6. The pattern of interference maxima
depends on the spacing and the type of arrangement of the atoms in the crystal.
Figure 3.7 shows sample patterns (called Laue patterns) that are obtained
from X-ray scattering from two different crystals. The bright dots correspond to
interference maxima for wavelengths from the range of incident wavelengths that
happen to satisfy Eq. 3.18. The three-dimensional pattern is more complicated
3.2 | The Photoelectric Effect
75
Film
Crystal
Incident X rays
(full range of
wavelengths)
Scattered
X rays
(a)
(b)
(c)
FIGURE 3.7 (a) Apparatus for observing X-ray scattering by a crystal. An interference maximum (dot) appears on the
film whenever a set of crystal planes happens to satisfy the Bragg condition for a particular wavelength. (b) Laue pattern
of TiO2 crystal. (c) Laue pattern of a polyethylene crystal. The differences between the two Laue patterns are due to the
differences in the geometric structure of the two crystals.
than our two-dimensional drawings, but the individual dots have the same
interpretation. Figure 3.8 shows the pattern obtained from a sample that consists
of many tiny crystals, rather than one single crystal. (It looks like Figure 3.7b
or 3.7c rotated rapidly about its center.) From such pictures it is also possible to
deduce crystal structures and lattice spacing.
All of the examples we have discussed in this section depend on the wave
properties of electromagnetic radiation. However, as we now begin to discuss,
there are other experiments that cannot be explained if we regard electromagnetic
radiation as waves.
Film
Powder
Incident
X rays
Scattered
X rays
(a)
3.2 THE PHOTOELECTRIC EFFECT
We’ll now turn to our discussion of the first of three experiments that cannot be
explained by the wave theory of light. When a metal surface is illuminated with
light, electrons can be emitted from the surface. This phenomenon, known as the
photoelectric effect, was discovered by Heinrich Hertz in 1887 in the process
of his research into electromagnetic radiation. The emitted electrons are called
photoelectrons.
A sample experimental arrangement for observing the photoelectric effect
is illustrated in Figure 3.9. Light falling on a metal surface (the emitter) can
release electrons, which travel to the collector. The experiment must be done
in an evacuated tube, so that the electrons do not lose energy in collisions with
molecules of the air. Among the properties that can be measured are the rate of
electron emission and the maximum kinetic energy of the photoelectrons.∗
The rate of electron emission can be measured as an electric current i by an
ammeter in the external circuit. The maximum kinetic energy of the electrons
∗
The electrons can be emitted with many different kinetic energies, depending on how tightly bound
they are to the metal. Here we are concerned only with the maximum kinetic energy, which depends
on the energy needed to remove the least tightly bound electron from the surface of the metal.
(b)
FIGURE 3.8 (a) Apparatus for observing X-ray scattering from a
powdered or polycrystalline sample.
Because the individual crystals have
many different orientations, each scattered ray of Figure 3.7 becomes a
cone which forms a circle on the film.
(b) Diffraction pattern (known as
Debye-Scherrer pattern) of polycrystalline gold.
76
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
Light
Emitter
Collector
e
i
V
A
Vext
FIGURE 3.9 Apparatus for observing
the photoelectric effect. The flow of
electrons from the emitter to the collector is measured by the ammeter A
as a current i in the external circuit.
A variable voltage source Vext establishes a potential difference between
the emitter and collector, which is
measured by the voltmeter V .
can be measured by applying a negative potential to the collector that is just
enough to repel the most energetic electrons, which then do not have enough
energy to “climb” the potential energy hill. That is, if the potential difference
between the emitter and the collector is V (a negative quantity), then electrons
traveling from the emitter to the collector would gain a potential energy of
U = q V = −e V (a positive quantity) and would lose the same amount of
kinetic energy. Electrons leaving the emitter with a kinetic energy smaller than
this U cannot reach the collector and are pushed back toward the emitter.
As the magnitude of the potential difference is increased, at some point even the
most energetic electrons do not have enough kinetic energy to reach the collector.
This potential, called the stopping potential Vs , is determined by increasing the
magnitude of the voltage until the ammeter current drops to zero. At this point
the maximum kinetic energy Kmax of the electrons as they leave the emitter is just
equal to the kinetic energy eVs lost by the electrons in “climbing” the hill:
Kmax = eVs
(3.19)
where e is the magnitude of the electric charge of the electron. Typical values of
Vs are a few volts.∗
In the classical picture, the surface of the metal is illuminated by an electromagnetic wave of intensity I. The surface absorbs energy from the wave until
the energy exceeds the binding energy of the electron to the metal, at which
point the electron is released. The minimum quantity of energy needed to remove
an electron is called the work function φ of the material. Table 3.1 lists some
values of the work function of different materials. You can see that the values are
typically a few electron-volts.
The Classical Theory of the Photoelectric
Effect
What does the classical wave theory predict about the properties of the emitted
photoelectrons?
TABLE 3.1 Some Photoelectric Work
Functions
Material
φ (eV)
Na
2.28
Al
4.08
Co
3.90
Cu
4.70
Zn
4.31
Ag
4.73
Pt
6.35
Pb
4.14
1. The maximum kinetic energy of the electrons should be proportional to the
intensity of the radiation. As the brightness of the light source is increased,
more energy is delivered to the surface (the electric field is greater) and
the electrons should be released with greater kinetic energies. Equivalently,
of the
increasing the intensity of the light source increases the electric field E
= −eE
on the electron and its kinetic
wave, which also increases the force F
energy when it eventually leaves the surface.
2. The photoelectric effect should occur for light of any frequency or wavelength.
According to the wave theory, as long as the light is intense enough to release
electrons, the photoelectric effect should occur no matter what the frequency
or wavelength.
3. The first electrons should be emitted in a time interval of the order of seconds
after the radiation begins to strike the surface. In the wave theory, the energy
of the wave is uniformly distributed over the wave front. If the electron
absorbs energy directly from the wave, the amount of energy delivered to any
∗ The potential difference V read by the voltmeter is not equal to the stopping potential when the
emitter and collector are made of different materials. In that case a correction must be applied to
account for the contact potential difference between the emitter and collector.
3.2 | The Photoelectric Effect
77
electron is determined by how much radiant energy is incident on the surface
area in which the electron is confined. Assuming this area is about the size of
an atom, a rough calculation leads to an estimate that the time lag between
turning on the light and observing the first photoelectrons should be of the
order of seconds (see Example 3.2).
Example 3.2
A laser beam with an intensity of 120 W/m2 (roughly that
of a small helium-neon laser) is incident on a surface of
sodium. It takes a minimum energy of 2.3 eV to release
an electron from sodium (the work function φ of sodium).
Assuming the electron to be confined to an area of radius
equal to that of a sodium atom (0.10 nm), how long will it
take for the surface to absorb enough energy to release an
electron?
Solution
The average power Pav delivered by the wave of intensity
I to an area A is IA. An atom on the surface displays a “target area” of A = π r2 = π(0.10 × 10−9 m)2 =
3.1 × 10−20 m2 . If the entire electromagnetic power is
delivered to the electron, energy is absorbed at the rate
E/t = Pav . The time interval t necessary to absorb an
energy E = φ can be expressed as
t =
=
E
φ
=
Pav
IA
(2.3 eV)(1.6 × 10−19 J/eV)
(120 W/m2 )(3.1 × 10−20 m2 )
= 0.10 s
In reality, electrons in metals are not always bound to individual atoms but instead can be free to roam throughout the
metal. However, no matter what reasonable estimate we
make for the area over which the energy is absorbed, the
characteristic time for photoelectron emission is estimated
to have a magnitude of the order of seconds, in a range
easily accessible to measurement.
1. For a fixed value of the wavelength or frequency of the light source, the
maximum kinetic energy of the emitted photoelectrons (determined from the
stopping potential) is totally independent of the intensity of the light source.
Figure 3.10 shows a representation of the experimental results. Doubling the
intensity of the source leaves the stopping potential unchanged, indicating no
change in the maximum kinetic energy of the electrons. This experimental
result disagrees with the wave theory, which predicts that the maximum
kinetic energy should depend on the intensity of the light.
2. The photoelectric effect does not occur at all if the frequency of the light
source is below a certain value. This value, which is characteristic of the
kind of metal surface used in the experiment, is called the cutoff frequency fc .
Above fc , any light source, no matter how weak, will cause the emission of
photoelectrons; below fc , no light source, no matter how strong, will cause the
emission of photoelectrons. This experimental result also disagrees with the
predictions of the wave theory.
3. The first photoelectrons are emitted virtually instantaneously (within 10−9 s)
after the light source is turned on. The wave theory predicts a measurable
time delay, so this result also disagrees with the wave theory.
These three experimental results all suggest the complete failure of the wave
theory to account for the photoelectric effect.
Current i
The experimental characteristics of the photoelectric effect were well known
by the year 1902. How do the predictions of the classical theory compare with the
experimental results?
I2 = 2I1
I1
Vs
0
Potential difference ∆V
FIGURE 3.10 The photoelectric current i as a function of the potential
difference V for two different values of the intensity of the light. When
the intensity I is doubled, the current
is doubled (twice as many photoelectrons are emitted), but the stopping
potential Vs remains the same.
78
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
The Quantum Theory of the Photoelectric
Effect
A successful theory of the photoelectric effect was developed in 1905 by Albert
Einstein. Five years earlier, in 1900, the German physicist Max Planck had
developed a theory to explain the wavelength distribution of light emitted by
hot, glowing objects (called thermal radiation, which is discussed in the next
section of this chapter). Based partly on Planck’s ideas, Einstein proposed that the
energy of electromagnetic radiation is not continuously distributed over the wave
front, but instead is concentrated in localized bundles or quanta (also known as
photons). The energy of a photon associated with an electromagnetic wave of
frequency f is
E = hf
(3.20)
where h is a proportionality constant known as Planck’s constant. The photon
energy can also be related to the wavelength of the electromagnetic wave by
substituting f = c/λ, which gives
E=
hc
λ
(3.21)
We often speak about photons as if they were particles, and as concentrated
bundles of energy they have particlelike properties. Like the electromagnetic
waves, photons travel at the speed of light, and so they must obey the relativistic
relationship p = E/c. Combining this with Eq. 3.21, we obtain
p=
h
λ
(3.22)
Photons carry linear momentum as well as energy, and thus they share this
characteristic property of particles.
Because a photon travels at the speed of light, it must have zero mass. Otherwise
its energy and momentum would be infinite. Similarly, a photon’s rest energy
E0 = mc2 must also be zero.
In Einstein’s interpretation, a photoelectron is released as a result of an
encounter with a single photon. The entire energy of the photon is delivered
instantaneously to a single photoelectron. If the photon energy hf is greater than
the work function φ of the material, the photoelectron will be released. If the
photon energy is smaller than the work function, the photoelectric effect will not
occur. This explanation thus accounts for two of the failures of the wave theory:
the existence of the cutoff frequency and the lack of any measurable time delay.
If the photon energy hf exceeds the work function, the excess energy appears
as the kinetic energy of the electron:
Kmax = hf − φ
(3.23)
The intensity of the light source does not appear in this expression! For a fixed
frequency, doubling the intensity of the light means that twice as many photons
strike the surface and twice as many photoelectrons are released, but they all have
precisely the same maximum kinetic energy.
3.2 | The Photoelectric Effect
79
You can think of Eq. 3.23 as giving a relationship between energy quantities in
analogy to making a purchase at a store. The quantity hf represents the payment
you hand to the cashier, the quantity φ represents the cost of the object, and Kmax
represents the change you receive. In the photoelectric effect, hf is the amount
of energy that is available to “purchase” an electron from the surface, the work
function φ is the “cost” of removing the least tightly bound electron from the
surface, and the difference between the available energy and the removal cost is
the leftover energy that appears as the kinetic energy of the emitted electron. (The
more tightly bound electrons have a greater “cost” and so emerge with smaller
kinetic energies.)
A photon that supplies an energy equal to φ, exactly the minimum amount
needed to remove an electron, corresponds to light of frequency equal to the cutoff
frequency fc . At this frequency, there is no excess energy for kinetic energy, so
Eq. 3.23 becomes hfc = φ, or
φ
h
(3.24)
The corresponding cutoff wavelength λc = c/fc is
λc =
hc
φ
(3.25)
The cutoff wavelength represents the largest wavelength for which the photoelectric effect can be observed for a surface with the work function φ.
The photon theory appears to explain all of the observed features of the
photoelectric effect. The most detailed test of the theory was done by Robert
Millikan in 1915. Millikan measured the maximum kinetic energy (stopping
potential) for different frequencies of the light and obtained a plot of Eq. 3.23. A
sample of his results is shown in Figure 3.11. From the slope of the line, Millikan
obtained a value for Planck’s constant of
h = 6.57 × 10−34 J · s
In part for his detailed experiments on the photoelectric effect, Millikan was
awarded the 1923 Nobel Prize in physics. Einstein was awarded the 1921 Nobel
Prize for his photon theory as applied to the photoelectric effect.
As we discuss in the next section, the wavelength distribution of thermal
radiation also yields a value for Planck’s constant, which is in good agreement
with Millikan’s value derived from the photoelectric effect. Planck’s constant is
one of the fundamental constants of nature; just as c is the characteristic constant
of relativity, h is the characteristic constant of quantum mechanics. The value of
Planck’s constant has been measured to great precision in a variety of experiments.
The presently accepted value is
h = 6.6260696 × 10−34 J · s
This is an experimentally determined value, with a relative uncertainty of about
5 × 10−8 (±3 units in the last digit).
Robert A. Millikan (1868–1953,
United States). Perhaps the best experimentalist of his era, his work included
the precise determination of Planck’s
constant using the photoelectric effect
(for which he received the 1923 Nobel
Prize) and the measurement of the
charge of the electron (using his
famous “oil-drop” apparatus).
Stopping potential Vs (volts)
fc =
3
2
1
0
Slope = 4.1 × 10−15 V.s
60
80
100
120
Radiation frequency (1013 Hz)
FIGURE 3.11 Millikan’s results for
the photoelectric effect in sodium. The
slope of the line is h/e; the experimental determination of the slope gives a
way of determining Planck’s constant.
The intercept should give the cutoff frequency; however, in Millikan’s
time the contact potentials of the electrodes were not known precisely and
so the vertical scale is displaced by
a few tenths of a volt. The slope not
affected by this correction.
80
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
Example 3.3
(a) What are the energy and momentum of a photon of red
light of wavelength 650 nm? (b) What is the wavelength of
a photon of energy 2.40 eV?
Solution
The momentum is found in a similar way, using
Eq. 3.22
p=
(a) Using Eq. 3.21 we obtain
(6.63 × 10−34 J · s)(3.00 × 108 m/s)
hc
=
λ
650 × 10−9 m
−19
J
= 3.06 × 10
E=
h
1 hc
1
=
=
λ
c λ
c
p=
3.06 × 10−19 J
= 1.91 eV
1.60 × 10−19 J/eV
This type of problem can be simplified if we express the
combination hc in units of eV · nm:
E=
1240 eV · nm
hc
=
= 1.91 eV
λ
650 nm
1240 eV · nm
650 nm
= 1.91 eV/c
The momentum could also be found directly from the
energy:
Converting to electron-volts, we have
E=
1.91 eV
E
=
= 1.91 eV/c
c
c
(It may be helpful to review the discussion in Example 2.11
about these units of momentum.)
(b) Solving Eq. 3.21 for λ, we find
λ=
1240 eV · nm
hc
=
= 517 nm
E
2.40 eV
Example 3.4
The work function for tungsten metal is 4.52 eV. (a) What
is the cutoff wavelength λc for tungsten? (b) What is the
maximum kinetic energy of the electrons when radiation
of wavelength 198 nm is used? (c) What is the stopping
potential in this case?
(b) At the shorter wavelength,
hc
−φ
λ
1240 eV · nm
=
− 4.52 eV
198 nm
= 1.74 eV
Kmax = hf − φ =
Solution
(a) Equation 3.25 gives
hc
1240 eV · nm
=
= 274 nm
λc =
φ
4.52 eV
in the ultraviolet region.
(c) The stopping potential is the voltage corresponding
to Kmax :
Vs =
Kmax
1.74 eV
=
= 1.74 V
e
e
3.3 THERMAL RADIATION
The second type of experiment we discuss that cannot be explained by the classical
wave theory is thermal radiation, which is the electromagnetic radiation emitted
by all objects because of their temperature. At room temperature the thermal
radiation is mostly in the infrared region of the spectrum, where our eyes are not
sensitive. As we heat objects to higher temperatures, they may emit visible light.
3.3 | Thermal Radiation
1. The total intensity radiated over all wavelengths (that is, the area under each
curve) increases as the temperature is increased. This is not a surprising result:
we commonly observe that a glowing object glows brighter and thus radiates
more energy as we increase its temperature. From careful measurement, we
find that the total intensity increases as the fourth power of the absolute or
kelvin temperature:
(3.26)
I = σ T4
where we have introduced the proportionality constant σ . Equation 3.26 is
called Stefan’s law and the constant σ is called the Stefan-Boltzmann constant.
Its value can be determined from experimental results such as those illustrated
in Figure 3.13:
σ = 5.67037 × 10−8 W/m2 · K4
2. The wavelength λmax at which the emitted intensity reaches its maximum
value decreases as the temperature is increased, in inverse proportion to the
temperature: λmax ∝ 1/T. From results such as those of Figure 3.13, we can
determine the proportionality constant, so that
λmax T = 2.8978 × 10−3 m · K
Prism
θ
always, intensity means energy per unit time per unit area (or power per unit area), as in Eq. 3.10.
Previously, “unit area” referred to the wave front, such as would be measured if we recorded the waves
with an antenna of a certain area. Here, “unit area” indicates the electromagnetic radiation emitted
from each unit area of the surface of the object whose thermal emissions are being observed.
∆θ
Detector
FIGURE 3.12 Measurement of the
spectrum of thermal radiation. A
device such as a prism is used to separate the wavelengths emitted by the
object.
λmax
1250 K
λmax
(3.27)
This result is known as Wien’s displacement law; the term “displacement”
refers to the way the peak is moved or displaced as the temperature is
∗ As
Object at
temperature T1
Intensity I(λ)
A typical experimental arrangement is shown in Figure 3.12. An object is
maintained at a temperature T1 . The radiation emitted by the object is detected
by an apparatus that is sensitive to the wavelength of the radiation. For example,
a dispersive medium such as a prism can be used so that different wavelengths
appear at different angles θ. By moving the radiation detector to different angles
θ we can measure the intensity∗ of the radiation at a specific wavelength. The
detector is not a geometrical point (hardly an efficient detector!) but instead
subtends a small range of angles θ , so what we really measure is the amount of
radiation in some range θ at θ, or, equivalently, in some range λ at λ.
Many experiments were done in the late 19th century to study the wavelength
spectrum of thermal radiation. These experiments, as we shall see, gave results that
totally disagreed with the predictions of the classical theories of thermodynamics
and electromagnetism; instead, the successful analysis of the experiments provided
the first evidence of the quantization of energy, which would eventually be seen
as the basis for the new quantum theory.
Let’s first review the experimental results. The goal of these experiments was
to measure the intensity of the radiation emitted by the object as a function of
wavelength. Figure 3.13 shows a typical set of experimental results when the
object is at a temperature T1 = 1000 K. If we now change the temperature of the
object to a different value T2 , we obtain a different curve, as shown in Figure 3.13
for T2 = 1250 K. If we repeat the measurement for many different temperatures,
we obtain systematic results for the radiation intensity that reveal two important
characteristics:
81
1000 K
1 2
3 4 5 6 7 8 9 10
Wavelength (μm)
FIGURE 3.13 A possible result of the
measurement of the radiation intensity
over many different wavelengths. Each
different temperature of the emitting
body gives a different peak λmax .
82
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
varied. Wien’s law is qualitatively consistent with our common observation
that heated objects first begin to glow with a red color, and at higher temperatures the color becomes more yellow. As the temperature is increased, the
wavelength at which most of the radiation is emitted moves from the longerwavelength (red) part of the visible region toward medium wavelengths. The
term “white hot” refers to an object that is hot enough to produce the mixture
of all wavelengths in the visible region to make white light.
Example 3.5
(a) At what wavelength does a room-temperature (T =
20◦ C) object emit the maximum thermal radiation?
(b) To what temperature must we heat it until its peak
thermal radiation is in the red region of the spectrum
(λ = 650 nm)? (c) How many times as much thermal
radiation does it emit at the higher temperature?
Solution
(a) Using the absolute temperature, T1 = 273 + 20 =
293 K, Wien’s displacement law gives
2.8978 × 10−3 m · K
T1
2.8978 × 10−3 m · K
=
= 9.89 μm
293 K
λmax =
This is in the infrared region of the electromagnetic
spectrum.
FIGURE 3.14 A cavity filled with
electromagnetic radiation in thermal
equilibrium with its walls at temperature T. Some radiation escapes
through the hole, which represents an
ideal blackbody.
(b) For λmax = 650 nm, we again use Wien’s displacement
law to find the new temperature T2 :
T2 =
2.8978 × 10−3 m · K
λmax
=
2.8978 × 10−3 m · K
650 × 10−9 m
= 4460 K
(c) The total intensity of radiation is proportional to T 4 , so
the ratio of the total thermal emissions will be
σ T24
(4460 K)4
I2
=
=
I1
σ T14
(293 K)4
= 5.37 × 104
Be sure to notice the use of absolute (kelvin) temperatures
in this example.
The theoretical analysis of the emission of thermal radiation from an arbitrary
object is extremely complicated. It depends on details of the surface properties
of the object, and it also depends on how much radiation the object reflects from
its surroundings. To simplify our analysis, we consider a special type of object
called a blackbody, which absorbs all radiation incident on it and reflects none of
the incident radiation.
To simplify further, we consider a special type of blackbody: a hole in a
hollow metal box whose walls are in thermal equilibrium at temperature T. The
box is filled with electromagnetic radiation that is emitted and reflected by the
walls. A small hole in one wall of the box allows some of the radiation to escape
(Figure 3.14). It is the hole, and not the box itself, that is the blackbody. Radiation
from outside that is incident on the hole gets lost inside the box and has a
negligible chance of reemerging from the hole; thus no reflections occur from the
blackbody (the hole). The radiation that emerges from the hole is just a sample
of the radiation inside the box, so understanding the nature of the radiation inside
the box allows us to understand the radiation that leaves through the hole.
3.3 | Thermal Radiation
Let’s consider the radiation inside the box. It has an energy density (energy
per unit volume) per unit wavelength interval u(λ). That is, if we could look into
the interior of the box and measure the energy density of the electromagnetic
radiation with wavelengths between λ and λ + dλ in a small volume element,
the result would be u(λ)dλ. For the radiation in this wavelength interval, what
is the corresponding intensity (power per unit area) emerging from the hole? At
any particular instant, half of the radiation in the box will be moving away from
the hole. The other half of the radiation is moving toward the hole at velocity
of magnitude c but directed over a range of angles. Averaging over this range
of angles to evaluate the energy flowing perpendicular to the surface of the hole
introduces another factor of 1/2, so the contribution of the radiation in this small
wavelength interval to the intensity passing through the hole is
I(λ) =
c
u(λ)
4
(3.28)
The quantity I(λ)dλ is the radiant intensity in the small interval dλ at the
wavelength λ. This is the quantity whose measurement gives the results displayed
in Figure 3.13. Each data point represents a measurement of the intensity in a small
wavelength interval. The goal of the theoretical analysis is to find a mathematical
function I(λ) that gives a smooth fit through the data points of Figure 3.13.
If we wish to find the total intensity emitted in the region between wavelengths
λ1 and λ2 , we divide the region into narrow intervals dλ and add the intensities in
each interval, which is equivalent to the integral between those limits:
I(λ1 :λ2 ) =
λ2
I(λ) dλ
(3.29)
λ1
This is similar to Eq. 1.27 for determining the number of molecules with energies
between two limits. The total emitted intensity can be found by integrating over
all wavelengths:
I=
∞
I(λ) dλ
(3.30)
0
This total intensity should work out to be proportional to the 4th power of the
temperature, as required by Stefan’s law (Eq. 3.26).
Classical Theory of Thermal Radiation
Before discussing the quantum theory of thermal radiation, let’s see what the
classical theories of electromagnetism and thermodynamics can tell us about the
dependence of I on λ. The complete derivation is not given here, only a brief
outline of the theory.∗ The derivation involves first computing the amount of
radiation (number of waves) at each wavelength and then finding the contribution
of each wave to the total energy in the box.
∗
For a more complete derivation, see R. Eisberg and R. Resnick, Quantum Theory of Atoms, Molecules,
Solids, Nuclei, and Particles, 2nd edition (Wiley, 1985), pp. 9–13.
83
84
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
1. The box is filled with electromagnetic standing waves. If the walls of the box
are metal, radiation is reflected back and forth with a node of the electric field
at each wall (the electric field must vanish inside a conductor). This is the
same condition that applies to other standing waves, like those on a stretched
string or a column of air in an organ pipe.
2. The number of standing waves with wavelengths between λ and λ + dλ is
N(λ) dλ =
8π V
dλ
λ4
(3.31)
where V is the volume of the box. For one-dimensional standing waves, as
on a stretched string of length L, the allowed wavelength are λ = 2L/n, (n =
1, 2, 3, . . .). The number of possible standing waves with wavelengths between
λ1 and λ2 is n2 − n1 = 2L(1/λ2 − 1/λ1 ). In the small interval from λ to
λ + dλ, the number of standing waves is N(λ)dλ = |dn/dλ|dλ = (2L/λ2 )dλ.
Equation 3.31 can be obtained by extending this approach to three dimensions.
3. Each individual wave contributes an average energy of kT to the radiation
in the box. This result follows from an analysis similar to that of Section 1.3
for the statistical mechanics of gas molecules. In this case we are interested
in the statistics of the oscillating atoms in the walls of the cavity, which are
responsible for setting up the standing electromagnetic waves in the cavity.
For a one-dimensional oscillator, the energies are distributed according to the
Maxwell-Boltzmann distribution:∗
N(E) =
N −E/kT
e
kT
(3.32)
Recall from Section 1.3 that N(E) is defined so that the number of oscillators
with energies between E and E + dE is dN = N(E)dE, and thus the total
∞
number of oscillators at all energies is dN = 0 N(E)dE, which (as you
should show) works out to N. The average energy per oscillator is then found
in the same way as the average energy of a gas molecule (Eq. 1.25):
∞
1 ∞
1
E N(E) dE =
E e−E/kT dE
(3.33)
Eav =
N 0
kT 0
which does indeed work out to Eav = kT.
Putting all these ingredients together, we can find the energy density of
radiation in the wavelength interval dλ inside the cavity: energy density = (number
of standing waves per unit volume) × (average energy per standing wave) or
u(λ) dλ =
8π
N(λ) dλ
kT = 4 kT dλ
V
λ
(3.34)
The corresponding intensity per unit wavelength interval dλ is
I(λ) =
c
2π c
c 8π
kT = 4 kT
u(λ) =
4
4 λ4
λ
(3.35)
This result is known as the Rayleigh-Jeans formula; based firmly on the classical
theories of electromagnetism and thermodynamics, it represents our best attempt
∗ The exponential part of this expression is that same as that of Eq. 1.22 for gas molecules, but the rest of
the equation is different, because the statistical behavior of one-dimensional oscillators is different from
that of gas molecules moving in three dimensions. We’ll consider these calculations in greater detail in
Chapter 10.
3.3 | Thermal Radiation
Quantum Theory of Thermal Radiation
The new physics that gave the correct interpretation of thermal radiation was
proposed by the German physicist Max Planck in 1900. The ultraviolet catastrophe
occurs because the Rayleigh-Jeans formula predicts too much intensity at short
wavelengths (or equivalently at high frequencies). What is needed is a way to
make u → 0 as λ → 0, or as f → ∞. Again considering the electromagnetic
standing waves to result from the oscillations of atoms in the walls of the cavity,
Planck tried to find a way to reduce the number of high-frequency standing
waves by reducing the number of high-frequency oscillators. He did this by a
bold assumption that formed the cornerstone of a new physical theory, quantum
physics. Associated with this theory is a new version of mechanics, known
as wave mechanics or quantum mechanics. We discuss the methods of wave
mechanics in Chapter 5; for now we show how Planck’s theory provided the
correct interpretation of the emission spectrum of thermal radiation.
Planck suggested that an oscillating atom can absorb or emit energy only in
discrete bundles. This bold suggestion was necessary to keep the average energy
of a low-frequency (long-wavelength) oscillator equal to kT (in agreement with
the Rayleigh-Jeans law at long wavelength), but it also made the average energy
of a high-frequency (low-wavelength) oscillator approach zero. Let’s see how
Planck managed this remarkable feat.
In Planck’s theory, each oscillator can emit or absorb energy only in quantities
that are integer multiples of a certain basic quantity of energy ε,
En = nε
n = 1, 2, 3, . . .
Rayleigh-Jeans
Intensity I(λ)
to apply classical physics to understanding the problem of blackbody radiation.
In Figure 3.15 the intensity calculated from the Rayleigh-Jeans formula is
compared with typical experimental results. The intensity calculated with
Eq. 3.35 approaches the data at long wavelengths, but at short wavelengths, the
classical theory (which predicts u → ∞ as λ → 0) fails miserably. The failure
of the Rayleigh-Jeans formula at short wavelengths is known as the ultraviolet
catastrophe and represents a serious problem for classical physics, because the
theories of thermodynamics and electromagnetism on which the Rayleigh-Jeans
formula is based have been carefully tested in many other circumstances and
found to give extremely good agreement with experiment. It is apparent in the
case of blackbody radiation that the classical theories do not work, and that a
new kind of physical theory is needed.
85
1 2 3 4 5 6 7 8 9 10
Wavelength (μm)
FIGURE 3.15 The failure of the classical Rayleigh-Jeans formula to fit
the observed intensity. At long wavelengths the theory approaches the data,
but at short wavelengths the classical
formula fails miserably.
(3.36)
where n is the number of quanta. Furthermore, the energy of each of the quanta is
determined by the frequency
ε = hf
(3.37)
where h is the constant of proportionality, now known as Planck’s constant. From
the mathematical standpoint, the difference between Planck’s calculation and
the classical calculation using Maxwell-Boltzmann statistics is that the energy
of an oscillator at a certain wavelength or frequency is no longer a continuous
variable—it is a discrete variable that takes only the values given by Eq. 3.36.
The integrals in the classical calculation are then replaced by sums, and the
number of oscillators with energy En is then
Nn = N(1 − e−ε/kT )e−nε/kT
(3.38)
Max Planck (1858–1947, Germany).
His work on the spectral distribution
of radiation, which led to the quantum theory, was honored with the
1918 Nobel Prize. In his later years,
he wrote extensively on religious and
philosophical topics.
86
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
Intensity I(λ)
Planck’s
function
(Compare this result with Eq. 3.32 for the continuous case.) Here Nn represents
the number of oscillators with energy En , while N is the total number. You should
∞
Nn = N, again giving the total number of oscillators when
be able to show that
n=0
summed over all possible energies. Planck’s calculation then gives the average
energy:
Eav =
1 2 3 4 5 6 7 8 9 10
∞
∞
1
(nε)e−nε/kT
Nn En = (1 − e−ε/kT )
N
(3.39)
n=0
n=0
which gives (see Problem 14)
Wavelength (μm)
FIGURE 3.16 Planck’s function fits
the observed data perfectly.
Eav =
ε
eε/kT
−1
=
hf
ehf /kT
−1
=
hc/λ
−1
ehc/λkT
(3.40)
Note from this equation that Eav ∼
= kT at small f (large λ) but that Eav → 0 at
large f (small λ). Thus the small-wavelength oscillators carry a vanishingly small
energy, and the ultraviolet catastrophe is solved!
Based on Planck’s result, the intensity of the radiation then becomes (using
Eqs. 3.28 and 3.31):
1
2π hc2
hc/λ
c 8π
(3.41)
=
I(λ) =
4 λ4
ehc/λkT − 1
λ5 ehc/λkT − 1
Intensity
(An alternative approach to deriving this result is given in Section 10.6.) The
perfect agreement between experiment and Planck’s formula is illustrated in
Figure 3.16.
In Problems 15 and 16 at the end of this chapter you will demonstrate that
Planck’s formula can be used to deduce Stefan’s law and Wien’s displacement
law. In fact, deducing Stefan’s law from Planck’s formula results in a relationship
between the Stefan-Boltzmann constant and Planck’s constant:
σ =
C
Frequency
FIGURE 3.17 Data from the COBE
satellite, launched in 1989 to determine the temperature of the cosmic
microwave background radiation from
the early universe. The data points
exactly fit the Planck function corresponding to a temperature of 2.725 K.
To appreciate the remarkable precision
of this experiment, note that the sizes
of the error bars have been increased
by a factor of 400 to make them visible! (Source: NASA Office of Space
Science)
2π 5 k 4
15c2 h3
(3.42)
By determining the value of the Stefan-Boltzmann constant from the intensity
data available in 1900, Planck was able to determine a value of the constant h :
h = 6.56 × 10−34 J · s
which agrees very well with the value of h that Millikan deduced 15 years later
based on the analysis of data from the photoelectric effect. The good agreement
of these two values is remarkable, because they are derived from very different
kinds of experiments—one involves the emission and the other the absorption of
electromagnetic radiation. This suggests that the quantization property is not an
accident arising from the analysis of one particular experiment, but is instead a
property of the electromagnetic field itself. Along with many other scientists of
his era, Planck was slow to accept this interpretation. However, later experimental
evidence (including the Compton effect) proved to be so compelling that it left
no doubt about Einstein’s photon theory and the particlelike structure of the
electromagnetic field.
3.4 | The Compton Effect
87
Planck’s formula still finds important applications today in the measurement
of temperature. By measuring the intensity of radiation emitted by an object
at a particular wavelength (or, as in actual experiments, in a small interval of
wavelengths), Eq. 3.41 can be used to deduce the temperature of the object. Note
that only one measurement, at any wavelength, is all that is required to obtain
the temperature. A radiometer is a device for measuring the intensity of thermal
radiation at selected wavelengths, enabling a determination of temperature.
Radiometers in orbiting satellites are used to measure the temperature of the land
and sea areas of the Earth and of the upper surface of clouds. Other orbiting
radiometers have been aimed toward “empty space” to measure the temperature
of the radiation from the early history of the universe (Figure 3.17).
Example 3.6
You are using a radiometer to observe the thermal radiation
from an object that is heated to maintain its temperature at
1278 K. The radiometer records radiation in a wavelength
interval of 12.6 nm. By changing the wavelength at which
you are measuring, you set the radiometer to record the
most intense radiation emission from the object. What is
the intensity of the emitted radiation in this interval?
Solution
The wavelength setting for the most intense radiation is
determined from Wien’s displacement law:
λmax
2.8978 × 10−3 m · K
2.8978 × 10−3 m · K
=
=
T
1278 K
= 2.267 × 10−6 m = 2267 nm
The given temperature corresponds to kT = (8.6174 ×
10−5 eV/K)(1278 K) = 0.1101 eV. The radiation intensity
in this small wavelength interval is
I(λ)dλ =
1
2π hc2
dλ
5
hc/λkT
λ e
−1
= 2π(6.626 × 10−34 J · s)(2.998 × 108 m/s)2
×(12.6 × 10−9 m)(2.267 × 10−6 m)−5
×(e(1240 eV·nm)/(2267 nm)(0.1101 eV) − 1)−1
= 552 W/m2
3.4 THE COMPTON EFFECT
Another way for radiation to interact with matter is by means of the Compton
effect, in which radiation scatters from loosely bound, nearly free electrons. Part
of the energy of the radiation is given to the electron; the remainder of the energy
is reradiated as electromagnetic radiation. According to the wave picture, the
scattered radiation is less energetic than the incident radiation (the difference
going into the kinetic energy of the electron) but has the same wavelength. As we
will see, the photon concept leads to a very different prediction for the scattered
radiation.
The scattering process is analyzed simply as an interaction (a “collision” in
the classical sense of particles) between a single photon and an electron, which
88
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
ton
we assume to be at rest. Figure 3.18 shows the process. Initially, the photon has
energy E and linear momentum p given by
ho
Incident
photon
e
att
Sc
p
red
E′, p′
θ
φ
E, p
E e, pe
Scattered
electron
FIGURE 3.18 The
Compton scattering.
geometry
of
E = hf =
hc
λ
and
p=
E
c
(3.43)
The electron, initially at rest, has rest energy me c2 . After the scattering, the photon
has energy E′ = hc/λ′ and momentum p′ = E′ /c, and it moves in a direction at an
angle θ with respect to the direction of the incident photon. The electron has total
final energy Ee and momentum pe and moves in a direction at an angle φ with
respect to the initial photon. (To allow for the possibility of high-energy incident
photons giving energetic scattered electrons, we use relativistic kinematics for the
electron.) The conservation laws for total relativistic energy and momentum are
then applied:
E + me c2 = E′ + Ee
p = pe cos φ + p′ cos θ
0 = pe sin φ − p′ sin θ
Einitial = Efinal :
px,initial = px,final :
py,initial = py,final :
(3.44a)
(3.44b)
(3.44c)
We have three equations with four unknowns (θ , φ, Ee , E′ ; pe and p′ are not
independent unknowns) that cannot be solved uniquely, but we can eliminate any
two of the four unknowns by solving the equations simultaneously. If we choose
to measure the energy and direction of the scattered photon, we eliminate Ee and
φ. The angle φ is eliminated by first rewriting the momentum equations:
pe cos φ = p − p′ cos θ
and
pe sin φ = p′ sin θ
(3.45)
Squaring these equations and adding the results, we obtain
p2e = p2 − 2pp′ cos θ + p′2
(3.46)
The relativistic relationship between energy and momentum is, according to
Eq. 2.39, Ee2 = c2 p2e + m2e c4 . Substituting in this equation for Ee from Eq. 3.44a
and for p2e from Eq. 3.46, we obtain
(E + me c2 − E′ )2 = c2 (p2 − 2pp′ cos θ + p′2 ) + m2e c4
(3.47)
and after a bit of algebra, we find
1
1
1
− =
(1 − cos θ )
′
E
E
me c2
(3.48)
In terms of wavelength, this equation can also be written as
λ′ − λ =
Arthur H. Compton (1892–1962,
United States). His work on X-ray scattering verified Einstein’s photon theory
and earned him the 1927 Nobel Prize.
He was a pioneer in research with X
rays and cosmic rays. During World
War II he directed a portion of the U.S.
atomic bomb research.
h
(1 − cos θ )
me c
(3.49)
where λ is the wavelength of the incident photon and λ′ is the wavelength of the
scattered photon. The quantity h/me c is known as the Compton wavelength of the
electron and has a value of 0.002426 nm; however, keep in mind that it is not a
true wavelength but rather is a change of wavelength.
Equations 3.48 and 3.49 give the change in energy or wavelength of the photon,
as a function of the scattering angle θ. Because the quantity on the right-hand side
is never negative, E′ is always less than E, so that the scattered photon has less
energy than the original incident photon; the difference E − E′ is just the kinetic
3.4 | The Compton Effect
energy given to the electron, Ee − me c2 . Similarly, λ′ is greater than λ, meaning
the scattered photon always has a longer wavelength than the incident photon; the
change in wavelength ranges from 0 at θ = 0◦ to twice the Compton wavelength
at θ = 180◦ . Of course the descriptions in terms of energy and wavelength are
equivalent, and the choice of which to use is merely a matter of convenience.
Using Ee = Ke + me c2 , where Ke is the kinetic energy of the electron, conservation of energy (Eq. 3.44a) can also be written as E + me c2 = E′ + Ke + me c2 .
Solving for Ke , we obtain
Ke = E − E′
(3.50)
That is, the kinetic energy acquired by the electron is equal to the difference
between the initial and final photon energies.
We can also find the direction of the electron’s motion by dividing the two
momentum relationships in Equation 3.45:
tan φ =
pe sin φ
p′ sin θ
E′ sin θ
=
=
′
pe cos φ
p − p cos θ
E − E′ cos θ
(3.51)
where the last result comes from using p = E/c and p′ = E′ /c.
Example 3.7
X rays of wavelength 0.2400 nm are Compton-scattered,
and the scattered beam is observed at an angle of 60.0◦
relative to the incident beam. Find: (a) the wavelength
of the scattered X rays, (b) the energy of the scattered
X-ray photons, (c) the kinetic energy of the scattered
electrons, and (d) the direction of travel of the scattered
electrons.
(b) The energy E′ can be found directly from λ′ :
Solution
(d) From Eq. 3.51,
E′ =
hc
1240 eV · nm
=
= 5141 eV
λ′
0.2412 nm
(c) The initial photon energy E is hc/λ = 5167 eV, so
Ke = E − E′ = 5167 eV − 5141 eV = 26 eV
(a) λ′ can be found immediately from Eq. 3.49:
h
λ′ = λ +
(1 − cos θ )
me c
◦
= 0.2400 + (0.00243 nm)(1 − cos 60 )
= 0.2412 nm
φ = tan−1
= tan−1
◦
= 59.7
E′ sin θ
E − E′ cos θ
(5141 eV)( sin 60◦ )
(5167 eV) − (5141 eV)(cos 60◦ )
The first experimental demonstration of this type of scattering was done by
Arthur Compton in 1923. A diagram of his experimental arrangement is shown in
Figure 3.19. A beam of X rays of a single wavelength λ is incident on a scattering
target, for which Compton used carbon. (Although no scattering target contains
actual “free” electrons, the outer or valence electrons in many materials are very
weakly attached to the atom and behave like nearly free electrons. The binding
energies of these electrons in the atom are so small compared with the energies of
the incident X-ray photons that they can be regarded as nearly “free” electrons.) A
movable detector measured the energy of the scattered X rays at various angles θ.
89
90
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
λ
Target
θ
λ′
X-ray
source
Detector
λ
λ′
FIGURE 3.19 Schematic diagram of Compton-scattering apparatus. The wavelength λ′ of the scattered X rays is measured by the detector, which can be moved
to different positions θ . The wavelength difference λ′ − λ varies with θ.
0°
45°
λ′ = 0.0715 nm
Compton’s original results are illustrated in Figure 3.20. At each angle,
two peaks appear, corresponding to scattered X-ray photons with two different
energies or wavelengths. The wavelength of one peak does not change as the
angle is varied; this peak corresponds to scattering that involves “inner” electrons
of the atom, which are more tightly bound to the atom so that the photon can
scatter with no loss of energy. The wavelength of the other peak, however, varies
strongly with angle; as can be seen from Figure 3.21, this variation is exactly as
the Compton formula predicts.
Similar results can be obtained for the scattering of gamma rays, which
are higher-energy (shorter wavelength) photons emitted in various radioactive
decays. Compton also measured the variation in wavelength of scattered
gamma rays, as illustrated in Figure 3.22. The change in wavelength in the
7
74
73
72
Slope = 2.4
135°
λ′ = 0.0749 nm
FIGURE 3.20 Compton’s original results for X-ray scattering.
×
10−12 m
5
4
3
71
70
0
λ = 0.0709 nm
λ′ (10−12 m)
λ′ (10−12 m)
λ′ = 0.0731 nm
6
75
90°
1
1 – cos θ
2
FIGURE 3.21 The scattered X-ray
wavelengths λ′ , from Figure 3.20,
for different scattering angles. The
expected slope is 2.43 × 10−12 m, in
agreement with the measured slope
of Compton’s data points.
2
0
Slope = 2.4
×
1
1 – cos θ
10−12 m
2
FIGURE 3.22 Compton’s results for
gamma-ray scattering. The wavelengths are much smaller than for
X-rays, but the slope is the same as
in Figure 3.21, which the Compton
formula, Eq. 3.49, predicts.
3.5 | Other Photon Processes
gamma-ray measurements is identical with the change in wavelength in the X-ray
measurements, as Eq. 3.49 predicts—the change in wavelength does not depend
on the incident wavelength.
3.5 OTHER PHOTON PROCESSES
Although thermal radiation, the photoelectric effect, and Compton scattering provided the earliest experimental evidence in support of the quantization (particlelike
behavior) of electromagnetic radiation, there are numerous other experiments that
can also be interpreted correctly only if we assume the existence of photons as
discrete quanta of electromagnetic radiation. In this section we discuss some of
these processes, which cannot be understood if we consider only the wave nature
of electromagnetic radiation. As you study the descriptions of these processes,
note how photons interact with atoms or electrons by delivering energy in discrete
bundles, in contrast to the wave interpretation in which the energy can be regarded
as arriving continuously.
Interactions of Photons with Atoms
The emission of electromagnetic radiation from atoms takes place in discrete
amounts characterized by one or more photons. When an atom emits a photon
of energy E, the atom loses an equivalent amount of energy. Consider an atom
at rest that has an initial energy Ei . The atom emits a photon of energy E. After
the emission, the atom is left with a final energy Ef , which we will take as the
energy associated with the internal structure of the atom. Because of conservation
of momentum, the final atom must have a momentum that is equal and opposite
to the momentum of the emitted photon, so the atom must also have a “recoil”
kinetic energy K. (Normally this kinetic energy is very small.) Conservation of
energy then gives
Ei = Ef + K + E
or
E = (Ei − Ef ) − K
(3.52)
The energy of the emitted photon is equal to the net energy lost by the atom,
minus a negligibly small contribution to the recoil kinetic energy of the atom.
In the reverse process, an atom can absorb a photon of energy E. If the atom
is initially at rest, it must again acquire a small recoil kinetic energy in order to
conserve momentum. Now conservation of energy gives
Ei + E = Ef + K
or
Ef − Ei = E − K
(3.53)
The energy available to add to the atom’s internal supply of energy is the photon
energy, less a recoil kinetic energy that is usually negligible.
Photon emission and absorption experiments are among the most important
techniques for acquiring information about the internal structure of atoms, as we
discuss in Chapter 6.
91
92
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
Bremsstrahlung and X-ray Production
When an electric charge, such as an electron, is accelerated or decelerated, it
radiates electromagnetic energy; according to the quantum interpretation, we
would say that it emits photons. Suppose we have a beam of electrons, which
has been accelerated through a potential difference V , so that the electrons
experience a loss in potential energy of −e V and thus acquire a kinetic energy
of K = e V (Figure 3.23). When the electrons strike a target they are slowed
down and eventually come to rest, because they make collisions with the atoms
of the target material. In such a collision, momentum is transferred to the atom,
the electron slows down, and photons are emitted. The recoil kinetic energy of
the atom is small (because the atom is so massive) and can safely be neglected. If
the electron has a kinetic energy K before the encounter and if it leaves after the
collision with a smaller kinetic energy K ′ , then the photon energy hf = hc/λ is
hf =
hc
= K − K′
λ
(3.54)
The amount of energy lost, and therefore the energy and wavelength of the
emitted photon, are not uniquely determined, because K is the only known energy
in Eq. 3.54. An electron usually will make many collisions, and therefore emit
many different photons, before it is brought to rest; the photons then will range
all the way from very small energies (large wavelengths) corresponding to small
energy losses, up to a maximum photon energy hfmax equal to K, corresponding
to an electron that loses all of its kinetic energy K in a single encounter (that is,
when K ′ = 0). The smallest emitted wavelength λmin is therefore determined by
the maximum possible energy loss,
λmin =
hc
hc
=
K
e V
(3.55)
∆V
C
X-ray photon
A
hf
K
Electron
Target atom
K′
X rays
(a)
(b)
FIGURE 3.23 (a) Apparatus for producing bremsstrahlung. Electrons from a
cathode C are accelerated to the anode A through the potential difference V .
When an electron encounters a target atom of the anode, it can lose energy, with the
accompanying emission of an X-ray photon. (b) A schematic representation of the
bremsstrahlung process.
3.5 | Other Photon Processes
50 kV
Relative intensity
For typical accelerating voltages in the range of 10,000 V, λmin is in the range of
a few tenths of nm, which corresponds to the X-ray region of the spectrum. This
continuous distribution of X rays (which is very different from the discrete X-ray
energies that are emitted in atomic transitions; more about these in Chapter 8) is
called bremsstrahlung, which is German for braking, or decelerating, radiation.
Some sample bremsstrahlung spectra are illustrated in Figure 3.24.
Symbolically we can write the bremsstrahlung process as
93
40 kV
30 kV
20 kV
electron → electron + photon
0
This is just the reverse process of the photoelectric effect, which is
electron + photon → electron
However, neither process occurs for free electrons. In both cases there must be a
heavy atom in the neighborhood to take care of the recoil momentum.
Pair Production and Annihilation
Another process that can occur when photons encounter atoms is pair production,
in which the photon loses all its energy and in the process two particles are
created: an electron and a positron. (A positron is a particle that is identical in
mass to the electron but has a positive electric charge; more about antiparticles
in Chapter 14.) Here we have an example of the creation of rest energy. The
electron did not exist before the encounter of the photon with the atom (it was not
an electron that was part of the atom). The photon energy hf is converted into the
relativistic total energies E+ and E− of the positron and electron:
hf = E+ + E− = (me c2 + K+ ) + (me c2 + K− )
(3.56)
Because K+ and K− are always positive, the photon must have an energy of
at least 2me c2 = 1.02 MeV in order for this process to occur; such high-energy
photons are in the region of nuclear gamma rays. Symbolically,
photon → electron + positron
This process, like bremsstrahlung, will not occur unless there is an atom nearby
to supply the necessary recoil momentum. The reverse process,
electron + positron → photon
also occurs; this process is known as electron-positron annihilation and can occur
for free electrons and positrons as long as at least two photons are created. In
this process the electron and positron disappear and are replaced by two photons.
Conservation of energy requires that
(me c2 + K+ ) + (me c2 + K− ) = E1 + E2
(3.57)
0.02
0.04 0.06 0.08 0.10
Wavelength (nm)
FIGURE 3.24 Some typical bremsstrahlung spectra. Each spectrum is
labeled with the value of the accelerating voltage V .
94
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
where E1 and E2 are the photon energies. Usually the kinetic energies K+ and K−
are negligibly small, so we can assume the positron and electron to be essentially
at rest. Momentum conservation then requires the two photons to have equal and
opposite momenta and thus equal energies. The two annihilation photons have
equal energies of 0.511 MeV (= me c2 ) and move in exactly opposite directions.
3.6 WHAT IS A PHOTON?
We can describe photons by giving a few of their basic properties:
• like an electromagnetic wave, photons move with the speed of light;
• they have zero mass and rest energy;
• they carry energy and momentum, which are related to the frequency and
wavelength of the electromagnetic wave by E = hf and p = h/λ ;
• they can be created or destroyed when radiation is emitted or absorbed;
• they can have particlelike collisions with other particles such as electrons.
Laser
Mirror
B
Splitter
A
Switch
Mirror
Interference
pattern
Detector
FIGURE 3.25 Apparatus for delayed
choice experiment. Photons from the
laser strike the beam splitter and can
then travel paths A or B. The switch
in path A can deflect the beam into
a detector. If the switch is off, the
beam on path A recombines with the
beam on path B to form an interference pattern. [Source: A. Shimony,
“The Reality of the Quantum World,”
Scientific American 258, 46 (January
1988)].
In this chapter we have described some experiments that favor the photon
interpretation of electromagnetic radiation, according to which the energy of the
radiation is concentrated in small bundles. Other experiments, such as interference
and diffraction, favor the wave interpretation, according to which the energy of
the radiation is spread over its entire wavefront. For example, the explanation of
the double-slit interference experiment requires that the wavefront be divided so
that some of its intensity can pass through each slit. A particle must choose to go
through one slit or the other; only a wave can go through both.
If we regard the wave and particle pictures as valid but exclusive alternatives,
we must assume that the light emitted by a source must travel either as waves or
as particles. How does the source know what kind of light (particles or waves) to
emit? Suppose we place a double-slit apparatus on one side of the source and a
photoelectric cell on the other side. Light emitted toward the double slit behaves
like a wave and light emitted toward the photocell behaves like particles. How
did the source know in which direction to aim the waves and in which direction
to aim the particles?
Perhaps nature has a sort of “secret code” in which the kind of experiment we
are doing is signaled back to the source so that it knows whether to emit particles
or waves. Let us repeat our dual experiment with light from a distant galaxy,
light that has been traveling toward us for a time roughly equal to the age of
the universe (13 × 109 years). Surely the kind of experiment we are doing could
not be signaled back to the limits of the known universe in the time it takes us
to remove the double-slit apparatus from the laboratory table and replace it with
the photoelectric apparatus. Yet we find that the starlight can produce both the
double-slit interference and also the photoelectric effect.
Figure 3.25 shows a recent experiment that was designed to test whether this
dual nature is an intrinsic property of light or of our apparatus. A light beam
from a laser goes through a beam splitter, which separates the beam into two
components (A and B). The mirrors reflect the two component beams so that they
can recombine to form an interference pattern. In path A there is a switch that can
deflect the beam into a detector. If the switch is off, beam A is not deflected and
will combine with beam B to produce the interference pattern. If the switch is on,
3.6 | What is a Photon?
beam A is deflected and observed in the detector, indicating that the light traveled
a definite path, as would be characteristic of a particle. To put this another way, if
the switch is off, the light beam is observed as a wave; if it is on, the light beam
is observed as particles.
If light behaves like particles, the beam splitter sends it along either path A
or path B; either path can be randomly chosen for the particle, but each particle
can travel only one path. If light behaves like a wave, on the other hand, the beam
splitter sends it along both paths, dividing its intensity between the two. Perhaps
the beam splitter can somehow sense whether the switch is open or closed, so
that it knows whether we are doing a particle-type or a wave-type experiment.
If this were true, then the beam splitter would “know” whether to send all of
the intensity down one path (so that we would observe a particle) or to split
the intensity between the paths (so that we would observe a wave). However, in
this experiment the experimenters used a very fast optical switch whose response
time was shorter than the time it takes for light to travel through the apparatus
to the switch. That is, the state of the switch could be changed after the light
had already passed through the beam splitter, and so it was impossible for the
beam splitter to “know” how the switch was set and thus whether a particle-type
or a wave-type experiment was being done. This kind of experiment is called a
“delayed choice” experiment, because the experimenter makes the choice of what
kind of experiment to do after the light is already traveling on its way to the
observation apparatus.
In this experiment, the investigators discovered that whenever they had the
switch off, they observed the interference pattern characteristic of waves. When
they had the switch on, they observed particles in the detector and no interference
pattern. That is, whenever they did a wave-type experiment they observed waves,
and whenever they did a particle-type experiment they observed particles. The
wave and particle natures are both present simultaneously in the light, and this dual
nature is clearly associated with the light and is not characteristic of the apparatus.
Many other experiments of this type have been done, and they all produce
similar results. We are therefore trapped into an uncomfortable conclusion: Light
is not either particles or waves; it is somehow both particles and waves, and
only shows one or the other aspect, depending on the kind of experiment we are
doing. A particle-type experiment shows the particle nature, while a wave-type
experiment shows the wave nature. Our failure to classify light as either particle
or wave is not so much a failure to understand the nature of light as it is a failure of
our limited vocabulary (based on experiences with ordinary particles and waves)
to describe a phenomenon that is more elegant and mysterious than either simple
particles or waves.
Wave-Particle Duality
The dilemma of the dual particle+wave nature of light, which is called waveparticle duality, cannot be resolved with a simple explanation; physicists and
philosophers have struggled with this problem ever since the quantum theory was
introduced. The best we can do is to say that neither the wave nor the particle
picture is wholly correct all of the time, that both are needed for a complete
description of physical phenomena, and that in fact the two are complementary to
one another.
Suppose we use a photographic film to observe the double-slit interference
pattern. The film responds to individual photons. When a single photon is absorbed
95
96
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
by the film, a single grain of the photographic emulsion is darkened; a complete
picture requires a large number of grains to be darkened.
Let us imagine for the moment that we could see individual grains of the film
as they absorbed photons and darkened, and let us do the double-slit experiment
with a light source that is so weak that there is a relatively long time interval
between photons. We would see first one grain darken, then another, and so forth,
until after a large number of photons we would see the interference pattern begin
to emerge. Some areas of the film (the interference maxima) show evidence for
the arrival of a large number of photons, while in other areas (the interference
minima) few photons arrive.
Alternatively, the wave picture of the double-slit experiment suggests that we
could find the net electric field of the wave that strikes the screen by superimposing
the electric fields of the portions of the incident wave fronts that pass through the
two slits; the intensity or power in that combined wave could then be found by a
procedure similar to Eqs. 3.7 through 3.10, and we would expect that the resultant
intensity should show maxima and minima just like the observed double-slit
interference pattern.
In summary, the correct explanation of the origin and appearance of the
interference pattern comes from the wave picture, and the correct interpretation
of the evolution of the pattern on the film comes from the photon picture; the
two explanations, which according to our limited vocabulary and common-sense
experience cannot simultaneously be correct, must somehow be taken together to
give a complete description of the properties of electromagnetic radiation.
Keep in mind that “photon” and “wave” represent descriptions of the behavior
of electromagnetic radiation when it encounters material objects. It is not correct
to think of light as being “composed” of photons, just as we don’t think of light as
being “composed” of waves. The explanation in terms of photons applies to some
interactions of radiation with matter, while the explanation in terms of waves
applies to other interactions. For example, when we say that an atom “emits” a
photon, we don’t mean that there is a supply of photons stored within the atom;
instead, we mean that the atom has given up a quantity of its internal energy to
create an equivalent amount of energy in the form of electromagnetic radiation.
In the case of the double-slit experiment, we might reason as follows: the
interaction between a “source” of radiation and the electromagnetic field is
quantized, so that we can think of the emission of radiation by the atoms of the
source in terms of individual photons. The interaction at the opposite end of the
experiment, the photographic film, is also quantized, and we have the similarly
useful view of atoms absorbing radiation as individual photons. In between, the
electromagnetic radiation propagates smoothly and continuously as a wave and
can show wave-type behavior (interference or diffraction) when it encounters the
double slit.
Where the wave has large intensity, the film reveals the presence of many
photons; where the wave has small intensity, few photons are observed. Recalling
that the intensity of the wave is proportional to the square of its amplitude, we
then have
probability to observe photons ∝ |electric field amplitude|2
It is this expression that provides the ultimate connection between the wave
behavior and the particle behavior, and we will see in the next two chapters that
Questions
97
a similar expression connects the wave and the particle aspects of those objects,
such as electrons, which have been previously considered to behave as classical
particles.
Chapter Summary
Section
Double-slit
maxima
yn = n
Bragg’s law for
X-ray diffraction
Energy of photon
Maximum kinetic
energy of
photoelectrons
Cutoff
wavelength
Stefan’s law
Wien’s
displacement law
λD
d
n = 0, 1, 2, 3, . . .
Section
2πc
kT
λ4
3.1
Rayleigh-Jeans
formula
2d sin θ = nλ n = 1, 2, 3, · · ·
3.1
E = hf = hc/λ
3.2
Kmax = eVs = hf − φ
3.2
λc = hc/φ
3.2
2πhc2
1
Planck’s blackbody I(λ) =
5
hc/λkT − 1
λ
e
distribution
1
1
1
− =
(1 − cos θ ),
Compton
E′
E
me c2
scattering
h
λ′ − λ =
(1 − cos θ )
me c
Bremsstrahlung
λmin = hc/K = hc/eV
I=
Pair production
σ T4
λmax T = 2.8978 × 10
3.3
−3
m·K
3.3
Electron-positron
annihilation
I(λ) =
hf = E+ + E− =
(me c2 + K+ ) + (me c2 + K− )
(me c2 + K+ ) + (me c2 + K− )
= E1 + E2
3.3
3.3
3.4
3.5
3.5
3.5
Questions
1. The diameter of an atomic nucleus is about 10 × 10−15 m.
Suppose you wanted to study the diffraction of photons
by nuclei. What energy of photons would you choose?
Why?
2. How is the wave nature of light unable to account for the
observed properties of the photoelectric effect?
3. In the photoelectric effect, why do some electrons have
kinetic energies smaller than Kmax ?
4. Why doesn’t the photoelectric effect work for free electrons?
5. What does the work function tell us about the properties of a
metal? Of the metals listed in Table 3.1, which has the least
tightly bound electrons? Which has the most tightly bound?
6. Electric current is charge flowing per unit time. If we increase
the kinetic energy of the photoelectrons (by increasing
the energy of the incident photons), shouldn’t the current increase, because the charge flows more rapidly? Why
doesn’t it?
7. What might be the effects on a photoelectric effect experiment if we were to double the frequency of the incident
light? If we were to double the wavelength? If we were to
double the intensity?
8. In the photoelectric effect, how can a photon moving in one
direction eject an electron moving in a different direction?
What happens to conservation of momentum?
9. In Figure 3.10, why does the photoelectric current rise
slowly to its saturation value instead of rapidly, when the
potential difference is greater than Vs ? What does this figure
indicate about the experimental difficulties that might arise
from trying to determine Vs in this way?
10. Suppose that the frequency of a certain light source is just
above the cutoff frequency of the emitter, so that the photoelectric effect occurs. To an observer in relative motion,
the frequency might be Doppler shifted to a lower value
that is below the cutoff frequency. Would this moving
observer conclude that the photoelectric effect does not
occur? Explain.
11. Why do cavities that form in a wood fire seem to glow
brighter than the burning wood itself? Is the temperature
in such cavities hotter than the surface temperature of the
exposed burning wood?
12. What are the fields of classical physics on which the classical theory of blackbody radiation is based? Why don’t
98
13.
14.
15.
16.
17.
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
we believe that the “ultraviolet catastrophe” suggests that
something is wrong with one of those classical theories?
In what region of the electromagnetic spectrum do roomtemperature objects radiate? What problems would we have
if our eyes were sensitive in that region?
How does the total intensity of thermal radiation vary when
the temperature of an object is doubled?
Compton-scattered photons of wavelength λ′ are observed
at 90◦ . In terms of λ′ , what is the scattered wavelength
observed at 180◦ ?
The Compton-scattering formula suggests that objects
viewed from different angles should show scattered light
of different wavelengths. Why don’t we observe a change in
color of objects as we change the viewing angle?
You have a monoenergetic source of X rays of energy 84
keV, but for an experiment you need 70 keV X rays. How
would you convert the X-ray energy from 84 to 70 keV?
18. TV sets with picture tubes can be significant emitters of
X rays. What is the origin of these X rays? Estimate their
wavelengths.
19. The X-ray peaks of Figure 3.20 are not sharp but are spread
over a range of wavelengths. What reasons might account
for that spreading?
20. A beam of photons passes through a block of matter. What
are the three ways discussed in this chapter that the photons
can lose energy in interacting with the material?
21. Of the photon processes discussed in this chapter (photoelectric effect, thermal radiation, Compton scattering,
bremsstrahlung, pair production, electron-positron annihilation), which conserve momentum? Energy? Mass? Number
of photons? Number of electrons? Number of electrons
minus number of positrons?
Problems
3.1 Review of Electromagnetic Waves
1. A double-slit experiment is performed with sodium light
(λ = 589.0 nm). The slits are separated by 1.05 mm, and the
screen is 2.357 m from the slits. Find the separation between
adjacent maxima on the screen.
2. In Example 3.1, what angle of incidence will produce the
second-order Bragg peak?
3. Monochromatic X rays are incident on a crystal in the geometry of Figure 3.5. The first-order Bragg peak is observed
when the angle of incidence is 34.0◦ . The crystal spacing
is known to be 0.347 nm. (a) What is the wavelength of
the X rays? (b) Now consider a set of crystal planes that
makes an angle of 45◦ with the surface of the crystal (as
in Figure 3.6). For X rays of the same wavelength, find the
angle of incidence measured from the surface of the crystal
that produces the first-order Bragg peak. At what angle from
the surface does the emerging beam appear in this case?
4. A certain device for analyzing electromagnetic radiation is
based on the Bragg scattering of the radiation from a crystal.
For radiation of wavelength 0.149 nm, the first-order Bragg
peak appears centered at an angle of 15.15◦ . The aperture of
the analyzer passes radiation in the angular range of 0.015◦ .
What is the corresponding range of wavelengths passing
through the analyzer?
7.
8.
9.
10.
11.
12.
are continuously bombarded by these photons. Why are they
not dangerous to us?
(a) What is the wavelength of an X-ray photon of energy
10.0 keV? (b) What is the wavelength of a gamma-ray photon of energy 1.00 MeV? (c) What is the range of energies
of photons of visible light with wavelengths 350 to 700 nm?
What is the cutoff wavelength for the photoelectric effect
using an aluminum surface?
A metal surface has a photoelectric cutoff wavelength
of 325.6 nm. It is illuminated with light of wavelength
259.8 nm. What is the stopping potential?
When light of wavelength λ illuminates a copper surface,
the stopping potential is V . In terms of V , what will be
the stopping potential if the same wavelength is used to
illuminate a sodium surface?
The cutoff wavelength for the photoelectric effect in a certain metal is 254 nm. (a) What is the work function for
that metal? (b) Will the photoelectric effect be observed for
λ > 254 nm or for λ < 254 nm?
A surface of zinc is illuminated and photoelectrons are
observed. (a) What is the largest wavelength that will cause
photoelectrons to be emitted? (b) What is the stopping
potential when light of wavelength 220.0 nm is used?
3.3 Blackbody Radiation
3.2 The Photoelectric Effect
5. Find the momentum of (a) a 10.0-MeV gamma ray; (b) a
25-keV X ray; (c) a 1.0-μm infrared photon; (d) a 150-MHz
radio-wave photon. Express the momentum in kg · m/s and
eV/c.
6. Radio waves have a frequency of the order of 1 to 100 MHz.
What is the range of energies of these photons? Our bodies
13. (a) Show that in the classical result for the energy distribution of the cavity wall oscillators (Eq. 3.32), the total number
of oscillators at all energies is N. (b) Show that Eav = kT
for the classical oscillators.
14. (a) Writing the discrete Maxwell-Boltzmann distribution
for Planck’s cavity wall oscillators as Nn = Ae−En /kT
(where A is a constant to be determined), show that the
Problems
condition
∞
n=0
Nn = N gives A = N(1 − e−ε/kT ) as in Eq.
3.38. [Hint: Use
∞
n=0
15.
16.
17.
18.
19.
20.
21.
22.
23.
enx = (1 − ex )−1 ]. (b) By taking the
derivative with respect to x of the equation given in the
∞
hint, show that
nenx = ex /(1 − ex )2 . (c) Use this result
n=0
to derive Eq. 3.40 from Eq. 3.39. (d) Show that Eav ∼
= kT at
large λ and Eav → 0 for small λ.
By differentiating Eq. 3.41 show that I(λ) has its maximum
as expected according to Wien’s displacement law, Eq. 3.27.
Integrate
∞ Eq. 3.41 to obtain Eq. 3.26. Use the definite integral 0 x3 dx/(ex − 1) = π 4 /15 to obtain Eq. 3.42 relating
the Stefan-Boltzmann constant to Planck’s constant.
Use the numerical value of the Stefan-Boltzmann constant to
find the numerical value of Planck’s constant from Eq. 3.42.
The surface of the Sun has a temperature of about 6000 K. At
what wavelength does the Sun emit its peak intensity? How
does this compare with the peak sensitivity of the human eye?
The universe is filled with thermal radiation, which has a
blackbody spectrum at an effective temperature of 2.7 K (see
Chapter 15). What is the peak wavelength of this radiation?
What is the energy (in eV) of quanta at the peak wavelength?
In what region of the electromagnetic spectrum is this peak
wavelength?
(a) Assuming the human body (skin temperature 34◦ C)
to behave like an ideal thermal radiator, find the wavelength where the intensity from the body is a maximum.
In what region of the electromagnetic spectrum is radiation
with this wavelength? (b) Making whatever (reasonable)
assumptions you may need, estimate the power radiated by
a typical person isolated from the surroundings. (c) Estimate
the radiation power absorbed by a person in a room in which
the temperature is 20◦ C.
A cavity is maintained at a temperature of 1650 K. At
what rate does energy escape from the interior of the cavity
through a hole in its wall of diameter 1.00 mm?
An analyzer for thermal radiation is set to accept wavelengths in an interval of 1.55 nm. What is the intensity of the
radiation in that interval at a wavelength of 875 nm emitted
from a glowing object whose temperature is 1675 K?
(a) Assuming the Sun to radiate like an ideal thermal source
at a temperature of 6000 K, what is the intensity of the
solar radiation emitted in the range 550.0 nm to 552.0 nm?
(b) What fraction of the total solar radiation does this
represent?
3.4 The Compton Effect
24. Show how Eq. 3.48 follows from Eq. 3.47.
25. Incident photons of energy 10.39 keV are Compton scattered, and the scattered beam is observed at 45.00◦ relative
to the incident beam. (a) What is the energy of the scattered
photons at that angle? (b) What is the kinetic energy of the
scattered electrons?
99
26. X-ray photons of wavelength 0.02480 nm are incident on a
target and the Compton-scattered photons are observed at
90.0◦ . (a) What is the wavelength of the scattered photons?
(b) What is the momentum of the incident photons? Of the
scattered photons? (c) What is the kinetic energy of the
scattered electrons? (d) What is the momentum (magnitude
and direction) of the scattered electrons?
27. High-energy gamma rays can reach a radiation detector
by Compton scattering from the surroundings, as shown
in Figure 3.26. This effect is known as back-scattering.
Show that, when E ≫ me c2 , the back-scattered photon has
an energy of approximately 0.25 MeV, independent of the
energy of the original photon, when the scattering angle is
nearly 180◦ .
Detector
FIGURE 3.26 Problem 27.
28. Gamma rays of energy 0.662 MeV are Compton scattered.
(a) What is the energy of the scattered photon observed at a
scattering angle of 60.0◦ ? (b) What is the kinetic energy of
the scattered electrons?
3.5 Other Photon Processes
29. Suppose an atom of iron at rest emits an X-ray photon
of energy 6.4 keV. Calculate the “recoil” momentum and
kinetic energy of the atom. (Hint: Do you expect to need
classical or relativistic kinetic energy for the atom? Is the
kinetic energy likely to be much smaller than the atom’s rest
energy?)
30. What is the minimum X-ray wavelength produced in
bremsstrahlung by electrons that have been accelerated
through 2.50 × 104 V?
31. An atom absorbs a photon of wavelength 375 nm and immediately emits another photon of wavelength 580 nm. What
is the net energy absorbed by the atom in this process?
General Problems
32. A certain green light bulb emits at a single wavelength of
550 nm. It consumes 55 W of electrical power and is 75%
efficient in converting electrical energy into light. (a) How
many photons does the bulb emit in one hour? (b) Assuming
the emitted photons to be distributed uniformly in space,
how many photons per second strike a 10 cm by 10 cm paper
held facing the bulb at a distance of 1.0 m?
33. When sodium metal is illuminated with light of wavelength
4.20 × 102 nm, the stopping potential is found to be 0.65 V;
when the wavelength is changed to 3.10 × 102 nm, the
100
34.
35.
36.
37.
38.
39.
Chapter 3 | The Particlelike Properties of Electromagnetic Radiation
stopping potential is 1.69 V. Using only these data and the
values of the speed of light and the electronic charge, find the
work function of sodium and a value of Planck’s constant.
A photon of wavelength 192 nm strikes an aluminum surface along a line perpendicular to the surface and releases
a photoelectron traveling in the opposite direction. Assume
the recoil momentum is taken up by a single aluminum
atom on the surface. Calculate the recoil kinetic energy of
the atom. Would this recoil energy significantly affect the
kinetic energy of the photoelectron?
A certain cavity has a temperature of 1150 K. (a) At what
wavelength will the intensity of the radiation inside the cavity
have its maximum value? (b) As a fraction of the maximum
intensity, what is the intensity at twice the wavelength found
in part (a)?
In Compton scattering, calculate the maximum kinetic
energy given to the scattered electron for a given photon
energy.
The COBE satellite was launched in 1989 to study the
cosmic background radiation and measure its temperature.
By measuring at many different wavelengths, researchers
were able to show that the background radiation exactly
followed the spectral distribution expected for a blackbody. At a wavelength of 0.133 cm, the radiant intensity is
1.440 × 10−7 W/m2 in a wavelength interval of 0.00833 cm.
What is the temperature of the radiation that would be
deduced from these data?
The WMAP satellite launched in 2001 studied the cosmic
microwave background radiation and was able to chart small
fluctuations in the temperature of different regions of the
background radiation. These fluctuations in temperature correspond to regions of large and small density in the early
universe. The satellite was able to measure differences in
temperature of 2 × 10−5 K at a temperature of 2.7250 K. At
the peak wavelength, what is the difference in the radiation
intensity per unit wavelength interval between the “hot” and
“cold” regions of the background radiation?
You have been hired as an engineer on a NASA project
to design a microwave spectrometer for an orbital mission to measure the cosmic background radiation, which
has a blackbody spectrum with an effective temperature
of 2.725 K. (a) The spectrometer is to scan the sky
between wavelengths of 0.50 mm and 5.0 mm, and at each
wavelength it accepts radiation in a wavelength range of
40.
41.
42.
43.
44.
3.0 × 10−4 mm. What maximum and minimum radiation
intensity do you expect to find in this region? (b) The
photon detector in the spectrometer is in the form of a
disk of diameter 0.86 cm. How many photons per second
will the spectrometer record at its maximum and minimum
intensities?
A photon of wavelength 7.52 pm scatters from a free electron at rest. After the interaction, the electron is observed to
be moving in the direction of the original photon. Find the
momentum of the electron.
A hydrogen atom is moving at a speed of 125.0 m/s. It
absorbs a photon of wavelength 97 nm that is moving in the
opposite direction. By how much does the speed of the atom
change as a result of absorbing the photon?
Before a positron and an electron annihilate, they form a sort
of “atom” in which each orbits about their common center
of mass with identical speeds. As a result of this motion, the
photons emitted in the annihilation show a small Doppler
shift. In one experiment, the Doppler shift in energy of the
photons was observed to be 2.41 keV. (a) What would be
the speed of the electron or positron before the annihilation
to produce this Doppler shift? (b) The positrons form these
atom-like structures with the nearly “free” electrons in a
solid. Assuming the positron and electron must have about
the same speed to form this structure, find the kinetic energy
of the electron. This technique, called “Doppler broadening,” is an important method for learning about the energies
of electrons in materials.
Prove that it is not possible to conserve both momentum
and total relativistic energy in the following situation: A
v emits a photon and then
free electron moving at velocity
v′.
moves at a slower velocity
A photon of energy E interacts with an electron at rest
and undergoes pair production, producing a positive electron (positron) and an electron (in addition to the original
electron):
photon + e− → e+ + e− + e−
The two electrons and the positron move off with identical momenta in the direction of the initial photon. Find
the kinetic energy of the three final particles and find the
energy E of the photon. (Hint: Conserve momentum and
total relativistic energy.)
Chapter
4
THE WAVELIKE PROPERTIES OF
PARTICLES
Just as we produce images from light waves that scatter from objects, we can also form
images from ‘‘particle waves’’. The electron microscope produces images from electron
waves that enable us to visualize objects on a scale that is much smaller than the wavelength
of light. The ability to observe individual human cells and even sub-cellular objects such as
chromosomes has revolutionized our understanding of biological processes. It is even
possible to form images of a single atom, such as this cobalt atom on a gold surface. The
ripples on the surface show electrons from gold atoms reacting to the presence of the
intruder.
102
Chapter 4 | The Wavelike Properties of Particles
In classical physics, the laws describing the behavior of waves and particles are fundamentally different. Projectiles obey particle-type laws, such as
Newtonian mechanics. Waves undergo interference and diffraction, which cannot
be explained by the Newtonian mechanics associated with particles. The energy
carried by a particle is confined to a small region of space; a wave, on the other
hand, distributes its energy throughout space in its wavefronts. In describing the
behavior of a particle we often want to specify its location, but this is not so easy
to do for a wave. How would you describe the exact location of a sound wave or
a water wave?
In contrast to this clear distinction found in classical physics, quantum physics
requires that particles sometimes obey the rules that we have previously established
for waves, and we shall use some of the language associated with waves to describe
particles. The system of mechanics associated with quantum systems is sometimes
called “wave mechanics” because it deals with the wavelike behavior of particles.
In this chapter we discuss the experimental evidence in support of this wavelike
behavior for particles such as electrons.
As you study this chapter, notice the frequent references to such terms as the
probability of the outcome of a measurement, the average of many repetitions
of a measurement, and the statistical behavior of a system. These terms are
fundamental to quantum mechanics, and you cannot begin to understand quantum
behavior until you feel comfortable with discarding such classical notions as fixed
trajectories and certainty of outcome, while substituting the quantum mechanical
notions of probability and statistically distributed outcomes.
4.1 DE BROGLIE’S HYPOTHESIS
Louis de Broglie (1892–1987,
France). A member of an aristocratic
family, his work contributed substantially to the early development of the
quantum theory.
Progress in physics often can be characterized by long periods of experimental
and theoretical drudgery punctuated occasionally by flashes of insight that cause
profound changes in the way we view the universe. Frequently the more profound
the insight and the bolder the initial step, the simpler it seems in historical
perspective, and the more likely we are to sit back and wonder, “Why didn’t
I think of that?” Einstein’s special theory of relativity is one example of such
insight; the hypothesis of the Frenchman Louis de Broglie is another.∗
In the previous chapter we discussed the double-slit experiment (which can be
understood only if light behaves as a wave) and the photoelectric and Compton
effects (which can be understood only if light behaves as a particle). Is this dual
particle-wave nature a property only of light or of material objects as well? In
a bold and daring hypothesis in his 1924 doctoral dissertation, de Broglie chose
the latter alternative. Examining Eq. 3.20, E = hf, and Eq. 3.22, p = h/λ, we find
some difficulty in applying the first equation in the case of particles, for we cannot
be sure whether E should be the kinetic energy, total energy, or total relativistic
energy (all, of course, are identical for light). No such difficulties arise from the
second relationship. De Broglie suggested, lacking any experimental evidence in
∗
De Broglie’s name should be pronounced “deh-BROY” or “deh-BROY-eh,” but it is often said as
“deh-BROH-lee.”
4.1 | De Broglie’s Hypothesis
103
support of his hypothesis, that associated with any material particle moving with
momentum p there is a wave of wavelength λ, related to p according to
λ=
h
p
(4.1)
where h is Planck’s constant. The wavelength λ of a particle computed according
to Eq. 4.1 is called its de Broglie wavelength.
Example 4.1
Compute the de Broglie wavelength of the following: (a) A
1000-kg automobile traveling at 100 m/s (about 200 mi/h).
(b) A 10-g bullet traveling at 500 m/s. (c) A smoke particle
of mass 10−9 g moving at 1 cm/s. (d) An electron with
a kinetic energy of 1 eV. (e) An electron with a kinetic
energy of 100 MeV.
Then,
λ=
6.6 × 10−34 J · s
h
=
p
5.4 × 10−25 kg · m/s
= 1.2 × 10−9 m = 1.2 nm
Solution
(a) Using the classical relation between velocity and
momentum,
λ=
h
6.6 × 10−34 J · s
h
=
=
= 6.6 × 10−39 m
p
mv
(103 kg)(100 m/s)
(b) As in part (a),
λ=
6.6 × 10−34 J · s
h
=
= 1.3 × 10−34 m
mv
(10−2 kg)(500 m/s)
(c)
λ=
6.6 × 10−34 J · s
h
= 6.6 × 10−20 m
=
−2
−12
mv
(10 kg)(10 m/s)
(d) The rest energy (mc2 ) of an electron is 5.1 × 105 eV.
Because the kinetic energy (1 eV) is much less than the rest
energy, we can use nonrelativistic kinematics.
√
p = 2mK
= 2(9.1 × 10−31 kg)(1 eV)(1.6 × 10−19 J/eV)
−25
= 5.4 × 10
kg · m/s
We can
√ also find this solution in the following way, using
p = 2mK and hc = 1240 eV · nm.
√
cp = c 2mK = 2(mc2 )K
= 2(5.1 × 105 eV)(1 eV) = 1.0 × 103 eV
λ=
h
hc
1240 eV · nm
=
=
= 1.2 nm
p
pc
1.0 × 103 eV
This method may seem artificial at first, but with practice it becomes quite useful, especially because energies
are usually given in electron-volts in atomic and nuclear
physics.
(e) In this case, the kinetic energy is much greater than the
rest energy, and so we are in the extreme relativistic realm,
where K ∼
=E∼
= pc, as in Eq. 2.40. The wavelength is
λ=
1240 MeV · fm
hc
=
= 12 fm
pc
100 MeV
Note that the wavelengths computed in parts (a), (b), and (c) are far too small to be
observed in the laboratory. Only in the last two cases, in which the wavelength is
of the same order as atomic or nuclear sizes, do we have any chance of observing
the wavelength. Because of the smallness of h, only for particles of atomic or
nuclear size will the wave behavior be observable.
Two questions immediately follow. First, just what sort of wave is it that has
this de Broglie wavelength? That is, what does the amplitude of the de Broglie
104
Chapter 4 | The Wavelike Properties of Particles
wave measure? We’ll discuss the answer to this question later in this chapter.
For now, we assume that, associated with the particle as it moves, there is a de
Broglie wave of wavelength λ, which shows itself when a wave-type experiment
(such as diffraction) is performed on it. The outcome of the wave-type experiment
depends on this wavelength. The de Broglie wavelength, which characterizes the
wave-type behavior of particles, is central to the quantum theory.
The second question then occurs: Why was this wavelength not directly
observed before de Broglie’s time? As parts (a), (b), and (c) of Example 4.1
showed, for ordinary objects the de Broglie wavelength is very small. Suppose we
tried to demonstrate the wave nature of these objects through a double-slit type
of experiment. Recall from Eq. 3.16 that the spacing between adjacent fringes
in a double-slit experiment is y = λD/d. Putting in reasonable values for the
slit separation d and slit-to-screen distance D, you will find that there is no
achievable experimental configuration that can produce an observable separation
of the fringes (see Problem 9). There is no experiment that can be done to
reveal the wave nature of macroscopic (laboratory-sized) objects. Experimental
verification of de Broglie’s hypothesis comes only from experiments with objects
on the atomic scale, which are discussed in the next section.
4.2 EXPERIMENTAL EVIDENCE FOR
DE BROGLIE WAVES
Light waves
(Plane wave fronts)
a
The indications of wave behavior come mostly from interference and diffraction
experiments. Double-slit interference, which was reviewed in Section 3.1, is
perhaps the most familiar type of interference experiment, but the experimental
difficulties of constructing double slits to do interference experiments with beams
of atomic or subatomic particles were not solved until long after the time of de
Broglie’s hypothesis. We discuss these experiments later in this section. First
we’ll discuss diffraction experiments with electrons.
Particle Diffraction Experiments
q
Screen
Diffraction of light waves is discussed in most introductory physics texts and is
illustrated in Figure 4.1 for light diffracted by a single slit. For light of wavelength
λ incident on a slit of width a, the diffraction minima are located at angles given by
a sin θ = nλ
FIGURE 4.1 Light waves (represented as plane wave fronts) are incident on a narrow slit of width a.
Diffraction causes the waves to spread
after passing through the slit, and the
intensity varies along the screen. The
photograph shows the resulting intensity pattern.
n = 1, 2, 3, . . .
(4.2)
on either side of the central maximum. Note that most of the light intensity falls
in the central maximum.
The experiments that first verified de Broglie’s hypothesis involve electron
diffraction, not through an artificially constructed single slit (as for the diffraction
pattern in Figure 4.1) but instead through the atoms of a crystal. The outcomes
of these experiments resemble those of the similar X-ray diffraction experiments
illustrated in Section 3.1.
In an electron diffraction experiment, a beam of electrons is accelerated from
rest through a potential difference V
√, acquiring a nonrelativistic kinetic energy
K = e V and a momentum p = 2mK. Wave mechanics would describe
the beam of electrons as a wave of wavelength λ = h/p. The beam strikes a
4.2 | Experimental Evidence for De Broglie Waves
crystal, and the scattered beam is photographed (Figure 4.2). The similarity
between electron diffraction patterns (Figure 4.2) and X-ray diffraction patterns
(Figure 3.7) strongly suggests that the electrons are behaving as waves.
The “rings” produced in X-ray diffraction of polycrystalline materials
(Figure 3.8b) are also produced in electron diffraction, as shown in Figure 4.3,
again providing strong evidence for the similarity in the wave behavior of
electrons and X rays. Experiments of the type illustrated in Figure 4.3 were
first done in 1927 by G. P. Thomson, who shared the 1937 Nobel Prize for this
work. (Thomson’s father, J. J. Thomson, received the 1906 Nobel Prize for his
discovery of the electron and measurement of its charge-to-mass ratio. Thus
it can be said that Thomson, the father, discovered the particle nature of the
electron, while Thomson, the son, discovered its wave nature.)
An electron diffraction experiment gave the first experimental confirmation
of the wave nature of electrons (and the quantitative confirmation of the de
Broglie relationship λ = h/p) soon after de Broglie’s original hypothesis. In
1926, at the Bell Telephone Laboratories, Clinton Davisson and Lester Germer
were investigating the reflection of electron beams from the surface of nickel
crystals. A schematic view of their apparatus is shown in Figure 4.4. A beam of
electrons from a heated filament is accelerated through a potential difference V .
After passing through a small aperture, the beam strikes a single crystal of nickel.
Electrons are scattered in all directions by the atoms of the crystal, some of them
striking a detector, which can be moved to any angle φ relative to the incident
beam and which measures the intensity of the electron beam scattered at that angle.
Figure 4.5 shows the results of one of the experiments of Davisson and Germer.
When the accelerating voltage is set at 54 V, there is an intense reflection of the
beam at the angle φ = 50◦ . Let’s see how these results give confirmation of the
de Broglie wavelength.
105
Screen
Crystal
Electron beam
FIGURE 4.2 (Top) Electron diffraction apparatus. (Bottom) Electron
diffraction pattern. Each bright dot is
a region of constructive interference,
as in the X-ray diffraction patterns of
Figure 3.7. The target is a crystal of
Ti2 Nb10 O29 .
f = 50°
F
+V
Electron
beam
f
Detector
Crystal
FIGURE 4.3 Electron diffraction of
polycrystalline beryllium. Note the similarity between this pattern and the
pattern for X-ray diffraction of a polycrystalline material (Figure 3.8b).
FIGURE 4.4 Apparatus used by
Davisson and Germer to study
electron diffraction. Electrons
leave the filament F and are accelerated by the voltage V . The beam
strikes a crystal and the scattered
beam is detected at an angle φ
relative to the incident beam. The
detector can be moved in the range
0 to 90◦ .
FIGURE 4.5 Results of Davisson
and Germer. Each point on the plot
represents the relative intensity
when the detector in Figure 4.4 is
located at the corresponding angle
φ measured from the vertical axis.
Constructive interference causes
the intensity of the reflected beam
to reach a maximum at φ = 50◦ for
V = 54 V.
106
Chapter 4 | The Wavelike Properties of Particles
f
Incident
ray
Diffracted
ray
d
d sin f
Each of the atoms of the crystal can act as a scatterer, so the scattered electron
waves can interfere, and we have a crystal diffraction grating for the electrons.
Figure 4.6 shows a simplified representation of the nickel crystal used in the
Davisson-Germer experiment. Because the electrons were of low energy, they did
not penetrate very far into the crystal, and it is sufficient to consider the diffraction
to take place in the plane of atoms on the surface. The situation is entirely similar
to using a reflection-type diffraction grating for light; the spacing d between the
rows of atoms on the crystal is analogous to the spacing between the slits in the
optical grating. The maxima for a diffraction grating occur at angles φ such that
the path difference between adjacent rays d sin φ is equal to a whole number of
wavelengths:
FIGURE 4.6 The crystal surface acts
like a diffraction grating with spacing d.
d sin φ = nλ
n = 1, 2, 3, . . .
(4.3)
where n is the order number of the maximum.
From independent data, it is known that the spacing between the rows of atoms
in a nickel crystal is d = 0.215 nm. The peak at φ = 50◦ must be a first-order
peak (n = 1), because no peaks were observed at smaller angles. If this is indeed
an interference maximum, the corresponding wavelength is, from Eq. 4.3,
◦
λ = d sin φ = (0.215 nm)(sin 50 ) = 0.165 nm
We can compare this value with that expected on the basis of the de Broglie
theory. An electron accelerated through a potential difference of 54 V has a kinetic
energy of 54 eV and therefore a momentum of
FIGURE 4.7 Diffraction of neutrons
by a sodium chloride crystal.
Intensity of scattered protons
101
10–1
–2
10–3
10–4
10–5
10–6
√
1√
1
1
2mK =
2mc2 K =
2(511, 000 eV)(54 eV) = (7430 eV)
c
c
c
The de Broglie wavelength is λ = h/p = hc/pc. Using hc = 1240 eV · nm,
λ=
100
10
p=
0
4
8
12 16 20 24 28
Scattering angle (degrees)
FIGURE 4.8 Diffraction of 1-GeV
protons by oxygen nuclei. The pattern of maxima and minima is similar
to that of single-slit diffraction of
light waves. [Source: H. Palevsky et
al., Physical Review Letters 18, 1200
(1967).]
hc
1240 eV · nm
=
= 0.167 nm
pc
7430 eV
This is in excellent agreement with the value found from the diffraction maximum, and provides strong evidence in favor of the de Broglie theory. For this
experimental work, Davisson shared the 1937 Nobel Prize with G. P. Thomson.
The wave nature of particles is not exclusive to electrons; any particle with
momentum p has de Broglie wavelength h/p. Neutrons are produced in nuclear
reactors with kinetic energies corresponding to wavelengths of roughly 0.1 nm;
these also should be suitable for diffraction by crystals. Figure 4.7 shows that
diffraction of neutrons by a salt crystal produces the same characteristic patterns
as the diffraction of electrons or X rays. Clifford Shull shared the 1994 Nobel
Prize for the development of the neutron diffraction technique.
To study the nuclei of atoms, much smaller wavelengths are needed, of the order
of 10−15 m. Figure 4.8 shows the diffraction pattern produced by the scattering
of 1-GeV kinetic energy protons by oxygen nuclei. Maxima and minima of the
diffracted intensity appear in a pattern similar to the single-slit diffraction shown
in Figure 4.1. (The intensity at the minima does not fall to zero because nuclei
do not have a sharp boundary. The determination of nuclear sizes from such
diffraction patterns is discussed in Chapter 12.)
4.2 | Experimental Evidence for De Broglie Waves
107
Example 4.2
Protons of kinetic energy 1.00 GeV were diffracted by
oxygen nuclei, which have a radius of 3.0 fm, to produce
the data shown in Figure 4.8. Calculate the expected angles
where the first three diffraction minima should appear.
Solution
The total relativistic energy of the protons is E =
K + mc2 = 1.00 GeV + 0.94 GeV = 1.94 GeV is, so their
momentum is
1 2
E − (mc2 )2
p=
c
1
=
(1.94 GeV)2 − (0.94 GeV)2 = 1.70 GeV/c
c
The corresponding de Broglie wavelength is
λ=
h hc 1240 MeV · fm
=
=
= 0.73 fm
p pc
1700 MeV
We can represent the oxygen nuclei as circular disks, for
which the diffraction formula is a bit different from Eq. 4.2:
a sin θ = 1.22nλ, where a is the diameter of the diffracting
object. Based on this formula, the first diffraction minimum
(n = 1) should appear at the angle
sin θ =
(1.22)(1)(0.73 fm)
1.22nλ
=
= 0.148
a
6.0 fm
or θ = 8.5◦ . Because the sine of the diffraction
angle is proportional to the index n, the n = 2
minimum should appear at the angle where sin
θ = 2 × 0.148 = 0.296 (θ = 17.2◦ ), and the n = 3 minimum
where sin θ = 3 × 0.148 = 0.444 (θ = 26.4◦ ).
From the data in Figure 4.8, we see the first diffraction
minimum at an angle of about 10◦ , the second at about
18◦ , and the third at about 27◦ , all in very good agreement
with the expected values. The data don’t exactly follow
the formula for diffraction by a disk, because nuclei don’t
behave quite like disks. In particular, they have diffuse
rather than sharp edges, which prevents the intensity at
the diffraction minima from falling to zero and also alters
slightly the locations of the minima.
Double-Slit Experiments with Particles
The definitive evidence for the wave nature of light was deduced from the
double-slit experiment performed by Thomas Young in 1801 (discussed in
Section 3.1). In principle, it should be possible to do double-slit experiments
with particles and thereby directly observe their wavelike behavior. However, the
technological difficulties of producing double slits for particles are formidable,
and such experiments did not become possible until long after the time of de
Broglie. The first double-slit experiment with electrons was done in 1961. A
diagram of the apparatus is shown in Figure 4.9. The electrons from a hot filament
were accelerated through 50 kV (corresponding to λ = 5.4 pm) and then passed
through a double slit of separation 2.0 μm and width 0.5 μm. A photograph of
the resulting intensity pattern is shown in Figure 4.10. The similarity with the
double-slit pattern for light (Figure 3.2) is striking.
A similar experiment can be done for neutrons. A beam of neutrons from
a nuclear reactor can be slowed to a room-temperature “thermal” energy
distribution (average K ≈ kT ≈ 0.025 eV), and a specific wavelength can be
selected by a scattering process similar to Bragg diffraction (see Eq. 3.18
and Problem 32 at the end of the present chapter). In one experiment, neutrons of kinetic energy 0.00024 eV and de Broglie wavelength 1.85 nm passed
through a gap of diameter 148 μm in a material that absorbs virtually all
of the neutrons incident on it (Figure 4.11). In the center of the gap was a
boron wire (also highly absorptive for neutrons) of diameter 104 μm. The neutrons could pass on either side of the wire through slits of width 22 μm. The
intensity of neutrons that pass through this double slit was observed by sliding
Fluorescent
screen
Electrons
F
50 kV
Photographic
film
FIGURE 4.9 Double-slit apparatus
for electrons. Electrons from the filament F are accelerated through 50 kV
and pass through the double slit. They
produce a visible pattern when they
strike a fluorescent screen (like a TV
screen), and the resulting pattern is
photographed. A photograph is shown
in Figure 4.10. [See C. Jonsson, American Journal of Physics 42, 4 (1974).]
108
Chapter 4 | The Wavelike Properties of Particles
D=5m
Wavelength
selector
Detector
Entrance
slit
Neutron beam
Intensity
FIGURE 4.10 Double-slit interference pattern for electrons.
100 mm
Scanning slit position
Intensity
FIGURE 4.12 Intensity pattern observed for double-slit interference
with neutrons. The spacing between
the maxima is about 75 μm. [Source:
R. Gahler and A. Zeilinger, American
Journal of Physics 59, 316 (1991).]
10 mm
Scanning slit position
FIGURE 4.13 Intensity pattern observed for double-slit interference
with helium atoms. [Source: O. Carnal and J. Mlynek, Physical Review
Letters 66, 2689 (1991).]
Double
slit
Scanning
slit
FIGURE 4.11 Double-slit apparatus for neutrons. Thermal neutrons from a reactor
are incident on a crystal; scattering through a particular angle selects the energy of
the neutrons. After passing through the double slit, the neutrons are counted by the
scanning slit assembly, which moves laterally.
another slit across the beam and measuring the intensity of neutrons passing
through this “scanning slit.” Figure 4.12 shows the resulting pattern of intensity
maxima and minima, which leaves no doubt that interference is occurring and that
the neutrons have a corresponding wave nature. The wavelength can be deduced
from the slit separation using Eq. 3.16 to obtain the spacing between adjacent
maxima, y = yn+1 − yn . Estimating the spacing y from Figure 4.12 to be about
75 μm, we obtain
dy
(126 μm)(75 μm)
λ=
=
= 1.89 nm
D
5m
This result agrees very well with the de Broglie wavelength of 1.85 nm selected
for the neutron beam.
It is also possible to do a similar experiment with atoms. In this case, a
source of helium atoms formed a beam (of velocity corresponding to a kinetic
energy of 0.020 eV) that passed through a double slit of separation 8 μm and
width 1 μm. Again a scanning slit was used to measure the intensity of the beam
passing through the double slit. Figure 4.13 shows the resulting intensity pattern.
Although the results are not as dramatic as those for electrons and neutrons, there
is clear evidence of interference maxima and minima, and the separation of the
maxima gives a wavelength that is consistent with the de Broglie wavelength (see
Problem 8).
Diffraction can be observed with even larger objects. Figure 4.14 shows the
pattern produced by fullerene molecules (C60 ) in passing through a diffraction
grating with a spacing of d = 100 nm. The diffraction pattern was observed at
a distance of 1.2 m from the grating. Estimating the separation of the maxima
in Figure 4.14 as 50 μm, we get the angular separation of the maxima to be
θ ≈ tan θ = (50 μm)/(1.2 m) = 4.2 × 10−5 rad, and thus λ = d sin θ = 4.2 pm.
For C60 molecules with a speed of 117 m/s used in this experiment, the expected
de Broglie wavelength is 4.7 pm, in good agreement with our estimate from the
diffraction pattern.
In this chapter we have discussed several interference and diffraction
experiments using different particles—electrons, protons, neutrons, atoms,
and molecules. These experiments are not restricted to any particular type of
particle or to any particular type of observation. They are examples of a general
phenomenon, the wave nature of particles, that was unobserved before 1920
because the necessary experiments had not yet been done. Today this wave
nature is used as a basic tool by scientists. For example, neutron diffraction
109
Intensity
4.2 | Experimental Evidence for De Broglie Waves
–150 –100 –50 0 50 100 150
Detector position in mm
FIGURE 4.14 Diffraction grating pattern produced by C60 molecules.
[Source: O. Nairz, M. Arndt, and
A. Zeilinger, American Journal of
Physics 71, 319 (2003).]
FIGURE 4.15 The atomic structure
of solid benzene as deduced from
neutron diffraction. The circles indicate contours of constant density. The
black circles show the locations of the
six carbon atoms that form the familiar
benzene ring. The blue circles show
the locations of the hydrogen atoms.
gives detailed information on the structure of solid crystals and of complex
molecules (Figure 4.15). The electron microscope uses electron waves to illuminate and form an image of objects; because the wavelength can be made thousands
of times smaller than that of visible light, it is possible to resolve and observe
small details that are not observable with visible light (Figure 4.16).
Through Which Slit Does the Particle Pass?
When we do a double-slit experiment with particles such as electrons, it is
tempting to try to determine through which slit the particle passes. For example,
we could surround each slit with an electromagnetic loop that causes a meter to
deflect whenever a charged particle or perhaps a particle with a magnetic moment
passes through the loop (Figure 4.17). If we fired the particles through the slits at
a slow enough rate, we could track each particle as it passed through one slit or
the other and then appeared on the screen.
If we performed this imaginary experiment, the result would no longer be an
interference pattern on the screen. Instead, we would observe a pattern similar to
that shown in Figure 4.17, with “hits” in front of each slit, but no interference
fringes. No matter what sort of device we use to determine through which slit the
particle passes, the interference pattern will be destroyed. The classical particle
must pass through one slit or the other; only a wave can reveal interference,
which depends on parts of the wavefront passing through both slits and then
recombining.
When we ask through which slit the particle passed, we are investigating only
the particle aspects of its behavior, and we cannot observe its wave nature (the
interference pattern). Conversely, when we study the wave nature, we cannot
simultaneously observe the particle nature. The electron will behave as a particle
or a wave, but we cannot observe both aspects of its behavior simultaneously.
This curious aspect of quantum mechanics was also discussed for photons in
Section 3.6, where we discovered that experiments can reveal either the particle
nature of the photon or its wave nature, but not both aspects simultaneously.
FIGURE 4.16 Electron microscope
image of bacteria on the surface of
a human tongue. The magnification
here is about a factor of 5000.
110
Chapter 4 | The Wavelike Properties of Particles
Electron
beam
Double
slit
Screen
FIGURE 4.17 Apparatus to record passage of electrons through slits.
Each slit is surrounded by a loop with a meter that signals the passage
of an electron through the slit. No interference fringes are seen on the
screen.
This is the basis for the principle of complementarity, which asserts that the
complete description of a photon or a particle such as an electron cannot be made
in terms of only particle properties or only wave properties, but that both aspects of
its behavior must be considered. Moreover, the particle and wave natures cannot
be observed simultaneously, and the type of behavior that we observe depends
on the kind of experiment we are doing: a particle-type experiment shows only
particle like behavior, and a wave-type experiment shows only wavelike behavior.
4.3 UNCERTAINTY RELATIONSHIPS FOR CLASSICAL
WAVES
(a)
(b)
FIGURE 4.18 (a) A pure sine wave,
which extends from −∞ to +∞.
(b) A narrow wave pulse.
In quantum mechanics, we want to use de Broglie waves to describe particles. In
particular, the amplitude of the wave will tell us something about the location of
the particle. Clearly a pure sinusoidal wave, as in Figure 4.18a, is not much use
in locating a particle—the wave extends from −∞ to +∞, so the particle might
be found anywhere in that region. On the other hand, a narrow wave pulse like
Figure 4.18b does a pretty good job of locating the particle in a small region of
space, but this wave does not have an easily identifiable wavelength. In the first
case, we know the wavelength exactly but have no knowledge of the location of
the particle, while in the second case we have a good idea of the location of the
particle but a poor knowledge of its wavelength. Because wavelength is associated
with momentum by the de Broglie relationship (Eq. 4.1), a poor knowledge of the
wavelength is associated with a poor knowledge of the particle’s momentum. For
a classical particle, we would like to know both its location and its momentum as
precisely as possible. For a quantum particle, we are going to have to make some
compromises—the better we know its momentum (or wavelength), the less we
know about its location. We can improve our knowledge of its location only at
the expense of our knowledge of its momentum.
4.3 | Uncertainty Relationships for Classical Waves
This competition between knowledge of location and knowledge of wavelength
is not restricted to de Broglie waves—classical waves show the same effect. All
real waves can be represented as wave packets—disturbances that are localized to
a finite region of space. We will discuss more about constructing wave packets in
Section 4.5. In this section we will examine this competition between specifying
the location and the wavelength of classical waves more closely.
Figure 4.19a shows a very small wave packet. The disturbance is well localized
to a small region of space of length x. (Imagine listening to a very short burst
of sound, of such brief duration that it is hard for you to recognize the pitch or
frequency of the wave.) Let’s try to measure the wavelength of this wave packet.
Placing a measuring stick along the wave, we have some difficulty defining exactly
where the wave starts and where it ends. Our measurement of the wavelength is
therefore subject to a small uncertainty λ. Let’s represent this uncertainty as a
fraction ε of the wavelength λ, so that λ ∼ ελ. The fraction ε is certainly less
than 1, but it is probably greater than 0.01, so we estimate that ε ∼ 0.1 to within
an order of magnitude. (In our discussion of uncertainty, we use the ∼ symbol
to indicate a rough order-of-magnitude estimate.) That is, the uncertainty in our
measurement of the wavelength might be roughly 10% of the wavelength.
The size of this wave disturbance is roughly one wavelength, so x ≈ λ. For
this discussion we want to examine the product of the size of the wave packet and
the uncertainty in the wavelength, x times λ with x ≈ λ and λ ∼ ελ:
xλ ∼ ελ2
(4.4)
This expression shows the inverse relationship between the size of the wave
packet and the uncertainty in the wavelength: for a given wavelength, the smaller
the size of the wave packet, the greater the uncertainty in our knowledge of the
wavelength. That is, as x gets smaller, λ must become larger.
Making a larger wave packet doesn’t help us at all. Figure 4.19b shows a larger
wave packet with the same wavelength. Suppose this larger wave packet contains
?
?
∆x ≈ l
(a)
?
?
∆ x ≈ Nl
(b)
FIGURE 4.19 (a) Measuring the wavelength of a wave represented by a
small wave packet of length roughly one wavelength. (b) Measuring the
wavelength of a wave represented by a large wave packet consisting of N
waves.
111
112
Chapter 4 | The Wavelike Properties of Particles
N cycles of the wave, so that x ≈ Nλ. Again using our measuring stick, we try
to measure the size of N wavelengths, and dividing this distance by N we can then
determine the wavelength. We still have the same uncertainty of ελ in locating the
start and end of this wave packet, but when we divide by N to find the wavelength,
the uncertainty in one wavelength becomes λ ∼ ελ/N. For this larger wave
packet, the product of x and λ is xλ ∼ (Nλ)(ελ/N) = ελ2 , exactly the
same as in the case of the smaller wave packet. Equation 4.4 is a fundamental
property of classical waves, independent of the type of wave or the method used
to measure its wavelength. This is the first of the uncertainty relationships for
classical waves.
Example 4.3
In a measurement of the wavelength of water waves, 10
wave cycles are counted in a distance of 196 cm. Estimate
the minimum uncertainty in the wavelength that might be
obtained from this experiment.
Solution
With 10 wave crests in a distance of 196 cm, the wavelength
is about (196 cm)/10 = 19.6 cm. We can take ε ∼ 0.1 as a
good order-of-magnitude estimate of the typical precision
that might be obtained. From Eq. 4.4, we can find the
uncertainty in wavelength:
λ ∼
(0.1)(19.6 cm)2
ελ2
=
= 0.2 cm
x
196 cm
With an uncertainty of 0.2 cm, the “true” wavelength might
range from 19.5 cm to 19.7 cm, so we might express this
result as 19.6 ± 0.1 cm.
The Frequency-Time Uncertainty Relationship
?
?
We can take a different approach to uncertainty for classical waves by imagining a
measurement of the period rather than the wavelength of the wave that comprises
our wave packet. Suppose we have a timing device that we use to measure
the duration of the wave packet, as in Figure 4.20. Here we are plotting the
wave disturbance as a function of time rather than location. The “size” of the
wave packet is now its duration in time, which is roughly one period T for this
wave packet, so that t ≈ T. Whatever measuring device we use, we have some
difficulty locating exactly the start and end of one cycle, so we have an uncertainty
T in measuring the period. As before, we’ll assume this uncertainty is some
small fraction of the period: T ∼ εT. To examine the competition between the
duration of the wave packet and our ability to measure its period, we calculate the
product of t and T:
tT ∼ εT 2
∆t ≈ T
FIGURE 4.20 Measuring the period
of a wave represented by a small wave
packet of duration roughly one period.
(4.5)
This is the second of our uncertainty relationships for classical waves. It shows that
for a wave of a given period, the smaller the duration of the wave packet, the larger
is the uncertainty in our measurement of the period. Note the similarity between
Eqs. 4.4 and 4.5, one representing relationships in space and the other in time.
It will turn out to be more useful if we write Eq. 4.5 in terms of frequency
instead of period. Given that period T and frequency f are related by f = 1/T, how
is f related to T? The correct relationship is certainly not f = 1/T, which
would imply that a very small uncertainty in the period would lead to a very large
4.4 | Heisenberg Uncertainty Relationships
113
uncertainty in the frequency. Instead, they should be directly related—the better
we know the period, the better we know the frequency. Here is how we obtain the
relationship: Beginning with f = 1/T, we take differentials on both sides:
df = −
1
dT
T2
Next we convert the infinitesimal differentials to finite intervals, and because we
are interested only in the magnitude of the uncertainties we can ignore the minus
sign:
1
(4.6)
f = 2 T
T
Combining Eqs. 4.5 and 4.6, we obtain
f t ∼ ε
(4.7)
Equation 4.7 shows that the longer the duration of the wave packet, the more
precisely we can measure its frequency.
Example 4.4
An electronics salesman offers to sell you a frequencymeasuring device. When hooked up to a sinusoidal signal,
it automatically displays the frequency of the signal, and to
account for frequency variations, the frequency is remeasured once each second and the display is updated. The
salesman claims the device to be accurate to 0.01 Hz. Is
this claim valid?
Solution
must have an associated uncertainty of about
f ∼
It appears that the salesman may be exaggerating the
precision of this device.
Based on Eq. 4.7, and again estimating ε to be about 0.1, we
know that a measurement of frequency in a time t = 1s
4.4 HEISENBERG UNCERTAINTY RELATIONSHIPS
The uncertainty relationships discussed in the previous section apply to all waves,
and we should therefore apply them to de Broglie waves. We can use the basic de
Broglie relationship p = h/λ to relate the uncertainty in the momentum p to the
uncertainty in wavelength λ, using the same procedure that we used to obtain
Eq. 4.6. Starting with p = h/λ, we take differentials on both sides and obtain
dp = (−h/λ2 )dλ. Now we change the differentials into differences, ignoring the
minus sign:
p =
h
λ
λ2
0.1
ε
=
t
1s
= 0.1 Hz
(4.8)
114
Chapter 4 | The Wavelike Properties of Particles
An uncertainty in the momentum of the particle is directly related to the uncertainty
in the wavelength associated with the particle’s de Broglie wave packet.
Combining Eq. 4.8 with Eq. 4.4, we obtain
xp ∼ εh
(4.9)
Just like Eq. 4.4, this equation suggests an inverse relationship between x and
p. The smaller the size of the wave packet of the particle, the larger is the
uncertainty in its momentum (and thus in its velocity).
Quantum mechanics provides a formal procedure for calculating x and p
for wave packets corresponding to different physical situations and for different
schemes for confining a particle. One outcome of these calculations gives the
wave packet with the smallest possible value of the product xp, which turns
out to be h/4π, as we will discuss in the next chapter. Thus ε = 1/4π in this case.
All other wave packets will have larger values for xp.
The combination h/2π occurs frequently in quantum mechanics and is given
the special symbol h− (“h-bar”)
Werner Heisenberg (1901–1976, Germany). Best known for the uncertainty
principle, he also developed a complete formulation of the quantum theory based on matrices.
Screen
l
q = sin–1 a
y
x
a
Electrons
∆x~∞
FIGURE 4.21 Single-slit diffraction
of electrons. A wide beam of electrons
is incident on a narrow slit. The electrons that pass through the slit acquire
a component of momentum in the x
direction.
h− =
h
= 1.05 × 10−34 J · s = 6.58 × 10−16 eV · s
2π
−
we can write the uncertainty relationship as
In terms of h,
xpx 12 h−
(4.10)
The x subscript has been added to the momentum to remind us that Eq. 4.10
applies to motion in a given direction and relates the uncertainties in position and
momentum in that direction only. Similar and independent relationships can be
−
−
or zpz h/2.
applied in the other directions as necessary; thus ypy h/2
Equation 4.10 is the first of the Heisenberg uncertainty relationships. It sets the
limit of the best we can possibly do in an experiment to measure simultaneously the
location and the momentum of a particle. Another way of interpreting this equation
is to say that the more we try to confine a particle, the less we know about its
momentum.
−
represents the minimum value of the product xpx ,
Because the limit of h/2
in most cases we will do worse than this limit. It is therefore quite acceptable to
take
xpx ∼ h−
(4.11)
as a rough estimate of the relationship between the uncertainties in location and
momentum.
As an example, let’s consider a beam of electrons incident on a single slit, as in
Figure 4.21. We know this experiment as single-slit diffraction, which produces
the characteristic diffraction pattern illustrated in Figure 4.1. We’ll assume that
the particles are initially moving in the y direction and that we know their
momentum in that direction as precisely as possible. If the electrons initially have
no component of their momentum in the x direction, we know px exactly (it is
exactly zero), so that px = 0; thus we know nothing about the x coordinates of
the electrons (x = ∞). This situation represents a very wide beam of electrons,
only a small fraction of which pass through the slit.
At the instant that some of the electrons pass through the slit, we know quite
a bit more about their x location. In order to pass through the slit, the uncertainty
4.4 | Heisenberg Uncertainty Relationships
in their x location is no larger than a, the width of the slit; thus x = a. This
improvement in our knowledge of the electron’s location comes at the expense of
our knowledge of its momentum, however. According to Eq. 4.11, the uncertainty
−
in the x component of its momentum is now px ∼ h/a.
Measurements beyond
the slit no longer show the particle moving precisely in the y direction (for
which px = 0); the momentum now has a small x component as well, with values
−
distributed about zero but now with a range of roughly ±h/a.
In passing through
the slit, a particle acquires on the average an x component of momentum of
−
roughly h/a,
according to the uncertainty principle.
Let us now find the angle θ that specifies where a particle with this value of px
lands on the screen. For small angles, sin θ ≈ tan θ and so
sin θ ≈ tan θ =
−
px
λ
h/a
=
=
py
py
2π a
using λ = h/py for the de Broglie wavelength of the electrons. The first minimum
of the diffraction pattern of a single slit is located at sin θ = λ/a, which is larger
than the spread of angles into which most of the particles are diffracted. The
calculation shows that the distribution of transverse momentum given by the
uncertainty principle is roughly equivalent to the spreading of the beam into the
central diffraction peak, and it illustrates again the close connection between wave
behavior and uncertainty in particle location.
The diffraction (spreading) of a beam following passage through a slit is just
the effect of the uncertainty principle on our attempt to specify the location of the
particle. As we make the slit narrower, px increases and the beam spreads even
more. In trying to obtain more precise knowledge of the location of the particle
by making the slit narrower, we have lost knowledge of the direction of its travel.
This trade-off between observations of position and momentum is the essence of
the Heisenberg uncertainty principle.
We can also apply the second of our classical uncertainty relationships (Eq. 4.7)
to de Broglie waves. If we assume the energy-frequency relationship for light,
E = hf , can be applied to particles, then we immediately obtain E = hf .
Combining this with Eq. 4.7, we obtain
Et ∼ εh
(4.12)
Once again, the minimum uncertainty wave packet gives ε = 1/4π, and so
Et 21 h−
(4.13)
This is the second of the Heisenberg uncertainty relationships. It tells us that
the more precisely we try to determine the time coordinate of a particle, the less
precisely we know its energy. For example, if a particle has a very short lifetime
between its creation and decay (t → 0), a measurement of its rest energy (and
thus its mass) will be very imprecise (E → ∞). Conversely, the rest energy of
a stable particle (one with an infinite lifetime, so that t = ∞) can in principle be
measured with unlimited precision (E = 0).
As in the case of the first Heisenberg relationship, we can take
Et ∼ h−
as a reasonable estimate for most wave packets.
(4.14)
115
116
Chapter 4 | The Wavelike Properties of Particles
The Heisenberg uncertainty relationships are the mathematical representations
of the Heisenberg uncertainty principle, which states:
It is not possible to make a simultaneous determination of the position and
the momentum of a particle with unlimited precision,
and
It is not possible to make a simultaneous determination of the energy and
the time coordinate of a particle with unlimited precision.
These relationships give an estimate of the minimum uncertainty that can result
from any experiment; measurement of the position and momentum of a particle
will give a spread of values of widths x and px . We may, for other reasons, do
much worse than Eqs. 4.10 and 4.13, but we can do no better.
These relationships have a profound impact on our view of nature. It is quite
acceptable to say that there is an uncertainty in locating the position of a water
wave. It is quite another matter to make the same statement about a de Broglie
wave, because there is an implied corresponding uncertainty in the position of the
particle. Equations 4.10 and 4.13 say that nature imposes a limit on the accuracy
with which we can do experiments. To emphasize this point, the Heisenberg
relationships are sometimes called “indeterminacy” rather than “uncertainty”
principles, because the idea of uncertainty may suggest an experimental limit
that can be reduced by using better equipment or technique. In actuality, these
coordinates are indeterminate to the limits provided by Eqs. 4.10 and 4.13—no
matter how hard we try, it is simply not possible to measure more precisely.
Example 4.5
An electron moves in the x direction with a speed of
3.6 × 106 m/s. We can measure its speed to a precision of
1%. With what precision can we simultaneously measure
its x coordinate?
The uncertainty px is 1% of this value, or 3.3 ×
10−26 kg · m/s. The uncertainty in position is then
x ∼
Solution
= 3.2 nm
The electron’s momentum is
px = mvx = (9.11 × 10−31 kg)(3.6 × 106 m/s)
= 3.3 × 10−24 kg · m/s
1.05 × 10−34 J · s
h−
=
px
3.3 × 10−26 kg · m/s
which is roughly 10 atomic diameters.
Example 4.6
Repeat the calculations of the previous example in the case
of a pitched baseball (m = 0.145 kg) moving at a speed of
95 mi/h (42.5 m/s). Again assume that its speed can be
measured to a precision of 1%.
Solution
The baseball’s momentum is
px = mvx = (0.145 kg)(42.5 m/s) = 6.16 kg · m/s
The uncertainty in momentum is 6.16 × 10−2 kg · m/s, and
the corresponding uncertainty in position is
x ∼
h−
1.05 × 10−34 J · s
=
= 1.7 × 10−33 m
px
6.16 × 10−2 kg · m/s
4.4 | Heisenberg Uncertainty Relationships
This uncertainty is 19 orders of magnitude smaller than the
size of an atomic nucleus. The uncertainty principle cannot
be blamed for the batter missing the pitch! Once again
117
we see that, because of the small magnitude of Planck’s
constant, quantum effects are not observable for ordinary
objects.
A Statistical Interpretation of Uncertainty
A diffraction pattern, such as that shown in Figure 4.21, is the result of the
passage of many particles or photons through the slit. So far, we have been
discussing the behavior of only one particle. Let’s imagine that we do an
experiment in which a large number of particles passes (one at a time) through
the slit, and we measure the transverse (x component) momentum of each
particle after it passes through the slit. We can do this experiment simply by
placing a detector at different locations on the screen where we observe the
diffraction pattern. Because the detector actually accepts particles over a finite
region on the screen, it measures in a range of deflection angles or equivalently
in a range of transverse momentum. The result of the experiment might look
something like Figure 4.22. The vertical scale shows the number of particles with
momentum in each interval corresponding to different locations of the detector
on the screen. The values are symmetrically arranged about zero, which indicates
that the mean or average value of px is zero. The width of the distribution is
characterized by px .
Figure 4.22 resembles a statistical distribution, and in fact the precise definition
of px is similar to that of the standard deviation σA of a quantity A that has a
mean or average value Aav :
σA =
(A2 )av − (Aav )2
If there are N individual measurements of A, then Aav = N −1 Ai and (A2 )av =
N −1 A2i .
By analogy, we can make a rigorous definition of the uncertainty in
momentum as
(4.15)
px = (p2x )av − (px,av )2
The average value of the transverse momentum for the situation shown in
Figure 4.22 is zero, so
px =
Number of particles
recorded by detector
∆px
(p2x )av
(4.16)
which gives in effect a root-mean-square value of px . This can be taken to be
a rough measure of the magnitude of px . Thus it is often said that px gives a
measure of the magnitude of the momentum of the particle. As you can see from
Figure 4.22, this is indeed true.∗
∗ The relationship between the value of p calculated from Eq. 4.16 and the width of the distribution
x
shown in Figure 4.22 depends on the exact shape of the distribution. You should consider the value
from Eq. 4.16 as a rough order-of-magnitude estimate of the width of the distribution.
0
Momentum
FIGURE 4.22 Results that might be
obtained from measuring the number
of electrons in a given time interval
at different locations on the screen of
Figure 4.21. The distribution is centered around px = 0 and has a width
that is characterized by px .
118
Chapter 4 | The Wavelike Properties of Particles
Example 4.7
In nuclear beta decay, electrons are observed to be ejected
from the atomic nucleus. Suppose we assume that electrons are somehow trapped within the nucleus, and that
occasionally one escapes and is observed in the laboratory.
Take the diameter of a typical nucleus to be 1.0 × 10−14 m,
and use the uncertainty principle to estimate the range of
kinetic energies that such an electron must have.
Solution
If the electron were trapped in a region of width x ≈
10−14 m, the corresponding uncertainty in its momentum
would be
px ∼
−
h−
1 hc
1 197 MeV · fm
=
=
= 19.7 MeV/c
x
c x
c
10 fm
−
Note the use of hc
= 197 MeV · fm in this calculation. This
momentum is clearly in the relativistic regime for electrons,
so we must use the relativistic formula to find the kinetic
energy for a particle of momentum 19.7 MeV/c:
p2 c2 + (mc2 )2 − mc2
= (19.7 MeV)2 + (0.5 MeV)2 − 0.5 MeV = 19 MeV
K=
where we have used Eq. 4.16 to relate px to p2x . This
result gives the spread of kinetic energies corresponding to
a spread in momentum of 19.7 MeV/c.
Electrons emitted from the nucleus in nuclear beta
decay typically have kinetic energies of about 1 MeV,
much smaller than the typical spread in energy required by
the uncertainty principle for electrons confined inside the
nucleus. This suggests that beta-decay electrons of such low
energies cannot be confined in a region of the size of the
nucleus, and that another explanation must be found for the
electrons observed in nuclear beta decay. (As we discuss in
Chapter 12, these electrons cannot preexist in the nucleus,
which would violate the uncertainty principle, but are
“manufactured” by the nucleus at the instant of the decay.)
Example 4.8
(a) A charged pi meson has a rest energy of 140 MeV
and a lifetime of 26 ns. Find the energy uncertainty of the
pi meson, expressed in MeV and also as a fraction of its
rest energy. (b) Repeat for the uncharged pi meson, with
a rest energy of 135 MeV and a lifetime of 8.3 × 10−17 s.
(c) Repeat for the rho meson, with a rest energy of 765 MeV
and a lifetime of 4.4 × 10−24 s.
Solution
(a) If the pi meson lives for 26 ns, we have only that much
time in which to measure its rest energy, and Eq. 4.8 tells us
that any energy measurement done in a time t is uncertain
by an amount of at least
E =
h−
6.58 × 10−16 eV · s
=
t
26 × 10−9 s
= 2.5 × 10−8 eV
= 2.5×10−14 MeV
2.5 × 10−14 MeV
E
=
E
140 MeV
= 1.8 × 10−16
(b) In a similar way,
E =
h−
6.58 × 10−16 eV · s
=
= 7.9 eV
t
8.3 × 10−17 s
= 7.9 × 10−6 MeV
7.9 × 10−6 MeV
E
=
= 5.9 × 10−8
E
135 MeV
(c) For the rho meson,
E =
h−
6.58 × 10−16 eV · s
=
= 1.5 × 108 eV
t
4.4 × 10−24 s
= 150 MeV
150 MeV
E
=
= 0.20
E
765 MeV
In the first case, the uncertainty principle does not give
a large enough effect to be measured—particle masses
cannot be measured to a precision of 10−16 (about 10−6
is the best precision that we can obtain). In the second
example, the uncertainty principle contributes at about the
level of 10−7 , which approaches the limit of our measuring
4.5 | Wave Packets
instruments and therefore might be observable in the laboratory. In the third example, we see that the uncertainty
principle can contribute substantially to the precision of our
knowledge of the rest energy of the rho meson; measurements of its rest energy will show a statistical distribution
centered about 765 MeV with a spread of 150 MeV, and no
matter how precise an instrument we use to measure the
rest energy, we can never reduce that spread.
119
The lifetime of a very short-lived particle such as the rho
meson cannot be measured directly. In practice we reverse
the procedure of the calculation of this example—we
measure the rest energy, which gives a distribution similar
to Figure 4.22, and from the “width” E of the distribution
we deduce the lifetime using Eq. 4.8. This procedure is
discussed in Chapter 14.
Example 4.9
Estimate the minimum velocity that would be measured
for a billiard ball (m ≈ 100 g) confined to a billiard table
of dimension 1 m.
so
Solution
Thus quantum effects might result in motion of the billiard
ball with a speed distribution having a spread of about 1 ×
10−33 m/s. At this speed, the ball would move a distance
of 1% of the diameter of an atomic nucleus in a time equal
to the age of the universe! Once again, we see that quantum
effects are not observable with macroscopic objects.
For x ≈ 1 m, we have
px ∼
h−
1.05 × 10−34 J · s
=
= 1 × 10−34 kg · m/s
x
1m
vx =
px
1 × 10−34 kg · m/s
=
= 1 × 10−33 m/s
m
0.1 kg
4.5 WAVE PACKETS
In Section 4.3, we described measurements of the wavelength or frequency of a
wave packet, which we consider to be a finite group of oscillations of a wave.
That is, the wave amplitude is large over a finite region of space or time and is
very small outside that region.
Before we begin our discussion, it is necessary to keep in mind that we are
discussing traveling waves, which we imagine as moving in one direction with a
uniform speed. (We’ll discuss the speed of the wave packet later.) As the wave
packet moves, individual locations in space will oscillate with the frequency or
wavelength that characterizes the wave packet. When we show a static picture of
a wave packet, it doesn’t matter that some points within the packet appear to have
positive displacement, some have negative displacement, and some may even
have zero displacement. As the wave travels, those locations are in the process
of oscillating, and our drawings may “freeze” that oscillation. What is important
is the locations in space where the overall wave packet has a large oscillation
amplitude and where it has a very small amplitude.∗
In this section we will examine how to build a wave packet by adding waves
together. A pure sinusoidal wave is of no use in representing a particle—the wave
∗ By
analogy, think of a radio wave traveling from the station to your receiver. At a particular instant
of time, some points in space may have instantaneous electromagnetic field values of zero, but that
doesn’t affect your reception of the signal. What is important is the overall amplitude of the traveling
wave.
120
Chapter 4 | The Wavelike Properties of Particles
extends from −∞ to +∞, so the particle could be found anywhere. We would like
the particle to be represented by a wave packet that describes how the particle is
localized to a particular region of space, such as an atom or a nucleus.
The key to the process of building a wave packet involves adding together
waves of different wavelength. We represent our waves as A cos kx, where k is the
wave number (k = 2π/λ) and A is the amplitude. For example, let’s add together
two waves:
y(x) = A1 cos k1 x + A2 cos k2 x = A1 cos(2π x/λ1 ) + A2 cos(2π x/λ2 )
(4.17)
This sum is illustrated in Figure 4.23a for the case A1 = A2 and λ1 = 9, λ2 = 11.
This combined wave shows the phenomenon known as beats in the case of sound
waves. So far we don’t have a result that looks anything like the wave packet we
are after, but you can see that by adding together two different waves we have
reduced the amplitude of the wave packet at some locations. This pattern repeats
endlessly from −∞ to +∞, so the particle is still not localized.
Let’s try a more detailed sum. Figure 4.23b shows the result of adding 5 waves
with wavelengths 9, 9.5, 10, 10.5, 11. Here we have been a bit more successful
in restricting the amplitude of the wave packet in some regions. By adding even
more waves with a larger range of wavelengths, we can obtain still narrower
regions of large amplitude: Figure 4.23c shows the result of adding 9 waves
with wavelengths 8, 8.5, 9, . . . , 12, and Figure 4.23d shows the result of adding
13 waves of wavelengths 7, 7.5, 8, . . . , 13. Unfortunately, all of these patterns
(including the regions of large amplitude) repeat endlessly from −∞ to +∞, so
even though we have obtained increasingly large regions where the wave packet
has small amplitude, we haven’t yet created a wave packet that might represent
a particle localized to a particular region. If these wave packets did represent
particles, then the particle would not be confined to any finite region.
1
1
0.5
0.5
0
0
–100
0
–50
50
100 –100
–0.5
–0.5
–1
–1
1
1
0.5
0.5
0
–50
0
–0.5
–1
(b)
50
100
50
100
(c)
(a)
–100
0
–50
50
100
–100
–50
0
0
–0.5
–1
(d)
FIGURE 4.23 (a) Adding two waves of wavelengths 9 and 11 gives beats.
(b) Adding 5 waves with wavelengths ranging from 9 to 11. (c) Adding 9
waves with wavelengths ranging from 8 to 12. (d) Adding 13 waves with
wavelengths from 7 to 13. All of the patterns repeat from −∞ to +∞.
4.5 | Wave Packets
121
The regions of large amplitude in Figures 4.23b,c,d do show how adding more
waves of a greater range of wavelengths helps to restrict the size of the wave
packet. The region of large amplitude in Figure 4.23b ranges from about x = −40
to +40, while in Figure 4.23c it is from about x = −20 to +20 and in Figure 4.23d
from about x = −15 to +15. This shows again the inverse relationship between x
and λ expected for wave packets given by Eq. 4.4: as the range of wavelengths
increases from 2 to 4 to 6, the size of the “allowed” regions decreases from about
80 to 40 to 30. Once again we find that to restrict the size of the wave packet we
must sacrifice the precise knowledge of the wavelength.
Note that for all four of these wave patterns, the disturbance seems to have a
wavelength of about 10, equal to the central wavelength of the range of values
of the functions we constructed. We can therefore regard these functions as a
cosine wave with a wavelength of 10 that is shaped or modulated by the other
cosine waves included in the function. For example, for the case of A1 = A2 = A,
Eq. 4.17 can be rewritten after a bit of trigonometric manipulation as
πx π x
πx π x
y(x) = 2A cos
−
+
cos
(4.18)
λ1
λ2
λ1
λ2
If λ1 and λ2 are close together (that is, if λ = λ2 − λ1 ≪ λ1 , λ2 ), this can be
approximated as
λπ x
2π x
y(x) = 2A cos
cos
(4.19)
λ2av
λav
where λav = (λ1 + λ2 )/2 ≈ λ1 or λ2 . The second cosine term represents a wave
with a wavelength of 10, and the first cosine term provides the shaping envelope
that produces the beats.
Any finite combination of waves with discrete wavelengths will produce
patterns that repeat between −∞ to +∞, so this method of adding waves will
not work in constructing a finite wave packet. To construct a wave packet with a
finite width, we must replace the first cosine term in Eq. 4.19 with a function that
is large in the region where we want to confine the particle but that falls to zero
as x → ±∞. For example, the simplest function that has this property is 1/x, so
we might imagine a wave packet whose mathematical form is
λπ x
2π x
2A
sin
(4.20)
cos
y(x) =
x
λ0
λ20
Here λ0 represents the central wavelength, replacing λav . (In going from Eq. 4.19
to Eq. 4.20, the cosine modulating term has been changed to a sine; otherwise the
function would blow up at x = 0.) This function is plotted in Figure 4.24a. It looks
more like the kind of function we are seeking—it has large amplitude only in a
small region of space, and the amplitude drops rapidly to zero outside that region.
Another function that has this property is the Gaussian modulating function:
−2(λπx/λ20 )2
y(x) = Ae
which is shown in Figure 4.24b.
2π x
cos
λ0
(4.21)
1
2A
∆l px
sin
x
l20
cos
0.5
2px
l0
0
–150
–100
0
–50
50
100
150
–0.5
–1
(a)
1
–2(∆lpx/l20)2
Ae
cos
0.5
2px
λo
0
–150
–100
–50
0
50
100
150
–0.5
–1
(b)
FIGURE 4.24 (a) A wave packet
in which the modulation envelope
decreases in amplitude like 1/x.
(b) A wave packet with a Gaussian
modulating function. Both curves are
drawn for λ0 = 10 and λ = 0.58,
which corresponds approximately to
Figure 4.23b.
122
Chapter 4 | The Wavelike Properties of Particles
Both of these functions show the characteristic inverse relationship between an
arbitrarily defined size of the wave packet x and the wavelength range parameter
λ that is used in constructing the wave packet. For example, consider the wave
packet shown in Figure 4.24a. Let’s arbitrarily define the width of the wave packet
as the distance over which the amplitude of the central region falls by 1/2. That
occurs roughly where the argument of the sine has the value ±π/2, which gives
xλ ∼ λ20 , consistent with our classical uncertainty estimate.
These wave packets can also be constructed by adding together waves of
differing amplitude and wavelength, but the wavelengths form a continuous rather
than a discrete set. It is a bit easier to illustrate this if we work with wave number
k = 2π/λ rather than wavelength. So far we have been adding waves in the form
of A cos kx, so that
(4.22)
y(x) =
Ai cos ki x
where ki = 2π/λi . The waves plotted in Figure 4.23 represent applications of the
general formula of Eq. 4.22 carried out over different numbers of discrete waves.
If we have a continuous set of wave numbers, the sum in Eq. 4.22 becomes an
integral:
y(x) = A(k) cos kx dk
(4.23)
where the integral is carried out over whatever range of wave numbers is permitted
(possibly infinite).
For example, suppose we have a range of wave numbers from k0 − k/2 to
k0 + k/2 that is a continuous distribution of wave numbers of width k centered
at k0 . If all of the waves have the same amplitude A0 , then from Eq. 4.23 the form
of the wave packet can be shown to be (see Problem 24 at the end of the chapter)
k
2A0
(4.24)
sin
x cos k0 x
y(x) =
x
2
This is identical with Eq. 4.20 with k0 = 2π/λ0 and k = 2π λ/λ20 . This
relationship between k and λ follows from a procedure similar to what was used
to obtain Eq. 4.6. With k = 2π/λ, taking differentials gives dk = −(2π/λ2 )dλ.
Replacing the differentials with differences and ignoring the minus sign gives the
relationship between k and λ.
A better approximation of the shape of the wave packet can be found by letting
2
2
A(k) vary according to a Gaussian distribution A(k) = A0 e−(k−k0 ) /2(k) . This
gives a range of wave numbers that has its largest contribution at the central
wave number k0 and falls off to zero for larger or smaller wave numbers with a
characteristic width of k. Applying Eq. 4.23 to this case, with k ranging from
−∞ to +∞ gives (see Problem 25)
√
2
(4.25)
y(x) = A0 k 2π e−(kx) /2 cos k0 x
which shows how the form of Eq. 4.21 originates.
By specifying the distribution of wavelengths, we can construct a wave packet
of any desired shape. A wave packet that restricts the particle to a region in space
4.6 | The Motion of a Wave Packet
of width x will have a distribution of wavelengths characterized by a width λ.
The smaller we try to make x, the larger will be the spread λ of the wavelength
distribution. The mathematics of this process gives a result that is consistent with
the uncertainty relationship for classical waves (Eq. 4.4).
4.6 THE MOTION OF A WAVE PACKET
Let’s consider again the “beats” wave packet represented by Eq. 4.17 and
illustrated in Figure 4.23a. We now want to turn our “static” waves into traveling
waves. It will again be more convenient for this discussion to work with the wave
number k instead of the wavelength. To turn a static wave y(x) = A cos kx into a
traveling wave moving in the positive x direction, we replace kx with kx − ωt, so
that the traveling wave is written as y(x, t) = A cos(kx − ωt). (For motion in the
negative x direction, we would replace kx with kx + ωt.) Here ω is the circular
frequency of the wave: ω = 2π f . The combined traveling wave then would be
represented as
y(x, t) = A1 cos(k1 x − ω1 t) + A2 cos(k2 x − ω2 t)
(4.26)
For any individual wave, the wave speed is related to its frequency and wavelength
according to v = λf . In terms of the wave number and circular frequency, we can
write this as v = (2π/k)(ω/2π ), so v = ω/k. This quantity is sometimes called
the phase speed and represents the speed of one particular phase or component
of the wave packet. In general, each individual component may have a different
phase speed. As a result, the shape of the wave packet may change with time.
For Figure 4.23a, we chose A1 = A2 and λ1 = 9, λ2 = 11. Let’s choose v1 = 6
units/s and v2 = 4 units/s. Figure 4.25 shows the waveform at a time of t = 1 s. In
that time, wave 1 will have moved a distance of 6 units in the positive x direction
and wave 2 will have moved a distance of 4 units in the positive x direction.
However, the combined wave moves a much greater distance in that time: the
center of the beat that was formerly at x = 0 has moved to x = 15 units. How is
it possible that the combined waveform moves faster than either of its component
waves?
1
0.5
0
–100
–50
0
50
100
–0.5
–1
FIGURE 4.25 The solid line shows the waveform of Figure 4.23a
at t = 1 s, and the dashed line shows the same waveform at t = 0.
Note that the peak that was originally at x = 0 has moved to x = 15
at t = 1 s.
123
124
Chapter 4 | The Wavelike Properties of Particles
To produce the peak at x = 0 and t = 0, the two component waves were exactly
in phase—their two maxima lined up exactly to produce the combined maximum
of the wave. At x = 15 units and t = 1 s, two individual maxima line up once
again to produce a combined maximum. They are not the same two maxima that
lined up to produce the maximum at t = 0, but it happens that two other maxima
are in phase at x = 15 units and t = 1 s to produce the combined maximum. If we
were to watch an animation of the wave, we would see the maximum originally
at x = 0 move gradually to x = 15 between t = 0 and t = 1 s.
We can understand how this occurs by writing Eq. 4.26 in a form similar to
Eq. 4.18 using trigonometric identities. The result is (again assuming A1 = A2 =
A, as we did in Eq. 4.18):
k
ω
k + k2
ω + ω2
y(x, t) = 2A cos
x−
t cos 1
x− 1
t
(4.27)
2
2
2
2
As in Eq. 4.18, the second term in Eq. 4.27 represents the rapid variation of the
wave within the envelope given by the first term. It is the first term that dictates the
overall shape of the waveform, so it is this term that determines the speed of travel
of the waveform. For a wave that is written as cos(kx − ωt), the speed is ω/k. For
this wave envelope, the speed is (ω/2)/(k/2) = ω/k. This speed is called
the group speed of the wave packet. As we have seen, the group speed of the
wave packet can be very different from the phase speed of the component waves.
For more complicated situations than the two-component “beat” waveform, the
group speed can be generalized by turning the differences into differentials:
vgroup =
dω
dk
(4.28)
The group speed depends on the relationship between frequency and wavelength
for the component waves. If the phase speed of all component waves is the same
and is independent of frequency or wavelength (as, for example, light waves
in empty space), then the group speed is identical to the phase speed and the
wave packet keeps its original shape as it travels. In general, the propagation
of a component wave depends on the properties of the medium, and different
component waves will travel with different speeds. Light waves in glass or sound
waves in most solids travel with a speed that varies with frequency or wavelength,
and so their wave packets change shape as they travel. De Broglie waves in general
have different phase speeds, so that their wave packets expand as they travel.
Example 4.10
Certain ocean waves travel with a phase velocity vphase =
gλ/2π , where g is the acceleration due to gravity. What
is the group velocity of a “wave packet” of these waves?
Solution
With k = 2π/λ, we can write the phase velocity as a
function of k as
vphase = g/k
But with vphase = ω/k, we have ω/k =
and Eq. 4.28 gives
dω
1
d
vgroup =
gk =
=
dk dk
2
g 1
=
k 2
g/k, so ω =
gk
gλ
2π
Note that the group speed of the wave packet increases as
the wavelength increases.
4.6 | The Motion of a Wave Packet
The Group Speed of deBroglie Waves
Suppose we have a localized particle, represented by a group of de Broglie
waves. For each component wave, the energy of the particle is related to the
−
−
and so dE = hdω.
Similarly,
frequency of the de Broglie wave by E = hf = hω,
the momentum of the particle is related to the wavelength of the de Broglie wave
−
−
so dp = hdk.
The group speed of the de Broglie wave then can
by p = h/λ = hk,
be expressed as
vgroup =
dω
dE/h−
dE
=
=
−
dk
dp/h
dp
(4.29)
For a classical particle having only kinetic energy E = K = p2 /2m, we can find
dE/dp as
2
d
p
p
dE
=
=v
(4.30)
=
dp
dp 2m
m
which is the velocity of the particle.
Combining Eqs. 4.29 and 4.30 we obtain an important result:
vgroup = vparticle
(4.31)
The speed of a particle is equal to the group speed of the corresponding wave
packet. The wave packet and the particle move together—wherever the particle
goes, its de Broglie wave packet moves along with it like a shadow. If we do a
wave-type experiment on the particle, the de Broglie wave packet is always there
to reveal the wave behavior of the particle. A particle can never escape its wave
nature!
The Spreading of a Moving Wave Packet
Suppose we have a wave packet that represents a confined particle at t = 0. For
example, the particle might have passed through a single-slit apparatus. Its initial
uncertainty in position is x0 and its initial uncertainty in momentum is px0 .
The wave packet moves in the x direction with velocity vx , but that velocity is
not precisely known—the uncertainty in its momentum gives a corresponding
uncertainty in velocity: vx0 = px0 /m. Because there is an uncertainty in the
velocity of the wave packet, we can’t be sure where it will be located at time
t. That is, its location at time t is x = vx t, with velocity vx = vx0 ± vx0 . Thus
there are two contributions to the uncertainty in its location at time t: the initial
uncertainty x0 and an additional amount equal to vx0 t that represents the
spreading of the wave packet. We’ll assume that these two contributions add
quadratically, like experimental uncertainties, so that the total uncertainty in the
location of the particle is
(4.32)
x = (x0 )2 + (vx0 t)2 = (x0 )2 + (px0 t/m)2
−
Taking px0 = h/x
0 according to the uncertainty principle, we have
−
2
x = (x0 )2 + (ht/mx
0)
(4.33)
If we try to make the wave packet very small at t = 0 (x0 is small), then the second
term under the square root makes the wave packet expand rapidly, because x0
appears in the denominator of that term. The more successful we are at confining
125
Chapter 4 | The Wavelike Properties of Particles
126
10
a wave packet, the more quickly it spreads. This reminds us of the single-slit
experiment discussed in Section 4.4: the narrower we make the slit, the more the
waves diverge after passing through the slit. Figure 4.26 shows how the size of the
wave packet expands with time for two different initial sizes, and you can see that
the smaller initial wave packet grows more rapidly than the larger initial packet.
8
∆x0 = 0.5
∆x
6
∆ x0 = 1
4
2
∆ x0 = 2
4.7 PROBABILITY AND RANDOMNESS
0
0
1
2
3
4
5
t
FIGURE 4.26 The smaller the initial
wave packet, the more quickly it
grows.
Any single measurement of the position or momentum of a particle can be made
with as much precision as our experimental skill permits. How then does the
wavelike behavior of a particle become observable? How does the uncertainty in
position or momentum affect our experiment?
Suppose we prepare an atom by attaching an electron to a nucleus. (For
this example we regard the nucleus as being fixed in space.) Some time after
preparing our atom, we measure the position of the electron. We then repeat the
procedure, preparing the atom in an identical way, and find that a remeasurement
of the position of the electron yields a value different from that found in our
first measurement. In fact, each time we repeat the measurement, we may obtain
a different outcome. If we repeat the measurement a large number of times, we
find ourselves led to a conclusion that runs counter to a basic notion of classical
physics—systems that are prepared in identical ways do not show identical
subsequent behavior. What hope do we then have of constructing a mathematical
theory that has any usefulness at all in predicting the outcome of a measurement,
if that outcome is completely random?
The solution to this dilemma lies in the consideration of the probability of
obtaining any given result from an experiment whose possible results are subject
to the laws of statistics. We cannot predict the outcome of a single flip of a
coin or roll of the dice, because any single result is as likely as any other single
result. We can, however, predict the distribution of a large number of individual
measurements. For example, on a single flip of a coin, we cannot predict whether
the outcome will be “heads” or “tails”; the two are equally likely. If we make a
large number of trials, we expect that approximately 50% will turn up “heads” and
50% will yield “tails”; even though we cannot predict the result of any single toss
of the coin, we can predict reasonably well the result of a large number of tosses.
Our study of systems governed by the laws of quantum physics leads us to a
similar situation. We cannot predict the outcome of any single measurement of
the position of the electron in the atom we prepared, but if we do a large number
of measurements, we ought to find a statistical distribution of results. We cannot
develop a mathematical theory that predicts the result of a single measurement, but
we do have a mathematical theory that predicts the statistical behavior of a system
(or of a large number of identical systems). The quantum theory provides this
mathematical procedure, which enables us to calculate the average or probable
outcome of measurements and the distribution of individual outcomes about the
average. This is not such a disadvantage as it may seem, for in the realm of
quantum physics, we seldom do measurements with, for example, a single atom.
If we were studying the emission of light by a radiant system or the properties
of a solid or the scattering of nuclear particles, we would be dealing with a large
number of atoms, and so our concept of statistical averages is very useful.
In fact, such concepts are not as far removed from our daily lives as we might
think. For example, what is meant when the TV weather forecaster “predicts”
4.7 | Probability and Randomness
a 50% chance of rain tomorrow? Will it rain 50% of the time, or over 50%
of the city? The proper interpretation of the forecast is that the existing set of
atmospheric conditions will, in a large number of similar cases, result in rain in
about half the cases. A surgeon who asserts that a patient has a 50% chance of
surviving an operation means exactly the same thing—experience with a large
number of similar cases suggests recovery in about half.
Quantum mechanics uses similar language. For example, if we say that the
electron in a hydrogen atom has a 50% probability of circulating in a clockwise
direction, we mean that in observing a large collection of similarly prepared atoms
we find 50% to be circulating clockwise. Of course a single measurement shows
either clockwise or counterclockwise circulation. (Similarly, it either rains or it
doesn’t; the patient either lives or dies.)
Of course, one could argue that the flip of a coin or the roll of the dice is not
a random process, but that the apparently random nature of the outcome simply
reflects our lack of knowledge of the state of the system. For example, if we knew
exactly how the dice were thrown (magnitude and direction of initial velocity,
initial orientation, rotational speed) and precisely what the laws are that govern
their bouncing on the table, we should be able to predict exactly how they would
land. (Similarly, if we knew a great deal more about atmospheric physics or
physiology, we could predict with certainty whether or not it will rain tomorrow
or an individual patient will survive.) When we instead analyze the outcomes in
terms of probabilities, we are really admitting our inability to do the analysis
exactly. There is a school of thought that asserts that the same situation exists in
quantum physics. According to this interpretation, we could predict exactly the
behavior of the electron in our atom if only we knew the nature of a set of so-called
“hidden variables” that determine its motion. However, experimental evidence
disagrees with this theory, and so we must conclude that the random behavior of
a system governed by the laws of quantum physics is a fundamental aspect of
nature and not a result of our limited knowledge of the properties of the system.
The Probability Amplitude
What does the amplitude of the de Broglie wave represent? In any wave
phenomenon, a physical quantity such as displacement or pressure varies with
location and time. What is the physical property that varies as the de Broglie wave
propagates?
A localized particle is represented by a wave packet. If a particle is confined
to a region of space of dimension x, its wave packet has large amplitude only in
a region of space of dimension x and has small amplitude elsewhere. That is,
the amplitude is large where the particle is likely to be found and small where the
particle is less likely to be found. The probability of finding the particle at any
point depends on the amplitude of its de Broglie wave at that point. In analogy
with classical physics, in which the intensity of any wave is proportional to the
square of its amplitude, we have
probability to observe particles ∝ | de Broglie wave amplitude |2
Compare this with the similar relationship for photons discussed in Section 3.6:
probability to observe photons ∝ | electric field amplitude |2
Just as the electric field amplitude of an electromagnetic wave indicates regions
of high and low probability for observing photons, the de Broglie wave performs
127
128
Chapter 4 | The Wavelike Properties of Particles
(a)
(b)
(c)
FIGURE 4.27 The buildup of an electron interference pattern as increasing numbers of electrons are detected: (a) 100
electrons; (b) 3000 electrons; (c) 70,000 electrons. (Reprinted with permission from Akira Tonomura, Hitachi, Ltd,
T. Matsuda and T. Kawasaki, Advanced Research Laboratory. From American Journal of Physics 57, 117. (Copyright
1989) American Association of Physics Teachers.)
the same function for particles. Figure 4.27 illustrates this effect, as individual
electrons in a double-slit type of experiment eventually produce the characteristic
interference fringes. The path of each electron is guided by its de Broglie wave
toward the allowed regions of high probability. This statistical effect is not
apparent for a small number of electrons, but it becomes quite apparent when a
large number of electrons has been detected.
In the next chapter we discuss the mathematical framework for computing the
wave amplitudes for a particle in various situations, and we also develop a more
rigorous mathematical definition of the probability.
Chapter Summary
Section
Section
Statistical momentum
uncertainty
px =
Wave packet
(discrete k)
y(x) =
4.3
Wave packet
(continuous k)
y(x) =
Heisenberg positionxpx ∼ h−
momentum uncertainty
4.4
Group speed of wave
packet
vgroup =
Heisenberg
Et ∼ h−
energy-time uncertainty
4.4
De Broglie wavelength λ = h/p
Single slit diffraction
Classical
position-wavelength
uncertainty
Classical frequencytime uncertainty
a sin θ = nλ n = 1, 2, 3, . . .
xλ ∼ ελ2
f t ∼ ε
4.1
4.2
4.3
(p2x )av − (px,av )2
Ai cos ki x
4.4
4.5
A(k) cos kx dk
4.5
dω
dk
4.6
Questions
1. When an electron moves with a certain de Broglie wavelength, does any aspect of the electron’s motion vary with
that wavelength?
2. Imagine a different world in which the laws of quantum
physics still apply, but which has h = 1 J · s. What might be
some of the difficulties of life in such a world? (See Mr.
Tompkins in Paperback by George Gamow for an imaginary
account of such a world.)
3. Suppose we try to measure an unknown frequency f by
listening for beats between f and a known (and controllable)
frequency f ′ . (We assume f ′ is known to arbitrarily small
uncertainty.) The beat frequency is |f ′ − f |. If we hear no
Problems
4.
5.
6.
7.
8.
9.
10.
beats, then we conclude that f = f ′ . (a) How long must we
listen to hear “no” beats? (b) If we hear no beats in one
second, how accurately have we determined f ? (c) If we
hear no beats in 10 s, how accurately? In 100 s? (d) How is
this experiment related to Eq. 4.7?
What difficulties does the uncertainty principle cause in
trying to pick up an electron with a pair of forceps?
Does the uncertainty principle apply to nature itself or only
to the results of experiments? That is, is it the position and
momentum that are really uncertain, or merely our knowledge of them? What is the difference between these two
interpretations?
The uncertainty principle states in effect that the more we
try to confine an object, the faster we are likely to find it
moving. Is this why you can’t seem to keep money in your
pocket for long? Make a numerical estimate.
Consider a collection of gas molecules trapped in a container. As we move the walls of the container closer together
(compressing the gas) the molecules move faster (the temperature increases). Does the gas behave this way because
of the uncertainty principle? Justify your answer with some
numerical estimates.
Many nuclei are unstable and undergo radioactive decay to
other nuclei. The lifetimes for these decays are typically of
the order of days to years. Do you expect that the uncertainty
principle will cause a measurable effect in the precision
to which we can measure the masses of atoms of these
nuclei?
Just as the classical limit of relativity can be achieved by
letting c → ∞, the classical limit of quantum behavior is
achieved by letting h → 0. Consider the following in the
h → 0 limit and explain how they behave classically: the
size of the energy quantum of an electromagnetic wave,
the de Broglie wavelength of an electron, the Heisenberg
uncertainty relationships.
Assume the electron beam in a television tube is accelerated through a potential difference of 25 kV and then passes
through a deflecting capacitor of interior width 1 cm. Are
11.
12.
13.
14.
15.
16.
129
diffraction effects important in this case? Justify your answer
with a calculation.
The structure of crystals can be revealed by X-ray diffraction (Figures 3.7 and 3.8), electron diffraction (Figure 4.2),
and neutron diffraction (Figure 4.7). In what ways do these
experiments reveal similar structure? In what ways are they
different?
Often it happens in physics that great discoveries are made
inadvertently. What would have happened if Davisson and
Germer had their accelerating voltage set below 32 V?
Suppose we cover one slit in the two-slit electron experiment
with a very thin sheet of fluorescent material that emits a
photon of light whenever an electron passes through. We
then fire electrons one at a time at the double slit; whether
or not we see a flash of light tells us which slit the electron
went through. What effect does this have on the interference
pattern? Why?
In another attempt to determine through which slit the electron passes, we suspend the double slit itself from a very
fine spring balance and measure the “recoil” momentum of
the slit as a result of the passage of the electron. Electrons
that strike the screen near the center must cause recoils
in opposite directions depending on which slit they pass
through. Sketch such an apparatus and describe its effect
on the interference pattern. (Hint: Consider the uncertainty
h as applied to the motion of the slits
principle xpx ∼ −
suspended from the spring. How precisely do we know the
position of the slit?)
It is possible for vphase to be greater than c? Can vgroup be
greater than c?
In a nondispersive medium, vgroup = vphase ; this is another
way of saying that all waves travel with the same phase
velocity, no matter what their wavelengths. Is this true for
(a) de Broglie waves? (b) Light waves in glass? (c) Light
waves in vacuum? (d) Sound waves in air? What difficulties would be encountered in attempting communication
(by speech or by radio signals for example) in a strongly
dispersive medium?
Problems
4.1 De Broglie’s Hypothesis
1. Find the de Broglie wavelength of (a) a 5-MeV proton; (b) a 50-GeV electron; (c) an electron moving at
v = 1.00 × 106 m/s.
2. The neutrons produced in a reactor are known as thermal
neutrons, because their kinetic energies have been reduced
(by collisions) until K = 32 kT, where T is room temperature
(293 K). (a) What is the kinetic energy of such neutrons?
(b) What is their de Broglie wavelength? Because this
wavelength is of the same order as the lattice spacing of
the atoms of a solid, neutron diffraction (like X-ray and
electron diffraction) is a useful means of studying solid
lattices.
3. By doing a nuclear diffraction experiment, you measure
the de Broglie wavelength of a proton to be 9.16 fm. (a)
What is the speed of the proton? (b) Through what potential
difference must it be accelerated to achieve that speed?
4. A proton is accelerated from rest through a potential
difference of −2.36 × 105 V. What is its de Broglie
wavelength?
Chapter 4 | The Wavelike Properties of Particles
4.2 Experimental Evidence for de Broglie Waves
5. Find the potential difference through which electrons must
be accelerated (as in an electron microscope, for example)
if we wish to resolve: (a) a virus of diameter 12 nm; (b) an
atom of diameter 0.12 nm; (c) a proton of diameter 1.2 fm.
6. In an electron microscope we wish to study particles of diameter about 0.10 μm (about 1000 times the size of a single
atom). (a) What should be the de Broglie wavelength of the
electrons? (b) Through what potential difference should the
electrons be accelerated to have that de Broglie wavelength?
7. In order to study the atomic nucleus, we would like to
observe the diffraction of particles whose de Broglie wavelength is about the same size as the nuclear diameter, about
14 fm for a heavy nucleus such as lead. What kinetic energy
should we use if the diffracted particles are (a) electrons?
(b) Neutrons? (c) Alpha particles (m = 4 u)?
8. In the double-slit interference pattern for helium atoms
(Figure 4.13), the kinetic energy of the beam of atoms
was 0.020 eV. (a) What is the de Broglie wavelength of a
helium atom with this kinetic energy? (b) Estimate the de
Broglie wavelength of the atoms from the fringe spacing
in Figure 4.13, and compare your estimate with the value
obtained in part (a). The distance from the double slit to the
scanning slit is 64 cm.
9. Suppose we wish to do a double-slit experiment with a beam
of the smoke particles of Example 4.1c. Assume we can
construct a double slit whose separation is about the same
size as the particles. Estimate the separation between the
fringes if the double slit and the screen were on opposite
coasts of the United States.
10. In the Davisson-Germer experiment using a Ni crystal, a
second-order beam is observed at an angle of 55◦ . For what
accelerating voltage does this occur?
11. A certain crystal is cut so that the rows of atoms on its
surface are separated by a distance of 0.352 nm. A beam
of electrons is accelerated through a potential difference of
175 V and is incident normally on the surface. If all possible
diffraction orders could be observed, at what angles (relative
to the incident beam) would the diffracted beams be found?
4.3 Uncertainty Relationships for Classical Waves
12. Suppose a traveling wave has a speed v (where v = λf ).
Instead of measuring waves over a distance x, we stay in
one place and count the number of wave crests that pass in
a time t. Show that Eq. 4.7 is equivalent to Eq. 4.4 for this
case.
13. Sound waves travel through air at a speed of 330 m/s. A
whistle blast at a frequency of about 1.0 kHz lasts for 2.0 s.
(a) Over what distance in space does the “wave train” representing the sound extend? (b) What is the wavelength of the
sound? (c) Estimate the precision with which an observer
could measure the wavelength. (d) Estimate the precision
with which an observer could measure the frequency.
14. A stone tossed into a body of water creates a disturbance at
the point of impact that lasts for 4.0 s. The wave speed is
25 cm/s. (a) Over what distance on the surface of the water
does the group of waves extend? (b) An observer counts 12
wave crests in the group. Estimate the precision with which
the wavelength can be determined.
15. A radar transmitter emits a pulse of electromagnetic radiation with wavelength 0.225 m. The pulses have a duration of
1.17 μs. The receiver is set to accept a range of frequencies
about the central frequency. To what range of frequencies
should the receiver be set?
16. Estimate the signal processing time that would be necessary
if you want to design a device to measure frequencies to a
precision of no worse than 10,000 Hz.
4.4 Heisenberg Uncertainty Relationships
17. The speed of an electron is measured to within an uncertainty
of 2.0 × 104 m/s. What is the size of the smallest region of
space in which the electron can be confined?
18. An electron is confined to a region of space of the size of an
atom (0.1 nm). (a) What is the uncertainty in the momentum
of the electron? (b) What is the kinetic energy of an electron with a momentum equal to p? (c) Does this give a
reasonable value for the kinetic energy of an electron in an
atom?
19. The ∗ particle has a rest energy of 1385 MeV and a lifetime
of 2.0 × 10−23 s. What would be a typical range of outcomes
of measurements of the ∗ rest energy?
20. A pi meson (pion) and a proton can briefly join together to
form a particle. A measurement of the energy of the π p
system (Figure 4.28) shows a peak at 1236 MeV, corresponding to the rest energy of the particle, with an experimental
spread of 120 MeV. What is the lifetime of the ?
Reaction probability
130
120 MeV
1000
1200
1400
Energy (MeV)
1600
FIGURE 4.28 Problem 20.
21. A nucleus emits a gamma ray of energy 1.0 MeV from a
state that has a lifetime of 1.2 ns. What is the uncertainty
in the energy of the gamma ray? The best gamma-ray
detectors can measure gamma-ray energies to a precision of
no better than a few eV. Will this uncertainty be directly
measurable?
Problems
22. In special conditions (see Section 12.9), it is possible to
measure the energy of a gamma-ray photon to 1 part in
1015 . For a photon energy of 50 keV, estimate the maximum
lifetime that could be determined by a direct measurement
of the spread of the photon energy.
23. Alpha particles are emitted in nuclear decay processes with
typical energies of 5 MeV. In analogy with Example 4.7,
deduce whether the alpha particle can exist inside the
nucleus.
4.5 Wave Packets
24. Use a distribution of wave numbers of constant amplitude
in a range k about k0 :
k
k
k k0 +
k0 −
A(k) = A0
2
2
=0
otherwise
and obtain Eq. 4.24 from Eq. 4.23.
25. Use the distribution of wave numbers A(k) =
2
2
A0 e−(k−k0 ) /2(k) for k = −∞ to +∞ to derive Eq. 4.25.
26. Do the trigonometric manipulation necessary to obtain
Eq. 4.18.
4.6 The Motion of a Wave Packet
27. Show that the data used in Figure 4.25 are consistent with
Eq. 4.27; that is, use λ1 = 9 and λ2 = 11, v1 = 6 and v2 = 4
to show that vgroup = 15.
28. (a) Show that the group velocity and phase velocity are
related by:
dvphase
vgroup = vphase − λ
dλ
(b) When white light travels through glass, the phase velocity of each wavelength depends on the wavelength. (This is
the origin of dispersion and the breaking up of white light
into its component colors—different wavelengths travel at
different speeds and have different indices of refraction.)
How does vphase depend on λ? Is dvphase /dλ positive or
negative? Therefore, is vgroup > vphase or < vphase ?
29. Certain
surface waves in a fluid travel with phase velocity
√
b/λ, where b is a constant. Find the group velocity of a
packet of surface waves, in terms of the phase velocity.
30. By a calculation similar to that of Eq. 4.30, show that dE/dp
= v remains valid when E represents the relativistic kinetic
energy of the particle.
General Problems
31. A free electron bounces elastically back and forth in one
dimension between two walls that are L = 0.50 nm apart.
(a) Assuming that the electron is represented by a de Broglie
standing wave with a node at each wall, show that the permitted de Broglie wavelengths are λn = 2L/n (n = 1, 2, 3, . . .).
(b) Find the values of the kinetic energy of the electron for
n = 1, 2, and 3.
131
32. A beam of thermal neutrons (see Problem 2) emerges from
a nuclear reactor and is incident on a crystal as shown in
Figure 4.29. The beam is Bragg scattered, as in Figure 3.5,
from a crystal whose scattering planes are separated by
0.247 nm. From the continuous energy spectrum of the
beam we wish to select neutrons of energy 0.0105 eV. Find
the Bragg-scattering angle that results in a scattered beam
of this energy. Will other energies also be present in the
scattered beam at that angle?
Graphite
Neutron
beam
q
Shielding
Reactor
q
Scattering
crystal
FIGURE 4.29 Problem 32.
33. (a) Find the de Broglie wavelength of a nitrogen molecule
in air at room temperature (293 K). (b) The density of air at
room temperature and atmospheric pressure is 1.292 kg/m3 .
Find the average distance between air molecules at this temperature and compare with the de Broglie wavelength. What
do you conclude about the importance of quantum effects
in air at room temperature? (c) Estimate the temperature at
which quantum effects might become important.
34. In designing an experiment, you want a beam of photons and
a beam of electrons with the same wavelength of 0.281 nm,
equal to the separation of the Na and Cl ions in a crystal of
NaCl. Find the energy of the photons and the kinetic energy
of the electrons.
35. A nucleus of helium with mass 5 u breaks up from rest into
a nucleus of ordinary helium (mass = 4 u) plus a neutron
(mass = 1 u). The rest energy liberated in the break-up is
0.89 MeV, which is shared (not equally) by the products.
(a) Using energy and momentum conservation, find the
kinetic energy of the neutron. (b) The lifetime of the original nucleus is 1.0 × 10−21 s. What range of values of the
neutron kinetic energy might we measure in the laboratory
as a result of the uncertainty relationship?
36. In a metal, the conduction electrons are not attached to any
one atom, but are relatively free to move throughout the
entire metal. Consider a cube of copper measuring 1.0 cm on
each edge. (a) What is the uncertainty in any one component
of the momentum of an electron confined to the metal?
(b) Estimate the average kinetic energy of an electron in
the metal. (Assume p = [(px )2 + (py )2 + (pz )2 ]1/2 .)
(c) Assuming the heat capacity of copper to be
24.5 J/mole · K, would the contribution of this motion to
the internal energy of the copper be important at room
temperature? What do you conclude from this? (See also
Problem 38.)
132
Chapter 4 | The Wavelike Properties of Particles
37. A proton or a neutron can sometimes “violate” conservation
of energy by emitting and then reabsorbing a pi meson,
which has a mass of 135 MeV/c2 . This is possible as long
as the pi meson is reabsorbed within a short enough time
t consistent with the uncertainty principle. (a) Consider
p → p + π. By what amount E is energy conservation
violated? (Ignore any kinetic energies.) (b) For how long
a time t can the pi meson exist? (c) Assuming the pi
meson to travel at very nearly the speed of light, how far
from the proton can it go? (This procedure, as we discuss in
Chapter 12, gives us an estimate of the range of the nuclear
force, because protons and neutrons are held together in the
nucleus by exchanging pi mesons.)
38. In a crystal, the atoms are a distance L apart; that is, each
atom must be localized to within a distance of at most L.
(a) What is the minimum uncertainty in the momentum of
the atoms of a solid that are 0.20 nm apart? (b) What is
the average kinetic energy of such an atom of mass 65 u?
(c) What would a collection of such atoms contribute to
the internal energy of a typical solid, such as copper? Is
this contribution important at room temperature? (See also
Problem 36.)
39. An apparatus is used to prepare an atomic beam by heating a collection of atoms to a temperature T and allowing
the beam to emerge through a hole of diameter d in one
side of the oven. The beam then travels through a straight
path of length L. Show that the uncertainty principle causes
the diameter of the beam at the end of √
the path to be
h/d 3mkT, where
larger than d by an amount of order L−
m is the mass of an atom. Make a numerical estimate
for typical values of T = 1500 K, m = 7 u (lithium atoms),
d = 3 mm, L = 2 m.
Chapter
5
THE SCHRÖDINGER EQUATION
Quantum mechanics provides a mathematical framework in which the description of a
process often includes different and possibly contradictory outcomes. A favorite illustration
of that situation is the case of Schrödinger’s cat. The cat is confined in a chamber with a
radioactive atom, the decay of which will trigger the release of poison from a vial. Because
we don’t know exactly when that decay will occur, until an observation of the condition of
the cat is made the quantum-mechanical description of the cat must include both ‘‘cat alive’’
and ‘‘cat dead’’ components.
134
Chapter 5 | The Schrödinger Equation
Incident
wave
B Transmitted
wave
A
Reflected
wave
Glass
Air (region 1) (region 2) Air (region 3)
(a)
Region 1
Region 2
A
Region 3
B
(b)
Region 1
Region 2 Region 3
A
B
V0
(c)
FIGURE 5.1 (a) A light wave in air is
incident on a slab of glass, showing
transmitted and reflected waves at the
two boundaries (A and B). (b) A surface wave in water incident on a region
of smaller depth similarly has transmitted and reflected waves. (c) The
de Broglie waves of electrons moving
from a region of constant zero potential to a region of constant negative
potential V0 also have transmitted and
reflected components.
The future behavior of a particle in a classical (nonrelativistic, nonquantum)
situation may be predicted with absolute certainty using Newton’s laws. If a
(which might
particle interacts with its environment through a known force F
be associated with a potential energy U), we can do the mathematics necessary
= d p/dt (a second-order, linear differential
to solve Newton’s second law, F
v(t) at all future times
equation), and find the particle’s location r (t) and velocity
t. The mathematics may be difficult, and in fact it may not be possible to solve the
equations in closed form (in which case an approximate solution can be obtained
with the help of a computer). Aside from any such mathematical difficulties, the
= d p/dt
physics of the problem consists of writing down the original equation F
v(t). For example, a satellite or planet
and interpreting its solutions r (t) and
moving under the influence of a 1/r2 gravitational force can be shown, after the
equations have been solved, to follow exactly an elliptical path.
In the case of nonrelativistic quantum physics, the basic equation to be solved
is a second-order differential equation known as the Schrödinger equation. Like
Newton’s laws, the Schrödinger equation is written for a particle interacting with
its environment, although we describe the interaction in terms of the potential
energy rather than the force. Unlike Newton’s laws, the Schrödinger equation
does not give the trajectory of the particle; instead, its solution gives the wave
function of the particle, which carries information about the particle’s wavelike
behavior. In this chapter we introduce the Schrödinger equation, obtain some of its
solutions for certain potential energies, and learn how to interpret those solutions.
5.1 BEHAVIOR OF A WAVE AT A BOUNDARY
In studying wave motion, we often must analyze what occurs when a wave
moves from one region or medium to a different region or medium in which the
properties of the wave may change. For example, when a light wave moves from
air into glass, its wavelength and the amplitude of its electric field both decrease.
At every such boundary, a portion of the incident wave intensity is transmitted
into the second medium and a portion is reflected back into the first medium.
Let’s consider the case of a light wave incident on a glass plate, as in
Figure 5.1a. At boundary A, the light wave moves from air (region 1) into glass
(region 2), while at B the light wave moves from glass into air (region 3). The
wavelength in air in region 3 is the same as the original wavelength of the incident
wave in region 1, but the amplitude in region 3 is less than the amplitude in
region 1, because some of the intensity is reflected at A and at B.
Other types of waves show similar behavior. For example, Figure 5.1b shows
a surface water wave that moves into a region of shallower depth. In that region,
its wavelength is smaller (but its amplitude is larger) compared with the original
incident wave. When the wave enters region 3, in which the depth is the same as
in region 1, the wavelength returns to its original value, but the amplitude of the
wave is smaller in region 3 than in region 1 because some of the intensity was
reflected at the two boundaries.
The same type of behavior occurs for de Broglie waves that characterize
particles. Consider, for example, the apparatus shown in Figure 5.1c. Electrons
are incident from the left and move inside a narrow metal tube that is at ground
potential (V = 0). Another narrow tube in region 2 is connected to the negative
terminal of a battery, which maintains it at a uniform potential of −V0 . Region 3
is connected to region 1 at ground potential. The gaps between the tubes can in
5.1 | Behavior of a Wave at a Boundary
principle be made so small that we can regard the changes in potential at A and B as
occurring
√ suddenly. In region 1, the electrons have kinetic energy K, momentum
p = 2mK, and de Broglie wavelength λ = h/p. In region 2, the potential energy
for the electrons is U = qV = (−e)(−V0 ) = +eV0 . We assume that the original
kinetic energy of the electrons in region 1 is greater than eV0 , so that the electrons
move into region 2 with a smaller kinetic energy (equal to K − eV0 ), a smaller
momentum, and thus a greater wavelength. When the electrons move from region
2 into region 3, they gain back the lost kinetic energy and move with their original
kinetic energy K and thus with their original wavelength. As in the case of the
light wave or the water wave, the amplitude of the de Broglie wave in region 3
is smaller than in region 1, meaning that the current of electrons in region 3 is
smaller than the incident current, because some of the electrons are reflected at
the boundaries at A and B.
We can thus identify a total of 5 waves moving in the three regions: (1) a wave
moving to the right in region 1 (the incident wave); (2) a wave moving to the left
in region 1 (representing the net combination of waves reflected from boundary A
plus waves reflected from boundary B and then transmitted through boundary A
back into region 1); (3) a wave moving to the right in region 2 (representing waves
transmitted through boundary A plus waves reflected at B and then reflected again
at A); (4) a wave moving to the left in region 2 (waves reflected at B); and (5)
a wave moving to the right in region 3 (the transmitted waves at boundary B).
Because we are assuming that waves are incident from region 1, it is not possible
to have a wave moving to the left in region 3.
Penetration of the Reflected Wave
Another property of classical waves that carries over into quantum waves is
penetration of a totally reflected wave into a forbidden region. When a light
wave is completely reflected from a boundary, an exponentially decreasing wave
called the evanescent wave penetrates into the second medium. Because 100%
of the light wave intensity is reflected, the evanescent wave carries no energy
and so cannot be directly observed in the second medium. But if we make the
second medium very thin (perhaps equal to a few wavelengths of light) the light
wave can emerge on the opposite side of the second medium. We’ll discuss this
phenomenon in more detail at the end of this chapter.
The same effect occurs with de Broglie waves. Suppose we increase the battery
voltage in Figure 5.1c so that the potential energy in region 2 (equal to eV0 )
is greater than the initial kinetic energy in region 1. The electrons do not have
enough energy to enter region 2 (they would have negative kinetic there) and so
all electrons are reflected back into region 1.
Like light waves, de Broglie waves can also penetrate into the forbidden region
with exponentially decreasing amplitudes. However, because de Broglie waves
are associated with the motion of electrons, that means that electrons must also
penetrate a short distance into the forbidden region. The electrons cannot be
directly observed in that region, because they have negative kinetic energy there.
Nor can we do any experiment that would reveal their “real” existence in the
forbidden region, such as measuring the speed of their passage through that region
or detecting the magnetic field that their motion might produce.
One explanation for the penetration of the electrons into the forbidden region
relies on the uncertainty principle—because we can’t know exactly the energy of
the incident electrons, we can’t say with certainty that they don’t have enough
kinetic energy to penetrate into the forbidden region. For short enough time t, the
135
136
Chapter 5 | The Schrödinger Equation
−
energy uncertainty E ∼ h/t
might allow the electron to travel a short distance
into the forbidden region, but this extra energy does not “belong to” the electron
in any permanent sense. Later in this chapter we’ll discuss a more mathematical
approach to this explanation of penetration into the forbidden region.
(a)
Continuity at the Boundaries
When a wave such as a light wave or a water wave crosses a boundary as in
Figure 5.1, the mathematical function that describes the wave must have two
properties at each boundary:
(b)
1. The wave function must be continuous.
2. The slope of the wave function must be continuous, except when the boundary
height is infinite.
(c)
(d)
FIGURE 5.2
(a) A discontinuous
wave. (b) A continuous wave with
a discontinuous slope. (c) Two sine
waves join smoothly. (d) A sine wave
and an exponential join smoothly.
H
y(t)
Parabola
t
0
Sine
v(t)
0
t
FIGURE 5.3 The position and velocity of a ball dropped from a height
H above a springlike rubber sheet at
y = 0.
Figure 5.2a shows a discontinuous wave function; the wave displacement
changes suddenly at a single location. This type of behavior is not allowed.
Figure 5.2b shows a continuous wave function (there are no gaps) with a
discontinuous slope. This type of behavior is also not allowed, unless the
boundary is of infinite height. Figures 5.2c, d show how two sine curves and an
exponential and a sine can be joined so that both the function and the slope are
continuous.
Across any non-infinite boundary, the wave must be smooth—no gaps in the
function and no sharp changes in slope. When we solve for the mathematical
form of a wave function, there are usually undetermined parameters, such as
the amplitude and phase of the wave. In order to make the wave smooth at
the boundary, we obtain the values of those coefficients by applying the two
boundary conditions to make the function and its slope continuous. For example,
at boundary A in Figure 5.1, we first evaluate the total wave function in region 1
at A and set it equal to the wave function in region 2 at A. This guarantees that the
total wave function is continuous at A. We then take the derivative of the wave
function in region 1, evaluate it at A, and set that equal to the derivative of the
wave function in region 2 evaluated at A. This step makes the slope in region 1
match the slope in region 2 at boundary A. These two steps give us two equations
relating the parameters of the waves and allow us to find relationships between
the amplitudes and phases of the waves in regions 1 and 2. The process must
be repeated at every boundary, such as at B in Figure 5.1 to match the waves in
regions 2 and 3.
We can understand the exception to the continuity of the slope for infinite
boundaries with an example from classical physics. Imagine a ball dropped from
a height y = H above a stretched rubber sheet at y = 0. The ball falls freely
under gravity until it strikes the sheet, which we assume behaves like an elastic
spring. The sheet stretches as the ball is brought to rest, after which the restoring
force propels the ball upward. The motion of the ball might be represented by
Figure 5.3. Above the sheet (y > 0) the motion is represented by parabolas, and
while the ball is in contact with the sheet (y < 0) the motion is described by sine
curves. Note how the curves join smoothly at y = 0, and note how both y(t) and
its derivative v(t) are continuous.
On the other hand, imagine a ball hitting a steel surface, which we assume to be
perfectly rigid. The ball rebounds elastically, and at the instant it is in contact with
the surface its velocity reverses direction. The motion of the ball is represented
5.1 | Behavior of a Wave at a Boundary
in Figure 5.4. At the points of contact with the surface, there is a sudden change
in the velocity, corresponding to an infinite acceleration and thus to an infinite
force. The function y(t) is continuous, but its slope is not—the function has no
gaps, but it does have sharp “points” where the slope changes suddenly.
The assumption of the perfectly rigid surface is an idealization that we make
to help us understand the situation and also to help simplify the mathematics. In
reality the steel surface will flex slightly and ultimately behave somewhat like
a much stiffer version of the rubber sheet. In quantum mechanics we will also
sometimes use an assumption of a perfectly rigid or impenetrable boundary to
help us understand and simplify the analysis of a more complicated physical
situation.
In this section we have established several properties of classical waves that
also apply to quantum waves:
H
137
y(t)
t
0
v(t)
t
0
FIGURE 5.4 The position and velocity of a ball dropped from a height H
above a rigid surface.
1. When a wave crosses a boundary between two regions, part of the wave
intensity is reflected and part is transmitted.
2. When a wave encounters a boundary to a region from which it is forbidden,
the wave will penetrate perhaps by a few wavelengths before reflecting.
3. At a finite boundary, the wave and its slope are continuous. At an infinite
boundary, the wave is continuous but its slope is discontinuous.
Example 5.1
In the geometry of Figure 5.1, the wave in region 1 is
given by y1 (x) = C1 sin(2π x/λ1 − φ1 ), where C1 = 11.5,
λ1 = 4.97 cm, and φ1 = −65.3◦ . In region 2, the wavelength is λ2 = 10.5 cm. The boundary A is located at x = 0,
and the boundary B is located at x = L, where L = 20.0 cm.
Find the wave functions in regions 2 and 3.
Solution
The general form of the wave in region 2 can be represented in a form similar to that of the wave in region 1:
y2 (x) = C2 sin(2π x/λ2 − φ2 ). To find the complete wave
function in region 2, we must find the amplitude C2 and
the phase φ2 by applying the boundary conditions on
the function and its slope at boundary A (x = 0). Setting
y1 (x = 0) = y2 (x = 0) gives
−C1 sin φ1 = −C2 sin φ2
The slopes can be found from the derivative of the general form dy/dx = (2π/λ)C cos(2π x/λ − φ) evaluated at
x = 0:
2π
2π
C1 cos φ1 =
C cos φ2
λ1
λ2 2
Dividing the first equation by the second eliminates C2
and allows us to solve for φ2 :
φ2 = tan−1
= tan
−1
λ1
tan φ1
λ2
4.97 cm
◦
tan(−65.3 )
10.5 cm
◦
= −45.8
We can solve for C2 using the result from applying the first
boundary condition:
C2 = C1
sin φ1
sin(−65.3◦ )
= 11.5
= 14.6
sin φ2
sin(−45.8◦ )
To find the wave function in region 3, which we assume
to have the same form y3 (x) = C3 sin(2π x/λ1 − φ3 ), we
must apply the boundary conditions on y2 and y3 at x = L.
Applying the two boundary conditions in the same way we
did at x = 0, we obtain
2π L
2π L
− φ2 = C3 sin
− φ3
C2 sin
λ2
λ1
2π
2π L
2π L
2π
C cos
− φ2 =
C cos
− φ3
λ2 2
λ2
λ1 3
λ1
Proceeding as we did before, we divide these two
equations to find φ3 = 60.9◦ , and then from either
equation obtain C3 = 7.36. Our two solutions are
138
Chapter 5 | The Schrödinger Equation
then y2 (x) = 14.6 sin (2π x/10.5 + 45.8◦ ) and y3 (x) =
7.36 sin(2π x/4.97 + 14.6◦ ), with x measured in cm.
Figure 5.5 shows the wave in all three regions. Note
how the waves join smoothly at the boundaries.
How is it possible that the amplitude of y2 can be greater
than the amplitude of y1 ? Keep in mind that y1 represents
the total wave in region 1, which includes the incident wave
and the reflected wave. Depending on the phase difference
between them, when the incident and reflected waves are
added to obtain y1 , the amplitude of the resultant can be
smaller than the amplitude of either wave.
10
−10
10
20
30
−10
FIGURE 5.5 Example 5.1.
5.2 CONFINING A PARTICLE
L
B
A
V0
V0
(a)
U = U0
U=0
(b)
FIGURE 5.6 (a) Apparatus for confining an electron to the center region
of length L. (b) The potential energy
of an electron in this apparatus.
A free particle (that is, a particle on which no forces act anywhere) is by definition
not confined, so it can be located anywhere. It has, as we discussed in Chapter 4,
a definite wavelength, momentum, and energy (for which we can choose any
value).
A confined particle, on the other hand, is represented by a wave packet that
makes it likely to be found only in a region of space of size x. We construct
such a wave packet by adding together different sine or cosine waves to obtain
the desired mathematical shape.
In quantum mechanics, we often want to analyze the behavior of confined
particles, for example an electron that is attached to a specific atom or molecule.
We’ll consider the properties of atomic electrons beginning in Chapter 6, but for
now let’s look at a simpler problem: an electron moving in one dimension and
confined by a series of electric fields. Figure 5.6 shows how the apparatus of
Figure 5.1c might be modified for this purpose. The center section is grounded
(so that V = 0) and the two side sections are connected to batteries so that they
are at potentials of −V0 relative to the center section. As before, we assume that
the gaps between the center section and the side sections can be made as narrow
as possible, so we can regard the potential energy as changing instantaneously at
the boundaries A and B. This arrangement is often called a potential energy well.
The potential energy of an electron in this situation is then 0 in the center
section and U0 = qV = (−e)(−V0 ) = +eV0 in the two side sections as shown in
Figure 5.6. To confine the electron, we want to consider cases in which it moves
in the center section with a kinetic energy K that is less than U0 . For example, the
electron might have a kinetic energy of 5 eV in the center section, and the side
sections might have potential energies of 10 eV. The electron thus does not have
enough energy to “climb” the potential energy hill between the center section and
the side sections, and (at least from the classical point of view) the electron is
confined to the center section.
We’ll discuss the full solution to this problem later in this chapter, but for now
let’s simplify even further and consider the case of an infinitely high potential
energy barrier at A and B. This is a good approximation to the situation in
which the kinetic energy of the electron in the center section is much smaller
than the potential energy supplied by the batteries. In this case the penetration
5.2 | Confining a Particle
into the forbidden region, which we discussed in Section 5.1, cannot occur. The
probability to find the electron in either of the side regions is therefore precisely
zero everywhere in those regions, and thus the wave amplitude is zero everywhere
in those regions, including at the boundaries (locations A and B). For the wave
function to be continuous, the wave function in the center section must have
values of zero at A and B.
Of all the possible waves that might be used to describe the particle in this
center section, the continuity condition restricts us to waves that have zero
amplitude at the boundaries. Some of those waves are illustrated in Figure 5.7.
Note that the wave function is continuous, but its slope is not (there are sharp
points in the function at locations A and B). This is an example of the exception
to the second boundary condition—the slope may be discontinuous at an infinite
barrier.
In contrast to the free particle for which the wavelength could have any value,
only certain values of the wavelength are allowed. The de Broglie relationship then
tells us that only certain values of the momentum are allowed, and consequently
only certain values of the energy are allowed. The energy is not a continuous
variable, free to take on any arbitrary value; instead, the energy is a discrete
variable that is restricted to a certain set of values. This is known as quantization
of energy.
You can see directly from Figure 5.7 that the allowed wavelengths are
2L, L, 2L/3, . . ., where L is the length of the center section. We can write these
wavelengths as
λn =
2L
n
n = 1, 2, 3, . . .
(5.1)
This set of wavelengths is identical to the wavelengths of the classical problem
of standing waves on a string stretched between two points. From the de Broglie
relationship λ = h/p we obtain
pn = n
h
2L
(5.2)
The energy of the particle in the center section is only kinetic energy p2 /2m,
and so
En = n2
h2
8mL2
(5.3)
These are the allowed or quantized values of the energy of the electron.
A wave packet describing the electron in this region must be a combination of
waves with the allowed values of the wavelengths. However, it is not necessary
to construct a wave packet from a combination of waves to describe this confined
particle. Even a single one of these waves represents the confined particle, because
the wave function must be zero in the forbidden regions. So the waveforms shown
in Figure 5.7 can represent wave packets of this confined electron, each wave
packet consisting of only a single wave.
The appearance of energy quantization accompanies every attempt to confine
a particle to a finite region of space. Quantization of energy is one of the principal
features of the quantum theory, and studying the quantized energy levels of
systems (such as by observing the energies of emitted photons) is an important
technique of experimental physics that gives us information about the properties
of atoms and nuclei.
A
139
B
L = ½λ
A
B
L=λ
A
B
L=
3 2λ
A
B
L = 2λ
FIGURE 5.7 Some possible waves
that might be used to describe an electron confined by an infinite potential
energy barrier to a region of length L.
140
Chapter 5 | The Schrödinger Equation
Applying the Uncertainty Principle
to a Confined Particle
In Chapter 4 we constructed wave packets and showed how the uncertainty
principle related the size of the wave packet to the range of wavelengths that
was used in its construction. Let’s now see how the Heisenberg uncertainty
relationships apply in the case of a confined particle.
In the arrangement of Figure 5.6 (with infinitely high barriers on each side),
the particle is known to be somewhere in the center section of the apparatus, and
thus x ∼ L is a reasonable estimate of the uncertainty in its location. To find the
uncertainty in its momentum,
we use the rigorous definition of uncertainty given
in Eq. 4.15: px = (p2x )av − (px,av )2 . The particle moving in the center section
can be considered to be moving to the left or to the right with equal probability
(just as the classical standing-wave problem can be analyzed as the superposition
of identical waves moving to the left and to the right). Thus px,av = 0. If the
particle is moving with a momentum given by Eq. 5.2, p2x = (nh/L)2 and so
px = nh/L. Combining the uncertainties in position and momentum, we have
xpx ∼ L
nh
= nh
L
(5.4)
−
and so the result
The product of the uncertainties is certainly greater than h/2,
of confining the particle is entirely consistent with the Heisenberg uncertainty
relationship. Note that even the smallest possible value of the product of the
uncertainties (which is obtained for n = 1) is still much larger than the minimum
value given by the uncertainty principle.
Later in this chapter, we will use a more rigorous way to evaluate the uncertainty
in position using a formula similar to Eq. 4.15 to find the uncertainty in position,
and we will find that the result does not differ very much from the estimate
of Eq. 5.4.
5.3 THE SCHR ÖDINGER EQUATION
Erwin Schrödinger (1887–1961, Austria). Although he disagreed with the
probabilistic interpretation that was
later given to his work, he developed the mathematical theory of wave
mechanics that for the first time permitted the wave behavior of physical
systems to be calculated.
The differential equation whose solution gives us the wave behavior of particles is
called the Schrödinger equation. It was developed in 1926 by Austrian physicist
Erwin Schrödinger. The equation cannot be derived from any previous laws
or postulates; like Newton’s equations of motion or Maxwell’s equations of
electromagnetism, it is a new and independent result whose correctness can
be determined only by comparing its predictions with experimental results.
For nonrelativistic motion, the Schrödinger equation gives results that correctly
account for observations at the atomic and subatomic level.
We can justify the form of the Schrödinger equation by examining the solution
expected for the free particle, which should give a wave whose shape at any
particular time, specified by the wave function ψ(x), is that of a simple de
Broglie wave, such as ψ(x) = A sin kx, where A is the amplitude of the wave and
k = 2π/λ. If we are looking for a differential equation, then we need to take some
derivatives:
dψ
= kA cos kx,
dx
d2ψ
= −k 2 A sin kx = −k 2 ψ(x)
dx2
5.3 | The Schrödinger Equation
Note that the second derivative gives the original function again. With the kinetic
energy K = p2 /2m = (h/λ)2 /2m = h−2 k 2 /2m, we can then write
2m
2m
d2ψ
= −k 2 ψ(x) = − −2 Kψ(x) = − −2 (E − U)ψ(x)
h
h
dx2
where E = K + U is the nonrelativistic total energy of the particle. For a free
particle, U = 0 so E = K; however, we are using the free particle solution to try
to extend to the more general case in which there is a potential energy U(x). The
equation then becomes
−
h−2 d 2 ψ
+ U(x)ψ(x) = Eψ(x)
2m dx2
(5.5)
Equation 5.5 is the time-independent Schrödinger equation for one-dimensional
motion.
The solution to Eq. 5.5 gives the shape of the wave at time t = 0. The
mathematical function that describes a one-dimensional traveling wave must
involve both x and t. This wave is represented by the function (x,t):
(x,t) = ψ(x)e−iωt
(5.6)
The time dependence is given by the complex exponential function e−iωt with
−
(You can find a few useful formulas involving complex numbers in
ω = E/h.
Appendix B.) We’ll discuss the time-dependent part later in this chapter. For now,
we’ll concentrate on the time-independent function ψ(x).
We assume that we know the potential energy U(x), and we wish to obtain the
wave function ψ(x) and the energy E for that potential energy. This is a general
example of a type of problem known as an eigenvalue problem; we find that it is
possible to obtain solutions to the equation only for particular values of E, which
are known as the energy eigenvalues.
The general procedure for solving the Schrödinger equation is as follows:
1. Begin by writing Eq. 5.5 with the appropriate U(x). Note that if the potential
energy changes discontinuously [U(x) may be represented by a discontinuous
function; ψ(x) may not], we may need to write different equations for different
regions of space. Examples of this sort are given in Section 5.4.
2. Using general mathematical techniques suited to the form of the equation, find
a mathematical function ψ(x) that is a solution to the differential equation.
Because there is no one specific technique for solving differential equations,
we will study several examples to learn how to find solutions.
3. In general, several solutions may be found. By applying boundary conditions
some of these may be eliminated and some arbitrary constants may be
determined. It is generally the application of the boundary conditions that
selects out the allowed energies.
4. If you are seeking solutions for a potential energy that changes discontinuously,
you must apply the continuity conditions on ψ(x) (and usually on dψ/dx) at
the boundary between different regions.
Because the Schrödinger equation is linear, any constant multiplying a solution
is also a solution. The method to determine the amplitude of the wave function is
discussed in the next section.
141
142
Chapter 5 | The Schrödinger Equation
Probabilities and Normalization
|ψ(x)|2
dx
x1
x2
FIGURE 5.8 The probability to find
the particle in a small region of width
dx is equal to the area of the strip under
the |ψ(x)|2 curve. The total probability
to find the particle between x1 and x2 is
the sum of the areas of the strips, equal
to the integral between those limits.
The remaining steps in the procedure for applying the Schrödinger recipe depend
on the physical interpretation of the solution to the differential equation. Our
original goal in solving the Schrödinger equation was to obtain the wave properties
of the particle. What does the amplitude of ψ(x) represent, and what is the physical
variable that is waving? It is certainly not a displacement, as in the case of a water
wave or a wave on a stretched piano wire, nor is it a pressure wave, as in the case
of sound. It is a very different kind of wave, whose squared absolute amplitude
gives the probability for finding the particle in a given region of space.
If we define P(x) as the probability density (probability per unit length, in one
dimension), then according to the Schrödinger recipe
P(x) dx = |ψ(x)|2 dx
(5.7)
as indicated in Figure 5.8. In Eq. 5.7, |ψ(x)|2 dx gives the probability to find the
particle in the interval dx at x (that is, between x and x + dx).∗ Because the wave
function ψ(x) might be a complex function, it is necessary to square its absolute
magnitude to make sure that the probability is a positive real number.
The squared magnitude of the general time-dependent wave function
(Eq. 5.6) is:
| (x,t)|2 = |ψ(x)|2 |e−iωt |2 = |ψ(x)|2
(5.8)
where the last step can be taken because the magnitude of the time-dependent
factor is 1. For this reason, the probability density associated with a solution to
the Schrödinger equation (for any allowed value of E) is independent of time.
These special quantum states are called stationary states.
This interpretation of |ψ(x)|2 helps us to understand the continuity condition
of ψ(x). We must not allow the probability to change discontinuously, but, like
any well-behaved wave, the probability to locate the particle varies smoothly and
continuously.
This interpretation of ψ(x) now permits us to complete the Schrödinger recipe
and to illustrate how to use the wave function to calculate quantities that we can
measure in the laboratory. Steps 1 through 4 were given previously; the recipe
continues:
5. For a wave function describing a single particle, the probability summed over
all locations must give 100%—that is, the particle must be located somewhere
between x = −∞ and x = +∞. The probability to find the particle in a small
interval was given in Eq. 5.7. The total probability to find the particle in all
such intervals must be exactly 1:
+∞
|ψ(x)|2 dx = 1
(5.9)
−∞
The Schrödinger equation is linear, which means that if ψ(x) is a solution
then any constant times ψ(x) is also a solution. For the probability to be a
meaningful concept, this constant must be chosen so that Eq. 5.9 is satisfied.
∗
It is not correct to speak of “the probability to find the particle at the point x.” A single point is a
mathematical abstraction with no physical dimension. The probability of finding a particle at a point
is zero, but there can be a nonzero probability of finding the particle in an interval.
5.3 | The Schrödinger Equation
A wave function with its multiplicative constant chosen in this way is said to
be normalized, and Eq. 5.9 is known as the normalization condition.
6. Because the solution to the Schrödinger equation represents a probability,
any solution that becomes infinite must be discarded—it makes no sense
to have an infinite probability to find a particle in any interval. In practice,
we “discard” a solution by setting its multiplicative constant equal to zero.
For example, if the mathematical solution to the differential equation yields
ψ(x) = Aekx + Be−kx for the entire region x > 0, then we must require A = 0
for the solution to be physically meaningful; otherwise |ψ(x)|2 would become
infinite as x goes to infinity. On the other hand, if this solution is to be valid
in the entire region x < 0, then we must set B = 0. However, if the solution is
to be valid only in a small portion of the range of x—say, 0 < x < L—then
we cannot set either A = 0 or B = 0.
7. Suppose the interval between two points x1 and x2 is divided into a series of
infinitesimal intervals of width dx (Figure 5.8). To find the total probability for
the particle to be located between x1 and x2 , which we represent as P(x1 : x2 ),
we calculate the sum of all the probabilities P(x) dx in each interval dx. This
sum can be expressed as an integral:
x2
x2
|ψ(x)|2 dx
(5.10)
P(x) dx =
P(x1 : x2 ) =
x1
x1
If the wave function has been properly normalized, Eq. 5.10 will always yield
a probability that lies between 0 and 1.
8. Because we can no longer speak with certainty about the position of the
particle, we can no longer guarantee the outcome of a single measurement of
any physical quantity that depends on its position. Instead, we can find the
average outcome of a large number of measurements. For example, suppose
we wish to find the average location of a particle by measuring its coordinate
x. From a large number of measurements, we find the value x1 a certain
number of times n1 , x2 a number of times n2 , etc., and in the usual way we
can find the average value
n1 x1 + n2 x2 + · · ·
nx
(5.11)
= i i
xav =
n1 + n2 + · · ·
ni
The number of times ni that we measure each xi is proportional to the
probability P(xi )dx to find the particle in the interval dx at xi . Making this
substitution and changing the sums to integrals, we have
+∞
P(x)x dx +∞
=
|ψ(x)|2 x dx
(5.12)
xav = −∞
+∞
−∞
P(x) dx
−∞
where the last step can be made if the wave function is normalized, in which
case the denominator of Eq. 5.12 is equal to one.
By analogy, the average value of any function of x can be found:
+∞
+∞
|ψ(x)|2 f (x) dx
(5.13)
P(x)f (x) dx =
[f (x)]av =
−∞
−∞
Average values calculated according to Eq. 5.12 or 5.13 are known as
expectation values.
143
144
Chapter 5 | The Schrödinger Equation
5.4 APPLICATIONS OF THE SCHR ÖDINGER
EQUATION
Solutions for Constant Potential Energy
First let’s examine the solutions to the Schrödinger equation for the special case
of a constant potential energy, equal to U0 . Then Eq. 5.5 becomes
−
h−2 d 2 ψ
+ U0 ψ(x) = Eψ(x)
2m dx2
(5.14)
or (assuming for now that E > U0 )
d2ψ
= −k 2 ψ(x)
dx2
with k =
2m(E − U0 )
h−2
(5.15)
The parameter k in this equation is equal to the wave number 2π/λ.
The solution to Eq. 5.15 is a function of x that, when differentiated twice, gives
back the original function multiplied by the negative constant −k 2 . The function
that has this property is sin kx or cos kx. The most general solution to the equation is
ψ(x) = A sin kx + B cos kx
(5.16)
The constants A and B must be determined by applying the continuity and
normalization requirements. We can demonstrate that Eq. 5.16 satisfies Eq. 5.15
by taking two derivatives:
dψ
= kA cos kx − kB sin kx
dx
d2ψ
= −k 2 A sin kx − k 2 B cos kx = −k 2 (A sin kx + B cos kx) = −k 2 ψ(x)
dx2
so the original equation is indeed satisfied.
To analyze the penetration of a particle into a forbidden region, we must
consider the case in which the energy E of the particle is smaller than the potential
energy U0 . For this case we can rewrite Eq. 5.14 as
d2ψ
2m(U0 − E)
′2
′
= k ψ(x)
with
k =
(5.17)
2
dx
h−2
In this case the general solution in the forbidden regions is
′
′
ψ(x) = Aek x + Be−k x
(5.18)
Once again, we can demonstrate that Eq. 5.18 is a solution of Eq. 5.17 by taking
two derivatives:
dψ
′
′
= k ′ Aek x − k ′ Be−k x
dx
d2ψ
′
′
′
′
2
2
2
2
= k ′ Aek x + k ′ Be−k x = k ′ (Aek x + Be−k x ) = k ′ ψ(x)
2
dx
We will use Eqs. 5.16 and 5.18 as our solutions to the Schrödinger equation for
constant potential energy in the allowed (E > U0 ) and forbidden (E < U0 ) regions.
5.4 | Applications of the Schrödinger Equation
The Free Particle
For a free particle, the force is zero and so the potential energy is constant. We
may choose any value for that constant, so for convenience we’ll choose U0 = 0.
The solution is given by Eq. 5.16, ψ(x) = A sin kx + B cos kx. The energy of the
particle is
E=
h−2 k 2
2m
(5.19)
Our solution has placed no restrictions on k, so the energy is permitted to have
any value (in the language of quantum physics, we say that the energy is not quantized). We note that Eq. 5.19 is the kinetic energy of a particle with momentum
−
or, equivalently, p = h/λ. This is as we would have expected, because
p = hk
the free particle can be represented by a de Broglie wave with any wavelength.
Solving for A and B presents some difficulties because the normalization
integral, Eq. 5.9, cannot be evaluated from −∞ to +∞ for this wave function.
We therefore cannot determine probabilities for the free particle from the wave
function of Eq. 5.16.
It is also instructive to write the wave function in terms of complex exponentials,
using sin kx = (eikx − e−ikx )/2i and cos kx = (eikx + e−ikx )/2:
ikx
ikx
e + e−ikx
e − e−ikx
(5.20)
+B
= A′ eikx + B′ e−ikx
ψ(x) = A
2i
2
where A′ = A/2i + B/2 and B′ = −A/2i + B/2. To interpret this solution in terms
of waves we form the complete time-dependent wave function using Eq. 5.6:
(x,t) = (A′ eikx + B′ e−ikx )e−iωt = A′ ei(kx−ωt) + B′ e−i(kx+ωt)
(5.21)
The dependence of the first term on kx − ωt identifies this term as representing
a wave moving to the right (in the positive x direction) with amplitude A′ , and
the second term involving kx + ωt represents a wave moving to the left (in the
negative x direction) with amplitude B′ .
If we want the wave to represent a beam of particles moving in the +x direction,
then we must set B′ = 0. The probability density associated with this wave is
then, according to Eq. 5.7,
P(x) = |ψ(x)|2 = |A′ |2 eikx e−ikx = |A′ |2
(5.22)
The probability density is constant, meaning the particles are equally likely to be
found anywhere along the x axis. This is consistent with our discussion of the freeparticle de Broglie wave in Chapter 4—a wave of precisely defined wavelength
extends from x = −∞ to x = +∞ and thus gives a completely unlocalized
particle.
Infinite Potential Energy Well
Now we’ll consider the formal solution to the problem we discussed in Section 5.2:
a particle is trapped in the region between x = 0 and x = L by infinitely high
potential energy barriers. Imagine an apparatus like that of Figure 5.6, in which the
particle moves freely in this region and makes elastic collisions with the perfectly
rigid barriers that confine it. This problem is sometimes called “a particle in a
box.” For now we’ll assume that the particle moves in only one dimension; later
we’ll expand to two and three dimensions.
145
146
Chapter 5 | The Schrödinger Equation
To ∞
To ∞
The potential energy may be expressed as:
U(x) = 0
U=∞
x=0
U=0
U=∞
x=L
FIGURE 5.9 The potential energy of a
particle that moves freely (U = 0) in
the region 0 ≤ x ≤ L but is completely
excluded (U = ∞) from the regions
x < 0 and x > L.
0≤x≤L
=∞
x < 0, x > L
(5.23)
The potential energy is shown in Figure 5.9. We are free to choose any constant
value for U in the region 0 ≤ x ≤ L; we choose it to be zero for convenience.
Because the potential energy is different in the regions inside and outside the
well, we must find separate solutions in each region. We can analyze the outside
region in either of two ways. If we examine Eq. 5.5 for the region outside the well,
we find that the only way to keep the equation from becoming meaningless when
U → ∞ is to require ψ = 0, so that Uψ will not become infinite. Alternatively,
we can go back to the original statement of the problem. If the walls at the
boundaries of the well are perfectly rigid, the particle must always be in the well,
and the probability for finding it outside must be zero. To make the probability
zero everywhere outside the well, we must make ψ = 0 everywhere outside. Thus
we have
ψ(x) = 0
x < 0, x > L
(5.24)
The Schrödinger equation for 0 ≤ x ≤ L, when U(x) = 0, is identical with Eq. 5.14
with U0 = 0 and has the same solution:
ψ(x) = A sin kx + B cos kx
0≤x≤L
(5.25)
with
k=
2mE
h−2
(5.26)
Our solution is not yet complete, for we have not evaluated A or B, nor have
we found the allowed values of the energy E. To do this, we must apply the
requirement that ψ(x) is continuous across any boundary. In this case, we require
that our solutions for x < 0 and x > 0 match up at x = 0; similarly, the solutions
for x > L and x < L must match at x = L.
Let us begin at x = 0. At x < 0, we have found that ψ = 0, and so we must set
ψ(x) of Eq. 5.25 to zero at x = 0.
ψ(0) = A sin 0 + B cos 0 = 0
(5.27)
which gives B = 0. Because ψ = 0 for x > L, the second boundary condition is
ψ(L) = 0, so
ψ(L) = A sin kL + B cos kL = 0
(5.28)
We have already found B = 0, so we must now have A sin kL = 0. Either
A = 0, in which case ψ = 0 everywhere, ψ 2 = 0 everywhere, and there is no
particle (a meaningless solution) or else sin kL = 0, which is true only when
kL = π , 2π , 3π , . . ., or
kL = nπ
n = 1, 2, 3, . . .
(5.29)
5.4 | Applications of the Schrödinger Equation
With k = 2π/λ, we have λ = 2L/n; this is identical with the result obtained in
introductory mechanics for the wavelengths of the standing waves in a string of
length L fixed at both ends, which we already obtained in Section 5.2 (Eq. 5.1).
Thus the solution to the Schrödinger equation for a particle trapped in a linear
region of length L is a series of standing de Broglie waves! Not all wavelengths
are permitted; only certain values, determined from Eq. 5.29, may occur.
From Eq. 5.26 we find that, because only certain values of k are permitted by
Eq. 5.29, only certain values of E may occur—the energy is quantized! Solving
Eq. 5.29 for k and substituting into Eq. 5.26, we obtain
h−2 k 2
h2 n2
h−2 π 2 n2
=
=
En =
2m
2mL2
8mL2
n = 1, 2, 3, . . .
147
n=4
E4 = 16E0
n=3
E3 = 9E0
n=2
E2 = 4E0
n=1
E1 = E0
(5.30)
−2
For convenience, let E0 = h π 2 /2mL2 = h2 /8mL2 ; this unit of energy is
determined by the mass of the particle and the width of the well. Then En = n2 E0 ,
and the only allowed energies for the particle are E0 , 4E0 , 9E0 , 16E0 , etc. All
intermediate values, such as 3E0 or 6.2E0 , are forbidden. Figure 5.10 shows the
allowed energy levels. The lowest energy state, for which n = 1, is known as the
ground state, and the states with higher energies (n > 1) are known as excited
states.
Because the energy is purely kinetic in this case, our result means that only
certain speeds are permitted for the particle. This is very different from the case
of the classical trapped particle, in which the particle can be given any initial
velocity and will move forever, back and forth, at the same speed. In the quantum
case, this is not possible; only certain initial speeds can result in sustained states of
motion; these special conditions represent the “stationary states.” Average values
calculated according to Eq. 5.13 likewise do not change with time.
From one energy state, the particle can make jumps or transitions to another
energy state by absorbing or releasing an amount of energy equal to the energy
difference between the two states. By absorbing energy the particle will move to
a higher energy state, and by releasing energy it moves to a lower energy state.
A similar effect occurs for electrons in atoms, in which the absorbed or released
energy is usually in the form of a photon of visible light or other electromagnetic
radiation. For example, from the state with n = 3 (E3 = 9E0 ), the particle might
absorb an energy of E = 7E0 and jump upward to the n = 4 state (E4 = 16E0 )
or might release energy of E = 5E0 and jump downward to the n = 2 state
(E2 = 4E0 ).
FIGURE 5.10 The first four energy
levels in a one-dimensional infinite
potential energy well.
Example 5.2
An electron is trapped in a one-dimensional region of length
1.00 × 10−10 m (a typical atomic diameter). (a) Find the
energies of the ground state and first two excited states.
(b) How much energy must be supplied to excite the electron from the ground state to the second excited state?
(c) From the second excited state, the electron drops down
to the first excited state. How much energy is released in this
process?
Solution
(a) The basic quantity of energy needed for this
calculation is
E0 =
=
(hc)2
h2
=
8mL2
8mc2 L2
(1240 eV · nm)2
= 37.6 eV
8(511, 000 eV)(0.100 nm)2
148
Chapter 5 | The Schrödinger Equation
With En = n2 E0 , we can find the energy of the states:
n=1:
n=2:
n=3:
This is the energy that must be absorbed for the electron to
make this jump.
(c) The energy difference between the second and first
excited states is
E1 = E0 = 37.6 eV
E2 = 4E0 = 150.4 eV
E3 = 9E0 = 338.4 eV
(b) The energy difference between the ground state and
the second excited state is
E = E3 − E1 = 338.4 eV − 37.6 eV = 300.8 eV
E = E3 − E2 = 338.4 eV − 150.4 eV = 188.0 eV
This is the energy that is released when the electron makes
this jump.
To complete the solution for ψ(x), we mustdetermine the constant A by using
+∞
the normalization condition given in Eq. 5.9, −∞ |ψ(x)|2 dx = 1. The integrand
is zero in the regions −∞ < x ≤ 0 and L ≤ x < +∞, so all that remains is
L
nπ x
A2 sin2
dx = 1
(5.31)
L
0
from which we find A = 2/L. The complete wave function for 0 ≤ x ≤ L is
then
2
nπ x
ψn (x) =
sin
n = 1, 2, 3, . . .
(5.32)
L
L
In Figure 5.11, the wave functions and probability densities ψ 2 are illustrated for
the lowest several states.
In the ground state, the particle has the greatest probability to be found near
the middle of the well (x = L/2), and the probability falls off to zero between
the center and the sides of the well. This is very different from the behavior
of a classical particle—a classical particle moving at constant speed would be
found with equal probability at every location inside the well. The quantum
particle also has constant speed but yet is still found with differing probability
at various locations in the well. It is the wave nature of the quantum particle that
is responsible for this very nonclassical behavior.
n=1
x=0
n=3
x=L
n=2
x=0
x=0
x=L
n=4
x=L
x=0
x=L
FIGURE 5.11 The wave functions (solid lines) and probability densities (shaded regions) of the first four
states in the one-dimensional infinite potential energy well.
5.4 | Applications of the Schrödinger Equation
149
Another example of nonclassical behavior occurs for the first excited state.
The probability density has a maximum at x = L/4 and another maximum at
x = 3L/4. Between the two maxima, there is zero probability to find the particle
in the center of the well at x = L/2. How can the particle travel from x = L/4 to
x = 3L/4 without ever being at x = L/2? Of course, no classical particle could
behave in such a way, but it is a common behavior for waves. For example,
the first overtone of a vibrating string has a node at its midpoint and antinodes
(vibrational maxima) at the 1 /4 and 3 /4 locations.
The calculation of probabilities and average values is illustrated by the
following examples.
Example 5.3
Consider again an electron trapped in a one-dimensional
region of length 1.00 × 10−10 m = 0.100 nm. (a) In the
ground state, what is the probability of finding the electron
in the region from x = 0.0090 nm to 0.0110 nm? (b) In the
first excited state, what is the probability of finding the
electron between x = 0 and x = 0.025 nm?
Solution
(a) When the interval is small, it is often simpler to
use Eq. 5.7 to find the probability, instead of using
the integration method. The width of the small interval is dx = 0.0110 nm − 0.0090 nm = 0.0020 nm. Evaluating the wave function at the midpoint of the interval
(x = 0.0100 nm), we can use the n = 1 wave function with
Eq. 5.7 to find
P(x) dx = |ψ1 (x)|2 dx =
=
2 2 πx
sin
dx
L
L
π(0.0100 nm)
2
sin2
(0.002 nm)
0.100 nm
0.100 nm
(b) For this wide interval, we must use the integration
method to find the probability:
P(x1 : x2 ) =
x2
x1
2
=
L
=
|ψ2 (x)|2 dx
x2
sin2
x1
2π x
dx
L
x
1
4π x x2
−
sin
L 4π
L x1
Evaluating this expression using the limits x1 = 0 and
x2 = 0.025 nm gives a probability of 0.25 or 25%. This
result is of course what we would expect by inspection of
the graph of ψ 2 for n = 2 in Figure 5.11. The interval from
x = 0 to x = L/4 contains 25% of the total area under the
ψ 2 curve.
= 0.0038 = 0.38%
Example 5.4
Show that the average value of x is L/2, independent of the
quantum state.
Solution
We use Eq. 5.12; because ψ = 0 except for 0 ≤ x ≤ L, the
limits of integration are 0 and L:
L
2 L 2 nπ x
xav =
sin
|ψ(x)|2 x dx =
x dx
L 0
L
0
This can be integrated by parts or found in integral tables;
the result is
xav =
L
2
Note that, as required, this result is independent of n. Thus
a measurement of the average position of the particle yields
no information about its quantum state.
150
Chapter 5 | The Schrödinger Equation
Let’s now look at how the uncertainty principle applies to the motion of this
trapped particle. By solving Problems 34 and 35, you will find that the uncertainties
inposition and momentum for a particle in an infinite potential well are x =
L 1/12 − 1/2π 2 n2 and p = hn/2L. The product of the uncertainties is
hn
xp =
2
1
h
1
=
−
2
2
12 2π n
2
n2
1
−
12 2π 2
Clearly the product of the uncertainties grows as n grows. The minimum value
−
The ground state
occurs for n = 1, in which case xp = 0.090h = 0.57h.
represents a fairly “compact” wave packet, but it is somewhat less compact than
the minimum possible limit of 0.50h− (Eq. 4.10). You can see from Figure 5.11
how the wave becomes less compact (spreads out more) as n increases. Even for
−
n = 2, the product of the uncertainties grows quickly to 1.67h.
Finite Potential Energy Well
Because the infinite potential energy well is an idealization of a technique for
confining a particle, we should examine the solution when the barriers at the
sides of the well are finite rather than infinite. The potential energy well can be
described by
U(x) = 0
= U0
0≤x≤L
x < 0, x > L
(5.33)
and is sketched in Figure 5.12. We look for solutions in which the particle is
confined to this well, and thus the energies that we deduce for the particle must
be less than U0 .
The solution in the center region (between x = 0 and x = L) is exactly the
same as it was for the infinite well (Eq. 5.25):
ψ(x) = A sin kx + B cos kx
U = U0
U = U0
′
x=0
x=L
FIGURE 5.12 The potential energy of
a particle that is confined to the region
0 ≤ x ≤ L by finite barriers U0 at x =
0 and x = L.
(5.34)
although the values that we deduced previously for the coefficients A and B are
not valid in this calculation. The region x < 0 is an example of a situation in
which the energy E of the particle is less than the potential energy U0 , and so we
′
′
must use the solution in the form of Eq. 5.18, ψ(x) = Cek x + De−k x with k ′ given
in Eq. 5.17. This region includes x = −∞, for which the term with the coefficient
D becomes infinite. Because we cannot allow the probability to become infinite,
we must discard this term by setting D = 0. The solution for x < 0 is then
ψ(x) = Cek x
U=0
(0 ≤ x ≤ L)
(x < 0)
(5.35)
In the region x > L, the energy E is once again smaller than U0 , and so the solution is
′
′
also in the form of Eq. 5.18, ψ(x) = Fek x + Ge−k x . Here the region now includes
x = +∞, for which the term with the coefficient F would become infinite. We
must prevent that possibility by setting F = 0, so the solution in this region is
′
ψ(x) = Ge−k x
(x > L)
(5.36)
We now have 4 coefficients to determine (A, B, C, G) along with the energy E.
For this determination, we have 4 equations from the boundary conditions (the
continuity of both ψ and dψ/dx at both x = 0 and x = L) and one equation
5.4 | Applications of the Schrödinger Equation
from the normalization condition. As you might imagine, solving 5 equations
in 5 unknowns presents a straightforward but very tedious algebraic challenge.
Moreover, the resulting solution for the energy values cannot be obtained in
terms of a direct equation such as Eq. 5.30, but instead must be found numerically
by solving a transcendental equation. The result is a series of increasing energy
values, but the number of energy values is finite rather than infinite, because the
energy cannot be allowed to exceed the value of U0 .
As we did for the infinite potential energy well in Example 5.2, let’s consider a
well of width L = 0.100 nm. We’ll choose the depth of the well to be U0 = 400 eV.
Applying the boundary conditions at x = 0 and x = L, we can eliminate all of the
coefficients and find an equation that involves only k and k ′ (both of which depend
on the energy E). Solving that equation numerically, we find four possible values
of the energy: E1 = 26 eV, E2 = 104 eV, E3 = 227 eV, E4 = 375 eV. Here the
subscript just numbers the energy values, starting at the ground state; there is no
simple functional dependence of the energies on the quantum number n as there
was for the infinite well. The allowed energy levels are shown in Figure 5.13.
The probability densities (square of the wave functions) for these four states are
shown in Figure 5.14. In some ways they are similar to the probability densities in
the infinite well—note that each state has n maxima in its probability density, just
like the infinite well (see Figure 5.11). Unlike the infinite well, these probability
densities show the property of penetration into the classically forbidden region.
Look carefully at the continuity of the wave function and its slope at x = 0 and
x = L; see how smoothly the sine and cosine function inside the well joins the
exponentials in the forbidden regions.
The energy levels of the finite well are smaller than those of the infinite well of
the same width (38 eV, 150 eV, 338 eV, 602 eV), and the differences increase as
we go to higher states. This is consistent with the uncertainty principle—because
of the penetration into the forbidden region, x is larger for the finite well and
thus px must be smaller. As a result, the kinetic energies are smaller for the
finite well. From Figure 5.14 we see that the penetration distance increases as we
go up in energy, so the difference between x for the finite well and the infinite
well increases and the energy discrepancy also increases.
−0.1
0
x (nm)
0.1
0.2
n=4
E4 = 375 eV
n=3
E3 = 227 eV
n=2
E2 = 104 eV
n=1
E1 = 26 eV
FIGURE 5.13 The energy levels in a
potential energy well of depth 400 eV.
There are only four energy states in
this well.
−0.1
0
x (nm)
0.1
0.2
0.1
0.2
n=4
n=2
−0.1
U0 = 400 eV
n=3
n=1
0
0.1
x (nm)
0.2
−0.1
151
0
x (nm)
FIGURE 5.14 The probability densities of the four states in the one-dimensional potential energy well of width
0.100 nm and depth 400 eV.
152
Chapter 5 | The Schrödinger Equation
For an energy close to the top of the well such as E4 , a smaller uncertainty
−
and
E is necessary to reach the top of the well, giving a larger t ∼ h/E
thus a larger penetration distance. At the bottom of the well, the state E1 requires
much more energy to reach the top of the well and thus needs a much larger E;
the smaller resulting t gives a smaller penetration distance into the forbidden
region.
Two-Dimensional Infinite Potential
Energy Well
When we extend the previous calculation to two and three dimensions, the
principal features of the solution remain the same, but an important new feature
is introduced. In this section we show how this occurs; this new feature, known
as degeneracy, will turn out to be very important in our study of atomic physics.
To begin with, we need a Schrödinger equation that is valid in more than one
dimension; our previous version, Eq. 5.5, included only one spatial dimension. If
the potential energy is a function of x and y, we expect that ψ also depends on
both x and y, and the derivatives with respect to x must be replaced by derivatives
with respect to x and y. In two dimensions, we then have∗
h−2
−
2m
∂ 2 ψ(x, y) ∂ 2 ψ(x, y)
+
∂x2
∂y2
+ U(x, y)ψ(x, y) = Eψ(x, y)
(5.37)
The two-dimensional potential energy well is:
U(x, y) = 0
=∞
0 ≤ x ≤ L; 0 ≤ y ≤ L
otherwise
(5.38)
The particle is confined by infinitely high barriers to the square region with the
vertices (x, y) = (0, 0), (L, 0), (L, L), (0, L), as shown in Figure 5.15. A classical
analog might be a small disk sliding without friction on a tabletop and colliding
elastically with walls at x = 0, x = L, y = 0, and y = L. (For simplicity, we have
made the allowed region square; we could have made it rectangular by setting
U = 0 when 0 ≤ x ≤ a and 0 ≤ y ≤ b.)
Solving partial differential equations requires a technique more involved than
we need to consider, so we will not give the details of the solution. We suspect
that, as in the previous case, ψ(x, y) = 0 outside the allowed region, in order to
make the probability zero there. Inside the well, we consider solutions that are
separable; that is, our function of x and y can be expressed as the product of one
function that depends only on x and another that depends only on y:
y
y=L
U=∞
ψ(x, y) = f (x)g(y)
U=0
(5.39)
where the functions f and g are similar to Eq. 5.16:
U=∞
x
f (x) = A sin kx x + B cos kx x,
x=L
FIGURE 5.15 A particle moves freely
in the two-dimensional region 0 <
x < L, 0 < y < L, but encounters infinite barriers beyond that region.
∗ The
g(y) = C sin ky y + D cos ky y
(5.40)
first two terms on the left side of this equation require partial derivatives; for well-behaved
functions, these involve taking the derivative with respect to one variable while keeping the other
constant. Thus if f (x, y) = x2 + xy + y2 , then ∂f /∂x = 2x + y and ∂f /∂y = x + 2y.
5.4 | Applications of the Schrödinger Equation
153
The wave number k of the one-dimensional problem has become the separate
wave numbers kx for f (x) and ky for g(y). We show later how these are related.
(See also Problem 18 at the end of this chapter.)
The continuity condition on ψ(x, y) requires that the solutions inside and
outside match at the boundary. Because ψ = 0 everywhere outside, the continuity
condition then requires that ψ = 0 everywhere on the boundary. That is,
ψ(0, y) = 0 and ψ(L, y) = 0 for all y
ψ(x, 0) = 0 and ψ(x, L) = 0 for all x
In analogy with the one-dimensional problem, the condition at x = 0 gives
f (0) = 0, which requires B = 0 in Eq. 5.40. Similarly, the condition at y = 0
gives g(0) = 0, which requires D = 0. The condition f (L) = 0 requires that
sin kx L = 0, and thus that kx L be an integer multiple of π ; the condition g(L) = 0
similarly requires that ky L be an integer multiple of π . These two integers do not
necessarily need to be the same, so we call them nx and ny . Making all these
substitutions into Eq. 5.39, we obtain
ψ(x, y) = A′ sin
ny π y
nx π x
sin
L
L
(5.41)
where we have combined A and C into A′ . The coefficient A′ is once again found
by the normalization condition, which in two dimensions becomes
ψ 2 dx dy = 1
(5.42)
For our case this gives
L
(5,2) or (2,5)
(5,1) or (1,5)
dy
0
L
0
ny π y
n πx
A′2 sin2 x sin2
dx = 1
L
L
(5.43)
(4,3) or (3,4)
(4,2) or (2,4)
from which follows
2
A′ =
L
(3,3)
(5.44)
(4,1) or (1,4)
(The solutions to this problem, which are standing de Broglie waves on a twodimensional surface, are similar to the solutions of the classical problem of the
vibrations of a stretched membrane such as a drumhead.)
Finally, we can substitute our solution for ψ(x, y) back into Eq. 5.41 to find the
energy:
E=
h−2 π 2 2
h2
2
(n
+
n
)
=
(n2 + n2y )
x
y
2mL2
8mL2 x
(3,2) or (2,3)
(3,1) or (1,3)
(2,2)
(2,1) or (1,2)
(5.45)
Compare this result with Eq. 5.30. Once again we let E0 = h−2 π 2 /2mL2 = h2 /8mL2
so that E = E0 (n2x + n2y ). In Figure 5.16 the energies of the excited states are shown.
You can see how different the energies are from those of the one-dimensional
case shown in Figure 5.10.
Figure 5.17 shows the probability density ψ 2 for several different combinations
of the quantum numbers nx and ny . The probability has maxima and minima, just
like the probability in the one-dimensional problem. For example, if we gave
(1,1)
(nx,ny)
29E0
26E0
25E0
20E0
18E0
17E0
13E0
10E0
8E0
5E0
2E0
E=0
FIGURE 5.16 The lower permitted
energy levels of a particle confined to
an infinite two-dimensional potential
energy well.
154
Chapter 5 | The Schrödinger Equation
y
(1,1)
x
(2,1)
2E0
(1,2)
(3,1)
5E0
(2,2)
5E0
(1,3)
(3,2)
8E0
(2,3)
10E0
10E0
13E0
(3,3)
13E0
18E0
FIGURE 5.17 The probability density for some of the lower energy levels of a particle confined to the infinite
two-dimensional potential energy well. The individual plots are labeled with the quantum numbers (nx , ny ) and
with the value of the energy E.
FIGURE 5.18 A ring of iron atoms
on a copper surface forms a “corral”
within which the probability density
of trapped electrons is clearly visible. This image was obtained with
a scanning tunneling electron microscope. (Image originally created by
IBM Corporation.)
the particle an energy of 8E0 and then made a large number of measurements
of its position, we would expect to find it most often near the four points
(x, y) = (L/4, L/4), (L/4, 3L/4), (3L/4, L/4) and (3L/4, 3L/4); we expect never
to find it near x = L/2 or y = L/2. The shape of the probability density tells us
something about the quantum numbers and therefore about the energy. Thus if we
measured the probability density and found six maxima, as shown in Figure 5.17,
we would deduce that the particle had an energy of 13E0 with nx = 2 and ny = 3,
or else nx = 3, ny = 2.
Recently it has become possible to photograph the probability densities of
electrons confined in a two-dimensional region. The tip of an electron microscope
was used to place 48 individual iron atoms on a metal surface in a ring or
“corral” of radius 7.13 nm that formed the walls of a potential well, as shown in
Figure 5.18. Inside the ring, the waves of probability density for electrons trapped
in the potential well are clearly visible. The potential well is circular, rather than
square, but otherwise the analysis follows the procedures described in this section;
when the Schrödinger equation is solved in cylindrical polar coordinates with the
potential energy for a circular well, the calculated probability density gives a close
match with the observed one. These beautiful results are a dramatic confirmation
of the wave functions obtained for the two-dimensional potential energy well.
Degeneracy Occasionally it happens that two different sets of quantum numbers nx and ny have exactly the same energy. This situation is known as degeneracy,
and the energy levels are said to be degenerate. For example, the energy level
at E = 13E0 is degenerate, because both nx = 2, ny = 3 and nx = 3, ny = 2 have
5.5 | The Simple Harmonic Oscillator
(1,7)
(5,5)
50E0
50E0
FIGURE 5.19 Two very different probability densities with exactly the same energy.
E = 13E0 . This degeneracy arises from interchanging nx and ny (which is the
same as interchanging the x and y axes), so the probability distributions in the
two cases are not very different. However, consider the state with E = 50E0 , for
which there are three sets of quantum numbers: nx = 7, ny = 1; nx = 1, ny = 7;
and nx = 5, ny = 5. The first two sets of quantum numbers result from the interchange of nx and ny and so have similar probability distributions, but the third
represents a very different state of motion, as shown in Figure 5.19. The level
at E = 13E0 is said to be two-fold degenerate, while the level at E = 50E0 is
three-fold degenerate; we could also say that one level has a degeneracy of 2,
while the other has a degeneracy of 3.
Degeneracy occurs in general whenever a system is labeled by two or more
quantum numbers; as we have seen in the above calculation, different combinations
of quantum numbers often can give the same value of the energy. The number
of different quantum numbers required by a given physical problem turns out
to be exactly equal to the number of dimensions in which the problem is
being solved—one-dimensional problems need only one quantum number, twodimensional problems need two, and so forth. When we get to three dimensions,
as in Problem 19 at the end of this chapter and especially in the hydrogen atom in
Chapter 7, we find that the effects of degeneracy become more significant; in the
case of atomic physics, the degeneracy is a major contributor to the structure and
properties of atoms.
5.5 THE SIMPLE HARMONIC OSCILLATOR
Another situation that can be analyzed using the Schrödinger equation is the
one-dimensional simple harmonic oscillator. The classical oscillator is an object
of mass m attached to a spring of force constant k. The spring exerts a restoring
force F = −kx on the object, where x is the displacement from its equilibrium
position. Using Newton’s laws, we can analyze
the oscillator and show
that it has
a (circular or angular) frequency ω0 = k/m and a period T = 2π m/k. The
maximum distance of the oscillating object from its equilibrium position is x0 ,
the amplitude of the oscillation. The oscillator has its maximum kinetic energy at
x = 0; its kinetic energy vanishes at the turning points x = ±x0 . At the turning
points the oscillator comes to rest for an instant and then reverses its direction of
motion. The motion is, of course, confined to the region −x0 ≤ x ≤ +x0 .
155
156
Chapter 5 | The Schrödinger Equation
Why analyze the motion of such a system using quantum mechanics? Although
we never find in nature an example of a one-dimensional quantum oscillator, there
are systems that behave approximately as one—a vibrating diatomic molecule,
for example. In fact, any system in a smoothly varying potential energy well near
its minimum behaves approximately like a simple harmonic oscillator.
A force F = −kx has the associated potential energy U = 21 kx2 , and so we
have the Schrödinger equation:
−
1
h−2 d 2 ψ
+ kx2 ψ = Eψ
2m dx2
2
(5.46)
(Because we are working in one dimension, U and ψ are functions only of x.)
There are no boundaries between different regions of potential energy here, so
the wave function must fall to zero for both x → +∞ and x → −∞. The simplest
function that satisfies these conditions, which turns out to be the correct ground
2
state wave function, is ψ(x) = Ae−ax . The constant a and the energy E can be
found by substituting this function into Eq. 5.46. We begin by evaluating d 2 ψ/dx2 .
dψ
2
= −2ax(Ae−ax )
dx
d2ψ
2
2
2
= −2a(Ae−ax ) − 2ax(−2ax)Ae−ax = (−2a + 4a2 x2 )Ae−ax
2
dx
2
Substituting into Eq. 5.46 and canceling the common factor Ae−ax yields
h−2 a 2a2 h−2 2 1 2
−
x + kx = E
m
m
2
(5.47)
Equation 5.47 is not an equation to be solved for x, because we are looking for
a solution that is valid for any x, not just for one specific value. In order for this
to hold for any x, the coefficients of x2 must cancel and the remaining constants
must be equal. (That is, consider the equation bx2 = c. It will be true for any and
all x only if both b = 0 and c = 0.) Thus
−
which yield
|ψ |
0
√
km
a= −
2h
and
and
E=
h−2 a
=E
m
1 −
h k/m
2
We can also write the energy in terms of the classical frequency ω0 =
1−
E = hω
2 0
2
−x0
2a2 h−2
1
+ k=0
m
2
+x0
x
FIGURE 5.20 The probability density
for the ground state of the simple harmonic oscillator. The classical turning
points are at x = ±x0 .
(5.48)
(5.49)
k/m as
(5.50)
The coefficient A is found from the normalization condition (see Problem 20 at the
end of the chapter). The result, which is valid only for this ground-state wave func−
)1/4 . The complete wave function of the ground state is then
tion, is A = (mω0 /hπ
mω 1/4 √
− 2
e−( km/2h)x
ψ(x) = − 0
(5.51)
hπ
The probability density for this wave function is illustrated in Figure 5.20. Note
that, as in the case of the finite potential energy well, the probability density can
penetrate into the forbidden region beyond the classical turning points at x = ±x0
(in this region the potential energy is greater than E).
5.5 | The Simple Harmonic Oscillator
157
The solution we have found corresponds only to the ground state of the
2
oscillator. The general solution is of the form ψn (x) = Afn (x)e−ax , where fn (x) is
a polynomial in which the highest power of x is xn . The corresponding energies are
1 −
hω0
n = 0, 1, 2, . . .
(5.52)
En = n +
2
n=3
E = 72 ħω0
n=2
E = 52 ħω0
These levels are shown in Figure 5.21. Note that they are uniformly spaced,
in contrast to the one-dimensional infinite potential energy well. Probability
densities are shown in Figure 5.22. All of the solutions have the property of
penetration of probability density into the forbidden region beyond the classical
turning points. The probability density oscillates, somewhat like a sine wave,
2
between the turning points, and decreases like e−ax to zero beyond the turning
points. Note the great similarity between the probability densities for the quantum
oscillator and those of the finite potential energy well (Figure 5.14).
A sequence of vibrational excited states similar to Figure 5.21 is commonly
found in diatomic molecules such as HCl (see Chapter 9). The spacing between
the states is typically 0.1–1 eV; the states are observed when photons (in the
infrared region of the spectrum) are emitted or absorbed as the molecule jumps
from one state to another. A similar sequence is observed in nuclei, where the
spacing is 0.1–1 MeV and the radiations are in the gamma-ray region of the
spectrum.
n=1
E = 32 ħω0
n=0
E = 12 ħω0
n=0
U =12 kx2
FIGURE 5.21 Energy levels of the
simple harmonic oscillator. Note that
the levels have equal spacings and
that the distance between the classical
turning points increases with energy.
n=2
0
x
n=1
0
x
0
x
n=3
0
x
FIGURE 5.22 Probability densities for the simple harmonic oscillator. Note how the distance between the classical
turning points (marked by the short vertical lines) increases with energy. Compare with the probability densities for
the finite potential energy well (Figure 5.14).
Example 5.5
An electron is bound to a region of space by a springlike
force with an effective spring constant of k = 95.7 eV/nm2 .
(a) What is its ground-state energy? (b) How much energy
must be absorbed for the electron to jump from the ground
state to the second excited state?
1
= (197 eV · nm)
2
95.7 eV/nm2
0.511 × 106 eV
= 1.35 eV
Solution
(a) The ground-state energy is
1−
k
1− k
1−
E = hω0 = h
= hc
2
2
m
2
mc2
(b) The difference between adjacent energy levels is
−
hω
0 = 2.70 eV for all energy levels, so the energy that
must be absorbed to go from the ground state to the second
excited state is E = 2 × 2.70 eV = 5.40 eV.
158
Chapter 5 | The Schrödinger Equation
Example 5.6
For the electron of Example 5.5 in its ground state, what
is the probability to find it in a narrow interval of width
0.004 nm located halfway between the equilibrium position
and the classical turning point?
Solution
First we need to find the location of the turning point. At
the classical turning points x = ±x0 , the kinetic energy is
zero and so the total energy is all potential. Thus E = 21 kx20 ,
and so
2E
2(1.35 eV)
= 0.168 nm
x0 =
=
k
95.7 eV/nm2
2
Evaluating the parameters of the wave function Ae−ax (the
normalization constant A and the exponential coefficient
a), we have
1/4
−
mω 1/4 mc2 hω
= −2 2 0
A= − 0
hπ
h c π
1/4
(0.511 × 106 eV)(2.70 eV)
= 1.83 nm−1/2
(197 eV · nm)2 π
√
√
km
kmc2
a= − =
−
2h
2hc
(95.7 eV/nm2 )(0.511 × 106 eV)
=
2(197 eV · nm)
=
= 17.74 nm−2
The probability in the interval dx = 0.004 nm at x = x0 /2
= 0.084 nm is then
2
P(x) dx = |ψ(x)|2 dx = A2 e−2ax dx
−2 )(0.084 nm)2
= (1.83 nm−1/2 )2 e−2(17.74 nm
(0.004 nm)
= 0.0104 = 1.04%
As we did in the case of the infinite potential energy well, let’s look at the
application of the uncertainty principle to the wave packet represented by the
harmonic oscillator. Using the results of Problems 22 and 23 for the uncertainties
−
ω0
in position
and momentum for the ground state of the oscillator, x = h/2m
−
−
m/2,
the
product
of
the
uncertainties
is
xp
=
h/2.
This
is
the
and p = hω
0
minimum possible value for this product, according to Eq. 4.10. The ground state
of the oscillator thus represents the most “compact” wave packet in which the
product of the uncertainties has its smallest value. You can see from Figure 5.22
that the excited states of the oscillator are much less compact (more spread out)
than the ground state.
5.6 STEPS AND BARRIERS
In this general type of problem, we analyze what happens when a particle moving
(again in one dimension) in a region of constant potential energy suddenly moves
into a region of different, but also constant, potential energy. We will not discuss
in detail the solutions to these problems, but the methods of solution of each are
so similar that we can outline the steps to take in the solution. In this discussion,
we let E be the (fixed) total energy of the particle and U0 will be the value of
the constant potential energy. In these calculations, the particle is not confined,
so the energy is not quantized—we are free to choose any value for the particle
energy.
5.6 | Steps and Barriers
Potential Energy Step, E > U O
E
E − U0 = K
Consider the potential energy step shown in Figure 5.23:
U(x) = 0
= U0
E=K
x<0
x≥0
(5.53)
If the total energy E of the particle is greater than U0 , then we can write the
solutions to the Schrödinger equation in the two regions based on the general
form of Eq. 5.16:
2mE
k0 =
x < 0 (5.54a)
ψ0 (x) = A sin k0 x + B cos k0 x
h−2
2m
ψ1 (x) = C sin k1 x + D cos k1 x
k1 = −2 (E − U0 ) x > 0 (5.54b)
h
U0
x=0
FIGURE 5.23 A step of height U0 .
Particles are incident from the left with
energy E. The kinetic energy is equal
to E in the region x < 0 and is reduced
to E − U0 in the region x > 0.
Relationships among the four coefficients, A, B, C, and D, may be found by
applying the condition that ψ(x) and ψ ′ (x) = dψ/dx must be continuous at the
boundary; thus ψ0 (0) = ψ1 (0) and ψ0′ (0) = ψ1′ (0). A typical solution might look
like Figure 5.24. Note the smooth transition between the solutions at x = 0, which
results from applying the continuity conditions.
The coefficients A, B, C, and D are in general complex, so to visualize the
complete wave we need both the real and imaginary parts of ψ. We can use the
equation eiθ = cos θ + i sin θ to transform these solutions from sines and cosines
to complex exponentials:
ψ0 (x) = A′ eik0 x + B′ e−ik0 x
ψ1 (x) = C ′ eik1 x + D′ e−ik1 x
x<0
x>0
(5.55a)
(5.55b)
The coefficients A′ , B′ , C ′ , D′ can be found from the coefficients A, B, C, D. The
time dependent wave functions are obtained by multiplying each term by e−iωt ,
which gives
= A′ ei(k0 x−ωt) + B′ e−i(k0 x+ωt)
′ i(k1 x−ωt)
+ D′ e−i(k1 x+ωt)
1 (x,t) = C e
(5.56a)
(5.56b)
0 (x,t)
|ψ |2
|ψ |2
Im(ψ)
Im(ψ)
Re(ψ)
(a)
159
Re(ψ)
(b)
FIGURE 5.24 Wave function for electrons incident from the left on a potential energy step for
E > U0 . The probability density and the real and imaginary parts of the wavefunction are shown for
(a) t = 0 and (b) t = 1/4 period. The vertical line marks the location of the step.
160
Chapter 5 | The Schrödinger Equation
We can then make the following identification of the component waves, recalling
that (kx − ωt) is the phase of a wave moving in the positive x direction, while
(kx + ωt) is the phase of a wave moving in the negative x direction, and
assuming that the squared magnitude of each coefficient gives the intensity of
the corresponding component wave. In the region x < 0, Eq. 5.56a describes
the superposition of a wave ei(k0 x−ωt) of intensity |A′ |2 moving in the positive x
direction (from −∞ to 0) and a wave e−i(k0 x+ωt) of intensity |B′ |2 moving in the
negative x direction. Suppose we had intended our solution to describe particles
that are incident from the left on this step. Then |A′ |2 gives the intensity of the
incident wave (more exactly, the de Broglie wave describing the incident beam of
particles) and |B′ |2 gives the intensity of the reflected wave. The ratio |B′ |2 /|A′ |2
tells us the reflected fraction of the incident wave intensity.
In the region x > 0, Eq. 5.56b describes the transmitted wave ei(k1 x−ωt) of
intensity |C ′ |2 moving to the right and a wave e−i(k1 x+ωt) of intensity |D′ |2 moving
to the left. If particles are incident from −∞, it is not possible to have particles in
the region x > 0 moving to the left, so in this particular experimental situation we
are justified in setting D′ to zero.
Figure 5.24a shows that the probability density has the same value everywhere
in the region x > 0. You can see this immediately from Eq. 5.56b with D′ = 0;
taking the squared magnitude of the remaining term gives a constant result,
independent of x and t. This is consistent with what we expect for the de Broglie
wave of free particles; the particles can be found anywhere in the region x > 0
with equal probability.
In the region x < 0, the incident and reflected waves combine to produce a
standing wave, for which the probability density has fixed maxima and minima.
The probability density in this region does not vary with time, as suggested by the
plots for the two different times (t = 0 and t = 1/4 period) shown in Figure 5.24.
To illustrate the propagation of the de Broglie wave, it is instructive also
to plot the real and imaginary parts of the wave function, which are shown in
Figure 5.24. Here you can see the change in wavelength (corresponding to the
change in kinetic energy or momentum) in crossing the step. You can also see
something of the time dependence—the wave propagates in both regions, but it
does so in a way that the real and imaginary parts combine to give a probability
density that remains unchanged in time.
Potential Energy Step, E < U O
If the energy of the particle is less than the height of the potential energy step,
then the solution in the region x > 0 is of the form of Eq. 5.18:
2mE
k0 =
x < 0 (5.57a)
ψ0 (x) = A sin k0 x + B cos k0 x
h−2
2m
k1 x
−k1 x
k1 = −2 (U0 − E) x > 0 (5.57b)
ψ1 (x) = Ce + De
h
We set C = 0 to keep ψ1 (x) from becoming infinite as x → ∞, and we apply
the boundary conditions on ψ(x) and ψ ′ (x) at x = 0. The resulting solution is
shown in Figure 5.25. The probability density for x > 0 shows penetration into the
classically forbidden region. All particles are reflected from the barrier; that is, if
we write 0 (x,t) in the form of Equation 5.56a, then we must have |A′ | = |B′ |,
indicating that the waves moving to the right (the incident wave) and to the left
(the reflected wave) in the region x < 0 have equal amplitudes.
5.6 | Steps and Barriers
|ψ |2
Re(ψ )
|ψ |2
Re(ψ )
Im(ψ )
(a)
Im(ψ )
(b)
FIGURE 5.25 Wave function for electrons incident from the left on a potential energy step for
E < U0 . The probability density and the real and imaginary parts of the wavefunction are shown
for (a) t = 0 and (b) t = 1/4 period. The vertical line marks the location of the step.
Figure 5.25 shows the probability density at two different times, illustrating that
the probability density does not change with time. In the region x < 0 we again
have standing waves with fixed maxima and minima. Viewing the real and imaginary parts at two different times (t = 0 and t = 1/4 period) shows that the wave
is propagating, even though the probability density does not change with time.
Penetration into the forbidden region is associated with the wave nature of the
particle and also with the uncertainty in the particle’s energy or location. The
probability density in the x > 0 region is |ψ1 |2 , which according to Eq. 5.57b is
proportional to e−2k1 x . If we define a representative penetration distance x to be
the distance over which the probability drops by 1/e, then e−2k1 x = e−1 and so
x =
h−
1
1
=
2k1
2 2m(U0 − E)
(5.58)
To be able to enter the region with x > 0, the particle must gain an energy
of at least U0 − E in order to get over the potential energy step; it must in
addition gain some kinetic energy if it is to move in the region x > 0. Of course,
it is a violation of conservation of energy for the particle to spontaneously gain
any amount of energy, but according to the uncertainty relationship Et ∼ h−
conservation of energy does not apply at times smaller than t except to within
−
That is, if the particle “borrows” an amount of energy
an amount E ∼ h/t.
−
we observers
E and “returns” the borrowed energy within a time t ∼ h/E,
will still believe energy is conserved. Suppose the particle borrows an energy
sufficient to give it a kinetic energy of K in the forbidden region. How far into the
forbidden region does the particle penetrate?
The “borrowed” energy is (U0 − E) + K; the energy (U0 − E) gets the particle
to the top of the step, and the extra kinetic energy K gives it its motion. The
energy must be returned within a time
h−
(5.59)
t =
U0 − E + K
The particle moves with speed v = 2K/m, and so the distance it can travel is
1
h−
1 2K
x = vt =
(5.60)
2
2 m U0 − E + K
161
162
Chapter 5 | The Schrödinger Equation
(The factor of 1/2 is present because in the time t the particle must penetrate the
distance x into the forbidden region and return through that same distance to the
allowed region.)
In the limit K → 0, the penetration distance x goes to 0 according to Eq. 5.60
because the particle has zero velocity; similarly, x → 0 in the limit K → ∞,
because it moves for a vanishing time interval t. In between those limits, there
must be a maximum value of x for some particular K. Differentiating Eq. 5.60
with respect to K, we can find the maximum value
h−
1
(5.61)
xmax =
2 2m(U0 − E)
This value of x is identical with Eq. 5.58! This demonstrates that the penetration
into the forbidden region given by the solution to the Schrödinger equation is
entirely consistent with the uncertainty relationship. (The agreement between
Eqs. 5.58 and 5.61 is really somewhat accidental, because the factor 1/e used to
obtain Eq. 5.58 was chosen arbitrarily. What we have really demonstrated is that
the estimates of uncertainty given by the Heisenberg relationships are consistent
with the wave properties of the particle obtained from the Schrödinger equation.
This should not be surprising, because the uncertainty principle can be derived as
a consequence of the Schrödinger equation.)
Potential Energy Barrier
U0
E
x=0
Consider now the potential energy barrier shown in Figure 5.26:
x=L
FIGURE 5.26 A barrier of height U0
and width L.
λ0
λ0
x=L
FIGURE 5.27 The real part of the
wave function of a particle of energy
E < U0 encountering a barrier (the
particle is incident from the left in
the figure). The wavelength λ0 is the
same on both sides of the barrier, but
the amplitude beyond the barrier is
much less than the original amplitude.
U(x) = 0
= U0
=0
x<0
0≤x≤L
x>L
(5.62)
Particles with energy E less than U0 are incident from the left. Our experience
then leads us to expect solutions of the form shown in Figure 5.27—sinusoidal
oscillation in the region x < 0 (an incident wave and a reflected wave), exponentials in the region 0 ≤ x ≤ L, and sinusoidal oscillations in the region x > L (the
transmitted wave). Note that the intensity of the transmitted wave (x > L) is much
smaller than the intensity of the incident + reflected waves (x < 0), which means
that most of the particles are reflected and few are transmitted through the barrier.
Also note that the wavelengths are the same on either side of the barrier (because
the kinetic energies are the same).
The intensity of the transmitted wave, which can be found by application of the
continuity conditions, depends on the energy of the particle and on the height and
thickness of the barrier. Classically, the particles should never appear at x > L,
because they do not have sufficient energy to overcome the barrier. This situation
is an example of barrier penetration, sometimes called quantum mechanical
tunneling. Particles can not be observed while they are in the classically forbidden
region 0 ≤ x ≤ L, but can “tunnel” through that region and be observed at x > L.
Every particle incident on the barrier of Figure 5.26 is either reflected or
transmitted; the number of incident particles is equal to the number reflected
back to x < 0 plus the number transmitted to x > L. None are “trapped” or ever
seen in the forbidden region 0 < x < L. How can the incident particle get from
x < 0 to x > L? As a classical particle, it can’t! However, the wave representing
the particle can penetrate through the barrier, which allows the particle to be
observed in the classically allowed region x > L.
5.6 | Steps and Barriers
163
Air
Glass
(a)
(b)
FIGURE 5.28 (a) Total internal reflection of light waves at a glass-air
boundary. (b) Frustrated total internal reflection. The thicker the air
gap, the smaller the probability to penetrate. Note that the light beam
does not appear in the gap.
This phenomenon of penetration of a forbidden region is a well-known property
of classical waves. Quantum physics provides a new aspect to this phenomenon by
associating a particle with the wave, and thus allowing a particle to pass through
a classically forbidden region. An example of the penetration effect for classical
waves occurs for total internal reflection∗ of light waves. Figure 5.28a shows a
light beam in glass incident on a boundary with air. The beam is totally reflected
in the glass. However, if a second piece of glass is brought close to the first, as
in Figure 5.28b, the beam can appear in the second piece of glass. This effect is
called frustrated total internal reflection. The intensity of the beam in the second
piece, represented by the widths of the arrows in Figure 5.28b, decreases rapidly
as the thickness of the gap increases.
Just like our unobservable quantum wave, which penetrates a few wavelengths
into the forbidden region, an unobservable light wave of exponentially decreasing
amplitude, the evanescent wave, penetrates into the air even when the light wave
undergoes total reflection in the glass. The evanescent wave carries no energy
away from the interface, so it cannot be directly observed in the air, but it can be
observed in another medium such as a second piece of glass placed close to the
first. Evanescent waves have applications in microscopy, where they enable the
production of images of individual molecules.
Although the potential energy barrier of Figure 5.26 is rather artificial, there
are many practical examples of quantum tunneling:
1. Alpha Decay. An atomic nucleus consists of protons and neutrons in a
constant state of motion; occasionally these particles form themselves into an
aggregate of two protons and two neutrons, called an alpha particle. In one
form of radioactive decay, the nucleus can emit an alpha particle, which can
be detected in the laboratory. However, in order to escape from the nucleus
the alpha particle must penetrate a barrier of the form shown in Figure 5.29.
The probability for the alpha particle to penetrate the barrier, and be detected
in the laboratory, can be computed based on the energy of the alpha particle
∗
Total internal reflection occurs when a light beam is incident on a boundary between two substances,
such as glass and air, from the side with the higher index of refraction. If the angle of incidence inside
the glass exceeds a certain critical value, the light beam is totally reflected back into the glass.
Repulsive Coulomb
potential
Energy of alpha
particle
E
Attractive
nuclear potential
(a)
(b)
FIGURE 5.29 (a) A nuclear potential energy barrier is penetrated by
an alpha particle of energy E. (b) A
representation of the real part of the
wave function of the alpha particle.
The probability to penetrate the barrier depends strongly on the energy of
the alpha particle.
Chapter 5 | The Schrödinger Equation
and the height and thickness of the barrier. The decay probability can be
measured in the laboratory, and it is found to be in excellent agreement with
the value obtained from a quantum-mechanical calculation based on barrier
penetration.
2. Ammonia Inversion. Figure 5.30 is a representation of the ammonia molecule
NH3 . If we were to try to move the nitrogen atom along the axis of the molecule,
toward the plane of the hydrogen atoms, we find repulsion caused by the three
hydrogen atoms, which produces a potential energy of the form shown in
Figure 5.31. According to classical mechanics, unless we give the nitrogen
atom sufficient energy, it should not be able to surmount the barrier and appear
on the other side of the plane of hydrogens. According to quantum mechanics,
the nitrogen can tunnel through the barrier and appear on the other side of the
molecule. In fact, the nitrogen atom actually tunnels back and forth with a
frequency in excess of 1010 oscillations per second.
3. The Tunnel Diode. A tunnel diode is an electronic device that uses the
phenomenon of tunneling. Schematically, the potential energy of an electron
in a tunnel diode can be represented by Figure 5.32. The current that flows
through the device is produced by electrons tunneling through the barrier.
The rate of tunneling, and therefore the current, can be regulated merely by
changing the height of the barrier, which can be done with an applied voltage.
N
H
H
H
FIGURE 5.30 A schematic diagram of the ammonia molecule.
The Coulomb repulsion of the
three hydrogens establishes a barrier against the nitrogen atom
moving to a symmetric position
(shown in dashed lines) on the
opposite side of the plane of
hydrogens.
Barrier height
determined by
applied voltage
Potential energy
of nitrogen
164
Potential
energy
barrier
Equilibrium
positions
Energy of
electrons
Ground
state
energy
0
Distance from plane of hydrogens
FIGURE 5.31 The potential energy seen by the nitrogen atom
in an ammonia molecule. The
nitrogen can penetrate the barrier
and move from one equilibrium
position to another.
FIGURE 5.32 The potential
energy barrier seen by an electron in a tunnel diode. The
conductivity of the device is
determined by the electron’s
probability to penetrate the barrier, which depends on the
height of the barrier.
Chapter Summary
This can be done rapidly, so that switching frequencies in excess of 109 Hz
can be obtained. Ordinary semiconductor diodes depend on the diffusion of
electrons across a junction, and therefore operate on much longer time scales
(that is, at lower frequencies).
4. The Scanning Tunneling Microscope. Images of individual atoms on the
surface of materials (such as Figure 5.18) can be made with the scanning
tunneling microscope. Electrons are trapped in a surface by a potential energy
barrier (the work function of the material). When a needlelike probe is placed
within about 1 nm of the surface (Figure 5.33), electrons can tunnel through
the barrier between the surface and the probe and produce a current that can
be recorded in an external circuit. The current is very sensitive to the width of
the barrier (the distance from the probe to the surface). In practice, a feedback
mechanism keeps the current constant by moving the tip up and down. The
motion of the tip gives a map of the surface that reveals details smaller
than 0.01 nm, about 1/100 the diameter of an atom! For the development of
the scanning tunneling microscope, Gerd Binnig and Heinrich Rohrer were
awarded the 1986 Nobel Prize in physics.
165
Probe
Path of
probe tip
Electron
cloud
Atom
on
surface
FIGURE 5.33 In a scanning tunneling microscope, a needlelike probe is
scanned over a surface. The probe is
moved vertically so that the distance
between the probe and the surface
remains constant as the probe scans
laterally.
Chapter Summary
Section
Time-independent
Schrödinger
equation
Time-dependent
Schrödinger
equation
Probability density
Normalization
condition
Probability in
interval x1 to x2
−
h−2 d 2 ψ
+ U(x)ψ(x) = Eψ(x) 5.3
2m dx2
(x, t) =
ψ(x)e−iωt
P(x) = |ψ(x)|
+∞
2
−∞ |ψ(x)| dx = 1
P(x1 : x2 ) =
x1
+∞
5.3
h 2 n2
(n = 1, 2, 3, . . .)
8mL2
ψ(x, y) =
2
E=
5.4
ny π y
2
n πx
sin x sin
L
L
L
5.4
h
(n2 + n2y )
8mL2 x
√
− 2
km/2h
)x
Simple harmonic
oscillator ground
state
−
1/4 −(
ψ(x) = (mω0 /hπ)
e
5.5
5.3
|ψ(x)|2 f (x) dx
5.3
Simple harmonic
oscillator energies
−
En = (n + 12 )hω
0 (n = 0, 1, 2, . . .) 5.5
5.4
Potential energy
step, E > U0
ψ0 (x < 0) = A sin k0 x + B cos k0 x 5.6
ψ1 (x > 0) = C sin k1 x + D cos k1 x
5.4
Potential energy
step, E < U0
ψ0 (x < 0) = A sin k0 x + B cos k0 x 5.6
ψ1 (x > 0) = Cek1 x + De−k1 x
Constant potential
energy, E > U0
ψ(x) = A sin kx + B cos kx,
k = 2m(E − U0 )/h−2
′
Two-dimensional
infinite well
2
nπx
sin
,
L
L
|ψ(x)|2 dx
[f (x)]av =
Constant potential
energy, E < U0
ψn (x) =
En =
5.3
Average or
expectation value
of f (x)
−∞
Infinite potential
energy well
5.3
2
x2
Section
′
ψ(x) = Aek x + Be−k x ,
k ′ = 2m(U0 − E)/h−2
166
Chapter 5 | The Schrödinger Equation
Questions
1. Newton’s laws can be solved to give the future behavior of
a particle. In what sense does the Schrödinger equation also
do this? In what sense does it not?
2. Why is it important for a wave function to be normalized? Is
an unnormalized wave function a solution to the Schrödinger
equation?
+∞
3. What is the physical meaning of −∞ |ψ|2 dx = 1 ?
4. What are the dimensions of ψ(x)? Of ψ(x, y)?
5. None of the following are permitted as solutions of the
Schrödinger equation. Give the reasons in each case.
(a) ψ(x) = A cos kx
x<0
ψ(x) = B sin kx
x>0
−L≤x≤L
(b) ψ(x) = Ax−1 e−kx
(c) ψ(x) = A sin−1 kx
(d) ψ(x) = A tan kx
x>0
6. What happens to the probability density in the infinite well
when n → ∞? Is this consistent with classical physics?
7. How would the solution to the infinite potential energy well
be different if the well extended from x = x0 to x = x0 + L,
where x0 is a nonzero value of x? Would any of the measurable properties be different?
8. How would the solution to the one-dimensional infinite
potential energy well be different if the potential energy
were not zero for 0 ≤ x ≤ L but instead had a constant value
U0 ? What would be the energies of the excited states? What
would be the wavelengths of the standing de Broglie waves?
Sketch the behavior of the lowest two wave functions.
9. Assuming a pendulum to behave like a quantum oscillator, what are the energy differences between the quantum
10.
11.
12.
13.
14.
15.
16.
17.
states of a pendulum of length 1 m? Are such differences
observable?
For the potential energy barrier (Figure 5.26), is the wavelength for x > L the same as the wavelength for x < 0? Is
the amplitude the same?
Suppose particles were incident on the potential energy step
from the positive x direction. Which of the four coefficients
of Eq. 5.56 would be set to zero? Why?
The energies of the excited states of the systems we have discussed in this chapter have been exact—there is no energy
uncertainty. What does this suggest about the lifetime of
particles in those excited states? Left on its own, will a
particle ever make transitions from one state to another?
Explain how the behavior of a particle in a one-dimensional
infinite well can be considered in terms of standing de
Broglie waves.
How would you design an experiment to observe barrier
penetration with sound waves? What range of thicknesses
would you choose for the barrier?
If U0 were negative in Figure 5.26, how would the wave
functions appear for E > 0?
Does Eq. 5.2 imply that we know the momentum of the
particle exactly? If so, what does the uncertainty principle
indicate about our knowledge of its location? How can you
reconcile this with our knowledge that the particle must be
in the well?
Do sharp boundaries and discontinuous jumps of potential
energy occur in nature? If not, how would our analysis of
potential energy steps and barriers be different?
Problems
5.1 Behavior of a Wave at a Boundary
1. A ball falls from rest at a height H above a lake. Let y = 0
at the surface of the lake. As it falls, it experiences a gravitational force −mg. When it enters the water, it experiences
a buoyant force B so the net force in the water is B − mg.
(a) Write expressions for v(t) and y(t) while the ball is falling
in air. (b) In the water, let v2 (t) = at + b and y2 (t) = 12 at2 +
bt + c where a = (B − mg)/m. Use the continuity conditions at the surface of the water to find the constants b and c.
2. A wave has the form y = A cos(2π x/λ + π/3) when x < 0.
For x > 0, the wavelength is λ/2. By applying continuity
conditions at x = 0, find the amplitude (in terms of A) and
phase of the wave in the region x > 0. Sketch the wave,
showing both x < 0 and x > 0.
5.2 Confining a Particle
3. The lowest energy of a particle in an infinite one-dimensional
well is 4.4 eV. If the width of the well is doubled, what is its
lowest energy?
4. An electron is trapped in an infinite well of width 0.120 nm.
What are the three longest wavelengths permitted for the
electron’s de Broglie waves?
5. An electron is trapped in a one-dimensional region of width
0.050 nm. Find the three smallest possible values allowed
for the energy of the electron.
6. What is the minimum energy of a neutron (mc2 = 940 MeV)
confined to a region of space of nuclear dimensions
(1.0 × 10−14 m)?
Problems
5.3 The Schrödinger Equation
7. In the region 0 ≤ x ≤ a, a particle is described by
the wave function ψ1 (x) = −b(x2 − a2 ). In the region
a ≤ x ≤ w, its wave function is ψ2 (x) = (x − d)2 − c. For
x ≥ w, ψ3 (x) = 0. (a) By applying the continuity conditions
at x = a, find c and d in terms of a and b. (b) Find w in terms of
a and b.
8. A particle is described by the wave function
ψ(x) = b(a2 − x2 ) for −a ≤ x ≤ +a and ψ(x) = 0 for
x ≤ −a and x ≥ +a, where a and b are positive real constants. (a) Using the normalization condition, find b in terms
of a. (b) What is the probability to find the particle at
x = +a/2 in a small interval of width 0.010a? (c) What is
the probability for the particle to be found between x = +a/2
and x = +a?
9. In a certain region of space, a particle is described by the
wave function ψ(x) = Cxe−bx where C and b are real constants. By substituting into the Schrödinger equation, find
the potential energy in this region and also find the energy of
the particle. (Hint: Your solution must give an energy that
is a constant everywhere in this region, independent of x.)
10. A particle is represented by the following wave function:
ψ(x) =
=
=
=
0
C(2x/L + 1)
C(−2x/L + 1)
0
x < −L/2
− L/2 < x < 0
0 < x < +L/2
x > +L/2
(a) Use the normalization condition to find C. (b) Evaluate the probability to find the particle in an interval of
width 0.010L at x = L/4 (that is, between x = 0.245L and
x = 0.255L. (No integral is necessary for this calculation.)
(c) Evaluate the probability to find the particle between
x = 0 and x = +L/4. (d)
Find the average value of x and
the rms value of x: xrms = (x2 )av .
5.4 Applications of the Schrödinger Equation
11. A particle in an infinite well is in the ground state with an
energy of 1.26 eV. How much energy must be added to the
particle to reach the second excited state (n = 3)? The third
excited state (n = 4)?
12. An electron is trapped in an infinitely deep one-dimensional
well of width 0.251 nm. Initially the electron occupies the
n = 4 state. (a) Suppose the electron jumps to the ground
state with the accompanying emission of a photon. What
is the energy of the photon? (b) Find the energies of other
photons that might be emitted if the electron takes other
paths between the n = 4 state and the ground
state.
13. Show that Eq. 5.31 gives the value A = 2/L.
14. A particle is trapped in an infinite one-dimensional well
of width L. If the particle is in its ground state, evaluate
the probability to find the particle (a) between x = 0 and
15.
16.
17.
18.
19.
167
x = L/3; (b) between x = L/3 and x = 2L/3; (c) between
x = 2L/3 and x = L.
A particle is confined between rigid walls separated by a
distance L = 0.189 nm. The particle is in the second excited
state (n = 3). Evaluate the probability to find the particle in
an interval of width 1.00 pm located at: (a) x = 0.188 nm;
(b) x = 0.031 nm; (c) x = 0.079 nm. (Hint: No integrations are required for this problem; use Eq. 5.7 directly.)
What would be the corresponding results for a classical
particle?
What is the next level (above E = 50E0 ) of the twodimensional particle in a box in which the degeneracy
is greater than 2?
A particle is confined to a two-dimensional box of length L
h2 π 2 /2mL2 )(n2x +
and width 2L. The energy values are E = (−
2
ny /4). Find the two lowest degenerate levels.
Show by direct substitution that Eq. 5.39 gives a solution to
the two-dimensional Schrödinger equation, Eq. 5.37. Find
the relationship between kx , ky , and E.
A particle is confined to a three-dimensional region of
space of dimensions L by L by L. The energy levels
h2 π 2 /2mL2 )(n2x + n2y + n2z ), where nx , ny , and nz are
are (−
integers ≥ 1. Sketch an energy level diagram, showing the
energies, quantum numbers, and degeneracies for the lowest
10 energy levels.
5.5 The Simple Harmonic Oscillator
20. Using the normalization condition, show that the constant A
hπ )1/4 for the one-dimensional simple
has the value (mω0 /−
harmonic oscillator in its ground state.
21. (a) At the classical turning points ±x0 of the simple harmonic
oscillator, K = 0 and so E = U. From this relationship, show
hω0 /k)1/2 for an oscillator in its ground state.
that x0 = (−
(b) Find the turning points in the first and second excited
states.
22. Use the ground-state wave function of the simple harmonic
oscillator to find xav , (x2 )av , and x. Use the normalization
hπ )1/4 .
constant A = (mω0 /−
23. (a) Using a symmetry argument rather than a calculation,
determine the value of pav for a simple harmonic oscillator.
(b) Conservation of energy for the harmonic oscillator can
be used to relate p2 to x2 . Use this relation, along with
the value of (x2 )av from Problem 22, to find (p2 )av for the
oscillator in its ground state.
(c) Using the results of parts a
hω0 m/2.
and b, show that p = −
24. The ground state energy of an oscillating electron is 1.24 eV.
How much energy must be added to the electron to move it
to the second excited state? The fourth excited state?
25. Compare the probabilities for an oscillating particle in its
ground state to be found in a small interval of width dx at
the center of the well and at the classical turning points.
168
Chapter 5 | The Schrödinger Equation
5.6 Steps and Barriers
26. Find the value of K at which Eq. 5.60 has its maximum
value, and show that Eq. 5.61 is the maximum value of x.
27. For a particle with energy E < U0 incident on the potential
energy step, use ψ0 and ψ1 from Eqs. 5.57, and evaluate the
constants B and D in terms of A by applying the boundary
conditions at x = 0.
28. Using the wave functions of Eq. 5.55 for the potential energy
step, apply the boundary conditions of ψ and dψ/dx to find
B′ and C ′ in terms of A′ , for the potential step when particles
are incident from the negative x direction. Evaluate the ratios
|B′ |2 /|A′ |2 and |C ′ |2 /|A′ |2 and interpret.
29. (a) Write down the wave functions for the three regions
of the potential energy barrier (Figure 5.26) for E < U0 .
You will need six coefficients in all. Use complex exponential notation. (b) Use the boundary conditions at x = 0
and at x = L to find four relationships among the six coefficients. (Do not try to solve these relationships.) (c) Suppose
particles are incident on the barrier from the left. Which
coefficient should be set to zero? Why?
30. Repeat Problem 29 for the potential energy barrier when
E > U0 , and sketch a representative probability density that
shows several cycles of the wave function. In your sketch,
make sure the amplitude and wavelength in each region
accurately describe the situation.
E
E
E
FIGURE 5.34 Problem 32.
General Problems
31. An electron is trapped in a one-dimensional well of width
0.132 nm. The electron is in the n = 10 state. (a) What
is the energy of the electron? (b) What is the uncertainty
in its momentum? (Hint: Use Eq. 4.10.) (c) What is the
uncertainty in its position? How do these results change as
n → ∞? Is this consistent with classical behavior?
32. Sketch the form of a possible solution to the Schrödinger
equation for each of the potential energies shown in
Figure 5.34. The potential energies go to infinity at the
boundaries. In each case show several cycles of the wave
function. In your sketches, pay attention to the continuity conditions (where applicable) and to changes in the
wavelength and amplitude.
33. Show that the average value of x2 in the one-dimensional
infinite potential energy well is L2 (1/3 − 1/2n2 π 2 ).
34. Use the result of Problem 33 to show that,
for the infinite one-dimensional
well, defining x = (x2 )av − (xav )2
gives x = L 1/12 − 1/2π 2 n2 .
35. (a) In the infinite one-dimensional well, what is
pav ? (Use a symmetry argument.) (b) What is (p2 )av ? [Hint:
What is (p2 /2m)av ?] (c) Defining p = (p2 )av − (pav )2 ,
show that p = hn/2L.
36. The first excited state of the harmonic oscillator has a wave
2
function of the form ψ(x) = Axe−ax . Follow the method
outlined in Section 5.5 to find a and the energy E. Find the
constant A from the normalization condition.
37. Using the normalization constant A from Problem 20 and
the value of a from Eq. 5.49, evaluate the probability to
find an oscillator in the ground state beyond the classical
turning points ±x0 . This problem cannot be solved in closed,
analytic form. Develop an approximate, numerical method
using a graph, calculator, or computer. Assume an electron bound to an atomic-sized region (x0 = 0.1 nm) with an
effective force constant of 1.0 eV/nm2 .
38. A two-dimensional harmonic oscillator has energy E =
−
hω0 (nx + ny + 1), where nx and ny are integers beginning
with zero. (a) Justify this result based on the energy of
the one-dimensional oscillator. (b) Sketch an energy-level
diagram similar to Figure 5.21, showing the lowest 4 energy
levels. For each level, show the value of E (in units of
−
hω0 ), the quantum numbers nx and ny , and the degeneracy. (c) Show that the degeneracy of each level is equal to
nx + ny + 1.
Chapter
6
THE RUTHERFORD-BOHR MODEL
OF THE ATOM
This model of the atom, based on the work of Rutherford and Bohr, shows electrons
circulating about the nucleus like planets circulating about the Sun. It can be a useful model
for some purposes, but it does not represent even approximately the structure of real atoms.
In Chapters 7 and 8 we will learn more about the behavior and properties of electrons in
atoms.
170
Chapter 6 | The Rutherford-Bohr Model of the Atom
Our goal in this chapter is to understand some of the details of atomic structure
that can be learned from experimental studies of atoms. In particular, we discuss
two types of experiments that are important in the development of our theory
of atomic structure: the scattering of charged particles by atoms, which tells us
about the distribution of electric charge in atoms, and the emission or absorption
of radiation by atoms, which tells us about their excited states.
We use the information obtained from these experiments to develop an atomic
model, which helps us understand and explain the properties of atoms. A model
is usually an oversimplified picture of a more complex system, which provides
some insight into its operation but may not be sufficiently detailed to explain all
of its properties.
In this chapter, we discuss the experiments that led to the Rutherford-Bohr
model (also known simply as the Bohr model), which is based on the familiar
“planetary” structure in which the electrons orbit about the nucleus like planets
about the Sun. Even though this model is not strictly valid from the standpoint of
wave mechanics, it does help us understand many atomic properties, especially
the excited states of the simplest atom, hydrogen. In Chapter 7, we show how
wave mechanics changes our picture of the hydrogen atom, and in Chapter 8 we
consider the structure of more complicated atoms.
6.1 BASIC PROPERTIES OF ATOMS
Before we begin to construct a model of the atom, it is helpful to summarize some
of the basic properties of atoms.
1. Atoms are very small, about 0.1 nm (0.1 × 10−9 m) in radius. Thus any
effort to “see” an atom using visible light (λ = 500 nm) is hopeless owing
to diffraction effects. We can make a crude estimate of the maximum size
of an atom in the following way. Consider a cube of elemental matter—for
example, iron. Iron has a density of about 8 g/cm3 and a molar mass of
56 g. One mole of iron (56 g) contains Avogadro’s number of atoms, about
6 × 1023 . Thus 6 × 1023 atoms occupy about 7 cm3 and so 1 atom occupies
about 10−23 cm3 . If we assume the atoms of a solid are packed together in the
most efficient possible
way, like hard spheres in contact, then the diameter of
√
3
one atom is about 10−23 cm3 = 2 × 10−8 cm = 0.2 nm.
2. Atoms are stable—they do not spontaneously break apart into smaller pieces
or collapse; therefore the internal forces that hold the atom together must be in
equilibrium. This immediately tells us that the forces that pull the parts of an
atom together must be opposed in some way; otherwise atoms would collapse.
3. Atoms contain negatively charged electrons, but are electrically neutral. If we
disturb an atom or collection of atoms with sufficient force, electrons are emitted. We learn this fact from studying the Compton effect and the photoelectric
effect. We also learned in Chapter 4 that even though electrons are emitted from
the nuclei of atoms in certain radioactive decay processes, they don’t “exist”
in those nuclei but are manufactured there by some process. Electrons were
excluded from the nucleus based on the uncertainty principle, which forbids
emitted electrons of the energies observed in the laboratory from existing in the
nucleus (see Example 4.7). The uncertainty principle places no such restriction
on the existence of electrons in a volume as large as an atom (see Problem 1).
We can also easily observe that bulk matter is electrically neutral, and we
assume that this is likewise a property of the atoms. Experiments with beams
6.2 | Scattering Experiments and the Thomson Model
of individual atoms support this assumption. From these experimental facts
we deduce that an atom with Z negatively charged electrons must also contain
a net positive charge of Ze.
4. Atoms emit and absorb electromagnetic radiation. This radiation may take
many forms—visible light (λ ∼ 500 nm), X rays (λ ∼ 1 nm), ultraviolet rays
(λ ∼ 10 nm), infrared rays (λ ∼ 0.1 μm), and so forth. In fact it is from observation of these emitted and absorbed radiations, which can be measured with great
precision, that we learn most of what we know about atoms. In a typical emission measurement, an electric current is passed through a glass tube containing
a small sample of the gas phase of the element under study, and radiation is
emitted when an excited atom returns to its ground state. The absorption wavelengths can be measured by passing a beam of white light through a sample of
the gas and noting which colors are removed from the white light by absorption
in the gas. One particularly curious feature of the atomic radiations is that atoms
don’t always emit and absorb radiations at the same wavelengths—some wavelengths present in the emission experiment do not also appear in the absorption
experiment. Any successful theory of atomic structure must be able to account
for these emission and absorption wavelengths.
Electrons
6.2 SCATTERING EXPERIMENTS AND THE
THOMSON MODEL
An early model of the structure of the atom was proposed (in 1904) by J. J.
Thomson, who was known for his previous identification of the electron and
measurement of its charge-to-mass ratio e/m. The Thomson model incorporates
many of the known properties of atoms: size, mass, number of electrons, and
electrical neutrality. In this model, an atom contains Z electrons that are embedded
in a uniform sphere of positive charge (Figure 6.1). The total positive charge of the
sphere is Ze, the mass of the sphere is essentially the mass of the atom (the electrons
don’t contribute significantly to the total mass), and the radius R of the sphere is the
radius of the atom. (This model is sometimes known as the “plum-pudding” model,
because the electrons are distributed throughout the atom like raisins in a plum
pudding.) As we will see, the Thomson model gives predictions that disagree with
experiment, and so it is not the correct way of understanding the structure of atoms.
One way of studying atoms is by probing the distribution of electric charge in
their interior, which we can do by bombarding the atom with charged particles and
observing the angle by which particles are deflected from their original direction.
This type of experiment is called a scattering experiment. Ideally we would like
to do this experiment with a single atom, such as is represented in Figure 6.2.
The scattering angle θ depends on the impact parameter b, which measures the
distance from the center of the atom that a projectile would pass if it were not
deflected. Each different value of the impact parameter results in a different value
of the scattering angle.
The particle is deflected from its original trajectory by the electrical forces
exerted on the particle by the atom. For a positively charged particle, these forces
are: (1) a repulsive force due to the positive charge of the atom, and (2) an attractive
force due to the negatively charged electrons. We assume that the mass of the
deflected particle is much greater than the mass of an electron but also much less
than the mass of the atom. In the encounter between the projectile and an electron,
171
R
r
FIGURE 6.1 The Thomson model of
the atom. Z electrons are imbedded in
a uniform sphere of positive charge Ze
and radius R. An imaginary spherical
surface of radius r contains a fraction
r3 /R3 of the positive charge.
y
θ
b
x
b=0
R
FIGURE 6.2 A positively charged particle is deflected by an angle θ as it
passes through a positively charged
sphere, representing a Thomson model
atom. The scattering angle depends on
the value of the impact parameter b,
which varies from 0 to R.
172
Chapter 6 | The Rutherford-Bohr Model of the Atom
θ
θ
θ
θ
FIGURE 6.3 Scattering by a thin foil.
Some individual scatterings tend to
increase θ , while others tend to
decrease θ.
the forces exerted on each by the other are equal and opposite (by Newton’s
third law), and so the principal victim of the encounter is the much less massive
electron; the effect on the projectile is negligible. (Imagine rolling a bowling ball
through a field of Ping-Pong balls!) We thus need consider only the positively
charged atom as a cause of the deflection of the particle. By the same argument,
we neglect any possible motion of the more massive atom caused by the passage
of the projectile. The basic experiment, then, is the scattering of a positively
charged projectile by the stationary positively charged massive part of the atom.
In practice we cannot do the experiment with one atom. Instead, we bombard a
thin foil, as in Figure 6.3. The scattering angle θ that we observe in the laboratory
is the result of scattering by many atoms, with impact parameters that we do
not know and cannot control. Let’s assume that for a single atom the average
scattering angle is θav , which represents an average over all possible impact
parameters from zero up to the atomic radius R. For a typical foil thickness of
1 μm (10−6 m), the projectile is scattered by about 104 atoms.
The total scattering angle θ is determined by statistical considerations, because
some of the individual scatterings move the projectile toward larger scattering
angles and some toward smaller angles, as represented in Figure 6.3. This is an
example of a “random walk” problem—for N scatterings, the most likely observed
net scattering angle θ is related to the average individual scattering angle by
√
θ ≃ Nθav
(6.1)
According to the Thomson model, the average scattering angle for a single atom
is on the order of 0.01◦ , and for a foil that is 104 atoms thick the net scattering
angle should be about 1◦ . This is consistent with experimental observations.
The most critical test of the Thomson model, which it fails completely, occurs
when we examine the probability for scattering at large angles. If each individual
scattering deflects the projectile through an angle of around 0.01◦ , then to observe
projectiles scattered through a total angle greater than 90◦ , we must have about
104 successive scatterings, all of which push the projectile toward larger angles.
Because the probabilities of individual scatterings toward either larger or smaller
angles are equal, the probability of having 104 successive scatterings toward
larger angles, like the probability of finding 104 successive heads in tossing a
coin, is about (1/2)10,000 = 10−3000 .
An experiment to observe this scattering was performed by Hans Geiger and
Ernest Marsden in the laboratory of Ernest Rutherford at Manchester University
in 1910. For projectiles they used alpha particles, which are nuclei of helium (of
charge +2e) emitted in radioactive decay. Their results showed that the probability
of an alpha particle scattering at angles greater than 90◦ was about 10−4 . This
remarkable discrepancy between the expected value based on the Thomson model
(10−3000 ) and the observed value (10−4 ) was described by Rutherford in this way:
Ernest Rutherford (1871–1937, England). Founder of nuclear physics, he
is known for his pioneering work on
alpha-particle scattering and radioactive decays. His inspiring leadership
influenced a generation of British nuclear and atomic scientists.
It was quite the most incredible event that ever happened to me in my life.
It was as incredible as if you fired a 15-inch shell at a piece of tissue paper
and it came back and hit you.
The analysis of the results of such scattering experiments led Rutherford to propose
that the mass and positive charge of the atom are not distributed uniformly over
the volume of the atom, but instead are concentrated in an extremely small region,
about 10−14 m in diameter, at the center of the atom. In Section 6.3 we will see
how this proposal is consistent with the large-angle scattering results.
6.2 | Scattering Experiments and the Thomson Model
Scattering in the Thomson Model (Optional)
Let’s assume that a projectile of positive charge ze is incident on an atom of
radius R that we represent according to the Thomson model as a uniform sphere
of positive charge Ze. The force on the projectile when it is a distance r from the
center of the atom can be computed using Gauss’s law (see Problem 2):
F=
zZe2
r
4πε0 R3
(6.2)
Before discussing the scattering, we should note that this equation can also describe
(if we put z = 1) the force on an electron embedded in the Thomson atom at a
distance r from its center. This force can be written F = kr with k = Ze2 /4πε0 R3 .
This linear restoring force permits the electrons to oscillate about their equilibrium
positions just like a mass on a spring subject to the linear restoring force F = kx. We
therefore expect the electrons in the Thomsonatom to oscillate about their equilibrium positions with a frequency f = (2π )−1 k/m , where k is the force constant.
Because an oscillating electric charge radiates electromagnetic waves whose frequency is identical to the oscillation frequency, we might expect, based on the
Thomson model, that the radiation emitted by atoms would show this characteristic
frequency. This turns out not to be true (see Problem 3); the calculated frequencies
do not correspond to the frequencies observed for radiation emitted by atoms.
The exact calculation of the scattering angle for different values of the impact
parameter in the Thomson model of the atom is fairly complicated, but for our
purposes we want only an estimate of the average value of the angle. As we will
find out later, it’s not very important if our estimate is off by a small factor.
Initially the projectile moves in the x direction in the geometry of Figure 6.2,
but the atom exerts a force in the y direction that produces a small component
of momentum py in that direction. Using Newton’s second law we can find the
momentum from the impulse received by the projectile due to the electrostatic
force:
py =
Fy dt
(6.3)
Rather than carry out this complicated integral for a force that is changing in magnitude and direction as the projectile travels, we’ll estimate the average scattering
angle by choosing an average value for the impact parameter, namely b = R/2 (representing the middle trajectory of Figure 6.2), and we’ll assume the force acts in
the y direction for a time t determined by the projectile’s flight along a line of
length roughly equal to R. This underestimates the amount of time during which the
force acts but overestimates the effect of the force (which doesn’t act purely in the
y direction along the entire trajectory), so to some extent these two effects should
cancel one another.
Making these approximations, we obtain
zZe2
zZe2 (R/2) R
=
py ∼
=
= Ft ∼
4πε0 R3 v
8πε0 Rv
(6.4)
The angle θ is small, so we can make the approximation tan θ ∼
= θ, and we can
assume that px changes very little from its initial value mv, and so the average
scattering angle is
py
py
zZe2 1
zZe2
=
=
=
(6.5)
θav ∼
= tan θav =
px
mv
8πε0 Rv mv
16πε0 RK
173
174
Chapter 6 | The Rutherford-Bohr Model of the Atom
using the nonrelativistic kinetic energy K = 12 mv2 . This gives an estimate of the
scattering angle when the impact parameter b is equal to half the radius R. Smaller
values of b will give smaller deflection angles, and larger values of b will give
larger angles, so this is a reasonable estimate for the average scattering angle for
a Thomson model atom.
Example 6.1
Using the Thomson model, estimate the average scattering angle when alpha particles (z = 2) with kinetic energy
3 MeV are scattered from gold (Z = 79). The atomic radius
of gold is 0.179 nm.
θav ∼
=
zZe2
1 e2 zZ
=
16πε0 RK
4 4πε0 RK
0.25(1.44 eV · nm)(2)(79)
(0.179 nm)(3 × 106 eV)
Solution
=
Using e2 /4πε0 = 1.44 eV·nm, we have
= 1 × 10−4 rad = 0.01
◦
Even though this result represents a rough estimate of the average scattering
angle in the Thomson model of the atom, its accuracy does not affect our
conclusions about the failure of the model. Even if our estimate were too small by
as much as a factor of 10 (which is highly unlikely), we would be comparing an
expected probability of 10−300 (instead of 10−3000 ) with the observed 10−4 , still
a spectacular disagreement. Any reasonable estimate shows the complete failure
of the Thomson model to account for these scattering experiments.
6.3 THE RUTHERFORD NUCLEAR ATOM
In analyzing the scattering of alpha particles, Rutherford concluded that the most
likely way an alpha particle (m = 4 u) can be deflected through large angles is
by a single collision with a more massive object. Rutherford therefore proposed
that the charge and mass of the atom were concentrated at its center, in a region
called the nucleus. Figure 6.4 illustrates the scattering geometry in this case. The
projectile, of charge ze, experiences a repulsive force due to the positively charged
nucleus:
F=
θ
b
FIGURE 6.4 Scattering by a nuclear
atom. The path of the scattered particle
is a hyperbola. Smaller impact parameters give larger scattering angles.
1 |q1 ||q2 |
(ze)(Ze)
=
2
4πε0 r
4πε0 r2
(6.6)
(Compare this with Eq. 6.2, which describes a projectile that is inside the sphere
of charge Ze and so feels only a portion of the positive charge. We assume now
that the projectile is always outside the nucleus, so it feels the full nuclear charge
Ze.) The atomic electrons, with their small mass, do not appreciably affect the
path of the projectile and we neglect their effect on the scattering. We also assume
that the nucleus is so much more massive than the projectile that it does not move
during the scattering process; because no recoil motion is given to the nucleus,
the initial and final kinetic energies K of the projectile are equal.
6.3 | The Rutherford Nuclear Atom
As Figure 6.4 shows, for each impact parameter b, there is a certain scattering
angle θ , and we need the relationship between b and θ. The projectile can be
shown∗ to follow a hyperbolic path; in polar coordinates r and φ, the equation of
the hyperbola is
1
1
zZe2
= sin φ +
(cos φ − 1)
r
b
8πε0 b2 K
(6.7)
As shown in Figure 6.5, the initial position of the particle is φ = 0, r → ∞, and
the final position is φ = π − θ , r → ∞. Using the coordinates at the final position,
Eq. 6.7 reduces to
b=
zZ e2
zZe2
cot 21 θ
cot 12 θ =
8πε0 K
2K 4πε0
175
b
q
b
f
r
FIGURE 6.5 The hyperbolic trajectory of a scattered particle.
(6.8)
(This result is written in this form so that e2 /4πε0 = 1.44 eV · nm or MeV · fm can
be easily inserted.) A projectile that approaches the nucleus with impact parameter
b will be scattered at an angle θ; projectiles approaching with smaller values of b
will be scattered through larger angles, as shown in Figure 6.4.
We divide our study of the scattering of charged projectiles by nuclei (which
is commonly called Rutherford scattering) into three parts: (1) calculation of the
fraction of projectiles scattered at angles greater than some value of θ , (2) the
Rutherford scattering formula and its experimental verification, and (3) the closest
approach of a projectile to the nucleus.
1. The Fraction of Projectiles Scattered at Angles Greater than θ. From
Figure 6.4 we see immediately that every projectile with impact parameters less
than a given value of b will be scattered at angles greater than its corresponding
θ. What is the chance of a projectile having an impact parameter less than a given
value of b? Suppose the foil were one atom thick—a single layer of atoms packed
tightly together, as in Figure 6.6. Each atom is represented by a circular disc, of
area π R2 . If the foil contains N atoms, its total area is Nπ R2 . For scattering at
angles greater than θ, the impact parameter must fall between zero and b—that
is, the projectile must approach the atom within a circular disc of area π b2 . If
the projectiles are spread uniformly over the area of the disc, then the fraction of
projectiles that fall within that area is just π b2 /π R2 .
A real scattering foil may be thousands or tens of thousands of atoms thick.
Let t be the thickness of the foil and A its area, and let ρ and M be the density and
molar mass of the material of which the foil is made. The volume of the foil is
then At, its mass is ρAt, the number of moles is ρAt/M, and the number of atoms
or nuclei per unit volume is
n = NA
ρAt 1
N ρ
= A
M At
M
b
(6.9)
where NA is Avogadro’s number (the number of atoms per mole). As seen by an
incident projectile, the number of nuclei per unit area is nt = NA ρt/M; that is, on
the average, each nucleus contributes an area (NA ρt/M)−1 to the field of view
∗
q
See, for example, R. M. Eisberg and R. Resnick, Quantum Physics of Atoms, Molecules, Solids,
Nuclei, and Particles, 2nd ed. (New York, Wiley, 1985).
FIGURE 6.6 Scattering geometry for
many atoms. For impact parameter b,
the scattering angle is θ. If the particle
enters the atom within the disc of area
πb2 , its scattering angle will be larger
than θ .
176
Chapter 6 | The Rutherford-Bohr Model of the Atom
of the projectile. For scattering at angles greater than θ, it must once again be
true that the projectile must fall within an area πb2 of the center of an atom; the
fraction scattered at angles greater than θ is just the fraction that approaches an
atom within the area πb2 :
f<b = f> θ = ntπb2
(6.10)
assuming that the incident particles are spread uniformly over the area of the foil.
Example 6.2
A gold foil (ρ = 19.3 g/cm3 , M = 197 g/mole) has a thickness of 2.0 × 10−4 cm. It is used to scatter alpha particles
of kinetic energy 8.0 MeV. (a) What fraction of the alpha
particles is scattered at angles greater than 90◦ ? (b) What
fraction of the alpha particles is scattered at angles between
90◦ and 45◦ ?
Solution
(a) For this case the number of nuclei per unit volume can
be computed as
n=
NA ρ
(6.02 × 1023 atoms/mole)(19.3 g/cm3 )
=
M
(197 g/mole)(1 m/102 cm)3
= 5.9 × 1028 m−3
For scattering at 90◦ , the impact parameter b can be found
from Eq. 6.8:
zZ e2
1
b=
cot θ
2K 4πε0
2
=
(2)(79)
◦
(1.44 MeV·fm) cot 45
2(8.0 MeV)
= 14 fm = 1.4 × 10−14 m
and using Eq. 6.10 we then have
f>90◦ = ntπb2
= (5.9 × 1028 m−3 )(2.0 × 10−6 m)π(1.4 × 10−14 m)2
= 7.5 × 10−5
(b) Repeating the calculation for θ = 45◦ , we find
zZ e2
cot 21 θ
2K 4πε0
(2)(79)
◦
(1.44 MeV · fm) cot 22.5
=
2(8.0 MeV)
= 34 fm = 3.4 × 10−14 m
b=
f>45◦ = ntπb2
= (5.9 × 1028 m−3 )(2.0 × 10−6 m)π(3.4 × 10−14 m)2
= 4.4 × 10−4
If a total fraction of 4.4 × 10−4 is scattered at angles
greater than 45◦ , and of that, 7.5 × 10−5 is scattered at
angles greater than 90◦ , the fraction scattered between 45◦
and 90◦ must be
4.4 × 10−4 − 7.5 × 10−5 = 3.6 × 10−4
2. The Rutherford Scattering Formula and Its Experimental Verification.
In order to find the probability that a projectile will be scattered into a small
angular range at θ (between θ and θ + dθ), we require that the impact parameter
lie within a small range of values db at b (see Figure 6.7). The fraction, df, is then
df = nt(2πb db)
(6.11)
from Eq. 6.10. Differentiating Eq. 6.8 we find db in terms of dθ :
db =
zZ e2
(− csc2 12 θ )( 21 dθ )
2K 4πε0
(6.12)
6.3 | The Rutherford Nuclear Atom
and so
zZ
|df | = πnt
2K
2
e2
4πε0
2
Detector
dθ
csc2 12 θ cot 12 θ dθ
(6.13)
(This minus sign in Eq. 6.12 is not important—it just tells us that θ increases
as b decreases.) Suppose we place a detector for the scattered projectiles at the
angle θ a distance r from the nucleus. The probability for a projectile to be
scattered into the detector depends on df , which gives the probability for scattered
particles to pass through the ring of radius r sin θ and width r dθ. The area of the
ring is dA = (2π r sin θ )r dθ. In order to calculate the rate at which projectiles
are scattered into the detector we must know the probability per unit area for
scattering into the ring. This is |df |/dA, which we call N(θ ), and, after some
manipulation, we find:
2 2
1
e
nt zZ 2
(6.14)
N(θ ) = 2
4r
2K
4πε0 sin4 12 θ
db
r θ
FIGURE 6.7 Particles entering the
ring between b and b + db are distributed uniformly along a ring of
angular width dθ . A detector is at
a distance r from the scattering foil.
q
F
Number scattered
Silver
S
Copper
Aluminum
dq
FIGURE 6.8 Schematic diagram of alphaparticle scattering experiment. A radioactive
source of alpha particles is in a shield with
a small hole. Alpha particles strike the foil
F and are scattered into the angular range
dθ . Each time a scattered particle strikes the
screen S a flash of light is emitted and observed with the movable microscope M.
r dθ
r sinθ
This is the Rutherford scattering formula.
In Rutherford’s laboratory, Hans Geiger and Ernest Marsden tested the predictions of this formula in a remarkable series of experiments involving the scattering
of alpha particles (z = 2) from a variety of thin metal foils. In those days before
electronic recording and processing equipment was available, Geiger and Marsden
observed and recorded the alpha particles by counting the scintillations (flashes of
light) produced when the alpha particles struck a zinc sulfide screen. A schematic
view of their apparatus is shown in Figure 6.8. In all, four predictions of the
Rutherford scattering formula were tested:
(a) N(θ) ∝ t. With a source of 8-MeV alpha particles from radioactive
decay, Geiger and Marsden used scattering foils of varying thicknesses t while
keeping the scattering angle θ fixed at about 25◦ . Their results are summarized
in Figure 6.9, and the linear dependence of N(θ) on t is apparent. This is
also evidence that, even at this moderate scattering angle, single scattering is
much more important than multiple scattering. (In a random statistical theory
of multiple scattering, the probability for scattering at a large angle would
be proportional to the square root of the number of single scatterings, and
we would expect N(θ ) ∝ t1/2 . Figure 6.9 shows clearly that this is not true.)
M
177
Foil thickness
FIGURE 6.9 The dependence of
scattering rate on foil thickness for
three different scattering foils.
Chapter 6 | The Rutherford-Bohr Model of the Atom
178
Number scattered
Au
Ag
Cu
Al
0
2000
4000
6000
Z2
FIGURE 6.10
The dependence of
scattering rate on the nuclear charge
Z for foils of different materials. The
data are plotted against Z 2 .
Number scattered
103
102
10
0.1
3. The Closest Approach of a Projectile to the Nucleus. A positively charged
projectile slows down as it approaches a nucleus, exchanging part of its initial
kinetic energy for the electrostatic potential energy due to the nuclear repulsion.
The closer the projectile gets to the nucleus, the more potential energy it gains,
because
Slope = −2
0.2
This result emphasizes a significant difference between scattering by a Thomson
model atom and a Rutherford nuclear atom: In the Thomson model, the projectile
is scattered by every atom along its path as it passes through the foil (see
Figure 6.3), while in the Rutherford nuclear model the nucleus is so tiny that
the chance of even a single significant encounter is small and the chance of
encountering more than one nucleus is negligible.
(b) N(θ) ∝ Z 2 . In this experiment, Geiger and Marsden used a variety
of different scattering materials, of approximately (but not exactly) the same
thickness. This proportionality is therefore much more difficult to test than the
previous one, since it involves the comparison of different thicknesses of different
materials. However, as shown in Figure 6.10, the results are consistent with the
proportionality of N(θ ) to Z 2 .
(c) N(θ ) ∝ K −2 . In order to test this prediction of the Rutherford scattering
formula, Geiger and Marsden kept the thickness of the scattering foil constant and
varied the speed of the alpha particles. They accomplished this by slowing down
the alpha particles emitted from the radioactive source by passing them through
thin sheets of mica. From independent measurements they knew the effect of
different thicknesses of mica on the velocity of the alpha particles. The results of
the experiment are shown in Figure 6.11; once again we see excellent agreement
with the expected relationship.
(d) N(θ ) ∝ sin−4 21 θ. This dependence of N on θ is perhaps the most important
and distinctive feature of the Rutherford scattering formula. It also produces the
largest variation in N over the range accessible by experiment. In the tests
discussed so far, N varied by perhaps an order of magnitude; in this case N varies
by about five orders of magnitude from the smaller to the larger angles. Geiger and
Marsden used a gold foil and varied θ from 5 to 150◦ , to obtain the relationship
between N and θ plotted in Figure 6.12. The agreement with the Rutherford
formula is again very good.
Thus all predictions of the Rutherford scattering formula were confirmed by
experiment, and the “nuclear atom” was verified.
0.5
U=
1
Relative kinetic energy of
alpha particles
FIGURE 6.11 The dependence of
scattering rate on the kinetic energy
of the incident alpha particles for scattering by a single foil. The slope
of −2 on the log-log scale shows
that N ∝ K −2 , as expected from the
Rutherford formula.
1 zZe2
1 q1 q2
=
4πε0 r
4πε0 r
(6.15)
The maximum potential energy, and thus the minimum kinetic energy, occurs at
the minimum value of r. We assume that U = 0 when the projectile is far from the
nucleus, where it has total energy E = K = 12 mv2 . As the projectile approaches
the nucleus, K decreases and U increases, but U + K remains constant. At the
distance rmin , the speed is vmin and:
E=
(See Figure 6.13.)
1 zZe2
1
1 2
= mv2
mvmin +
2
4πε0 rmin
2
(6.16)
6.3 | The Rutherford Nuclear Atom
zZe2 1
4pe0 rmin
K = 1 mn2min
2
U=
107
Number scattered
179
106
U=0
K = 1 mn2
2
L = mnb
1
4
sin q
2
105
104
103
L = mnminrmin
nmin
102
rmin
b
10
0
40
80
120
160
Scattering angle (degrees)
FIGURE 6.12 The dependence of scattering rate on the scattering angle θ,
using a gold foil. The sin−4 (θ/2) dependence is exactly as predicted by the
Rutherford formula.
2
U = zZe 1
4pe0 d
d
K=0
FIGURE 6.13 Closest approach of the
projectile to the nucleus.
Angular momentum is also conserved. Far from the nucleus, the angular
momentum L is mvb, and at rmin , the angular momentum is mvmin rmin , so
mvb = mvmin rmin
(6.17)
which gives vmin = bv/rmin . Substituting this result into Eq. 6.16, we find
1
b2 v2
1 2
mv = m 2
2
2
rmin
+
1 zZe2
4πε0 rmin
(6.18)
This expression can be solved for the value of rmin .
Notice that the kinetic energy of the projectile is not zero at rmin , unless b = 0.
(See Figure 6.13.) In this case, the projectile would lose all of its kinetic energy,
and thus get closest to the nucleus. At this point its distance from the nucleus is
d, the distance of closest approach. We find this distance by solving Eq. 6.18 for
rmin when b = 0, and obtain
d=
1 zZe2
4πε0 K
(6.19)
Example 6.3
Find the distance of closest approach of an 8.0-MeV alpha
particle incident on a gold foil.
Solution
d=
zZe2 1
1
= (2)(79)(1.44 MeV·fm)
= 28 fm
4πε0 K
8.0 MeV
180
Chapter 6 | The Rutherford-Bohr Model of the Atom
Although a distance of 28 fm is very small (much less than an atomic radius,
for example) it is larger than the nuclear radius of gold (about 7 fm). Thus the
projectile is always outside of the nuclear charge distribution, and the Rutherford
scattering law, which was derived assuming the projectile to remain outside the
nucleus, correctly describes the scattering. If we increase the kinetic energy of
the projectile, or decrease the electrostatic repulsion by using a target nucleus
with low Z, this may not be the case. Under certain circumstances, the distance
of closest approach can be less than the nuclear radius. When this happens, the
projectile no longer feels the full nuclear charge, and the Rutherford scattering law
no longer holds. In fact, as we discuss in Chapter 12, this gives us a convenient
way of measuring the size of the nucleus.
6.4 LINE SPECTRA
The radiation from atoms can be classified into continuous spectra and discrete
or line spectra. In a continuous spectrum, all wavelengths from some minimum,
perhaps 0, to some maximum, perhaps approaching ∞, are emitted. The radiation
from a hot, glowing object is an example of this category. White light is a mixture
of all of the different colors of visible light; an object that glows white hot is
emitting light at all wavelengths of the visible spectrum. If, on the other hand,
we force an electric discharge in a tube containing a small amount of the gas or
vapor of a certain element, such as mercury, sodium, or neon, light is emitted at a
few discrete wavelengths and not at any others. Examples of such emission “line”
spectra are shown in Figure 6.14. The strong 436 nm (blue) and 546 nm (green)
lines in the mercury emission spectrum give mercury-vapor street lights their
blue-green tint; the strong yellow line at 590 nm in the sodium spectrum (which
is actually a doublet —two very closely spaced lines) gives sodium-vapor street
lights a softer, yellowish color. The intense red lines of neon are responsible for
the red color of “neon signs.”
Another possible experiment is to pass a beam of white light, containing all
wavelengths, through a sample of a gas. When we do so, we find that certain
Film
or
screen
V
Blue
Hg
Red
Ultraviolet
Visible
Vapor
tube
Na
Prism
Slits
200
300
400
500
Wavelength (nm)
600
700
FIGURE 6.14 Apparatus for observing emission spectra. Light is emitted when an electric discharge is created in a tube
containing a vapor of an element. The light passes through a dispersive medium, such as a prism or a diffraction grating,
which displays the individual component wavelengths at different positions. Sample line spectra are shown for mercury
and sodium in the visible and near ultraviolet.
6.4 | Line Spectra
Gas
White
cell
light
source
Film Blue
or
screen
Hg
Red
Ultraviolet
Visible
Na
Prism
200
300
Slit
400
500
Wavelength (nm)
600
700
FIGURE 6.15 Apparatus for observing absorption spectra. A light source produces a continuous range of wavelengths,
some of which are absorbed by a gaseous element. The light is dispersed, as in Figure 6.14. The result is a continuous
“rainbow” spectrum, with dark lines at wavelengths where the light was absorbed by the gas.
wavelengths have been absorbed from the light, and again a line spectrum results.
In this case there are dark lines, superimposed on the bright continuous spectrum,
at the wavelengths where the absorption occurred. These wavelengths correspond
to many (but not all) of the wavelengths seen in the emission spectrum. Examples
of absorption spectra are shown in Figure 6.15.
In general, the interpretation of line spectra is very difficult in complex atoms,
and so we will deal for now with the line spectra of the simplest atom, hydrogen.
Regularities appear in both the emission and absorption spectra, as shown in
Figure 6.16. Notice that, as with the mercury and sodium spectra, some lines
present in the emission spectrum are missing from the absorption spectrum.
90 nm
100 nm
110 nm
120 nm
Lyman (ultraviolet)
Absorption and emission
400 nm
500 nm
600 nm
Balmer (visible)
Emission only
0.5 mm 1.0 mm
1.5 mm
2.0 mm
Emission only
Paschen (infrared)
1.0 mm 2.0 mm
3.0 mm
4.0 mm
Emission only
Brackett (infrared)
2.0 mm 4.0 mm
Pfund (infrared)
6.0 mm
8.0 mm
Emission only
FIGURE 6.16 Emission and absorption spectral series of hydrogen. Note the
regularities in the spacing of the spectral lines. The lines get closer together as the
limit of each series (dashed line) is approached. Only the Lyman series appears
in the absorption spectrum; all series are present in the emission spectrum.
181
182
Chapter 6 | The Rutherford-Bohr Model of the Atom
In 1885 Johannes Balmer, a Swiss schoolteacher, noticed (mostly by trial and
error) that the wavelengths of the group of emission lines of hydrogen in the
visible region could be calculated very accurately from the formula
λ = (364.5 nm)
n2
n2
−4
(n = 3, 4, 5, . . .)
(6.20)
For example, for n = 3, the formula gives λ = 656.1 nm, which corresponds
exactly to the longest wavelength of the series of hydrogen lines in the visible
region (see Figure 6.16). This formula is now known as the Balmer formula and
the series of lines that it fits is called the Balmer series. The wavelength 364.5 nm,
corresponding to n → ∞, is called the series limit (which is shown as the dashed
line at the left end of the Balmer series in Figure 6.16).
It was soon discovered that all of the groupings of lines in the hydrogen
spectrum could be fit with a similar formula of the form
λ = λlimit
n2
n2 − n20
(n = n0 + 1, n0 + 2, n0 + 3 . . .)
(6.21)
where λlimit is the wavelength of the appropriate series limit. For the Balmer series,
n0 = 2. The other series are today known as Lyman (n0 = 1), Paschen (n0 = 3),
Brackett (n0 = 4), and Pfund (n0 = 5). These series of hydrogen spectral lines are
shown in Figure 6.16.
Another interesting property of the hydrogen wavelengths is summarized in
the Ritz combination principle. If we convert the hydrogen emission wavelengths
to frequencies, we find the curious property that certain pairs of frequencies added
together give other frequencies that appear in the spectrum.
Any successful model of the hydrogen atom must be able to explain the
occurrence of these interesting arithmetic regularities in the emission spectra.
Example 6.4
The series limit of the Paschen series (n0 = 3) is 820.1 nm.
What are the three longest wavelengths of the Paschen
series?
n=4:
Solution
n=5:
From Eq. 6.21,
n2
λ = (820.1 nm) 2
n − 32
(n = 4, 5, 6, . . .)
The three longest wavelengths are:
n=6:
42
= 1875 nm
− 32
52
λ = (820.1 nm) 2
= 1281 nm
5 − 32
62
λ = (820.1 nm) 2
= 1094 nm
6 − 32
λ = (820.1 nm)
42
These transitions are in the infrared region of the electromagnetic spectrum.
6.5 | The Bohr Model
183
Example 6.5
Show that the longest wavelength of the Balmer series
and the longest two wavelengths of the Lyman series satisfy the Ritz combination principle. For the Lyman series,
λlimit = 91.13 nm.
Solution
Using Eq. 6.20 with n = 3, we find the longest wavelength
of the Balmer series to be 656.1 nm. Converting this to a
frequency, we obtain
f =
c
2.998 × 108 m/s
= 4.57 × 1014 Hz
=
λ
(656.1 nm)(10−9 m/nm)
Using Eq. 6.21 for n = 2 and 3 with n0 = 1, we find the
longest two wavelengths of the Lyman series and their
corresponding frequencies to be
2
n=2:
λ = (91.13 nm)
2
= 121.5 nm
22 − 12
f =
c
2.998 × 108 m/s
=
λ
(121.5 nm)(10−9 m/nm)
= 24.67 × 1014 Hz
n=3:
λ = (91.13 nm)
f =
32
= 102.5 nm
32 − 12
c
2.998 × 108 m/s
=
λ
(102.5 nm)(10−9 m/nm)
= 29.24 × 1014 Hz
Adding the smallest frequency of the Lyman series to the
smallest frequency of the Balmer series gives the next
smallest Lyman frequency:
24.67 × 1014 Hz + 4.57 × 1014 Hz = 29.24 × 1014 Hz
demonstrating the Ritz combination principle.
6.5 THE BOHR MODEL
Following Rutherford’s proposal that the mass and positive charge are concentrated in a very small region at the center of the atom, the Danish physicist Niels
Bohr in 1913 (while working in Rutherford’s laboratory) suggested that the atom
resembled a miniature planetary system, with the electrons circulating about the
nucleus like planets circulating about the Sun. The atom thus doesn’t collapse
under the influence of the electrostatic Coulomb force of the nucleus on the electrons for the same reason that the solar system doesn’t collapse under the influence
of the gravitational force of the Sun on the planets. In both cases, the attractive force
provides the centripetal acceleration necessary to maintain the orbital motion.
As we discuss later, the Bohr model does not give a correct view of the
actual structure and properties of atoms, but it represents an important first step
in achieving an understanding of atoms. The correct view requires methods of
quantum mechanics, which we discuss in Chapter 7.
We consider for simplicity the hydrogen atom, with a single electron
circulating about a nucleus that has a single positive charge, as in Figure 6.17.
The radius of the circular orbit is r, and the electron (of mass m) moves with
constant tangential speed v. The attractive Coulomb force provides the centripetal
acceleration v2 /r, so
F=
1 |q1 ||q2 |
1 e2
mv2
=
=
4πε0 r2
4πε0 r2
r
(6.22)
n
−e
F
+Ze
r
FIGURE 6.17 The Bohr model of the
atom (Z = 1 for hydrogen).
184
Chapter 6 | The Rutherford-Bohr Model of the Atom
Manipulating this equation, we can find the kinetic energy of the electron (we are
assuming the more massive nucleus to remain at rest—more about this later):
K=
1 e2
1 2
mv =
2
8πε0 r
(6.23)
The potential energy of the electron-nucleus system is the Coulomb potential
energy:
1 q1 q2
1 e2
U=
=−
(6.24)
4πε0 r
4πε0 r
The total energy E = K + U is obtained by adding Eqs. 6.23 and 6.24:
1 e2
1 e2
1 e2
+ −
=−
E =K+U =
8πε0 r
4πε0 r
8πε0 r
(6.25)
We have ignored one serious difficulty with this model thus far. Classical
physics requires that an accelerated electric charge, such as our orbiting electron,
must continuously radiate electromagnetic energy. As it radiates this energy, its
total energy would decrease, the electron would spiral in toward the nucleus,
and the atom would collapse. To overcome this difficulty, Bohr made a bold
and daring hypothesis—he proposed that there are certain special states of
motion, called stationary states, in which the electron may exist without radiating
electromagnetic energy. In these states, according to Bohr, the angular momentum
−
In stationary states,
L of the electron takes values that are integer multiples of h.
−
−
−
2h,
3h,
. . ., but
the angular momentum of the electron may have magnitude h,
−
−
never such values as 2.5h or 3.1h. This is called the quantization of angular
momentum.
In a circular orbit, the position vector r that locates the electron relative
to the nucleus is always perpendicular to its linear momentum p
. The angular
= r × p, has magnitude L = rp = mvr when
momentum, which is defined as L
r is perpendicular to p. Thus Bohr’s postulate is
mvr = nh−
(6.26)
where n is an integer (n = 1, 2, 3, . . .). We can use this expression with Eq. 6.23
for the kinetic energy
1 e2
1 nh− 2
1 2
=
mv = m
2
2
mr
8πε0 r
(6.27)
to find a series of allowed values of the radius r:
Niels Bohr (1885–1962, Denmark).
He developed a successful theory
of the radiation spectrum of atomic
hydrogen and also contributed the
concepts of stationary states and complementarity to quantum mechanics.
Later he developed a successful theory
of nuclear fission. The institute of theoretical physics he founded in Copenhagen attracts scholars from around
the world.
rn =
4πε0 h−2 2
n = a0 n2
me2
(n = 1, 2, 3, . . .)
(6.28)
where the Bohr radius a0 is defined as
a0 =
4πε0 h−2
= 0.0529 nm
me2
(6.29)
This important result is very different from what we expect from classical
physics. A satellite may be placed into Earth orbit at any desired radius by
boosting it to the appropriate altitude and then supplying the proper tangential
speed. This is not true for an electron’s orbit—only certain radii are allowed by
6.5 | The Bohr Model
the Bohr model. The radius of the electron’s orbit may be a0 , 4a0 , 9a0 , 16a0 , and
so forth, but never 3a0 or 5.3a0 .
Substituting Eq. 6.28 for r into Eq. 6.25 gives the energy:
me4
1
−13.60 eV
=
En = −
2 2 −2 n2
n2
32π ε0 h
(n = 1, 2, 3, . . .)
(6.30)
The energy levels calculated from Eq. 6.30 are shown in Figure 6.18. The
electron’s energy is quantized —only certain energy values are possible. In its
lowest level, with n = 1, the electron has energy E1 = −13.60 eV and orbits with
a radius of r1 = 0.0529 nm. This state is the ground state. The higher states (n = 2
with E2 = −3.40 eV, n = 3 with E3 = −1.51 eV, etc.) are the excited states.
The excitation energy of an excited state n is the energy above the ground
state, En − E1 . Thus the first excited state (n = 2) has excitation energy
E = E2 − E1 = −3.40 eV − (−13.60 eV) = 10.20 eV
the second excited state has excitation energy
E = E3 − E1 = −1.51 eV − (−13.60 eV) = 12.09 eV
and so forth. The excitation energy can also be regarded as the amount of energy
that the atom must absorb for the electron to make an upward jump. For example,
if the atom absorbs an energy of 10.20 eV when the electron is in the ground state
(n = 1), the electron will jump upward to the first excited state (n = 2).
The magnitude of an electron’s energy |En | is sometimes called its binding
energy; for example, the binding energy of an electron in the n = 2 state is
3.40 eV. If the atom absorbs an amount of energy equal to the binding energy
of the electron, the electron will be removed from the atom and become a
free electron. The atom, minus its electron, is called an ion. The amount of
energy needed to remove an electron from an atom is also called the ionization
energy. Usually the ionization energy of an atom indicates the energy to remove
an electron from the ground state. If the atom absorbs more energy than the
minimum necessary to remove the electron, the excess energy appears as the
kinetic energy of the now free electron.
The binding energy can also be regarded as the energy that is released when the
atom is assembled from an electron and a nucleus that are initially separated by a
large distance. If we bring an electron from a large distance away (where E = 0)
and place it in orbit in the state n where its energy has the negative value En ,
energy amounting to |En | is released, usually in the form of one or more photons.
The Hydrogen Wavelengths in the
Bohr Model
We previously discussed the emission and absorption spectra of atomic hydrogen,
and our discussion of the Bohr model is not complete without an understanding of
the origin of these spectra. Bohr postulated that, even though the electron doesn’t
radiate when it remains in any particular stationary state, it can emit radiation
when it moves to a lower energy level. In the lower level, the electron has less
energy than in the original level, and the energy difference appears as a quantum
of radiation whose energy hf is equal to the energy difference between the levels.
E∞ = 0
E4 = −0.85 eV
E3 = −1.51 eV
n=∞
n=4
n=3
n=2
185
Binding
energy
E2 = −3.40 eV
Excitation
energy
n=1
E1 = −13.60 eV
FIGURE 6.18 The energy levels of
atomic hydrogen, showing the excitation energy of the electron from n = 1
to n = 2 and the binding energy of the
n = 2 electron.
186
Chapter 6 | The Rutherford-Bohr Model of the Atom
n = n1
hf
That is, if the electron jumps from n = n1 to n = n2 , as in Figure 6.19, a photon
appears with energy
hf = En1 − En2
n = n2
(6.31)
or, using Eq. 6.30 for the energies,
f =
FIGURE 6.19
An electron jumps
from the state n1 to the state n2 as a
photon is emitted.
me4
64π 3 ε02 h−3
1
1
− 2
2
n2
n1
(6.32)
The wavelength of the emitted radiation is
2 2
2 2
64π 3 ε02 h−3 c
n1 n2
n1 n2
c
1
λ= =
=
2
2
2
4
f
me
R∞ n1 − n22
n1 − n2
(6.33)
where R∞ is called the Rydberg constant
R∞ =
me4
(6.34)
64π 3 ε02 h−3 c
The presently accepted numerical value is
R∞ = 1.097373 × 107 m−1
Example 6.6
Find the wavelengths of the transitions from n1 = 3 to
n2 = 2 and from n1 = 4 to n2 = 2 in atomic hydrogen.
Solution
For n1 = 3 and n2 = 2, Eq. 6.33 gives
2 2
n1 n2
1
λ=
2
R∞ n1 − n22
2 2
1
3 2
=
= 656.1 nm
7
−1
2
1.097 × 10 m
3 − 22
and for n1 = 4 and n2 = 2,
2 2
n1 n2
1
λ=
2
R∞ n1 − n22
1
=
1.097 × 107 m−1
42 22
42 − 22
= 486.0 nm
These wavelengths are remarkably close to the values of the two longest
wavelengths of the Balmer series (Figure 6.16). In fact, Eq. 6.33 gives
2
n
λ = (364.5 nm) 2 1
n1 − 4
for the wavelength of a transition from any state n1 to n2 = 2. This is identical
with Eq. 6.21 for the Balmer series. Thus we see that the radiations identified as
the Balmer series correspond to transitions from higher levels to the n = 2 level.
Similar identifications can be made for other series of radiations, as shown in
Figure 6.20. This association between the transitions expected according to the
Bohr model and the observed wavelengths (as in Figure 6.16) represents a huge
triumph for the model.
The Bohr formulas also explain the Ritz combination principle, according to
which certain frequencies in the emission spectrum can be summed to give other
364.5
...
3.40
434.0
2.86
410.1
486.0
2.55
...
3.02
656.1
93.7
13.23
1.89
94.9
13.06
91.1
97.2
12.76
...
13.60
102.5
n=∞
n=6
n=5
n=4
n=3
...
12.09
Photon
energy
(eV)
121.5
Photon
wavelength
(nm)
10.20
6.5 | The Bohr Model
n=2
E∞ = 0
E6 = −0.38 eV
E5 = −0.54 eV
E4 = −0.85 eV
E3 = −1.51 eV
E2 = −3.40 eV
Balmer
series
E1 = −13.60 eV
n=1
Lyman
series
FIGURE 6.20 The transitions of the Lyman and Balmer series in hydrogen. The
series limit is shown at the right of each group.
frequencies. Let us consider a transition from a state n3 to a state n2 , that is followed
by a transition from n2 to n1 . Equation 6.32 can be used for this case to give
1
1
−
fn3 →n2 = cR∞
n23
n22
1
1
fn2 →n1 = cR∞
−
n22
n21
Thus
fn3 →n2 + fn2 →n1 = cR∞
1
1
− 2
2
n3
n2
+ cR∞
1
1
− 2
2
n2
n1
= cR∞
1
1
− 2
2
n3
n1
which is equal to the frequency of the single photon emitted in a direct transition
from n3 to n1 , so
fn3 →n2 + fn2 →n1 = fn3 →n1
(6.35)
The Bohr model is thus entirely consistent with the Ritz combination principle.
The frequency of an emitted photon is related to its energy by E = hf , so the
summing of frequencies is equivalent to the summing of energies. We may thus
restate the Ritz combination principle in terms of energy: The energy of a photon
emitted in a transition that skips or crosses over one or more states is equal to the
step-by-step sum of the energies of the transitions connecting all of the individual
states. (See Problem 25.)
187
188
Chapter 6 | The Rutherford-Bohr Model of the Atom
The Bohr model also helps us understand why the atom doesn’t absorb and
emit radiation at all the same wavelengths. Isolated atoms are normally found
only in the ground state; the excited states live for a very short time (less than
10−9 s) before decaying to the ground state. The absorption spectrum therefore
contains only transitions from the ground state. From Figure 6.20, we see that
only the radiations of the Lyman series can be found in the absorption spectrum of
hydrogen. A hydrogen atom in its ground state can absorb radiation of 10.20 eV
and reach the first excited state, or of 12.09 eV and reach the second excited state,
and so forth. A hydrogen atom cannot absorb a photon of energy 1.89 eV (the first
line of the Balmer series), because the atom is originally not in the n = 2 level.
The Balmer series is therefore not found in the absorption spectrum.
Atoms with Z > 1
The Bohr theory for hydrogen can be used for any atom with a single electron,
even if the nuclear charge Z is greater than 1. For example, we can calculate
the energy levels of singly ionized helium (helium with one electron removed),
doubly ionized lithium, and so on. The nuclear electric charge enters the Bohr
theory in only one place—in the expression for the electrostatic force between
nucleus and electron, Eq. 6.22. For a nucleus of charge Ze, the Coulomb force
acting on the electron is
F=
1 Ze2
1 |q1 ||q2 |
=
2
4πε0 r
4πε0 r2
(6.36)
That is, where we had e2 previously, we now have Ze2 . Making the same
substitution in the final results, we can find the allowed radii:
rn =
a0 n2
4πε0 h−2 2
=
n
Ze2 m
Z
(6.37)
and the energies become
En = −
m(Ze2 )2 1
Z2
=
−(13.60
eV)
n2
32π 2 ε02 h−2 n2
(6.38)
The orbits in the higher-Z atoms are closer to the nucleus and have larger
(negative) energies; that is, the electron is more tightly bound to the nucleus.
Example 6.7
Calculate the two longest wavelengths of the Balmer series
of triply ionized beryllium (Z = 4).
Solution
The radiations of the Balmer series end with the n = 2
level, and so the two longest wavelengths are the radiations
corresponding to n = 3 → n = 2 and n = 4 → n = 2. The
energies of the radiations and their corresponding wavelengths are
1
1
E3 − E2 = −(13.60 eV)(42 ) 2 − 2 = 30.2 eV
3
2
λ=
hc 1240 eV·nm
=
= 41.0 nm
E
30.2 eV
6.6 | The Franck-Hertz Experiment
E4 − E2 = −(13.60 eV)(42 )
λ=
1
1
− 2
2
4
2
= 40.8 eV
hc 1240 eV·nm
=
= 30.4 nm
E
40.8 eV
189
These radiations are in the ultraviolet region.
Note that we cannot use Eq. 6.33 to find the wavelengths,
because that equation applies only to hydrogen (Z = 1).
6.6 THE FRANCK-HERTZ EXPERIMENT
C
F
G
P
V
V0 A
FIGURE 6.21 Franck-Hertz apparatus. Electrons leave the cathode C, are
accelerated by the voltage V toward
the grid G, and reach the plate P where
they are recorded on the ammeter A.
14.7 V
9.8 V
Current
Let us imagine the following experiment, performed with the apparatus shown
schematically in Figure 6.21. A filament heats the cathode, which then emits
electrons. These electrons are accelerated toward the grid by the potential
difference V , which we control. Electrons pass through the grid and reach the
plate if V exceeds V0 , a small retarding voltage between the grid and the plate.
The current of electrons reaching the plate is measured using the ammeter A.
Now suppose the tube is filled with atomic hydrogen gas at a low pressure. As
the voltage is increased from zero, more and more electrons reach the plate, and
the current rises accordingly. The electrons inside the tube may make collisions
with atoms of hydrogen, but lose no energy in these collisions—the collisions are
perfectly elastic. The only way the electron can give up energy in a collision is if
the electron has enough energy to cause the hydrogen atom to make a transition to
an excited state. Thus, when the energy of the electrons reaches and barely exceeds
10.2 eV (or when the voltage reaches 10.2 V), the electrons can make inelastic
collisions, leaving 10.2 eV of energy with the atom (now in the n = 2 level), and
the original electron moves off with very little energy. If it should pass through the
grid, the electron might not have sufficient energy to overcome the small retarding
potential and reach the plate. Thus when V = 10.2 V, a drop in the current
is observed. As V is increased further, we begin to see the effects of multiple
collisions. That is, when V = 20.4 V, an electron can make an inelastic collision,
leaving the atom in the n = 2 state. The electron loses 10.2 eV of energy in this
process, and so it moves off after the collision with a remaining 10.2 eV of energy,
which is sufficient to excite a second hydrogen atom in an inelastic collision. Thus,
if a drop in the current is observed at V , similar drops are observed at 2V , 3V , . . . .
This experiment should thus give rather direct evidence for the existence of
atomic excited states. Unfortunately, it is not easy to do this experiment with
hydrogen, because hydrogen occurs naturally in the molecular form H2 , rather
than in atomic form. The molecules can absorb energy in a variety of ways, which
would confuse the interpretation of the experiment. A similar experiment was
done in 1914 by James Franck and Gustav Hertz, using a tube filled with mercury
vapor. Their results are shown in Figure 6.22, which gives clear evidence for an
excited state at 4.9 eV; whenever the voltage is a multiple of 4.9 V, a drop in
the current appears. Coincidentally, the emission spectrum of mercury shows an
intense ultraviolet line of wavelength 254 nm, which corresponds to an energy
of 4.9 eV; this results from a transition between the same 4.9-eV excited state
and the ground state. The Franck-Hertz experiment showed that an electron must
have a certain minimum energy to make an inelastic collision with an atom; we
now interpret that minimum energy as the energy of an excited state of the atom.
Franck and Hertz were awarded the 1925 Nobel Prize in physics for this work.
4.9 V
5
10
Voltage
15
FIGURE 6.22 Result of Franck-Hertz
experiment using mercury vapor. The
current drops at voltages of 4.9 V,
9.8 V (= 2 × 4.9 V), 14.7 V (= 3 ×
4.9 V).
190
Chapter 6 | The Rutherford-Bohr Model of the Atom
∗
6.7 THE CORRESPONDENCE PRINCIPLE
We have seen how Bohr’s model permits calculations of transition wavelengths in
atomic hydrogen that are in excellent agreement with the wavelengths observed in
the emission and absorption spectra. However, in order to obtain this agreement,
Bohr had to introduce postulates that were radical departures from classical
physics. In particular, according to classical physics an accelerated charged
particle radiates electromagnetic energy, but an electron in Bohr’s atomic model,
accelerated as it moves in a circular orbit, does not radiate (unless it jumps to
another orbit). Here we have a very different case than we did in our study
of special relativity. You will recall, for example, that relativity gives us one
expression for the kinetic energy, K = E − E0 , and classical physics gives us
another, K = 12 mv2 ; however, we showed that E − E0 reduces to 21 mv2 when
v ≪ c. Thus these two expressions are really not very different—one is merely
a special case of the other. The dilemma associated with the accelerated electron
is not simply a matter of atomic physics (as an example of quantum physics)
being a special case of classical physics. Either the accelerated charge radiates,
or it doesn’t! Bohr’s solution to this serious dilemma was to propose the
correspondence principle, which states that
Quantum theory must agree with classical theory in the limit in which
classical theory is known to agree with experiment,
or equivalently,
Quantum theory must agree with classical theory in the limit of large
quantum numbers.
Let us see how we can apply this principle to the Bohr atom. According to
classical physics, an electric charge moving in a circle radiates at a frequency
equal to its frequency of rotation. For an atomic orbit, the period of revolution
is
the distance traveled in one orbit, 2πr, divided by the orbital speed v = 2K/m,
where K is the kinetic energy:
16π 3 ε0 mr3
2π r
=
T=
(6.39)
e
2K/m
where we use Eq. 6.23 for the kinetic energy. The frequency f is the inverse of
the period:
f =
e
1
=
T
16π 3 ε0 mr3
(6.40)
Using Eq. 6.28 for the radii of the allowed orbits, we find
fn =
me4
32π 3 ε02 h−3
1
n3
(6.41)
A “classical” electron moving in an orbit of radius rn would radiate at this
frequency fn .
∗
This is an optional section that may be skipped without loss of continuity.
6.8 | Deficiencies of the Bohr Model
191
If we made the radius of the Bohr atom so large that it went from a quantumsized object (10−10 m) to a laboratory-sized object (10−3 m), the atom should
behave classically. The radius increases with increasing n like n2 , so this classical
behavior should occur for n in the range 103 –104 . Let us then calculate the
frequency of the radiation emitted by such an atom when the electron drops from
the orbit n to the orbit n − 1. According to Eq. 6.32, the frequency is
2n − 1
1
1
me4
me4
−
(6.42)
=
f =
3 2 −3
3 2 −3 n2 (n − 1)2
2
2
(n
−
1)
n
64π ε0 h
64π ε0 h
If n is very large, then we can approximate n − 1 by n and 2n − 1 by 2n, which
gives
f ∼
=
2n
1
me4
=
3 2 −3 n4
3 2 −3 n3
64π ε0 h
32π ε0 h
me4
(6.43)
This is identical with Eq. 6.41 for the “classical” frequency. The “classical”
electron spirals slowly in toward the nucleus, radiating at the frequency given by
Eq. 6.41, while the “quantum” electron jumps from the orbit n to the orbit n − 1
and then to the orbit n − 2, and so forth, radiating at the frequency given by the
identical Eq. 6.43. (When the circular orbits are very large, this jumping from
one circular orbit to the next smaller one looks very much like a spiral, as in
Figure 6.23.)
In the region of large n, where classical and quantum physics overlap, the
classical and quantum expressions for the radiation frequencies are identical.
This is an example of an application of Bohr’s correspondence principle. The
applications of the correspondence principle go far beyond the Bohr atom, and
this principle is important in understanding how we get from the domain in which
the laws of classical physics are valid to the domain in which the laws of quantum
physics are valid.
6.8 DEFICIENCIES OF THE BOHR MODEL
The Bohr model gives us a picture of how electrons move about the nucleus,
and many of our attempts to explain the behavior of atoms refer to this picture,
even though it is not strictly correct. Our presentation ignored two effects that
must be included to improve the accuracy of the model. Other deficiencies in the
model cannot be so easily fixed, because they are inconsistent with the correct
quantum-mechanical picture, which is presented in the next chapter.
1. Motion of the Proton. Our model was based on an electron orbiting around
a fixed proton, but actually the electron and proton both orbit about their center
of mass (just as the Earth and Sun orbit around their center of mass). The kinetic
energy should thus include a term describing the motion of the proton. We can
account for this effect if the mass that appears in the equation for the energy
levels (Eq. 6.30) is not the electron mass but instead is the reduced mass of the
proton-electron system, calculated from the electron mass me and proton mass mp
according to
me mp
(6.44)
m=
me + mp
FIGURE 6.23 (Top) A large quantum
atom. Photons are emitted in discrete
transitions as the electron jumps to
lower states. (Bottom) A classical
atom. Photons are emitted continuously by the accelerated electron.
192
Chapter 6 | The Rutherford-Bohr Model of the Atom
The reduced mass is just slightly smaller than the electron mass and has the
effect of decreasing the energy and frequency or increasing the wavelength by
about 0.05%. Equivalently, in Eq. 6.33 for the wavelengths we can replace the
Rydberg constant R∞ (so called because it would be correct if the proton mass
were infinite) with the value R = R∞ (1 + me /mp ).
2. Wavelengths in Air. Another small but easily fixable error occurs when we
convert the frequencies (Eq. 6.32) calculated directly from the Bohr energy levels
to wavelengths (Eq. 6.33). The wavelength measurements are normally done in
air, so we should calculate the wavelength as λ = vair /f , where vair is the speed of
light in air. This has the effect of decreasing the calculated wavelengths by about
0.03% (to some extent offsetting the error we made by ignoring the motion of the
proton).
3. Angular Momentum. A serious failure of the Bohr model is that it gives
incorrect predictions for the angular momentum of the electron. In Bohr’s theory,
−
which is
the orbital angular momentum is quantized in integer multiples of h,
correct. However, for the ground state of hydrogen (n = 1), the Bohr theory gives
−
while experiment clearly shows L = 0.
L = h,
4. Uncertainty. Another deficiency of the model is that it violates the uncertainty relationship. (In Bohr’s defense, remember that the model was developed a
decade before the introduction of wave mechanics, with its accompanying ideas
of uncertainty.) Suppose the electron orbits in the xy plane. In this case we know
exactly its z coordinate (in the xy plane, z = 0 and so z = 0) and the z component
of its momentum (also precisely zero, so pz = 0). Such an atom would therefore
−
In fact, as we discuss in the next
violate the uncertainty relationship zpz ≥ h.
chapter, quantum mechanics introduces a degree of “fuzziness” to the behavior
of electrons in atoms that is not consistent with any orbit in a single plane.
In spite of its successes, the Bohr model is at best an incomplete model. It is
useful only for atoms that contain one electron (hydrogen, singly ionized helium,
doubly ionized lithium, and so forth), but not for atoms with two or more electrons,
because we have considered only the force between electron and nucleus, and
not the force between the electrons themselves. Furthermore, if we look very
carefully at the emission spectrum, we find that many lines are in fact not single
lines, but very closely spaced combinations of two or more lines; the Bohr model
is unable to account for these doublets of spectral lines. The model is also limited
in its usefulness as a basis from which to calculate other properties of the atom;
although we can accurately calculate the energies of the spectral lines, we cannot
calculate their intensities. For example, how often will an electron in the n = 3
state jump directly to the n = 1 state, emitting the corresponding photon, and how
often will it jump first to the n = 2 state and then to the n = 1 state, emitting two
photons? A complete theory should provide a way to calculate this property.
We do not wish, however, to discard the model completely. The Bohr model
provides a useful starting point in our study of atoms, and Bohr introduced several
ideas (stationary states, quantization of angular momentum, correspondence
principle) that carry over into the correct quantum-mechanical calculation. There
are many atomic properties, especially those associated with magnetism, that can
be simply modeled on the basis of Bohr orbits. Most remarkably, when we treat
the hydrogen atom correctly in the next chapter using quantum mechanics, we
find that the energy levels calculated by solving the Schrödinger equation are in
fact identical with those of the Bohr model.
Questions
193
Chapter Summary
Section
Section
Scattering impact
parameter
1
zZ e2
cot θ
b=
2K 4πε0
2
6.3
Excitation energy
of level n
Fraction scattered
at angles > θ
f>θ = ntπb2
6.3
|En |
Binding (or
ionization) energy
of level n
Rutherford
N(θ) =
2 2
scattering formula nt zZ 2
1
e
4r2 2K
4πε0
sin4 12 θ
1 zZe2
Distance of
d=
4πε0 K
closest approach
n2
Balmer formula
λ = (364.5 nm) 2
n −4
(n = 3, 4, 5, . . .)
6.3
λ=
Single-electron
atoms with Z > 1
rn =
4πε0 h−2 2
6.5
Radii of Bohr
rn =
n = a 0 n2
me2
orbits in hydrogen
(n = 1, 2, 3, . . .)
me4
1
Energies of Bohr En = −
6.5
2 2 − 2 n2
32π ε0 h
orbits in hydrogen
−13.60 eV
=
(n = 1, 2, 3, . . .)
n2
Reduced mass of
proton-electron
system
m=
6.5
6.5
2 2
64π 3 ε02 h−3 c
n1 n2
2
4
me
n1 − n22
2 2
n1 n2
1
=
R∞ n21 − n22
Hydrogen
wavelengths in
Bohr model
6.3
6.4
En − E1
a0 n 2
Z2
, En = −(13.60 eV) 2
Z
n
me mp
me + mp
6.5
6.5
6.8
Questions
1. Does the Thomson model fail at large scattering angles or at
small scattering angles? Why?
2. What principles of physics would be violated if we scattered
a beam of alpha particles with a single impact parameter
from a single target atom at rest?
3. Could we use the Rutherford scattering formula to analyze
the scattering of (a) protons incident on iron? (b) Alpha particles incident on lithium (Z = 3)? (c) Silver nuclei incident
on gold? (d) Hydrogen atoms incident on gold? (e) Electrons
incident on gold?
4. What determines the angular range dθ in the alpha-particle
scattering experiment (Figure 6.8)?
5. Why didn’t Bohr use the concept of de Broglie waves in his
theory?
6. In which Bohr orbit does the electron have the largest
velocity? Are we justified in treating the electron nonrelativistically in that case?
7. How does an electron in hydrogen get from r = 4a0 to
r = a0 without being anywhere in between?
8. How is the quantization of the energy in the hydrogen
atom similar to the quantization of the systems discussed
in Chapter 5? How is it different? Do the quantizations
originate from similar causes?
9. In a Bohr atom, an electron jumps from state n1 , with angular
h, to state n2 , with angular momentum n2 −
h.
momentum n1 −
How can an isolated system change its angular momentum?
(In classical physics, a change in angular momentum requires
an external torque.) Can the photon carry away the difference in angular momentum? Estimate the maximum angular
momentum, relative to the center of the atom, that the photon can have. Does this suggest another failure of the Bohr
model?
10. The product En rn for the hydrogen atom is (1) independent
of Planck’s constant and (2) independent of the quantum
number n. Does this observation have any significance? Is
this a classical or a quantum effect?
11. (a) How does a Bohr atom violate the position-momentum
uncertainty relationship? (b) How does a Bohr atom violate the energy-time uncertainty relationship? (What is E?
What does this imply about t? What do you conclude
about transitions between levels?)
194
Chapter 6 | The Rutherford-Bohr Model of the Atom
12. List the assumptions made in deriving the Bohr theory.
Which of these are a result of neglecting small quantities?
Which of these violate basic principles of relativity or
quantum physics?
13. List the assumptions made in deriving the Rutherford scattering formula. Which of these are a result of neglecting
small quantities? Which of these violate basic principles of
relativity or quantum physics?
14. In both the Rutherford theory and the Bohr theory, we used
the classical expression for the kinetic energy. Estimate the
velocity of an electron in the Bohr atom and of an alpha particle in a typical scattering experiment, and decide whether
the use of the classical formula is justified.
15. In both the Rutherford theory and the Bohr theory, we
neglected any wave properties of the particles. Estimate the
de Broglie wavelength of an electron in a Bohr atom and
compare it with the size of the atom. Estimate the de Broglie
wavelength of an alpha particle and compare it with the
16.
17.
18.
19.
size of the nucleus. Is the wave behavior expected to be
important in either case?
What is the distinction between binding energy and ionization energy? Between binding energy and excitation energy?
If you were given the value of the binding energy of a level
in hydrogen, could you find its excitation energy without
knowing which level it is?
Why are the decreases in current in the Franck-Hertz experiment not sharp?
As indicated by the Franck-Hertz experiment, the first
excited state of mercury is at an energy of 4.9 eV. Do
you expect mercury to show absorption lines in the visible
spectrum?
Is the correspondence principle a necessary part of quantum physics or is it merely an accidental agreement of two
formulas? Where do we draw the line between the world
of quantum physics and the world of classical, nonquantum
physics?
Problems
6.1 Basic Properties of Atoms
6.3 The Rutherford Nuclear Atom
1. Electrons in atoms are known to have kinetic energies in
the range of a few eV. Show that the uncertainty principle
allows electrons of this energy to be confined in a region the
size of an atom (0.1 nm).
6.2 Scattering Experiments and the Thomson Model
2. Consider an electron in Figure 6.1 embedded in a sphere of
positive charge Ze at a distance r from its center. (a) Using
Gauss’s law, show that the electric field on the electron due
to the positive charge is
E=
1 Ze
r
4πε0 R3
(b) For this electric field, show that the force on the electron
is given by Eq. 6.2.
3. (a) Compute the oscillation frequency of the electron and the
expected absorption or emission wavelength in a Thomsonmodel hydrogen atom. Use R = 0.053 nm. Compare with
the observed wavelength of the strongest emission and
absorption line in hydrogen, 122 nm. (b) Repeat for sodium
(Z = 11). Use R = 0.18 nm. Compare with the observed
wavelength, 590 nm.
4. Consider the Thomson model for an atom with 2 electrons.
Let the electrons be located along a diameter on opposite
sides of the center of the sphere, each a distance x from the
center. (a) Show that the configuration is stable if x = R/2.
(b) Try to construct similar stable configurations for atoms
with 3, 4, 5, and 6 electrons.
5. Alpha particles of kinetic energy 5.00 MeV are scattered
at 90◦ by a gold foil. (a) What is the impact parameter?
(b) What is the minimum distance between alpha particles
and gold nucleus? (c) Find the kinetic and potential energies
at that minimum distance.
6. How much kinetic energy must an alpha particle have before
its distance of closest approach to a gold nucleus is equal to
the nuclear radius (7.0 × 10−15 m)?
7. What is the distance of closest approach when alpha particles
of kinetic energy 6.0 MeV are scattered by a thin copper foil?
8. Protons of energy 5.0 MeV are incident on a silver foil of
thickness 4.0 × 10−6 m. What fraction of the incident protons is scattered at angles: (a) Greater than 90◦ ? (b) Greater
than 10◦ ? (c) Between 5◦ and 10◦ ? (d) Less than 5◦ ?
9. Protons are incident on a copper foil 12 μm thick. (a) What
should the proton kinetic energy be in order that the distance of closest approach equal the nuclear radius (5.0 fm)?
(b) If the proton energy were 7.5 MeV, what is the impact
parameter for scattering at 120◦ ? (c) What is the minimum
distance between proton and nucleus for this case? (d) What
fraction of the protons is scattered beyond 120◦ ?
10. Alpha particles of kinetic energy K are scattered either from
a gold foil or a silver foil of identical thickness. What is the
ratio of the number of particles scattered at angles greater
than 90◦ by the gold foil to the same number for the silver
foil?
11. The maximum kinetic energy given to the target nucleus will
occur in a head-on collision with b = 0. (Why?) Estimate
Problems
the maximum kinetic energy given to the target nucleus
when 8.0 MeV alpha particles are incident on a gold foil.
Are we justified in neglecting this energy?
12. The maximum kinetic energy that an alpha particle can
transmit to an electron occurs during a head-on collision.
Compute the kinetic energy lost by an alpha particle of
kinetic energy 8.0 MeV in a head-on collision with an electron at rest. Are we justified in neglecting this energy in the
Rutherford theory?
13. Alpha particles of energy 9.6 MeV are incident on a silver
foil of thickness 7.0 μm. For a certain value of the impact
parameter, the alpha particles lose exactly half their incident
kinetic energy when they reach their minimum separation
from the nucleus. Find the minimum separation, the impact
parameter, and the scattering angle.
14. Alpha particles of kinetic energy 6.0 MeV are incident at
a rate of 3.0 × 107 per second on a gold foil of thickness
3.0 × 10−6 m. A circular detector of diameter 1.0 cm is
placed 12 cm from the foil at an angle of 30◦ with the
direction of the incident alpha particles. At what rate does
the detector measure scattered alpha particles?
195
25. Use the Bohr formula to find the energy differences
E(n1 → n2 ) = En1 − En2 and show that (a) E(4 → 2) =
E(4 → 3) + E(3 → 2); (b) E(4 → 1) = E(4 → 2) +
E(2 → 1). (c) Interpret these results based on the Ritz combination principle.
26. Find the shortest and the longest wavelengths of the Lyman
series of singly ionized helium.
27. Draw an energy-level diagram showing the lowest four levels of singly ionized helium. Show all possible transitions
from the levels and label each transition with its wavelength.
28. A long time ago, in a galaxy far, far away, electric charge
had not yet been invented, and atoms were held together
by gravitational forces. Compute the Bohr radius and the
n = 2 to n = 1 transition energy in a gravitationally bound
hydrogen atom.
29. An alternative development of the Bohr theory begins by
assuming that the stationary states are those for which the
circumference of the orbit is an integral number of de Broglie
wavelengths. (a) Show that this condition leads to standing
de Broglie waves around the orbit. (b) Show that this condition gives the angular momentum condition, Eq. 6.26, used
in the Bohr theory.
6.4 Line Spectra
15. The shortest wavelength of the hydrogen Lyman series is
91.13 nm. Find the three longest wavelengths in this series.
16. One of the lines in the Brackett series (series limit = 1458 nm)
has a wavelength of 1944 nm. Find the next higher and next
lower wavelengths in this series.
17. The longest wavelength in the Pfund series is 7459 nm. Find
the series limit.
6.5 The Bohr Model
18. In the n = 3 state of hydrogen, find the electron’s velocity,
kinetic energy, and potential energy.
19. Use the Bohr theory to find the series wavelength limits of
the Lyman and Paschen series of hydrogen.
20. (a) Show that the speed of an electron in the nth Bohr
orbit of hydrogen is αc/n, where α is the fine structure
hc. (b) What would be the speed
constant, equal to e2 /4πε0 −
in a hydrogenlike atom with a nuclear charge of Ze?
21. An electron is in the n = 5 state of hydrogen. To what states
can the electron make transitions, and what are the energies
of the emitted radiations?
22. Continue Figure 6.20, showing the transitions of the Paschen
series and computing their energies and wavelengths.
23. A collection of hydrogen atoms in the ground state is illuminated with ultraviolet light of wavelength 59.0 nm. Find
the kinetic energy of the emitted electrons.
24. Find the ionization energy of: (a) the n = 3 level of hydrogen; (b) the n = 2 level of He+ (singly ionized helium); (c)
the n = 4 level of Li++ (doubly ionized lithium).
6.6 The Franck-Hertz Experiment
30. A hypothetical atom has only two excited states, at 4.0 and
7.0 eV, and has a ground-state ionization energy of 9.0 eV.
If we used a vapor of such atoms for the Franck-Hertz experiment, for what voltages would we expect to see decreases
in the current? List all voltages up to 20 V.
31. The first excited state of sodium decays to the ground state
by emitting a photon of wavelength 590 nm. If sodium vapor
is used for the Franck-Hertz experiment, at what voltage will
the first current drop be recorded?
6.7 The Correspondence Principle
32. Suppose all of the excited levels of hydrogen had lifetimes
of 10−8 s. As we go to higher and higher excited states, they
get closer and closer together, and soon they are so close
in energy that the energy uncertainty of each state becomes
as large as the energy spacing between states, and we can
no longer resolve individual states. Find the value of n for
which this occurs. What is the radius of such an atom?
33. Compare the frequency of revolution of an electron with
the frequency of the photons emitted in transitions from
n to n − 1 for (a) n = 10; (b) n = 100; (c) n = 1000;
(d) n = 10,000.
6.8 Deficiencies of the Bohr Model
34. What is the difference in wavelength between the first line of
the Balmer series in ordinary hydrogen (M = 1.007825 u)
and in “heavy” hydrogen (M = 2.014102 u)?
196
Chapter 6 | The Rutherford-Bohr Model of the Atom
General Problems
35. A hydrogen atom is in the n = 6 state. (a) Counting all
possible paths, how many different photon energies can be
emitted if the atom ends up in the ground state? (b) Suppose
only n = 1 transitions were allowed. How many different
photon energies would be emitted? (c) How many different
photon energies would occur in a Thomson-model hydrogen
atom?
36. An electron is in the n = 8 level of ionized helium.
(a) Find the three longest wavelengths that are emitted
when the electron makes a transition from the n = 8 level to
a lower level. (b) Find the shortest wavelength that can be
emitted. (c) Find the three longest wavelengths at which the
electron in the n = 8 level will absorb a photon and move
to a higher state, if we could somehow keep it in that level
long enough to absorb. (d) Find the shortest wavelength that
can be absorbed.
37. The lifetimes of the levels in a hydrogen atom are of the
order of 10−8 s. Find the energy uncertainty of the first
excited state and compare it with the energy of the state.
38. The following wavelengths are found among the many radiations emitted by singly ionized helium: 24.30 nm, 25.63 nm,
102.5 nm, 320.4 nm. If we group the transitions in helium
as we did in hydrogen by identifying the final state n0 and
initial state n, to which series does each transition belong?
39. Adjacent wavelengths 72.90 nm and 54.00 nm are found in
one series of transitions among the radiations emitted by
doubly-ionized lithium. Find the value of n0 for this series
and find the next wavelength in the series.
40. When an atom emits a photon in a transition from a
state of energy E1 to a state of energy E2 , the photon
energy is not precisely equal to E1 − E2 . Conservation of
momentum requires that the atom must recoil, and so some
energy must go into recoil kinetic energy KR . Show that
KR ∼
= (E1 − E2 )2 /2Mc2 where M is the mass of the atom.
Evaluate this recoil energy for the n = 2 to n = 1 transition
of hydrogen.
41. In a muonic atom, the electron is replaced by a negatively
charged particle called the muon. The muon mass is 207
times the electron mass. (a) Ignoring the correction for finite
nuclear mass, what is the shortest wavelength of the Lyman
series in a muonic hydrogen atom? In what region of the
electromagnetic spectrum does this belong? (b) How large
is the correction for the finite nuclear mass in this case? (See
the discussion at the beginning of Section 6.8.)
42. Consider an atom in which the single electron is replaced
by a negatively charged muon (mμ = 207me ). What is the
radius of the first Bohr orbit of a muonic lead atom (Z = 82)?
Compare with the nuclear radius of about 7 fm.
Chapter
7
THE HYDROGEN ATOM IN
WAVE MECHANICS
These computer-generated distributions represent the probability to locate the electron in
the n = 8 state of hydrogen for angular momentum quantum number l = 2 (top) and l = 6
(bottom). The nucleus is at the center, and the height at any point gives the probability to find
the electron in a small volume element at that location in the xz plane. This way of describing
the motion of an electron in hydrogen is very different from the circular orbits of the Bohr
model.
198
Chapter 7 | The Hydrogen Atom in Wave Mechanics
In this chapter we study the solutions of the Schrödinger equation for the
hydrogen atom. We will see that these solutions, which lead to the same energy
levels calculated in the Bohr model, differ from the Bohr model by allowing for
the uncertainty in localizing the electron.
Other deficiencies of the Bohr model are not so easily eliminated by solving
the Schrödinger equation. First, the so-called “fine structure” of the spectral lines
(the splitting of the lines into close-lying doublets) cannot be explained by our
solutions; the proper explanation of this effect requires the introduction of a new
property of the electron, the intrinsic spin. Second, the mathematical difficulties
of solving the Schrödinger equation for atoms containing two or more electrons
are formidable, so we restrict our discussion in this chapter to one-electron atoms,
in order to see how wave mechanics enables us to understand some basic atomic
properties. In the next chapter we discuss the structure of many-electron atoms.
7.1 A ONE-DIMENSIONAL ATOM
Quantum mechanics gives us a view of the structure of the hydrogen atom that
is very different from the Bohr model. In the Bohr model, the electron moves
about the proton in a circular orbit. Quantum mechanics, on the other hand, does
not allow a fixed radius or a fixed orbital plane but instead describes the electron
in terms of a probability density, which leads to an uncertainty in locating the
electron.
To analyze the hydrogen atom according to quantum mechanics, we must
solve the Schrödinger equation for the Coulomb potential energy of the proton
and the electron:
e2
(7.1)
U(r) = −
4πε0 r
Eventually we will discuss the solutions to this three-dimensional problem for
the hydrogen atom using spherical polar coordinates, but for now let’s look at the
simpler one-dimensional problem, in which a proton is fixed at the origin (x = 0)
and an electron moves along the positive x axis. (This doesn’t represent a real
atom, but it does show how some properties of electron wave functions in atoms
emerge from solving the Schrödinger equation.)
In one dimension, the Schrödinger equation for an electron with potential
energy U(x) = −e2 /4πε0 x would then be
−
e2
h−2 d 2 ψ
−
ψ(x) = Eψ(x)
2m dx2
4πε0 x
(7.2)
For a bound state, the wave function must fall to zero as x → ∞. Moreover, in order
for the second term on the left side to remain finite at x = 0, the wave function must
be zero at x = 0. The simplest function that satisfies both of these requirements is
ψ(x) = Axe−bx , where A is the normalization constant. By substituting this trial
wave function into Eq. 7.2, we find a solution when b = me2 /4πε0 h−2 = 1/a0
(where a0 is the Bohr radius defined in Eq. 6.29). The energy corresponding
to this wave function is E = −h−2 b2 /2m = −me4 /32π 2 ε02 h−2 , which happens by
chance to be identical to the energy of the ground state in the Bohr model
(Eq. 6.30 for n = 1).
7.1 | A One-Dimensional Atom
0.4
0.4
0.4
0.2
0.6
199
0.2
0
0.4
5
10
15
0
−0.2
5
0.2
10
15
20
25
−0.2
−0.4
0
0
5
10
−0.4
−0.6
(a)
(c)
(b)
FIGURE 7.1 Wave functions and probability densities (shaded areas) for an electron bound in a one-dimensional
Coulomb potential energy. The horizontal axis represents the distance between the proton and electron in units of a0 .
(a) Ground state. (b) First excited state. (c) Second excited state.
Figure 7.1a shows this wave function and its corresponding probability density
|ψ(x)|2 . There is clearly an uncertainty in specifying the location of the electron.
The most probable region to find the electron is near x = a0 , but there is a nonzero
probability for the electron to be anywhere in the range 0 < x < ∞. This is very
different from the Bohr model, in which the distance between the proton and
electron is fixed at the value a0 .
Also shown in the figure are wave functions and probability densities corresponding to the first and second excited states. The wave functions have the
oscillatory or wavelike property that we expect for quantum wave functions. As
we go to higher excited states, there are more peaks in the probability density and
the region of maximum probability moves to larger distances. These same features
emerge from the solution to the three-dimensional problem. From this simple
one-dimensional calculation (which does not in any way physically represent
the real three-dimensional hydrogen atom) you can already see how quantum
mechanics will resolve some of the difficulties associated with the Bohr model.
Example 7.1
Find the normalization constant of the ground state wave
function for a particle trapped in the one-dimensional
Coulomb potential energy.
Solution
The normalization integral (with b = 1/a0 ) is
∞
∞
x2 e−2x/a0 dx = 1
|ψ(x)|2 dx = A2
0
0
The integration is in a standard form that is found in integral
tables and that we will have occasion to use frequently in
analyzing the hydrogen wave functions:
∞
n!
xn e−cx dx = n+1
c
0
(7.3)
Using this standard form with n = 2 and c = 2/a0 , the
normalization integral becomes
A2
2!
=1
(2/a0 )3
or
−3/2
A = 2a0
200
Chapter 7 | The Hydrogen Atom in Wave Mechanics
Example 7.2
In the ground state of an electron bound in a one-dimensional Coulomb potential energy, what is the probability
to find the electron located between x = 0 and x = a0 ?
Solution
The probability can be found using Eq. 5.10:
a0
4 a0 2 −2x/a0
P(0 : a0 ) =
x e
dx
|ψ(x)|2 dx = 3
a0 0
0
with the normalization constant from Example 7.1. The
integral is a standard form that we will later find useful for
analyzing the hydrogen wave functions:
e−cx
xn e−cx dx = −
c
n!
nxn−1
n(n − 1)xn−2
n
+
·
·
·
+
+
× x +
c
c2
cn
(7.4)
The probability is then
4
e−2x/a0
P(0 : a0 ) = 3 −
2/a0
a0
2
2x
2
+
x +
2/a0
(2/a0 )2
a0
0
= 0.323
7.2 ANGULAR MOMENTUM IN THE
HYDROGEN ATOM
Angular momentum played a significant role in Bohr’s analysis of the structure of
the hydrogen atom. Bohr was able to obtain the correct energy levels by assuming
that in the orbit with quantum number n, the angular momentum of the electron
−
Bohr’s idea about the “quantization of angular momentum” turned
is equal to nh.
out to have some correct features, but his analysis is not consistent with the actual
quantum mechanical nature of angular momentum.
Angular Momentum of Classical Orbits
L = maximum
L=0
FIGURE 7.2 Planetary orbits of the
same energy but different angular
momentum L. As L decreases, the
elliptical orbits become longer and
thinner.
Before considering the angular momentum of an orbiting electron, it is helpful to
review how angular momentum affects classical orbits, such as those of planets
or comets about the Sun. Classically, the angular momentum of a particle is
= r × p, where r is the position vector that locates
represented by the vector L
is perpendicular
the particle and p
is its linear momentum. The direction of L
to the plane of the orbit. Along with the energy, the angular momentum remains
constant as the planet orbits.
The total energy of the orbital motion determines the average distance of the
planet from the Sun. For a given total energy, many different orbits are possible,
from the nearly circular orbit of the Earth to the highly elongated elliptical orbits
of the comets. These orbits differ in their angular momentum L, which is largest
for the circular orbit and smallest for the elongated ellipse. Figure 7.2 shows
a variety of planetary orbits having the same total energy but different angular
momentum. The complete specification of the orbit requires that we give not
only the magnitude of the angular momentum vector but also its direction; this
direction identifies the plane of the orbit. To completely describe the angular
momentum vector requires three numbers; for example, we might give the three
(Lx , Ly , Lz ). Equivalently, we might give the magnitude L of the
components of L
vector and two angular coordinates that give its direction (similar to latitude and
longitude on a sphere).
7.2 | Angular Momentum in the Hydrogen Atom
201
Angular Momentum in Quantum Mechanics
Quantum mechanics gives us a very different view of angular momentum. The
angular momentum properties of a three-dimensional wave function are described
by two quantum numbers. The first is the angular momentum quantum number l.
This quantum number determines the length of the angular momentum vector:
| = l(l + 1)h−
(l = 0, 1, 2, . . .)
(7.5)
|L
−
| = nh.
In particular, it
Note that this is very different from the Bohr condition |L
is possible for the quantum vector to have a length of zero, but in the Bohr model
−
the minimum length is h.
The second number that we use to describe angular momentum in quantum
mechanics is the magnetic quantum number ml . This quantum number tells
us about one component of the angular momentum vector, which we usually
choose to be the z component. The relationship between the z component of
and the magnetic quantum number is
L
Lz = ml h−
(ml = 0, ±1, ±2, . . . , ±l)
(7.6)
Note that for each value of l there are 2l + 1 possible values of ml .
Unlike the classical angular momentum vector, for which we provide an
exact specification by giving three numbers, the quantum angular momentum is
described by only two numbers. Clearly two numbers cannot completely identify
a vector in three-dimensional space, so something is missing from our description
of the quantum angular momentum. As we discuss later, this missing part of the
description of the quantum angular momentum vector is directly related to the
application of the uncertainty principle to angular momentum.
Example 7.3
Compute the length of the angular momentum vectors that
represent the orbital motion of an electron in a quantum
state with l = 1 and in another state with l = 2.
For l = 1,
Solution
and for l = 2,
Equation 7.5 gives the relationship between the length of
the vector and the angular momentum quantum number l.
| =
|L
√
1(1 + 1)h− = 2h−
| =
|L
√
2(2 + 1)h− = 6h−
Example 7.4
that
What are the possible z components of the vector L
represents the orbital angular momentum of a state with
l = 2?
Solution
The possible ml values for l = 2 are +2, +1, 0, −1,
vector can have any of five possible
−2, and so the L
− −
−
−
z components: Lz = 2h,
h, 0, −h,
or −√2h.
The length of the
−
, as we found previously, is 6h.
vector L
Chapter 7 | The Hydrogen Atom in Wave Mechanics
202
for l = 2 are illustrated in Figure 7.3. Each
The components of the vector L
corresponds to a different ml value. The polar
orientation in space of the vector L
makes with the z axis can be found by referring to the
angle θ that the vector L
| cos θ, we have
figure. With Lz = |L
z
Lz = +2h
−
ml = +2
θ
Lz = +h−
L
= √6h−
ml = +1
Lz = 0
ml = 0
Lz = −h−
ml = −1
cos θ =
ml = −2
Lz = −2h−
FIGURE 7.3 The orientations in space
and z components of a vector with
l = 2. There are five different possible
orientations.
Lz
|
|L
=
ml
l(l + 1)
(7.7)
|.
using Eq. 7.6 for Lz and Eq. 7.5 for |L
This behavior represents a curious aspect of quantum mechanics called spatial quantization—only certain orientations of angular momentum vectors are
allowed. The number of these orientations is equal to 2l + 1 (the number of
different possible ml values) and the magnitudes of their successive z components
−
For example, an angular momentum state with l = 1 can have
always differ by h.
−
−
to z components Lz = +h,
0, −h)
and
ml values of +1, 0,
√
√ or −1 (corresponding
vector in this case can have one of
thus cos θ = +1/ 2, 0, or − 1/ 2. The L
only three possible orientations relative to the z axis, corresponding to angles of
45◦ , 90◦ , or 135◦ . This is in contrast to a classical angular momentum vector, which
can have any possible orientation in space; that is, the angle between a classical
angular momentum vector and the z axis can take any value between 0 and 180◦ .
The Angular Momentum Uncertainty
Relationship
In quantum mechanics, the maximum amount of permitted information about the
angular momentum vector is its length (given by Eq. 7.5) and its z component
(given by Eq. 7.6). Because the complete description of a vector requires three
numbers, we are always missing some information about the angular momentum
| and Lz exactly, then we have no information
of a quantum state. If we specify |L
(Lx and Ly ). Any possible outcome of a
about the other components of L
|2 = L2x + L2y + L2z ).
measurement of Lx or Ly can therefore occur (as long as |L
vector rotates or precesses
In graphic terms, we can imagine that the tip of the L
about the z axis so that Lz remains fixed but Lx and Ly are undetermined, as in
Figure 7.4. This rotation cannot be directly measured; all we can observe is the
“smeared out” distribution of values of Lx and Ly .
that is summarized
There is thus an uncertainty or indeterminacy in specifying L
by another form of the uncertainty principle:
z
Lz φ ≥ h−
L
Lz
φ
x
Ly
Lx
y
precesses
FIGURE 7.4 The vector L
rapidly about the z axis, so that Lz
stays constant, but Lx and Ly are
indeterminate.
(7.8)
where φ is the azimuthal angle shown in Figure 7.4. If we know Lz exactly
(Lz = 0), then we have no knowledge at all of the angle φ —all values are
equally probable. This is equivalent to saying that we know nothing at all about
is determined, the other components
Lx and Ly ; whenever one component of L
are completely undetermined.
On the other hand, if we try to construct an angular momentum state in which a
different component—for example, Lx —is completely specified (so that φ would
be known), the state becomes a mixture or superposition of different Lz values. In
effect, we can reduce the uncertainty in φ only at the expense of increasing the
uncertainty in Lz . This is exactly the same type of behavior that was described by
7.3 | The Hydrogen Atom Wave Functions
203
the other forms of the uncertainty principle; for example, reducing the uncertainty
in x is always accompanied by an increase in the uncertainty in px .
From this discussion you can see why the length of the angular momentum is
defined according to Eq. 7.5 and why, for example, we could not have simply
−
| = lh.
If this were possible, then when ml had its
defined the length as |L
−
the length of the
maximum value (ml = +l), we would have Lz = ml h− = lh;
vector would then be equal to its z component, and so it must lie along the
z axis with Lx = Ly = 0. However, this simultaneous exact knowledge of all
violates the angular momentum form of the uncertainty
three components of L
principle, and therefore this situation is not permitted to occur. It is therefore
−
to be greater than lh.
necessary for the length of L
7.3 THE HYDROGEN ATOM WAVE FUNCTIONS
To find the complete spatial description of the electron in a hydrogen atom,
we must obtain three-dimensional wave functions. The Schrödinger equation in
three-dimensional Cartesian coordinates has the following form:
−
h−2
2m
∂ 2ψ
∂ 2ψ
∂ 2ψ
+
+
∂x2
∂y2
∂z2
+ U(x, y, z)ψ(x, y, z) = Eψ(x, y, z)
(7.9)
where ψ is a function of x, y, and z. The usual procedure for solving a partial
differential equation of this type is to separate the variables by replacing a function
of three variables with the product of three functions of one variable— for example,
ψ(x, y, z) = X (x)Y (y)Z(z). However, the Coulomb potential
energy (Eq. 7.1)
written in Cartesian coordinates, U(x, y, z) = −e2 /4πε0 x2 + y2 + z2 , does not
lead to a separable solution.
For this calculation, it is more convenient to work in spherical polar coordinates
(r, θ, φ) instead of Cartesian coordinates (x, y, z). The variables of spherical
polar coordinates are illustrated in Figure 7.5. This simplification in the solution
is at the expense of an increased complexity of the Schrödinger equation, which
becomes:
∂ 2ψ
2 ∂ψ
h−2 ∂ 2 ψ
1
∂
∂ψ
1
+
+
sin
θ
+
(7.10)
−
2m ∂r2
r ∂r
r2 sin θ ∂θ
∂θ
r2 sin2 θ ∂φ 2
z
+U(r)ψ(r, θ , φ) = Eψ(r, θ, φ)
θ
z
where now ψ is a function of the spherical polar coordinates r, θ, and φ. When
the potential energy depends only on r (and not on θ or φ), as is the case for the
Coulomb potential energy, we can find solutions that are separable and can be
factored as
(7.11)
where the radial function R(r), the polar function (θ ), and the azimuthal
function (φ) are each functions of a single variable. This procedure gives three
differential equations, each of a single variable (r, θ, φ).
The quantum state of a particle that moves in a potential energy that depends
only on r can be described by angular momentum quantum numbers l and ml .
y
φ
x
x
ψ(r, θ, φ) = R(r)(θ )(φ)
r
y
FIGURE 7.5 Spherical polar coordinates for the hydrogen atom. The proton is at the origin and the electron is
at a radius r, in a direction determined
by the polar angle θ and the azimuthal
angle φ.
204
Chapter 7 | The Hydrogen Atom in Wave Mechanics
The polar and azimuthal solutions are given by combinations of standard trigonometric functions. The remaining radial function is then obtained from solving the
radial equation:
h−2 d 2 R 2 dR
l(l + 1)h−2
e2
+
+
+
−
R(r) = ER(r)
(7.12)
−
2m dr2
r dr
4πε0 r
2mr2
The mass that appears in this equation is the reduced mass of the proton-electron
system defined in Eq. 6.44.
Quantum Numbers and Wave Functions
When we solve a three-dimensional equation such as the Schrödinger equation,
three parameters emerge in a natural way as indices or labels for the solutions, just
as the single index n emerged from our solution of the one-dimensional infinite
well in Section 5.4. These indices are the three quantum numbers that label the
solutions. The three quantum numbers that emerge from the solutions and their
allowed values are:
n
principal quantum number
1, 2, 3, . . .
l
angular momentum quantum number
ml
magnetic quantum number
0, 1, 2, . . . , n − 1
0, ±1, ±2, . . . , ±l
The principal quantum number n is identical to the quantum number n that we
obtained in the Bohr model. It determines the quantized energy levels:
En = −
me4
32π 2 ε02 h−2
1
n2
(7.13)
which is identical to Eq. 6.30. Note that the energy depends only on n and not
on the other quantum numbers l or ml . The permitted values of the angular
momentum quantum number l are limited by n (l ranges from 0 to n − 1) and
those of the magnetic quantum number ml are limited by l.
Complete with quantum numbers, the separated solutions of Eq. 7.10 can be
written
ψn,l,ml (r, θ, φ) = Rn,l (r)l,ml (θ )ml (φ)
(7.14)
The indices (n, l, ml ) are the three quantum numbers that are necessary to describe
the solutions. Wave functions corresponding to some values of the quantum
numbers are shown in Table 7.1. The wave functions are written in terms of the
Bohr radius a0 defined in Eq. 6.29.
For the ground state (n = 1), only l = 0 and ml = 0 are allowed. The complete
set of quantum numbers for the ground state is then (n, l, ml ) = (1, 0, 0), and the
wave function for this state is given in the first line of Table 7.1. The first excited
state (n = 2) can have l = 0 or l = 1. For l = 0, only ml = 0 is allowed. This state
has quantum numbers (2, 0, 0), and its wave function is given in the second line
of Table 7.1. For l = 1, we can have ml = 0 or ±1. There are thus three possible
sets of quantum numbers: (2, 1, 0) and (2, 1, ±1). The wave functions for these
states are given in the third and fourth lines of Table 7.1. The second excited state
(n = 3) can have l = 0 (ml = 0), l = 1 (ml = 0, ±1), or l = 2 (ml = 0, ±1, ±2).
For the n = 2 level, there are four different possible sets of quantum numbers
and correspondingly four different wave functions. All of these wave functions
7.3 | The Hydrogen Atom Wave Functions
TABLE 7.1 Some Hydrogen Atom Wave Functions
n
l
ml
1
0
0
2
0
0
2
1
0
2
1
±1
3
0
0
3
1
0
3
1
±1
(θ )
(φ)
1
√
2
1
√
2π
1
√
2
1
√
2π
3
cos θ
2
√
3
∓
sin θ
2
1
√
2π
1
√ e±iφ
2π
1
√
2
1
√
2π
3
cos θ
2
√
3
∓
sin θ
2
1
√
2π
R(r)
2
3/2
a0
1
(2a0 )3/2
e−r/a0
r
2−
e−r/2a0
a0
1
r −r/2a0
e
√
3/2
a
3(2a0 )
0
√
1
)3/2
3(2a0
2
2r
1−
(3a0 )3/2
3a0
8
r
√
3/2
a
9 2(3a0 )
0
8
r
√
3/2
a0
9 2(3a0 )
r −r/2a0
e
a0
2r2
+
e−r/3a0
27a20
r2
− 2 e−r/3a0
6a0
r2
− 2 e−r/3a0
6a0
3
2
0
r2 −r/3a0
e
√
27 10(3a0 )3/2 a20
3
2
±1
r2 −r/3a0
4
e
√
27 10(3a0 )3/2 a20
3
2
±2
r2 −r/3a0
4
e
√
27 10(3a0 )3/2 a20
4
5
(3 cos2 θ − 1)
8
15
∓
sin θ cos θ
4
√
15 2
sin θ
4
1
√ e±iφ
2π
1
√
2π
1
√ e±iφ
2π
1
√ e±2iφ
2π
correspond to the same energy, so the n = 2 level is degenerate. (Degeneracy was
introduced in Section 5.4.) The n = 3 level is degenerate with nine possible sets
of quantum numbers. In general, the level with principal quantum number n has a
degeneracy equal to n2 . Figure 7.6 illustrates the labeling of the first three levels.
If different combinations of quantum numbers have exactly the same energy,
what is the purpose of listing them separately? First, as we discuss in the last
section of this chapter, the levels are not precisely degenerate, but are separated
−1.5 eV
−3.4 eV
−13.6 eV
(3, 0, 0)
(3, 1, 1)
(3, 1, 0)
(3, 1, –1)
(2, 0, 0)
(2, 1, 1)
(2, 1, 0)
(2, 1, –1)
(3, 2, 2)
(3, 2, 1)
(3, 2, 0)
(3, 2, –1)
(3, 2, –2)
(1, 0, 0)
FIGURE 7.6 The lower energy levels of hydrogen, labeled with the quantum numbers (n, l, ml ). The first excited state is
four-fold degenerate and the second excited state is nine-fold degenerate.
205
Chapter 7 | The Hydrogen Atom in Wave Mechanics
0.8
1.5
0.6
n = 1, l = 0
0.4
0.4
1
0.5
n=3
n=2
0.3
0.2
0
2
r
4
l=1
0.1
l=1
l=2
0
0
l=0
0.2
l=0
R(r)
2
R(r)
R(r)
206
0
6
–0.1
–0.2
0
2
4
6
8
10
0
r
5
10
15
20
r
FIGURE 7.7 The radial wave functions of the n = 1, n = 2, and n = 3 states of hydrogen. The radius coordinate is measured
in units of a0 .
by a very small energy (about 10−5 eV). Second, in the study of the transitions
between the levels, we find that the intensities of the individual transitions
depend on the quantum numbers of the particular level from which the transition
originates. Third, and perhaps most important, each of these sets of quantum
numbers corresponds to a very different wave function, and therefore represents
a very different state of motion of the electron. These states have different spatial
probability distributions for locating the electron, and thus can affect many atomic
properties—for example, the way two atoms can form molecular bonds.
The radial wave functions for the states listed in Table 7.1 are plotted in
Figure 7.7. You can readily see the differences in the motion of the electron for
the different states. For example, in the n = 2 level, the l = 0 and l = 1 wave
functions have the same energy but their behavior is very different: the l = 1 wave
function falls to zero at r = 0, but the l = 0 wave function remains nonzero at
r = 0. The l = 0 electron thus has a much greater probability of being found close
to (or even inside) the nucleus, which turns out to play a large role in determining
the rates for certain radioactive decay processes.
z
Probability Densities
dr
As we learned in Chapter 5, the probability of finding the electron in any spatial
interval is determined by the square of the wave function. For the hydrogen atom,
|ψ(r, θ , φ)|2 gives the volume probability density (probability per unit volume) at
the location (r, θ , φ). To compute the actual probability of finding the electron,
we multiply the probability per unit volume by the volume element dV located at
(r, θ, φ). In spherical polar coordinates (see Figure 7.8) the volume element is
r dθ
dθ
r sin θ
y
dφ
x
r sinθ dφ
FIGURE 7.8 The volume element in
spherical polar coordinates.
dV = r2 sin θdr dθ dφ
(7.15)
and therefore the probability to find the electron in the volume element at that
location is
|ψn,l,ml (r, θ, φ)|2 dV = |Rn,l (r)|2 |l,ml (θ )|2 |ml (φ)|2 r2 sin θ dr dθ dφ
(7.16)
7.4 | Radial Probability Densities
l = 0, ml = 0
l = 1, ml = 0
l = 1, ml = 1
l = 2, ml = 0
l = 2, ml = 1
l = 2, ml = 2
FIGURE 7.9 Representations of |ψ|2 for different sets of quantum numbers. The z axis
is the vertical direction. The diagrams represent surfaces on which the probability has
the same value.
Some representations of the probability density |ψ(r, θ, φ)|2 are shown in
Figure 7.9. We can regard these illustrations as representing the “smeared out”
distribution of electronic charge in the atom, which results from the uncertainty
in the electron’s location. They also represent the statistical outcomes of a large
number of measurements of the location of the electron in the atom. These spatial
distributions have important consequences for the structure of atoms with many
electrons, which is discussed in Chapter 8, and also for the joining of atoms into
molecules, which is discussed in Chapter 9.
In the next two sections, we will separately examine how the probability
density depends on the radial coordinate and on the angular coordinates.
7.4 RADIAL PROBABILITY DENSITIES
Instead of asking about the complete probability density to locate the electron,
we might want to know the probability to find the electron at a particular distance
from the nucleus, no matter what the values of θ and φ might be. That is, imagine
a thin spherical shell of radius r and thickness dr. What is the probability to find
the electron in the shell between spheres of radius r and r + dr? We define the
radial probability density P(r) so that the probability to find the electron within
207
Chapter 7 | The Hydrogen Atom in Wave Mechanics
208
that shell is P(r)dr. We can determine the radial probability from the complete
probability (Eq. 7.16) by integrating over the θ and φ coordinates. In effect, this
adds up the probabilities for the volume elements at a given r for all θ and φ.
P(r) dr = |Rn,l (r)|2 r2 dr
π
|l,ml (θ )|2 sin θ dθ
0
2π
0
|ml (φ)|2 dφ
(7.17)
The θ and φ integrals are each equal to unity, because each of the functions R, ,
and is individually normalized. Thus the radial probability density is
P(r) = r2 |Rn,l (r)|2
(7.18)
Figure 7.10 shows this function for several of the lowest levels of hydrogen.
Note that, because of the r2 factor, P(r) must be zero at r = 0 even though
R(r) might not. That is, the probability to locate the electron in a spherical shell
always goes to zero as r → 0 because the volume of the shell goes to zero, but
the probability density |ψ|2 may be nonzero at r = 0. Moreover, P(r) and |R(r)|2
convey different information about the electron’s behavior, as you can see by
comparing Figures 7.7 and 7.10. For example, the radial wave function R(r) for
n = 1, l = 0 has its maximum at r = 0, but the radial probability density for that
state has its maximum at r = a0 .
Using the radial probability densities, it is possible to find the average value
of the radial coordinate, that is, the average distance between the proton and
the electron (see Problems 30 and 31). These values are indicated by markers
in Figure 7.10. Notice that the average radial coordinate is about 1.5a0 for the
n = 1 wave function and is much greater, about 5a0 , for both of the n = 2 wave
functions. The average radius is greater still, about 12a0 , for the n = 3 states.
It appears from these graphs that the average radius depends mostly on n and
not very much on l. The principal quantum number n thus determines not only
0.8
0.12
0.2
l=1
n=3
n=2
l=1
n = 1, l = 0
l=2
l=0
0.1
P(r)
P(r)
P(r)
0.4
l=0
0.06
0.2
1 0
2 10
0
0
0
0
1
2
3
r
4
5
0
2
4
6
r
8
10
12
0
5
10
15
20
25
r
FIGURE 7.10 The radial probability density P(r) for the n = 1, n = 2, and n = 3 states of hydrogen. The radius coordinate
is measured in units of a0 . The markers on the horizontal axis show the values of the average radius rav labeled with the
value of l.
7.4 | Radial Probability Densities
209
the energy level of the electron, it also determines to a great extent the average
distance of the electron from the nucleus. As in the Bohr model, this average
radius varies roughly as n2 , so that an n = 2 electron is on the average about 4
times farther from the nucleus than an n = 1 electron, an n = 3 electron is about
9 times farther from the nucleus than an n = 1 electron, and so forth.
Another measure of the location of the electron is its most probable radius,
determined from the location at which P(r) has its maximum value. For each
n, P(r) for the state with l = n − 1 has only a single maximum, which occurs at
the location of the Bohr orbit, r = n2 a0 . The following example illustrates this for
the n = 2 state.
Example 7.5
Prove that the most likely distance from the origin of an
electron in the n = 2, l = 1 state is 4a0 .
dP(r)
1 d 4 −r/a0
)
=
(r e
dr
24a50 dr
Solution
1
1
3 −r/a0
4
=
+r −
e−r/a0 = 0
4r e
a0
24a50
In the n = 2, l = 1 level, the radial probability density is
P(r) = r2 |R2,1 (r)|2 = r2
1 r2 −r/a0
e
24a30 a20
We wish to find where this function has its maximum; in
the usual fashion, we take the first derivative of P(r) and
set it equal to zero:
or
1 −r/a0
r4
3
e
−
4r
=0
a0
24a50
The only solution that yields a maximum is r = 4a0 .
Example 7.6
For the n = 2 states (l = 0 and l = 1), compare the probabilities of the electron being found inside the Bohr radius.
Evaluating the integrals using Eq. 7.4, we obtain
P(0 : a0 ) = 0.034
Solution
For the n = 2, l = 0 level, P(r)dr = r2 |R2,0 (r)|2 dr
2
= r2 8a13 2 − ar0 e−r/a0 dr. The total probability of find0
ing the electron between r = 0 and r = a0 is
P(0 : a0 ) =
=
For the n = 2, l = 1 level P(r)dr = r2 |R2,1 (r)|2 dr
1 r2 −r/a0
dr. The total probability between r = 0
= r2 24a
3 a2 e
0 0
and r = a0 is
a0
P(r) dr
0
1
8a30
0
a0
4r3
r4
+ 2 e−r/a0 dr
4r2 −
a0
a0
P(0 : a0 ) =
=
a0
P(r) dr
0
1
24a30
0
a0
r4 −r/a0
e
dr = 0.0037
a20
210
Chapter 7 | The Hydrogen Atom in Wave Mechanics
The results of Example 7.6 are consistent with the radial probability densities
shown in Figure 7.10—in the l = 0 state, the probability of finding the electron
inside a0 is about 10 times larger than in the l = 1 state, as suggested by the small
peak in the radial probability density for n = 2, l = 0 at small r. There is clearly
more area under the P(r) curve between r = 0 and r = a0 for n = 2, l = 0 than
there is for n = 2, l = 1.
Curiously, Figure 7.10 also shows that for n = 2 the l = 0 radial probability
density is also greater than the l = 1 probability density at large r. Thus the l = 0
electron spends more time close to the nucleus than the l = 1 electron and it also
spends more time farther away. This is a general result that holds for any value
of n: the smaller the l value, the larger is the probability to find the electron both
close to the nucleus and far from the nucleus. The classical planetary orbits of
Figure 7.2 show the same type of behavior—the orbit with L = 0 spends more
time close to the central body and also more time far away from the central body,
compared with the orbits that have larger L.
7.5 ANGULAR PROBABILITY DENSITIES
In this section, we consider the angular part of the probability density, which is
obtained from the squared magnitudes of the angular parts of the wave function:
P(θ, φ) = |l,ml (θ )ml (φ)|2
(7.19)
Figure 7.11 shows the angular probablity densities for the l = 0 and l = 1 wave
functions listed in Table 7.1.
Note that all of the probability densities are cylindrically symmetric—there
is no dependence on the azimuthal angle φ. The l = 0 wave function is also
spherically symmetric—that is, the probability density is independent of direction.
The l = 1 probability densities have two distinct shapes. For ml = 0, the
electron is found primarily in two regions of maximum probability along the
positive and negative z axis, while for ml = ±1, the electron is found primarily
near the xy plane. For ml = 0, the electron’s angular momentum vector lies in the
xy plane (Figure 7.3). Classically, the angular momentum vector is perpendicular
to the orbital plane, so it should not be surprising that the electron is most likely
to be found in a location away from the xy plane—that is, along the z axis. For
z
z
l=1
ml = 0
l=0
ml = 0
y
l=1
ml = ±1
y
x
x
z
y
x
FIGURE 7.11 The angular dependence of the l = 0 and l = 1 probability densities.
7.6 | Intrinsic Spin
211
ml = ±1, the angular momentum vector has its maximum projection along the z
, spends most of its time near
axis; the electron, again orbiting perpendicular to L
the xy plane. These probability densities for locating the electron are consistent
with the information given by the orientation of the angular momentum vector,
and the cylindrical symmetry of the probability densities is consistent with the
represented in Figure 7.4.
uncertainty in the knowledge of the orientation of L
Example 7.7
For the n = 2, l = 1 wave functions, find the direction
in space at which the maximum probability occurs when
ml = 0 and when ml = ±1.
Solution
For l = 1, ml = 0 we have P(θ, φ) = |2,0 (θ )0 (φ)|2 =
3
2
4π cos θ . To find the location of the maximum, we set
dP/dθ equal to zero:
3
dP
=
(−2 cos θ sin θ ) = 0
dθ
4π
There are two solutions to this equation: one for cos θ = 0,
for which θ = π/2, and another for sin θ = 0, which gives
θ = 0 or π . By taking the second derivative, we find that
θ = π/2 leads to a minimum while θ = 0 or π gives the
maximum. There are thus two regions of maximum probability, one along the positive z axis (θ = 0) and another
along the negative z axis (θ = π ), as in Figure 7.11.
For l = 1, ml = ±1 the angular probability density is
3
sin2 θ. We can then
P(θ, φ) = |2,±1 (θ )±1 (φ)|2 = 8π
find the location of the maximum:
3
dP
=
(sin θ cos θ ) = 0
dθ
4π
Once again there are two solutions: θ = 0, π or θ = π/2.
However, in this case the maximum occurs for θ = π/2
and the probability maximum occurs in the xy plane, as in
Figure 7.11.
7.6 INTRINSIC SPIN
One way of observing spatial quantization is to place the atom in an externally
applied magnetic field. From the interaction between the magnetic field and the
magnetic dipole moment of the atom (which is related to the electron’s orbital
angular momentum), it is possible both to observe the separate components of
and also to determine l by counting the number of z components (which, as
L
we have seen, is equal to 2l + 1). However, when this experiment is done, a
surprising result emerges that indicates an unexpected property of the electron,
known as intrinsic spin.
L
−
Orbital Magnetic Dipole Moments
Figure 7.12 shows a classical magnetic dipole moment, which might be produced
by a current loop or the orbital motion of a charged object. The classical magnetic
dipole moment µ
is defined as a vector whose magnitude is equal to the product
of the circulating current and the area enclosed by the orbital loop. The direction
of µ
is perpendicular to the plane of the orbit, determined by the right-hand
rule—with the fingers in the direction of the conventional (positive) current, the
thumb indicates the direction of µ
, as shown in Figure 7.12 for a circulating
negative charge like an electron.
r
i
µ
FIGURE 7.12 A circulating negative
charge is represented as a current loop.
and
Because the charge is negative, L
µ
point in opposite directions.
212
Chapter 7 | The Hydrogen Atom in Wave Mechanics
z
ml ប
µL
L
−ml μB
FIGURE 7.13 According to quantum
mechanics, the vectors can be considered to precess around the z axis, and
so we can specify only the z compo and µ
nents of L
.
As we have seen, quantum mechanics forbids exact knowledge of the direction
and therefore of µ
and
of L
. Figure 7.13 suggests the relationship between L
µ
that is consistent with quantum mechanics. Only the z components of these
and µ
vectors can be specified. Because the electron has a negative charge, L
have z components of opposite signs.
We can use the Bohr model with a circular orbit to obtain the relationship
and µ
between L
, which turns out to be identical with the correct quantum
mechanical result. We regard the circulating electron as a circular loop of current
i = dq/dt = q/T, where q is the charge of the electron (−e) and T is the time for
one circuit around the loop. If the electron moves with speed v = p/m around
a loop of radius r, then T = 2πr/v = 2πrm/p. The magnitude of the magnetic
moment is
q
q
q
|
(7.20)
π r2 =
rp =
|L
μ = iA =
2πrm/p
2m
2m
| = rp. Writing Eq. 7.20 in terms of vectors and putting −e for the
with |L
electronic charge, we obtain
µ
L = −
e
L
2m
(7.21)
The negative sign, which is present because the electron has a negative charge,
and µ
indicates that the vectors L
L point in opposite directions. The subscript
L on µ
L reminds us that this magnetic moment arises from the orbital angular
of the electron.
momentum L
The z component of the magnetic moment is
μL,z = −
e
eh−
e
Lz = − ml h− = − ml = −ml μB
2m
2m
2m
(7.22)
−
The quantity eh/2m
is defined to be the Bohr magneton
μB =
eh−
2m
(7.23)
The value of μB is
μB = 9.274 × 10−24 J/T
The Bohr magneton is a convenient unit for expressing atomic magnetic moments,
which typically have values of the order of μB .
A Dipole in an External Field
Before we consider further the behavior of µ
L , we discuss the similar behavior of
an electric dipole, which consists of two equal and opposite charges q separated
by a distance r. The electric dipole moment p
has magnitude qr and points from
the negative charge to the positive charge. As shown in Figure 7.14a, in a uniform
+ on the positive charge and F− on the negative
electric field, the vertical forces F
charge are of equal magnitude. The dipole experiences a torque that tends to rotate
, but the net force on the dipole is zero. Suppose now
it into alignment with E
that the field is not uniform—for example, the field strength decreases from the
−
bottom of the figure to the top, as in Figure 7.14b. Now the downward force F
+ on the positive
acting on the negative charge is greater than the upward force F
7.6 | Intrinsic Spin
E
F+
E
E
F+
+
+
p
F+
−
p
F−
−
−
F−
213
p
+
F−
Fnet = 0
Fnet
Fnet
(a)
(b)
(c)
experiences no net force. (b) In a nonuniform
FIGURE 7.14 (a) An electric dipole in a uniform electric field E
− is greater than the force F+ .
electric field (decreasing from the bottom of the figure to the top), the force F
There is a net downward force on the dipole. (c) If the dipole moment is reversed, the net force is in the opposite
direction.
charge. There is still a net torque that tends to rotate the dipole, but there is also a
net force that tends to move the dipole, in this case downward. On the other hand,
if we reverse the locations of the two charges (Figure 7.14c), which is equivalent
to reversing the electric dipole moment p
, the upward force F+ on the positive
− on the negative charge, so the
charge is now greater than the downward force F
net force on the dipole is upward.
We can state this result in another way that will be more applicable to our
discussion of magnetic dipole moments. Let the field direction define the z axis.
Then dipoles with pz > 0 (as in Figure 7.14b) experience a net negative force and
move in the negative z direction, while dipoles with pz < 0 (as in Figure 7.14c)
experience a net positive force and move in the positive z direction.
A magnetic dipole moment µ
behaves in an identical way. (In fact, if we
imagine fictitious N and S poles, the behavior of a magnetic moment would
be described by illustrations similar to Figure 7.14.) A nonuniform magnetic
field acting on the magnetic moments gives an unbalanced force that causes a
displacement. Figure 7.15 illustrates the behavior of magnetic dipole moments
having different orientations in a nonuniform magnetic field. The two different
orientations give net forces in opposite directions: if μz is positive the force on
the dipole is negative, and if μz is negative the force on the dipole is positive.
B
B
µ
µ
The Stern-Gerlach Experiment
Imagine the following experiment, illustrated schematically in Figure 7.16. A
beam of hydrogen atoms is prepared in the n = 2, l = 1 state. The beam consists
of equal numbers of atoms in the ml = −1, 0, and +1 states. (We assume we can
do the experiment so quickly that the n = 2 state doesn’t decay to the n = 1 state.
In practice this may not be possible.) The beam passes through a region in which
there is a nonuniform magnetic field. The atoms with ml = +1 (μL,z = −μB )
Fnet
Fnet
FIGURE 7.15 Two magnetic dipoles
in a nonuniform magnetic field. Oppositely directed dipoles experience net
forces in opposite directions.
214
Chapter 7 | The Hydrogen Atom in Wave Mechanics
z axis
+1
ml
–1
0
+1
0
−1
Slit
Oven
Magnet
Screen
FIGURE 7.16 Schematic diagram of the Stern-Gerlach experiment. A beam of atoms
passes through a region where there is a nonuniform magnetic field. Atoms with their
magnetic dipole moments in opposite directions experience forces in opposite directions.
(a)
(b)
FIGURE 7.17
The results of the
Stern-Gerlach experiment. (a) The
image of the slit with the field turned
off. (b) With the field on, two images
of the slit appear. The small divisions in the scale at the left represent
0.05 mm. [Source: W. Gerlach and
O. Stern, Zeitschrift für Physik 9, 349
(1922)]
experience a net upward force and are deflected upward, while the atoms with
ml = −1 (μL,z = +μB ) are deflected downward. The atoms with ml = 0 are
undeflected.
After passing through the field, the beam strikes a screen where it makes a
visible image. When the field is off, we expect to see one image of the slit in the
center of the screen, because there is no deflection at all. When the field is on, we
expect three images of the slit on the screen—one in the center (corresponding to
ml = 0), one above the center (ml = +1), and one below the center (ml = −1).
If the atom were in the ground state (l = 0), we expect to see one image in the
screen whether the field was off or on (recall that a ml = 0 atom is not deflected).
If we had prepared the beam in a state with l = 2, we would see five images with
the field on. The number of images that appears is just the number of different ml
values, which is equal to 2l + 1. With the possible values for l of 0, 1, 2, 3, . . .,
it follows that 2l + 1 has the values 1, 3, 5, 7, . . .; that is, we should always see
an odd number of images on the screen. However, if we were actually to perform
the experiment with hydrogen in the l = 1 state, we would find not three but
six images on the screen! Even more confusing, if we did the experiment with
hydrogen in the l = 0 state, we would find not one but two images on the screen,
one representing an upward deflection and one a downward deflection! In the
has length zero, and so we expect that there is no magnetic
l = 0 state, the vector L
moment for the magnetic field to deflect. We observe this not to be true—even
when l = 0, the atom still has a magnetic moment, in contradiction to Eq. 7.21.
The first experiment of this type was done by O. Stern and W. Gerlach in 1921.
They used a beam of silver atoms; although the electronic structure of silver is
more complicated than that of hydrogen (as we discuss in Chapter 8), the same
basic principle applies—the silver atom must have l = 0, 1, 2, 3, . . ., and so an odd
number of images is expected to appear on the screen. In fact, they observed the
beam to split into two components, producing two images of the slits on the screen
(see Figure 7.17).
The observation of separated images was the first conclusive evidence of spatial
quantization; classical magnetic moments would have all possible orientations and
would make a continuous smeared-out pattern on the screen, but the observation
of a number of discrete images on the screen means that the atomic magnetic
7.6 | Intrinsic Spin
moments can take only certain discrete orientations in space. These correspond to
the discrete orientations of the magnetic moment (or, equivalently, of the angular
momentum).
However, the number of discrete images on the screen does not agree with
our expectations that it be an odd number. We expect 2l + 1 images, so for
two images we should have l = 1/2, which is not permitted by the Schrödinger
equation. We can resolve this dilemma if there is another contribution to the
angular momentum of the atom, the intrinsic angular momentum of the electron.
An electron in an atom has two kinds of angular momentum, somewhat like the
Earth as it both orbits the Sun and rotates on its axis. The electron has an orbital
, which characterizes the motion of the electron about the
angular momentum L
, which behaves as if the electron
nucleus, and an intrinsic angular momentum S
is usually called the intrinsic spin.
were spinning about its axis. For this reason, S
(However, it is not correct to use the classical analogy to think of the electron as a
tiny ball of charge spinning about an axis, because the electron is a point particle
with no physical size.) The idea of electron spin was proposed by S. A. Goudsmit
and G. E. Uhlenbeck in 1925, and P. A. M. Dirac showed in 1928 that relativistic
quantum theory for the electron gives the electron spin directly as an additional
quantum number.
In order to explain the result of the Stern-Gerlach experiment, we must assign
to the electron an intrinsic spin quantum number s of 1/2. The intrinsic spin
behaves much like the orbital angular momentum; there is the quantum number
s (which we can regard as a label arising from the mathematics), the angular
, a z component Sz , an associated magnetic moment µ
momentum vector S
S , and a
spin magnetic quantum number ms . Figure 7.18 illustrates the vector properties of
, and Table 7.2 compares the properties of orbital and spin angular momentum
S
for electrons in atoms.
The inclusion of spin gives a direct explanation for the Stern-Gerlach experiment. The outermost electron in a silver atom occupies a state with l = 0.
(The other electrons do not contribute to the magnetic properties of the atom.)
The magnetic behavior is therefore due entirely to the spin magnetic moment,
which has only two possible orientations in the magnetic field (corresponding to
ms = ± 1/2) and thus gives the two beams observed emerging from the magnet.
Every fundamental particle has a characteristic intrinsic spin and a corresponding spin magnetic moment. For example, the proton and neutron also have a spin
quantum number of 1/2. The photon has a spin quantum number of 1, while the pi
meson (pion) has s = 0.
Sz = +
S
ប
2
S = √ 3/4 ប
Sz = −
ប
2
FIGURE 7.18 The spin angular momentum of an electron and the spatial
orientation of the spin angular momentum vector.
TABLE 7.2 Orbital and Spin Angular Momentum of Electrons in Atoms
Orbital
Quantum number
Length of vector
z component
Magnetic quantum number
Magnetic moment
Spin
l = 0, 1, 2, . . .
√
| = l(l + 1)h−
|L
| =
|S
ml = 0, ±1, ±2, . . ., ±l
ms = ± 1/2
Lz = ml h−
µ
L = −(e/2m)L
s=
215
1/2
√
√
s(s + 1)h− = 3/4h−
Sz = ms h−
µ
S = −(e/m)S
216
Chapter 7 | The Hydrogen Atom in Wave Mechanics
Example 7.8
In a Stern-Gerlach type of experiment, the magnetic
field varies with distance in the z direction according
to dBz /dz = 1.4 T/mm. The silver atoms travel a distance
x = 3.5 cm through the magnet. The most probable speed
of the atoms emerging from the oven is v = 750 m/s. Find
the separation of the two beams as they leave the magnet. The mass of a silver atom is 1.8 × 10−25 kg, and its
magnetic moment is about 1 Bohr magneton.
Solution
The potential energy of the magnetic moment in the magnetic field is
because the field along the central axis of the magnet has
only a z component. The force on the atom can be found
from the potential energy according to
dB
dU
= μz z
dz
dz
a=
Fz
μ (dBz /dz)
= z
m
m
The vertical deflection z of either beam can be found
from z = 12 at2 , where t, the time to traverse the magnet,
equals x/v. Each beam is deflected by this amount, so the
net separation d is 2z, or
d=
= −μz Bz
U = −µ
·B
Fz = −
The acceleration of a silver atom of mass m as it passes
through the magnet is
=
μz (dBz /dz)x2
mv2
(9.27 × 10−24 J/T)(1.4 × 103 T/m)(3.5 × 10−2 m)2
(1.8 × 10−25 kg)(750 m/s)2
= 1.6 × 10−4 m = 0.16 mm
This is consistent with the separation that can be read from
the scale in Figure 7.17.
7.7 ENERGY LEVELS AND SPECTROSCOPIC
NOTATION
We previously described all of the possible electronic states in hydrogen by three
quantum numbers (n, l, ml ), but as we have seen, a fourth property of the electron,
the intrinsic angular momentum or spin, requires the introduction of a fourth
quantum number. We don’t need to specify the spin s, because it is always 1/2
(we regard it as a fundamental property of the electron, like its electric charge or
its mass), but we must specify the value of the quantum number ms (+ 1/2 or − 1/2),
which tells us about the z component of the spin. Thus the complete description of
the state of an electron in an atom requires the four quantum numbers (n, l, ml , ms ).
For example, the ground state of hydrogen was previously labeled as (n, l, ml ) =
(1, 0, 0). With the addition of ms , this would become either (1, 0, 0, + 1/2)
or (1, 0, 0, − 1/2). The degeneracy of the ground state is now 2. The first
excited state would have eight possible labels: (2, 0, 0, + 1/2), (2, 0, 0, − 1/2),
(2, 1, +1, + 1/2), (2, 1, +1, − 1/2), (2, 1, 0, + 1/2), (2, 1, 0, − 1/2), (2, 1, −1, + 1/2),
and (2, 1, −1, − 1/2). There are now two possible labels for each previous single
label (each n, l, ml becomes n, l, ml , − 1/2 and n, l, ml , − 1/2, so the degeneracy of
each level is 2n2 instead of n2 .
It is important to know the direction (z component) of the angular momentum
vectors when an atom is in a magnetic field, but for most other applications the
values of ml and ms are of no significance, and it is cumbersome to write them
each time we wish to refer to a certain level of an atom. We therefore use a
different notation, known as spectroscopic notation, to label the levels. In this
7.8 | The Zeeman Effect
system we use letters to stand for the different l values: for l = 0, we use the letter
s (do not confuse this with the quantum number s), for l = 1, we use the letter p,
and so on. The complete notation is as follows:
−0.8 eV
−1.5 eV
−3.4 eV
Value of l
0
1
2
3
4
5
6
Designation
s
p
d
f
g
h
i
(The first four letters stand for sharp, principal, diffuse, and fundamental, which
were terms used to describe atomic spectra before atomic theory was developed.)
In spectroscopic notation, the ground state of hydrogen is labeled 1s, where the
value n = 1 is specified before the s. Figure 7.19 illustrates the labeling of the
hydrogen atom levels in this notation.
Also shown in Figure 7.19 are arrows representing some different photons that
can be emitted when the atom makes a transition from one state to a lower state.
Some of the missing arrows (such as 4d to 3s) would represent transitions that are
not allowed to occur. By solving the Schrödinger equation and using the solutions
to compute transition probabilities, we find that the transitions most likely to
occur are those that change l by one unit. This restriction is called a selection rule,
and for atomic transitions the selection rule is
l = ±1
−13.6 eV
4s
4p
3s
3p
2s
2p
4d
217
4f
3d
1s
FIGURE 7.19 A partial energy level
diagram of hydrogen, showing the
spectroscopic notation of the levels
and some of the transitions that satisfy
the l = ±1 selection rule.
(7.24)
For example, the 3s level cannot emit a photon in a transition to the 2s level
(l = 0), but rather must go to the 2p level (l = 1). There is no selection rule
for n, so the 3p level can go to 2s or 1s (but not to 2p).
∗
7.8 THE ZEEMAN EFFECT
Consider for the moment a hypothetical (and less interesting) world in which
the electron has no spin, and therefore no spin magnetic moment. Suppose we
prepared a hydrogen atom in a 2p (l = 1) level and placed it in an external
(supplied by a laboratory electromagnet, for example).
uniform magnetic field B
The magnetic moment µ
L associated with the orbital angular momentum then
interacts with the field, and the energy associated with this interaction is
U = −µ
L · B
(7.25)
That is, magnetic moments aligned in the direction of the field have less energy
than those aligned oppositely to the field. Using Eq. 7.22 for the z component of
the magnetic moment (assuming that the field is in the z direction), we have
U = −μL,z B = ml μB B
(7.26)
in terms of the Bohr magneton μB defined in Eq. 7.23. In the absence of a
magnetic field, the 2p level has a certain energy E0 (−3.4 eV). When the field
is turned on, the energy becomes E0 + U = E0 + ml μB B; that is, there are now
three different possible energies for the level, depending on the value of ml .
Figure 7.20 illustrates this situation.
∗
This is an optional section that may be skipped without loss of continuity.
l = 1, ml = 0, ±1
μ BB
Field off
μBB
Field on
ml = +1
ml = 0
ml = −1
FIGURE 7.20 The splitting of an l =
1 level in an external magnetic field.
(The effects of the electron’s spin
angular momentum are ignored.) The
energy in a magnetic field is different
for different values of ml .
Chapter 7 | The Hydrogen Atom in Wave Mechanics
218
Field off
Field on
ml = +1
ml = 0
2p
E − μBB
E
E
E + μBB
ml = −1
dE = −
ml = 0
1s
Now suppose the atom emits a photon in a transition from the 2p state to the
1s ground state. In the absence of the magnetic field, a single photon is emitted
with an energy of 10.2 eV and a corresponding wavelength of 122 nm. When
the magnetic field is present, three photons can be emitted, with energies of
10.2 eV + μB B, 10.2 eV, and 10.2 eV − μB B. To determine how a small change
in energy E affects the wavelength, we differentiate the expression E = hc/λ
and obtain
FIGURE 7.21 The normal Zeeman
effect. When the field is turned on, the
original wavelength λ becomes three
separate wavelengths.
(7.27)
Replacing the differentials with small differences, taking absolute magnitudes,
and solving for λ gives
λ =
λ − ∆λ λ λ + ∆λ
λ
hc
dλ
λ2
λ2
E
hc
(7.28)
where E is the energy splitting between the levels when the field is on
(E = μB B). Figure 7.21 illustrates the three transitions, and shows an example
of the result of a measurement of the emitted wavelengths.
In analyzing transitions between different ml states, often we need to use a
second selection rule: the only transitions that occur are those that change ml by
0, +1, or −1:
ml = 0, ±1
(7.29)
Changes in ml of two or more are not permitted.
Example 7.9
Compute the change in wavelength of the 2p → 1s photon
when a hydrogen atom is placed in a magnetic field of
2.00 T.
and so, from Eq. 7.28,
λ =
λ2
E
hc
Solution
The energy of the photon from n = 2 to n = 1 is
E = −13.6 eV( 212 − 112 ) = 10.2 eV, and its wavelength
is λ = hc/E = (1240 eV · nm)/(10.2 eV) = 122 nm. The
energy change E of the levels is
E = μB B = (9.27 × 10−24 J/T)(2.00 T)
= 18.5 × 10−24 J = 11.6 × 10−5 eV
=
(122 nm)2
11.6 × 10−5 eV
1240 eV · nm
= 0.00139 nm
Even for a fairly large magnetic field of 2 T, the change in
wavelength is very small, but it is easily measurable using
an optical spectrometer.
The experiment we have just considered is an example of the Zeeman
effect —the splitting of a spectral line with a single wavelength into lines with
7.9 | Fine Structure
several different wavelengths when the emitting atoms are in an externally applied
magnetic field. In the normal Zeeman effect a single spectral line splits into three
components; this occurs only in atoms without spin. (All electrons of course have
spin, unlike the hypothetical spinless electrons we considered; however, in certain
atoms with several electrons, the spins can pair off and cancel, so that the atom
behaves like a spinless one.) When spin is present, we must consider not only
the effect of the orbital magnetic moment but also the spin magnetic moment.
The resulting pattern of level splittings is more complicated, and spectral lines
may split into more than three components. This case is known as the anomalous
Zeeman effect, an example of which is shown in Figure 7.22.
∗
7.9 FINE STRUCTURE
A careful inspection of the emission lines of atomic hydrogen shows that many
of them are in fact not single lines but very closely spaced combinations of two
lines. In this section we examine the origin of that effect, known as fine structure.
In this calculation it is more convenient for us to examine the hydrogen atom
from the electron’s frame of reference, in which the proton appears to travel
around the electron, just as the Sun appears to travel around the Earth. For
convenience, we treat this problem in the context of the Bohr model to obtain an
estimate of the effect.
Figure 7.23a shows the atom viewed from the ordinary frame of reference of
the proton. We assume the electron to orbit counterclockwise so that the orbital
is in the z direction, and we also assume that the spin S
angular momentum L
(which could point either up or down) is also in the z direction. The same situation
is shown in Figure 7.23b from the viewpoint of the electron, with the proton now
appearing to move in a circular orbit around the electron.
In the electron reference frame the motion of the proton in a circular orbit of
radius r can be considered to be a current loop, which causes a magnetic field B
at the electron, as shown in Figure 7.23c. This magnetic field interacts with the
spin magnetic moment of the electron, µ
S = −(e/m)S. The interaction energy of
the magnetic moment µ
S in a magnetic field is
U = −µ
S · B
(7.30)
; with µ
We choose the z direction to be the direction of B
S = −(e/m)S, we have
U=
e
·B
= e Sz B
S
m
m
(7.31)
eh−
B = ±μB B
2m
(7.32)
−
the energy is
With Sz = ± 12 h,
U =±
∗ This
is an optional section that may be skipped without loss of continuity.
D1
219
D2
FIGURE 7.22 The anomalous Zeeman effect in sodium. (Top) The
so-called sodium D-lines, a closelying doublet of wavelengths 589.0
and 589.6 nm in the absence of a
magnetic field. (Bottom) Splitting of
the lines into six and four components
in a magnetic field. This image was
photographed by Peter Zeeman in
1897.
220
Chapter 7 | The Hydrogen Atom in Wave Mechanics
L
S
i
+
S
r
+
B
r
µs
(b)
(a)
(c)
FIGURE 7.23 An electron circulates about the proton in a hydrogen atom. (b)
From the point of view of the electron, the proton circulates about the electron.
(c) The apparently circulating proton is represented by the current i and causes a
at the location of the electron.
magnetic field B
μBB
∆E
μBB
L S
FIGURE 7.24 The fine-structure split and
ting in hydrogen. The state with L
S parallel is slightly higher in energy
and S antiparallel.
than the state with L
−
and thus U = +μB B. When S
The situation shown in Figure 7.23 has Sz = + 21 h,
has the opposite orientation, U = −μB B. The effect is to split each level into two,
and S parallel and a lower state with L
and S antiparallel, as
a higher state with L
shown in Figure 7.24. The energy difference between the states is E = 2μB B.
At this point, the result looks rather similar to that of our previous discussion
of the Zeeman effect, but it is important to note one significant difference: the
magnetic field B in this case is not a field in the laboratory that can be turned on
or off; it is, instead, a field that is always present, produced by the relative motion
between the proton and the electron.
We can use the Bohr model to make a rough estimate of the magnitude of this
energy splitting. A circular loop of radius r carrying current i establishes at its
center a magnetic field B = μ0 i/2r.The current i is the charge carried around the
loop (+e in this case) divided by the time T for one orbit. The time for one orbit
is the distance traveled (2πr) divided by the speed v.
B=
μ e
μ ev
μ0 i
= 0 = 0
2r
2r T
2r 2πr
(7.33)
The energy difference between the states is then
E = 2μB B =
μ0 e2 h−2 n
μ0 ev
μ
=
2πr2 B
4πm2 r3
(7.34)
−
where the last result is obtained by substituting v = nh/mr
from Bohr’s angu−
lar momentum condition (Eq. 6.26) and μB = eh/2m from Eq. 7.23. Finally,
substituting from Eq. 6.28 for the radius of the orbits in the Bohr atom, we obtain
3
μ0 me8 1
me2 1
μ0 e2 h−2 n
=
(7.35)
E =
−
4πm2
4πε0 h2 n2
256π 4 ε03 h−4 n5
We can rewrite this in a somewhat simpler form by recalling that c2 = 1/ε0 μ0
and using the dimensionless constant α, known as the fine structure constant,
α=
e2
−
4πε0 hc
(7.36)
Chapter Summary
221
which gives
E = mc2 α 4
1
n5
(7.37)
The value of the fine structure constant is approximately 1/137. For hydrogen in
and S
the n = 2 level, we expect the energy difference between the state with L
parallel and the state with L and S antiparallel to be
E = (0.511 MeV)
1
137
4
1
= 4.53 × 10−5 eV
25
We can compare this estimate with the experimental value, based on the observed
splitting of the first line of the Lyman series, which gives 4.54 × 10−5 eV. We see
that in spite of the assumptions we have made, our use of the Bohr model, and our
failure to use the hydrogen wave functions to do this calculation, the agreement
with the experimental value is remarkably good. (In fact, the agreement is so good
as to be embarrassing, for we neglected to consider the important relativistic effect
of the motion of the electron, which contributes to the fine structure about equally
to the spin-orbit interaction discussed in this section. We really should regard this
calculation as an order-of-magnitude estimate, which happens by chance to give
a numerical result close to the observed value.)
Chapter Summary
Section
Orbital angular
momentum
Orbital magnetic
quantum number
Spatial
quantization
Angular momentum uncertainty
relationship
√
−
l(l + 1) h
(l = 0, 1, 2, . . .)
7.2
Angular probability density
P(θ , φ) = |l,ml (θ)ml (φ)|
7.5
Lz = ml h−
(ml = 0, ±1, ±2, . . . , ±l)
7.2
Orbital magnetic
dipole moment
µ
L = −(e/2m)L
7.6
7.2
Spin magnetic
dipole moment
µ
S = −(e/m)S
7.6
7.2
Spin angular
momentum
√
√
|
S| = s(s + 1)h− = 3/4h−
(for s = 1/2)
7.6
Spin magnetic
quantum number
Sz = ms h− (ms = ± 1/2)
7.6
Spectroscopic
notation
s (l = 0), p(l = 1), d(l = 2),
f (l = 3), . . .
7.7
Lz
m
= l
|L|
l(l + 1)
Lz φ h−
cos θ =
Hydrogen
n = 1, 2, 3, . . .
quantum numbers l = 0, 1, 2, . . . , n − 1
ml = 0, ±1, ±2, . . . , ±l
me4
1
Hydrogen energy En = −
−2 2
2 ε2 h
n
32π
0
levels
Hydrogen wave
functions
Section
| =
|L
ψn,l,ml (r, θ , φ) =
Rn,l (r)l,ml (θ)ml (φ)
Radial probability P(r) = r2 |Rn,l (r)|2
density
7.3
2
l = ±1 ml = 0, ±1
7.3
Selection rules for
photon emission
7.3
Normal Zeeman
effect
λ =
7.4
Fine-structure
estimate
E = mc2 α 4 /n5
λ2
λ2
E = μB B
hc
hc
(α≈1/137)
7.7, 7.8
7.8
7.9
222
Chapter 7 | The Hydrogen Atom in Wave Mechanics
Questions
1. How does the quantum-mechanical interpretation of the
hydrogen atom differ from the Bohr model?
2. How does a quantized angular momentum vector differ from
a classical angular momentum vector?
3. What are the meanings of the quantum numbers n, l, ml
according to (a) the quantum-mechanical calculation;
(b) the vector model; (c) the Bohr (orbital) model?
4. List the dynamical quantities that are constant for a specific
choice of n and l. List the dynamical quantities that are not
constant. Compare these lists with the Bohr model.
5. How does the orbital angular momentum differ between the
Bohr model and the quantum-mechanical calculation?
precesses about the z axis? Can
6. What does it mean that L
we observe the precession?
7. In the Bohr model, we calculated the total energy from
the potential energy and kinetic energy for each orbit. In
the quantum-mechanical calculation, is the potential energy
constant for any set of quantum numbers? Is the kinetic
energy? Is the total energy?
8. What is meant by the term spatial quantization? Is space
really quantized?
9. A deficiency of the Bohr model is the problem of angular momentum conservation in transitions between levels.
Discuss this problem in relation to the quantum-mechanical
angular momentum properties of the atom, especially the
selection rule Eq. 7.24. The photon can be considered to
h.
carry angular momentum −
10. The 2s electron has a greater probability to be close to the
nucleus than the 2p electron and also a greater probability
to be farther away (see Figure 7.10). How is this possible?
11. The probability density ψ*ψ does not depend on φ for the
wave functions listed in Table 7.1. What is the significance
of this?
12. How would the wave functions of Table 7.1 change if the
nuclear charge were Ze instead of e? (Recall how we made
the same change in the Bohr model in Section 6.5.) What
effect would this have on the radial probability densities
P(r)?
13. Can a hydrogen atom in its ground state absorb a photon (of
the proper energy) and end up in the 3d state?
14. Is it correct to think of the electron as a tiny ball of charge
spinning on its axis? Is it useful? Is this situation similar
to using the Bohr model to represent the electron’s orbital
motion?
15. The photon has a spin quantum number of 1, but its spin
magnetic moment is zero. Explain.
16. What are the similarities and differences between Zeeman
splitting and fine-structure splitting?
17. How would the calculated fine structure be different in an
atom with a single electron and a nuclear charge of Ze?
18. Does the fine structure, as we have calculated it, have any
effect on the n = 1 level?
19. How would (a) the Zeeman effect and (b) the fine structure
be different in a muonic hydrogen atom? (See Problems 41
and 42 in Chapter 6.) The muon has the same spin as the
electron, but is 207 times as massive.
20. Even though our calculation of the fine structure was based
on a very simplified model, it does yield a result similar
to the more correct calculation: the fine-structure splitting
decreases as we go to higher excited states. Give at least two
qualitative reasons for this.
Problems
7.1 A One-Dimensional Atom
−bx
1. By substituting the wave function ψ(x) = Axe
into
Eq. 7.2, show that a solution can be obtained only for
b = 1/a0 , and find the ground-state energy.
2. Show that the probability density for the ground-state solution of the one-dimensional Coulomb potential energy has
its maximum at x = a0 .
3. An electron in its ground state is trapped in the onedimensional Coulomb potential energy. What is the probability to find it in the region between x = 0.99a0 and
x = 1.01a0 ?
7.2 Angular Momentum in the Hydrogen Atom
4. An electron is in an angular momentum state with l = 3.
(a) What is the length of the electron’s angular momentum
vector? (b) How many different possible z components can
the angular momentum vector have? List the possible z
components. (c) What are the values of the angle that the
vector makes with the z axis?
L
vector make with the z axis when
5. What angles does the L
l = 2?
7.3 The Hydrogen Atom Wave Functions
6. List the 16 possible sets of quantum numbers n, l, ml of the
n = 4 level of hydrogen (as in Figure 7.6).
7. (a) What are the possible values of l for n = 6? (b) What are
the possible values of ml for l = 6? (c) What is the smallest
possible value of n for which l can be 4? (d) What is the
h?
smallest possible l that can have a z component of 4−
Problems
8. Show that the (1, 0, 0) and (2, 0, 0) wave functions listed in
Table 7.1 are properly normalized.
9. Show by direct substitution that the n = 2, l = 0, ml = 0 and
n = 2, l = 1, ml = 0 wave functions of Table 7.1 are both
solutions of Eq. 7.10 corresponding to the energy of the first
excited state of hydrogen.
10. Show by direct substitution that the wave function corresponding to n = 1, l = 0, ml = 0 is a solution of Eq. 7.10
corresponding to the ground-state energy of hydrogen.
11. Consider a thin spherical shell located between r = 0.49a0
and 0.51a0 . For the n = 2, l = 1 state of hydrogen, find
the probability for the electron to be found in a small volume element that subtends a polar angle of 0.11◦ and an
azimuthal angle of 0.25◦ if the center of the volume element is located at: (a) θ = 0, φ = 0; (b) θ = 90◦ , φ = 0;
(c) θ = 90◦ , φ = 90◦ ; (d) θ = 45◦ , φ = 0. Do the calculation
for all possible ml values.
7.4 Radial Probability Densities
12. Show that the radial probability density of the 1s level has
its maximum value at r = a0 .
13. Find the values of the radius where the n = 2, l = 0 radial
probability density has its maximum values.
14. What is the probability of finding a n = 2, l = 1 electron
between a0 and 2a0 ?
15. For a hydrogen atom in the ground state, what is the probability to find the electron between 1.00a0 and 1.01a0 ? (Hint:
It is not necessary to evaluate any integrals to solve this
problem.)
7.5 Angular Probability Densities
16. Find the directions in space where the angular probability
density for the l = 2, ml = ±1 electron in hydrogen has its
maxima and minima.
17. Find the directions in space where the angular probability
density for the l = 2, ml = 0 electron in hydrogen has its
maxima and minima.
7.6 Intrinsic Spin
18. (a) Including the electron spin, what is the degeneracy
of the n = 5 energy level of hydrogen? (b) By adding
up the number of states for each value of l permitted
for n = 5, show that the same degeneracy as part (a) is
obtained.
19. For each l value, the number of possible states is 2(2l + 1).
Show explicitly that the total number of states for each prinn−1
2(2l + 1) = 2n2 . This gives the
cipal quantum number is
l=0
degeneracy of each energy level.
20. Explain why each of the following sets of quantum
numbers (n, l, ml , ms ) is not permitted for hydrogen.
(a) (2, 2, −1, + 1/2) (b) (3, 1, +2, − 1/2) (c) (4, 1, +1, − 3/2)
(d) (2, −1, +1, + 1/2)
223
7.7 Energy Levels and Spectroscopic Notation
21. List the excited states (in spectroscopic notation) to which
the 4p state can make downward transitions.
22. (a) A hydrogen atom is in an excited 5g state, from which it
makes a series of transitions by emitting photons, ending in
the 1s state. Show, on a diagram similar to Figure 7.19, the
sequence of transitions that can occur. (b) Repeat part (a) if
the atom begins in the 5d state.
23. (a) List in spectroscopic notation all levels with n = 7.
(b) An electron is initially in the state with n = 7, l = 2. List
in spectroscopic notation all lower states to which transitions
are allowed.
7.8 The Zeeman Effect
24. Consider the normal Zeeman effect applied to the 3d to 2p
transition. (a) Sketch an energy-level diagram that shows
the splitting of the 3d and 2p levels in an external magnetic field. Indicate all possible transitions from each ml
state of the 3d level to each ml state of the 2p level. (b)
Which transitions satisfy the ml = ±1 or 0 selection rule?
(c) Show that there are only three different transition energies
emitted.
25. A collection of hydrogen atoms is placed in a magnetic field
of 3.50 T. Ignoring the effects of electron spin, find the
wavelengths of the three normal Zeeman components (a) of
the 3d to 2p transition; (b) of the 3s to 2p transition.
7.9 Fine Structure
26. Calculate the wavelengths of the components of the first line
of the Lyman series, taking the fine structure of the 2p level
into account.
27. Calculate the energies and wavelengths of the 3d to 2p transition, taking into account the fine structure of both levels.
How many component wavelengths might there be in the
transition?
General Problems
28. Show that the wave function ψ(x) = A(x + cx2 )e−bx gives a
solution to the Schrödinger equation for the one-dimensional
Coulomb potential energy. Evaluate the constants A, b, c, and
find the energy corresponding to this solution.
29. Find the probabilities for the n = 2, l = 0 and n = 2, l = 1
electron states in hydrogen to be further than r = 5a0 from
the nucleus. Which has the greater probability to be far from
the nucleus?
30. The mean or average
∞value of the radius r can be found
according to rav = 0 rP(r)dr. Show that the mean value
of r for the 1s state of hydrogen is 32 a0 . Why is this greater
than the Bohr radius?
31. Find the value of rav (see Problem 30) for the 2s and 2p
levels.
32. The mean or average value of the potential energy of
the electron in a hydrogen atom can be found from
224
Chapter 7 | The Hydrogen Atom in Wave Mechanics
∞
Uav = 0 U(r)P(r)dr. Find Uav in the 1s state and compare
with the potential energy computed with the Bohr model
when n = 1.
33. Suppose the source of atoms in a Stern-Gerlach experiment
were an oven of temperature 1000 K. Assume the magnetic
field gradient to be 10 T/m, and take the length of the magnetic field region and the field-free region between magnet
and screen to be 1 m each. Make any other assumptions
you may need and estimate the separation of the images
observed on the screen.
34. For the 1s, 2s, and 2p states of hydrogen, show that
(r−1 )av = 1/n2 a0 . This turns out to be a general result
for any state of hydrogen. Based on this result, explain why
the Bohr model gives such a good estimate for the finestructure splitting as well as for other magnetic effects due
to the circulating electron.
Chapter
8
MANY-ELECTRON ATOMS
This computer-generated drawing shows the structure of an atom of neon, with the electron
probability distributions surrounding the central nucleus. The bright inner sphere represents
the 1s electrons, the dark outer sphere is the 2s electrons, and the lobes are the 2p electrons.
This is a more realistic picture of an atom than the ‘‘planetary’’ view developed in Chapter 6.
226
Chapter 8 | Many-Electron Atoms
Physicists often attack complex problems by trying to separate the more important
parts from the less important. For example, in analyzing the motion of the Earth
in the Solar System, we can start by ignoring all bodies other than the Sun. With
this simplification, we find that the Earth moves about the Sun in an elliptical
orbit. Now we can account for the effect of the Moon, which introduces a slight
“wobble” about the ellipse. Finally, we can introduce the much weaker effect of
the gravitational pull of the other planets.
It is tempting to try to use a similar approach to understand the motion of
electrons in atoms with more than one electron. Unfortunately, we can’t analyze
the motion of an electron in an atom with more than one electron by separating out
the more and less influential forces. For example, in a neutral atom with atomic
number Z, each electron experiences an electrostatic force due to the nucleus with
a charge of +Ze, but it also experiences an electrostatic force due to all the other
electrons with a total charge of −(Z − 1)e. The effect of the nucleus is comparable
to the effect of the other electrons, which can’t be analyzed as a small correction.
We are thus required to consider simultaneously the effect of the nucleus
and each of the other electrons. The problem of the mutual interactions of three
or more objects is an example of what physicists call the many-body problem.
Exact, closed-form solutions to the Schrödinger equation cannot be found for such
problems. The solutions must be obtained numerically using a computer. In this
chapter, we consider an approximate set of energy levels for many-electron atoms,
and we try to understand some of the properties of atoms (chemical, electrical,
magnetic, optical, etc.) based on those energy levels.
8.1 THE PAULI EXCLUSION PRINCIPLE
Wolfgang Pauli (1900–1958, Switzerland). His exclusion principle gave
the basis for understanding atomic
structure. He also contributed to the
development of quantum theory, to
the theory of nuclear beta decay, and
to the understanding of symmetry in
physical laws.
Let’s begin by considering how the Z electrons in an atom might occupy the atomic
energy levels. As a first guess, we might expect that all Z electrons will eventually
cascade down to the lowest energy level, the 1s state. If this were correct, we
would expect the properties of the atom to vary rather smoothly compared with its
neighbors having Z ± 1 electrons. Indeed, certain of the properties of atoms, such
as the energies of the emitted X rays, show this smooth variation. However, other
properties do not vary in this way and thus are not consistent with this model of
all electrons in the same level. For example, neon (with Z = 10) is an inert gas;
it is practically unreactive and does not form chemical compounds under most
conditions. Its neighbors, fluorine (Z = 9) and sodium (Z = 11), are among the
most reactive of the elements and under most conditions will combine with other
substances, sometimes violently. As another example, nickel (Z = 28) is strongly
magnetic (ferromagnetic) and, for a metal, does not have a particularly large
electrical conductivity. Copper (Z = 29) is an excellent electrical conductor but
is not magnetic. Such wide variations in properties between neighboring elements
suggest that it is not correct to assume that all electrons occupy the same energy
level.
The rule that prevents all of the electrons in an atom from falling into the 1s
level was proposed by Wolfgang Pauli in 1925, based on a study of the transitions
that are present, and those that are expected but not present, in the emission
spectra of atoms. Simply stated, the Pauli exclusion principle is as follows:
No two electrons in a single atom can have the same set of quantum numbers
(n, l, ml , ms ).
8.1 | The Pauli Exclusion Principle
227
The Pauli principle is the most important rule governing the structure of atoms,
and no study of the properties of atoms can be attempted without a thorough
understanding of this principle.
To illustrate how the Pauli principle works, consider the structure of helium
(Z = 2). The first electron in helium, in the 1s ground state, has quantum numbers
n = 1, l = 0, ml = 0, ms = + 1/2 or − 1/2. The second electron can have the same
n, l, and ml , but it cannot have the same ms , because the exclusion principle would
be violated. Thus if the first 1s electron has ms = + 1/2, the second 1s electron must
have ms = − 1/2. Now consider an atom of lithium (Z = 3). Just as with helium,
the first two electrons will have quantum numbers (n, l, ml , ms ) = (1, 0, 0, + 1/2)
and (1, 0, 0, − 1/2). According to the exclusion principle, the third electron cannot
have the same set of quantum numbers as the first two, so it cannot go into the n =
1 level, because there are only two different sets of quantum numbers available in
the n = 1 level, and both of those sets have already been used. The third electron
must therefore go into one of the n = 2 levels, and experiments indicate that the
2s level is the next available. Without the Pauli principle, lithium would have
three electrons in the 1s level; with the Pauli principle, we expect that lithium has
two electrons in the 1s level and one electron in the 2s level. These two different
possible structures for lithium would give very different physical properties, and
the physical properties of lithium indicate that the structure with one electron in
the 2s level is the correct one.
We can continue this process with beryllium (Z = 4). The fourth electron can
join the third electron in the 2s level, but that now completes the capacity of
the 2s level—one of the electrons might have quantum numbers (n, l, ml , ms ) =
(2, 0, 0, + 1/2) and the other might have (2, 0, 0, − 1/2). There are no other sets of
quantum numbers that an additional electron could have in the 2s level without
duplicating one of the sets that has already been assigned and thus violating the
Pauli principle. When we reach boron, with Z = 5, the fifth electron must go
into a different level—one of the 2p levels. We might therefore expect that the
properties of boron, with a 2p electron, would be different from the properties of
lithium or beryllium, which have only 2s electrons.
It is this process of first using up all of the possible quantum numbers for one
level, and then placing electrons in the next level, that accounts for the variations
in the chemical and physical properties of the elements.
Example 8.1
A certain atom has six electrons in the 3d level. (a) What
is the maximum possible total ml for the six electrons, and
what is the total ms in that configuration? (b) What is the
maximum possible total ms for the six electrons, and what
would be the largest possible total ml in that configuration?
Solution
(a) For a d state l = 2, so the possible ml values are
+2, +1, 0, −1, and −2. At most two electrons can be
assigned ml of +2 according to the Pauli principle (one
with ms = + 1/2 and one with ms = − 1/2). Similarly, two
electrons can be assigned ml of +1 (again, with ms = + 1/2
and ms = − 1/2), and the remaining two electrons can be
assigned to ml of 0. That gives a total ml of +6, with a total
ms of 0.
(b) To maximize ms , we can assign at most five electrons to
ms = + 1/2 (with corresponding ml values of +2, +1, 0, −1,
and −2). The sixth electron cannot also have ms = + 1/2,
because its ml value would be the same as one already
assigned, which would violate the Pauli principle by having two electrons with the same ml and ms labels. The sixth
electron must therefore have ms = − 1/2, giving a total ms of
+2. The first five electrons give a total ml of 0, so the largest
total ml would be obtained by assigning the sixth electron to
ml of +2, giving a total ml of +2.
228
Chapter 8 | Many-Electron Atoms
6p
5d
4f
6s
5p
4d
5s
Energy
4p
3d
4s
3p
3s
2p
2s
1s
FIGURE 8.1 Atomic subshells, in order of increasing energy. The energy
groupings are not to scale, but represent the relative energies of the
subshells.
8.2 ELECTRONIC STATES IN MANY-ELECTRON
ATOMS
Figure 8.1 illustrates the result of an approximate calculation of the order of the
filling of energy levels in many-electron atoms as the atomic number Z increases.
The 1s level is always the lowest energy level to be filled, and the 2s and 2p levels
are fairly close in energy. The 2s level always lies a bit lower in energy than the
2p level, and so the 2s level is filled before the 2p. (The fine-structure splitting
is very small on the scale of this diagram.) We can understand why the 2s level
lies lower in energy if we recall Example 7.6 and Figure 7.10. An electron in the
2s level has a greater probability to be found at small radii compared with an
electron in the 2p level. (Penetrating close to the nucleus, the 2s electron also is
attracted by the full nuclear charge +Ze, while the 2p electron spends most of
its time beyond the orbits of the 1s electrons where it is attracted by an effective
charge that is less than the full charge of the nucleus. We’ll discuss this effect,
which is called electron screening, in Section 8.3.) These two effects—closer
penetration to the nucleus and screening—are responsible for the tighter binding
of the 2s electrons compared with the 2p electrons.
A more extreme example of the tighter binding of the penetrating orbits occurs
for the n = 3 levels. The 3s electron penetrates the inner orbits (it has a large
probability density at small r; see Figure 7.10), and the 3p electron penetrates
almost as much. The 3d electron has negligible penetration of the inner orbits. As
a result, the 3s and 3p levels are more tightly bound and therefore lower in energy
than the 3d level. A similar effect occurs for the n = 4 levels—the tighter binding
of the 4s and 4p electrons pulls their energy levels down so low that they almost
coincide with the 3d level, as shown in Figure 8.1. The 3d and 4s levels are very
close in energy—for some atoms the 3d level is lower and for some atoms the
4s is lower. This small energy difference is an important factor that contributes to
the large electrical conductivity of copper, as we discuss later in this chapter.
The tighter binding of the penetrating s and p orbits also pulls the 5s and 5p
levels down close to the 4d level, and similarly causes the 6s and 6p levels to
appear at roughly the same energy as the 5d and 4f levels.
As we learned in the case of the hydrogen atom, orbits with the same value
of n all lie at about the same average distance from the nucleus. (The electrons
in the penetrating orbits spend some of their time closer to the nucleus than the
nonpenetrating orbits, but also some of their time further from the nucleus; the
average distance from the nucleus of the penetrating orbits is then about the same
as the average distance from the nucleus of the nonpenetrating orbits with the
same value of n. See Problem 31 in Chapter 7 for a verification of this property
for the hydrogen atom.) The set of orbits with a certain value of n, with about the
same average distance from the nucleus, is known as an atomic shell. The atomic
shells are designated by letter, as follows:
n
1
2
3
4
5
Shell
K
L
M
N
O
The levels with a certain value of n and l (for instance, 2s or 3d) are known
as subshells. According to the Pauli principle, the maximum number of electrons
that can be placed in each subshell is 2(2l + 1). The (2l + 1) factor comes from
the number of different ml values for each l, because ml can take the values
0, ±1, ±2, ±3, . . . , ±l. The extra factor of 2 comes from the two different ms
8.2 | Electronic States in Many-Electron Atoms
values; for each ml , we can have ms = + 1/2 or ms = − 1/2. According to this
scheme, the 1s subshell has a capacity of 2(2 × 0 + 1) = 2 electrons; the 3d
subshell has a capacity of 2(2 × 2 + 1) = 10 electrons. (Note that this capacity
doesn’t depend on n; any d subshell has a capacity of 10 electrons.) Table 8.1
shows the ordering and capacity of the subshells.
It is important to keep in mind exactly what is represented by Figure 8.1
and Table 8.1. They give the order of filling of the energy levels, and so they
represent only the “outer” or valence electrons. For example, the first 18 electrons
fill the levels up through 3p, and the energy levels (subshells) available to the
19th electron in potassium (Z = 19) or calcium (Z = 20) are well described by
Figure 8.1. However, the energy levels appropriate to the 19th electron in a heavy
element such as lead (Z = 82) would be very different. In this case it is more
correct to describe the atom in terms of shells—all of the n = 3 states (the M
shell) are grouped together, as are all of the n = 4 states (the N shell), and so
forth. When we discuss the inner structure of the atom, as in the case of X rays,
the ordering of Figure 8.1 is not appropriate, and it is more appropriate to group
the levels by shells, as we do in Section 8.5.
229
TABLE 8.1 Filling of Atomic Subshells
Subshell Capacity
2(2l+1)
n
l
1
0
1s
2
2
0
2s
2
2
1
2p
6
3
0
3s
2
3
1
3p
6
4
0
4s
2
3
2
3d
10
4
1
4p
6
5
0
5s
2
4
2
4d
10
5
1
5p
6
The Periodic Table
6
0
6s
2
Figure 8.2 shows the periodic table, which is an orderly array of the chemical
elements, listed in order of increasing atomic number Z and arranged in such a
way that the vertical columns, called groups, contain elements with rather similar
physical and chemical properties. In this section we discuss the way in which the
filling of electronic subshells helps us understand the arrangement of the periodic
table. In later sections we examine some of the physical and chemical properties
of the elements.
4
3
4f
14
5
2
5d
10
6
1
6p
6
7
0
7s
2
5
3
5f
14
6
2
6d
10
Inert
gases
Alkalis
1
1s
2s
3s
3
5s
6s
7s
Li 4
2
He
Halogens
Be
2p 5
11 Na 12 Mg
K 20 Ca
3d
37 Rb 38 Sr
21 Sc 22 Ti 23
39
V 24 Cr 25 Mn 26 Fe 27 Co 28 Ni 29 Cu 30 Zn
Y 40 Zr 41 Nb 42 Mo 43 Tc 44 Ru 45 Rh 46 Pd 47 Ag 48 Cd
4d
55 Cs 56 Ba
5d
87 Fr 88 Ra
6d
B6
C7
3p 13 Al 14 Si 15
Transition metals
19
4s
H Alkaline
earths
71 Lu 72 Hf 73 Ta 74 W 75 Re 76 Os 77
Ir 78 Pt 79 Au 80 Hg
103 Lr 104 Rf 105 Db 106 Sg 107 Bh 108 Hs 109 Mt 110 Ds 111 Rg 112 Cn
N8
O9
P 16
S 17 Cl 18 Ar
4p 31 Ga 32 Ge 33 As 34 Se 35 Br 36 Kr
5p 49 In 50 Sn 51 Sb 52 Te 53
5f
7p 113 Uut 114 Uuq 115 Uup 116 Uuh 117 Uus 118 Uuo
57 La 58 Ce 59 Pr 60 Nd 61 Pm 62 Sm 63 Eu 64 Gd 65 Tb 66 Dy 67 Ho 68 Er 69 Tm 70 Yb
89 Ac 90 Th 91 Pa 92
U 93 Np 94 Pu 95 Am 96 Cm 97 Bk 98 Cf 99 Es 100 Fm 101 Mv 102 No
Actinides
FIGURE 8.2 The periodic table of the elements.
I 54 Xe
6p 81 Tl 82 Pb 83 Bi 84 Po 85 At 86 Rn
Lanthanides (rare earths)
4f
F 10 Ne
230
Chapter 8 | Many-Electron Atoms
In attempting to understand the ordering of subshells and the periodic table,
we must follow two rules for filling the electronic subshells:
1. The capacity of each subshell is 2(2l+1). (This is of course just another way
of stating the Pauli exclusion principle.)
2. The electrons occupy the lowest energy states available.
To indicate the electron configuration of each element, we use a notation in
which the identity of the subshell and the number of electrons in it are listed. The
identity of the subshell is indicated in the usual way, and the number of electrons
in that subshell is indicated by a superscript. Thus hydrogen has the configuration
1s1 , for one electron in the 1s shell, and helium has the configuration 1s2 . Helium
has both a filled subshell (the 1s) and a closed major shell (the K shell) and
thus is an extraordinarily stable and inert element. With lithium (Z = 3), we
begin to fill the 2s subshell; lithium has the configuration 1s2 2s1 . With beryllium
(Z = 4, 1s2 2s2 ) the 2s subshell is full, and the next element must begin filling
the 2p subshell (boron, Z = 5, 1s2 2s2 2p1 ). The 2p subshell has a capacity of six
electrons, and with neon (Z = 10, 1s2 2s2 2p6 ) both the 2p subshell and the L shell
(n = 2) are complete.
The next row (or period) begins with sodium (Z = 11, 1s2 2s2 2p6 3s1 ), and the
3s and 3p subshells are filled in much the same way as the 2s and 2p subshells,
ending with the inert gas argon (Z = 18, 1s2 2s2 2p6 3s2 3p6 ). The elements of the
third row (period) are chemically similar to the corresponding elements of the
second row (period), and so are written directly under them. The next electron
might be expected to go into the 3d level. However, the highly penetrating orbit
of the 4s electron causes the 4s level to appear at a slightly lower energy than the
3d level, so the 4s subshell normally fills first. The configurations of potassium
(Z = 19) and calcium (Z = 20) are therefore respectively 1s2 2s2 2p6 3s2 3p6 4s1
and 1s2 2s2 2p6 3s2 3p6 4s2 . These elements have properties similar to, and therefore
appear directly under, the corresponding elements with one and two s-subshell
electrons in the second and third periods.
We now begin to fill the 3d subshell. Because there is no 1d or 2d subshell,
we would expect the first element with a d-subshell configuration to have rather
different chemical properties from the elements we have placed previously;
thus it should not appear in any of our previously occupied groups (columns),
and so we begin a new group with scandium (Z = 21, 1s2 2s2 2p6 3s2 3p6 4s2 3d 1 ).
The 3d subshell eventually closes with zinc (Z = 30, 1s2 2s2 2p6 3s2 3p6 4s2 3d 10 ).
Along the way there are some minor variations; the most important is copper,
with Z = 29. For this case the 3d level lies slightly lower than the 4s level,
and so the 3d subshell fills before the 4s, resulting in the configuration
1s2 2s2 2p6 3s2 3p6 3d 10 4s1 . As we discuss later, this configuration is responsible
for the large electrical conductivity of copper.
In the next series of elements, the 4p subshell is filled, from gallium (Z = 31)
to the inert gas krypton (Z = 36). When we move to the next period, we fill the 5s
subshell before the 4d subshell, and the series of 10 elements corresponding to the
filling of the 4d subshell is written directly under the series that had unfilled configurations in the 3d subshell. (Silver, with Z = 47, corresponds exactly to copper in
the fourth period, with the 4d subshell filling before the 5s.) After the completion of
the 4d subshell, the 5p subshell is filled, ending with the inert gas xenon (Z = 54).
The next period begins with cesium and barium filling the 6s subshell. As was
the case in the previous periods, the 5d and 6s lie at almost the same energy. However, there is yet another subshell at about the same energy as the 6s and 5d —the
4f subshell, which now begins to fill, from lanthanum to ytterbium. This series of
8.2 | Electronic States in Many-Electron Atoms
231
TABLE 8.2 Electronic Configurations of Some Elements
H
1s1
Mn
[Ar]4s2 3d 5
La
[Xe]6s2 5d 1
He
1s2
Cu
[Ar]4s1 3d 10
Ce
[Xe]6s2 5d 1 4f 1
Li
1s2 2s1
Zn
[Ar]4s2 3d 10
Pr
[Xe]6s2 4f 3
Be
1s2 2s2
Ga
[Ar]4s2 3d 10 4p1
Gd
[Xe]6s2 5d 1 4f 7
B
1s2 2s2 2p1
Kr
[Ar]4s2 3d 10 4p6
Dy
[Xe]6s2 4f 10
Ne
1s2 2s2 2p6
Rb
[Kr]5s1
Yb
[Xe]6s2 4f 14
Na
[Ne]3s1
Y
[Kr]5s2 4d 1
Lu
[Xe]6s2 5d 1 4f 14
Al
[Ne]3s2 3p1
Mo
[Kr]5s1 4d 5
Re
[Xe]6s2 5d 5 4f 14
Ar
[Ne]3s2 3p6
Ag
[Kr]5s1 4d 10
Au
[Xe]6s1 5d 10 4f 14
K
[Ar]4s1
In
[Kr]5s2 4d 10 5p1
Hg
[Xe]6s2 5d 10 4f 14
Sc
[Ar]4s2 3d 1
Xe
[Kr]5s2 4d 10 5p6
Tl
[Xe]6s2 5d 10 4f 14 6p1
Cr
[Ar]4s1 3d 5
Cs
[Xe]6s1
Rn
[Xe]6s2 5d 10 4f 14 6p6
A symbol in brackets [ ] means that the atom has the configuration of the previous inert gas plus the additional
electrons listed.
elements, called the lanthanides or rare earths, is usually written separately in the
periodic table, because there have been no other f -subshell elements under which
to write them. The 4f subshell has a capacity of 14 electrons, and so there are 14
elements in the lanthanide series. Once the 4f subshell is complete, we return to
filling the 5d subshell, writing those elements in the groups under the corresponding
3d and 4d elements, and then complete the period with the filling of the 6p subshell, ending with the inert gas radon (Z = 86). The seventh period is filled much
like the sixth, with a series known as the actinides, written under the lanthanides,
corresponding to the filling of the 5f subshell.
What is most remarkable about this scheme is that the arrangement of the
periodic table was known well before the introduction of atomic theory. The
elements were organized into groups and periods based on their physical and
chemical properties by Dmitri Mendeleev in 1859; understanding that organization
in terms of atomic levels is a great triumph for the atomic theory. This way of
organizing the elements gives us great insight into their physical and chemical
properties, as we discuss in the next sections.
Table 8.2 lists the electronic configurations of some of the elements.
Example 8.2
Copper has the electronic configuration [Ar]4s1 3d 10 in its
ground state. By adding a small amount of energy (about
1 eV) to a copper atom, it is possible to move one of the
3d electrons to the 4s level and change the configuration
to [Ar]4s2 3d 9 . By adding still more energy (about 5 eV),
one of the 3d electrons can be moved to the 4p level so
that the configuration becomes [Ar]4s1 3d 9 4p1 . For each of
these configurations, determine the maximum value of the
total ms of the electrons.
Solution
The electrons in the filled shells (Ar core) have a total ms of
zero. In fact, any filled subshell has equal numbers of electrons in ms = + 1/2 and ms = − 1/2 states, which also gives
a total of zero. In the 4s1 3d 10 configuration, only the single
232
Chapter 8 | Many-Electron Atoms
4s electron contributes to ms , and its maximum value is
+ 1/2. In the 4s2 3d 9 configuration, the two 4s electrons give
a total ms of zero. In the 3d subshell, there are 5 different ml
values, so we can have at most 5 electrons with ms = + 1/2.
The remaining 4 electrons must have ms = − 1/2, so we
have a total ms of 5 × (+ 1/2) + 4 × (− 1/2) = + 1/2. In the
4s1 3d 9 4p1 configuration, each of the three subshells contributes a maximum ms of + 1/2, so the maximum total ms
is + 3/2.
8.3 OUTER ELECTRONS: SCREENING AND
OPTICAL TRANSITIONS
Nucleus (+3e)
1s electrons
(−2e)
FIGURE 8.3
Electron structure in
lithium, as might be seen from the
average location of an outer (2s)
electron. The dashed line represents
a spherical Gaussian surface at that
location.
The electronic configurations of the alkali elements (those in the first column of
the periodic table) all show a single s electron outside an inert gas core. These
elements are very reactive, meaning they can easily give up the s electron to
another element to form a chemical bond. For example, lithium (1s2 2s1 ) readily
gives up its 2s electron to form the positive ion Li+ .
It may at first seem somewhat surprising that Li gives up its electron so easily.
The ionization energy of Li is 5.39 eV. This is smaller than the ionization energy
of hydrogen (13.6 eV), even though from Eq. 6.38 we might expect that the
energies of electrons in atoms should increase in proportion to Z 2 .
We can understand this effect from the diagram of Figure 8.3. The lithium
atom can be roughly characterized by an inner atomic shell consisting of two 1s
electrons and a single electron in the 2s subshell. As was the case in the oneelectron atoms we considered in Chapters 6 and 7, the principal quantum number
n determines the average distance of an electron from the nucleus. Although there
is no simple formula that allows us to calculate the average orbital radius in atoms
with more than one electron, it is certainly reasonable to expect that the 2s electron
is most likely to be found much farther from the nucleus than the 1s electrons.
The net electric force on the 2s electron can be estimated using Gauss’s law.
Imagine a spherical surface centered at the nucleus having a radius equal to the
average orbital radius of the 2s electron. The electric field at that distance is
determined, according to Gauss’s law, by the net charge contained within the
sphere. The electrons in the n = 1 orbit have nearly a 100% probability of being
found within the sphere. Thus the net charge inside the sphere must include the
nucleus (+3e) and the two n = 1 electrons (−2e) for a total net charge of +e. To
a good approximation, for some applications a lithium atom looks very much like
a one-electron atom with the electron in the n = 2 orbit about a nucleus with an
effective charge of +e. (Recall from electrostatics that if the charge distribution is
spherically symmetric, we can replace an extended charge distribution with a point
charge at the center of the sphere.) Equation 6.38 gives the energy of such an electron in the n = 2 orbit in an atom with an effective nuclear charge of Zeff e = +e as
Z2
= −3.40 eV
(8.1)
En = (−13.6 eV) eff
n2
This simple model predicts that the ionization energy of a neutral lithium atom
is 3.40 eV. The measured value is 5.39 eV. The agreement is not extremely
good, but the estimated value is off by much less than a factor of Z 2 = 9, so the
calculation is probably on the right track.
The difference between the measured and estimated values can be accounted
for by an effect that we have already discussed: the penetration of the s electrons
through the inner shells to be occasionally found close to the nucleus. The 2s
8.3 | Outer Electrons: Screening and Optical Transitions
233
electron sometimes finds itself much closer to the nucleus than its average orbital
radius, and may occasionally be inside the n = 1 shell. In this case Gauss’s law
tells us that the electron feels the full +3e charge of the nucleus, which results in
an increase in the binding energy.
Let’s instead consider an excited state of lithium, in which the 2s electron
moves to the 2p state. The 2p electron penetrates the inner shell hardly at all. The
energy of the 2p electron in lithium is −3.54 eV, in almost exact agreement with
the prediction of our simple model. The small discrepancy might indicate a small
degree of penetration of the 2p electron inside some of the 1s probability distribution, which gives a small increase to the binding energy. If we instead move the
outer electron to the 3d state, the measured energy is −1.51 eV, in exact agreement
with the prediction of Eq. 8.1 for n = 3. The 3d electron has almost no penetration
inside the 1s shell, and so that electron is very well described by Zeff = 1.
This effect is called electron screening. To an outer electron, the charge of the
nucleus can be screened or shielded by the electrons in the inner shells. This is
one case in which the formulas we derived for the energies of a one-electron atom
can be used to determine approximately the energy of an electron in an atom with
more than one electron. For the outer electron in lithium, the 3 positive charges
in the nucleus are screened by the negative charges of the two inner electrons,
giving a net charge of one unit. The less penetrating is the orbit of the outer
electron, the more accurate is the prediction of Eq. 8.1. In lithium, for example,
the 3d orbit has almost no penetration of the inner shells and so the formula gives
a very accurate representation of the binding of that electron. The 2p orbit in
lithium has relatively little penetration, so again the approximate formula gives a
good prediction. It is less accurate for the 2s electron, which does occasionally
penetrate through the inner 1s orbits.
Electron screening can also be used in a qualitative way to help understand the
ionization energies of atoms. Consider helium, for example. In ionized helium, the
single electron has an energy of −54.4 eV in its ground state. If we add a second
electron to make neutral helium (with both electrons in the 1s state), the ionization
energy is 24.6 eV. The screening of one electron by a portion of the probability distribution of the other is responsible for reducing the ionization energy from 54.4 eV
when no second electron is present to 24.6 eV when the second electron is present.
Example 8.3
The ground state of helium has the configuration 1s2 .
Use the electron screening model to predict the energies
of the following excited states of helium: (a) 1s1 2s1 (measured value −4.0 eV); (b) 1s1 2p1 (−3.4 eV); (c) 1s1 3d 1
(−1.5 eV).
Solution
(a) For the outer electron in helium, the nuclear charge of
+2e is screened by the single 1s electron, so the effective
charge seen by the outer electron is +e. From Eq. 8.1, we
have
Z2
12
=
(−13.6
eV)
= −3.4 eV
En = (−13.6 eV) eff
n2
22
The measured value is -4.0 eV, suggesting that the 2s
electron has a small penetration through the 1s distribution
and thus experiences a somewhat tighter binding than this
simple model predicts.
(b) Because Eq. 8.1 depends on n but not l, the calculation
for the 2p excited state gives the same result as the calculation for the 2s excited state (−3.4 eV). Now the agreement
is almost exact, because the 2p has less penetration than
the 2s.
(c) For the 3d excited state, Eq. 8.1 gives
En = (−13.6 eV)
2
Zeff
12
=
(−13.6
eV)
= −1.5 eV
n2
32
The agreement is again very good, suggesting little penetration of the 3d electron inside the 1s probability
distribution.
234
Chapter 8 | Many-Electron Atoms
Optical Transitions
E=0
1s13p1
1s12s1
1s12p1
1s13d1
1s2
FIGURE 8.5 A small portion of the
energy level diagram for helium. Note
the l = ±1 transitions.
0
0
5p
4p
5s
5s
−2
4s
4p
3d
4
4 13
61 60 .3
.
0. 3
4
−3
−1
−3
5d
4d
5p
5.2
51
5.8 9.6
61
3
11
3s
7.3
42 2
1. 7
49 12.
8
Energy (eV)
6s
4s
3p
−2
5f
4f
Energy (eV)
−1
5d
4d
49
81 568 8.1
9.
.
1 6
−24.5 eV
1s13s1
When we excite one of the outer electrons to a higher energy level or remove
it completely from the atom, the resulting vacancy can be filled by electrons
dropping into the empty state. The energy lost by these electrons usually appears
as emitted photons, which are in the visible range of the spectrum and are thus
known as optical transitions. The binding energies of the outer electrons in a
typical atom are of the order of several electron-volts, and so it takes relatively
little energy to move an outer electron and produce an optical transition. In
fact, it is the absorption and reemission of light by these outer electrons that
are responsible for the colors of material objects (although in solids the electron
energy levels are usually very different from those in isolated atoms). In contrast
with X-ray spectra, which vary slowly and smoothly from one element to the
next, optical spectra can show large variations between neighboring elements,
especially those that correspond to filled subshells.
Beyond hydrogen, the simplest energy-level diagrams to understand are those
of the alkali metals, which have a single s electron outside an inert core. Many of
the excited states then correspond to the excitation of this single electron, and the
resulting spectra are very similar to the spectrum of hydrogen, because the nuclear
charge of +Ze is screened by the other (Z − 1) electrons. Figure 8.4 shows the
energy levels of Li and Na along with some of the emitted transitions, which follow
the same l = ±1 selection rule as the transitions in hydrogen (see Figure 7.19).
The ground-state configuration of lithium is 1s2 2s1 and the ground-state
configuration of sodium is 1s2 2s2 2p6 3s1 . The excited states in both cases can be
obtained by moving the outer electron to a higher state. For example, the first
excited state of Li is 1s2 2p1 , with the 2s electron moving to the 2p level. (The
energy necessary to accomplish this can be provided by various means, such as
by absorption of a photon or by passage of an electric current through the material
as in a gas discharge tube.) The excited electron in the 2p state rapidly drops back
5f
4f
3d
3p
2p
32
3.3
670
.8
58
9.
2
−4
−4
−5
−5
2s
(a)
3s
(b)
FIGURE 8.4 (a) Energy-level diagram of lithium, showing some of the transitions (labeled with wavelength in nm) in
the optical region. (b) Energy-level diagram for sodium. Because of the fine-structure splitting, the 3p level in sodium is
actually a very closely spaced pair of levels, so all transitions involving that level show two closely spaced wavelengths.
The wavelength shown here is the average of the two. The fine-structure splitting is negligible for the other levels in sodium
and for all of the levels in lithium.
8.4 | Properties of the Elements
Ultraviolet
light
#4
#3
#2
#1
3.0 eV
2.4 eV
Sunlight
1.8 eV
517 nm
689 nm
Excited
states
413 nm
to the 2s state, with the emission of a photon of wavelength 670.8 nm. The inert
core doesn’t participate in this excitation or emission, so to a good approximation
we can ignore all but the outer electron in studying the levels and transitions in
the alkali elements.
The ground-state configuration of helium is 1s2 . We can produce an excited
state by moving one of these electrons up to a higher level, and so some possible
excited-state configurations might be 1s1 2s1 , 1s1 2p1 , 1s1 3s1 , and so forth. Photons
are emitted when the excited electron drops back to the 1s level. The l = ±1
selection rule for transitions once again limits those that can occur. Figure 8.5
shows a portion of the energy level diagram for helium.
The phenomenon of fluorescence is responsible for the appearance of objects
under so-called “black light,” which is a source of ultraviolet radiation. Photons
in the ultraviolet region, invisible to the human eye, have higher energies than
those in the visible region, and hence if an ultraviolet photon is absorbed by an
atom, the outer electron (which is responsible for the optical transitions) can be
excited to high levels. These electrons make transitions back to their ground state,
accompanied by the emission of photons in the visible region. Objects seen in
ultraviolet light often show colors in the blue or violet end of the spectrum that
are not present when the objects are viewed in sunlight. We can understand this
effect by considering the composition of sunlight and the optical excited states of
a hypothetical atom shown in Figure 8.6. The intensity of sunlight is concentrated
in the center of the visible spectrum, in the yellow region; very little intensity
is present in the red or blue ends of the visible spectrum. The “yellow” photons
have enough energy to excite the hypothetical atom to levels 1 and 2 shown in
Figure 8.6, but not enough to reach level 3 or 4. However, the higher-energy
ultraviolet photons have sufficient energy to reach the higher levels, so the light
emitted by the atom has a stronger blue component when that atom is excited by
ultraviolet light than when excited by sunlight.
235
Ground 0 eV
state
FIGURE 8.6 Excited states of a hypothetical atom. Only excited states 1 and
2 can be easily reached by exposure to
sunlight; exposure to ultraviolet light
populates state 4, which in turn populates state 3. Under ultraviolet light, a
stronger blue or violet (413 nm) color
is revealed than under sunlight.
Example 8.4
Calculate the energy difference between the 3d and 2p
states in lithium, and compare with the corresponding
energy difference in hydrogen.
Solution
From Figure 8.4, the wavelength of the photon emitted in
the 3d to 2p transition is 610.4 nm. The energy difference
is then
1240 eV · nm
hc
=
= 2.03 eV
E =
λ
610.4 nm
The energy difference between corresponding levels
in hydrogen (Figure 6.20) is E3 − E2 = −1.51 eV −
(−3.40 eV) = 1.89 eV. Due to electron screening, we
expect the outer electron in lithium to behave similarly
to the electron in hydrogen, so the energy differences are
in rough agreement.
8.4 PROPERTIES OF THE ELEMENTS
In this section we briefly study the way our knowledge of atomic structure helps us
to understand the physical and chemical properties of the elements. Our discussion
is based on the following two principles:
1. Filled subshells are normally very stable configurations. An atom with one
electron beyond a filled shell will readily give up that electron to another atom
Chapter 8 | Many-Electron Atoms
236
to form a chemical bond. Similarly, an atom lacking one electron from a filled
shell will readily accept an additional electron from another atom in forming
a chemical bond.
2. Filled subshells do not normally contribute to the chemical or physical
properties of an atom. Only the electrons in the unfilled subshells need be
considered. (X-ray energies, discussed in the next section, are an exception
to this rule.) Sometimes only a single outer electron is the primary factor
influencing the physical properties of an element.
We consider a number of different physical properties of the elements, and try to
understand those properties based on atomic theory.
TABLE 8.3 Ionization Energies (in eV)
of Neutral Atoms of Some Elements
H
13.60
Ar
15.76
He
24.59
K
4.34
Li
5.39
Cu
7.72
Be
9.32
Kr
14.00
Ne
21.56
Rb
4.18
Na
5.14
Au
9.22
1. Atomic Radii. The radius of an atom is not a precisely defined quantity,
because the electron probability density determines the “size” of an atom. The
radii are also difficult to define experimentally, and in fact different kinds of
experiments may give different values for the radii. One way of defining the
radius is by means of the spacing between the atoms in a crystal containing
that element. Figure 8.7 shows how such typical atomic radii vary with Z.
2. lonization Energy. Table 8.3 gives the ionization energies of some of the
elements, and Figure 8.8 shows the variation of ionization energy with atomic
number Z.
3. Electrical Resistivity. In bulk materials, an electric current flows when a
potential difference (voltage) is applied across the material. The current i and
voltage V are related according to the expression V = iR, where R is the
electrical resistance of the material. If the material is uniform with length L
and cross-sectional area A, then the resistance is
R=ρ
L
A
(8.2)
The resistivity ρ is characteristic of the kind of material and is measured
in units of · m (ohm · meter). A good electrical conductor has a small
0.3
30
Na
0.2
Li
4f
4d
0.1
3p
3d
6p
5f
5d
5p
4p
2p
0.0
0
10
20
30
40
50
Z
60
70
80
90 100
FIGURE 8.7 Atomic radii, determined from atomic separations in ionic crystals. These radii are different from the mean
radii of the electron cloud for free atoms.
Ionization energy (eV)
Atomic radius (nm)
K
Fr
Cs
Rb
25
20
He
2p
Ne
3p
Ar
15
4p
5d 6p
Xe
4d
10
Rn
4f
2s
5
3s
Li Na K 4s
0
0
5p
Kr
3d
10
20
Rb
30
5s
40
Cs
50
Z
6s
60
5f
7s
Fr
70
80
90
100
FIGURE 8.8 Ionization energies of neutral atoms of the
elements.
8.4 | Properties of the Elements
resistivity (ρ = 1.7 × 10−8 · m for copper); a poor conductor has a large
resistivity (ρ = 2 × 1015 · m for sulfur). From the atomic point of view,
current depends on the movement of relatively loosely bound electrons, which
can be removed from their atoms by the applied potential difference, and also
on the ability of the electrons to travel from one atom to another. Thus
elements with s electrons, which are the least tightly bound and which also
travel farthest from the nucleus, are expected to have small resistivities.
Figure 8.9 shows the variation of electrical resistivity with atomic number.
4. Magnetic Susceptibility. When a material is placed in a magnetic field of
intensity B, the material becomes “magnetized” and acquires a magnetization
M, which for many materials is proportional to B:
μ0 M = χ B
(8.3)
where χ is a dimensionless constant called the magnetic susceptibility.
(Materials for which χ > 0 are known as paramagnetic, and those for which
χ < 0 are called diamagnetic; materials that remain permanently magnetized
even when B is removed are known as ferromagnetic, and χ is undefined for
such materials.)
From the atomic point of view, the magnetism of atoms depends on the
and S of the electrons in unfilled subshells, because the atomic magnetic
L
and S (recall Table 7.2). This
moments μ
S are proportional to L
L and μ
effect is responsible for paramagnetic susceptibilities and occurs in all atoms
or S is nonzero. Diamagnetism is caused by the following effect:
in which L
when a varying magnetic field occurs in an area bounded by an electric
circuit, an induced current flows in the circuit; the induced current sets up a
magnetic field which tends to oppose the changes in the applied field (Lenz’s
law). In the atomic physics case, the electric circuit is the circulating electron,
and the induced current consists of a slight speeding up or slowing down of
the electron in its orbit when a magnetic field is applied. This produces a
103
Resistivity (10–8 Ω.m)
3d
4f
Mn
102
5d
4d
Re
Tc
10
1
Cu
0
10
20
30
Ag
40 50
Z
Au
60
70
80
FIGURE 8.9 Electrical resistivities of the elements.
90 100
237
Chapter 8 | Many-Electron Atoms
Magnetic susceptibility (units of 10–6)
238
106
4f
Ferromagnetic
105
104
5f
4d
3d
5d
103
102
3s 4s
2s
5s
6s
10
1
–1
–10
2p
–102
–103
3p
0
10
20
4p
30 40
5p
50 60
Z
6p
70
80
90 100
FIGURE 8.10 Magnetic susceptibilities of the elements.
contribution to the magnetization of the material that is opposite to the applied
, and so the diamagnetic contribution to χ is negative.
field B
Figure 8.10 shows the magnetic susceptibilities of the elements.
Just by examining Figures 8.7 to 8.10, you can see the remarkable regularities
in the properties of the elements. Notice especially how similar the properties of
the different sequences of elements are—for example, the electrical resistivity of
the d-subshell elements or the magnetic susceptibility of the p-subshell elements.
We now look at how the atomic structure is responsible for these properties.
Inert Gases
The inert gases occupy the last column of the periodic table. Because they have
only filled subshells, the inert gases do not generally combine with other elements
to form compounds; these elements are very reluctant to give up or to accept
an electron. At room temperature they are monatomic gases. Their atoms don’t
easily join together, so the boiling points are very low (typically −200◦ C). Their
ionization energies are much larger than those of neighboring elements, because
of the extra energy needed to break open a filled subshell.
p -Subshell Elements
The elements of the column (group) next to the inert gases are the halogens
(F, Cl, Br, I, At). These atoms lack one electron from a closed shell and have
the configuration np5 . A filled p subshell is a very stable configuration, so
these elements readily form compounds with other atoms that can provide an
extra electron to complete the p subshell. The halogens are therefore extremely
reactive.
As we move across the series of six elements in which the p subshell is being
filled, the atomic radius decreases. This “shrinking” occurs because the nuclear
charge is increasing and pulling all of the orbits closer to the nucleus. Notice from
Figure 8.7 that the halogens have the smallest radii within each p subshell series.
(The ionic crystal radii of the inert gases are not known.)
8.4 | Properties of the Elements
As we increase the nuclear charge, the p electrons also become more tightly
bound; Figure 8.8 shows how the ionization energy increases systematically as
the p subshell is filled.
From Figure 8.10 we see that each p subshell series is diamagnetic, with a
characteristic negative magnetic susceptibility.
s -Subshell Elements
The elements of the first two columns (groups) are known as the alkalis (configuration ns1 ) and alkaline earths (ns2 ). The single s electron makes the alkalis quite
reactive. The alkaline earths are similarly reactive, in spite of the filled s subshell.
This occurs because the s electron wave functions can extend rather far from the
nucleus, where the electrons are screened (by Z − 2 other electrons) from the
nuclear charge and therefore not tightly bound. (Notice from Figure 8.7 that the
ns1 and ns2 configurations give the largest atomic radii, and from Figure 8.8 that
they have the smallest ionization energies.) For the same reasons, the ns1 and ns2
elements are relatively good electrical conductors. From Figure 8.10 we see that
these elements are paramagnetic; for l = 0, there is no diamagnetic contribution
to the magnetism.
Transition Metals
The three rows of elements in which the d subshell is filling (Sc to Zn, Y to Cd, Lu
to Hg) are known as the transition metals. Many of their chemical properties are
determined by the outer electrons—those whose wave functions extend furthest
from the nucleus. For the transition metals, these are always s electrons, which have
a larger mean radius than the d electrons. (Remember that the mean radius depends
mostly on n; the s electrons of the transition metals have a larger n than the d
electrons. For example, in the first row of transition metals, the 3d subshell is filling
but the 4s subshell is already filled.) As the atomic number increases across the
transition metal series, we add one d electron and one unit of nuclear charge; the net
effect on the s electron is very small, because the additional d electron screens the
s electron from the additional nuclear charge. Properties of the transition metals,
which are in large part determined by the outermost electrons, can therefore be
very similar, as the small variation in radius and ionization energy shows.
The electrical resistivity of the transition metals shows two interesting features:
a sharp rise at the center of the sequence, and a sharp drop near the end (Figure 8.9).
The sharp drop near the end of the sequence indicates the small resistivity (large
conductivity) of copper, silver, and gold. If we filled the d subshell in the expected
sequence, copper would have the configuration 4s2 3d 9 ; however, the filled d
subshell is more stable than a filled s subshell, and so one of the s electrons
transfers to the d subshell, resulting in the configuration 4sl 3d 10 . This relatively
free, single s electron makes copper an excellent conductor. Silver (5s1 4d 10 ) and
gold (6s1 5d 10 ) behave similarly.
At the center of the sequence of transition metals there is a sharp rise in the
resistivity; apparently a half-filled shell is also a stable configuration, and so Mn
(3d 5 ), Tc (4d 5 ), and Re (5d 5 ) have larger resistivities than their neighbors. A
similar rise in resistivity is seen at the center of the 4f sequence.
The transition metals have similar paramagnetic susceptibilities, due to the
large orbital angular momentum of the d electrons and also to the large number
of d-subshell electrons that can couple their spin magnetic moments. These two
239
240
Chapter 8 | Many-Electron Atoms
effects are large enough to overcome the diamagnetism of the orbital motion. It is
the d electrons that are also responsible for the ferromagnetism of iron, nickel, and
cobalt. As soon as the d subshell is filled, however, the orbital and spin magnetic
moments no longer contribute to the magnetic properties (all of the ml and ms
values, positive as well as negative, are taken); for this reason, copper and zinc
are diamagnetic, not paramagnetic like their transition metal neighbors.
Lanthanides (Rare Earths)
The lanthanide (or rare earth) elements are contained in the series of 14 elements
from La to Yb; this series is usually drawn at the bottom of the periodic chart
of the elements. The rare earths are rather similar to the transition metals in that
an “inner” subshell (the 4f ) is being filled after an “outer” subshell (the 6s) is
already filled. For the same reasons discussed above, the chemical properties of
the rare earths should be rather similar, because they are determined mainly by
the 6s electrons; the radii and ionization energies show that this is true.
Because of the larger orbital angular momentum of f -subshell electrons (l = 3)
and also because of the larger number of f -subshell electrons (up to 14) that can
align their spin magnetic moments, the paramagnetic susceptibilities of the rare
earths are even larger than those of the transition metals. Even the ferromagnetism
of the rare earths is substantially stronger than that of the iron group. Generally,
we think of iron as the most magnetic of the elements. The internal magnetic field
within a magnetized piece of iron is about 28 T. Magnetized holmium metal, a
rare earth, has an internal magnetic field of 800 T, roughly 30 times that of iron!
Most of the other rare earths have similar magnetic properties. (The rare earth
metals do not reveal their ferromagnetic properties at room temperature, but must
be cooled to lower temperatures. Holmium must be cooled to 20 K to reveal its
ferromagnetic properties.)
Actinides
The actinide series of elements, which corresponds to the filling of the 5f subshell,
is usually shown in the periodic table directly under the lanthanide series. These
elements should have chemical and physical properties similar to those of the rare
earths. Unfortunately, most of the actinide elements (those beyond uranium) are
radioactive and do not occur in nature. They are artificially produced elements
and are available only in microscopic quantities. We are thus unable to determine
many of their bulk properties.
8.5 INNER ELECTRONS: ABSORPTION EDGES
AND X RAYS
Let’s imagine doing the Franck-Hertz experiment (see Section 6.6), in which we
accelerate a beam of electrons that then passes through a chamber filled with
mercury vapor. However, instead of using accelerating voltages in the range of
10 V, we’ll use voltages in the range of 105 V. Figure 8.11 shows the current
passing through the tube as a function of the accelerating voltage. A sudden
drop in the current occurs at 83.1 kV. Low accelerating voltages correspond to
K Absorption edge (keV)
100
80
60
40
20
0
0
20
60
40
80
Z
FIGURE 8.13 K absorption edges of the elements.
83.1 kV
Accelerating Voltage
FIGURE 8.11 Electron current passing through mercury vapor as a function of accelerating voltage.
Absorption
interactions that push the outer electrons in the mercury atoms to higher excited
states (or ionize the atom). The drop in the current at 83.1 kV occurs when the
mercury atom absorbs energy from the electron beam that ionizes the atom by
knocking loose one of the tightly bound inner electrons. The binding energy of
the inner electron in this case is 83.1 keV.
A similar experiment can be done by passing a beam of X rays through a
thin film of mercury and measuring the absorption of the photon intensity. If we
are able to vary the wavelength of the X rays, the absorption as a function of
wavelength might look like Figure 8.12. Photons are absorbed from the beam by
the photoelectric effect, in which electrons are knocked loose from mercury atoms.
As the photon wavelength is increased (or as the photon energy is decreased),
we reach a point at which the photons do not have enough energy to produce at
least one component of photoelectrons, and thus there is a sudden decrease in the
photon absorption. The wavelength at which this occurs, 0.0149 nm, corresponds
to an energy of 83.1 keV, in agreement with the value deduced from electron
scattering (Figure 8.11).
The sudden drop in the electron current or in the photoelectron emission is
called the absorption edge. It corresponds to the release of an inner electron from
the atom. In the case of mercury, the most tightly bound (1s) electrons have
a binding energy of 83.1 keV. In the electron scattering experiment, when the
energy of the electrons in the beam exceeds 83.1 keV, the collision of an electron
with a mercury atom can transfer an energy of 83.1 keV to the atom and result in
the ejection of one of the 1s electrons. Similarly, when the photon energy exceeds
83.1 keV (or when its wavelength is below 0.0149 nm), the photons can eject a
photoelectron from the 1s level, but when the photon energy is below 83.1 keV
that is not possible.
As discussed in Section 8.3, the n = 1 level is also known as the K shell. So far
we have been discussing the K absorption edge in mercury, which corresponds
to the release of an electron from the K shell. It is also possible to release a less
tightly bound electron from the L shell (n = 2), in which case we would speak
of the L absorption edge. In mercury, the L absorption edge is about 14 keV.
(Because of the fine-structure splitting, there are actually three different states in
the L shell with slightly different energies.)
Figure 8.13 shows the K absorption edges of the elements. There is a very
noticeable difference between the data shown in Figure 8.13 and those shown in
241
Current
8.5 | Inner Electrons: Absorption Edges and X Rays
0.0149 nm
Photon wavelength
FIGURE 8.12 Absorption of photons
by a thin film of mercury as a function
of the photon wavelength.
Chapter 8 | Many-Electron Atoms
242
Figures 8.7–8.10: the K absorption edges show no evidence for any shell effects.
Instead, there is a smooth dependence on the atomic number over the entire range
of elements. As the nuclear charge increases, the 1s electrons are pulled into
smaller and more tightly bound orbits, but this is a gradual process that is largely
unaffected by the stacking of electrons into higher energy shells. There are no
sudden changes in the 1s properties as a higher shell is filled and a still higher
shell begins filling.
X-Ray Transitions
Mβ
Mα
Nα
Lγ M series
Lα
n=4
(N shell)
n=3
(M shell)
Lβ
L series
n=2
(L shell)
Kα
X rays, as we discussed in Chapter 3, are electromagnetic radiations with
wavelengths from approximately 0.01 to 10 nm (energies from 100 eV to 100 keV).
In Chapter 3 we discussed the continuous X-ray spectrum emitted by accelerated
electrons. In this section we are concerned with the discrete X-ray line spectra
emitted by atoms.
X rays are emitted in transitions between the more tightly bound inner electron
energy levels of an atom. Under normal conditions all of the inner shells of an
atom are filled, so X-ray transitions do not occur between these levels. However,
when we remove one of the inner electrons, such as by ejecting a K electron
following electron scattering or a photoelectric process, an electron from a higher
subshell will rapidly make a transition to fill that vacancy, emitting an X-ray
photon in the process. The energy of the photon is equal to the energy difference
of the initial and final atomic levels of the electron that makes the transition.
When we remove a 1s electron, we are creating a vacancy in the K shell. The X
rays that are emitted in the process of filling this vacancy are known as K-shell X
rays, or simply K X rays. (These X rays are emitted in transitions that come from
the L, M, N, . . . shells, but they are known by the vacancy that they fill, not by the
shell from which they originate.) The K X ray that originates with the n = 2 shell
(L shell) is known as the Kα X ray, and the K X rays originating from the M shell
are known as Kβ X rays. Figure 8.14 illustrates these transitions.
If the bombarding electrons or photons knock loose an electron from the L
shell, electrons from higher levels will drop down to fill this vacancy. The photons
emitted in these transitions are known as L X rays. The lowest-energy X ray of the
L series is known as Lα , and the other L X rays are labeled in order of increasing
energy as shown in Figure 8.14.
It is possible to have an L X ray emitted directly following the Kα X ray. A
vacancy in the K shell can be filled by a transition from the L shell, with the
emission of the Kα X ray. However, the electron that made the jump from the L
shell left a vacancy there, which can be filled by an electron from a higher shell,
with the accompanying emission of an L X ray.
In a similar manner, we label the other X-ray series by M, N, and so forth.
Figure 8.15 shows a sample X-ray spectrum emitted by silver.
Kβ
Moseley’s Law
Kγ
Kδ
K series
FIGURE 8.14 X-ray series.
n=1
(K shell)
Let us consider in more detail the Kα X ray, which (as shown in Figure 8.14) is
emitted when an electron from the L shell drops down to fill a vacancy in the
K shell. An electron in the L shell is normally screened by the two 1s electrons,
and so it sees an effective nuclear charge of Zeff = Z − 2. When one of those
1s electrons is removed in the creation of a K-shell vacancy, only the remaining
single 1s electron shields the L shell, and so Zeff = Z − 1. (In this calculation, we
neglect the small screening effect of the outer electrons; their probability densities
8.5 | Inner Electrons: Absorption Edges and X Rays
243
150
Kβ
Kγ
Lβ
Mα
0.01
Slope =
3.22
Lα
0.1
1.0
√∆ E (eV1/2)
Intensity
Kα
100
50
Wavelength (nm)
Intercept =1
FIGURE 8.15 Characteristic X-ray spectrum of silver, such as might
be produced by 30 keV electrons striking a silver target. The continuous distribution is a bremsstrahlung spectrum.
are not zero within the L-shell orbits, but they are sufficiently small that their
effect on Zeff can be neglected.) To a very good approximation, the Kα X ray
can thus be analyzed as a transition from the n = 2 level to the n = 1 level in a
one-electron atom with Zeff = Z − 1. Using Eq. 6.38 for the Bohr atom, we can
find the energy of the Kα transition in an atom of atomic number Z:
1
1
2
E = E2 − E1 = (−13.6 eV)(Z − 1)
− 2 = (10.2 eV)(Z − 1)2 (8.4)
22
1
Just as was the case for the K absorption edge, the energies of the Kα X
rays vary√smoothly with atomic number and show no effects of atomic shells. If
we plot E as a function of Z, we expect to obtain a straight line with slope
1
1
(10.2 eV) /2 = 3.19 eV /2 . Figure 8.16 is an example of such a plot. The measured
1
slope is 3.22 eV /2 , in excellent agreement with what is expected from Eq. 8.4. The
straight line intersects the x axis at a value very close to 1, as we expect from Eq. 8.4.
This method gives us a powerful and direct way to determine the atomic
number Z of an atom, as was first demonstrated in 1913 by the British physicist H.
G. J. Moseley, who measured the Kα (and other) X-ray energies of the elements
and thus determined their atomic numbers. The dependence of the X-ray energies
on Z given by Eq. 8.4 is known as Moseley’s law. Moseley was the first to
demonstrate the type of linear relationship shown in Figure 8.16; such graphs
are now known as Moseley plots. His discovery provided the first direct means
of measuring the atomic numbers of the elements. Previously, the elements had
been ordered in the periodic table according to increasing mass. Moseley found
certain elements listed out of order, in which the element of higher Z happened to
have the smaller mass (for example, cobalt and nickel or iodine and tellurium).
He also found gaps corresponding to yet undiscovered elements; for example, the
naturally radioactive element technetium (Z = 43) does not exist in nature and
was not known at the time of Moseley’s work, but Moseley showed the existence
of such a gap at Z = 43.
The straight-line plot of Figure 8.16 is independent of our assumption regarding
the exact value of the screening correction. That is, we could have written
Zeff = Z − k, where k is some unknown number, probably close to 1. The only
change in our plot would be in the intercept. We would still have a straight line
with the same slope.
Moseley’s work was of great importance in the development of atomic physics.
Working in the same year as Rutherford and Bohr, Moseley not only provided
0
0
10 20 30 40 50 60
Atomic number Z
FIGURE 8.16 Moseley plot of square
root of Kα X-ray energy as a function
of atomic number.
Henry G. J. Moseley (1887–1915, England). His work on X-ray spectra provided the first link between the chemical periodic table and atomic physics,
but his brilliant career was cut short
when he died on a World War I battlefield.
244
Chapter 8 | Many-Electron Atoms
confirmation of the Rutherford-Bohr model, he also demonstrated a direct link
between atomic structure and the periodic table, which was previously a rather
arbitrary ordering scheme of the elements but subsequently became a classification
based on their electronic configurations.
Example 8.5
Compute the energy of the Kα X ray of sodium (Z = 11).
Solution
The energy can be found with the help of Eq. 8.4,
The measured value is 1.04 keV. The small discrepancy
may be due to the screening correction in Zeff , which is not
exactly equal to 1.
E = (10.2 eV)(Z − 1)2 = (10.2 eV)(10)2 = 1.02 keV
Example 8.6
Some measured X-ray energies in silver (Z = 47) are
E(Kα ) = 21.990 keV and E(Kβ ) = 25.145 keV. The
binding energy of the K electron in silver is E(K) =
25.514 keV. From these data, find: (a) the energy of the Lα
X ray, and (b) the binding energy of the L electron.
Solution
(a) From Figure 8.14, we see that the energies are
related by:
E(Lα ) + E(Kα ) = E(Kβ )
(b) Again from Figure 8.14, we see that
E(Kα ) = E(L) − E(K)
or
E(L) = E(K) + E(Kα )
= −25.514 keV + 21.990 keV = −3.524 keV
The binding energy of the L electron is therefore
3.524 keV.
or
E(Lα ) = E(Kβ ) − E(Kα )
= 25.145 keV − 21.990 keV = 3.155 keV
∗
8.6 ADDITION OF ANGULAR MOMENTA
The properties of an alkali atom such as sodium are determined primarily by the
single outer electron; if that electron has quantum numbers (n, l, ml , ms ) then the
entire atom behaves as if it had those same quantum numbers. In atoms with
several electrons outside of filled subshells, this is not the case. For example,
the electronic configuration of carbon (Z = 6) is 1s2 2s2 2p2 . To find the angular
momentum of carbon, we must combine the angular momenta of the two 2p
electrons to find the total orbital angular momentum quantum number L and total
magnetic quantum number ML that characterize the entire atom.
∗
This is an optional section that may be skipped without loss of continuity.
8.6 | Addition of Angular Momenta
Suppose we have an atom with two electrons outside of filled subshells. These
electrons have quantum numbers (n1 , l1 , ml1 , ms1 ) and (n2 , l2 , ml2 , ms2 ). The total
orbital angular momentum of the atom is determined by the vector sum of the
orbital angular momenta of the two electrons:
=L
1 + L
2
L
(8.5)
Each vector is related to its corresponding angular momentum quantum
number by
| = L(L + 1)h−
1 | = l1 (l1 + 1)h−
2 | = l2 (l2 + 1)h− (8.6)
|L
|L
|L
These vectors do not add like ordinary vectors, but have special addition rules
associated with quantized angular momentum. These rules enable us to find L and
its associated magnetic quantum number ML .
1. The maximum value of the total orbital angular momentum quantum
number is
Lmax = l1 + l2
(8.7)
2. The minimum value of the total orbital angular momentum quantum
number is
Lmin = |l1 − l2 |
(8.8)
3. The permitted values of L range from Lmin to Lmax in integer steps:
L = Lmin , Lmin + 1, Lmin + 2, . . . , Lmax
(8.9)
4. The z component of the total angular momentum vector is found from the sum
of the z components of the individual vectors:
Lz = L1z + L2z
(8.10)
or, in terms of the magnetic quantum numbers,
ML = ml1 + ml2
(8.11)
The permitted values of the total magnetic quantum number ML range from
−L to +L in integer steps:
ML = −L, −L + 1, . . . , −1, 0, +1, . . . , L − 1, L
(8.12)
An identical set of rules holds for coupling the spin angular momentum vectors
. For two electrons, each of which has
to give the total spin angular momentum S
s = 1/2, the total spin quantum number S can be 0 or 1.
All filled subshells have L = 0 and S = 0, so we don’t need to consider filled
subshells in analyzing the angular momentum of an atom. For this reason, filled
subshells ordinarily do not contribute to the magnetic properties of atoms.
For coupling more than two electrons, the procedure is first to couple the
angular momenta of two electrons to give the maximum and minimum values of
L. Then couple each allowed L to the angular momentum of the third electron to
find the largest maximum and smallest minimum. This continues for all of the
electrons in the unfilled subshell.
245
246
Chapter 8 | Many-Electron Atoms
Example 8.7
Find the total orbital and spin quantum numbers for
carbon.
Smax =
Solution
Carbon has two 2p electrons outside filled subshells. Each
of these electrons has l = 1. According to the rules for
adding angular momenta, we have
Lmax = 1 + 1 = 2,
Thus L = 0, 1, or 2. For the spin angular momentum, we
have
Lmin = |1 − 1| = 0
1
2
+
1
2
= 1,
Smin = | 21 − 12 | = 0
and so S = 0 or 1. Some combinations of L and S might
be forbidden by the Pauli principle. For example, to obtain
L = 2, the two electrons must both have ml = +1. The two
electrons must therefore have different values of ms , so
S = 1 is not allowed when L = 2.
Example 8.8
Find the total orbital and spin quantum numbers for
nitrogen.
Solution
Nitrogen has three 2p electrons, each with l = 1, outside
filled subshells. If we add the first two, we get Lmax = 2 and
Lmin = 0, as in Example 8.7, so that L = 0, 1, or 2. We now
couple the third l = 1 electron to each of these values to find
the largest maximum and smallest minimum, which give
Lmax = 2 + 1 = 3,
Lmin = |1 − 1| = 0
Smax = 1 +
1
2
= 23 ,
Smin = |0 − 21 | =
1
2
The resulting values of S are 1/2 and 3/2 (from the minimum to the maximum in integer steps). Once again, the
Pauli principle may forbid certain combinations of L and
S. The state with L = 3 cannot exist at all, because all three
electrons must have ml = +1, and assigning ms quantum
numbers will then result in two electrons with the same ml
and ms , which is forbidden by the Pauli principle.
and so L = 0, 1, 2, or 3. For the spin vectors, we again
couple the first two to give Smax = 1 and Smin = 0. Adding
the third s = 1/2 electron, we have
The two 2p electrons of carbon can combine to give L = 0, 1, or 2 and S = 0 or
1. The ground state of carbon will be identified by only one particular choice of L
and S. How do we know which of these combinations will be the ground state? The
rules for finding the ground state quantum numbers are known as Hund’s rules:
1. First find the maximum value of the total spin magnetic quantum number MS
consistent with the Pauli principle. Then
S = MS,max
(8.13)
2. Next, for that MS , find the maximum value of ML consistent with the Pauli
principle. Then
L = ML,max
(8.14)
In the case of carbon, the maximum value of MS is +1, obtained when the two
valence electrons both have ms = + 1/2. Thus S = 1. With only two electrons in
the 2p shell, the Pauli principle places no restrictions on S; in fact, three electrons
in the 2p shell can be assigned ms = + 1/2. Our next task is to find the maximum
value of ML . The maximum value of ml for the first p electron is +1. The second
p electron cannot also have ml = +1, because that would give both electrons the
8.6 | Addition of Angular Momenta
247
same set of quantum numbers, in violation of the Pauli principle. The maximum
value of ml for the second electron is 0, so ML,max = +1 and L = 1. The ground
state of carbon is therefore characterized by S = 1 and L = 1.
Example 8.9
Use Hund’s rules to find the ground-state quantum numbers
of nitrogen.
Solution
The electronic configuration of nitrogen is 1s2 2s2 2p3 . We
begin by maximizing the total MS for the three 2p electrons. Three electrons in the p subshell are permitted by
the Pauli principle to have ms = + 1/2, so the maximum
value of MS is 3/2, and therefore S is 3/2. Each of the three
electrons has quantum numbers (2, 1, ml , + 1/2). To maximize ML we assign the first electron the maximum value
of ml —namely, +1. The maximum value of ml left for
the second electron is 0, and the third electron must therefore have ml = −1. The total ML is 1 + 0 + (−1) = 0, so
L = 0. Thus L = 0, S = 3/2 are the ground-state quantum
numbers for nitrogen.
Example 8.10
Find the ground-state L and S of oxygen (Z = 8).
Solution
The electronic configuration of oxygen is 1s2 2s2 2p4 .
Because only three electrons in the p subshell can
have ms = + 1/2, the fourth must have ms = − 1/2, so
MS,max = 1/2 + 1/2 + 1/2 + (− 1/2) = +1, and it follows
that S = 1. To find L, we note that, as for nitrogen, the three
electrons with ms = + 1/2 have ml = +1, 0, and −1, and
we maximize ML by giving the fourth electron ml = +1.
Thus ML,max = +1, and L = 1.
Let us look now at the energy levels of helium. The ground-state configuration
of helium is 1s2 . Both electrons are s electrons, with l = 0, and so the only possible
value of L is zero. Because both electrons have ml = 0, the Pauli principle requires
that the spin of the two electrons be opposite, so that one has ms = + 1/2 and the
other has ms = − 1/2. The only possible total MS is therefore zero, so the ground
state of helium has L = 0 and S = 0. The first excited state has configuration
1s1 2s1 . Both electrons still have l = 0, so we must again have L = 0. However,
the total spin S can now be 0 or 1, because the Pauli principle does not restrict ms
in this case—the two electrons already have different principal quantum numbers
n, and so there is nothing to prevent them from having the same ms . There are,
therefore, two “first excited states” of helium, one with L = 0 and S = 0, and
another with L = 0 and S = 1. (Both of these states have configuration 1s1 2s1 .)
A state with S = 0 is called a singlet state (because there is only a single possible
MS value), and a state with S = 1 is called a triplet state (because there are three
possible MS values: +1, 0, −1).
The classification of states into singlet and triplet is important when we
consider the selection rules for transitions between states; these selection rules
tell us which transitions are allowed (and therefore likely) to occur and which are
not. The selection rules, which involve both L and S, are
L = 0, ±1
S = 0
(8.15)
(8.16)
(There are no selection rules for n.) Of course, the selection rule l = ±1 for the
single electron that makes the transition still applies. For the two 1s1 2s1 states
Chapter 8 | Many-Electron Atoms
S=0
1s14s1
39
6.
5
Energy (eV)
8.1
–3
–4
1s12s1
1s12p1
1s13s1
31
8
38 .8
8. 9
50
1.
6
72
–2
1s13d1
5
1.3 05.
7
47
1s 3s
1
1s13p1
4.8
1
1s14d1
1s14s1
50
–1
S=1
1s14p1
1s14p1
1
1s14d1
1
1s 3p
1s13d1
4
58 47.
7. 1
6
0
4
66 92
7. .2
8
248
1s12p1
1s12s1
58.4
–5
–6
–24
–25
0
S=0
−1
L=2
−2
L=3
L=1
S=1
L=2
2p13d1
L=1
L=3
L=1
L=0
−3
Energy (eV)
−4
L=2
L=1
2p13p1
L=2
2p13s1
L=1
−5
−6
−7
−8
L = 0, S = 0
−9
−10
−11
L = 2, S = 0
2p2
L = 1, S = 1
FIGURE 8.18 Energy-level diagram
for carbon. Each group of levels is
labeled with the electron configuration. Each individual level is labeled
with the total L and S.
1s2
FIGURE 8.17 Energy-level diagram for helium. The states are grouped
into singlets (S = 0) and triplets (S = 1). Some of the transitions in
the optical and ultraviolet regions are shown. Transitions marked with
an X would violate the l = ±1 selection rule.
of helium, the l rule does not permit either state to make transitions to the 1s2
ground state (2s to 1s would be l = 0), and in addition, the S rule forbids the
triplet (S = 1) states from decaying to the S = 0 ground state. These transitions
can thus occur only by violating these selection rules. Because that is a very
unlikely event, the transitions occur with very low probability. Energy levels that
have a low probability of decay must “live” for a long time before they decay;
such states are known as metastable states.
Figure 8.17 shows the energy levels and transitions in helium. The singlet
and triplet levels are grouped separately, because transitions between singlet and
triplet levels would violate the S = 0 selection rule.
Figure 8.18 shows the energy-level diagram of carbon. Notice the increasing
complexity of the diagram, compared with the alkali metals and even with
helium. This follows from the coupling of two electrons, both of whose l values
may be different from zero. We have already discussed how the 2p2 configuration
can give L = 0, 1, or 2 and S = 0 or 1. Only one of these (L = 1, S = 1) is
the ground state of carbon; the others are excited states. More excited states
can be obtained by promoting one of the 2p electrons to a higher level, giving
configurations of 2p1 3s1 (L = 1, S = 0 or 1), 2p1 3p1 (L = 0, 1, or 2; S = 0 or
1), 2p1 3d 1 (L = 1, 2, or 3; S = 0 or 1), and so forth. Imagine the difficulty of
analyzing the energy level diagram of the rare earths or actinides, which have f
subshells (l = 3) with as many as 14 electrons!
8.7 LASERS
There are three means by which radiation can interact with the energy levels of
atoms (depicted in Figure 8.19). The first two we have already discussed. In the
first kind of interaction, an atom in an excited state makes a transition to a lower
8.7 | Lasers
249
state, with the emission of a photon. (In all the examples we consider here, the
photon energy is equal to the energy difference of the two atomic states.) This is
spontaneous emission, which we represent as
atom∗ → atom + photon
Spontaneous
emission
where the asterisk indicates an excited state.
The second interaction, induced absorption, is responsible for absorption
spectra and resonance absorption. An atom in the ground state absorbs a photon
(of the proper energy) and makes a transition to an excited state. Symbolically:
atom + photon → atom∗
Induced
absorption
The third interaction, which is responsible for the operation of the laser, is
induced (or stimulated) emission. In this process, an atom is initially in an excited
state. A passing photon of just the right energy (again, equal to the energy
difference of the two levels) induces the atom to emit a photon and make a
transition to the lower, or ground, state. (Of course, it would eventually have
made that transition left on its own, but it makes it sooner after being prodded by
the passing photon.) Symbolically,
atom∗ + photon → atom + 2 photons
The significant detail is that the two photons that emerge are traveling in exactly the
same direction with exactly the same energy, and the associated electromagnetic
waves are perfectly in phase (coherent).
Suppose we have a collection of atoms, all in the same excited state, as shown
in Figure 8.20. A photon passes the first atom, causing induced emission and
resulting in two photons. Each of these two photons causes an induced emission
process, resulting in four photons. This process continues, doubling the number
of photons at each step, until we build up an intense beam of photons, all coherent
and moving in the same direction. In its simplest interpretation, this is the basis
of operation of the laser. (The word laser is an acronym for Light Amplification
by Stimulated Emission of Radiation.)
This simple model for a laser will not work, for several reasons. First, it
is difficult to keep a collection of atoms in their excited states until they are
stimulated to emit the photon (we don’t want any spontaneous emission). A
FIGURE 8.20 Buildup of intense beam in a laser. Each emitted
photon interacts with an excited atom and produces two photons.
Induced
emission
FIGURE 8.19 Interactions of radiation with atomic energy levels.
250
Chapter 8 | Many-Electron Atoms
Short-lived
state
Metastable
state
Pumping
Lasing
transition
Ground
state
FIGURE 8.21 A three-level atom.
Short-lived
state
Metastable
state
Pumping
Lasing
transition
Short-lived
state
Ground
state
FIGURE 8.22 A four-level atom.
second reason is that atoms that happen to be in their ground state undergo
absorption and thus remove photons from the beam as it builds up.
To solve these problems, we must achieve a population inversion—in a
collection of atoms, there must be more atoms in the upper state than in the lower
state. This is called an “inversion” because under normal conditions at thermal
equilibrium, the lower state always has the greater population. The “inversion” is
thus an unnatural situation that must be achieved by artificial means, because it is
essential for the operation of the laser.
The first laser, which was constructed by T. H. Maiman in 1960, was based on a
three-level atom (Figure 8.21). The laser medium is a solid ruby rod, in which the
chromium atoms are responsible for the action of the laser. The atoms, originally
in the ground state, are “pumped” into the excited state by an external source
of energy (a burst of light from a flash lamp that surrounds the ruby rod). The
excited state decays very rapidly (by spontaneous emission) to a lower excited
state, which is a metastable state—the atom remains in that level for a relatively
long time, perhaps 10−3 s, compared with 10−8 s for the short-lived states. The
transition from the metastable state to the ground state is the “lasing” transition,
resulting from stimulated emission by a passing photon.
If the pumping action is successful, there are more atoms in the metastable
state than in the ground state, and we have achieved a population inversion.
However, as the lasing transition occurs, the population of the ground state is
increased, thereby upsetting the population inversion. This excess of population
in the ground state allows absorption of the lasing transition, thereby removing
photons that might contribute to the lasing action.
The four-level laser illustrated in Figure 8.22 relieves this remaining difficulty.
The ground state is pumped to an excited state that decays rapidly to the
metastable state, as with the three-level laser. The lasing transition proceeds from
the metastable state to yet another excited state, which in turn decays rapidly to
the ground state. The atom in its ground state thus cannot absorb at the energy of
the lasing transition, and we have a workable laser. Because the lower short-lived
state decays rapidly, its population is always smaller than that of the metastable
state, which maintains the population inversion.
A common example of the four-level laser is the familiar helium-neon laser,
which operates with a mixture of helium and neon gas (about 90% helium). The
important energy levels of He and Ne are shown in Figure 8.23. An electrical
current in the gas “pumps” the helium from its ground state to the excited state
at an energy of about 20.6 eV. This is a metastable state of helium—the atom
remains in that state for a relatively long time because a 2s electron is not permitted
to return to the 1s level by photon emission. Occasionally, an excited helium
atom collides with a ground-state neon atom. When this occurs, the 20.6 eV of
excitation energy may be transferred to the neon atom, because neon happens to
have an excited state at 20.6 eV, and the helium atom returns to its ground state.
Symbolically,
helium∗ + neon → helium + neon∗
where the excited state is indicated by the asterisk. The excited state of neon
corresponds to removing one electron from the filled 2p subshell and promoting
it to the 5s subshell. From there it decays to the 3p level and eventually returns to
the 2p ground state. Figure 8.23 illustrates this sequence of events and the level
schemes. (The level shown with a dashed line, the neon 3s level, is not important
8.7 | Lasers
Collision
20.61 eV
1s12s1
20.66 eV
2p55s1
18.70 eV
2p53p1
632.8 nm
2p53s1
0
He
1s2
0
Ne
2p6
FIGURE 8.23 Sequence of transitions in a He-Ne laser.
for the basic operation of the laser, but it is necessary as an intermediate step in
the return to the neon ground state, because the l = 0 transition 3p → 2p is not
allowed, but the sequence 3p → 3s → 2p is permitted.)
At any given time, there are more neon atoms in the 5s state than in the 3p
state, because the good energy matchup of the 5s state with the helium excited
state gives a high probability of the 5s state in neon being excited. The 3p state,
on the other hand, decays rapidly. This provides the population inversion that is
needed for the laser.
In the helium-neon laser, the gases are enclosed in a narrow tube (Figure 8.24).
Occasionally a neon atom in the 5s state spontaneously emits a photon (at a
wavelength of 632.8 nm) parallel to the axis of the tube. This photon causes
stimulated emission by other atoms, and a beam of coherent (in-phase) radiation
eventually builds up traveling along the tube axis. Mirrors are carefully aligned at
the ends of the tube to help in the formation of the coherent wave, as it bounces
V
Laser
beam
Fully
silvered
mirror
FIGURE 8.24 Schematic diagram of a He-Ne laser.
Partially
silvered
mirror
251
252
Chapter 8 | Many-Electron Atoms
back and forth between the two ends of the tube, causing additional stimulated
emission. One of the mirrors is only partially silvered, allowing a portion of the
beam to escape through one end.
The laser is not a particularly efficient device; the small helium-neon lasers
you have probably seen used for laboratory or demonstration experiments have
a light output of perhaps a few milliwatts; the electric power required to operate
such a device may be of the order of 10 to 100 W, and thus the efficiency
(power out ÷ power in) of such a device is only about 10−4 to 10−5 . It is the
coherence and directionality of the laser beam and its energy density that make
the laser such a useful device—its power can be concentrated in a beam only
a few millimeters in diameter, and thus even a small laser can deliver 100 to
1000 W/m2 . Larger lasers in the megawatt (106 W) range are presently readily
available, and research laboratories are using lasers in the 100 terawatt (1014 W)
range for special applications. These powerful lasers do not operate continuously,
but are instead pulsed, producing short (perhaps 10−9 s) pulses at rates of order
100 Hz. (Such a pulse is, in fact, an excellent example of a wave packet.)
Chapter Summary
Section
Section
No two electrons in a single
atom can have the same
set of quantum numbers
(n, l, ml , ms ).
8.1
Filling order of
atomic subshells
1s, 2s, 2p, 3s, 3d, 4s, 3d,
4p, 5s, 4d, 5p, 6s, 4f , 5d,
6p, 7s, 5f , 6d
8.2
Moseley’s law
for Kα X rays
Capacity of
subshell nl
2(2l + 1)
8.2
Adding angular
momenta l1 , ml1
and l2 , ml2
Pauli exclusion
principle
Energy of
screened electron
Hund’s rules for
ground state
Z2
En = (−13.6 eV) eff
n2
8.3
E = (10.2 eV)(Z − 1)2
8.5
Lmax = l1 + l2 ,
Lmin = |l1 − l2 |,
ML = ml1 + ml2
8.6
First S = MS,max , then
L = ML,max
8.6
Questions
1. Continue Figure 8.1 upward, showing the next two major
groups. What will be the atomic number of the next inert
gas below Rn? What will be the structure of the eighth row
(period) of the periodic table? Where do you expect the
first g subshell to begin filling? What properties would you
expect the g-subshell elements to have? What will be the
atomic number of the second inert gas below radon?
2. Why do the 4s and 3d subshells appear so close in energy,
when they belong to different principal quantum numbers n?
3. Would you expect element 107 to be a good conductor or
a poor conductor? How about element 111? Do you expect
element 112 to be paramagnetic or diamagnetic?
4. Zirconium frequently is present as an impurity in hafnium
metal. Why?
5. Do you expect ytterbium (Yb) to become ferromagnetic
at sufficiently low temperatures? What type of magnetic
behavior would be expected at ordinary temperatures for
polonium (Po)? For francium (Fr)?
6. As we move across the series of transition metal or rare
earth elements, we add electrons to the d or f subshells.
In chemical compounds, these elements often show valence
states of +2, which correspond to removing two s electrons.
Explain this apparent paradox.
Problems
7. Why do the rare earth (lanthanide) elements have such similar chemical properties? What property might you use to
distinguish lanthanide atoms from one another?
8. Explain why the Bohr theory gives a poor accounting of
optical transitions but does well in predicting the energies
of X-ray transitions.
9. What can you conclude about the electronic configuration of
an atom that has both L = 0 and S = 0 in the ground state?
10. Suppose we do a Stern-Gerlach experiment using an atom
that has angular momentum quantum numbers L and S in
its ground state. Into how many components will the beam
split? Do you expect them to be equally spaced?
11. What is the degeneracy of a state of total orbital angular
momentum L that has S = 0? What is the degeneracy of
a state of total spin angular momentum S that has L = 0?
What is the total degeneracy of a state in which both L and
S are nonzero?
12. What L and S values must an atom have in order to show the
normal Zeeman effect? Does this apply only to the ground
state or to excited states also? Can an atom show the normal
Zeeman effect in some transitions and the anomalous Zeeman effect in other transitions? Could the same atom even
show no Zeeman effect at all in some transitions?
13. Based on the rules for coupling electron l and s values to
give the total L and S, explain why filled subshells don’t
contribute to the magnetic properties of an atom.
14. If an atom in its ground state has S = 0, can you infer
whether it has an even or an odd number of electrons? What
if L = 0?
15. The L atomic shell actually contains three distinct levels:
a 2s level and two 2p levels (a fine-structure doublet). If
16.
17.
18.
19.
20.
21.
253
we look carefully at the Kα X ray under high resolution,
we see two, not three, different components. Explain this
discrepancy.
The Kα energies computed using Eq. 8.4 are about 0.1%
low for Z = 20, 1% low for Z = 40, and 10% low for
Z = 80. Why does the simple theory fail for large Z? Could
it be because the screening effect has not been handled
correctly and that Zeff is not Z − 1? If not, can you suggest
an alternative reason?
The first excited state in sodium is a fine-structure doublet; the wavelengths emitted in the decay of these states
are 589.59 nm and 589.00 nm, a difference of 0.59 nm.
The excited 4s1 state in sodium (see Figure 8.4) decays
to the 3p doublet with the emission of radiation at the
wavelengths 1138.15 nm and 1140.38 nm, a difference of
2.23 nm. Explain how the 3p fine structure can give a wavelength difference of 0.59 nm in one case and 2.23 nm in the
other case.
Suppose we had a three-level atom, like that of Figure 8.21,
in which the metastable state were the higher excited state;
the lasing transition would then be the upper transition.
Does this atom solve the problem of absorption of the lasing
transition? Would such an atom make a good laser?
How does a laser beam differ from a point source of light?
Contrast the change in beam intensity with distance from
the source for a laser and a point source.
Explain what is meant by a population inversion and why it
is necessary for the operation of a laser.
How could you demonstrate that laser light is coherent?
What would be the result of the same experiment using an
ordinary monochromatic source? A white light source?
Problems
8.1 The Pauli Exclusion Principle
1. (a) List the six possible sets of quantum numbers (n, l, ml , ms )
of a 2p electron. (b) Suppose we have an atom such as carbon,
which has two 2p electrons. Ignoring the Pauli principle,
how many different possible combinations of quantum numbers of the two electrons are there? (c) How many of the
possible combinations of part (b) are eliminated by applying
the Pauli principle? (d) Suppose carbon is in an excited state
with configuration 2p1 3p1 . Does the Pauli principle restrict
the choice of quantum numbers for the electrons? How many
different sets of quantum numbers are possible for the two
electrons?
2. Nitrogen (Z = 7) has three electrons in the 2p level (in
addition to two electrons each in the 1s and 2s levels).
(a) Consistent with the Pauli principle, what is the maximum possible value of the total ms of all seven electrons?
(b) List the quantum numbers of the three 2p electrons that
result in the largest total ms . (c) If the electrons in the 2p
level occupy states that maximize ms , what would be the
maximum possible value for the total ml ? (d) What would
be the maximum possible total ml if the three 2p electrons
were in states that did not maximize ms ?
3. (a) How many different sets of quantum numbers
(n, l, ml , ms ) are possible for an electron in the 4f level?
(b) Suppose a certain atom has three electrons in the 4f
level. What is the maximum possible value of the total ms
of the three electrons? (c) What is the maximum possible
total ml of three 4f electrons? (d) Suppose an atom has ten
electrons in the 4f level. What is the maximum possible
value of the total ms of the ten 4f electrons? (e) What is the
maximum possible total ml of ten 4f electrons?
8.2 Electronic States in Many-Electron Atoms
4. (a) Suppose a beryllium atom (Z = 4) absorbs energy (such
as from a beam of photons) that pushes one of the electrons to an excited state. If the photon energy is set at the
254
Chapter 8 | Many-Electron Atoms
minimum necessary for this to occur, from which subshell
does the electron make the transition and to which subshell
does it jump? (b) Suppose the same experiment is done with
neon (Z = 10). At the minimum energy for absorption, from
which subshell does the electron make the transition and
to which subshell does it jump? (c) Would you expect the
minimum absorption energy for beryllium to be larger or
smaller than the minimum energy for neon? Explain.
5. (a) List all elements with a p3 configuration. (b) List all
elements with a d 7 configuration.
6. Give the electronic configuration of (a) P; (b) V; (c) Sb;
(d) Pb.
7. (a) What is the electronic configuration of Fe? (b) In its
ground state, what is the maximum possible total ms of its
electrons? (c) When the electrons have their maximum possible total ms , what is the maximum total ml ? (d) Suppose
one of the d electrons is excited to the next highest level.
What is the maximum possible total ms , and when ms has its
maximum total what is the maximum total ml ?
8.3 Outer Electrons: Screening and Optical Transitions
8. The ground state of singly ionized lithium (Z = 3) is 1s2 .
Use the electron screening model to predict the energies
of the 1s1 2p1 and 1s1 3d 1 excited states in singly ionized lithium. Compare your predictions with the measured
energies (respectively −13.4 eV and −6.0 eV).
9. The ground state of neutral beryllium (Z = 4) is 1s2 2s2 . Use
the electron screening model to predict the energies of the
following excited states: 1s2 2s1 3p1 (measured −2.02 eV)
and 1s2 2s1 4d 1 (−0.90 eV).
10. Using the wavelengths given in Figure 8.4, compute the
energy difference between the 3d and 4d states in lithium;
do the same for sodium. Compare those values with the corresponding n = 4 to n = 3 energy difference in hydrogen.
Why is the agreement so good, considering the different
values of Z?
11. (a) Using the information for lithium given in Figure 8.4,
compute the energy difference of the 3p and 3d states.
(b) Compute the energy of the 3s, 4s, and 5s states above
the ground state. (c) The ionization energy of lithium in its
ground state is 5.39 eV. What is the ionization energy of the
2p state? Of the 3s state?
8.5 Inner Electrons: Absorption Edges and X Rays
12. A certain element emits a Kα X ray of wavelength 0.1940 nm.
Identify the element.
13. Compute the Kα X ray energies of calcium (Z = 20), zirconium (Z = 40), and mercury (Z = 80). Compare with the
measured values of 3.69 keV, 15.8 keV, and 70.8 keV. (See
Question 16).
8.6 Addition of Angular Momenta
14. Chromium has the electron configuration 4s1 3d 5 beyond the
inert argon core. What are the ground-state L and S values?
15. Use Hund’s rules to find the ground-state L and S of
(a) Ce, configuration [Xe]6s2 4f 1 5d 1 ; (b) Gd, configuration
[Xe]6s2 4f 7 5d 1 ; (c) Pt, configuration [Xe]6s1 4f 14 5d 9 .
16. Using Hund’s rules, find the ground-state L and S of
(a) fluorine (Z = 9); (b) magnesium (Z = 12); (c) titanium
(Z = 22); (d) iron (Z = 26).
17. A certain excited state of an atom has the configuration
4d 1 5d 1 . What are the possible L and S values?
18. Use the degeneracies of the states with all possible total L
and S to find how many different levels the 2p1 3p1 excited
state of carbon includes. (See Figure 8.18.) Compare this
result with the result of counting the individual ml and ms
values from Problem 1(d). (See also Question 11.)
8.7 Lasers
19. A small helium-neon laser produces a light beam with
an average power of 3.5 mW and a diameter of 2.4 mm.
(a) How many photons per second are emitted by the laser?
(b) What is the amplitude of the electric field of the light
wave? Compare this result with the electric field at a distance
of 1 m from an incandescent light bulb that emits 100 W of
visible light.
General Problems
20. (a) How many different possible ways are there to assign the
sets of quantum numbers to the four 2p electrons in oxygen
(Z = 8)? (b) List all possible values of the total ms for the
four electrons. (c) List all possible values of the total ml of
the four electrons. (d) If the total ms has its largest possible
value, what are the possible values of the total ml ? (e) If the
total ml has its largest possible value, what are the possible
values of the total ms ?
21. (a) The ionization energy of sodium is 5.14 eV. What is
the effective charge seen by the outer electron? (b) If the
3s electron of a sodium atom is moved to the 4f state, the
measured binding energy is 0.85 eV. What is the effective
charge seen by an electron in this state?
22. Draw a Moseley plot, similar to Figure 8.16, for the Kβ X
rays using the following energies in keV:
Ne
P
Ca
0.858
2.14
4.02
Mn
Zn
Br
6.51
9.57
13.3
Zr
Rh
Sn
17.7
22.8
28.4
Determine the slope and compare with the expected value.
(Equation 8.4 applies only to Kα X rays; you will need to
Problems
derive a similar equation for the Kβ X rays.) Determine the
z-axis intercept and give its interpretation.
23. Draw a Moseley plot, similar to Figure 8.16, for the Lα X
rays using the following energies in keV:
Mn
Zn
Br
Zr
0.721
1.11
1.60
2.06
Rh
Sn
Cs
Nd
2.89
3.71
4.65
5.72
Give interpretations of the slope and intercept.
24. Because of the fine-structure splitting of the 3p state, the
3p → 3s transition in sodium actually consists of two closely
spaced lines of wavelengths 589.00 nm and 589.59 nm.
Assuming a magnetic moment of one Bohr magneton, find
255
the effective magnetic field that produces the fine-structure
splitting of the 3p state of sodium.
25. (a) What is the longest wavelength of the absorption spectrum of lithium? (b) What is the longest wavelength of the
absorption spectrum of helium? In what region of the spectrum does this occur? (c) What are the shortest wavelengths
in the absorption spectra of helium and lithium? In what
region of the electromagnetic spectrum are these?
26. Using the wavelengths given in Figure 8.17, compute the
energy difference between the 1s1 4p1 and 1s1 3p1 singlet
(S = 0) states in helium. Compare this energy difference
with the value expected using the Bohr model, assuming
that the p electron is screened by the s electron. Repeat the
calculation for the 3d and 4d triplet (S = 1) states.
Chapter
9
MOLECULAR STRUCTURE
Molecules range from the simple with only two atoms to very complex organic molecules
such as DNA. The photo shows a computer model of C60 , a spherical arrangement of 60
carbon atoms in pentagons and hexagons, known as a ‘‘buckyball.’’
258
Chapter 9 | Molecular Structure
In this chapter we consider the combination of atoms into molecules, the excited
states of molecules, and the ways that molecules can absorb and emit radiation.
From a variety of experiments we learn that the spacing of atoms in molecules is
of the order of 0.1 nm, and that the binding energy of an atom in a molecule is
of the order of electron-volts. This spacing and binding energy are characteristic
of electronic orbits, which suggests that the forces that bind molecules together
originate with the electrons. The negatively charged electrons provide the binding
that overcomes the Coulomb repulsion of the positively charged nuclei of the
atoms in the molecule.
When atoms are brought together to form molecules, the atomic states of the
electrons change into molecular states. These states are filled in the order of
increasing energy by the valence electrons of the atoms of the molecule. The
probability densities of the occupied molecular states determine the nature of
molecular bonds and the structure and properties of molecules, including their
geometrical shapes.
Just as we began to study atomic physics by looking at the simplest atom, we
begin our study of molecular physics with the simplest molecule, H+
2 , the singly
ionized hydrogen molecule. We next turn to other simple molecules, such as H2
and NaCl, and finally we look at how our previous knowledge of atomic wave
functions can help us to understand the molecular states that form the basis of
organic chemistry.
We will also study ways other than electronic excitations that molecules can
absorb and emit electromagnetic radiation. These radiations give a distinctive
signature of the molecule and its structure. Molecular spectroscopy, the study
of these radiations, finds application in such diverse areas as identification of
atmospheric pollutants and the search for life in outer space.
9.1 THE HYDROGEN MOLECULE
Let’s first look at how the wave functions of the atomic electrons can lead to
the binding together of atoms into stable molecules. Even though the negatively
charged electrons provide the attractive force that overcomes the Coulomb
repulsion of the positively charged atomic nuclei, it is perhaps not immediately
obvious how stable molecules form at all because there is also a Coulomb repulsion
of the electrons of one atom for those of another. The key to understanding this
problem is the existence of the spatial probability densities of atomic orbits, such
as we calculated for hydrogen and illustrated in Chapter 7. These probability
densities are frequently not spherically symmetric, and very often may show
overwhelming preferences for one spatial direction over another.
A complete understanding of the effect of the electrons on molecular binding
is in general made difficult by what also complicates atomic structure—there are
too many electrons present for us to be able to write down and solve the equations
that govern the structure of the atom or molecule. We therefore use the same
tactic to study molecular structure that we used for atomic structure: we begin
with a molecule that has only one electron. Such a molecule is H+
2 , the hydrogen
molecule ion, which results when we remove an electron from a molecule of
ordinary hydrogen, H2 .
Before we discuss the wave mechanical properties of H+
2 , let’s try to guess
what holds this molecule together. We first realize that it is not correct to think
of H+
2 as an atom of hydrogen (proton plus electron) joined to a second proton.
9.1 | The Hydrogen Molecule
ψ1
ψ1 + ψ2
ψ2
+
ψ1
+
FIGURE 9.1 The electron wave functions for two hydrogen atoms
separated by a large distance.
The atom of hydrogen in such a combination is electrically neutral, so there is
no electrostatic Coulomb force to hold the two pieces together. In this kind of
molecule, at least, it is apparently not correct to identify the electron as belonging
exclusively to one or the other of the components. The electron must somehow be
shared between the two parts. The electron must spend a significant part of its time
in the region between the two protons. In the language of quantum mechanics, the
electron’s probability density must have a large value in that region.
As we learned in Chapter 7, an electron in the ground state of hydrogen
has an energy of −13.6 eV, a wave function ψ = (πa30 )−1/2 e−r/a0 , where a0 is
the Bohr radius, and a probability density proportional to ψ 2 . Figure 9.1 shows
the wave function for an electron that could be bound to either of two protons
separated by a large distance. As we bring the two protons closer together, the
wave functions begin to overlap, and we must combine them according to the
rules of quantum mechanics—first add the wave functions, then square the result
to find the combined probability density. (Note that this gives a very different
result from first squaring, then adding.)
We can combine these two wave functions in two different ways, depending on
whether they have the same signs or opposite signs. The absolute sign of a wave
function is arbitrary. When we calculate the normalization constant of a wave
function, we actually compute its square. We could choose either the positive or
the negative root; for convenience we usually choose the positive one. When we
calculate probability densities ψ 2 for a single wave function, the choice of sign
becomes irrelevant. However, when we combine different wave functions, their
relative signs determine whether the two functions add or subtract, which can
result in very different probability densities.
Consider the two different combinations of wave functions shown in Figure 9.2.
In one case (Figure 9.2a), the two wave functions have the same sign, and in the
other case (Figure 9.2b) they have opposite signs. This has a substantial effect
on the probability distributions, which are shown in Figure 9.3. The probability
density obtained from squaring ψ1 + ψ2 (Figure 9.3a) has relatively large values
in the region between the two protons. This suggests a concentration of negative
charge between the protons, which can supply the Coulomb attraction to pull
the two protons together and form a stable molecule. The square of ψ1 − ψ2
(Figure 9.3b), however, gives a vanishing probability density midway between
the protons and thus a small density of negative charge in the region between the
protons. There is not enough negative charge to overcome the Coulomb repulsion
of the protons, and as a result this combination of wave functions does not lead to
the formation of a stable molecule.
Binding Energy of
259
H+
2
There are two contributions to the energy of the H+
2 molecule: the Coulomb
repulsion of the two positively charged protons for each other, and the attraction
of the combination of the two protons for the negatively charged electron.
ψ2
+
+
+
+
ψ1 − ψ 2
(a)
ψ1
−ψ2
(b)
FIGURE 9.2
The overlap of two
hydrogenic wave functions. The wave
functions are indicated by the dashed
lines, and their sum by the solid line.
In (a), the two wave functions have
the same sign, while in (b) they have
opposite signs.
|ψ1 + ψ2|2
+
+
(a)
|ψ1 − ψ2|2
+
+
(b)
FIGURE 9.3 The probability densities
corresponding to the two combined
wave functions of Figure 9.2.
Chapter 9 | Molecular Structure
260
The Coulomb repulsion energy of the protons is positive, and the energy of
the attraction of the electron by the protons is negative. For a stable molecule
to form, the total energy must be negative, so the critical question is whether
the electrons can provide enough negative energy of attraction to overcome the
positive repulsion energy of the protons.
To find the conditions necessary for a stable H+
2 ion to form, let’s look at how
the various contributions to the energy of the ion depend on the separation distance
R between the two protons. The Coulomb potential energy that characterizes the
repulsion of the bare protons is Up = e2 /4πε0 R; this function is plotted in
Figure 9.4. To find the electron energy as a function of R, we first consider the
case when the two protons are very far apart. In this case the electron is in the
ground state orbit about one of the protons, for which E = −13.6 eV. As we
bring the protons together, the electron becomes more tightly bound (because it
is attracted by both protons) and its energy becomes more negative. As R → 0,
the system approaches a single atom with Z = 2. For the wave function ψ1 + ψ2
(Figure 9.2a), the combined wave function has a maximum at R = 0 and resembles
the ground-state wave function of an atom with Z = 2. Recalling the result from
Chapter 6 for the electron energy in a hydrogen-like atom,
En = (−13.6 eV)
Energy (eV)
40
30
20
Up
10
Up + E −
0
−10
−13.6
0.1
0.2
U p + E+
−20
0.3
R (nm)
E−
2.7 eV
−30
−40
E+
−50
−60
FIGURE 9.4 Dependence of energy
on separation distance R for H+
2.
Z2
n2
(9.1)
where n is the principal quantum number, we find the energy of a ground-state
electron for Z = 2 to be −54.4 eV. The energy corresponding to the sum of the
two wave functions, which we label E+ , therefore has the value −13.6 eV at large
R and approaches the value −54.4 eV at small R. The result of an exact calculation
of E+ is shown in Figure 9.4.
For the combination corresponding to the difference between the two wave
functions, the energy is once again −13.6 eV for large R. As R → 0, the
combined wave function approaches 0 (Figure 9.2b). The lowest energy level
with a wave function that vanishes at R = 0 is the 2p state, for which the energy
in a Z = 2 hydrogenlike atom is −13.6 eV. The energy E− corresponding to the
wave function ψ1 − ψ2 therefore has the value −13.6 eV for both large R and
small R. Its exact form is shown in Figure 9.4.
The total energy of the hydrogen molecule ion is the sum of the proton
energy Up and the electron energy E+ or E− . These two sums are also plotted
in Figure 9.4. You can see that the combination Up + E− has no minimum and
therefore no stable bound state. The wave function ψ1 − ψ2 does not lead to a
stable configuration for the hydrogen molecule ion, just as we originally suspected.
The sum Up + E+ gives the stable configuration of the ion, for which the
equilibrium condition occurs at the point where Up + E+ has its minimum value.
The minimum occurs at a separation Req = 0.106 nm and an energy of −16.3 eV.
The binding energy B of H+
2 is the energy necessary to take apart the ion into H
and H+ and corresponds to the depth of the potential energy minimum of Up + E+
in Figure 9.4:
B = E(H + H+ ) − E(H+
2 ) = −13.6 eV − (−16.3 eV) = 2.7 eV
(9.2)
Note that we have defined molecular binding energy as the energy difference
between the separate components (H and H+ ) and the combined system (H+
2 ).
It is interesting to note that the stability is achieved at Req = 2a0 . In Chapter 7
we learned that the radial probability density for the 1s state of hydrogen has its
9.1 | The Hydrogen Molecule
261
maximum value at r = a0 . Thus the stable configuration of the H+
2 ion is such
that the maximum in the radial probability density for a single H atom would
fall exactly in the middle of the molecule! This is once again consistent with our
expectations for the structure of H+
2 —the electron must spend most of its time
between the two protons.
In summary, from our study of this simple molecule we have learned that an
important feature of molecular bonding concerns the sharing of a single electron
by two atoms of the molecule. This sharing is responsible for the stability of the
molecule. With this in mind we can now add a second electron and consider the
H2 molecule.
The H2 Molecule
B = E(H + H) − E(H2 ) = 2(−13.6 eV) − (−31.7 eV) = 4.5 eV
(9.3)
Comparing Figures 9.4 and 9.6, you can immediately see the effect of adding
an additional electron to H+
2 : the binding energy is greater (the molecule is more
tightly bound), and the protons are drawn closer together. Both of these effects
are due to the presence of the increased electron density in the region between the
two protons.
Antibonding
Energy
ψ1 − ψ2
R=∞
−27.2 eV
ψ1 + ψ 2
Bonding
FIGURE 9.5 Energy of different combinations of wave functions in H2 .
10
0
0.1
R (nm)
Energy (eV)
Suppose we have two hydrogen atoms separated by a very large distance.
Associated with each atom there is a 1s electronic state, at an energy of −13.6 eV,
because the atoms are so far apart that there is no interaction between the electrons.
As we bring the atoms closer together to form a H2 molecule, the electron wave
functions begin to overlap, so that the electrons are “shared” between the two
atoms. As we have seen in the previous discussion, this can occur in such a way
that the two electron wave functions add in the region between the two protons,
giving a stable molecule, or subtract, leading to no stable molecule. The separate,
individual electronic states of the atoms now become molecular states.
Notice that, as shown in Figure 9.5, the number of states does not change as
the separation R is reduced. When the atoms are separated by a large distance,
there are two states, each at −13.6 eV, so the total energy at R = ∞ is −27.2 eV.
When the separation is reduced, there are still two states, but now at different
energies. One state corresponds to the sum of the two wave functions and leads
to a stable H2 molecule; the other state corresponds to the difference of the two
wave functions and does not give a stable molecule. The molecular state that
leads to a stable molecule is known as a bonding state, and the one that does not
lead to a stable molecule is an antibonding state.
As we found previously for H+
2 , in order to form a molecule, the electron
probability distribution must be large in the region between the two protons. In the
case of H2 , this is true for both electrons, and it is certainly our expectation, based
on the Pauli principle, that for the two electrons both to occupy the molecular state
leading to the large probability in the central region, their spins must be oppositely
directed; that is, one must have ms = + 1/2 and the other ms = − 1/2 . As long as
the two electrons have opposite spins, they can both occupy the bonding state,
leading to a stable molecule.
The energy of the bonding state for H2 is shown in Figure 9.6; as you can see,
there is a minimum with E = −31.7 eV at R = 0.074 nm. The molecular binding
energy of H2 is the difference between the energy of the separated neutral H
atoms and the energy of the combined system:
0.2
−10
−20
0.074 nm
−27.2
−30
Antibonding
Bonding
4.5 eV
−40
FIGURE 9.6 Bonding and antibonding in H2 .
262
Chapter 9 | Molecular Structure
We can also understand why He does not form the molecule He2 —as two
He atoms are brought together, the bonding and antibonding states are formed in
much the same way as with H2 . The He2 molecule would have four electrons; at
most two can be in the bonding state, so the other two must be in the antibonding
state. The net effect is that no stable molecule forms. (However, He+
2 is stable,
with two bonding electrons and only one antibonding electron. The binding
energy of He+
2 is 3.1 eV and the separation is 0.108 nm, remarkably close to the
corresponding values of H+
2.)
9.2 COVALENT BONDING IN MOLECULES
The sharing of electrons in a molecule such as H2 is the origin of the covalent
bond; this type of bonding occurs commonly in molecules containing two identical
atoms, in which case it is called homopolar or homonuclear bonding.
The essential features of covalent bonding are:
1. As two atoms are brought together, the electrons interact and the separate
atomic states and energy levels are transformed into molecular states.
2. In one of the molecular states, the electron wave functions overlap in such
a way as to give a lower energy than the separated atoms had; this is the
bonding state that leads to the formation of stable molecules.
3. The other molecular state (the antibonding state) has an increased energy
relative to the separated atoms and does not lead to the formation of stable
molecules.
4. The restrictions of the Pauli principle apply to molecular states just as they
do to atomic states; each molecular state has a maximum occupancy of two
electrons, corresponding to the two different orientations of electron spin.
Other hydrogenlike atoms with a single s electron can also form stable
molecules through covalent bonding. For example, two Li atoms (Z = 3, configuration 1s2 2s1 ) can form a molecule of Li2 . The four 1s electrons (two from
each atom) fill the 1s bonding and antibonding states, and the remaining two 2s
electrons can both occupy the 2s bonding state. The binding energy of Li2 is
1.10 eV, which is considerably smaller than the binding energy of H2 (4.52 eV),
and the equilibrium separation distance of the atoms in the molecule is 0.267 nm,
much larger than that of H2 (0.074 nm).
Other homonuclear molecules formed from s-state bonds are listed in Table 9.1.
It is customary to characterize the molecular bond strength in terms of the
dissociation energy rather than the binding energy; the two terms are usually
equivalent and indicate the energy needed to break the molecule into neutral atoms.
The dissociation energy is weakly temperature dependent. Some tabulations list
the values at room temperature (as in Table 9.1), while others list values at 0 K.
The room-temperature values are higher than the 0 K values by about 1.5kT =
0.04 eV.
As Z increases, meaning that the s electrons are associated with increasing
principal quantum numbers n, the dissociation energy decreases and the equilibrium separation increases. This is consistent with the behavior of the s electron
in atoms as n increases—as Figure 8.7 shows, the radius of the orbit of the s
electron increases with increasing n for the alkali elements.
9.2 | Covalent Bonding in Molecules
TABLE 9.1 Properties of s -Bonded Molecules*
Molecule
Dissociation Energy (eV)
Equilibrium Separation (nm)
H2
4.52
0.074
Li2
1.10
0.267
Na2
0.80
0.308
K2
0.59
0.392
Rb2
0.47
0.422
Cs2
0.43
0.450
LiH
2.43
0.160
LiNa
0.91
0.281
NaH
2.09
0.189
KNa
0.66
0.347
NaRb
0.61
0.359
∗
Values taken from the Handbook of Chemistry and Physics and the American Institute of
Physics Handbook.
We can also form molecular bonds with two different alkali elements. Some of
these are listed in Table 9.1. The dissociation energies and equilibrium separations
are consistent with those of the corresponding homonuclear molecules. For
example, the dissociation energy and equilibrium separation of LiH are midway
between those of H2 and Li2 .
Atoms with valence electrons in p states can also form diatomic molecules
through covalent bonds—oxygen and nitrogen, for example. There are three
atomic p states, so there will be six molecular p states, and the classification of
levels can become quite tedious, but we can understand the structure of molecules
composed of atoms with p electrons based on the geometry of atomic p states.
In Chapter 7 we solved the Schrödinger equation for the H atom and showed
the spatial probability distributions for the various possible electronic wave
functions. Of course, these solutions for hydrogen will not be correct for other
atoms, but the essential features of the geometry of the atomic states remains
correct. We identified three different p states, corresponding to ml = −1, 0, and
+1. The probability distributions corresponding to these ml values were shown in
Figure 7.11.
We can imagine these distributions to have a sort of “figure-eight” shape with
two distinct lobes of large probability. In the ml = 0 case, the figure eight has
its long axis along the z axis, and the two lobes of maximum probability occur
in the +z and −z directions. In the ml = ±1 cases, the probability distribution
can be regarded as occurring from a figure eight probability distribution in the
xy plane that is rotating about the z axis, counterclockwise for ml = +1 and
clockwise for ml = −1. Because of the uncertainty principle, we can’t observe
the two probability lobes in the xy plane; all we can observe is the smeared-out
“donut-shaped” distribution shown in Figure 7.11.
For our purposes here, it is not as convenient to use the ml notation as it is to
use a different representation in which we assign each of the three possible p states
263
264
Chapter 9 | Molecular Structure
y
py
pz
px
x
z
FIGURE 9.7 Probability distributions
of three different p electrons.
px
px
(a)
py
py
a label that gives the direction in space corresponding to the lobes of maximum
probability. Thus pz is the state with regions of large electron probability along
the z axis, and similarly for px and py . Figure 9.7 shows a schematic representation
of these probability distributions. (The pz state corresponds exactly to ml = 0; px
and py correspond to mixtures of ml = +1 and ml = −1.) Just as the uncertainty
principle does not allow us to observe the two lobes of probability in the xy plane,
it also forbids us from observing the separate px and py probability distributions.
However, the distributions do exist (even through we can’t observe them), and
two atoms can interact with one another by means of these electron clouds.
We consider the structure of molecules containing p electrons based on this
model of the three mutually perpendicular p states px , py , and pz . We discuss three
applications of this type of covalent bonding: pp bonds, sp directed bonds, and sp
hybrid states.
pp Covalent Bonds
Consider what happens when we bring together two p-shell atoms, whose
probability distributions are each similar to Figure 9.7. We assume that the atoms
approach along the x axis, as in Figure 9.8. As the atoms are brought together,
the px states overlap (Figure 9.9a), giving (if the two wave functions add) an
increased electron charge density between the two nuclei and contributing to the
bonding of the atoms in the molecule. There is a much weaker overlap between
the py states (Figure 9.9b) and also between the pz states (which are not shown in
the figure). Because the overlap of the py states is not along the line connecting
the nuclei, there are components to the binding force that oppose one another,
and only a much smaller resultant force acts along the line connecting the nuclei
(Figure 9.9b). In addition, there is less overlap of the py states. The net result is
that the py states (and also the pz states) are less effective in binding the molecule
than the px states.
This somewhat oversimplified model suggests that the px state should have a
much greater bonding effect (and also a greater antibonding effect) than the py
and pz states. It also suggests that the bonding and antibonding effects of py states
should be the same as those of pz states.
Now we can consider the energies of the molecular states as a function of the
nuclear separation distance R. We assume that we are dealing with two atoms
py
py
(b)
FIGURE 9.9 (a) Overlap of px probability distributions. The vectors indicate the force on the nuclei due to the
overlap. (b) Overlap of py probability
distributions. The off-axis forces give
a smaller resultant force along the axis.
px
px
FIGURE 9.8 Two atoms with p electrons. The pz probability distribution, which extends perpendicular to the page, is not shown.
9.2 | Covalent Bonding in Molecules
px
An
py, pz
Electron energy (not to scale)
having filled 1s and 2s states and valence electrons in the 2p shell. When the 1s
states of the two atoms overlap, the result is 1s bonding and antibonding molecular
states, just as in the case of H2 . There are altogether four 1s electrons in the
molecule, and with two in each state the 1s bonding and antibonding molecular
states are filled to capacity. The same is true of the 2s states. The atomic 2s levels
form bonding and antibonding molecular states; because each atom has a filled 2s
shell, the four 2s electrons fill both the bonding and antibonding molecular states.
The atoms have partially filled 2p shells, so the final molecular bonding
depends critically on the molecular 2p states. For each atomic p state (px , py , pz )
there are corresponding bonding and antibonding molecular states. However, the
bonding and antibonding effects of these states are not equivalent, as Figure 9.9
illustrates. The p state that happens to lie along the line of approach (px ) has an
effect that is significantly greater than the p states that lie off the line of approach
(py , pz ). The px bonding state must therefore lie lower in energy than the py and
pz bonding states, and the px antibonding state must lie higher in energy than the
py and pz antibonding states. Figure 9.10 illustrates the energies of the molecular
states. The relative stability of a molecule can be determined based on the filling
of the bonding and antibonding states with electrons (two per state, corresponding
to spin up and spin down electrons). The following example illustrates how these
states are filled.
265
tib
ond
ing
2p
py, pz
ing
ond
B
px
Antibonding
2s
Bonding
Antibonding
1s
Bonding
Separation distance
R=∞
FIGURE 9.10 Bonding and antibonding 2p states.
Example 9.1
Based on the filling of the bonding and antibonding states,
predict the relative stability of the molecules (a) N2 ,
(b) O2 , and (c) F2 .
Solution
(a) Nitrogen (1s2 2s2 2p3 ) has seven electrons: two each in
the filled 1s and 2s shells, and three electrons in the 2p shell.
In the N2 molecule, there are therefore 14 electrons. We
start in Figure 9.10 with two in the bonding 1s state, then
two in the antibonding 1s, then two more in the bonding
2s, followed by two in the antibonding 2s for a total of
eight electrons in the s states. That leaves six 2p electrons
for the 2p molecular states. We can place two each in the
three lowest 2p bonding molecular states, thus filling those
states. No electrons go into the 2p antibonding states. With
only bonding 2p electrons, N2 forms a very stable diatomic
molecule.
(b) Oxygen (1s2 2s2 2p4 ) has eight electrons, so the O2
molecule has a total of 16 electrons. As in N2 , the first eight
electrons fill the 1s and 2s states, leaving eight additional
electrons for the 2p states. The first six of those fill the three
bonding states, and so the remaining two must go into 2p
antibonding states. With six bonding and two antibonding
valence electrons, we would expect that O2 is less stable
than N2 , which has only bonding valence electrons.
(c) Fluorine (1s2 2s2 2p5 ) has nine electrons, so of the 18
electrons in the F2 molecule 10 must be placed in the 2p
states: six in the bonding states and four in the antibonding
states. Thus F2 should be less stable than O2 .
How well do the properties of these molecules agree with our predictions? N2
has a dissociation energy of 9.8 eV and is not reactive under most circumstances.
O2 has a smaller dissociation energy (5.1 eV); the O2 molecular bonds can be
broken by relatively modest chemical reactions, as, for example, the oxidation
of metals exposed to air. F2 has an even smaller dissociation energy (1.6 eV);
266
Chapter 9 | Molecular Structure
fluorine gas reacts quite violently with many substances, and the F2 molecule
can be broken apart by exposure to visible light (which has photon energies of
2–4 eV) in a process known as photodissociation. The properties of these 2p
molecules are thus quite consistent with expectations based on the filling of the
bonding and antibonding states. Similar relationships occur for the 3p, 4p, 5p, and
6p homonuclear molecules.
sp Molecular Bonds
ψH
ψF
+
−
+
F
+
H
FIGURE 9.11
Overlap of s and p
wave functions.
It is often the case that a stable molecule is formed from two atoms, one with an
s-state valence electron and the other with one or more p-state valence electrons.
Consider, for example, the HF molecule. The F atom has five electrons in the
p shell, so of the three 2p atomic states, two will each have their capacity of
two electrons, and the third will have a single electron. We ignore the four
paired p electrons, which do not significantly affect the molecular bonding, and
concentrate instead on the single unpaired p electron. The two-lobed probability
distribution corresponds to a two-lobed p−state wave function, in which the signs
of ψ are opposite for the two lobes. The 1s wave function of H has only one sign
(Figure 9.11). As the H and F atoms approach each other from a large distance,
the H wave function and the F wave function can combine to give an increased
electron probability in the region between the nuclei, and hence a bonding sp state
is formed. It is also possible to have antibonding sp states, which result from the
H and F wave functions having opposite signs and producing a reduced electron
probability density between the nuclei.
Table 9.2 gives dissociation energies and nuclear separation distances for some
sp-bonded diatomic molecules.
Consider now the structure of the water molecule, H2 O. Oxygen has eight
electrons, four of which occupy the 2p shell. When we place these electrons in the
2p atomic states, we begin with one electron each in the px , py , and pz states, and
then the fourth 2p electron must pair with one of the first three. An oxygen atom
therefore has two unpaired 2p electrons, each of which can form a bond with the
TABLE 9.2 Properties of sp -Bonded Molecules
Molecule
Dissociation Energy (eV)
Equilibrium Separation (nm)
HF
5.90
0.092
HCl
4.48
0.128
HBr
3.79
0.141
HI
3.10
0.160
LiF
5.98
0.156
LiCl
4.86
0.202
NaF
4.99
0.193
NaCl
4.26
0.236
KF
5.15
0.217
KCl
4.43
0.267
9.2 | Covalent Bonding in Molecules
1s electron of H to form a molecule of H2 O. Figure 9.12 shows a representation
of the electron probability distributions we might expect for an oxygen atom and
for a molecule of H2 O. Such a molecule has directed bonds, which have a fixed,
measurable relative direction in space. The expected angle between the two bonds
is 90◦ ; this angle can be measured experimentally by, for example, measuring the
electric dipole moment of the atom, and the result, 104.5◦ , is somewhat larger
than we expect. This discrepancy can be interpreted as arising from the Coulomb
repulsion of the two H atoms, which tends to spread the bond angle somewhat.
As another example, consider the NH3 (ammonia) molecule. With Z = 7, the
nitrogen atom has three unpaired p electrons, one each in the px , py , and pz atomic
states. Each of these can form a bond with a H atom to form the NH3 molecule,
and we expect to find three mutually perpendicular sp bonds (Figure 9.13). The
measured bond angle is 107.3◦ , again indicating some repulsion between the
H atoms.
Table 9.3 lists some bond angles measured for other molecules that have sp
directed bonds. As you can see, the bond angle does indeed approach 90◦ in many
cases. Based on the discussion given above, you should be able to explain why
this happens as the Z of the central atom increases.
−
+
−
+
+
O
+
H
+
H
+
FIGURE 9.12 Overlap of electronic
wave functions in H2 O.
+
H
+
+
sp Hybrid States
One example of a 2p atom we have so far not considered is carbon, and for a
special reason: carbon forms a great variety of molecular bonds, with a resulting
diversity in the type and complexity of molecules containing carbon. It is this
diversity that is the basis for the many kinds of organic molecules that can form,
based on various kinds of carbon molecular bonds, and so an understanding of
the physics of carbon molecular bonds is essential to the understanding of many
fundamental questions of structure and processes in molecular biology.
Carbon, with six electrons, has the configuration 1s2 2s2 2p2 , so we expect
carbon under ordinary circumstances to show a valence of 2, with the two 2p
electrons contributing to the structure, and we might therefore expect to form
stable molecules such as CH2 , with directed sp bonding (similar to H2 O) and a
bond angle of roughly 90◦ . Instead, what forms is CH4 (methane) in a tetrahedral
structure (Figure 9.14), with four equivalent bonds. For another example, the
elements of the third column of the periodic table (boron, aluminum, gallium, . . .)
have the outer configuration ns2 np (n = 2 for boron, n = 3 for aluminum, etc.),
and we expect these elements to form compounds as if they had a single valence
p electron. We therefore expect halides such as BCl or GaF, oxides such as B2 O
or Al2 O, nitrides such as B3 N or Al3 N, hydrides such as BH or GaH, and so
forth. Instead we find that boron, aluminum, and gallium generally behave as if
they had three valence electrons, and form compounds such as BCl3 , Al2 O3 , AlN,
and B2 H6 . Furthermore, the three valence electrons seem to be equivalent; there
seems to be no way, for example, to associate two of the valence electrons with
s states and one with a p state. The bonds formed by the three electrons make
equivalent angles of 120◦ with one another.
It is the effect of sp hybridization that is responsible for the valence of three
(rather than one) in boron and four (rather than two) in carbon. The four bonds in
CH4 are equivalent and identical, which would not be expected if we had two ss
bonds and two sp bonds; similarly, in BF3 or BCl3 , the three bonds are identical
and are clearly not identified with two sp bonds and one pp bond.
267
−
−
+
+
+
+
−
+
N
H