Academia.eduAcademia.edu

Modern Physics - Kenneth Krane

MODERN PHYSICS MODERN PHYSICS Third edition K e n n e t h S. K r a n e DEPARTMENT OF PHYSICS OREGON STATE UNIVERSITY JOHN WILEY & SONS, INC VP AND EXECUTIVE PUBLISHER EXECUTIVE EDITOR MARKETING MANAGER DESIGN DIRECTOR DESIGNER PRODUCTION MANAGER ASSISTANT PRODUCTION EDITOR PHOTO DEPARTMENT MANAGER PHOTO EDITOR COVER DESIGNER COVER IMAGE Kaye Pace Stuart Johnson Christine Kushner Jeof Vita Kristine Carney Janis Soo Elaine S. Chew Hilary Newman Sheena Goldstein Seng Ping Ngieng CERN/SCIENCE PHOTO LIBRARY/Photo Researchers, Inc. This book was set in Times by Laserwords Private Limited and printed and bound by R. R. Donnelley and Sons Company, Von Hoffman. The cover was printed by R. R. Donnelley and Sons Company, Von Hoffman. This book is printed on acid free paper. Founded in 1807, John Wiley & Sons, Inc. has been a valued source of knowledge and understanding for more than 200 years, helping people around the world meet their needs and fulfill their aspirations. Our company is built on a foundation of principles that include responsibility to the communities we serve and where we live and work. In 2008, we launched a Corporate Citizenship Initiative, a global effort to address the environmental, social, economic, and ethical challenges we face in our business. Among the issues we are addressing are carbon impact, paper specifications and procurement, ethical conduct within our business and among our vendors, and community and charitable support. For more information, please visit our website: www.wiley.com/go/citizenship. Copyright 2012, 1996, 1983, John Wiley & Sons, Inc. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc. 222 Rosewood Drive, Danvers, MA 01923, website www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, (201)748–6011, fax (201)748–6008, website http://www.wiley.com/go/permissions. Evaluation copies are provided to qualified academics and professionals for review purposes only, for use in their courses during the next academic year. These copies are licensed and may not be sold or transferred to a third party. Upon completion of the review period, please return the evaluation copy to Wiley. Return instructions and a free of charge return mailing label are available at www.wiley.com/go/returnlabel. If you have chosen to adopt this textbook for use in your course, please accept this book as your complimentary desk copy. Outside of the United States, please contact your local sales representative. Library of Congress Cataloging-in-Publication Data Krane, Kenneth S. Modern physics/Kenneth S. Krane. -- 3rd ed. p. cm. Includes bibliographical references and index. ISBN 978-1-118-06114-5 (hardback) 1. Physics. I. Title. QC21.2.K7 2012 539--dc23 2011039948 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 PREFACE This textbook is meant to serve a first course in modern physics, including relativity, quantum mechanics, and their applications. Such a course often follows the standard introductory course in calculus-based classical physics. The course addresses two different audiences: (1) Physics majors, who will later take a more rigorous course in quantum mechanics, find an introductory modern course helpful in providing background for the rigors of their imminent coursework in classical mechanics, thermodynamics, and electromagnetism. (2) Nonmajors, who may take no additional physics class, find an increasing need for concepts from modern physics in their disciplines—a classical introductory course is not sufficient background for chemists, computer scientists, nuclear and electrical engineers, or molecular biologists. Necessary prerequisites for undertaking the text include any standard calculusbased course covering mechanics, electromagnetism, thermal physics, and optics. Calculus is used extensively, but no previous knowledge of differential equations, complex variables, or partial derivatives is assumed (although some familiarity with these topics would be helpful). Chapters 1–8 constitute the core of the text. They cover special relativity and quantum theory through atomic structure. At that point the reader may continue with Chapters 9–11 (molecules, quantum statistics, and solids) or branch to Chapters 12–14 (nuclei and particles). The final chapter covers cosmology and can be considered the capstone of modern physics as it brings together topics from relativity (special and general) as well as from nearly all of the previous material covered in the text. The unifying theme of the text is the empirical basis of modern physics. Experimental tests of derived properties are discussed throughout. These include the latest tests of special and general relativity as well as studies of wave-particle duality for photons and material particles. Applications of basic phenomena are extensively presented, and data from the literature are used not only to illustrate those phenomena but to offer insight into how “real” physics is done. Students using the text have the opportunity to study how laboratory results and the analysis based on quantum theory go hand-in-hand to illuminate such diverse topics as Bose-Einstein condensation, heat capacities of solids, paramagnetism, the cosmic microwave background radiation, X-ray spectra, dilute mixtures of 3 He in 4 He, and molecular spectroscopy of the interstellar medium. This third edition offers many changes from the previous edition. Most of the chapters have undergone considerable or complete rewriting. New topics have been introduced and others have been rearranged. More experimental results are presented and recent discoveries are highlighted, such as the WMAP microwave background data and Bose-Einstein condensation. End-of-chapter problem sets now include problems organized according to chapter section, which offer the student an opportunity to gain familiarity with a particular topic, as well as general problems, which often require the student to apply a broader array of concepts or techniques. The number of worked examples in the chapters and the number of end-of-chapter questions and problems have each increased by about 15% from the previous edition. The range of abilities required to solve the problems has been vi Preface broadened, so that this edition includes both more straightforward problems that build confidence as well as more difficult problems that will challenge students. Each chapter now includes a brief summary of the important points. Some of the end-of-chapter problems are available for assignment using the WebAssign program (www.webassign.net). A new development in physics teaching since the appearance of the 2nd edition of this text has been the availability of a large and robust body of literature from physics education research (PER). My own teaching style has been profoundly influenced by PER findings, and in preparing this new edition I have tried to incorporate PER results wherever possible. One of the major themes that has emerged from PER in the past decade or two is that students can often learn successful algorithms for solving problems while lacking a fundamental understanding of the underlying concepts. Many approaches to addressing this problem are based on pre-class conceptual exercises and in-class individual or group activities that help students to reason through diverse problems that can’t be resolved by plugging numbers into an equation. It is absolutely essential to devote class time to these exercises and to follow through with exam questions that require similar analysis and articulation of the conceptual reasoning. More details regarding the application of PER to the teaching of modern physics, including references to articles from the PER literature, are included in the Instructor’s Manual for this text, which can be found at www.wiley.com/college/krane. The Instructor’s Manual also includes examples of conceptual questions for in-class discussion or exams that have been developed and class tested through the support of a Course, Curriculum and Laboratory Improvement grant from the National Science Foundation. Specific changes to the chapters include the following: Chapter 1: The sections on Units and Dimensions and on Significant Figures have been removed. In their place, a more detailed review of applications of classical energy and momentum conservation is offered. The need for special relativity is briefly established with a discussion of the failures of the classical concepts of space and time, and the need for quantum theory is previewed in the failure of Maxwell-Boltzmann particle statistics to account for the heat capacities of diatomic gases. Chapter 2: Spacetime diagrams have been introduced to help illustrate relationships in the twin paradox. The application of the relativistic conservation laws to decay and collisions processes is now given a separate section to help students learn to apply those laws. The section on tests of special relativity has been updated to include recent results. Chapter 3: The section on thermal radiation has been rewritten, and more detailed derivations of the Rayleigh-Jeans and Planck formulas are now given. Chapter 4: New experimental results for particle diffraction and interference are discussed. The sections on the classical uncertainty relationships and on wave packet construction and motion have been rewritten. Chapter 5: To help students understand the processes involved in applying boundary conditions to solutions of the Schrödinger equation, a new section on wave boundary conditions has been added. A new introductory section on particle confinement introduces energy quantization and helps to build the connection between the wave function and the uncertainty relationships. Time dependence of the wave function is introduced more explicitly at an Preface earlier stage in the formulism. Graphic illustrations for step and barrier problems now show the real and imaginary parts of the wave function as well as its squared magnitude. Chapter 6: The derivation of the Thomson model scattering angle has been modified, and the section on deficiencies of the Bohr model has been rewritten. Chapter 7: To ease the entry into the 3-dimensional Schrödinger analysis of the hydrogen atom in spherical coordinates, a new section on the onedimensional hydrogen atom has been added. Angular momentum concepts relating to the hydrogen atom are now introduced before the full solutions to the wave equation. Chapter 8: Much of the material has been reorganized for clarity and ease of presentation. The screening discussion has been made more explicit. Chapter 9: More emphasis has been given to the use of bonding and antibonding orbitals to predict the relative stability of molecules. Sections on molecular vibrations and rotations have been rewritten. Chapter 10: This chapter has been extensively rewritten. A new section on the density of states function allows statistical distributions for photons or particles to be discussed more rigorously. New applications of quantum statistics include Bose-Einstein condensation, white dwarf stars, and dilute mixtures of 3 He in 4 He. Chapter 11: The chapter has been rewritten to broaden the applications of the quantum theory of solids to include not only electrical conductivity but also the heat capacity of solids and paramagnetism. Chapter 12: To emphasize the unity of various topics within modern physics, this chapter now includes proton and neutron separation energies, a new section on quantum states in nuclei, and nuclear vibrational and rotational states, all of which have analogues in atomic or molecular structure. Chapter 13: The discussion of the physics of fission has been expanded while that of the properties of nuclear reactors has been reduced somewhat. Because much current research in nuclear physics is related to astrophysics, this chapter now features a section on nucleosynthesis. Chapter 14: New material on quarkonium and neutrino oscillations has been added. Chapter 15: Chapters 15 and 16 of the 2nd edition have been collapsed into a single chapter on cosmology. New results from COBE and WMAP are included, along with discussions of the horizon and flatness problems (and their inflationary solution). Many reviewers and class-testers of the manuscript of this edition have offered suggestions to improve both the physics and its presentation. I am particularly grateful to: David Bannon, Oregon State University Gerald Crawford, Fort Lewis College Luther Frommhold, University of Texas-Austin Gary Goldstein, Tufts University Leon Gunther, Tufts University Gary Ihas, University of Florida vii viii Preface Paul Lee, California State University, Northridge Jeff Loats, Metropolitan State College of Denver Jay Newman, Union College Stephen Pate, New Mexico State University David Roundy, Oregon State University Rich Schelp, Erskine College Weidian Shen, Eastern Michigan University Hongtao Shi, Sonoma State University Janet Tate, Oregon State University Jeffrey L. Wragg, College of Charleston Weldon Wilson, University of Central Oklahoma I am also grateful for the many anonymous comments from students who used the manuscript at the test sites. I am indebted to all those reviewers and users for their contributions to the project. Funding for the development and testing of the supplemental exercises in the Instructor’s Manual was provided through a grant from the National Science Foundation. I am pleased to acknowledge their support. Two graduate students at Oregon State University helped to test and implement the curricular reforms: K. C. Walsh and Pornrat Wattasinawich. I appreciate their assistance in this project. The staff at John Wiley & Sons have been especially helpful throughout the project. I am particularly grateful to: Executive Editor Stuart Johnson for his patience and support in bringing the new edition into reality; Assistant Production Editor Elaine Chew for handling a myriad of complicated composition and illustration details with efficiency and good humor; and Photo Editor Sheena Goldstein for helping me navigate the treacherous waters of new copyright and permission restrictions. In my research and other professional activities, I occasionally meet physicists who used earlier editions of this text when they were students. Some report that their first exposure to modern physics kindled the spark that led them to careers in physics. For many students, this course offers their first insights into what physicists really do and what is exciting, perplexing, and challenging about our profession. I hope students who use this new edition will continue to find those inspirations. Corvallis, Oregon August 2011 Kenneth S. Krane [email protected] CONTENTS Preface v 1. The Failures of Classical Physics 1 1.1 1.2 Review of Classical Physics 3 The Failure of Classical Concepts of Space and Time 11 1.3 1.4 The Failure of the Classical Theory of Particle Statistics 13 Theory, Experiment, Law 20 Questions 21 Problems 22 2. The Special Theory of Relativity 25 2.1 Classical Relativity 26 2.2 2.3 2.4 The Michelson-Morley Experiment 29 Einstein’s Postulates 31 Consequences of Einstein’s Postulates 32 2.5 2.6 The Lorentz Transformation The Twin Paradox 44 2.7 2.8 2.9 Relativistic Dynamics 47 Conservation Laws in Relativistic Decays and Collisions 53 Experimental Tests of Special Relativity 56 40 Questions 63 Problems 64 3. The Particlelike Properties of Electromagnetic Radiation 3.1 Review of Electromagnetic Waves 3.2 3.3 The Photoelectric Effect Thermal Radiation 80 3.4 3.5 3.6 The Compton Effect 87 Other Photon Processes 91 What is a Photon? 94 Questions 97 Problems 98 75 70 69 x Contents 4. The Wavelike Properties of Particles 101 4.1 De Broglie’s Hypothesis 102 4.2 Experimental Evidence for De Broglie Waves 104 4.3 Uncertainty Relationships for Classical Waves 110 4.4 Heisenberg Uncertainty Relationships 4.5 Wave Packets 119 113 4.6 The Motion of a Wave Packet 123 4.7 Probability and Randomness 126 Questions 128 Problems 129 5. The Schrödinger Equation 133 5.1 Behavior of a Wave at a Boundary 5.2 Confining a Particle 138 134 5.3 The Schrödinger Equation 140 5.4 Applications of the Schrödinger Equation 144 5.5 The Simple Harmonic Oscillator 155 5.6 Steps and Barriers 158 Questions 166 Problems 166 6. The Rutherford-Bohr Model of the Atom 169 6.1 Basic Properties of Atoms 170 6.2 Scattering Experiments and the Thomson Model 171 6.3 The Rutherford Nuclear Atom 174 6.4 Line Spectra 180 6.5 The Bohr Model 183 6.6 The Franck-Hertz Experiment 189 6.7 The Correspondence Principle 190 6.8 Deficiencies of the Bohr Model 191 Questions 193 Problems 194 7. The Hydrogen Atom in Wave Mechanics 197 7.1 A One-Dimensional Atom 198 7.2 Angular Momentum in the Hydrogen Atom 200 7.3 The Hydrogen Atom Wave Functions 203 7.4 Radial Probability Densities 207 Contents 7.5 Angular Probability Densities 7.6 7.7 7.8 Intrinsic Spin 211 Energy Levels and Spectroscopic Notation The Zeeman Effect 217 7.9 Fine Structure 219 Questions 222 Problems 210 216 222 8. Many-Electron Atoms 225 8.1 8.2 The Pauli Exclusion Principle 226 Electronic States in Many-Electron Atoms 8.3 8.4 8.5 Outer Electrons: Screening and Optical Transitions 232 Properties of the Elements 235 Inner Electrons: Absorption Edges and X Rays 240 8.6 8.7 Addition of Angular Momenta Lasers 248 228 244 Questions 252 Problems 253 9. Molecular Structure 257 9.1 9.2 The Hydrogen Molecule 258 Covalent Bonding in Molecules 9.3 9.4 9.5 Ionic Bonding 271 Molecular Vibrations 275 Molecular Rotations 278 9.6 Molecular Spectra Questions 286 Problems 262 281 286 10. Statistical Physics 289 10.1 Statistical Analysis 290 10.2 10.3 Classical and Quantum Statistics 292 The Density of States 296 10.4 10.5 10.6 The Maxwell-Boltzmann Distribution 301 Quantum Statistics 306 Applications of Bose-Einstein Statistics 309 10.7 Applications of Fermi-Dirac Statistics 314 Questions 320 Problems 321 xi xii Contents 11. Solid-State Physics 325 11.1 11.2 Crystal Structures 326 The Heat Capacity of Solids 11.3 11.4 11.5 Electrons in Metals 338 Band Theory of Solids 342 Superconductivity 346 11.6 11.7 Intrinsic and Impurity Semiconductors Semiconductor Devices 353 11.8 Magnetic Materials 357 Questions 364 Problems 365 334 350 12. Nuclear Structure and Radioactivity 12.1 12.2 12.3 Nuclear Constituents 370 Nuclear Sizes and Shapes 372 Nuclear Masses and Binding Energies 12.4 12.5 The Nuclear Force 378 Quantum States in Nuclei 12.6 12.7 12.8 Radioactive Decay 382 Alpha Decay 387 Beta Decay 391 369 374 380 12.9 Gamma Decay and Nuclear Excited States 12.10 Natural Radioactivity 398 394 Questions 402 Problems 403 13. Nuclear Reactions and Applications 407 13.1 13.2 Types of Nuclear Reactions 408 Radioisotope Production in Nuclear Reactions 13.3 13.4 Low-Energy Reaction Kinematics Fission 416 13.5 13.6 13.7 Fusion 422 Nucleosynthesis 428 Applications of Nuclear Physics Questions 437 Problems 437 414 432 412 Contents 14. Elementary Particles 441 14.1 14.2 The Four Basic Forces 442 Classifying Particles 444 14.3 14.4 14.5 Conservation Laws 448 Particle Interactions and Decays 453 Energy and Momentum in Particle Decays 14.6 14.7 Energy and Momentum in Particle Reactions 460 The Quark Structure of Mesons and Baryons 464 14.8 The Standard Model Questions 474 Problems 474 458 470 15. Cosmology: The Origin and Fate of the Universe 15.1 15.2 15.3 The Expansion of the Universe 478 The Cosmic Microwave Background Radiation Dark Matter 484 15.4 15.5 The General Theory of Relativity Tests of General Relativity 493 15.6 15.7 15.8 Stellar Evolution and Black Holes 496 Cosmology and General Relativity 501 The Big Bang Cosmology 503 482 486 15.9 The Formation of Nuclei and Atoms 15.10 Experimental Cosmology 509 506 Questions 514 Problems 515 Appendix A: Constants and Conversion Factors 517 Appendix B: Complex Numbers 519 Appendix C: Periodic Table of the Elements 521 Appendix D: Table of Atomic Masses 523 Answers to Odd-Numbered Problems 533 Photo Credits 537 Index 539 Index to Tables 545 477 xiii Chapter 1 THE FAILURES OF CLASSICAL PHYSICS CASSINI INTERPLANETARY TRAJECTORY SATURN ARRIVAL 1 JUL 2004 VENUS SWINGBY 26 APR 1998 VENUS SWINGBY 24 JUN 1999 ORBIT OF JUPITER ORBIT OF EARTH ORBIT OF SATURN DEEP SPACE MANEUVER 3 DEC 1990 ORBIT OF VENUS EARTH SWINGBY 18 AUG 1999 LAUNCH 15 OCT 1997 JUPITER SWINGBY 30 DEC 2000 Classical physics, as postulated by Newton, has enabled us to send space probes on trajectories involving many complicated maneuvers, such as the Cassini mission to Saturn, which was launched in 1997 and gained speed for its trip to Saturn by performing four ‘‘gravity-assist’’ flybys of Venus (twice), Earth, and Jupiter. The spacecraft arrived at Saturn in 2004 and is expected to continue to send data through at least 2017. Planning and executing such interplanetary voyages are great triumphs for Newtonian physics, but when objects move at speeds close to the speed of light or when we examine matter on the atomic or subatomic scale, Newtonian mechanics is not adequate to explain our observations, as we discuss in this chapter. 2 Chapter 1 | The Failures of Classical Physics If you were a physicist living at the end of the 19th century, you probably would have been pleased with the progress that physics had made in understanding the laws that govern the processes of nature. Newton’s laws of mechanics, including gravitation, had been carefully tested, and their success had provided a framework for understanding the interactions among objects. Electricity and magnetism had been unified by Maxwell’s theoretical work, and the electromagnetic waves predicted by Maxwell’s equations had been discovered and investigated in the experiments conducted by Hertz. The laws of thermodynamics and kinetic theory had been particularly successful in providing a unified explanation of a wide variety of phenomena involving heat and temperature. These three successful theories—mechanics, electromagnetism, and thermodynamics—form the basis for what we call “classical physics.” Beyond your 19th-century physics laboratory, the world was undergoing rapid changes. The Industrial Revolution demanded laborers for the factories and accelerated the transition from a rural and agrarian to an urban society. These workers formed the core of an emerging middle class and a new economic order. The political world was changing, too—the rising tide of militarism, the forces of nationalism and revolution, and the gathering strength of Marxism would soon upset established governments. The fine arts were similarly in the middle of revolutionary change, as new ideas began to dominate the fields of painting, sculpture, and music. The understanding of even the very fundamental aspects of human behavior was subject to serious and critical modification by the Freudian psychologists. In the world of physics, too, there were undercurrents that would soon cause revolutionary changes. Even though the overwhelming majority of experimental evidence agreed with classical physics, several experiments gave results that were not explainable in terms of the otherwise successful classical theories. Classical electromagnetic theory suggested that a medium is needed to propagate electromagnetic waves, but precise experiments failed to detect this medium. Experiments to study the emission of electromagnetic waves by hot, glowing objects gave results that could not be explained by the classical theories of thermodynamics and electromagnetism. Experiments on the emission of electrons from surfaces illuminated with light also could not be understood using classical theories. These few experiments may not seem significant, especially when viewed against the background of the many successful and well-understood experiments of the 19th century. However, these experiments were to have a profound and lasting effect, not only on the world of physics, but on all of science, on the political structure of our world, and on the way we view ourselves and our place in the universe. Within the short span of two decades between 1905 and 1925, the shortcomings of classical physics would lead to the special and general theories of relativity and the quantum theory. The designation modern physics usually refers to the developments that began in about 1900 and led to the relativity and quantum theories, including the applications of those theories to understanding the atom, the atomic nucleus and the particles of which it is composed, collections of atoms in molecules and solids, and, on a cosmic scale, the origin and evolution of the universe. Our discussion of modern physics in this text touches on each of these areas. We begin our study in this chapter with a brief review of some important principles of classical physics, and we discuss some situations in which classical 1.1 | Review of Classical Physics physics offers either inadequate or incorrect conclusions. These situations are not necessarily those that originally gave rise to the relativity and quantum theories, but they do help us understand why classical physics fails to give us a complete picture of nature. 1.1 REVIEW OF CLASSICAL PHYSICS Although there are many areas in which modern physics differs radically from classical physics, we frequently find the need to refer to concepts of classical physics. Here is a brief review of some of the concepts of classical physics that we may need. Mechanics A particle of mass m moving with velocity v has a kinetic energy defined by K= 1 2 mv2 (1.1) and a linear momentum p  defined by p  = mv (1.2) In terms of the linear momentum, the kinetic energy can be written K= p2 2m (1.3) When one particle collides with another, we analyze the collision by applying two fundamental conservation laws: I. Conservation of Energy. The total energy of an isolated system (on which no net external force acts) remains constant. In the case of a collision between particles, this means that the total energy of the particles before the collision is equal to the total energy of the particles after the collision. II. Conservation of Linear Momentum. The total linear momentum of an isolated system remains constant. For the collision, the total linear momentum of the particles before the collision is equal to the total linear momentum of the particles after the collision. Because linear momentum is a vector, application of this law usually gives us two equations, one for the x components and another for the y components. These two conservation laws are of the most basic importance to understanding and analyzing a wide variety of problems in classical physics. Problems 1–4 and 11–14 at the end of this chapter review the use of these laws. The importance of these conservation laws is both so great and so fundamental that, even though in Chapter 2 we learn that the special theory of relativity modifies Eqs. 1.1, 1.2, and 1.3, the laws of conservation of energy and linear momentum remain valid. 3 4 Chapter 1 | The Failures of Classical Physics Example 1.1 A helium atom (m = 6.6465 × 10−27 kg) moving at a speed of vHe = 1.518 × 106 m/s collides with an atom of nitrogen (m = 2.3253 × 10−26 kg) at rest. After the collision, the helium atom is found to be moving with a velocity of v′He = 1.199 × 106 m/s at an angle of θHe = 78.75◦ relative to the direction of the original motion of the helium atom. (a) Find the velocity (magnitude and direction) of the nitrogen atom after the collision. (b) Compare the kinetic energy before the collision with the total kinetic energy of the atoms after the collision. Solution (a) The law of conservation of momentum for this collision can be written in vector form as p initial = pfinal , which is equivalent to px,initial = px,final py,initial = py,final and The collision is shown in Figure 1.1. The initial values of the total momentum are, choosing the x axis to be the direction of the initial motion of the helium atom, px,initial = mHe vHe and py,initial = 0 The final total momentum can be written px,final = py,final = mHe v′He mHe v′He cos θHe + mN v′N cos θN sin θHe + mN v′N sin θN The expression for py,final is written in general form with a + sign even though we expect that θHe and θN are on opposite sides of the x axis. If the equation is written in this way, θN will come out to be negative. The law of y x N v′He = −3.3613 × 105 m/s We can now solve for v′N and θN :  v′N = (v′N sin θN )2 + (v′N cos θN )2  = (−3.3613 × 105 m/s)2 + (3.6704 × 105 m/s)2 = 4.977 × 105 m/s v′ sin θN θN = tan−1 ′N vN cos θN   5 ◦ −1 −3.3613 × 10 m/s = tan = −42.48 3.6704 × 105 m/s (b) The initial kinetic energy is Kinitial = 21 mHe v2He = 7.658 × 10−15 J and the total final kinetic energy is = 12 (6.6465 × 10−27 kg)(1.199 × 106 m/s)2 θHe θN (b) mHe v′He sin θHe mN = −(6.6465 × 10−27 kg)(1.199 × 106 m/s) ◦ ×(sin78.75 )(2.3253 × 10−26 kg)−1 v′N sin θN = − 2 Kfinal = 21 mHe v′He + 21 mN v′N2 (a) y = 3.6704 × 105 m/s = 21 (6.6465 × 10−27 kg)(1.518 × 106 m/s)2 vHe He conservation of momentum gives, for the x components, mHe vHe = mHe v′He cos θHe + mN v′N cos θN , and for the y components, 0 = mHe v′He sin θHe + mN v′N sin θN . Solving for the unknown terms, we find m (v − v′He cos θHe ) v′N cos θN = He He mN = {(6.6465 × 10−27 kg)[1.518 × 106 m/s −(1.199 × 106 m/s)(cos 78.75◦ )]} ×(2.3253 × 10−26 kg)−1 x v′N FIGURE 1.1 Example 1.1. (a) Before collision; (b) after collision. + 12 (2.3253 × 10−26 kg)(4.977 × 105 m/s)2 = 7.658 × 10−15 J Note that the initial and final kinetic energies are equal. This is the characteristic of an elastic collision, in which no energy is lost to, for example, internal excitation of the particles. 1.1 | Review of Classical Physics 5 Example 1.2 An atom of uranium (m = 3.9529 × 10−25 kg) at rest decays spontaneously into an atom of helium (m = 6.6465 × 10−27 kg) and an atom of thorium (m = 3.8864 × 10−25 kg). The helium atom is observed to move in the positive x direction with a velocity of 1.423 × 107 m/s (Figure 1.2). (a) Find the velocity (magnitude and direction) of the thorium atom. (b) Find the total kinetic energy of the two atoms after the decay. Setting px,initial = px,final and solving for v′Th , we obtain v′Th = − =− U x = −2.432 × 105 m/s 2 2 K = 12 mHe v′He + 21 mTh v′Th = 12 (6.6465 × 10−27 kg)(1.423 × 107 m/s)2 (a) y v′Th + 21 (3.8864 × 10−25 kg)(−2.432 × 105 m/s)2 He v′He = 6.844 × 10−13 J x (b) FIGURE 1.2 Example 1.2. (a) Before decay; (b) after decay. Solution (a) Here we again use the law of conservation of momentum. The initial momentum before the decay is zero, so the total momentum of the two atoms after the decay must also be zero: px,initial = 0 (6.6465 × 10−27 kg)(1.423 × 107 m/s) 3.8864 × 10−25 kg The thorium atom moves in the negative x direction. (b) The total kinetic energy after the decay is: y Th mHe v′He mTh px,final = mHe v′He + mTh v′Th Clearly kinetic energy is not conserved in this decay, because the initial kinetic energy of the uranium atom was zero. However total energy is conserved —if we write the total energy as the sum of kinetic energy and nuclear energy, then the total initial energy (kinetic + nuclear) is equal to the total final energy (kinetic + nuclear). Clearly the gain in kinetic energy occurs as a result of a loss in nuclear energy. This is an example of the type of radioactive decay called alpha decay, which we discuss in more detail in Chapter 12. Another application of the principle of conservation of energy occurs when a particle moves subject to an external force F. Corresponding to that external force there is often a potential energy U, defined such that (for one-dimensional motion) F=− dU dx (1.4) The total energy E is the sum of the kinetic and potential energies: E =K+U (1.5) As the particle moves, K and U may change, but E remains constant. (In Chapter 2, we find that the special theory of relativity gives us a new definition of total energy.) 6 Chapter 1 | The Failures of Classical Physics When a particle moving with linear momentum p  is at a displacement r from the  about the point O is defined (see Figure 1.3) by origin O, its angular momentum L z  = r × p L L = r × p y O r p x FIGURE 1.3 A particle of mass m, located with respect to the origin r and moving O by position vector  p, has angular with linear momentum   about O. momentum L (1.6) There is a conservation law for angular momentum, just as with linear momentum. In practice this has many important applications. For example, when a charged particle moves near, and is deflected by, another charged particle, the total angular momentum of the system (the two particles) remains constant if no net external torque acts on the system. If the second particle is so much more massive than the first that its motion is essentially unchanged by the influence of the first particle, the angular momentum of the first particle remains constant (because the second particle acquires no angular momentum). Another application of the conservation of angular momentum occurs when a body such as a comet moves in the gravitational field of the Sun—the elliptical shape of the comet’s orbit is necessary to conserve angular momentum. In this case r and p  of the comet must  remains constant. simultaneously change so that L Velocity Addition Another important aspect of classical physics is the rule for combining velocities. For example, suppose a jet plane is moving at a velocity of vPG = 650 m/s, as measured by an observer on the ground. The subscripts on the velocity mean “velocity of the plane relative to the ground.” The plane fires a missile in the forward direction; the velocity of the missile relative to the plane is vMP = 250 m/s. According to the observer on the ground, the velocity of the missile is: vMG = vMP + vPG = 250 m/s + 650 m/s = 900 m/s. vAB represent the velocity of A We can generalize this rule as follows. Let  vBC represent the velocity of B relative to C. Then the velocity relative to B, and let  of A relative to C is (1.7) vAC = vAB + vBC This equation is written in vector form to allow for the possibility that the velocities might be in different directions; for example, the missile might be fired not in the direction of the plane’s velocity but in some other direction. This seems to be a very “common-sense” way of combining velocities, but we will see later in this chapter (and in more detail in Chapter 2) that this common-sense rule can lead to contradictions with observations when we apply it to speeds close to the speed of light. A common application of this rule (for speeds small compared with the speed of light) occurs in collisions, when we want to analyze conservation of momentum and energy in a frame of reference that is different from the one in which the collision is observed. For example, let’s analyze the collision of Example 1.1 in a frame of reference that is moving with the center of mass. Suppose the initial velocity of the He atom defines the positive x direction. The velocity of the center of mass (relative to the laboratory) is then vCL = (vHe mHe + vN mN )/(mHe + mN ) = 3.374 × 105 m/s. We would like to find the initial velocity of the He and N relative to the center of mass. If we start with vHeL = vHeC + vCL and vNL = vNC + vCL , then vHeC = vHeL − vCL = 1.518 × 106 m/s − 3.374 × 105 m/s = 1.181 × 106 m/s vNC = vNL − vCL = 0 − 3.374 × 105 m/s = −0.337 × 106 m/s 1.1 | Review of Classical Physics In a similar fashion we can calculate the final velocities of the He and N. The resulting collision as viewed from this frame of reference is illustrated in Figure 1.4. There is a special symmetry in this view of the collision that is not apparent from the same collision viewed in the laboratory frame of reference (Figure 1.1); each velocity simply changes direction leaving its magnitude unchanged, and the atoms move in opposite directions. The angles in this view of the collision are different from those of Figure 1.1, because the velocity addition in this case applies only to the x components and leaves the y components unchanged, which means that the angles must change. 7 y x N He (a) y He x N Electricity and Magnetism The electrostatic force (Coulomb force) exerted by a charged particle q1 on another charge q2 has magnitude F= 1 |q1 ||q2 | 4πε0 r2 (1.8) The direction of F is along the line joining the particles (Figure 1.5). In the SI system of units, the constant 1/4πε0 has the value 1 = 8.988 × 109 N · m2 /C2 4πε0 (b) FIGURE 1.4 The collision of Figure 1.1 viewed from a frame of reference moving with the center of mass. (a) Before collision. (b) After collision. In this frame the two particles always move in opposite directions, and for elastic collisions the magnitude of each particle’s velocity is unchanged. The corresponding potential energy is U= 1 q1 q2 4πε0 r (1.9) In all equations derived from Eq. 1.8 or 1.9 as starting points, the quantity 1/4πε0 must appear. In some texts and reference books, you may find electrostatic quantities in which this constant does not appear. In such cases, the centimetergram-second (cgs) system has probably been used, in which the constant 1/4πε0 is defined to be 1. You should always be very careful in making comparisons of electrostatic quantities from different references and check that the units are identical. An electrostatic potential difference V can be established by a distribution of charges. The most common example of a potential difference is that between the two terminals of a battery. When a charge q moves through a potential difference V , the change in its electrical potential energy U is U = qV (1.10) At the atomic or nuclear level, we usually measure charges in terms of the basic charge of the electron or proton, whose magnitude is e = 1.602 × 10−19 C. If such charges are accelerated through a potential difference V that is a few volts, the resulting loss in potential energy and corresponding gain in kinetic energy will be of the order of 10−19 to 10−18 J. To avoid working with such small numbers, it is common in the realm of atomic or nuclear physics to measure energies in electron-volts (eV), defined to be the energy of a charge equal in magnitude to that of the electron that passes through a potential difference of 1 volt: U = qV = (1.602 × 10−19 C)(1 V) = 1.602 × 10−19 J r + F + F FIGURE 1.5 Two charged particles experience equal and opposite electrostatic forces along the line joining their centers. If the charges have the same sign (both positive or both negative), the force is repulsive; if the signs are different, the force is attractive. 8 Chapter 1 | The Failures of Classical Physics and thus 1 eV = 1.602 × 10−19 J Some convenient multiples of the electron-volt are keV = kilo electron-volt = 103 eV MeV = mega electron-volt = 106 eV GeV = giga electron-volt = 109 eV (In some older works you may find reference to the BeV, for billion electron-volts; this is a source of confusion, for in the United States a billion is 109 while in Europe a billion is 1012 .) Often we wish to find the potential energy of two basic charges separated by typical atomic or nuclear dimensions, and we wish to have the result expressed in electron-volts. Here is a convenient way of doing this. First we express the quantity e2 /4πε0 in a more convenient form: e2 = (8.988 × 109 N · m2 /C2 )(1.602 × 10−19 C)2 = 2.307 × 10−28 N · m2 4πε0   9  1 10 nm = (2.307 × 10−28 N · m2 ) 1.602 × 10−19 J/eV m = 1.440 eV · nm With this useful combination of constants it becomes very easy to calculate electrostatic potential energies. For two electrons separated by a typical atomic dimension of 1.00 nm, Eq. 1.9 gives   e2 1 1 1 e2 = = (1.440 eV · nm) = 1.44 eV U= 4πε0 r 4πε0 r 1.00 nm B For calculations at the nuclear level, the femtometer is a more convenient unit of distance and MeV is a more appropriate energy unit:    15   1m 10 fm 1 MeV e2 = (1.440 eV · nm) = 1.440 MeV · fm 4πε0 109 nm 1m 106 eV i (a) Bext µ i (b) FIGURE 1.6 (a) A circular current  at loop produces a magnetic field B its center. (b) A current loop with magnetic moment µ  in an external  ext . The field exerts magnetic field B a torque on the loop that will tend to rotate it so that µ  lines up with B ext . It is remarkable (and convenient to remember) that the quantity e2 /4πε0 has the same value of 1.440 whether we use typical atomic energies and sizes (eV · nm) or typical nuclear energies and sizes (MeV · fm).  can be produced by an electric current i. For example, the A magnetic field B magnitude of the magnetic field at the center of a circular current loop of radius r is (see Figure 1.6a) B= μ0 i 2r (1.11) The SI unit for magnetic field is the tesla (T), which is equivalent to a newton per ampere-meter. The constant μ0 is μ0 = 4π × 10−7 N · s2 /C2 Be sure to remember that i is in the direction of the conventional (positive) current, opposite to the actual direction of travel of the negatively charged electrons that  is chosen typically produce the current in metallic wires. The direction of B according to the right-hand rule: if you hold the wire in the right hand with the 1.1 | Review of Classical Physics thumb pointing in the direction of the current, the fingers point in the direction of the magnetic field. It is often convenient to define the magnetic moment µ  of a current loop: |µ  | = iA (1.12) where A is the geometrical area enclosed by the loop. The direction of µ  is perpendicular to the plane of the loop, according to the right-hand rule.  ext (as in When a current loop is placed in a uniform external magnetic field B  ext : Figure 1.6b), there is a torque  τ on the loop that tends to line up µ  with B  ext τ = µ  ×B (1.13) Another way to describe this interaction is to assign a potential energy to the  ext : magnetic moment µ  in the external field B  ext U = −µ  ·B (1.14) c = (ε0 μ0 )−1/2 (1.15)  ext is applied, µ When the field B  rotates so that its energy tends to a minimum  ext are parallel. value, which occurs when µ  and B It is important for us to understand the properties of magnetic moments, because particles such as electrons or protons have magnetic moments. Although we don’t imagine these particles to be tiny current loops, their magnetic moments do obey Eqs. 1.13 and 1.14. A particularly important aspect of electromagnetism is electromagnetic waves. In Chapter 3 we discuss some properties of these waves in more detail. Electromagnetic waves travel in free space with speed c (the speed of light), which is related to the electromagnetic constants ε0 and μ0 : The speed of light has the exact value of c = 299,792,458 m/s. Electromagnetic waves have a frequency f and wavelength λ, which are related by c = λf (1.16) The wavelengths range from the very short (nuclear gamma rays) to the very long (radio waves). Figure 1.7 shows the electromagnetic spectrum with the conventional names assigned to the different ranges of wavelengths. Wavelength (m) 10 6 10 4 2 10 AM 10 0 FM TV 10 −2 10−4 Microwave Infrared Broadcast Long-wave radio 102 104 10−6 10−8 10−12 Nuclear gamma rays Ultraviolet Visible light X rays Short-wave radio 106 10−10 108 1010 1012 1014 1016 1018 Frequency (Hz) FIGURE 1.7 The electromagnetic spectrum. The boundaries of the regions are not sharply defined. 1020 1022 9 10 Chapter 1 | The Failures of Classical Physics Kinetic Theory of Matter An example of the successful application of classical physics to the structure of matter is the understanding of the properties of gases at relatively low pressures and high temperatures (so that the gas is far from the region of pressure and temperature where it might begin to condense into a liquid). Under these conditions, most real gases can be modeled as ideal gases and are well described by the ideal gas equation of state PV = NkT (1.17) where P is the pressure, V is the volume occupied by the gas, N is the number of molecules, T is the temperature, and k is the Boltzmann constant, which has the value k = 1.381 × 10−23 J/K In using this equation and most of the equations in this section, the temperature must be measured in units of kelvins (K). Be careful not to confuse the symbol K for the unit of temperature with the symbol K for kinetic energy. The ideal gas equation of state can also be expressed as PV = nRT (1.18) where n is the number of moles and R is the universal gas constant with a value of R = 8.315 J/mol · K One mole of a gas is the quantity that contains a number of fundamental entities (atoms or molecules) equal to Avogadro’s constant NA , where NA = 6.022 × 1023 per mole That is, one mole of helium contains NA atoms of He, one mole of nitrogen contains NA molecules of N2 (and thus 2NA atoms of N), and one mole of water vapor contains NA molecules of H2 O (and thus 2NA atoms of H and NA atoms of O). Because N = nNA (number of molecules equals number of moles times number of molecules per mole), the relationship between the Boltzmann constant and the universal gas constant is R = kNA (1.19) The ideal gas model is very successful for describing the properties of many gases. It assumes that the molecules are of negligibly small volume (that is, the gas is mostly empty space) and move randomly throughout the volume of the container. The molecules make occasional collisions with one another and with the walls of the container. The collisions obey Newton’s laws and are elastic and of very short duration. The molecules exert forces on one another only during collisions. Under these assumptions, there is no potential energy so that kinetic energy is the only form of energy that must be considered. Because the collisions are elastic, there is no net loss or gain of kinetic energy during the collisions. 1.2 | The Failure of Classical Concepts of Space and Time 11 Individual molecules may speed up or slow down due to collisions, but the average kinetic energy of all the molecules in the container does not change. The average kinetic energy of a molecule in fact depends only on the temperature: Kav = 32 kT (per molecule) (1.20) For rough estimates, the quantity kT is often used as a measure of the mean kinetic energy per particle. For example, at room temperature (20◦ C = 293 K), the mean kinetic energy per particle is approximately 4 × 10−21 J (about 1/40 eV), while in the interior of a star where T ∼ 107 K, the mean energy is approximately 10−16 J (about 1000 eV). Sometimes it is also useful to discuss the average kinetic energy of a mole of the gas: average K per mole = average K per molecule × number of molecules per mole Using Eq. 1.19 to relate the Boltzmann constant to the universal gas constant, we find the average molar kinetic energy to be Kav = 23 RT (per mole) (1.21) It should be apparent from the context of the discussion whether Kav refers to the average per molecule or the average per mole. v O2 v 1.2 THE FAILURE OF CLASSICAL CONCEPTS OF SPACE AND TIME In 1905, Albert Einstein proposed the special theory of relativity, which is in essence a new way of looking at space and time, replacing the “classical” space and time that were the basis of the physical theories of Galileo and Newton. Einstein’s proposal was based on a “thought experiment,” but in subsequent years experimental data have clearly indicated that the classical concepts of space and time are incorrect. In this section we examine how experimental results support the need for a new approach to space and time. The Failure of the Classical Concept of Time In high-energy collisions between two protons, many new particles can be produced, one of which is a pi meson (also known as a pion). When the pions are produced at rest in the laboratory, they are observed to have an average lifetime (the time between the production of the pion and its decay into other particles) of 26.0 ns (nanoseconds, or 10−9 s). On the other hand, pions in motion are observed to have a very different lifetime. In one particular experiment, pions moving at a speed of 2.737 × 108 m/s (91.3% of the speed of light) showed a lifetime of 63.7 ns. Let us imagine this experiment as viewed by two different observers (Figure 1.8). Observer #1, at rest in the laboratory, sees the pion moving relative to the laboratory at a speed of 91.3% of the speed of light and measures its LABORATORY O1 A B A B (a) O2 LABORATORY −v O1 (b) FIGURE 1.8 (a) The pion experiment according to O1 . Markers A and B respectively show the locations of the pion’s creation and decay. (b) The same experiment as viewed by O2 , relative to whom the pion is at rest and the laboratory moves with velocity −v. 12 Chapter 1 | The Failures of Classical Physics lifetime to be 63.7 ns. Observer #2 is moving relative to the laboratory at exactly the same velocity as the pion, so according to observer #2 the pion is at rest and has a lifetime of 26.0 ns. The two observers measure different values for the time interval between the same two events—the formation of the pion and its decay. According to Newton, time is the same for all observers. Newton’s laws are based on this assumption. The pion experiment clearly shows that time is not the same for all observers, which indicates the need for a new theory that relates time intervals measured by different observers who are in motion with respect to each other. The Failure of the Classical Concept of Space The pion experiment also leads to a failure of the classical ideas about space. Suppose observer #1 erects two markers in the laboratory, one where the pion is created and another where it decays. The distance D1 between the two markers is equal to the speed of the pion multiplied by the time interval from its creation to its decay: D1 = (2.737 × 108 m/s)(63.7 × 10−9 s) = 17.4 m. To observer #2, traveling at the same velocity as the pion, the laboratory appears to be rushing by at a speed of 2.737 × 108 m/s and the time between passing the first and second markers, showing the creation and decay of the pion in the laboratory, is 26.0 ns. According to observer #2, the distance between the markers is D2 = (2.737 × 108 m/s)(26.0 × 10−9 s) = 7.11 m. Once again, we have two observers in relative motion measuring different values for the same interval, in this case the distance between the two markers in the laboratory. The physical theories of Galileo and Newton are based on the assumption that space is the same for all observers, and so length measurements should not depend on relative motion. The pion experiment again shows that this cornerstone of classical physics is not consistent with modern experimental data. The Failure of the Classical Concept of Velocity Classical physics places no limit on the maximum velocity that a particle can reach. One of the basic equations of kinematics, v = v0 + at, shows that if a particle experiences an acceleration a for a long enough time t, velocities as large as desired can be achieved, perhaps even exceeding the speed of light. For another example, when an aircraft flying at a speed of 200 m/s relative to an observer on the ground launches a missile at a speed of 250 m/s relative to the aircraft, a ground-based observer would measure the missile to travel at a speed of 200 m/s + 250 m/s = 450 m/s, according to the classical velocity addition rule (Eq. 1.7). We can apply that same reasoning to a spaceship moving at a speed of 2.0 × 108 m/s (relative to an observer on a space station), which fires a missile at a speed of 2.5 × 108 m/s relative to the spacecraft. We would expect that the observer on the space station would measure a speed of 4.5 × 108 m/s for the missile. This speed exceeds the speed of light (3.0 × 108 m/s). Allowing speeds greater than the speed of light leads to a number of conceptual and logical difficulties, such as the reversal of the normal order of cause and effect for some observers. Here again modern experimental results disagree with the classical ideas. Let’s go back again to our experiment with the pion, which is moving through the laboratory at a speed of 2.737 × 108 m/s. The pion decays into another particle, called a muon, which is emitted in the forward direction (the direction of the pion’s velocity) with a speed of 0.813 × 108 m/s relative to the pion. According to Eq. 1.7, an observer in the laboratory should observe the muon to be moving with 1.3 | The Failure of the Classical Theory of Particle Statistics a velocity of 2.737 × 108 m/s + 0.813 × 108 m/s = 3.550 × 108 m/s, exceeding the speed of light. The observed velocity of the muon, however, is 2.846 × 108 m/s, below the speed of light. Clearly the classical rule for velocity addition fails in this experiment. The properties of time and space and the rules for combining velocities are essential concepts of the classical physics of Newton. These concepts are derived from observations at low speeds, which were the only speeds available to Newton and his contemporaries. In Chapter 2, we shall discover how the special theory of relativity provides the correct procedure for comparing measurements of time, distance, and velocity by different observers and thereby removes the failures of classical physics at high speed (while reducing to the classical laws at low speed, where we know the Newtonian framework works very well). 1.3 THE FAILURE OF THE CLASSICAL THEORY OF PARTICLE STATISTICS Thermodynamics and statistical mechanics were among the great triumphs of 19th-century physics. Describing the behavior of complex systems of many particles was shown to be possible using a small number of aggregate or average properties—for example, temperature, pressure, and heat capacity. Perhaps the crowning achievement in this field was the development of relationships between macroscopic properties, such as temperature, and microscopic properties, such as the molecular kinetic energy. Despite these great successes, this statistical approach to understanding the behavior of gases and solids also showed a spectacular failure. Although the classical theory gave the correct heat capacities of gases at high temperatures, it failed miserably for many gases at low temperatures. In this section we summarize the classical theory and explain how it fails at low temperatures. This failure directly shows the inadequacy of classical physics and the need for an approach based on quantum theory, the second of the great theories of modern physics. The Distribution of Molecular Energies In addition to the average kinetic energy, it is also important to analyze the distribution of kinetic energies—that is, what fraction of the molecules in the container has kinetic energies between any two values K1 and K2 . For a gas in thermal equilibrium at absolute temperature T (in kelvins), the distribution of molecular energies is given by the Maxwell-Boltzmann distribution: 1 2N E1/2 e−E/kT N(E) = √ π (kT)3/2 (1.22) In this equation, N is the total number of molecules (a pure number) while N(E) is the distribution function (with units of energy−1 ) defined so that N(E)dE is the number of molecules dN in the energy interval dE at E (or, in other words, the number of molecules with energies between E and E + dE): dN = N(E) dE (1.23) The distribution N(E) is shown in Figure 1.9. The number dN is represented by the area of the narrow strip between E and E + dE. If we divide the entire horizontal 13 Chapter 1 | The Failures of Classical Physics 1.20 N(E) (× 10–25) 14 Most probable energy ( 12 kT ) dN Average energy ( 32 kT) 0.80 dE N(E1: E2) 0.40 0.00 0.00 E1 E2 0.10 Energy (eV) 0.20 FIGURE 1.9 The Maxwell-Boltzmann energy distribution function, shown for one mole of gas at room temperature (300 K). axis into an infinite number of such small intervals and add the areas of all the resulting narrow strips, we obtain the total number of molecules in the gas:  ∞  ∞  ∞ 1 2N N(E) dE = dN = E1/2 e−E/kT dE = N (1.24) √ 3/2 (kT) π 0 0 0 ∞ The final step in this calculation involves the definite integral 0 x1/2 e−x dx, which you can find in tables of integrals. Also using calculus techniques (see Problem 8), you can show that the peak of the distribution function (the most probable energy) is 12 kT. The average energy in this distribution of molecules can also be found by dividing the distribution into strips. To find the contribution of each strip to the energy of the gas, we multiply the number of molecules in each strip, dN = N(E)dE, by the energy E of the molecules in that strip, and then we add the contributions of all the strips by integrating over all energies. This calculation would give the total energy of the gas; to find the average we divide by the total number of molecules N:  ∞  1 2 1 ∞ EN(E) dE = E3/2 e−E/kT dE (1.25) Eav = √ 3/2 N 0 (kT) π 0 Once again, the definite integral can be found in integral tables. The result of carrying out the integration is (1.26) Eav = 32 kT Equation 1.26 gives the average energy of a molecule in the gas and agrees precisely with the result given by Eq. 1.20 for the ideal gas in which kinetic energy is the only kind of energy the gas can have. Occasionally we are interested in finding the number of molecules in our distribution with energies between any two values E1 and E2 . If the interval between E1 and E2 is very small, Eq. 1.23 can be used, with dE = E2 − E1 and with N(E) evaluated at the midpoint of the interval. This approximation works very well when the interval is small enough that N(E) is either approximately flat or linear over the interval. If the interval is large enough that this approximation is not valid, then it is necessary to integrate to find the number of molecules in the interval:  E2  E2 1 2N E1/2 e−E/kT dE (1.27) N(E) dE = N(E1 : E2 ) = √ π (kT)3/2 E1 E1 This number is represented by the shaded area in Figure 1.9. This integral cannot be evaluated directly and must be found numerically. 1.3 | The Failure of the Classical Theory of Particle Statistics 15 Example 1.3 (a) In one mole of a gas at a temperature of 650 K (kT = 8.97 × 10−21 J = 0.0560 eV), calculate the number of molecules with energies between 0.0105 eV and 0.0135 eV. (b) In this gas, calculate the fraction of the molecules with energies in the range of ±2.5% of the most probable energy ( 21 kT). N(E) (× 10–25) 0.50 (a) Figure 1.10a shows the distribution N(E) in the region between E1 = 0.0105 eV and E2 = 0.0135 eV. Because the graph is very close to linear in this region, we can use Eq. 1.23 to find the number of molecules in this range. We take dE to be the width of the range, dE = E2 − E1 = 0.0135 eV − 0.0105 eV = 0.0030 eV, and for E we use the energy at the midpoint of the range (0.0120 eV): dN = N(E) dE 0.40 1 2N =√ E1/2 e−E/kT dE π (kT)3/2 0.30 0.20 0.10 0.00 0.008 0.010 0.012 0.014 0.016 Energy (eV) (a) = 2(6.022 × 1023 )(0.0120 eV)1/2 π −1/2 (0.0560 eV)−3/2 ×e−(0.0120 eV)/(0.0560 eV) (0.0030 eV) = 1.36 × 1022 (b) Figure 1.10b shows the distribution in this region. To find the fraction of the molecules in this energy range, we want dN/N. The most probable energy is 21 kT or 0.0280 eV, and ±2.5% of this value corresponds to ±0.0007 eV or a range from 0.0273 eV to 0.0287 eV. The fraction is 0.50 N(E) (× 10–25) Solution 0.40 0.30 0.20 dN 1 N(E) dE 2 E1/2 e−E/kT dE = =√ N N π (kT)3/2 0.10 0.00 0.026 0.027 0.028 0.029 0.030 Energy (eV) (b) FIGURE 1.10 Example 1.3. = 2(0.0280 eV)1/2 π −1/2 (0.0560 eV)−3/2 ×e−(0.0280 eV)/(0.0560 eV) (0.0014 eV) = 0.0121 Note from these examples how we use a distribution function. We do not use Eq. 1.22 to calculate the number of molecules at a particular energy. In this way N(E) differs from many of the functions you have encountered previously in your study of physics and mathematics. We always use the distribution function to calculate how many events occur in a certain interval of values rather than at an exact particular value. There are two reasons for this: (1) Asking the question in the form of how many molecules have a certain value of the energy implies that the energy is known exactly (to an infinite number of decimal places), and there is zero probability to find a molecule with that exact value of the energy. (2) Any measurement apparatus accepts a finite range of energies (or speeds) rather than a single exact value, and thus asking about intervals is a better representation of what can be measured in the laboratory. 16 Chapter 1 | The Failures of Classical Physics Note that N(E) has dimensions of energy−1 —it gives the number of molecules per unit energy interval (for example, number of molecules per eV). To get an actual number that can be compared with measurement, N(E) must be multiplied by an energy interval. In our study of modern physics, we will encounter many different types of distribution functions whose use and interpretation are similar to that of N(E). These functions generally give a number or a probability per some sort of unit interval (for example, probability per unit volume), and to use the distribution function to calculate an outcome we must always multiply by an appropriate interval (for example, an element of volume). Sometimes we will be able to deal with small intervals using a relationship similar to Eq. 1.23, as we did in Example 1.3, but in other cases we will find the need to evaluate an integral, as we did in Eq. 1.27. Polyatomic Molecules and the Equipartition of Energy So far we have been considering gases with only one atom per molecule (monatomic gases). For “point” molecules with no internal structure, only one form of energy is important: translational kinetic energy 12 mv2 . (We call this “translational” kinetic energy because it describes motion as the gas particles move from one location to another. Soon we will also consider rotational kinetic energy.) Let’s rewrite Eq. 1.26 in a more instructive form by recognizing that, with translational kinetic energy as the only form of energy, E = K = 12 mv2 . With v2 = v2x + v2y + v2z , we can write the energy as E = 21 mv2x + 21 mv2y + 12 mv2z (1.28) The average energy is then 1 2 2 m(vx )av + 21 m(v2y )av + 21 m(v2z )av = 32 kT (1.29) For a gas molecule there is no difference between the x, y, and z directions, so the three terms on the left are equal and each term is equal to 21 kT. The three terms on the left represent three independent contributions to the energy of the molecule—the motion in the x direction, for example, is not affected by the y or z motions. We define a degree of freedom of the gas as each independent contribution to the energy of a molecule, corresponding to one quadratic term in the expression for the energy. There are three quadratic terms in Eq. 1.28, so in this case there are three degrees of freedom. As you can see from Eq. 1.29, each of the three degrees of freedom of a gas molecule contributes an energy of 21 kT to its average energy. The relationship we have obtained in this special case is an example of the application of a general theorem, called the equipartition of energy theorem: When the number of particles in a system is large and Newtonian mechanics is obeyed, each molecular degree of freedom corresponds to an average energy of 12 kT. The average energy per molecule is then the number of degrees of freedom times 1 2 kT, and the total energy is obtained by multiplying the average energy per 1.3 | The Failure of the Classical Theory of Particle Statistics molecule by the number of molecules N: Etotal = NEav . We will refer to this total energy as the internal energy Eint to indicate that it represents the random motions of the gas molecules (in contrast, for example, to the energy involved with the motion of the entire container of gas molecules). Eint = N( 23 kT) = 32 NkT = 23 nRT (translation only) (1.31) Here we have 5 quadratic terms in the energy, and thus 5 degrees of freedom. According to the equipartition theorem, the average total energy per molecule is 5 × 21 kT = 25 kT, and the total internal energy of n moles of the gas is Eint = 52 nRT (translation + rotation) (1.32) If the molecule can also vibrate, we can imagine the rigid rod connecting the atoms in Figure 1.11 to be replaced by a spring. The two atoms can then vibrate in opposite directions along the z′ axis, with the center of mass of the molecule remaining fixed. The vibrational motion adds two quadratic terms to the energy, corresponding to the vibrational potential energy ( 21 kz′ 2 ) and the vibrational kinetic energy ( 21 mv2z′ ). Including the vibrational motion, there are now 7 degrees of freedom, so that Eint = 27 nRT (translation + rotation + vibration) (1.33) Heat Capacities of an Ideal Gas Now we examine where the classical molecular distribution theory, which gives a very good accounting of molecular behavior under most circumstances, fails to agree with one particular class of experiments. Suppose we have a container of gas with a fixed volume. We transfer energy to the gas, perhaps by placing the container in contact with a system at a higher temperature. All of this transferred energy increases the internal energy of the gas by an amount Eint , and there is an accompanying increase in temperature T. We define the molar heat capacity for this constant-volume process as CV = Eint n T z′ (1.30) where Eq. 1.19 has been used to express Eq. 1.30 in terms of either the number of molecules or the number of moles. The situation is different for a diatomic gas (two atoms per molecule), illustrated in Figure 1.11. There are still three degrees of freedom associated with the translational motion of the molecule, but now two additional forms of energy are permitted—rotational and vibrational. First we consider the rotational motion. The molecule shown in Figure 1.11 can rotate about the x′ and y′ axes (but not about the z′ axis, because the rotational inertia about that axis is zero for diatomic molecules in which the atoms are treated as points). Using the general form of 21 Iω2 for the rotational kinetic energy, we can write the energy of the molecule as E = 21 mv2x + 12 mv2y + 21 mv2z + 21 Ix′ ωx2′ + 12 Iy′ ωy2′ 17 (1.34) (The subscript V reminds us that we are doing this measurement at constant volume.) From Eqs. 1.30, 1.32, and 1.33, we see that the molar heat capacity depends on the type of gas: y′ x′ FIGURE 1.11 A diatomic molecule, with the origin at the center of mass. Rotations can occur about the x′ and y′ axes, and vibrations can occur along the z′ axis. 18 Chapter 1 | The Failures of Classical Physics CV = 23 R (monatomic or nonrotating, nonvibrating diatomic ideal gas) CV = (rotating and vibrating diatomic ideal gas) CV = 5 2R 7 2R (rotating diatomic ideal gas) (1.35) When we add energy to the gas, the equipartition theorem tells us that the added energy will on the average be distributed uniformly among all the possible forms of energy (corresponding to the number of degrees of freedom). However, only the translational kinetic energy contributes to the temperature (as shown by Eqs. 1.20 and 1.21). Thus, if we add 7 units of energy to a diatomic gas with rotating and vibrating molecules, on the average only 3 units go into translational kinetic energy and so 3/7 of the added energy goes into increasing the temperature. (To measure the temperature rise, the gas molecules must collide with the thermometer, so energy in the rotational and vibrational motions is not recorded by the thermometer.) Put another way, to obtain the same temperature increase T, a mole of diatomic gas requires 7/3 times the energy that is needed for a mole of monatomic gas. Comparison with Experiment How well do these heat capacity values agree with experiment? For monatomic gases, the agreement is very good. The equipartition theorem predicts a value of CV = 3R/2 = 12.5 J/mol · K, which should be the same for all monatomic gases and the same at all temperatures (as long as the conditions of the ideal gas model are fulfilled). The heat capacity of He gas is 12.5 J/mol · K at 100 K, 300 K (room temperature), and 1000 K, so in this case our calculation is in perfect agreement with experiment. Other inert gases (Ne, Ar, Xe, etc.) have identical values, as do vapors of metals (Cu, Na, Pb, Bi, etc.) and the monatomic (dissociated) state of elements that normally form diatomic molecules (H, N, O, Cl, Br, etc.). So over a wide variety of different elements and a wide range of temperatures, classical statistical mechanics is in excellent agreement with experiment. The situation is much less satisfactory for diatomic molecules. For a rotating and vibrating diatomic molecule, the classical calculation gives CV = 7R/2 = 29.1 J/mol · K. Table 1.1 shows some values of the heat capacities for different diatomic gases over a range of temperatures. TABLE 1.1 Heat Capacities of Diatomic Gases CV (J/mol · K) Element 100 K 300 K 1000 K H2 18.7 20.5 21.9 N2 20.8 20.8 24.4 O2 20.8 21.1 26.5 F2 20.8 23.0 28.8 Cl2 21.0 25.6 29.1 Br2 22.6 27.8 29.5 I2 24.8 28.6 29.7 Sb2 28.1 29.0 Te2 28.2 29.0 Bi2 28.6 29.1 1.3 | The Failure of the Classical Theory of Particle Statistics At high temperatures, many of the diatomic gases do indeed approach the expected value of 7R/2, but at lower temperatures the values are much smaller. For example, fluorine seems to behave as if it has 5 degrees of freedom (CV = 20.8 J/mol · K) at 100 K and 7 degrees of freedom (CV = 29.1 J/mol · K) at 1000 K. Hydrogen behaves as if it has 5 degrees of freedom at room temperature, but at high enough temperature (3000 K), the heat capacity of H2 approaches 29.1 J/mol · K, corresponding to 7 degrees of freedom, while at lower temperatures (40 K) the heat capacity is 12.5 J/mol · K, corresponding to 3 degrees of freedom. The temperature dependence of the heat capacity of H2 is shown in Figure 1.12. There are three plateaus in the graph, corresponding to heat capacities for 3, 5, and 7 degrees of freedom. At the lowest temperatures, the rotational and vibrational motions are “frozen” and do not contribute to the heat capacity. At about 100 K, the molecules have enough energy to allow rotational motion to occur, and by about 300 K the heat capacity is characteristic of 5 degrees of freedom. Starting about 1000 K, the vibrational motion can occur, and by about 3000 K there are enough molecules above the vibrational threshold to allow 7 degrees of freedom. What’s going on here? The classical calculation demands that CV should be constant, independent of the type of gas or the temperature. The equipartition of energy theorem, which is very successful in predicting many thermodynamic properties, fails miserably in accounting for the heat capacities. This theorem requires that the energy added to a gas must on average be divided equally among all the different forms of energy, and classical physics does not permit a threshold energy for any particular type of motion. How is it possible for 2 degrees of freedom, corresponding to the rotational or vibrational motions, to be “turned on” as the temperature is increased? The solution to this dilemma can be found in quantum mechanics, according to which there is indeed a minimum or threshold energy for the rotational and vibrational motions. We discuss this behavior in Chapters 5 and 9. In Chapter 11 we discuss the failure of the equipartition theorem to account for the heat capacities of solids and the corresponding need to replace the classical Maxwell-Boltzmann energy distribution function with a different distribution that is consistent with quantum mechanics. Classical Prediction 29.1 7 2 R 5 2 R 3 2 R Cv (J/mol.K) Vibration 20.8 Rotation 12.5 Translation 10 25 50 100 250 500 1000 2500 5000 Temperature (K) FIGURE 1.12 The heat capacity of molecular hydrogen at different temperatures. The data points disagree with the classical prediction. 19 20 Chapter 1 | The Failures of Classical Physics 1.4 THEORY, EXPERIMENT, LAW When you first began to study science, perhaps in your elementary or high school years, you may have learned about the “scientific method,” which was supposed to be a sort of procedure by which scientific progress was achieved. The basic idea of the “scientific method” was that, on reflecting over some particular aspect of nature, the scientist would invent a hypothesis or theory, which would then be tested by experiment and if successful would be elevated to the status of law. This procedure is meant to emphasize the importance of doing experiments as a way of testing hypotheses and rejecting those that do not pass the tests. For example, the ancient Greeks had some rather definite ideas about the motion of objects, such as projectiles, in the Earth’s gravity. Yet they tested none of these by experiment, so convinced were they that the power of logical deduction alone could be used to discover the hidden and mysterious laws of nature and that once logic had been applied to understanding a problem, no experiments were necessary. If theory and experiment were to disagree, they would argue, then there must be something wrong with the experiment! This dominance of analysis and faith was so pervasive that it was another 2000 years before Galileo, using an inclined plane and a crude timer (equipment surely within the abilities of the early Greeks to construct), discovered the laws of motion, which were later organized and analyzed by Newton. In the case of modern physics, none of the fundamental concepts is obvious from reason alone. Only by doing often difficult and necessarily precise experiments do we learn about these unexpected and fascinating effects associated with such modern physics topics as relativity and quantum physics. These experiments have been done to unprecedented levels of precision—of the order of one part in 106 or better—and it can certainly be concluded that modern physics was tested far better in the 20th century than classical physics was tested in all of the preceding centuries. Nevertheless, there is a persistent and often perplexing problem associated with modern physics, one that stems directly from your previous acquaintance with the “scientific method.” This concerns the use of the word “theory,” as in “theory of relativity” or “quantum theory,” or even “atomic theory” or “theory of evolution.” There are two contrasting and conflicting definitions of the word “theory” in the dictionary: 1. A hypothesis or guess. 2. An organized body of facts or explanations. The “scientific method” refers to the first kind of “theory,” while when we speak of the “theory of relativity” we refer to the second kind. Yet there is often confusion between the two definitions, and therefore relativity and quantum physics are sometimes incorrectly regarded as mere hypotheses, on which evidence is still being gathered, in the hope of someday submitting that evidence to some sort of international (or intergalactic) tribunal, which in turn might elevate the “theory” into a “law.” Thus the “theory of relativity” might someday become the “law of relativity,” like the “law of gravity.” Nothing could be further from the truth! The theory of relativity and the quantum theory, like the atomic theory or the theory of evolution, are truly “organized bodies of facts and explanations” and not “hypotheses.” There is no question of these “theories” becoming “laws”—the “facts” (experiments, observations) of relativity and quantum physics, like those of atomism or evolution, are accepted by virtually all scientists today. The Questions 21 experimental evidence for all of these processes is so compelling that no one who approaches them in the spirit of free and open inquiry can doubt the observational evidence or their inferences. Whether these collections of evidence are called theories or laws is merely a question of semantics and has nothing to do with their scientific merits. Like all scientific principles, they will continue to develop and change as new discoveries are made; that is the essence of scientific progress. Chapter Summary Section Section Classical kinetic energy p2 K = 12 mv2 = 2m 1.1 Magnetic field of a current loop Classical linear momentum p = mv 1.1 Classical angular momentum  = r × p L 1.1 Classical conservation laws Electric force and potential energy of two interacting charges In an isolated system, the energy, linear momentum, and angular momentum remain constant. 1 |q1 ||q2 | F= 4πε0 r2 1 q1 q2 U= 4πε0 r Relationship between U = qV electric potential energy and potential 1.1 μ i B= 0 2r 1.1 Potential energy of magnetic dipole U = −µ  · B ext 1.1 Average kinetic energy in a gas Kav = 32 kT(per molecule) 1.1 = 3 2 RT(per mole) 1 2N E1/2 e−E/kT 1.3 Maxwell-Boltzmann N(E) = √ (kT)3/2 π distribution 1.1 Equipartition of energy Energy per degree of freedom = 12 kT 1.3 1.1 Questions 1. Under what conditions can you apply the law of conservation of energy? Conservation of linear momentum? Conservation of angular momentum? 2. Which of the conserved quantities are scalars and which are vectors? Is there a difference in how we apply conservation laws for scalar and vector quantities? 3. What other conserved quantities (besides energy, linear momentum, and angular momentum) can you name? 4. What is the difference between potential and potential energy? Do they have different dimensions? Different units? 5. In Section 1.1 we defined the electric force between two charges and the magnetic field of a current. Use these quantities to define the electric field of a single charge and the magnetic force on a moving electric charge. 6. Other than from the ranges of wavelengths shown in Figure 1.7, can you think of a way to distinguish radio waves from infrared waves? Visible from infrared? That is, could you design a radio that could be tuned to infrared waves? Could living beings “see” in the infrared region? 7. Suppose we have a mixture of an equal number N of molecules of two different gases, whose molecular masses are m1 and m2 , in complete thermal equilibrium at temperature T. How do the distributions of molecular energies of the two gases compare? How do their average kinetic energies per molecule compare? 8. In most gases (as in the case of hydrogen) the rotational motion begins to occur at a temperature well below the temperature at which vibrational motion occurs. What does this tell us about the properties of the gas molecules? 9. Suppose it were possible for a pitcher to throw a baseball faster than the speed of light. Describe how the flight of the ball from the pitcher’s hand to the catcher’s glove would look to the umpire standing behind the catcher. 22 Chapter 1 | The Failures of Classical Physics 10. At low temperatures the molar heat capacity of carbon dioxide (CO2 ) is about 5R/2, and it rises to about 7R/2 at room temperature. However, unlike the gases discussed in Section 1.3, the heat capacity of CO2 continues to rise as the temperature increases, reaching 11R/2 at 1000 K. How can you explain this behavior? 11. If we double the temperature of a gas, is the number of molecules in a narrow interval dE around the most probable energy about the same, double, or half what it was at the original temperature? Problems 1.1 Review of Classical Physics 1. A hydrogen atom (m = 1.674 × 10−27 kg) is moving with a velocity of 1.1250 × 107 m/s. It collides elastically with a helium atom (m = 6.646 × 10−27 kg) at rest. After the collision, the hydrogen atom is found to be moving with a velocity of −6.724 × 106 m/s (in a direction opposite to its original motion). Find the velocity of the helium atom after the collision in two different ways: (a) by applying conservation of momentum; (b) by applying conservation of energy. 2. A helium atom (m = 6.6465 × 10−27 kg) collides elastically with an oxygen atom (m = 2.6560 × 10−26 kg) at rest. After the collision, the helium atom is observed to be moving with a velocity of 6.636 × 106 m/s in a direction at an angle of 84.7◦ relative to its original direction. The oxygen atom is observed to move at an angle of −40.4◦ . (a) Find the speed of the oxygen atom. (b) Find the speed of the helium atom before the collision. 3. A beam of helium-3 atoms (m = 3.016 u) is incident on a target of nitrogen-14 atoms (m = 14.003 u) at rest. During the collision, a proton from the helium-3 nucleus passes to the nitrogen nucleus, so that following the collision there are two atoms: an atom of “heavy hydrogen” (deuterium, m = 2.014 u) and an atom of oxygen-15 (m = 15.003 u). The incident helium atoms are moving at a velocity of 6.346 × 106 m/s. After the collision, the deuterium atoms are observed to be moving forward (in the same direction as the initial helium atoms) with a velocity of 1.531 × 107 m/s. (a) What is the final velocity of the oxygen-15 atoms? (b) Compare the total kinetic energies before and after the collision. 4. An atom of beryllium (m = 8.00 u) splits into two atoms of helium (m = 4.00 u) with the release of 92.2 keV of energy. If the original beryllium atom is at rest, find the kinetic energies and speeds of the two helium atoms. 5. A 4.15-volt battery is connected across a parallel-plate capacitor. Illuminating the plates with ultraviolet light causes electrons to be emitted from the plates with a speed of 1.76 × 106 m/s. (a) Suppose electrons are emitted near the center of the negative plate and travel perpendicular to that plate toward the opposite plate. Find the speed of the electrons when they reach the positive plate. (b) Suppose instead that electrons are emitted perpendicular to the positive plate. Find their speed when they reach the negative plate. 1.2 The Failure of Classical Concepts of Space and Time 6. Observer A, who is at rest in the laboratory, is studying a particle that is moving through the laboratory at a speed of 0.624c and determines its lifetime to be 159 ns. (a) Observer A places markers in the laboratory at the locations where the particle is produced and where it decays. How far apart are those markers in the laboratory? (b) Observer B, who is traveling parallel to the particle at a speed of 0.624c, observes the particle to be at rest and measures its lifetime to be 124 ns. According to B, how far apart are the two markers in the laboratory? 1.3 The Failure of the Classical Theory of Particle Statistics 7. A sample of argon gas is in a container at 35.0◦ C and 1.22 atm pressure. The radius of an argon atom (assumed spherical) is 0.710 × 10−10 m. Calculate the fraction of the container volume actually occupied by the atoms. 8. By differentiating the expression for the MaxwellBoltzmann energy distribution, show that the peak of the distribution occurs at an energy of 21 kT. 9. A container holds N molecules of nitrogen gas at T = 280 K. Find the number of molecules with kinetic energies between 0.0300 eV and 0.0312 eV. 10. A sample of 2.37 moles of an ideal diatomic gas experiences a temperature increase of 65.2 K at constant volume. (a) Find the increase in internal energy if only translational and rotational motions are possible. (b) Find the increase in internal energy if translational, rotational, and vibrational motions are possible. (c) How much of the energy calculated in (a) and (b) is translational kinetic energy? General Problems 11. An atom of mass m1 = m moving in the x direction with speed v1 = v collides elastically with an atom of mass m2 = 3m at rest. After the collision the first atom moves in the y direction. Find the direction of motion of the second atom and the speeds of both atoms (in terms of v) after the collision. 12. An atom of mass m1 = m moves in the positive x direction with speed v1 = v. It collides with and sticks to an atom of Problems mass m2 = 2m moving in the positive y direction with speed v2 = 2v/3. Find the resultant speed and direction of motion of the combination, and find the kinetic energy lost in this inelastic collision. 13. Suppose the beryllium atom of Problem 4 were not at rest, but instead moved in the positive x direction and had a kinetic energy of 40.0 keV. One of the helium atoms is found to be moving in the positive x direction. Find the direction of motion of the second helium, and find the velocity of each of the two helium atoms. Solve this problem in two different ways: (a) by direct application of conservation of momentum and energy; (b) by applying the results of Problem 4 to a frame of reference moving with the original beryllium atom and then switching to the reference frame in which the beryllium is moving. 14. Suppose the beryllium atom of Problem 4 moves in the positive x direction and has kinetic energy 60.0 keV. One helium atom is found to move at an angle of 30◦ with respect to the x axis. Find the direction of motion of the second helium atom and find the velocity of each helium atom. Work this 23 problem in two ways as you did the previous problem. (Hint: Consider one helium to be emitted with velocity components vx and vy in the beryllium rest frame. What is the relationship between vx and vy ? How do vx and vy change when we move in the x direction at speed v?) 15. A gas cylinder contains argon atoms (m = 40.0 u). The temperature is increased from 293 K (20◦ C) to 373 K (100◦ C). (a) What is the change in the average kinetic energy per atom? (b) The container is resting on a table in the Earth’s gravity. Find the change in the vertical position of the container that produces the same change in the average energy per atom found in part (a). 16. Calculate the fraction of the molecules in a gas that are moving with translational kinetic energies between 0.02kT and 0.04kT. 17. For a molecule of O2 at room temperature (300 K), calculate the average angular velocity for rotations about the x′ or y′ axes. The distance between the O atoms in the molecule is 0.121 nm. Chapter 2 THE SPECIAL THEORY OF RELATIVITY This 12-foot tall statue of Albert Einstein is located at the headquarters of the National Academy of Sciences in Washington DC. The page in his hand shows three equations that he discovered: the fundamental equation of general relativity, which revolutionized our understanding of gravity; the equation for the photoelectric effect, which opened the path to the development of quantum mechanics; and the equation for mass-energy equivalence, which is the cornerstone of his special theory of relativity. 26 Chapter 2 | The Special Theory of Relativity Einstein’s special theory of relativity and Planck’s quantum theory burst forth on the physics scene almost simultaneously during the first decade of the 20th century. Both theories caused profound changes in the way we view our universe at its most fundamental level. In this chapter we study the special theory of relativity.∗ This theory has a completely undeserved reputation as being so exotic that few people can understand it. On the contrary, special relativity is basically a system of kinematics and dynamics, based on a set of postulates that are different from those of classical physics. The resulting formalism is not much more complicated than Newton’s laws, but it does lead to several predictions that seem to go against our common sense. Even so, the special theory of relativity has been carefully and thoroughly tested by experiment and found to be correct in all its predictions. We first review the classical relativity of Galileo and Newton, and then we show why Einstein proposed to replace it. We then discuss the mathematical aspects of special relativity, the predictions of the theory, and finally the experimental tests. 2.1 CLASSICAL RELATIVITY A “theory of relativity” is in effect a way for observers in different frames of reference to compare the results of their observations. For example, consider an observer in a car parked by a highway near a large rock. To this observer, the rock is at rest. Another observer, who is moving along the highway in a car, sees the rock rush past as the car drives by. To this observer, the rock appears to be moving. A theory of relativity provides the conceptual framework and mathematical tools that enable the two observers to transform a statement such as “rock is at rest” in one frame of reference to the statement “rock is in motion” in another frame of reference. More generally, relativity gives a means for expressing the laws of physics in different frames of reference. The mathematical basis for comparing the two descriptions is called a transformation. Figure 2.1 shows an abstract representation of the situation. Two observers y y′ x, y, z, t x′, y′, z′, t′ Event O′ u O x z x′ z′ FIGURE 2.1 Two observers O and O′ observe the same event. O′ moves relative u. to O with a constant velocity  ∗ The general theory of relativity, which is covered briefly in Chapter 15, deals with “curved” coordinate systems, in which gravity is responsible for the curvature. Here we discuss the special case of the more familiar “flat” coordinate systems. 2.1 | Classical Relativity 27 O and O′ are each at rest in their own frames of reference but move relative to one another with constant velocity u . (O and O′ refer both to the observers and their reference frames or coordinate systems.) They observe the same event, which happens at a particular point in space and a particular time, such as a collision between two particles. According to O, the space and time coordinates of the event are x, y, z, t, while according to O′ the coordinates of the same event are x′ , y′ , z′ , t′ . The two observers use calibrated meter sticks and synchronized clocks, so any differences between the coordinates of the two events are due to their different frames of reference and not to the measuring process. We simplify the discussion by assuming that the relative velocity u  always lies along the common xx′ direction, as shown in Figure 2.1, and we let u  represent the velocity of O′ as measured by O (and thus O′ would measure velocity −u  for O). In this discussion we make a particular choice of the kind of reference frames inhabited by O and O′ . We assume that each observer has the capacity to test Newton’s laws and finds them to hold in that frame of reference. For example, each observer finds that an object at rest or moving with a constant velocity remains in that state unless acted upon by an external force (Newton’s first law, the law of inertia). Such frames of reference are called inertial frames. An observer in interstellar space floating in a nonrotating rocket with the engines off would be in an inertial frame of reference. An observer at rest on the surface of the Earth is not in an inertial frame, because the Earth is rotating about its axis and orbiting about the Sun; however, the accelerations associated with those motions are so small that we can usually regard our reference frame as approximately inertial. (The noninertial reference frame at the Earth’s surface does produce important and often spectacular effects, such as the circulation of air around centers of high or low pressure.) An observer in an accelerating car, a rotating merry-go-round, or a descending roller coaster is not in an inertial frame of reference! We now derive the classical or Galilean transformation that relates the coordinates x, y, z, t to x′ , y′ , z′ , t′ . We assume as a postulate of classical physics that t = t′ , that is, time is the same for all observers. We also assume for simplicity that the coordinate systems are chosen so that their origins coincide at t = 0. Consider an object in O′ at the coordinates x′ , y′ , z′ (Figure 2.2). According to O, the y and z coordinates are the same as those in O′ . Along the x direction, O would observe the object at x = x′ + ut. We therefore have the Galilean coordinate transformation x′ = x − ut y′ = y z′ = z (2.1) y To find the velocities of the object as observed by O and O′ , we take the derivatives of these expressions with respect to t′ on the left and with respect to t on the right (which we can do because we have assumed t′ = t). This gives the Galilean velocity transformation y′ u P O x O′ y′ x′ z′ v′x = vx − u v′y = vy v′z = vz In a similar fashion, we can take the derivatives of Eq. 2.2 with respect to time and obtain relationships between the accelerations a′x = ax a′y = ay a′z = az ut (2.2) (2.3) Equation 2.3 shows again that Newton’s laws hold for both observers. As long as u is constant (du/dt = 0), the observers measure identical accelerations and agree  = ma. on the results of applying F z x x′ z′ FIGURE 2.2 An object or event at point P is at coordinates x′ , y′ , z′ with respect to O′ . The x coordinate measured by O is x = x′ + ut. The y and z coordinates in O are the same as those in O′ . 28 Chapter 2 | The Special Theory of Relativity Example 2.1 Two cars are traveling at constant speed along a road in the same direction. Car A moves at 60 km/h and car B moves at 40 km/h, each measured relative to an observer on the ground (Figure 2.3a). What is the speed of car A relative to car B? A O B O′ v u (a) Solution A Let O be the observer on the ground, who observes car A to move at vx = 60 km/h. Assume O′ to be moving with car B at u = 40 km/h. Then O Figure 2.3b shows the situation as observed by O′ . O′ v′  –u  v′x = vx − u = 60 km/h − 40 km/h = 20 km/h B (b) FIGURE 2.3 Example 2.1. (a) As observed by O at rest on the ground. (b) As observed by O′ in car B. Example 2.2 An airplane is flying due east relative to still air at a speed of 320 km/h. There is a 65 km/h wind blowing toward the north, as measured by an observer on the ground. What is the velocity of the plane measured by the ground observer? Relative to the ground, the plane flies in a direction determined by φ = tan−1 (65 km/h)/(320 km/h) = 11.5◦ , or 11.5◦ north of east. v Solution v′  Let O be the observer on the ground, and let O′ be an observer who is moving with the wind, for example a balloonist (Figure 2.4). Then u = 65 km/h, and (because our equations are set up with u  in the xx′ direction) we ′ must choose the xx direction to be to the north. In this case we know the velocity with respect to O′ ; taking the y direction to the east, we have v′x = 0 and v′y = 320 km/h. Using Eq. 2.2 we obtain vx = v′x + u = 0 + 65 km/h = 65 km/h vy = v′y = 320 km/h u u N W O′ E S O FIGURE 2.4 Example 2.2. As observed by O at rest on the ground, the balloon drifts north with the wind, while the plane flies north of east. Example 2.3 A swimmer capable of swimming at a speed c in still water is swimming in a stream in which the current is u (which we assume to be less than c). Suppose the swimmer swims upstream a distance L and then returns downstream to the starting point. Find the time necessary to make the round trip, and compare it with the time to swim across the stream a distance L and return. Solution Let the frame of reference of O be the ground and the frame of reference of O′ be the water, moving at speed u (Figure 2.5a). The swimmer always moves at speed c relative to the water, and thus v′x = −c for the upstream swim. (Remember that u always defines the positive x direction.) According to Eq. 2.2, v′x = vx − u, 2.2 | The Michelson-Morley Experiment so vx = v′x + u = u − c. (As expected, the velocity relative to the ground has magnitude smaller than c; it is also negative, since the swimmer is swimming in the negative x direction, so |vx | = c − u.) Therefore, tup = L/(c − u). For the downstream swim, v′x = c, so vx = u + c, tdown = L/(c + u), and the total time is 29 O u–c u+c O′ u O′ u L L L(c − u) + L(c + u) t= + = c+u c−u c2 − u2 = 2Lc 2L 1 = 2 −u c 1 − u2 /c2 c2 (a) (2.4) O To swim directly across the stream, the swimmer’s efforts must be directed somewhat upstream to counter the effect of the current (Figure 2.5b). That is, in the frame of reference of O we would like to have vx = 0, which requires relative to v′x = −u according toEq. 2.2. Since the speed  ′2 = c; thus v′ = thewater is always c, v′2 + v c2 − v′2 x y x y = c2 − u2 , and the round-trip time is t = 2tacross =  2L c2 − u2 = 2L 1  c 1 − u2 /c2 (2.5) Notice the difference in form between this result and the result for the upstream-downstream swim, Eq. 2.4. (b) FIGURE 2.5 Example 2.3. The motion of a swimmer as seen by observer O at rest on the bank of the stream. Observer O′ moves with the stream at speed u. 2.2 THE MICHELSON-MORLEY EXPERIMENT We have seen how Newton’s laws remain valid with respect to a Galilean transformation that relates the description of the motion of an object in one reference frame to that in another reference frame. It is then interesting to ask whether the same transformation rules apply to the motion of a light beam. According to the Galilean transformation, a light beam moving relative to observer O′ in the x′ direction at speed c = 299,792,458 m/s would have a speed of c + u relative to O. Direct highprecision measurements of the speed of light beams have become possible in recent years (as we discuss later in this chapter), but in the 19th century it was necessary to devise a more indirect measurement of the speed of light according to different observers in relative motion. Suppose the swimmer in Example 2.3 is replaced by a light beam. Observer O′ is in a frame of reference in which the speed of light is c, and the frame of reference of observer O′ is in motion relative to observer O. What is the speed of light as measured by observer O? If the Galilean transformation is correct, we should expect to see a difference between the speed of the light beam according to O and O′ and therefore a time difference between the upstream-downstream and cross-stream times, as in Example 2.3. Albert A. Michelson (1852–1931, United States). He spent 50 years doing increasingly precise experiments with light, for which he became the first U.S. citizen to win the Nobel Prize in physics (1907). 30 S Chapter 2 | The Special Theory of Relativity A C B FIGURE 2.6 (Top) Beam diagram of Michelson interferometer. Light from source S is split at A by the half-silvered mirror; one part is reflected by the mirror at B and the other is reflected at C. The beams are then recombined for observation of the interference. (Bottom) Michelson’s apparatus. To improve sensitivity, the beams were reflected to travel each leg of the apparatus eight times, rather than just twice. To reduce vibrations from the surroundings, the interferometer was mounted on a 1.5-m square stone slab floating in a pool of mercury. Physicists in the 19th century postulated just such a situation—a preferred frame of reference in which the speed of light has the precise value of c and other frames in relative motion in which the speed of light would differ, according to the Galilean transformation. The preferred frame, like that of observer O′ in Example 2.3, is one that is at rest with respect to the medium in which light propagates at c (like the water of that example). What is the medium of propagation for light waves? It was inconceivable to physicists of the 19th century that a wave disturbance could propagate without a medium (consider mechanical waves such as sound or seismic waves, for example, which propagate due to mechanical forces in the medium). They postulated the existence of an invisible, massless medium, called the ether, which filled all space, was undetectable by any mechanical means, and existed solely for the propagation of light waves. It seemed reasonable then to obtain evidence for the ether by measuring the velocity of the Earth moving through the ether. This could be done in the geometry of Figure 2.5 by measuring the difference between the upstream-downstream and cross-stream times for a light wave. The calculation based on Galilean relativity would then give the relative velocity u  between O (in the Earth’s frame of reference) and the ether. The first detailed and precise search for the preferred frame was performed in 1887 by the American physicist Albert A. Michelson and his associate E. W. Morley. Their apparatus consisted of a specially designed Michelson interferometer, illustrated in Figure 2.6. A monochromatic beam of light is split in two; the two beams travel different paths and are then recombined. Any phase difference between the combining beams causes bright and dark bands or “fringes” to appear, corresponding, respectively, to constructive and destructive interference, as shown in Figure 2.7. There are two contributions to the phase difference between the beams. The first contribution comes from the path difference AB − AC; one of the beams may travel a longer distance. The second contribution, which would still be present even if the path lengths were equal, comes from the time difference between the upstream-downstream and cross-stream paths (as in Example 2.3) and indicates the motion of the Earth through the ether. Michelson and Morley used a clever method to isolate this second contribution—they rotated the entire apparatus by 90◦ ! The rotation doesn’t change the first contribution to the phase difference (because the lengths AB and AC don’t change), but the second contribution changes sign, because what was an upstream-downstream path before the rotation becomes a cross-stream path after the rotation. As the apparatus is rotated through 90◦ , the fringes should change from bright to dark and back again as the phase difference changes. Each change from bright to dark represents a phase change of 180◦ (a half cycle), which corresponds to a time difference of a half period (about 10−15 s for visible light). Counting the number of fringe changes thus gives a measure of the time difference between the paths, which in turn gives the relative velocity u. (See Problem 3.) When Michelson and Morley performed their experiment, there was no observable change in the fringe pattern—they deduced a shift of less than 0.01 fringe, corresponding to a speed of the Earth through the ether of at most 5 km/s. As a last resort, they reasoned that perhaps the orbital motion of the Earth just happened to cancel out the overall motion through the ether. If this were true, 2.3 | Einstein’s Postulates six months later (when the Earth would be moving in its orbit in the opposite direction) the cancellation should not occur. When they repeated the experiment six months later, they again obtained a null result. In no experiment were Michelson and Morley able to detect the motion of the Earth through the ether. In summary, we have seen that there is a direct chain of reasoning that leads from Galileo’s principle of inertia, through Newton’s laws with their implicit assumptions about space and time, ending with the failure of the MichelsonMorley experiment to observe the motion of the Earth relative to the ether. Although several explanations were offered for the unobservability of the ether and the corresponding failure of the upstream-downstream and cross-stream velocities to add in the expected way, the most novel, revolutionary, and ultimately successful explanation is given by Einstein’s special theory of relativity, which requires a serious readjustment of our traditional concepts of space and time, and therefore alters some of the very foundations of physics. 31 FIGURE 2.7 Interference fringes as observed with the Michelson interferometer of Figure 2.6. When the path length ACA changes by one-half wavelength relative to ABA, all light areas turn dark and all dark areas turn light. 2.3 EINSTEIN’S POSTULATES The special theory of relativity is based on two postulates proposed by Albert Einstein in 1905: The principle of relativity: The laws of physics are the same in all inertial reference frames. The principle of the constancy of the speed of light: The speed of light in free space has the same value c in all inertial reference frames. The first postulate declares that the laws of physics are absolute, universal, and the same for all inertial observers. Laws that hold for one inertial observer cannot be violated for any inertial observer. The second postulate is more difficult to accept because it seems to go against our “common sense,” which is based on the Galilean kinematics we observe in everyday experiences. Consider three observers A, B, and C. Observer B is at rest, while A and C move away from B in opposite directions each at a speed of c/4. B fires a light beam in the direction of A. According to the Galilean transformation, if B measures a speed of c for the light beam, then A measures a speed of c − c/4 = 3c/4, while C measures a speed of c + c/4 = 5c/4. Einstein’s second postulate, on the other hand, requires all three observers to measure the same speed of c for the light beam! This postulate immediately explains the failure of the Michelson-Morley experiment—the upstream-downstream and cross-stream speeds are identical (both are equal to c), so there is no phase difference between the two beams. The two postulates also allow us to dispose of the ether hypothesis. The first postulate does not permit a preferred frame of reference (all inertial frames are equivalent), and the second postulate does not permit only a single frame of reference in which light moves at speed c, because light moves at speed c in all frames. The ether, as a preferred reference frame in which light has a unique speed, is therefore unnecessary. Albert Einstein (1879–1955, Germany-United States). A gentle philosopher and pacifist, he was the intellectual leader of two generations of theoretical physicists and left his imprint on nearly every field of modern physics. 32 Chapter 2 | The Special Theory of Relativity 2.4 CONSEQUENCES OF EINSTEIN’S POSTULATES M Among their many consequences, Einstein’s postulates require a new consideration of the fundamental nature of time and space. In this section we discuss how the postulates affect measurements of time and length intervals by observers in different frames of reference. L0 S FIGURE 2.8 The clock ticks at intervals t0 determined by the time for a light flash to travel the distance 2L0 from the light source S to the mirror M and back to the source where it is detected. (We assume the emission and detection occur at the same location, so the beam travels perpendicular to the mirror). The Relativity of Time To demonstrate the relativity of time, we use the timing device illustrated in Figure 2.8. It consists of a flashing light source S that is a distance L0 from a mirror M. A flash of light from the source is reflected by the mirror, and when the light returns to S the clock ticks and triggers another flash. The time interval between ticks is the distance 2L0 (assuming the light travels perpendicular to the mirror) divided by the speed c: (2.6) t0 = 2L0 /c This is the time interval that is measured when the clock is at rest with respect to the observer. We consider two observers: O is at rest on the ground, and O′ moves with speed u. Each observer carries a timing device. Figure 2.9 shows a sequence of events that O observes for the clock carried by O′ . According to O, the flash is emitted when the clock of O′ is at A, reflected when it is at B, and detected at C. In this interval t, O observes the clock to move forward a distance of ut from the point at which the flash was √ emitted, and O concludes that the light beam travels a distance 2L, where L = L20 + (ut/2)2 , as shown in Figure 2.9. Because O observes the light beam to travel at speed c (as required by Einstein’s second postulate) the time interval measured by O is √ 2 L20 + (ut/2)2 2L = (2.7) t = c c Substituting for L0 from Eq. 2.6 and solving Eq. 2.7 for t, we obtain t0 t = √ 1 − u2 /c2 A B L L0 O′ C u L O′ (2.8) O′ u ∆t O FIGURE 2.9 In the frame of reference of O, the clock carried by O′ moves with speed u. The dashed line, of length 2L, shows the path of the light beam according to O. 2.4 | Consequences of Einstein’s Postulates 33 According to Eq. 2.8, observer O measures a longer time interval than O′ measures. This is a general result of special relativity, which is known as time dilation. An observer O′ is at rest relative to a device that produces a time interval t0 . For this observer, the beginning and end of the time interval occur at the same location, and so the interval t0 is known as the proper time. An observer O, relative to whom O′ is in motion, measures a longer time interval t for the same device. The dilated time interval t is always longer than the proper time interval t0 , no matter what the magnitude or direction of u . This is a real effect that applies not only to clocks based on light beams but also to time itself; all clocks run more slowly according to an observer in relative motion, biological clocks included. Even the growth, aging, and decay of living systems are slowed by the time dilation effect. However, note that under normal circumstances (u ≪ c), there is no measurable difference between t and t0 , so we don’t notice the effect in our everyday activities. Time dilation has been verified experimentally with decaying elementary particles as well as with precise atomic clocks carried aboard aircraft. Some experimental tests are discussed in the last section of this chapter. Example 2.4 Muons are elementary particles with a (proper) lifetime of 2.2 μs. They are produced with very high speeds in the upper atmosphere when cosmic rays (highenergy particles from space) collide with air molecules. Take the height L0 of the atmosphere to be 100 km in the reference frame of the Earth, and find the minimum speed that enables the muons to survive the journey to the surface of the Earth. If the muon is to be observed at the surface of the Earth, it must live for at least 333 μs in the Earth’s frame of reference. In the muon’s frame of reference, the interval between its birth and decay is a proper time interval of 2.2 μs. The time intervals are related by Eq. 2.8: Solution The birth and decay of the muon can be considered as the “ticks” of a clock. In the frame of reference of the Earth (observer O) this clock is moving, and therefore its ticks are slowed by the time dilation effect. If the muon is moving at a speed that is close to c, the time necessary for it to travel from the top of the atmosphere to the surface of the Earth is t = 100 km L0 = = 333 μs c 3.00 × 108 m/s Solving, we find 2.2 μs 333 μs =  1 − u2 /c2 u = 0.999978c If it were not for the time dilation effect, muons would not survive to reach the Earth’s surface. The observation of these muons is a direct verification of the time dilation effect of special relativity. The Relativity of Length For this discussion, the moving timing device of O′ is turned sideways, so that the light travels parallel to the direction of motion of O′ . Figure 2.10 shows the sequence of events that O observes for the moving clock. According to O, the length of the clock (distance between the light source and the mirror) is L; as we shall see, this length is different from the length L0 measured by O′ , relative to whom the clock is at rest. The flash of light is emitted when the clock of O′ is at A and reaches the mirror (position B) at time t1 later. In this time interval, the light travels a distance 34 Chapter 2 | The Special Theory of Relativity L u ∆t1 O A O′ L + u ∆t1 B O′ O u ∆t2 L − u ∆t2 C u O′ O FIGURE 2.10 Here the clock carried by O′ emits its light flash in the direction of motion. c t1 , equal to the length L of the clock plus the additional distance u t1 that the mirror moves forward in this interval. That is, c t1 = L + u t1 (2.9) The flash of light travels from the mirror to the detector in a time t2 and covers a distance of c t2 , equal to the length L of the clock less the distance u t2 that the clock moves forward in this interval: c t2 = L − u t2 (2.10) Solving Eqs. 2.9 and 2.10 for t1 and t2 , and adding to find the total time interval, we obtain t = t1 + t2 = L L 2L 1 + = c−u c+u c 1 − u2 /c2 MODERN PHYSICS L0 L0 L0 MODERN PHYSICS L0 L0 L FIGURE 2.11 Some length-contracted objects. Notice that the shortening occurs only in the direction of motion. (2.11) 2.4 | Consequences of Einstein’s Postulates From Eq. 2.8, 2L 1 t0 = 0 t =  c 1 − u2 /c2 1 − u2 /c2 (2.12) Setting Eqs. 2.11 and 2.12 equal to one another and solving, we obtain  L = L0 1 − u2 /c2 (2.13) Equation 2.13 summarizes the effect known as length contraction. Observer O′ , who is at rest with respect to the object, measures the rest length L0 (also known as the proper length, in analogy with the proper time). All observers relative to whom O′ is in motion measure a shorter length, but only along the direction of motion; length measurements transverse to the direction of motion are unaffected (Figure 2.11). For ordinary speeds (u ≪ c), the effects of length contraction are too small to be observed. For example, a rocket of length 100 m traveling at the escape speed from Earth (11.2 km/s) would appear to an observer on Earth to contract only by about two atomic diameters! Length contraction suggests that objects in motion are measured to have a shorter length than they do at rest. The objects do not actually shrink; there is merely a difference in the length measured by different observers. For example, to observers on Earth a high-speed rocket ship would appear to be contracted along its direction of motion (Figure 2.12a), but to an observer on the ship it is the passing Earth that appears to be contracted (Figure 2.12b). These representations of length-contracted objects are somewhat idealized. The actual appearance of a rapidly moving object is determined by the time at which light leaves the various parts of the object and enters the eye or the camera. The result is that the object appears distorted in shape and slightly rotated. Example 2.5 Consider the point of view of an observer who is moving toward the Earth at the same velocity as the muon. In this reference frame, what is the apparent thickness of the Earth’s atmosphere? Solution In this observer’s reference frame, the muon is at rest and the Earth is rushing toward it at a speed of u = 0.999978c, as we found in Example 2.4. To an observer on the Earth, the height of the atmosphere is its rest length L0 of 100 km. 35 (a) (b) FIGURE 2.12 (a) The Earth views the passing contracted rocket. (b) From the rocket’s frame of reference, the Earth appears contracted. To the observer in the muon’s rest frame, the moving Earth has an atmosphere of height given by Eq. 2.13:  L = L0 1 − u2 /c2  = (100 km) 1−(0.999978)2 = 0.66 km = 660 m This distance is small enough for the muons to reach the Earth’s surface within their lifetime. Note that what appears as a time dilation in one frame of reference (the observer on Earth) can be regarded as a length contraction in another frame of reference (the observer traveling with the muon). For another example of this effect, let’s review again the example of the pion decay discussed in Section 1.2. A pion at rest has a lifetime of 26.0 ns. According to observer O1 at rest in the laboratory frame of reference, a pion moving through the laboratory at a speed of 0.913c has a longer lifetime, which can be calculated to be 63.7 ns (using Eq. 2.8 for the time dilation). According to observer O2 , who is traveling through the laboratory at the same velocity as the pion, the pion appears to be at rest and has its proper lifetime of 26.0 ns. Thus O1 sees a time dilation effect. 36 Chapter 2 | The Special Theory of Relativity O1 erects two markers in the laboratory, at the locations where the pion is created and decays. To O1 , the distance between those markers is the pion’s speed times its lifetime, which works out to be 17.4 m. Suppose O1 places a stick of length 17.4 m in the laboratory connecting the two markers. That stick is at rest in the laboratory reference frame and so has its proper length in that frame. In the reference frame of O2 , the stick is moving at a speed of 0.913c and has a shorter length of 7.1 m, which we can find using the length contraction formula (Eq. 2.13). So O2 measures a distance of 7.1 m between the locations in the laboratory where the pion was created and where it decayed. Note that O1 measures the proper length and the dilated time, while O2 measures the proper time and the contracted length. The proper time and proper length must always be referred to specific observers, who might not be in the same reference frame. The proper time is always measured by an observer according to whom the beginning of the time interval and the end of the time interval occur at the same location. If the time interval is the lifetime of the pion, then O2 (relative to whom the pion does not move) sees its creation and decay at the same location and thus measures the proper time interval. The proper length, on the other hand, is always measured by an observer according to whom the measuring stick is at rest (O1 in this case). Example 2.6 An observer O is standing on a platform of length D0 = 65 m on a space station. A rocket passes at a relative speed of 0.80c moving parallel to the edge of the platform. The observer O notes that the front and back of the rocket simultaneously line up with the ends of the platform at a particular instant (Figure 2.13a). (a) According to O, what 65 m 0.8c is the time necessary for the rocket to pass a particular point on the platform? (b) What is the rest length L0 of the rocket? (c) According to an observer O′ on the rocket, what is the length D of the platform? (d) According to O′ , how long does it take for observer O to pass the entire length of the rocket? (e) According to O, the ends of the rocket simultaneously line up with the ends of the platform. Are these events simultaneous to O′ ? Solution O′ (a) According to O, the length L of the rocket matches the length D0 of the platform. The time for the rocket to pass a particular point is measured by O to be O (a) 108 m O′ O 39 m 0.8c (b) O O′ 0.8c (c) FIGURE 2.13 Example 2.6. (a) From the reference frame of O at rest on the platform, the passing rocket lines up simultaneously with the front and back of the platform. (b, c) From the reference frame O′ in the rocket, the passing platform lines up first with the front of the rocket and later with the rear. Note the differing effects of length contraction in the two reference frames. t0 = L 65 m = = 0.27 μs 0.80c 2.40 × 108 m/s This is a proper time interval, because O measures the interval between two events that occur at the same point in the frame of reference of O (the front of the rocket passes a point, and then the back of the rocket passes the same point). (b) O measures the contracted length L of the rocket. We can find its proper length L0 using Eq. 2.13: L 65 m L0 =  = 108 m = 1 − u2 /c2 1 − (0.80)2 (c) According to O the platform is at rest, so 65 m is its proper length D0 . According to O′ , the contracted length of 2.4 | Consequences of Einstein’s Postulates the platform is therefore   D = D0 1 − u2 /c2 = (65 m) 1 − (0.80)2 = 39 m (d) For O to pass the entire length of the rocket, O′ concludes that O must move a distance equal to its rest length, or 108 m. The time needed to do this is t′ = 108 m = 0.45 μs 0.80c Note that this is not a proper time interval for O′ , who determines this time interval using one clock at the front of the rocket to measure the time at which O passes the front of the rocket, and another clock on the rear of the rocket to measure the time at which O passes the rear of the rocket. The two events therefore occur at different points in O′ and so cannot be separated by a proper time in O′ . The corresponding time interval measured by O for the same two events, which we calculated in part (a), is a proper time interval for O, because the two events do occur at the same point in O. 37 The time intervals measured by O and O′ should be related by the time dilation formula, as you should verify. (e) According to O′ , the rocket has a rest length of L0 = 108 m and the platform has a contracted length of D = 39 m. There is thus no way that O′ could observe the two ends of both to align simultaneously. The sequence of events according to O′ is illustrated in Figures 2.13b and c. The time interval t′ in O′ between the two events that are simultaneous in O can be calculated by noting that, according to O′ , the time interval between the situations shown in Figures 2.13b and c must be that necessary for the platform to move a distance of 108 m − 39 m = 69 m, which takes a time t′ = 69 m = 0.29 μs 0.80c This result illustrates the relativity of simultaneity: two events at different locations that are simultaneous to O (the lining up of the two ends of the rocket with the two ends of the platform) cannot be simultaneous to O′ . Relativistic Velocity Addition The timing device is now modified as shown in Figure 2.14. A source P emits particles that travel at speed v′ according to an observer O′ at rest with respect to the device. The flashing bulb F is triggered to flash when a particle reaches it. The flash of light makes the return trip to the detector D, and the clock ticks. The time interval t0 between ticks measured by O′ is composed of two parts: one for the particle to travel the distance L0 at speed v′ and another for the light to travel the same distance at speed c: t0 = L0 /v′ + L0 /c (2.14) According to observer O, relative to whom O′ moves at speed u, the sequence of events is similar to that shown in Figure 2.10. The emitted particle, which travels at speed v according to O, reaches F in a time interval t1 after traveling the distance v t1 equal to the (contracted) length L plus the additional distance u t1 moved by the clock in that interval: v t1 = L + u t1 (2.15) In the interval t2 , the light beam travels a distance c t2 equal to the length L less the distance u t2 moved by the clock in that interval: P c t2 = L − u t2 D (2.16) We now solve Eqs. 2.15 and 2.16 for t1 and t2 , add to find the total interval t between ticks according to O, use the time dilation formula, Eq. 2.8, to relate this result to t0 from Eq. 2.14, and finally use the length contraction formula, Eq. 2.13, to relate L to L0 . After doing the algebra, we find the result v= v′ + u 1 + v′ u/c2 (2.17) v′  Particle Light F L0 FIGURE 2.14 In this timing device, a particle is emitted by P at a speed v′ . When the particle reaches F, it triggers the emission of a flash of light that travels to the detector D. 38 Chapter 2 | The Special Theory of Relativity Equation 2.17 is the relativistic velocity addition law for velocity components that are in the direction of u. Later in this chapter we use a different method to derive the corresponding results for motion in other directions. We can also regard Eq. 2.17 as a velocity transformation, enabling us to convert a velocity v′ measured by O′ to a velocity v measured by O. The corresponding classical law was given by Eq. 2.2: v = v′ + u. The difference between the classical and relativistic results is the denominator of Eq. 2.17, which reduces to 1 in cases when the speeds are small compared with c. Example 2.7 shows how this factor prevents the measured speeds from exceeding c. Equation 2.17 gives an important result when O′ observes a light beam. For v′ = c, c+u v= =c (2.18) 1 + cu/c2 That is, when v′ = c, then v = c, independent of the value of u. All observers measure the same speed c for light, exactly as required by Einstein’s second postulate. Example 2.7 A spaceship moving away from the Earth at a speed of 0.80c fires a missile parallel to its direction of motion (Figure 2.15). The missile moves at a speed of 0.60c relative to the ship. What is the speed of the missile as measured by an observer on the Earth? ν′ = 0.60c O O′ u = 0.80c FIGURE 2.15 Example 2.7. A spaceship moves away from Earth at a speed of 0.80c. An observer O′ on the spaceship fires a missile and measures its speed to be 0.60c relative to the ship. Solution Here O′ is on the ship and O is on Earth; O′ moves with a speed of u = 0.80c relative to O. The missile moves at speed v′ = 0.60c relative to O′ , and we seek its speed v relative to O. Using Eq. 2.17, we obtain v= = 0.60c + 0.80c v′ + u = 1 + v′ u/c2 1 + (0.60c)(0.80c)/c2 1.40c = 0.95c 1.48 According to classical kinematics (the numerator of Eq. 2.17), an observer on the Earth would see the missile moving at 0.60c + 0.80c = 1.40c, thereby exceeding the maximum relative speed of c permitted by relativity. You can see how Eq. 2.17 brings about this speed limit. Even if v′ were 0.9999 . . . c and u were 0.9999 . . . c, the relative speed v measured by O would remain less than c. The Relativistic Doppler Effect In the classical Doppler effect for sound waves, an observer moving relative to a source of waves (sound, for example) detects a frequency different from that emitted by the source. The frequency f ′ heard by the observer O is related to the frequency f emitted by the source S according to f′ = f v ± vO v ∓ vS (2.19) where v is the speed of the waves in the medium (such as still air, in the case of sound waves), vS is the speed of the source relative to the medium, and vO is the speed of the observer relative to the medium. The upper signs in the numerator 2.4 | Consequences of Einstein’s Postulates and denominator are chosen whenever S moves toward O or O moves toward S, while the lower signs apply whenever O and S move away from one another. The classical Doppler shift for motion of the source differs from that for motion of the observer. For example, suppose the source emits sound waves at f = 1000 Hz. If the source moves at 30 m/s toward the observer who is at rest in the medium (which we take to be air, in which sound moves at v = 340 m/s), then f ′ = 1097 Hz, while if the source is at rest in the medium and the observer moves toward the source at 30 m/s, the frequency is 1088 Hz. Other possibilities in which the relative speed between S and O is 30 m/s, such as each moving toward the other at 15 m/s, give still different frequencies. Here we have a situation in which it is not the relative speed of the source and observer that determines the Doppler shift—it is the speed of each with respect to the medium. This cannot occur for light waves, since there is no medium (no “ether”) and no preferred reference frame by Einstein’s first postulate. We therefore require a different approach to the Doppler effect for light waves, an approach that does not distinguish between source motion and observer motion, but involves only the relative motion between the source and the observer. Consider a source of waves that is at rest in the reference frame of observer O. Observer O′ moves relative to the source at speed u. We consider the situation from the frame of reference of O′ , as shown in Figure 2.16. Suppose O observes the source to emit N waves at frequency f . According to O, it takes an interval t0 = N/f for these N waves to be emitted; this is a proper time interval in the frame of reference of O. The corresponding time interval to O′ is t′ , during which O moves a distance u t′ . The wavelength according to O′ is the total length interval occupied by these waves divided by the number of waves: λ′ = c t′ + u t′ c t′ + u t′ = N f t0 (2.20) The frequency according to O′ is f ′ = c/λ′ , so f′ = f 1 t0 t′ 1 + u/c (2.21) and using the time dilation formula, Eq. 2.8, to relate t′ and t0 , we obtain   2 /c2 1 − u 1 − u/c f′ = f =f (2.22) 1 + u/c 1 + u/c This is the formula for the relativistic Doppler shift, for the case in which the waves are observed in a direction parallel to u . Note that, unlike the classical formula, it does not distinguish between source motion and observer motion; the u ∆t′ –u  O c ∆t′ O O′ N waves FIGURE 2.16 A source of waves, in the reference frame of O, moves at speed u away from observer O′ . In the time t′ (according to O′ ), O moves a distance u t′ and emits N waves. 39 40 Chapter 2 | The Special Theory of Relativity relativistic Doppler effect depends only on the relative speed u between the source and observer. Equation 2.22 assumes that the source and observer are separating. If the source and observer are approaching one another, replace u by −u in the formula. Example 2.8 A distant galaxy is moving away from the Earth at such high speed that the blue hydrogen line at a wavelength of 434 nm is recorded at 600 nm, in the red range of the spectrum. What is the speed of the galaxy relative to the Earth? Solution Using Eq. 2.22 with f = c/λ and f ′ = c/λ′ , we obtain c c = ′ λ λ  1 − u/c 1 + u/c c c = 600 nm 434 nm  Solving, we find 1 − u/c 1 + u/c u/c = 0.31 Thus the galaxy is moving away from Earth at a speed of 0.31c = 9.4 × 107 m/s. Evidence obtained in this way indicates that nearly all the galaxies we observe are moving away from us. This suggests that the universe is expanding, and is usually taken to provide evidence in favor of the Big Bang theory of cosmology (see Chapter 15). 2.5 THE LORENTZ TRANSFORMATION We have seen that the Galilean transformation of coordinates, time, and velocity is not consistent with Einstein’s postulates. Although the Galilean transformation agrees with our “common-sense” experience at low speeds, it does not agree with experiment at high speeds. We therefore need a new set of transformation equations that replaces the Galilean set and that is capable of predicting such relativistic effects as time dilation, length contraction, velocity addition, and the Doppler shift. As before, we seek a transformation that enables observers O and O′ in relative motion to compare their measurements of the space and time coordinates of the same event. The transformation equations relate the measurements of O (namely, x, y, z, t) to those of O′ (namely, x′ , y′ , z′ , t′ ). This new transformation must have several properties: It must be linear (depending only on the first power of the space and time coordinates), which follows from the homogeneity of space and time; it must be consistent with Einstein’s postulates; and it must reduce to the Galilean transformation when the relative speed between O and O′ is small. We again assume that the velocity of O′ relative to O is in the positive xx′ direction. This new transformation consistent with special relativity is called the Lorentz transformation∗ . Its equations are x − ut (2.23a) x′ =  1 − u2 /c2 y′ = y (2.23b) ∗ H. A. Lorentz (1853–1928) was a Dutch physicist who shared the 1902 Nobel Prize for his work on the influence of magnetic fields on light. In an unsuccessful attempt to explain the failure of the Michelson-Morley experiment, Lorentz developed the transformation equations that are named for him in 1904, a year before Einstein published his special theory of relativity. For a derivation of the Lorentz transformation, see R. Resnick and D. Halliday, Basic Concepts in Relativity (New York, Macmillan, 1992). 2.5 | The Lorentz Transformation z′ = z (2.23c) t − (u/c2 )x t′ =  1 − u2 /c2 (2.23d) It is often useful to write these equations in terms of intervals of space and time by replacing each coordinate by the corresponding interval (replace x by x, x′ by x′ , t by t, t′ by t′ ). These equations are written assuming that O′ moves away from O in the xx′ direction. If O′ moves toward O, replace u with −u in the equations. The first three equations reduce directly to the Galilean transformation for space coordinates, Eqs. 2.1, when u ≪ c. The fourth equation, which links the time coordinates, reduces to t′ = t, which is a fundamental postulate of the Galilean-Newtonian world. We now use the Lorentz transformation equations to derive some of the predictions of special relativity. The problems at the end of the chapter guide you in some other derivations. The results derived here are identical with those we obtained previously using Einstein’s postulates, which shows that the equations of the Lorentz transformation are consistent with the postulates of special relativity. Length Contraction A rod of length L0 is at rest in the reference frame of observer O′ . The rod extends along the x′ axis from x′1 to x′2 ; that is, O′ measures the proper length L0 = x′2 − x′1 . Observer O, relative to whom the rod is in motion, measures the ends of the rod to be at coordinates x1 and x2 . For O to determine the length of the moving rod, O must make a simultaneous determination of x1 and x2 , and then the length is L = x2 − x1 . Suppose the first event is O′ setting off a flash bulb at one end of the rod at x′1 and t1′ , which O observes at x1 and t1 , and the second event is O′ setting off a flash bulb at the other end at x′2 and t2′ , which O observes at x2 and t2 . The equations of the Lorentz transformation relate these coordinates, specifically, x − ut1 x′1 =  1 1 − u2 /c2 x − ut2 x′2 =  2 1 − u2 /c2 Subtracting these equations, we obtain x′2 − x′1 =  x2 − x1 1− u2 /c2 u(t − t1 ) − 2 1 − u2 /c2 (2.24) (2.25) O′ must arrange to set off the flash bulbs so that the flashes appear to be simultaneous to O. (They will not be simultaneous to O′ , as we discuss later in this section.) This enables O to make a simultaneous determination of the coordinates of the endpoints of the rod. If O observes the flashes to be simultaneous, then t2 = t1 , and Eq. 2.25 reduces to x2 − x1 (2.26) With x′2 − x′1 = L0 and x2 − x1 = L, this becomes  L = L0 1 − u2 /c2 (2.27) x′2 − x′1 =  1 − u2 /c2 which is identical with Eq. 2.13, which we derived earlier using Einstein’s postulates. 41 42 Chapter 2 | The Special Theory of Relativity Velocity Transformation If O observes a particle to travel with velocity v (components vx , vy , vz ), what velocity v′ does O′ observe for the particle? The relationship between the velocities measured by O and O′ is given by the Lorentz velocity transformation: vx − u v′x = (2.28a) 1 − vx u/c2  vy 1 − u2 /c2 ′ vy = (2.28b) 1 − vx u/c2  vz 1 − u2 /c2 ′ (2.28c) vz = 1 − vx u/c2 By solving Eq. 2.28a for vx , you can show that it is identical to Eq. 2.17, a result we derived previously based on Einstein’s postulates. Note that, in the limit of low speeds (u ≪ c), the Lorentz velocity transformation reduces to the Galilean velocity transformation, Eq. 2.2. Note also that v′y = vy , even though y′ = y. This occurs because of the way the Lorentz transformation handles the time coordinate. We can derive these transformation equations for velocity from the Lorentz coordinate transformation. By way of example, we derive the velocity transformation for v′y = dy′ /dt′ . Differentiating the coordinate transformation y′ = y, we obtain dy′ = dy. Similarly, differentiating the time coordinate transformation (Eq. 2.23d), we obtain dt − (u/c2 )dx dt′ =  1 − u2 /c2 So  dy dy dy′ = = 1 − u2 /c2  ′ dt dt − (u/c2 ) dx [dt − (u/c2 ) dx]/ 1 − u2 /c2   vy 1 − u2 /c2 dy/dt = = 1 − u2 /c2 1 − (u/c2 ) dx/dt 1 − uvx /c2 v′y = v′z . Clock 1 Similar methods can be used to obtain the transformation equations for v′x and These derivations are left as exercises (Problem 14). Clock 2 O′ Simultaneity and Clock Synchronization O x=0 x = L /2 x=L FIGURE 2.17 A flash of light, emitted from a point midway between the two clocks, starts the two clocks simultaneously according to O. Observer O′ sees clock 2 start ahead of clock 1. Under ordinary circumstances, synchronizing one clock with another is a simple matter. But for scientific work, where timekeeping at a precision below the nanosecond range is routine, clock synchronization can present some significant challenges. At very least, we need to correct for the time that it takes for the signal showing the reading on one clock to be transmitted to the other clock. However, for observers who are in motion with respect to each other, special relativity gives yet another way that clocks may appear to be out of synchronization. Consider the device shown in Figure 2.17. Two clocks are located at x = 0 and x = L. A flash lamp is located at x = L/2, and the clocks are set running when they 2.5 | The Lorentz Transformation receive the flash of light from the lamp. The light takes the same interval of time to reach the two clocks, so the clocks start together precisely at a time L/2c after the flash is emitted, and the clocks are exactly synchronized. Now let us examine the same situation from the point of view of the moving observer O′ . In the frame of reference of O, two events occur: the receipt of a light signal by clock 1 at x1 = 0, t1 = L/2c and the receipt of a light signal by clock 2 at x2 = L, t2 = L/2c. Using Eq. 2.23d, we find that O′ observes clock 1 to receive its signal at t1 − (u/c2 )x1 L/2c = t1′ =  2 2 1 − u /c 1 − u2 /c2 (2.29) while clock 2 receives its signal at L/2c − (u/c2 )L t2 − (u/c2 )x2 =  t2′ =  1 − u2 /c2 1 − u2 /c2 (2.30) Thus t2′ is smaller than t1′ and clock 2 appears to receive its signal earlier than clock 1, so that the clocks start at times that differ by uL/c2 t′ = t1′ − t2′ =  1 − u2 /c2 (2.31) according to O′ . Keep in mind that this is not a time dilation effect—time dilation comes from the first term of the Lorentz transformation (Eq. 2.23d) for t′ , while the lack of synchronization arises from the second term. O′ observes both clocks to run slow, due to time dilation; O′ also observes clock 2 to be ahead of clock 1. We therefore reach the following conclusion: two events that are simultaneous in one reference frame are not simultaneous in another reference frame moving with respect to the first, unless the two events occur at the same point in space. (If L = 0, Eq. 2.31 shows that the clocks are synchronized in all reference frames.) Clocks that appear to be synchronized in one frame of reference will not necessarily be synchronized in another frame of reference in relative motion. It is important to note that this clock synchronization effect does not depend on the location of observer O′ but only on the velocity of O′ . In Figure 2.17, the location of O′ could have been drawn far to the left side of clock 1 or far to the right side of clock 2, and the result would be the same. In those different locations, the propagation time of the light signal showing clock 1 starting will differ from the propagation time of the light signal showing clock 2 starting. However, O′ is assumed to be an “intelligent” observer who is aware of the locations where the light signals showing the two clocks starting are received relative to the locations of the clocks. O′ corrects for this time difference, which is due only to the propagation time of the light signals, and even after making that correction the clocks still do not appear to be synchronized! Although the location of O′ does not appear in Eq. 2.31, the direction of the velocity of O′ is important—if O′ is moving in the opposite direction, the observed starting order of the two clocks is reversed. 43 44 Chapter 2 | The Special Theory of Relativity Example 2.9 Two rockets are leaving their space station along perpendicular paths, as measured by an observer on the space station. Rocket 1 moves at 0.60c and rocket 2 moves at 0.80c, both measured relative to the space station. What is the velocity of rocket 2 as observed by rocket 1? would be identical with vy , the Galilean transformation, v′y  and thus the speed would be (0.60c)2 + (0.80c)2 = c. Once again, the Lorentz transformation prevents relative speeds from reaching or exceeding the speed of light. Solution Observer O is the space station, observer O′ is rocket 1 (moving at u = 0.60c), and each observes rocket 2, moving (according to O) in a direction perpendicular to rocket 1. We take this to be the y direction of the reference frame of O. Thus O observes rocket 2 to have velocity components vx = 0, vy = 0.80c, as shown in Figure 2.18a. We can find v′x and v′y using the Lorentz velocity transformation: v′x = v′y = Rocket 2 vx = 0 vy = 0.80c Rocket 1 O (a) 0 − 0.60c vx − u = = −0.60c 2 1 − vx u/c 1 − 0(0.60c)/c2  vy 1 − u2 /c2 Rocket 2 v′x = –0.60c v′y = 0.64c 1 − vx u/c2  0.80c 1 − (0.60c)2 /c2 = 0.64c = 1 − 0(0.60c)/c2 Thus, according to O′ , the situation looks like Figure 2.18b. ′  The speed of rocket 2 according to O is 2 2 (0.60c) + (0.64c) = 0.88c, less than c. According to O′ u = 0.60c Rocket 1 O O′ u = – 0.60c (b) FIGURE 2.18 Example 2.9. (a) As viewed from the reference frame of O. (b) As viewed from the reference frame of O′ . Example 2.10 In Example 2.6, two events that were simultaneous to O (the lining up of the front and back of the rocket ship with the ends of the platform) were not simultaneous to O′ . Find the time interval between these events according to O′ . Solution According to O, the two simultaneous events are separated by a distance of L = 65 m. For u = 0.80c, Eq. 2.31 gives t′ =  uL/c2 1 − u2 /c2 (0.80)(65 m)/(3.00 × 108 m/s) = = 0.29 μs  1 − (0.80)2 which agrees with the result calculated in part (e) of Example 2.6. 2.6 THE TWIN PARADOX We now turn briefly to what has become known as the twin paradox. Suppose there is a pair of twins on Earth. One, whom we shall call Casper, remains on Earth, while his twin sister Amelia sets off in a rocket ship on a trip to a distant planet. Casper, based on his understanding of special relativity, knows that his sister’s clocks will 2.6 | The Twin Paradox run slow relative to his own and that therefore she should be younger than he when she returns, as our discussion of time dilation would suggest. However, recalling that discussion, we know that for two observers in relative motion, each thinks the other’s clocks are running slow. We could therefore study this problem from the point of view of Amelia, according to whom Casper and the Earth (accompanied by the solar system and galaxy) make a round-trip journey away from her and back again. Under such circumstances, she will think it is her brother’s clocks (which are now in motion relative to her own) that are running slow, and will therefore expect her brother to be younger than she when they meet again. While it is possible to disagree over whose clocks are running slow relative to his or her own, which is merely a problem of frames of reference, when Amelia returns to Earth (or when the Earth returns to Amelia), all observers must agree as to which twin has aged less rapidly. This is the paradox—each twin expects the other to be younger. The resolution of this paradox lies in considering the asymmetric role of the two twins. The laws of special relativity apply only to inertial frames, those moving relative to one another at constant velocity. We may supply Amelia’s rockets with sufficient thrust so that they accelerate for a very short length of time, bringing the ship to a speed at which it can coast to the planet, and thus during her outward journey Amelia spends all but a negligible amount of time in a frame of reference moving at constant speed relative to Casper. However, in order to return to Earth, she must decelerate and reverse her motion. Although this also may be done in a very short time interval, Amelia’s return journey occurs in a completely different inertial frame than her outward journey. It is Amelia’s jump from one inertial frame to another that causes the asymmetry in the ages of the twins. Only Amelia has the necessity of jumping to a new inertial frame to return, and therefore all observers will agree that it is Amelia who is “really” in motion, and that it is her clocks that are “really” running slow; therefore she is indeed the younger twin on her return. Let us make this discussion more quantitative with a numerical example. We assume, as discussed above, that the acceleration and deceleration take negligible time intervals, so that all of Amelia’s aging is done during the coasting. For simplicity, we assume the distant planet is at rest relative to the Earth; this does not change the problem, but it avoids the need to introduce yet another frame of reference. Suppose the planet to be 6 light-years distant from Earth, and suppose Amelia travels at a speed of 0.6c. Then according to Casper it takes his sister 10 years (10 years ×0.6c = 6 light-years) to reach the planet and 10 years to return, and therefore she is gone for a total of 20 years. (However, Casper doesn’t know his sister has reached the planet until the light signal carrying news of her arrival reaches Earth. Since light takes 6 years to make the journey, it is 16 years after her departure when Casper sees his sister’s arrival at the planet. Four years later she returns to Earth.) From the frame of reference of  Amelia aboard the rocket, the distance to the planet is contracted by a factor of 1 − (0.6)2 = 0.8, and is therefore 0.8 × 6 light-years = 4.8 light-years. At a speed of 0.6c, Amelia will measure 8 years for the trip to the planet, for a total round trip time of 16 years. Thus Casper ages 20 years while Amelia ages only 16 years and is indeed the younger on her return. We can confirm this analysis by having Casper send a light signal to his sister each year on his birthday. We know that the frequency of the signal as received 45 46 Chapter 2 | The Special Theory of Relativity by Amelia will be Doppler shifted. During the outward journey, she will receive signals at the rate of  1 − u/c = 0.5/year (1/year) 1 + u/c During the return journey, the Doppler-shifted rate will be  1 + u/c (1/year) = 2/year 1 − u/c Thus for the first 8 years, during Amelia’s trip to the planet, she receives 4 signals, and during the return trip of 8 years, she receives 16 signals, for a total of 20. She receives 20 signals, indicating her brother has celebrated 20 birthdays during her 16-year journey. Spacetime Diagrams Particle at rest Light beam Time v<c 45° Distance FIGURE 2.19 A spacetime diagram. Casper’s worldline 20 Amelia’s worldline Time (years) 15 Light signals 10 5 tan−1(0.6) 5 10 0 Distance (light-years) FIGURE 2.20 Casper’s spacetime diagram, showing his worldline and Amelia’s. A particularly helpful way of visualizing the journeys of Casper and Amelia uses a spacetime diagram. Figure 2.19 shows an example of a spacetime diagram for motion that involves only one spatial direction. In your introductory physics course, you probably became familiar with plotting motion on a graph in which distance appeared on the vertical axis and time on the horizontal axis. On such a graph, a straight line represents motion at constant velocity; the slope of the line is equal to the velocity. Note that the axes of the spacetime diagram are switched from the traditional graph of particle motion, with time on the vertical axis and space on the horizontal axis. On a spacetime diagram, the graph that represents the motion of a particle is called its worldline. The inverse of the slope of the particle’s worldline gives its velocity. Equivalently, the velocity is given by the tangent of the angle that the worldline makes with the vertical axis (rather than with the horizontal axis, as would be the case with a conventional plot of distance vs. time). Usually, the units of x and t are chosen so that motion at the speed of light is represented by a line with a 45◦ slope. A vertical line represents a particle that is at the same spatial locations at all times—that is, a particle at rest. Permitted motions with constant velocity are then represented by straight lines between the vertical and the 45◦ line representing the maximum velocity. Let’s draw the worldlines of Casper and Amelia according to Casper’s frame of reference. Casper’s worldline is a vertical line, because he is at rest in this frame (Figure 2.20). In Casper’s frame of reference, 20 years pass between Amelia’s departure and her return, so we can follow Casper’s vertical worldline for 20 years. Amelia is traveling at a speed of 0.6c, so her worldline makes an angle with the vertical whose tangent is 0.6 (31◦ ). In Casper’s frame of reference, the planet visited by Amelia is 6 light-years from Earth. Amelia travels a distance of 6 light-years in a time of 10 years (according to Casper) so that v = 6 light-years/10 years = 0.6c. The birthday signals that Casper sends to Amelia at the speed of light are represented by the series of 45◦ lines in Figure 2.20. Amelia receives 4 birthday signals during her outbound journey (the 4th arrives just as she reaches the planet) and 16 birthday signals during her return journey (the 16th is sent and received just as she returns to Earth). 2.7 | Relativistic Dynamics It is left as an exercise (Problems 22 and 24) to consider the situation if it is Amelia who is sending the signals. 2.7 RELATIVISTIC DYNAMICS We have seen how Einstein’s postulates have led to a new “relative” interpretation of such previously absolute concepts as length and time, and that the classical concept of absolute velocity is not valid. It is reasonable then to ask how far this revolution is to go in changing our interpretation of physical concepts. Dynamical quantities, such as momentum and kinetic energy, depend on length, time, and velocity. Do classical laws of momentum and energy conservation remain valid in Einstein’s relativity? Let’s test the conservation laws by examining the collision shown in Figure 2.21a. Two particles collide elastically as observed in the reference frame of O′ . Particle 1 of mass m1 = 2m is initially at rest, and particle 2 of mass m2 = m is moving in the negative x direction with an initial velocity of v′2i = −0.750c. Using the classical law of momentum conservation to analyze this collision, O′ would calculate the particles to be moving with final velocities v′1f = −0.500c and v′2f = +0.250c. According to O′ , the total initial and final momenta of the particles would be: p′i = m1 v′1i + m2 v′2i = (2m)(0) + (m)(−0.750c) = −0.750mc p′f = m1 v′1f + m2 v′2f = (2m)(−0.500c) + (m)(0.250c) = −0.750mc The initial and final momenta are equal according to O′ , demonstrating that momentum is conserved. Suppose that the reference frame of O′ moves at a velocity of u = +0.550c in the x direction relative to observer O, as in Figure 2.21b. How would observer O analyze this collision? We can find the initial and final velocities of the two particles according to O using the velocity transformation of Eq. 2.17, which gives y′ y 2m 2m m v′2i = −0.750c m v1i = 0.550c Initial v2i = −0.340c Initial 2m 2m m v′2f v′1f m v1f u = 0.550c v2f Final Final x′ (a) x (b) FIGURE 2.21 (a) A collision between two particles as observed from the reference frame of O′ . (b) The same collision observed from the reference frame of O. 47 48 Chapter 2 | The Special Theory of Relativity the initial velocities shown in the figure and the final velocities v1f = +0.069c and v2f = +0.703c. Observer O can now calculate the initial and final values of the total momentum of the two particles: pi = m1 v1i + m2 v2i = (2m)(+0.550c) + (m)(−0.340c) = +0.760mc pf = m1 v1f + m2 v2f = (2m)(+0.069c) + (m)(+0.703c) = +0.841mc Momentum is therefore not conserved according to observer O. This collision experiment has shown that that the law of conservation of linear momentum, with momentum defined as p  = mv, does not satisfy Einstein’s first postulate (the law must be the same in all inertial frames). We cannot have a law that is valid for some observers but not for others. Therefore, if we are to retain the conservation of momentum as a general law consistent with Einstein’s first postulate, we must find a new definition of momentum. This new definition of momentum must have two properties: (1) It must yield a law of conservation of momentum that satisfies the principle of relativity; that is, if momentum is conserved according to an observer in one inertial frame, then it is conserved according to observers in all inertial frames. (2) At low speeds, the new definition must reduce to p  = mv, which we know works perfectly well in the nonrelativistic case. These requirements are satisfied by defining the relativistic momentum for a v as particle of mass m moving with velocity  p =  m v (2.32) 1 − v2 /c2 In terms of components, we can write Eq. 2.32 as px =  mvx 1 − v2 /c2 and py =  mvy 1 − v2 /c2 (2.33) The velocity v that appears in the denominator of these expressions is always the velocity of the particle as measured in a particular inertial frame. It is not the velocity of an inertial frame. The velocity in the numerator can be any of the components of the velocity vector. We can now reanalyze the collision shown in Figure 2.21 using the relativistic definition of momentum. The initial relativistic momentum according to O′ is p′i =  m1 v′1i 1− 2 v′2 1i /c + m2 v′2i 1− 2 v′2 2i /c (m)(−0.750c) (2m)(0) = −1.134mc + =√ 2 1−0 1 − (0.750)2 The final velocities according to O′ are v′1f = −0.585c and v′2f = +0.294c, and the total final momentum is p′f =  m v′ +  2 2f 2 2 1 − v′2 1 − v′2 1f /c 2f /c m1 v′1f (m)(0.294c) (2m)(−0.585c) + = −1.134mc =  1 − (0.585)2 1 − (0.294)2 2.7 | Relativistic Dynamics 49 Thus p′i = p′f , and observer O′ concludes that momentum is conserved. According to O, the initial relativistic momentum is m v m v (2m)(+0.550c) (m)(−0.340c) +  2 2i =  + = 0.956mc pi =  1 1i 2 1 − (0.550) 1 − (0.340)2 1 − v21i /c2 1 − v22i /c2 Using the velocity transformation, the final velocities measured by O are v1f = −0.051c and v2f = +0.727c, and so O calculates the final momentum to be m v m v (2m)(−0.051c) (m)(+0.727c) +  2 2f = + = 0.956mc pf =  1 1f 1 − (0.051)2 1 − (0.727)2 1 − v21f /c2 1 − v22f /c2 Observer O also concludes that pi = pf and that the law of conservation of momentum is valid. Defining momentum according to Eq. 2.32 gives conservation of momentum in all reference frames, as required by the principle of relativity. Example 2.11 What is the momentum of a proton moving at a speed of v = 0.86c? Solution Using Eq. 2.32, we obtain mv p=  1 − v2 /c2 = (1.67 × 10−27 kg)(0.86)(3.00 × 108 m/s)  1 − (0.86)2 = 8.44 × 10−19 kg · m/s The units of kg · m/s are generally not convenient in solving problems of this type. Instead, we manipulate Eq. 2.32 to obtain pc =  mc2 (v/c) (938 MeV)(0.86) = =  2 2 2 2 1 − v /c 1 − v /c 1 − (0.86)2 mvc = 1580 MeV Here we have used the proton’s rest energy mc2 , which is defined later in this section. The momentum is obtained from this result by dividing by the symbol c (not its numerical value), which gives p = 1580 MeV/c The units of MeV/c for momentum are often used in relativistic calculations because, as we show later, the quantity pc often appears in these calculations. You should be able to convert MeV/c to kg · m/s and show that the two results obtained for p are equivalent. Relativistic Kinetic Energy Like the classical definition of momentum, the classical definition of kinetic energy also causes difficulties when we try to compare the interpretations of different observers. According to O′ , the initial and final kinetic energies in the collision shown in Figure 2.21a are: 1 ′2 2 2 2 Ki′ = 12 m1 v′2 1i + 2 m2 v2i = (0.5)(2m)(0) + (0.5)(m)(−0.750c) = 0.281mc 1 ′2 2 2 2 Kf′ = 12 m1 v′2 1f + 2 m2 v2f = (0.5)(2m)(−0.500c) + (0.5)(m)(0.250c) = 0.281mc 50 Chapter 2 | The Special Theory of Relativity and so energy is conserved according to O′ . The initial and final kinetic energies observed from the reference frame of O (as in Figure 2.21b) are Ki = 12 m1 v21i + 12 m2 v22i = (0.5)(2m)(0.550c)2 + (0.5)(m)(−0.340c)2 = 0.360mc2 Kf = 12 m1 v21f + 12 m2 v22f = (0.5)(2m)(0.069c)2 + (0.5)(m)(0.703c)2 = 0.252mc2 Thus energy is not conserved in the reference frame of O if we use the classical formula for kinetic energy. This leads to a serious inconsistency—an elastic collision for one observer would not be elastic for another observer. As in the case of momentum, if we want to preserve the law of conservation of energy for all observers, we must replace the classical formula for kinetic energy with an expression that is valid in the relativistic case (but that reduces to the classical formula for low speeds). We can derive the relativistic expression for the kinetic energy of a particle using essentially the same procedure used to derive the classical expression, starting with the particle form of the work-energy theorem (see Problem 28). The result of this calculation is mc2 K= − mc2 1 − v2 /c2 (2.34) Using Eq. 2.34, you can show that both O and O′ will conclude that kinetic energy is conserved. In fact, all observers will agree on the applicability of the energy conservation law using the relativistic definition for kinetic energy. Equation 2.34 looks very different from the classical result K = 12 mv2 , but, as you should show (see Problem 32), Eq. 2.34 reduces to the classical expression in the limit of low speeds (v ≪ c). The classical expression for kinetic energy also violates the second relativity postulate by allowing speeds in excess of the speed of light. There is no limit (in either classical or relativistic dynamics) to the energy we can give to a particle. Yet, if we allow the kinetic energy to increase without limit, the classical expression K = 21 mv2 implies that the velocity must correspondingly increase without limit, thereby violating the second postulate. You can also see from the first term of Eq. 2.34 that K → ∞ as v → c. Thus we can increase the relativistic kinetic energy of a particle without limit, and its speed will not exceed c. Relativistic Total Energy and Rest Energy We can also express Eq. 2.34 as K = E − E0 (2.35) where the relativistic total energy E is defined as mc2 E=  1 − v2 /c2 (2.36) 2.7 | Relativistic Dynamics 51 and the rest energy E0 is defined as E0 = mc2 (2.37) The rest energy is in effect the relativistic total energy of a particle measured in a frame of reference in which the particle is at rest. Sometimes m in Eq. 2.37 is called the rest mass m0 and is distinguished from  the “relativistic mass,” which is defined as m0 / 1 − v2 /c2 . We choose not to use relativistic mass, because it can be a misleading concept. Whenever we refer to mass, we always mean rest mass. Equation 2.37 suggests that mass can be expressed in units of energy divided by c2 , such as MeV/c2 . For example, a proton has a rest energy of 938 MeV and thus a mass of 938 MeV/c2 . Just like expressing momentum in units of MeV/c, expressing mass in units of MeV/c2 turns out to be very useful in calculations. The relativistic total energy is given by Eq. 2.35 as E = K + E0 (2.38) Collisions of particles at high energies often result in the production of new particles, and thus the final rest energy may not be equal to the initial rest energy (see Example 2.18). Such collisions must be analyzed using conservation of total relativistic energy E; kinetic energy will not be conserved when the rest energy changes in a collision. In the special example of the elastic collision considered in this section, the identities of the particles did not change, and so kinetic energy was conserved. In general, collisions do not conserve kinetic energy—it is the relativistic total energy that is conserved in collisions. Manipulation of Eqs. 2.32 and 2.36 gives a useful relationship among the total energy, momentum, and rest energy:  (2.39) E = (pc)2 + (mc2 )2 Figure 2.22 shows a useful mnemonic device for remembering this relationship, which has the form of the Pythagorean theorem for the sides of a right triangle. When a particle travels at a speed close to the speed of light (say, v > 0.99c), which often occurs in high-energy particle accelerators, the particle’s kinetic energy is much greater than its rest energy; that is, K ≫ E0 . In this case, Eq. 2.39 can be written, to a very good approximation, E∼ = pc E (2.40) This is called the extreme relativistic approximation and is often useful for simplifying calculations. As v approaches c, the angle in Figure 2.22 between the bottom leg of the triangle (representing mc2 ) and the hypotenuse (representing E) approaches 90◦ . Imagine in this case a very tall triangle, in which the vertical leg (pc) and the hypotenuse (E) are nearly the same length. For massless particles (such as photons), Eq. 2.39 becomes exactly E = pc K (2.41) All massless particles travel at the speed of light; otherwise, by Eqs. 2.34 and 2.36 their kinetic and total energies would be zero. 2 mc pc sine = v/c E0 = mc2 FIGURE 2.22 A useful mnemonic device for recalling the relationships among E0 , p, K, and E. Note that to put all variables in energy units, the quantity pc must be used. 52 Chapter 2 | The Special Theory of Relativity Example 2.12 What are the kinetic and relativistic total energies of a proton (E0 = 938 MeV) moving at a speed of v = 0.86c? The kinetic energy follows from Eq. 2.35: K = E − E0 Solution In Example 2.11 we found the momentum of this particle to be p = 1580 MeV/c. The total energy can be found from Eq. 2.39:   E = (pc)2 + (mc2 )2 = (1580 MeV)2 + (938 MeV)2 = 1837 MeV − 938 MeV = 899 MeV We also could have solved this problem by finding the kinetic energy directly from Eq. 2.34. = 1837 MeV Example 2.13 Find the velocity and momentum of an electron (E0 = 0.511 MeV) with a kinetic energy of 10.0 MeV. Solution The total energy is E = K + E0 = 10.0 MeV + 0.511 MeV = 10.51 MeV. We then can find the momentum from Eq. 2.39:  1 1 2 2 2 E − (mc ) = (10.51 MeV)2 −(0.511 MeV)2 p= c c = 10.5 MeV/c Note that in this problem we could have used the extreme relativistic approximation, p ∼ = E/c, from Eq. 2.40. The error we would make in this case would be only 0.1%. The velocity can be found by solving Eq. 2.36 for v. v = c  1−  mc2 E 2 =  1− = 0.9988  0.511 MeV 10.51 MeV 2 (2.42) Example 2.14 In the Stanford Linear Collider electrons are accelerated to a kinetic energy of 50 GeV. Find the speed of such an electron as (a) a fraction of c, and (b) a difference from c. The rest energy of the electron is 0.511 MeV = 0.511 × 10−3 GeV. (a) First we solve Eq. 2.34 for v, obtaining v=c 1− 1 (1 + K/mc2 )2  v=c 1− 1 [1 + (50 GeV)/(0.511 × 10−3 GeV)]2 = 0.999 999 999 948c Solution  and thus (2.43) Calculators cannot be trusted to 12 significant digits. Here is a way to avoid this difficulty. We can write Eq. 2.43 as v = c(1 + x)1/2 , where x = −1/(1 + K/mc2 )2 . Because K ≫ mc2 , we have x ≪ 1, and we can use the binomial 2.8 | Conservation Laws in Relativistic Decays and Collisions expansion to write v ∼ = c(1 + 12 x), or 1 v∼ =c 1− 2(1 + K/mc2 )2 53 This leads to the same value of v given above. (b) From the above result, we have c − v = 5.2 × 10−11 c which gives = 0.016 m/s v∼ = c(1 − 5.2 × 10−11 ) = 1.6 cm/s Example 2.15 At a distance equal to the radius of the Earth’s orbit (1.5 × 1011 m), the Sun’s radiation has an intensity of about 1.4 × 103 W/m2 . Find the rate at which the mass of the Sun is decreasing. Solution If we assume that the Sun’s radiation is distributed uniformly over the surface area 4π r2 of a sphere of radius 1.5 × 1011 m, then the total radiative power emitted by the Sun is 11 2 3 2 4π(1.5 × 10 m) (1.4 × 10 W/m ) = 4.0 × 1026 W = 4.0 × 1026 J/s By conservation of energy, we know that the energy lost by the Sun through radiation must be accounted for by a corresponding loss in its rest energy. The change in mass m corresponding to a change in rest energy E0 of 4.0 × 1026 J each second is m = 4.0 × 1026 J E0 = = 4.4 × 109 kg c2 9.0 × 1016 m2 /s2 The Sun loses mass at a rate of about 4 billion kilograms per second! If this rate were to remain constant, the Sun (with a present mass of 2 × 1030 kg) would shine “only” for another 1013 years. 2.8 CONSERVATION LAWS IN RELATIVISTIC DECAYS AND COLLISIONS In all decays and collisions, we must apply the law of conservation of momentum. The only difference between applying this law for collisions at low speed (as we did in Example 1.1) and at high speed is the use of the relativistic expression for momentum (Eq. 2.32) instead of Eq. 1.2. The law of conservation of momentum for relativistic motion can be stated in exactly the same way as for classical motion: In an isolated system of particles, the total linear momentum remains constant. In the classical case, kinetic energy is the only form of energy that is present in elastic collisions, so conservation of energy is equivalent to conservation of kinetic energy. In inelastic collisions or decay processes, the kinetic energy does not remain constant. Total energy is conserved in classical inelastic collisions, but we did not account for the other forms of energy that might be important. This missing energy is usually stored in the particles, perhaps as atomic or nuclear energy. 54 Chapter 2 | The Special Theory of Relativity In the relativistic case, the internal stored energy contributes to the rest energy of the particles. Usually rest energy and kinetic energy are the only two forms of energy that we consider in atomic or nuclear processes (later we’ll add the energy of radiation to this balance). A loss of kinetic energy in a collision is thus accompanied by a gain in rest energy, but the total relativistic energy (kinetic energy + rest energy) of all the particles involved in the process doesn’t change. For example, in a reaction in which new particles are produced, the loss in kinetic energy of the original reacting particles gives the increase in rest energy of the product particles. On the other hand, in a nuclear decay process such as alpha decay, the initial nucleus gives up some rest energy to account for the kinetic energy carried by the decay products. The law of energy conservation in the relativistic case is: In an isolated system of particles, the relativistic total energy (kinetic energy plus rest energy) remains constant. In applying this law to relativistic collisions, we don’t have to worry whether the collision is elastic or inelastic, because the inclusion of the rest energy accounts for any loss in kinetic energy. The following examples illustrate applications of the conservation laws for relativistic momentum and energy. Example 2.16 A neutral K meson (mass 497.7 MeV/c2 ) is moving with a kinetic energy of 77.0 MeV. It decays into a pi meson (mass 139.6 MeV/c2 ) and another particle of unknown mass. The pi meson is moving in the direction of the original K meson with a momentum of 381.6 MeV/c. (a) Find the momentum and total relativistic energy of the unknown particle. (b) Find the mass of the unknown particle. Conservation of relativistic momentum (pinitial = pfinal ) gives pK = pπ + px (where x represents the unknown particle), so Solution and conservation of total relativistic energy (Einitial = Efinal ) gives EK = Eπ + Ex , so (a) The total energy and momentum of the K meson are px = pK − pπ = 287.4 MeV/c − 381.6 MeV/c = −94.2 MeV/c EK = KK + mK c2 = 77.0 MeV + 497.7 MeV = 574.7 MeV  Ex = EK − Eπ = 574.7 MeV − 406.3 MeV 1 EK2 − (mK c2 )2 pK = = 168.4 MeV c 1 = (574.7 MeV)2 − (497.7 MeV)2 (b) We can find the mass by solving Eq. 2.39 for mc2 : c = 287.4 MeV/c  mx c2 = Ex2 − (cpx )2 and for the pi meson  = (168.4 MeV)2 − (94.2 MeV)2  Eπ = (cpπ )2 + (mπ c2 )2 = 139.6 MeV  = (381.6 MeV)2 + (139.6 MeV)2 Thus the unknown particle has a mass of 139.6 MeV/c2 , = 406.3 MeV and its mass shows that it is another pi meson. 2.8 | Conservation Laws in Relativistic Decays and Collisions 55 Example 2.17 In the reaction K− + p → 0 + π 0 , a charged K meson (mass 493.7 MeV/c2 ) collides with a proton (938.3 MeV/c2 ) at rest, producing a lambda particle (1115.7 MeV/c2 ) and a neutral pi meson (135.0 MeV/c2 ), as represented in Figure 2.23. The initial kinetic energy of the K meson is 152.4 MeV. After the interaction, the pi meson has a kinetic energy of 254.8 MeV. (a) Find the kinetic energy of the lambda. (b) Find the directions of motion of the lambda and the pi meson. (b) To find the directional information we must apply conservation of momentum. The initial momentum is just that of the K meson. From its total energy, EK = KK + mK c2 = 152.4 MeV + 493.7 MeV = 646.1 MeV, we can find the momentum: pinitial = pK =  1 (EK )2 − (mK c2 )2 c 1 (646.1 MeV)2 −(493.7 MeV)2 c = 416.8 MeV/c = y x K− A similar procedure applied to the two final particles gives p = 426.9 MeV/c and pπ = 365.7 MeV/c. The total momentum of the two final particles is px,final = p cos θ + pπ cos φ and py,final = p sin θ − pπ sin φ. Conservation of momentum in the x and y directions gives p (a) y π0 Λ0 θ x φ (b) FIGURE 2.23 Example 2.17. (a) A K− meson collides with a proton at rest. (b) After the collision, a π 0 meson and a 0 are produced. Solution (a) The initial and final total energies are Einitial = EK + Ep = KK + mK c2 + mp c2 Efinal = E +Eπ = K +m c2 +Kπ +mπ c2 In these two equations, the value of every quantity is known except the kinetic energy of the lambda. Using conservation of total relativistic energy, we set Einitial = Efinal and solve for K : K = KK + mK c2 + mp c2 − m c2 − Kπ − mπ c2 = 152.4 MeV + 493.7 MeV + 938.3 MeV − 1115.7 MeV − 254.8 MeV − 135.0 MeV = 78.9 MeV p cos θ + pπ cos φ = pinitial and p sin θ − pπ sin φ = 0 Here we have two equations with two unknowns (θ and φ). We can eliminate θ by writing the first equation as p cos θ = pinitial − pπ cos φ, then squaring both equations and adding them. The resulting equation can be solved for φ: p2initial + p2π − p2 2pπ pinitial φ = cos−1 ⎞ (416.8 MeV/c)2 + (365.7 MeV/c)2 ⎜ ⎟ −(426.9 MeV/c)2 ⎟ = cos−1 ⎜ ⎝ 2(365.7 MeV/c)(416.8 MeV/c) ⎠ ⎛ ◦ = 65.7 From the conservation of momentum equation for the y components, we have θ = sin−1 −1 = sin   pπ sin φ p  (365.7 MeV/c)(sin 65.7◦ ) 426.9 MeV/c  ◦ = 51.3 56 Chapter 2 | The Special Theory of Relativity Example 2.18 The discovery of the antiproton p (a particle with the same rest energy as a proton, 938 MeV, but with the opposite electric charge) took place in 1956 through the following reaction: p+p→p+p+p+p proton. Thus the initial total energy of the two protons is Ep + mp c2 . Let Ep′ and p′p represent the total energy and momentum of each of the four final particles (which move together and thus have the same energy and momentum). We can then apply conservation of total energy: in which accelerated protons were incident on a target of protons at rest in the laboratory. The minimum incident kinetic energy needed to produce the reaction is called the threshold kinetic energy, for which the final particles move together as if they were a single unit (Figure 2.24). Find the threshold kinetic energy to produce antiprotons in this reaction. Ep + mp c2 = 4Ep′ y y p p v p (a) and conservation of momentum: pp = 4p′p  We can write the momentum equation as Ep2 − (mp c2 )2 =  4 Ep′2 − (mp c2 )2 , so now we have two equations in two unknowns (Ep and Ep′ ). We eliminate Ep′ , for example by solving the energy conservation equation for Ep′ and substituting into the momentum equation. The result is x x (b) FIGURE 2.24 Example 2.18. (a) A proton moving with velocity v collides with another proton at rest. (b) The reaction produces three protons and an antiproton, which move together as a unit. Solution This problem can be solved by a straightforward application of energy and momentum conservation. Let Ep and pp represent the total energy and momentum of the incident Ep = 7mp c2 from which we can calculate the kinetic energy of the incident proton: Kp = Ep − mp c2 = 6mp c2 = 6(938 MeV) = 5628 MeV = 5.628GeV The Bevatron accelerator at the Lawrence Berkeley Laboratory was designed with this experiment in mind, so that it could produce a beam of protons whose energy exceeded 5.6 GeV. The discovery of the antiproton in this reaction was honored with the award of the 1959 Nobel Prize to the experimenters, Emilio Segrè and Owen Chamberlain. 2.9 EXPERIMENTAL TESTS OF SPECIAL RELATIVITY Because special relativity provided such a radical departure from the notions of space and time in classical physics, it is important to perform detailed experimental tests that can clearly distinguish between the predictions of special relativity and those of classical physics. Many tests of increasing precision have been done since the theory was originally presented, and in every case the predictions of special relativity are upheld. Here we discuss a few of these tests. 2.9 | Experimental Tests of Special Relativity Universality of the Speed of Light The second relativity postulate asserts that the speed of light has the same value c for all observers. This leads to several types of experimental tests, of which we discuss two: (1) Does the speed of light change with the direction of travel? (2) Does the speed of light change with relative motion between source and observer? The Michelson-Morley experiment provides a test of the first type. This experiment compared the upstream-downstream and cross-stream speeds of light and concluded that they were equal within the experimental error. Equivalently, we may say that the experiment showed that there is no preferred reference frame (no ether) relative to which the speed of light must be measured. If there is an ether, the speed of the Earth through the ether is less than 5 km/s, which is much smaller than the Earth’s orbital speed about the Sun, 30 km/s. We can express their result as a difference c between the upstream-downstream and cross-stream speeds; the experiment showed that c/c < 3 × 10−10 . To reconcile the result of the Michelson-Morley experiment with classical physics, Lorentz proposed the “ether drag” hypothesis, according to which the motion of the Earth through the ether caused an electromagnetic drag that contracted the arm of the interferometer in the direction of motion. This contraction was just enough to compensate for the difference in the upstream-downstream and cross-stream times predicted by the Galilean transformation. This hypothesis succeeds only when the two arms of the interferometer are of the same length. To test this hypothesis, a similar experiment was done in 1932 by Kennedy and Thorndike; in their experiment, the lengths of the interferometer arms differed by about 16 cm, the maximum distance over which light sources available at that time could remain coherent. The Kennedy-Thorndike experiment in effect tests the second question, whether the speed of light changes due to relative motion. Their result was c/c < 3 × 10−8 , which excludes the Lorentz contraction hypothesis as an explanation for the Michelson-Morley experiment. In recent years, these fundamental experiments have been repeated with considerably improved precision using lasers as light sources. Experimenters working at the Joint Institute for Laboratory Astrophysics in Boulder, Colorado, built an apparatus that consisted of two He-Ne lasers on a rotating granite platform. By electronically stabilizing the lasers, they improved the sensitivity of their apparatus by several orders of magnitude. Again expressing the result as a difference between the speeds along the two arms of the apparatus, this experiment corresponds to c/c < 8 × 10−15 , an improvement of about 5 orders of magnitude over the original Michelson-Morley experiment. In a similar repetition of the Kennedy-Thorndike experiment using He-Ne lasers, they obtained c/c < 1 × 10−10 , an improvement over the original experiment by a factor of 300. [See A. Brillet and J. L. Hall, Physical Review Letters 42, 549 (1979); D. Hils and J. L. Hall, Physical Review Letters 64, 1697 (1990).] A considerable improvement in the Kennedy-Thorndike type of experiment has been made possible by comparing the oscillation frequency of a crystal with the frequency of a hydrogen maser (a maser is similar to a laser, but it uses microwaves rather than visible light). The experimenters measured for nearly one year, looking for a change in the relative frequencies as the Earth’s velocity changed. No effect was observed, leading to a limit of c/c < 2 × 10−12 . [See P. Wolf et al., Physical Review Letters 90, 060402 (2003).] Another way of testing the second question is to measure the speed of a light beam emitted by a source in motion. Suppose we observe this beam along the 57 58 Chapter 2 | The Special Theory of Relativity direction of motion of the moving source, which might be moving toward us or away from us. In the rest frame of the source, the emitted light travels at speed c. We can express the speed of light in our reference frame as c′ = c + c, where c is zero according to special relativity (c′ = c) or is ±u according to classical physics (c′ = c ± u in the Galilean transformation, depending on whether the motion is toward or away from the observer). In one experiment of this type, the decay of pi mesons (pions) into gamma rays (a form of electromagnetic waves traveling at c) was observed. When pions (produced in laboratories with large accelerators) emit these gamma rays, they are traveling at speeds close to the speed of light, relative to the laboratory. Thus if Galilean relativity were valid, we should expect to find gamma rays emitted in the direction of motion of the decaying pions traveling at a speed c′ in the laboratory of nearly 2c, rather than always with c as predicted by special relativity. The observed laboratory speed of these gamma rays in one experiment was (2.9977 ± 0.0004) ×108 m/s when the decaying pions were moving at u/c = 0.99975. These results give c/c < 2 × 10−4 , and thus c′ = c as expected from special relativity. This experiment shows directly that an object moving at a speed of nearly c relative to the laboratory emits “light” that travels at a speed of c relative to both the object and the laboratory, giving direct evidence for Einstein’s second postulate. [See T. Alvager et al., Physics Letters 12, 260 (1964).] Another experiment of this type is to study the X rays emitted by a binary pulsar, a rapidly pulsating source of X rays in orbit about another star, which would eclipse the pulsar as it rotated in its orbit. If the speed of light (in this case, X rays) were to change as the pulsar moved first toward and later away from the Earth in its orbit, the beginning and end of the eclipse would not be equally spaced in time from the midpoint of the eclipse. No such effect is observed, and from these observations it is concluded that c/c < 2 × 10−12 , in agreement with predictions of special relativity. These experiments were done at u/c = 10−3 . [See K. Brecher, Physical Review Letters 39, 1051 (1977).] A different type of test of the limit by which the speed of light changes with direction of travel can be done using the clocks carried aboard the network of Earth satellites that make up the Global Positioning System (GPS). By comparing the readings of clocks on the GPS satellites with clocks on the ground at different times of day (as the satellites move relative to the ground stations), it is possible to test whether the change in the direction of travel affects the apparent synchronization of the clocks. No effect was observed, and the experimenters were able to set a limit of c/c < 5 × 10−9 for the difference between the one-way and round-trip speeds of light. [See P. Wolf and G. Petit, Physical Review A 56, 4405 (1997).] Time Dilation We have already discussed the time dilation effect on the decay of muons produced by cosmic rays. Muon decay can also be studied in the laboratory. Muons can be produced following collisions in high-energy accelerators, and the decay of the muons can be followed by observing their decay products (ordinary electrons). These muons can either be trapped and decay at rest, or they can be placed in a beam and decay in flight. When muons are observed at rest, their decay lifetime is 2.198 μs. (As we discuss in Chapter 12, decays generally follow an exponential law. The lifetime is the time after which a fraction 1/e = 0.368 of the original muons remain.) This is the proper lifetime, measured in a frame of reference in which the muon is at rest. In one particular experiment, muons were trapped in a ring and circulated at a momentum of p = 3094 MeV/c. The decays 2.9 | Experimental Tests of Special Relativity in flight occurred with a lifetime of 64.37 μs (measured in the laboratory frame of reference). For muons of this momentum, Eq. 2.8 gives a dilated lifetime of (see Problem 43) 64.38 μs, which is in excellent agreement with the measured value and confirms the time dilation effect. [See J. Bailey et al., Nature 268, 301 (1977).] Another similar experiment was done with pions. The proper lifetime, measured for pions at rest, is known to be 26.0 ns. In one experiment, pions were observed in flight at u/c = 0.913, and their lifetime was measured to be 63.7 ns. (Pions decay to muons, so we can follow the exponential radioactive decay of the pions by observing the muons emitted as a result of the decay.) For pions moving at this speed, the expected dilated lifetime is in exact agreement with the measured value, once again confirming the time dilation effect. [See D. S. Ayres et al., Physical Review D 3, 1051 (1971).] The Doppler Effect Confirmation of the relativistic Doppler effect first came from experiments done in 1938 by Ives and Stilwell. They sent a beam of hydrogen atoms, generated in a gas discharge, down a tube at a speed u, as shown in Figure 2.25. They could simultaneously observe light emitted by the atoms in a direction parallel to u (atom 1) and opposite to u (atom 2, reflected from the mirror). Using a spectrograph, the experimenters were able to photograph the characteristic spectral lines from these atoms and also, on the same photographic plate, from atoms at rest. If the classical Doppler formula were valid, the wavelengths of the lines from atoms 1 and 2 would be placed at symmetric intervals λ1 = ±λ0 (u/c) on either side of the line from the atoms at rest (wavelength λ0 ), as in Figure 2.25b. The relativistic Doppler formula, on the other hand, gives a small additional asymmetric shift λ2 = + 21 λ0 (u/c)2 , as in Figure 2.25c (computed for u ≪ c, so 100 V 30 kV lon Mirror 2 u 1 Hot filament Hydrogen arc region To spectrograph u Acceleration region (a) λ0 ∆λ1 λ0 ∆λ1 (b) ∆λ1–∆λ2 ∆λ1+∆λ2 (c) FIGURE 2.25 (a) Apparatus used in the Ives-Stilwell experiment. (b) Line spectrum expected from classical Doppler effect. (c) Line spectrum expected from relativistic Doppler effect. 59 Chapter 2 | The Special Theory of Relativity 5 ∆λ2 (10−3 nm) 4 3 2 1 0 0 1 2 3 4 5 u/c (units of 10−3) FIGURE 2.26 Results of the IvesStilwell experiment. According to classical theory, λ2 = 0, while according to special relativity, λ2 depends on (u/c)2 . The solid line, which represents the relativistic formula, gives excellent agreement with the data points. that higher-order terms in u/c can be neglected). Figure 2.26 shows the results of Ives and Stilwell for one of the hydrogen lines (the blue line of the Balmer series at λ0 = 486 nm). The agreement between the observed values and those predicted by the relativistic formula is impressive. Recent experiments with lasers have verified the relativistic formula at greater accuracy. These experiments are based on the absorption of laser light by an atom; when the radiation is absorbed, the atom changes from its lowest-energy state (the ground state) to one of its excited states. The experiment consists essentially of comparing the laser wavelength needed to excite atoms at rest with that needed for atoms in motion. One experiment used a beam of hydrogen atoms with kinetic energy 800 MeV (corresponding to u/c = 0.84) produced in a high-energy proton accelerator. An ultraviolet laser was used to excite the atoms. This experiment verified the relativistic Doppler effect to an accuracy of about 3 × 10−4 . [See D. W. MacArthur et al., Physical Review Letters 56, 282 (1986).] In another experiment, a beam of neon atoms moving with a speed of u = 0.0036c was irradiated with light from a tunable dye laser. This experiment verified the relativistic Doppler shift to a precision of 2 × 10−6 . [See R. W. McGowan et al., Physical Review Letters 70, 251 (1993).] A more recent study used two tunable dye lasers parallel and antiparallel to a beam of lithium atoms moving at 0.064c. The results of this experiment agreed with the relativistic Doppler formula to within a precision of 2 × 10−7 , improving on the best previous results by an order of magnitude. [See G. Saathoff et al., Physical Review Letters 91, 190403 (2003).] Relativistic Momentum and Energy The earliest direct confirmation of the relativistic relationship for energy and momentum came just a few years after Einstein’s 1905 paper. Simultaneous measurements were made of the momentum and velocity of high-energy electrons emitted in certain radioactive decay processes (nuclear beta decay, which is discussed in Chapter 12). Figure 2.27 shows the results of several different investigations plotted as p/mv, which should have the value 1 according to classical physics. The results agree with the relativistic formula and disagree with the classical one. Note that the relativistic and classical formulas give the same 1.8 1.6 Relativistic: p/mv 60 p 1 mv = √ 1–v2/c2 1.4 1.2 Nonrelativistic: 1.0 0 0.1 0.2 0.3 0.4 0.5 0.6 p =1 mv 0.7 0.8 Velocity (v/c) FIGURE 2.27 The ratio p/mv is plotted for electrons of various speeds. The data agree with the relativistic result and not at all with the nonrelativistic result (p/mv = 1). 2.9 | Experimental Tests of Special Relativity 61 results at low speeds, and in fact the two cannot be distinguished for speeds below 0.l c, which accounts for our failure to observe these effects in experiments with ordinary laboratory objects. Other more recent experiments, in which the kinetic energies of fast electrons were measured, are shown in Figure 2.28. Once again, the data at high speeds agree with special relativity and disagree with the classical equations. In a more extreme example, experimenters at the Stanford Linear Accelerator Center measured the speed of 20 GeV electrons, whose speed is within 5 × 10−10 of the speed of light (or about 0.15 m/s less than c). The measurement was not capable of this level of precision, but it did determine that the speed of the electrons was within 2 × 10−7 of the speed of light (60 m/s). [See Z. G. T. Guiragossian et al., Physical Review Letters 34, 335 (1975).] Nearly every time the nuclear or particle physicist enters the laboratory, a direct or indirect test of the momentum and energy relationships of special relativity is made. Principles of special relativity must be incorporated in the design of the high-energy accelerators used by nuclear and particle physicists, so even the construction of these projects gives testimony to the validity of the formulas of special relativity. For example, consider the capture of a neutron by an atom of hydrogen to form an atom of deuterium or “heavy hydrogen.” Energy is released in this process, mostly in the form of electromagnetic radiation (gamma rays). The energy of the gamma rays is measured to be 2.224 MeV. Where does this energy come from? 0.9 Nonrelativistic K = 12 mv2 5.0 0.7 0.6 Nonrelativistic p2 =m 2K 1.0 4.0 Nonrelativistic K = p2/2m 3.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Kinetic energy (MeV) (a) K = mc2 –1 √1 – (v/c)2 0.5 2.0 1.0 0.5 0 Relativistic 1 (v/c)2 Relativistic p2 K =m+ 2 2K 2c Kinetic energy (MeV) p2/2K (MeV/c2) 0.8 0.0 0.0 Relativistic K = √ p2c2 + m2c4 – mc2 1.0 2.0 3.0 Momentum (MeV/c) (b) 0.0 0 1 2 3 4 Kinetic energy (MeV) 5 (c) FIGURE 2.28 Confirmation of relativistic kinetic energy relationships. In (a) and (b) the momentum and energy of radioactive decay electrons were measured simultaneously. In these two independent experiments, the data were plotted in different ways, but the results are clearly in good agreement with the relativistic relationships and in poor agreement with the classical, nonrelativistic relationships. In (c) electrons were accelerated to a fixed energy through a large electric field (up to 4.5 million volts, as shown) and the velocities of the electrons were determined by measuring the flight time over 8.4 m. Notice that at small kinetic energies (K ≪ mc2 ), the relativistic and nonrelativistic relationships become identical. [Sources: (a) K. N. Geller and R. Kollarits, Am. J. Phys. 40, 1125 (1972); (b) S. Parker, Am. J. Phys. 40, 241 (1972); (c) W. Bertozzi, Am. J. Phys. 32, 551 (1964).]. 62 Chapter 2 | The Special Theory of Relativity It comes from the difference in mass when the hydrogen and neutron combine to form deuterium. The difference between the initial and final masses is: m = m(hydrogen) + m(neutron) − m(deuterium) = 1.007825 u + 1.008665 u − 2.014102 u = 0.002388 u The initial mass of hydrogen plus neutron is greater than the final mass of deuterium by 0.002388 u. The energy equivalent of this change in mass is E = (m)c2 = 2.224 MeV which is equal to the energy released as gamma rays. Similar experiments have been done to test the E = mc2 relationship by measuring the energy released as gamma rays following the capture of neutrons by atoms of silicon and sulfur, and comparing the gamma-ray energies with the difference between the initial and final masses. These experiments are consistent with E = mc2 to a precision of about 4 × 10−7 . [See S. Rainville et al., Nature 438, 1096 (2006).] Twin Paradox Although we cannot perform the experiment to test the twin paradox as we have described it, we can do an equivalent experiment. We take two clocks in our laboratory and synchronize them carefully. We then place one of the clocks in an airplane and fly it around the Earth. When we return the clock to the laboratory and compare the two clocks, we expect to find, if special relativity is correct, that the clock that has left the laboratory is the “younger” one—that is, it will have ticked away fewer seconds and appear to run behind its stationary twin. In this experiment, we must use very precise clocks based on the atomic vibrations of cesium in order to measure the time differences between the clock readings, which amount to only about 10−7 s. This experiment is complicated by several factors, all of which can be computed rather precisely: the rotating Earth is not an inertial frame (there is a centripetal acceleration), clocks on the surface of the Earth are already moving because of the rotation of the Earth, and the general theory of relativity predicts that a change in the gravitational field strength, which our moving clock will experience as it changes altitude in its airplane flight, will also change the rate at which the clock runs. In this experiment, as in the others we have discussed, the results are entirely in agreement with the predictions of special relativity. [See J. C. Hafele and R. E. Keating, Science 177, 166 (1972).] In a similar experiment, a cesium atomic clock carried on the space shuttle was compared with an identical clock on the Earth. The comparison was made through a radio link between the shuttle and the ground station. At an orbital height of about 328 km, the shuttle moves at a speed of about 7712 m/s, or 2.5 × 10−5 c. A clock moving at this speed runs slower than an identical clock at rest by the time dilation factor. For every second the clock is in orbit, it loses 330 ps relative to the clock on Earth; equivalently, it loses about 1.8 μs per orbit. These time intervals can be measured with great precision, and the predicted asymmetric aging was verified to a precision of about 0.1%. [See E. Sappl, Naturwissenschaften 77, 325 (1990).] Questions 63 Chapter Summary Section Galilean relativity x′ Einstein’s postulates (1) The laws of physics are the 2.3 same in all inertial frames. (2) The speed of light has the same value c in all inertial frames. Time dilation Length contraction Velocity addition =x− t =  ut, v′x = vx − u t0 1 − u2 /c2 (t0 = proper time)  L = L0 1 − u2 /c2 (L0 = proper length) v′ + u v= 1 + v′ u/c2 x′ =  x − ut 1 − u2 /c2 Lorentz velocity transformation p = Relativistic kinetic energy K=  Relativistic total energy Momentum-energy relationship , 2.5 y′ = y, z′ = z, t − (u/c2 )x t′ =  1 − u2 /c2 t′ =  Relativistic momentum Rest energy 2.4 vx − u , 1 − vx u/c2  vy 1 − u2 /c2 1 − vx u/c2  vz 1 − u2 /c2 ′ vz = 1 − vx u/c2 Clock synchronization 2.4 v′x = v′y = 2.4 2.4  Doppler effect 1 − u/c ′ (source and f =f 1 + u/c observer separating) Lorentz transformation 2.1 Section  2.5 , uL/c2 2.5 1 − u2 /c2 m v 2.7 1 − v2 /c2 mc2 1 − v2 /c2 E0 = mc2 − mc2 2.7 2.7 2 mc E = K + E0 =  1 − v2 /c2  E = (pc)2 + (mc2 )2 2.7 2.7 Extreme relativistic E ∼ = pc approximation 2.7 Conservation laws 2.8 In an isolated system of particles, the total momentum and the relativistic total energy remain constant. Questions 1. Explain in your own words what is meant by the term “relativity.” Are there different theories of relativity? 2. Suppose the two observers and the rock described in the first paragraph of Section 2.1 were isolated in interstellar space. Discuss the two observers’ differing perceptions of the motion of the rock. Is there any experiment they can do to determine whether the rock is moving in any absolute sense? 3. Describe the situation of Figure 2.4 as it would appear from the reference frame of O′ . 4. Does the Michelson-Morley experiment show that the ether does not exist or that it is merely unnecessary? 5. Suppose we made a pair of shears in which the cutting blades were many orders of magnitude longer than the handle. Let us in fact make them so long that, when we move the handles at angular velocity ω, a point on the tip of the blade has a tangential velocity v = ωr that is greater than c. Does this contradict special relativity? Justify your answer. 6. Light travels through water at a speed of about 2.25 × 108 m/s. Is it possible for a particle to travel through water at a speed v greater than 2.25 × 108 m/s? 64 Chapter 2 | The Special Theory of Relativity 7. Is it possible to have particles that travel at the speed of light? What does Eq. 2.36 require of such particles? 8. How does relativity combine space and time coordinates into spacetime? 9. Einstein developed the relativity theory after trying unsuccessfully to imagine how a light beam would look to an observer traveling with the beam at speed c. Why is this so difficult to imagine? 10. Explain in your own words the terms time dilation and length contraction. 11. Does the Moon’s disk appear to be a different size to a space traveler approaching it at v = 0.99c, compared with the view of a person at rest at the same location? 12. According to the time dilation effect, would the life expectancy of someone who lives at the equator be longer or shorter than someone who lives at the North Pole? By how much? 13. Criticize the following argument. “Here is a way to travel faster than light. Suppose a star is 10 light-years away. A radio signal sent from Earth would need 20 years to make the round trip to the star. If I were to travel to the star in my rocket  at v = 0.8c, to me the distance to the star is contracted by 1 − (0.8)2 to 6 light-years, and at that speed it would take me 6 light-years/0.8c = 7.5 years to travel there. The 14. 15. 16. 17. 18. 19. 20. round trip takes me only 15 years, and therefore I travel faster than light, which takes 20 years.” Is it possible to synchronize clocks that are in motion relative to each other? Try to design a method to do so. Which observers will believe the clocks to be synchronized? Suppose event A causes event B. To one observer, event A comes before event B. Is it possible that in another frame of reference event B could come before event A? Discuss. Is mass a conserved quantity in classical physics? In special relativity? “In special relativity, mass and energy are equivalent.” Discuss this statement and give examples. Which is more massive, an object at low temperature or the same object at high temperature? A spring at its natural length or the same spring under compression? A container of gas at low pressure or at high pressure? A charged capacitor or an uncharged one? Could a collision be elastic in one frame of reference and inelastic in another? (a) What properties of nature would be different if there were a relativistic transformation law for electric charge? (b) What experiments could be done to prove that electric charge does not change with velocity? Problems 2.1 Classical Relativity 1. You are piloting a small airplane in which you want to reach a destination that is 750 km due north of your starting location. Once you are airborne, you find that (due to a strong but steady wind) to maintain a northerly course you must point the nose of the plane at an angle that is 22◦ west of true north. From previous flights on this route in the absence of wind, you know that it takes you 3.14 h to make the journey. With the wind blowing, you find that it takes 4.32 h. A fellow pilot calls you to ask about the wind velocity (magnitude and direction). What is your report? 2. A moving sidewalk 95 m in length carries passengers at a speed of 0.53 m/s. One passenger has a normal walking speed of 1.24 m/s. (a) If the passenger stands on the sidewalk without walking, how long does it take her to travel the length of the sidewalk? (b) If she walks at her normal walking speed on the sidewalk, how long does it take to travel the full length? (c) When she reaches the end of the sidewalk, she suddenly realizes that she left a package at the opposite end. She walks rapidly back along the sidewalk at double her normal walking speed to retrieve the package. How long does it take her to reach the package? 2.2 The Michelson-Morley Experiment 3. A shift of one fringe in the Michelson-Morley experiment corresponds to a change in the round-trip travel time along one arm of the interferometer by one period of vibration of light (about 2 × 10−15 s) when the apparatus is rotated by 90◦ . Based on the results of Example 2.3, what velocity through the ether would be deduced from a shift of one fringe? (Take the length of the interferometer arm to be 11 m.) 2.4 Consequences of Einstein’s Postulates 4. The distance from New York to Los Angeles is about 5000 km and should take about 50 h in a car driving at 100 km/h. (a) How much shorter than 5000 km is the distance according to the car travelers? (b) How much less than 50 h do they age during the trip? 5. How fast must an object move before its length appears to be contracted to one-half its proper length? 6. An astronaut must journey to a distant planet, which is 200 light-years from Earth. What speed will be necessary if the astronaut wishes to age only 10 years during the round trip? Problems 7. The proper lifetime of a certain particle is 100.0 ns. (a) How long does it live in the laboratory if it moves at v = 0.960c? (b) How far does it travel in the laboratory during that time? (c) What is the distance traveled in the laboratory according to an observer moving with the particle? 8. High-energy particles are observed in laboratories by photographing the tracks they leave in certain detectors; the length of the track depends on the speed of the particle and its lifetime. A particle moving at 0.995c leaves a track 1.25 mm long. What is the proper lifetime of the particle? 9. Carry out the missing steps in the derivation of Eq. 2.17. 10. Two spaceships approach the Earth from opposite directions. According to an observer on the Earth, ship A is moving at a speed of 0.753c and ship B at a speed of 0.851c. What is the velocity of ship A as observed from ship B? Of ship B as observed from ship A? 11. Rocket A leaves a space station with a speed of 0.826c. Later, rocket B leaves in the same direction with a speed of 0.635c. What is the velocity of rocket A as observed from rocket B? 12. One of the strongest emission lines observed from distant galaxies comes from hydrogen and has a wavelength of 122 nm (in the ultraviolet region). (a) How fast must a galaxy be moving away from us in order for that line to be observed in the visible region at 366 nm? (b) What would be the wavelength of the line if that galaxy were moving toward us at the same speed? 13. A physics professor claims in court that the reason he went through the red light (λ = 650 nm) was that, due to his motion, the red color was Doppler shifted to green (λ = 550 nm). How fast was he going? 2.5 The Lorentz Transformation 14. Derive the Lorentz velocity transformations for v′x and v′z . 15. Observer O fires a light beam in the y direction (vy = c). Use the Lorentz velocity transformation to find v′x and v′y and show that O′ also measures the value c for the speed of light. Assume that O′ moves relative to O with velocity u in the x direction. 16. A light bulb at point x in the frame of reference of O blinks on and off at intervals t = t2 − t1 . Observer O′ , moving relative to O at speed u, measures the interval to be t′ = t2′ − t1′ . Use the Lorentz transformation expressions to derive the time dilation expression relating t and t′ . 17. A neutral K meson at rest decays into two π mesons, which travel in opposite directions along the x axis with speeds of 0.828c. If instead the K meson were moving in the positive x direction with a velocity of 0.486c, what would be the velocities of the two π mesons? 18. A rod in the reference frame of observer O makes an angle of 31◦ with the x axis. According to observer O′ , who is in motion in the x direction with velocity u, the rod makes an angle of 46◦ with the x axis. Find the velocity u. 65 19. According to observer O, two events occur separated by a time interval t = +0.465 μs and at locations separated by x = +53.4 m. (a) According to observer O′ , who is in motion relative to O at a speed of 0.762c in the positive x direction, what is the time interval between the two events? (b) What is the spatial separation between the two events, according to O′ ? 20. According to observer O, a blue flash occurs at xb = 10.4 m when tb = 0.124 μs, and a red flash occurs at xr = 23.6 m when tr = 0.138 μs. According to observer O′ , who is in motion relative to O at velocity u, the two flashes appear to be simultaneous. Find the velocity u. 2.6 The Twin Paradox 21. Suppose the speed of light were 1000 mi/h. You are traveling on a flight from Los Angeles to Boston, a distance of 3000 mi. The plane’s speed is a constant 600 mi/h. You leave Los Angeles at 10:00 A.M., as indicated by your wristwatch and by a clock in the airport. (a) According to your watch, what time is it when you land in Boston? (b) In the Boston airport is a clock that is synchronized to read exactly the same time as the clock in the Los Angeles airport. What time does that clock read when you land in Boston? (c) The following day when the Boston clock that records Los Angeles time reads 10:00 A.M., you leave Boston to return to Los Angeles on the same airplane. When you land in Los Angeles, what are the times read on your watch and on the airport clock? 22. Suppose rocket traveler Amelia has a clock made on Earth. Every year on her birthday she sends a light signal to brother Casper on Earth. (a) At what rate does Casper receive the signals during Amelia’s outward journey? (b) At what rate does he receive the signals during her return journey? (c) How many of Amelia’s birthday signals does Casper receive during the journey that he measures to last 20 years? 23. Suppose Amelia traveled at a speed of 0.80c to a star that (according to Casper on Earth) is 8.0 light-years away. Casper ages 20 years during Amelia’s round trip. How much younger than Casper is Amelia when she returns to Earth? 24. Make a drawing similar to Figure 2.20 showing the worldlines of Casper and Amelia from Casper’s frame of reference. Divide the world line for Amelia’s outward journey into 8 equal segments (for the 8 birthdays that Amelia celebrates). For each birthday, draw a line that represents a light signal that Amelia sends to Casper on her birthday. Do the same for Amelia’s return journey. (a) According to Casper’s time, when does he receive the signal showing Amelia celebrating her 8th birthday after leaving Earth? (b) How long does it take for Casper to receive the signals showing Amelia celebrating birthdays 9 through 16? 2.7 Relativistic Dynamics 25. (a) Using the relativistically correct final velocities for the collision shown in Figure 2.21a (v′1f = −0.585c, v′2f = +0.294c), show that relativistic kinetic energy is conserved 66 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. Chapter 2 | The Special Theory of Relativity according to observer O′ . (b) Using the relativistically correct final velocities for the collision shown in Figure 2.21b (v1f = −0.051c, v2f = +0.727c), show that relativistic kinetic energy is conserved according to observer O. Find the momentum, kinetic energy, and total energy of a proton moving at a speed of 0.756c. An electron is moving with a kinetic energy of 1.264 MeV. What is its speed? The work-energy theorem relates the change in kinetic energy of a particle to the work done on it by an external force: K = W = ∫ F dx. Writing Newton’s second law as F = dp/dt, show that W = ∫ v dp and integrate by parts using the relativistic momentum to obtain Eq. 2.34. For what range of velocities of a particle of mass m can we use the classical expression for kinetic energy 12 mv2 to within an accuracy of 1%? For what range of velocities of a particle of mass m can we use the extreme relativistic approximation E = pc to within an accuracy of 1%? Use Eqs. 2.32 and 2.36 to derive Eq. 2.39. Use the binomial expansion (1 + x)n = 1 + nx + [n(n − 1)/2!]x2 + · · · to show that Eq. 2.34 for the relativistic kinetic energy reduces to the classical expression 1 2 2 mv when v ≪ c. This important result shows that our familiar expressions are correct at low speeds. By evaluating the first term in the expansion beyond 21 mv2 , find the speed necessary before the classical expression is off by 0.01%. (a) According to observer O, a certain particle has a momentum of 817 MeV/c and a total relativistic energy of 1125 MeV. What is the rest energy of this particle? (b) An observer O′ in a different frame of reference measures the momentum of this particle to be 953 MeV/c. What does O′ measure for the total relativistic energy of the particle? An electron is moving at a speed of 0.81c. By how much must its kinetic energy increase to raise its speed to 0.91c? What is the change in mass when 1 g of copper is heated from 0 to 100◦ C? The specific heat capacity of copper is 0.40 J/g · K. Find the kinetic energy of an electron moving at a speed of (a) v = 1.00 × 10−4 c; (b) v = 1.00 × 10−2 c; (c) v = 0.300c; (d) v = 0.999c. An electron and a proton are each accelerated starting from rest through a potential difference of 10.0 million volts. Find the momentum (in MeV/c) and the kinetic energy (in MeV) of each, and compare with the results of using the classical formulas. In a nuclear reactor, each atom of uranium (of atomic mass 235 u) releases about 200 MeV when it fissions. What is the change in mass when 1.00 kg of uranium-235 is fissioned? 2.8 Conservation Laws in Relativistic Decays and Collisions 39. A π meson of rest energy 139.6 MeV moving at a speed of 0.906c collides with and sticks to a proton of rest energy 938.3 MeV that is at rest. (a) Find the total relativistic energy of the resulting composite particle. (b) Find the total linear momentum of the composite particle. (c) Using the results of (a) and (b), find the rest energy of the composite particle. 40. An electron and a positron (an antielectron) make a head-on collision, each moving at v = 0.99999c. In the collision the electrons disappear and are replaced by two muons (mc2 = 105.7 MeV), which move off in opposite directions. What is the kinetic energy of each of the muons? 41. It is desired to create a particle of mass 9700 MeV/c2 in a head-on collision between a proton and an antiproton (each having a mass of 938.3 MeV/c2 ) traveling at the same speed. What speed is necessary for this to occur? 42. A particle of rest energy mc2 is moving with speed v in the positive x direction. The particle decays into two particles, each of rest energy 140 MeV. One particle, with kinetic energy 282 MeV, moves in the positive x direction, and the other particle, with kinetic energy 25 MeV, moves in the negative x direction. Find the rest energy of the original particle and its speed. 2.9 Experimental Tests of Special Relativity 43. In the muon decay experiment discussed in Section 2.9 as a verification of time dilation, the muons move in the lab with a momentum of 3094 MeV/c. Find the dilated lifetime in the laboratory frame. (The proper lifetime is 2.198 μs.) 44. Derive the relativistic expression p2 /2K = m + K/2c2 , which is plotted in Figure 2.28a. General Problems 45. Suppose we want to send an astronaut on a round trip to visit a star that is 200 light-years distant and at rest with respect to Earth. The life support systems on the spacecraft enable the astronaut to survive at most 20 years. (a) At what speed must the astronaut travel to make the round trip in 20 years of spacecraft time? (b) How much time passes on Earth during the round trip? 46. A “cause” occurs at point 1 (x1 , t1 ) and its “effect” occurs at point 2 (x2 , t2 ). Use the Lorentz transformation to find t2′ − t1′ , and show that t2′ − t1′ > 0; that is, O′ can never see the “effect” coming before its “cause.” 47. Observer O sees a red flash of light at the origin at t = 0 and a blue flash of light at x = 3.26 km at a time t = 7.63 μs. What are the distance and the time interval between the flashes according to observer O′ , who moves relative to O in the direction of increasing x with a speed of 0.625c? Problems 48. 49. 50. 51. 52. Assume that the origins of the two coordinate systems line up at t = t′ = 0. Several spacecraft (A, B, C, and D) leave a space station at the same time. Relative to an observer on the station, A travels at 0.60c in the x direction, B at 0.50c in the y direction, C at 0.50c in the negative x direction, and D at 0.50c at 45◦ between the y and negative x directions. Find the velocity components, directions, and speeds of B, C, and D as observed from A. Observer O sees a light turn on at x = 524 m when t = 1.52 μs. Observer O′ is in motion at a speed of 0.563c in the positive x direction. The two frames of reference are synchronized so that their origins match up (x = x′ = 0) at t = t′ = 0. (a) At what time does the light turn on according to O′ ? (b) At what location does the light turn on in the reference frame of O′ ? Suppose an observer O measures a particle of mass m moving in the x direction to have speed v, energy E, and momentum p. Observer O′ , moving at speed u in the x direction, measures v′ , E′ , and p′ for the same object. (a) Use the Lorentz velocity transformation to find E′ and p′ in terms of m, u, and v. (b) Reduce E′2 − (p′ c)2 to its simplest form and interpret the result. Repeat Problem 50 for the mass moving in the y direction according to O. The velocity u of O′ is still along the x direction. Consider again the situation described in Section 2.6. Amelia’s friend Bernice leaves Earth at the same time as Amelia and travels in the same direction at the same speed, but Bernice continues in the original direction when Amelia reaches the planet and turns her ship around. (a) From Bernice’s frame of reference, Casper is moving at a velocity of −0.60c. Draw Casper’s worldline in Bernice’s frame of reference. (b) Casper celebrates 20 birthdays during Amelia’s journey. In Bernice’s frame of reference, how long does it take for Casper to celebrate 20 birthdays? (c) In Bernice’s frame of reference, draw a worldline representing Amelia’s outbound journey to the planet. (d) Calculate Amelia’s velocity during her return journey as observed from Bernice’s frame of reference, and draw a worldline showing Amelia’s return journey. Amelia’s and Casper’s worldlines should intersect when Amelia return to Earth. 53. 54. 55. 56. 57. 67 (e) Divide Casper’s worldline into 20 segments, representing his birthdays. He sends a light signal to Amelia on each birthday. Amelia receives a light signal from Casper just as she arrives at the planet. On which birthday did Casper send this signal? (f ) Amelia sends Casper a light signal on her 8th birthday. Draw a line on your diagram representing this light signal. When does Casper receive this signal? Electrons are accelerated to high speeds by a two-stage machine. The first stage accelerates the electrons from rest to v = 0.99c. The second stage accelerates the electrons from 0.99c to 0.999c. (a) How much energy does the first stage add to the electrons? (b) How much energy does the second stage add in increasing the velocity by only 0.9%? A beam of 1.35 × 1011 electrons/s moving at a speed of 0.732c strikes a block of copper that is used as a beam stop. The copper block is a cube measuring 2.54 cm on edge. What is the temperature increase of the block after one hour? An electron moving at a speed of vi = 0.960c in the positive x direction collides with another electron at rest. After the collision, one electron is observed to move with a speed of v1f = 0.956c at an angle of θ1 = 9.7◦ with the x axis. (a) Use conservation of momentum to find the velocity (magnitude and direction) of the second electron. (b) Based only on the original data given in the problem, use conservation of energy to find the speed of the second electron. A pion has a rest energy of 135 MeV. It decays into two gamma ray photons, bursts of electromagnetic radiation that travel at the speed of light. A pion moving through the laboratory at v = 0.98c decays into two gamma ray photons of equal energies, making equal angles θ with the original direction of motion. Find the angle θ and the energies of the two gamma ray photons. Consider again the decay described in Example 2.16 and determine the energies of the two pi mesons emitted in the decay of the K meson by first making a Lorentz transformation to a reference frame in which the initial K meson is at rest. When a K meson at rest decays into two pi mesons, they move in opposite directions with equal and opposite velocities, so they share the decay energy equally. Find the energies and velocities of the two pi mesons in the K meson’s rest frame. Then transform back to the lab frame to find their kinetic energies. Chapter 3 THE PARTICLELIKE PROPERTIES OF ELECTROMAGNETIC RADIATION Thermal emission, the radiation emitted by all objects due to their temperatures, laid the groundwork for the development of quantum mechanics around the beginning of the 20th century. Today we use thermography for many applications, including the study of heat loss by buildings, medical diagnostics, night vision and other surveillance, and monitoring potential volcanoes. 70 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation We now turn to a discussion of wave mechanics, the second theory on which modern physics is based. One consequence of wave mechanics is the breakdown of the classical distinction between particles and waves. In this chapter we consider the three early experiments that provided evidence that light, which we usually regard as a wave phenomenon, has properties that we normally associate with particles. Instead of spreading its energy smoothly over a wave front, the energy is delivered in concentrated bundles like particles; a discrete bundle (quantum) of electromagnetic energy is known as a photon. Before we begin to discuss the experimental evidence that supports the existence of the photon and the particlelike properties of light, we first review some of the properties of electromagnetic waves. 3.1 REVIEW OF ELECTROMAGNETIC WAVES  and magnetic field An electromagnetic field is characterized by its electric field E  B. For example, the electric field at a distance r from a point charge q at the origin is = E 1 q r̂ 4π ε0 r2 (3.1) where r̂ is a unit vector in the radial direction. The magnetic field at a distance r from a long, straight, current-carrying wire along the z axis is = B μ0 i φ̂ 2π r (3.2) where φ̂ is the unit vector in the azimuthal direction (in the xy plane) in cylindrical coordinates. If the charges are accelerated, or if the current varies with time, an electro and B  vary not only with r but also magnetic wave is produced, in which E with t. The mathematical expression that describes such a wave may have many different forms, depending on the properties of the source of the wave and of the medium through which the wave travels. One special form is the plane wave, in which the wave fronts are planes. (A point source, on the other hand, produces spherical waves, in which the wave fronts are spheres.) A plane electromagnetic wave traveling in the positive z direction is described by the expressions =E  0 sin(kz − ωt), E =B  0 sin(kz − ωt) B (3.3) where the wave number k is found from the wavelength λ (k = 2π/λ) and the angular frequency ω is found from the frequency f (ω = 2π f ). Because λ and f are related by c = λf , k and ω are also related by c = ω/k.  0 ; the plane of The polarization of the wave is represented by the vector E  0 and the direction of propagation, polarization is determined by the direction of E the z axis in this case. Once we specify the direction of travel and the polarization  0 , the direction of B  0 is fixed by the requirements that B  must be perpendicular E  and the direction of travel, and that the vector product E ×B  point in to both E  0 is in the x direction (E  0 = E0 î, where î the direction of travel. For example if E 3.1 | Review of Electromagnetic Waves  0 must be in the y direction (B  0 = B0 ĵ). is a unit vector in the x direction), then B  0 is determined by Moreover, the magnitude of B E B0 = 0 (3.4) c where c is the speed of light. An electromagnetic wave transmits energy from one place to another; the : energy flux is specified by the Poynting vector S = 1 E ×B  S (3.5) μ0 For the plane wave, this reduces to  = 1 E0 B0 sin2 (kz − ωt)k̂ S μ0 (3.8) There are two important features of this expression that you should recognize: 1. The intensity (the average power per unit area) is proportional to E02 . This is a general property of waves: the intensity is proportional to the square of the amplitude. We will see later that this same property also characterizes the waves that describe the behavior of material particles. 2. The intensity fluctuates with time, with the frequency 2f = 2(ω/2π). We don’t usually observe this rapid fluctuation—visible light, for example, has a frequency of about 1015 oscillations per second, and because our eye doesn’t respond that quickly, we observe the time average of many (perhaps 1013 ) cycles. If T is the observation time (perhaps 10−2 s in the case of the eye) then the average power is  1 T Pdt (3.9) Pav = T 0 and using Eq. 3.8 we obtain the intensity I: I= 1 Pav = E2 A 2μ0 c 0 because the average value of sin2 θ is 1 /2 . z S B E (3.6) where k̂ is a unit vector in the z direction. The Poynting vector has dimensions of power (energy per unit time) per unit area—for example, J/s/m2 or W/m2 . , B  , and S for this special case. Figure 3.1 shows the orientation of the vectors E Let us imagine the following experiment. We place a detector of electromagnetic radiation (a radio receiver or a human eye) at some point on the z axis, and we determine the electromagnetic power that this plane wave delivers to the receiver. The receiver is oriented with its sensitive area A perpendicular to the z axis, so that the maximum signal is received; we can therefore drop the vector  and work only with its magnitude S. The power P entering the representation of S receiver is then 1 E B A sin2 (kz − ωt) (3.7) P = SA = μ0 0 0 which we can rewrite using Eq. 3.4 as 1 2 P= E A sin2 (kz − ωt) μ0 c 0 71 (3.10) B E y x FIGURE 3.1 An electromagnetic wave traveling in the z direction. The electric  lies in the xz plane and the field E  lies in the yz plane. magnetic field B 72 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation Interference and Diffraction Plane wave fronts Double slit Maxima Screen Minima (a) (b) FIGURE 3.2 (a) Young’s doubleslit experiment. A plane wave front passes through both slits; the wave is diffracted at the slits, and interference occurs where the diffracted waves overlap on the screen. (b) The interference fringes observed on the screen. The property that makes waves a unique physical phenomenon is the principle of superposition, which, for example, allows two waves to meet at a point, to cause a combined disturbance at the point that might be greater or less than the disturbance produced by either wave alone, and finally to emerge from the point of “collision” with all of the properties of each wave totally unchanged by the collision. To appreciate this important distinction between material objects and waves, imagine trying that trick with two automobiles! This special property of waves leads to the phenomena of interference and diffraction. The simplest and best-known example of interference is Young’s double-slit experiment, in which a monochromatic plane wave is incident on a barrier in which two narrow slits have been cut. (This experiment was first done with light waves, but in fact any wave will do as well, not only other electromagnetic waves, such as microwaves, but also mechanical waves, such as water waves or sound waves. We assume that the experiment is being done with light waves.) Figure 3.2 illustrates this experimental arrangement. The plane wave is diffracted by each of the slits, so that the light passing through each slit covers a much larger area on the screen than the geometric shadow of the slit. This causes the light from the two slits to overlap on the screen, producing the interference. If we move away from the center of the screen just the right distance, we reach a point at which a wave crest passing through one slit arrives at exactly the same time as the previous wave crest that passed through the other slit. When this occurs, the intensity is a maximum, and a bright region appears on the screen. This is constructive interference, and it occurs continually at the point on the screen that is exactly one wavelength further from one slit than from the other. That is, if X1 and X2 are the distances from the point on the screen to the two slits, then a condition for maximum constructive interference is |X1 − X2 | = λ. Constructive interference occurs when any wave crest from one slit arrives simultaneously with another from the other slit, whether it is the next, or the fourth, or the forty-seventh. The general condition for complete constructive interference is that the difference between X1 and X2 be an integral number of wavelengths: |X1 − X2 | = nλ n = 0, 1, 2, . . . (3.11) It is also possible for the crest of the wave from one slit to arrive at a point on the screen simultaneously with the trough (valley) of the wave from the other slit. When this happens, the two waves cancel, giving a dark region on the screen. This is known as destructive interference. (The existence of destructive interference at  intensity minima immediately shows that we must add the electric field vectors E of the waves from the two slits, and not their powers P, because P can never be negative.) Destructive interference occurs whenever the distances X1 and X2 are such that the phase of one wave differs from the other by one-half cycle, or by one and one-half cycles, two and one-half cycles, and so forth: |X1 − X2 | = 21 λ, 23 λ, 52 λ, . . . = (n + 12 )λ n = 0, 1, 2, . . . (3.12) We can find the locations on the screen where the interference maxima occur in the following way. Let d be the separation of the slits, and let D be the distance 3.1 | Review of Electromagnetic Waves from the slits to the screen. If yn is the distance from the center of the screen to the nth maximum, then from the geometry of Figure 3.3 we find (assuming X1 > X2 )   2 2 d d X12 = D2 + and X22 = D2 + (3.13) + yn − yn 2 2 d X1 Subtracting these equations and solving for yn , we obtain yn = X12 − X22 (X + X2 )(X1 − X2 ) = 1 2d 2d D d X2 D (3.14) In experiments with light, D is of order 1 m, and yn and d are typically at most 1 mm; thus X1 ∼ = D and X2 ∼ = D, so X1 + X2 ∼ = 2D, and to a good approximation yn = (X1 − X2 ) 73 (3.15) d 2 yn d−y 2 n FIGURE 3.3 The geometry of the double-slit experiment. Using Eq. 3.11 for the values of (X1 − X2 ) at the maxima, we find yn = n λD d (3.16) Crystal Diffraction of X Rays Another device for observing the interference of light waves is the diffraction grating, in which the wave fronts pass through a barrier that has many slits (often thousands or tens of thousands) and then recombine. The operation of this device is illustrated in Figure 3.4; interference maxima corresponding to different wavelengths appear at different angles θ, according to d sin θ = nλ (3.17) where d is the slit spacing and n is the order number of the maximum (n = 1, 2, 3, . . .). The advantage of the diffraction grating is its superior resolution—it enables us to get very good separation of wavelengths that are close to one another, and thus it is a very useful device for measuring wavelengths. Notice, however, that in order to get reasonable values of the angle θ —for example, sin θ in the range of 0.3 to 0.5—we must have d of the order of a few times the wavelength. For visible light this is not particularly difficult, but for radiations of very short wavelength, mechanical construction of a grating is not possible. For example, for X rays with a wavelength of the order of 0.1 nm, we would need to construct a grating in which the slits were less than 1 nm apart, which is roughly the same as the spacing between the atoms of most materials. The solution to this problem has been known since the pioneering experiments of Laue and Bragg:∗ use the atoms themselves as a diffraction grating! A beam of X rays sees the regular spacings of the atoms in a crystal as a sort of three-dimensional diffraction grating. Source Grating θ Red Blue ∗ Max von Laue (1879–1960, Germany) developed the method of X-ray diffraction for the study of crystal structures, for which he received the 1914 Nobel Prize. Lawrence Bragg (1890–1971, England) developed the Bragg law for X-ray diffraction while he was a student at Cambridge University. He shared the 1915 Nobel Prize with his father, William Bragg, for their research on the use of X rays to determine crystal structures. FIGURE 3.4 The use of a diffraction grating to analyze light into its constituent wavelengths. 74 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation X rays θ Reflection planes d d sin θ FIGURE 3.5 A beam of X rays reflected from a set of crystal planes of spacing d. The beam reflected from the second plane travels a distance 2d sin θ greater than the beam reflected from the first plane. Consider the set of atoms shown in Figure 3.5, which represents a small portion of a two-dimensional slice of the crystal. The X rays are reflected from individual atoms in all directions, but in only one direction will the scattered “wavelets” constructively interfere to produce a reflected beam, and in this case we can regard the reflection as occurring from a plane drawn through the row of atoms. (This situation is identical with the reflection of light from a mirror—only in one direction will there be a beam of reflected light, and in that direction we can regard the reflection as occurring on a plane with the angle of incidence equal to the angle of reflection.) Suppose the rows of atoms are a distance d apart in the crystal. Then a portion of the beam is reflected from the front plane, and a portion is reflected from the second plane, and so forth. The wave fronts of the beam reflected from the second plane lag behind those reflected from the front plane, because the wave reflected from the second plane must travel an additional distance of 2d sin θ, where θ is the angle of incidence as measured from the face of the crystal. (Note that this is different from the usual procedure in optics, in which angles are defined with respect to the normal to the surface.) If this path difference is a whole number of wavelengths, the reflected beams interfere constructively and give an intensity maximum; thus the basic expression for the interference maxima in X-ray diffraction from a crystal is 2d sin θ = nλ n = 1, 2, 3, . . . (3.18) This result is known as Bragg’s law for X-ray diffraction. Notice the factor of 2 that appears in Eq. 3.18 but does not appear in the otherwise similar expression of Eq. 3.17 for the ordinary diffraction grating. Example 3.1 A single crystal of table salt (NaCl) is irradiated with a beam of X rays of wavelength 0.250 nm, and the first Bragg reflection is observed at an angle of 26.3◦ . What is the atomic spacing of NaCl? Incident beam Reflected beams θ1 d1 θ2 θ3 d2 d3 FIGURE 3.6 An incident beam of X rays can be reflected from many different crystal planes. Solution Solving Bragg’s law for the spacing d, we have d= 0.250 nm nλ = 0.282 nm = 2 sin θ 2 sin 26.3◦ Our drawing of Figure 3.5 was very arbitrary—we had no basis for choosing which set of atoms to draw the reflecting planes through. Figure 3.6 shows a larger section of the crystal. As you can see, there are many possible reflecting planes, each with a different value of θ and d. (Of course, di and θi are related and cannot be varied independently.) If we used a beam of X rays of a single wavelength, it might be difficult to find the proper angle and set of planes to observe the interference. However, if we use a beam of X rays of a continuous range of wavelengths, for each di and θi interference will occur for a certain wavelength λi , and so there will be a pattern of interference maxima appearing at different angles of reflection as shown in Figure 3.6. The pattern of interference maxima depends on the spacing and the type of arrangement of the atoms in the crystal. Figure 3.7 shows sample patterns (called Laue patterns) that are obtained from X-ray scattering from two different crystals. The bright dots correspond to interference maxima for wavelengths from the range of incident wavelengths that happen to satisfy Eq. 3.18. The three-dimensional pattern is more complicated 3.2 | The Photoelectric Effect 75 Film Crystal Incident X rays (full range of wavelengths) Scattered X rays (a) (b) (c) FIGURE 3.7 (a) Apparatus for observing X-ray scattering by a crystal. An interference maximum (dot) appears on the film whenever a set of crystal planes happens to satisfy the Bragg condition for a particular wavelength. (b) Laue pattern of TiO2 crystal. (c) Laue pattern of a polyethylene crystal. The differences between the two Laue patterns are due to the differences in the geometric structure of the two crystals. than our two-dimensional drawings, but the individual dots have the same interpretation. Figure 3.8 shows the pattern obtained from a sample that consists of many tiny crystals, rather than one single crystal. (It looks like Figure 3.7b or 3.7c rotated rapidly about its center.) From such pictures it is also possible to deduce crystal structures and lattice spacing. All of the examples we have discussed in this section depend on the wave properties of electromagnetic radiation. However, as we now begin to discuss, there are other experiments that cannot be explained if we regard electromagnetic radiation as waves. Film Powder Incident X rays Scattered X rays (a) 3.2 THE PHOTOELECTRIC EFFECT We’ll now turn to our discussion of the first of three experiments that cannot be explained by the wave theory of light. When a metal surface is illuminated with light, electrons can be emitted from the surface. This phenomenon, known as the photoelectric effect, was discovered by Heinrich Hertz in 1887 in the process of his research into electromagnetic radiation. The emitted electrons are called photoelectrons. A sample experimental arrangement for observing the photoelectric effect is illustrated in Figure 3.9. Light falling on a metal surface (the emitter) can release electrons, which travel to the collector. The experiment must be done in an evacuated tube, so that the electrons do not lose energy in collisions with molecules of the air. Among the properties that can be measured are the rate of electron emission and the maximum kinetic energy of the photoelectrons.∗ The rate of electron emission can be measured as an electric current i by an ammeter in the external circuit. The maximum kinetic energy of the electrons ∗ The electrons can be emitted with many different kinetic energies, depending on how tightly bound they are to the metal. Here we are concerned only with the maximum kinetic energy, which depends on the energy needed to remove the least tightly bound electron from the surface of the metal. (b) FIGURE 3.8 (a) Apparatus for observing X-ray scattering from a powdered or polycrystalline sample. Because the individual crystals have many different orientations, each scattered ray of Figure 3.7 becomes a cone which forms a circle on the film. (b) Diffraction pattern (known as Debye-Scherrer pattern) of polycrystalline gold. 76 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation Light Emitter Collector e i V A Vext FIGURE 3.9 Apparatus for observing the photoelectric effect. The flow of electrons from the emitter to the collector is measured by the ammeter A as a current i in the external circuit. A variable voltage source Vext establishes a potential difference between the emitter and collector, which is measured by the voltmeter V . can be measured by applying a negative potential to the collector that is just enough to repel the most energetic electrons, which then do not have enough energy to “climb” the potential energy hill. That is, if the potential difference between the emitter and the collector is V (a negative quantity), then electrons traveling from the emitter to the collector would gain a potential energy of U = q V = −e V (a positive quantity) and would lose the same amount of kinetic energy. Electrons leaving the emitter with a kinetic energy smaller than this U cannot reach the collector and are pushed back toward the emitter. As the magnitude of the potential difference is increased, at some point even the most energetic electrons do not have enough kinetic energy to reach the collector. This potential, called the stopping potential Vs , is determined by increasing the magnitude of the voltage until the ammeter current drops to zero. At this point the maximum kinetic energy Kmax of the electrons as they leave the emitter is just equal to the kinetic energy eVs lost by the electrons in “climbing” the hill: Kmax = eVs (3.19) where e is the magnitude of the electric charge of the electron. Typical values of Vs are a few volts.∗ In the classical picture, the surface of the metal is illuminated by an electromagnetic wave of intensity I. The surface absorbs energy from the wave until the energy exceeds the binding energy of the electron to the metal, at which point the electron is released. The minimum quantity of energy needed to remove an electron is called the work function φ of the material. Table 3.1 lists some values of the work function of different materials. You can see that the values are typically a few electron-volts. The Classical Theory of the Photoelectric Effect What does the classical wave theory predict about the properties of the emitted photoelectrons? TABLE 3.1 Some Photoelectric Work Functions Material φ (eV) Na 2.28 Al 4.08 Co 3.90 Cu 4.70 Zn 4.31 Ag 4.73 Pt 6.35 Pb 4.14 1. The maximum kinetic energy of the electrons should be proportional to the intensity of the radiation. As the brightness of the light source is increased, more energy is delivered to the surface (the electric field is greater) and the electrons should be released with greater kinetic energies. Equivalently,  of the increasing the intensity of the light source increases the electric field E  = −eE  on the electron and its kinetic wave, which also increases the force F energy when it eventually leaves the surface. 2. The photoelectric effect should occur for light of any frequency or wavelength. According to the wave theory, as long as the light is intense enough to release electrons, the photoelectric effect should occur no matter what the frequency or wavelength. 3. The first electrons should be emitted in a time interval of the order of seconds after the radiation begins to strike the surface. In the wave theory, the energy of the wave is uniformly distributed over the wave front. If the electron absorbs energy directly from the wave, the amount of energy delivered to any ∗ The potential difference V read by the voltmeter is not equal to the stopping potential when the emitter and collector are made of different materials. In that case a correction must be applied to account for the contact potential difference between the emitter and collector. 3.2 | The Photoelectric Effect 77 electron is determined by how much radiant energy is incident on the surface area in which the electron is confined. Assuming this area is about the size of an atom, a rough calculation leads to an estimate that the time lag between turning on the light and observing the first photoelectrons should be of the order of seconds (see Example 3.2). Example 3.2 A laser beam with an intensity of 120 W/m2 (roughly that of a small helium-neon laser) is incident on a surface of sodium. It takes a minimum energy of 2.3 eV to release an electron from sodium (the work function φ of sodium). Assuming the electron to be confined to an area of radius equal to that of a sodium atom (0.10 nm), how long will it take for the surface to absorb enough energy to release an electron? Solution The average power Pav delivered by the wave of intensity I to an area A is IA. An atom on the surface displays a “target area” of A = π r2 = π(0.10 × 10−9 m)2 = 3.1 × 10−20 m2 . If the entire electromagnetic power is delivered to the electron, energy is absorbed at the rate E/t = Pav . The time interval t necessary to absorb an energy E = φ can be expressed as t = = E φ = Pav IA (2.3 eV)(1.6 × 10−19 J/eV) (120 W/m2 )(3.1 × 10−20 m2 ) = 0.10 s In reality, electrons in metals are not always bound to individual atoms but instead can be free to roam throughout the metal. However, no matter what reasonable estimate we make for the area over which the energy is absorbed, the characteristic time for photoelectron emission is estimated to have a magnitude of the order of seconds, in a range easily accessible to measurement. 1. For a fixed value of the wavelength or frequency of the light source, the maximum kinetic energy of the emitted photoelectrons (determined from the stopping potential) is totally independent of the intensity of the light source. Figure 3.10 shows a representation of the experimental results. Doubling the intensity of the source leaves the stopping potential unchanged, indicating no change in the maximum kinetic energy of the electrons. This experimental result disagrees with the wave theory, which predicts that the maximum kinetic energy should depend on the intensity of the light. 2. The photoelectric effect does not occur at all if the frequency of the light source is below a certain value. This value, which is characteristic of the kind of metal surface used in the experiment, is called the cutoff frequency fc . Above fc , any light source, no matter how weak, will cause the emission of photoelectrons; below fc , no light source, no matter how strong, will cause the emission of photoelectrons. This experimental result also disagrees with the predictions of the wave theory. 3. The first photoelectrons are emitted virtually instantaneously (within 10−9 s) after the light source is turned on. The wave theory predicts a measurable time delay, so this result also disagrees with the wave theory. These three experimental results all suggest the complete failure of the wave theory to account for the photoelectric effect. Current i The experimental characteristics of the photoelectric effect were well known by the year 1902. How do the predictions of the classical theory compare with the experimental results? I2 = 2I1 I1 Vs 0 Potential difference ∆V FIGURE 3.10 The photoelectric current i as a function of the potential difference V for two different values of the intensity of the light. When the intensity I is doubled, the current is doubled (twice as many photoelectrons are emitted), but the stopping potential Vs remains the same. 78 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation The Quantum Theory of the Photoelectric Effect A successful theory of the photoelectric effect was developed in 1905 by Albert Einstein. Five years earlier, in 1900, the German physicist Max Planck had developed a theory to explain the wavelength distribution of light emitted by hot, glowing objects (called thermal radiation, which is discussed in the next section of this chapter). Based partly on Planck’s ideas, Einstein proposed that the energy of electromagnetic radiation is not continuously distributed over the wave front, but instead is concentrated in localized bundles or quanta (also known as photons). The energy of a photon associated with an electromagnetic wave of frequency f is E = hf (3.20) where h is a proportionality constant known as Planck’s constant. The photon energy can also be related to the wavelength of the electromagnetic wave by substituting f = c/λ, which gives E= hc λ (3.21) We often speak about photons as if they were particles, and as concentrated bundles of energy they have particlelike properties. Like the electromagnetic waves, photons travel at the speed of light, and so they must obey the relativistic relationship p = E/c. Combining this with Eq. 3.21, we obtain p= h λ (3.22) Photons carry linear momentum as well as energy, and thus they share this characteristic property of particles. Because a photon travels at the speed of light, it must have zero mass. Otherwise its energy and momentum would be infinite. Similarly, a photon’s rest energy E0 = mc2 must also be zero. In Einstein’s interpretation, a photoelectron is released as a result of an encounter with a single photon. The entire energy of the photon is delivered instantaneously to a single photoelectron. If the photon energy hf is greater than the work function φ of the material, the photoelectron will be released. If the photon energy is smaller than the work function, the photoelectric effect will not occur. This explanation thus accounts for two of the failures of the wave theory: the existence of the cutoff frequency and the lack of any measurable time delay. If the photon energy hf exceeds the work function, the excess energy appears as the kinetic energy of the electron: Kmax = hf − φ (3.23) The intensity of the light source does not appear in this expression! For a fixed frequency, doubling the intensity of the light means that twice as many photons strike the surface and twice as many photoelectrons are released, but they all have precisely the same maximum kinetic energy. 3.2 | The Photoelectric Effect 79 You can think of Eq. 3.23 as giving a relationship between energy quantities in analogy to making a purchase at a store. The quantity hf represents the payment you hand to the cashier, the quantity φ represents the cost of the object, and Kmax represents the change you receive. In the photoelectric effect, hf is the amount of energy that is available to “purchase” an electron from the surface, the work function φ is the “cost” of removing the least tightly bound electron from the surface, and the difference between the available energy and the removal cost is the leftover energy that appears as the kinetic energy of the emitted electron. (The more tightly bound electrons have a greater “cost” and so emerge with smaller kinetic energies.) A photon that supplies an energy equal to φ, exactly the minimum amount needed to remove an electron, corresponds to light of frequency equal to the cutoff frequency fc . At this frequency, there is no excess energy for kinetic energy, so Eq. 3.23 becomes hfc = φ, or φ h (3.24) The corresponding cutoff wavelength λc = c/fc is λc = hc φ (3.25) The cutoff wavelength represents the largest wavelength for which the photoelectric effect can be observed for a surface with the work function φ. The photon theory appears to explain all of the observed features of the photoelectric effect. The most detailed test of the theory was done by Robert Millikan in 1915. Millikan measured the maximum kinetic energy (stopping potential) for different frequencies of the light and obtained a plot of Eq. 3.23. A sample of his results is shown in Figure 3.11. From the slope of the line, Millikan obtained a value for Planck’s constant of h = 6.57 × 10−34 J · s In part for his detailed experiments on the photoelectric effect, Millikan was awarded the 1923 Nobel Prize in physics. Einstein was awarded the 1921 Nobel Prize for his photon theory as applied to the photoelectric effect. As we discuss in the next section, the wavelength distribution of thermal radiation also yields a value for Planck’s constant, which is in good agreement with Millikan’s value derived from the photoelectric effect. Planck’s constant is one of the fundamental constants of nature; just as c is the characteristic constant of relativity, h is the characteristic constant of quantum mechanics. The value of Planck’s constant has been measured to great precision in a variety of experiments. The presently accepted value is h = 6.6260696 × 10−34 J · s This is an experimentally determined value, with a relative uncertainty of about 5 × 10−8 (±3 units in the last digit). Robert A. Millikan (1868–1953, United States). Perhaps the best experimentalist of his era, his work included the precise determination of Planck’s constant using the photoelectric effect (for which he received the 1923 Nobel Prize) and the measurement of the charge of the electron (using his famous “oil-drop” apparatus). Stopping potential Vs (volts) fc = 3 2 1 0 Slope = 4.1 × 10−15 V.s 60 80 100 120 Radiation frequency (1013 Hz) FIGURE 3.11 Millikan’s results for the photoelectric effect in sodium. The slope of the line is h/e; the experimental determination of the slope gives a way of determining Planck’s constant. The intercept should give the cutoff frequency; however, in Millikan’s time the contact potentials of the electrodes were not known precisely and so the vertical scale is displaced by a few tenths of a volt. The slope not affected by this correction. 80 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation Example 3.3 (a) What are the energy and momentum of a photon of red light of wavelength 650 nm? (b) What is the wavelength of a photon of energy 2.40 eV? Solution The momentum is found in a similar way, using Eq. 3.22 p= (a) Using Eq. 3.21 we obtain (6.63 × 10−34 J · s)(3.00 × 108 m/s) hc = λ 650 × 10−9 m −19 J = 3.06 × 10 E= h 1 hc 1 = = λ c λ c p= 3.06 × 10−19 J = 1.91 eV 1.60 × 10−19 J/eV This type of problem can be simplified if we express the combination hc in units of eV · nm: E= 1240 eV · nm hc = = 1.91 eV λ 650 nm 1240 eV · nm 650 nm  = 1.91 eV/c The momentum could also be found directly from the energy: Converting to electron-volts, we have E=  1.91 eV E = = 1.91 eV/c c c (It may be helpful to review the discussion in Example 2.11 about these units of momentum.) (b) Solving Eq. 3.21 for λ, we find λ= 1240 eV · nm hc = = 517 nm E 2.40 eV Example 3.4 The work function for tungsten metal is 4.52 eV. (a) What is the cutoff wavelength λc for tungsten? (b) What is the maximum kinetic energy of the electrons when radiation of wavelength 198 nm is used? (c) What is the stopping potential in this case? (b) At the shorter wavelength, hc −φ λ 1240 eV · nm = − 4.52 eV 198 nm = 1.74 eV Kmax = hf − φ = Solution (a) Equation 3.25 gives hc 1240 eV · nm = = 274 nm λc = φ 4.52 eV in the ultraviolet region. (c) The stopping potential is the voltage corresponding to Kmax : Vs = Kmax 1.74 eV = = 1.74 V e e 3.3 THERMAL RADIATION The second type of experiment we discuss that cannot be explained by the classical wave theory is thermal radiation, which is the electromagnetic radiation emitted by all objects because of their temperature. At room temperature the thermal radiation is mostly in the infrared region of the spectrum, where our eyes are not sensitive. As we heat objects to higher temperatures, they may emit visible light. 3.3 | Thermal Radiation 1. The total intensity radiated over all wavelengths (that is, the area under each curve) increases as the temperature is increased. This is not a surprising result: we commonly observe that a glowing object glows brighter and thus radiates more energy as we increase its temperature. From careful measurement, we find that the total intensity increases as the fourth power of the absolute or kelvin temperature: (3.26) I = σ T4 where we have introduced the proportionality constant σ . Equation 3.26 is called Stefan’s law and the constant σ is called the Stefan-Boltzmann constant. Its value can be determined from experimental results such as those illustrated in Figure 3.13: σ = 5.67037 × 10−8 W/m2 · K4 2. The wavelength λmax at which the emitted intensity reaches its maximum value decreases as the temperature is increased, in inverse proportion to the temperature: λmax ∝ 1/T. From results such as those of Figure 3.13, we can determine the proportionality constant, so that λmax T = 2.8978 × 10−3 m · K Prism θ always, intensity means energy per unit time per unit area (or power per unit area), as in Eq. 3.10. Previously, “unit area” referred to the wave front, such as would be measured if we recorded the waves with an antenna of a certain area. Here, “unit area” indicates the electromagnetic radiation emitted from each unit area of the surface of the object whose thermal emissions are being observed. ∆θ Detector FIGURE 3.12 Measurement of the spectrum of thermal radiation. A device such as a prism is used to separate the wavelengths emitted by the object. λmax 1250 K λmax (3.27) This result is known as Wien’s displacement law; the term “displacement” refers to the way the peak is moved or displaced as the temperature is ∗ As Object at temperature T1 Intensity I(λ) A typical experimental arrangement is shown in Figure 3.12. An object is maintained at a temperature T1 . The radiation emitted by the object is detected by an apparatus that is sensitive to the wavelength of the radiation. For example, a dispersive medium such as a prism can be used so that different wavelengths appear at different angles θ. By moving the radiation detector to different angles θ we can measure the intensity∗ of the radiation at a specific wavelength. The detector is not a geometrical point (hardly an efficient detector!) but instead subtends a small range of angles θ , so what we really measure is the amount of radiation in some range θ at θ, or, equivalently, in some range λ at λ. Many experiments were done in the late 19th century to study the wavelength spectrum of thermal radiation. These experiments, as we shall see, gave results that totally disagreed with the predictions of the classical theories of thermodynamics and electromagnetism; instead, the successful analysis of the experiments provided the first evidence of the quantization of energy, which would eventually be seen as the basis for the new quantum theory. Let’s first review the experimental results. The goal of these experiments was to measure the intensity of the radiation emitted by the object as a function of wavelength. Figure 3.13 shows a typical set of experimental results when the object is at a temperature T1 = 1000 K. If we now change the temperature of the object to a different value T2 , we obtain a different curve, as shown in Figure 3.13 for T2 = 1250 K. If we repeat the measurement for many different temperatures, we obtain systematic results for the radiation intensity that reveal two important characteristics: 81 1000 K 1 2 3 4 5 6 7 8 9 10 Wavelength (μm) FIGURE 3.13 A possible result of the measurement of the radiation intensity over many different wavelengths. Each different temperature of the emitting body gives a different peak λmax . 82 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation varied. Wien’s law is qualitatively consistent with our common observation that heated objects first begin to glow with a red color, and at higher temperatures the color becomes more yellow. As the temperature is increased, the wavelength at which most of the radiation is emitted moves from the longerwavelength (red) part of the visible region toward medium wavelengths. The term “white hot” refers to an object that is hot enough to produce the mixture of all wavelengths in the visible region to make white light. Example 3.5 (a) At what wavelength does a room-temperature (T = 20◦ C) object emit the maximum thermal radiation? (b) To what temperature must we heat it until its peak thermal radiation is in the red region of the spectrum (λ = 650 nm)? (c) How many times as much thermal radiation does it emit at the higher temperature? Solution (a) Using the absolute temperature, T1 = 273 + 20 = 293 K, Wien’s displacement law gives 2.8978 × 10−3 m · K T1 2.8978 × 10−3 m · K = = 9.89 μm 293 K λmax = This is in the infrared region of the electromagnetic spectrum. FIGURE 3.14 A cavity filled with electromagnetic radiation in thermal equilibrium with its walls at temperature T. Some radiation escapes through the hole, which represents an ideal blackbody. (b) For λmax = 650 nm, we again use Wien’s displacement law to find the new temperature T2 : T2 = 2.8978 × 10−3 m · K λmax = 2.8978 × 10−3 m · K 650 × 10−9 m = 4460 K (c) The total intensity of radiation is proportional to T 4 , so the ratio of the total thermal emissions will be σ T24 (4460 K)4 I2 = = I1 σ T14 (293 K)4 = 5.37 × 104 Be sure to notice the use of absolute (kelvin) temperatures in this example. The theoretical analysis of the emission of thermal radiation from an arbitrary object is extremely complicated. It depends on details of the surface properties of the object, and it also depends on how much radiation the object reflects from its surroundings. To simplify our analysis, we consider a special type of object called a blackbody, which absorbs all radiation incident on it and reflects none of the incident radiation. To simplify further, we consider a special type of blackbody: a hole in a hollow metal box whose walls are in thermal equilibrium at temperature T. The box is filled with electromagnetic radiation that is emitted and reflected by the walls. A small hole in one wall of the box allows some of the radiation to escape (Figure 3.14). It is the hole, and not the box itself, that is the blackbody. Radiation from outside that is incident on the hole gets lost inside the box and has a negligible chance of reemerging from the hole; thus no reflections occur from the blackbody (the hole). The radiation that emerges from the hole is just a sample of the radiation inside the box, so understanding the nature of the radiation inside the box allows us to understand the radiation that leaves through the hole. 3.3 | Thermal Radiation Let’s consider the radiation inside the box. It has an energy density (energy per unit volume) per unit wavelength interval u(λ). That is, if we could look into the interior of the box and measure the energy density of the electromagnetic radiation with wavelengths between λ and λ + dλ in a small volume element, the result would be u(λ)dλ. For the radiation in this wavelength interval, what is the corresponding intensity (power per unit area) emerging from the hole? At any particular instant, half of the radiation in the box will be moving away from the hole. The other half of the radiation is moving toward the hole at velocity of magnitude c but directed over a range of angles. Averaging over this range of angles to evaluate the energy flowing perpendicular to the surface of the hole introduces another factor of 1/2, so the contribution of the radiation in this small wavelength interval to the intensity passing through the hole is I(λ) = c u(λ) 4 (3.28) The quantity I(λ)dλ is the radiant intensity in the small interval dλ at the wavelength λ. This is the quantity whose measurement gives the results displayed in Figure 3.13. Each data point represents a measurement of the intensity in a small wavelength interval. The goal of the theoretical analysis is to find a mathematical function I(λ) that gives a smooth fit through the data points of Figure 3.13. If we wish to find the total intensity emitted in the region between wavelengths λ1 and λ2 , we divide the region into narrow intervals dλ and add the intensities in each interval, which is equivalent to the integral between those limits: I(λ1 :λ2 ) =  λ2 I(λ) dλ (3.29) λ1 This is similar to Eq. 1.27 for determining the number of molecules with energies between two limits. The total emitted intensity can be found by integrating over all wavelengths: I=  ∞ I(λ) dλ (3.30) 0 This total intensity should work out to be proportional to the 4th power of the temperature, as required by Stefan’s law (Eq. 3.26). Classical Theory of Thermal Radiation Before discussing the quantum theory of thermal radiation, let’s see what the classical theories of electromagnetism and thermodynamics can tell us about the dependence of I on λ. The complete derivation is not given here, only a brief outline of the theory.∗ The derivation involves first computing the amount of radiation (number of waves) at each wavelength and then finding the contribution of each wave to the total energy in the box. ∗ For a more complete derivation, see R. Eisberg and R. Resnick, Quantum Theory of Atoms, Molecules, Solids, Nuclei, and Particles, 2nd edition (Wiley, 1985), pp. 9–13. 83 84 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation 1. The box is filled with electromagnetic standing waves. If the walls of the box are metal, radiation is reflected back and forth with a node of the electric field at each wall (the electric field must vanish inside a conductor). This is the same condition that applies to other standing waves, like those on a stretched string or a column of air in an organ pipe. 2. The number of standing waves with wavelengths between λ and λ + dλ is N(λ) dλ = 8π V dλ λ4 (3.31) where V is the volume of the box. For one-dimensional standing waves, as on a stretched string of length L, the allowed wavelength are λ = 2L/n, (n = 1, 2, 3, . . .). The number of possible standing waves with wavelengths between λ1 and λ2 is n2 − n1 = 2L(1/λ2 − 1/λ1 ). In the small interval from λ to λ + dλ, the number of standing waves is N(λ)dλ = |dn/dλ|dλ = (2L/λ2 )dλ. Equation 3.31 can be obtained by extending this approach to three dimensions. 3. Each individual wave contributes an average energy of kT to the radiation in the box. This result follows from an analysis similar to that of Section 1.3 for the statistical mechanics of gas molecules. In this case we are interested in the statistics of the oscillating atoms in the walls of the cavity, which are responsible for setting up the standing electromagnetic waves in the cavity. For a one-dimensional oscillator, the energies are distributed according to the Maxwell-Boltzmann distribution:∗ N(E) = N −E/kT e kT (3.32) Recall from Section 1.3 that N(E) is defined so that the number of oscillators with energies between E and E + dE is dN = N(E)dE, and thus the total ∞ number of oscillators at all energies is dN = 0 N(E)dE, which (as you should show) works out to N. The average energy per oscillator is then found in the same way as the average energy of a gas molecule (Eq. 1.25):   ∞ 1 ∞ 1 E N(E) dE = E e−E/kT dE (3.33) Eav = N 0 kT 0 which does indeed work out to Eav = kT. Putting all these ingredients together, we can find the energy density of radiation in the wavelength interval dλ inside the cavity: energy density = (number of standing waves per unit volume) × (average energy per standing wave) or u(λ) dλ = 8π N(λ) dλ kT = 4 kT dλ V λ (3.34) The corresponding intensity per unit wavelength interval dλ is I(λ) = c 2π c c 8π kT = 4 kT u(λ) = 4 4 λ4 λ (3.35) This result is known as the Rayleigh-Jeans formula; based firmly on the classical theories of electromagnetism and thermodynamics, it represents our best attempt ∗ The exponential part of this expression is that same as that of Eq. 1.22 for gas molecules, but the rest of the equation is different, because the statistical behavior of one-dimensional oscillators is different from that of gas molecules moving in three dimensions. We’ll consider these calculations in greater detail in Chapter 10. 3.3 | Thermal Radiation Quantum Theory of Thermal Radiation The new physics that gave the correct interpretation of thermal radiation was proposed by the German physicist Max Planck in 1900. The ultraviolet catastrophe occurs because the Rayleigh-Jeans formula predicts too much intensity at short wavelengths (or equivalently at high frequencies). What is needed is a way to make u → 0 as λ → 0, or as f → ∞. Again considering the electromagnetic standing waves to result from the oscillations of atoms in the walls of the cavity, Planck tried to find a way to reduce the number of high-frequency standing waves by reducing the number of high-frequency oscillators. He did this by a bold assumption that formed the cornerstone of a new physical theory, quantum physics. Associated with this theory is a new version of mechanics, known as wave mechanics or quantum mechanics. We discuss the methods of wave mechanics in Chapter 5; for now we show how Planck’s theory provided the correct interpretation of the emission spectrum of thermal radiation. Planck suggested that an oscillating atom can absorb or emit energy only in discrete bundles. This bold suggestion was necessary to keep the average energy of a low-frequency (long-wavelength) oscillator equal to kT (in agreement with the Rayleigh-Jeans law at long wavelength), but it also made the average energy of a high-frequency (low-wavelength) oscillator approach zero. Let’s see how Planck managed this remarkable feat. In Planck’s theory, each oscillator can emit or absorb energy only in quantities that are integer multiples of a certain basic quantity of energy ε, En = nε n = 1, 2, 3, . . . Rayleigh-Jeans Intensity I(λ) to apply classical physics to understanding the problem of blackbody radiation. In Figure 3.15 the intensity calculated from the Rayleigh-Jeans formula is compared with typical experimental results. The intensity calculated with Eq. 3.35 approaches the data at long wavelengths, but at short wavelengths, the classical theory (which predicts u → ∞ as λ → 0) fails miserably. The failure of the Rayleigh-Jeans formula at short wavelengths is known as the ultraviolet catastrophe and represents a serious problem for classical physics, because the theories of thermodynamics and electromagnetism on which the Rayleigh-Jeans formula is based have been carefully tested in many other circumstances and found to give extremely good agreement with experiment. It is apparent in the case of blackbody radiation that the classical theories do not work, and that a new kind of physical theory is needed. 85 1 2 3 4 5 6 7 8 9 10 Wavelength (μm) FIGURE 3.15 The failure of the classical Rayleigh-Jeans formula to fit the observed intensity. At long wavelengths the theory approaches the data, but at short wavelengths the classical formula fails miserably. (3.36) where n is the number of quanta. Furthermore, the energy of each of the quanta is determined by the frequency ε = hf (3.37) where h is the constant of proportionality, now known as Planck’s constant. From the mathematical standpoint, the difference between Planck’s calculation and the classical calculation using Maxwell-Boltzmann statistics is that the energy of an oscillator at a certain wavelength or frequency is no longer a continuous variable—it is a discrete variable that takes only the values given by Eq. 3.36. The integrals in the classical calculation are then replaced by sums, and the number of oscillators with energy En is then Nn = N(1 − e−ε/kT )e−nε/kT (3.38) Max Planck (1858–1947, Germany). His work on the spectral distribution of radiation, which led to the quantum theory, was honored with the 1918 Nobel Prize. In his later years, he wrote extensively on religious and philosophical topics. 86 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation Intensity I(λ) Planck’s function (Compare this result with Eq. 3.32 for the continuous case.) Here Nn represents the number of oscillators with energy En , while N is the total number. You should ∞  Nn = N, again giving the total number of oscillators when be able to show that n=0 summed over all possible energies. Planck’s calculation then gives the average energy: Eav = 1 2 3 4 5 6 7 8 9 10 ∞ ∞  1  (nε)e−nε/kT Nn En = (1 − e−ε/kT ) N (3.39) n=0 n=0 which gives (see Problem 14) Wavelength (μm) FIGURE 3.16 Planck’s function fits the observed data perfectly. Eav = ε eε/kT −1 = hf ehf /kT −1 = hc/λ −1 ehc/λkT (3.40) Note from this equation that Eav ∼ = kT at small f (large λ) but that Eav → 0 at large f (small λ). Thus the small-wavelength oscillators carry a vanishingly small energy, and the ultraviolet catastrophe is solved! Based on Planck’s result, the intensity of the radiation then becomes (using Eqs. 3.28 and 3.31):   1 2π hc2 hc/λ c 8π (3.41) = I(λ) = 4 λ4 ehc/λkT − 1 λ5 ehc/λkT − 1 Intensity (An alternative approach to deriving this result is given in Section 10.6.) The perfect agreement between experiment and Planck’s formula is illustrated in Figure 3.16. In Problems 15 and 16 at the end of this chapter you will demonstrate that Planck’s formula can be used to deduce Stefan’s law and Wien’s displacement law. In fact, deducing Stefan’s law from Planck’s formula results in a relationship between the Stefan-Boltzmann constant and Planck’s constant: σ = C Frequency FIGURE 3.17 Data from the COBE satellite, launched in 1989 to determine the temperature of the cosmic microwave background radiation from the early universe. The data points exactly fit the Planck function corresponding to a temperature of 2.725 K. To appreciate the remarkable precision of this experiment, note that the sizes of the error bars have been increased by a factor of 400 to make them visible! (Source: NASA Office of Space Science) 2π 5 k 4 15c2 h3 (3.42) By determining the value of the Stefan-Boltzmann constant from the intensity data available in 1900, Planck was able to determine a value of the constant h : h = 6.56 × 10−34 J · s which agrees very well with the value of h that Millikan deduced 15 years later based on the analysis of data from the photoelectric effect. The good agreement of these two values is remarkable, because they are derived from very different kinds of experiments—one involves the emission and the other the absorption of electromagnetic radiation. This suggests that the quantization property is not an accident arising from the analysis of one particular experiment, but is instead a property of the electromagnetic field itself. Along with many other scientists of his era, Planck was slow to accept this interpretation. However, later experimental evidence (including the Compton effect) proved to be so compelling that it left no doubt about Einstein’s photon theory and the particlelike structure of the electromagnetic field. 3.4 | The Compton Effect 87 Planck’s formula still finds important applications today in the measurement of temperature. By measuring the intensity of radiation emitted by an object at a particular wavelength (or, as in actual experiments, in a small interval of wavelengths), Eq. 3.41 can be used to deduce the temperature of the object. Note that only one measurement, at any wavelength, is all that is required to obtain the temperature. A radiometer is a device for measuring the intensity of thermal radiation at selected wavelengths, enabling a determination of temperature. Radiometers in orbiting satellites are used to measure the temperature of the land and sea areas of the Earth and of the upper surface of clouds. Other orbiting radiometers have been aimed toward “empty space” to measure the temperature of the radiation from the early history of the universe (Figure 3.17). Example 3.6 You are using a radiometer to observe the thermal radiation from an object that is heated to maintain its temperature at 1278 K. The radiometer records radiation in a wavelength interval of 12.6 nm. By changing the wavelength at which you are measuring, you set the radiometer to record the most intense radiation emission from the object. What is the intensity of the emitted radiation in this interval? Solution The wavelength setting for the most intense radiation is determined from Wien’s displacement law: λmax 2.8978 × 10−3 m · K 2.8978 × 10−3 m · K = = T 1278 K = 2.267 × 10−6 m = 2267 nm The given temperature corresponds to kT = (8.6174 × 10−5 eV/K)(1278 K) = 0.1101 eV. The radiation intensity in this small wavelength interval is I(λ)dλ = 1 2π hc2 dλ 5 hc/λkT λ e −1 = 2π(6.626 × 10−34 J · s)(2.998 × 108 m/s)2 ×(12.6 × 10−9 m)(2.267 × 10−6 m)−5 ×(e(1240 eV·nm)/(2267 nm)(0.1101 eV) − 1)−1 = 552 W/m2 3.4 THE COMPTON EFFECT Another way for radiation to interact with matter is by means of the Compton effect, in which radiation scatters from loosely bound, nearly free electrons. Part of the energy of the radiation is given to the electron; the remainder of the energy is reradiated as electromagnetic radiation. According to the wave picture, the scattered radiation is less energetic than the incident radiation (the difference going into the kinetic energy of the electron) but has the same wavelength. As we will see, the photon concept leads to a very different prediction for the scattered radiation. The scattering process is analyzed simply as an interaction (a “collision” in the classical sense of particles) between a single photon and an electron, which 88 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation ton we assume to be at rest. Figure 3.18 shows the process. Initially, the photon has energy E and linear momentum p given by ho Incident photon e att Sc p red E′, p′ θ φ E, p E e, pe Scattered electron FIGURE 3.18 The Compton scattering. geometry of E = hf = hc λ and p= E c (3.43) The electron, initially at rest, has rest energy me c2 . After the scattering, the photon has energy E′ = hc/λ′ and momentum p′ = E′ /c, and it moves in a direction at an angle θ with respect to the direction of the incident photon. The electron has total final energy Ee and momentum pe and moves in a direction at an angle φ with respect to the initial photon. (To allow for the possibility of high-energy incident photons giving energetic scattered electrons, we use relativistic kinematics for the electron.) The conservation laws for total relativistic energy and momentum are then applied: E + me c2 = E′ + Ee p = pe cos φ + p′ cos θ 0 = pe sin φ − p′ sin θ Einitial = Efinal : px,initial = px,final : py,initial = py,final : (3.44a) (3.44b) (3.44c) We have three equations with four unknowns (θ , φ, Ee , E′ ; pe and p′ are not independent unknowns) that cannot be solved uniquely, but we can eliminate any two of the four unknowns by solving the equations simultaneously. If we choose to measure the energy and direction of the scattered photon, we eliminate Ee and φ. The angle φ is eliminated by first rewriting the momentum equations: pe cos φ = p − p′ cos θ and pe sin φ = p′ sin θ (3.45) Squaring these equations and adding the results, we obtain p2e = p2 − 2pp′ cos θ + p′2 (3.46) The relativistic relationship between energy and momentum is, according to Eq. 2.39, Ee2 = c2 p2e + m2e c4 . Substituting in this equation for Ee from Eq. 3.44a and for p2e from Eq. 3.46, we obtain (E + me c2 − E′ )2 = c2 (p2 − 2pp′ cos θ + p′2 ) + m2e c4 (3.47) and after a bit of algebra, we find 1 1 1 − = (1 − cos θ ) ′ E E me c2 (3.48) In terms of wavelength, this equation can also be written as λ′ − λ = Arthur H. Compton (1892–1962, United States). His work on X-ray scattering verified Einstein’s photon theory and earned him the 1927 Nobel Prize. He was a pioneer in research with X rays and cosmic rays. During World War II he directed a portion of the U.S. atomic bomb research. h (1 − cos θ ) me c (3.49) where λ is the wavelength of the incident photon and λ′ is the wavelength of the scattered photon. The quantity h/me c is known as the Compton wavelength of the electron and has a value of 0.002426 nm; however, keep in mind that it is not a true wavelength but rather is a change of wavelength. Equations 3.48 and 3.49 give the change in energy or wavelength of the photon, as a function of the scattering angle θ. Because the quantity on the right-hand side is never negative, E′ is always less than E, so that the scattered photon has less energy than the original incident photon; the difference E − E′ is just the kinetic 3.4 | The Compton Effect energy given to the electron, Ee − me c2 . Similarly, λ′ is greater than λ, meaning the scattered photon always has a longer wavelength than the incident photon; the change in wavelength ranges from 0 at θ = 0◦ to twice the Compton wavelength at θ = 180◦ . Of course the descriptions in terms of energy and wavelength are equivalent, and the choice of which to use is merely a matter of convenience. Using Ee = Ke + me c2 , where Ke is the kinetic energy of the electron, conservation of energy (Eq. 3.44a) can also be written as E + me c2 = E′ + Ke + me c2 . Solving for Ke , we obtain Ke = E − E′ (3.50) That is, the kinetic energy acquired by the electron is equal to the difference between the initial and final photon energies. We can also find the direction of the electron’s motion by dividing the two momentum relationships in Equation 3.45: tan φ = pe sin φ p′ sin θ E′ sin θ = = ′ pe cos φ p − p cos θ E − E′ cos θ (3.51) where the last result comes from using p = E/c and p′ = E′ /c. Example 3.7 X rays of wavelength 0.2400 nm are Compton-scattered, and the scattered beam is observed at an angle of 60.0◦ relative to the incident beam. Find: (a) the wavelength of the scattered X rays, (b) the energy of the scattered X-ray photons, (c) the kinetic energy of the scattered electrons, and (d) the direction of travel of the scattered electrons. (b) The energy E′ can be found directly from λ′ : Solution (d) From Eq. 3.51, E′ = hc 1240 eV · nm = = 5141 eV λ′ 0.2412 nm (c) The initial photon energy E is hc/λ = 5167 eV, so Ke = E − E′ = 5167 eV − 5141 eV = 26 eV (a) λ′ can be found immediately from Eq. 3.49: h λ′ = λ + (1 − cos θ ) me c ◦ = 0.2400 + (0.00243 nm)(1 − cos 60 ) = 0.2412 nm φ = tan−1 = tan−1 ◦ = 59.7 E′ sin θ E − E′ cos θ (5141 eV)( sin 60◦ ) (5167 eV) − (5141 eV)(cos 60◦ ) The first experimental demonstration of this type of scattering was done by Arthur Compton in 1923. A diagram of his experimental arrangement is shown in Figure 3.19. A beam of X rays of a single wavelength λ is incident on a scattering target, for which Compton used carbon. (Although no scattering target contains actual “free” electrons, the outer or valence electrons in many materials are very weakly attached to the atom and behave like nearly free electrons. The binding energies of these electrons in the atom are so small compared with the energies of the incident X-ray photons that they can be regarded as nearly “free” electrons.) A movable detector measured the energy of the scattered X rays at various angles θ. 89 90 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation λ Target θ λ′ X-ray source Detector λ λ′ FIGURE 3.19 Schematic diagram of Compton-scattering apparatus. The wavelength λ′ of the scattered X rays is measured by the detector, which can be moved to different positions θ . The wavelength difference λ′ − λ varies with θ. 0° 45° λ′ = 0.0715 nm Compton’s original results are illustrated in Figure 3.20. At each angle, two peaks appear, corresponding to scattered X-ray photons with two different energies or wavelengths. The wavelength of one peak does not change as the angle is varied; this peak corresponds to scattering that involves “inner” electrons of the atom, which are more tightly bound to the atom so that the photon can scatter with no loss of energy. The wavelength of the other peak, however, varies strongly with angle; as can be seen from Figure 3.21, this variation is exactly as the Compton formula predicts. Similar results can be obtained for the scattering of gamma rays, which are higher-energy (shorter wavelength) photons emitted in various radioactive decays. Compton also measured the variation in wavelength of scattered gamma rays, as illustrated in Figure 3.22. The change in wavelength in the 7 74 73 72 Slope = 2.4 135° λ′ = 0.0749 nm FIGURE 3.20 Compton’s original results for X-ray scattering. × 10−12 m 5 4 3 71 70 0 λ = 0.0709 nm λ′ (10−12 m) λ′ (10−12 m) λ′ = 0.0731 nm 6 75 90° 1 1 – cos θ 2 FIGURE 3.21 The scattered X-ray wavelengths λ′ , from Figure 3.20, for different scattering angles. The expected slope is 2.43 × 10−12 m, in agreement with the measured slope of Compton’s data points. 2 0 Slope = 2.4 × 1 1 – cos θ 10−12 m 2 FIGURE 3.22 Compton’s results for gamma-ray scattering. The wavelengths are much smaller than for X-rays, but the slope is the same as in Figure 3.21, which the Compton formula, Eq. 3.49, predicts. 3.5 | Other Photon Processes gamma-ray measurements is identical with the change in wavelength in the X-ray measurements, as Eq. 3.49 predicts—the change in wavelength does not depend on the incident wavelength. 3.5 OTHER PHOTON PROCESSES Although thermal radiation, the photoelectric effect, and Compton scattering provided the earliest experimental evidence in support of the quantization (particlelike behavior) of electromagnetic radiation, there are numerous other experiments that can also be interpreted correctly only if we assume the existence of photons as discrete quanta of electromagnetic radiation. In this section we discuss some of these processes, which cannot be understood if we consider only the wave nature of electromagnetic radiation. As you study the descriptions of these processes, note how photons interact with atoms or electrons by delivering energy in discrete bundles, in contrast to the wave interpretation in which the energy can be regarded as arriving continuously. Interactions of Photons with Atoms The emission of electromagnetic radiation from atoms takes place in discrete amounts characterized by one or more photons. When an atom emits a photon of energy E, the atom loses an equivalent amount of energy. Consider an atom at rest that has an initial energy Ei . The atom emits a photon of energy E. After the emission, the atom is left with a final energy Ef , which we will take as the energy associated with the internal structure of the atom. Because of conservation of momentum, the final atom must have a momentum that is equal and opposite to the momentum of the emitted photon, so the atom must also have a “recoil” kinetic energy K. (Normally this kinetic energy is very small.) Conservation of energy then gives Ei = Ef + K + E or E = (Ei − Ef ) − K (3.52) The energy of the emitted photon is equal to the net energy lost by the atom, minus a negligibly small contribution to the recoil kinetic energy of the atom. In the reverse process, an atom can absorb a photon of energy E. If the atom is initially at rest, it must again acquire a small recoil kinetic energy in order to conserve momentum. Now conservation of energy gives Ei + E = Ef + K or Ef − Ei = E − K (3.53) The energy available to add to the atom’s internal supply of energy is the photon energy, less a recoil kinetic energy that is usually negligible. Photon emission and absorption experiments are among the most important techniques for acquiring information about the internal structure of atoms, as we discuss in Chapter 6. 91 92 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation Bremsstrahlung and X-ray Production When an electric charge, such as an electron, is accelerated or decelerated, it radiates electromagnetic energy; according to the quantum interpretation, we would say that it emits photons. Suppose we have a beam of electrons, which has been accelerated through a potential difference V , so that the electrons experience a loss in potential energy of −e V and thus acquire a kinetic energy of K = e V (Figure 3.23). When the electrons strike a target they are slowed down and eventually come to rest, because they make collisions with the atoms of the target material. In such a collision, momentum is transferred to the atom, the electron slows down, and photons are emitted. The recoil kinetic energy of the atom is small (because the atom is so massive) and can safely be neglected. If the electron has a kinetic energy K before the encounter and if it leaves after the collision with a smaller kinetic energy K ′ , then the photon energy hf = hc/λ is hf = hc = K − K′ λ (3.54) The amount of energy lost, and therefore the energy and wavelength of the emitted photon, are not uniquely determined, because K is the only known energy in Eq. 3.54. An electron usually will make many collisions, and therefore emit many different photons, before it is brought to rest; the photons then will range all the way from very small energies (large wavelengths) corresponding to small energy losses, up to a maximum photon energy hfmax equal to K, corresponding to an electron that loses all of its kinetic energy K in a single encounter (that is, when K ′ = 0). The smallest emitted wavelength λmin is therefore determined by the maximum possible energy loss, λmin = hc hc = K e V (3.55) ∆V C X-ray photon A hf K Electron Target atom K′ X rays (a) (b) FIGURE 3.23 (a) Apparatus for producing bremsstrahlung. Electrons from a cathode C are accelerated to the anode A through the potential difference V . When an electron encounters a target atom of the anode, it can lose energy, with the accompanying emission of an X-ray photon. (b) A schematic representation of the bremsstrahlung process. 3.5 | Other Photon Processes 50 kV Relative intensity For typical accelerating voltages in the range of 10,000 V, λmin is in the range of a few tenths of nm, which corresponds to the X-ray region of the spectrum. This continuous distribution of X rays (which is very different from the discrete X-ray energies that are emitted in atomic transitions; more about these in Chapter 8) is called bremsstrahlung, which is German for braking, or decelerating, radiation. Some sample bremsstrahlung spectra are illustrated in Figure 3.24. Symbolically we can write the bremsstrahlung process as 93 40 kV 30 kV 20 kV electron → electron + photon 0 This is just the reverse process of the photoelectric effect, which is electron + photon → electron However, neither process occurs for free electrons. In both cases there must be a heavy atom in the neighborhood to take care of the recoil momentum. Pair Production and Annihilation Another process that can occur when photons encounter atoms is pair production, in which the photon loses all its energy and in the process two particles are created: an electron and a positron. (A positron is a particle that is identical in mass to the electron but has a positive electric charge; more about antiparticles in Chapter 14.) Here we have an example of the creation of rest energy. The electron did not exist before the encounter of the photon with the atom (it was not an electron that was part of the atom). The photon energy hf is converted into the relativistic total energies E+ and E− of the positron and electron: hf = E+ + E− = (me c2 + K+ ) + (me c2 + K− ) (3.56) Because K+ and K− are always positive, the photon must have an energy of at least 2me c2 = 1.02 MeV in order for this process to occur; such high-energy photons are in the region of nuclear gamma rays. Symbolically, photon → electron + positron This process, like bremsstrahlung, will not occur unless there is an atom nearby to supply the necessary recoil momentum. The reverse process, electron + positron → photon also occurs; this process is known as electron-positron annihilation and can occur for free electrons and positrons as long as at least two photons are created. In this process the electron and positron disappear and are replaced by two photons. Conservation of energy requires that (me c2 + K+ ) + (me c2 + K− ) = E1 + E2 (3.57) 0.02 0.04 0.06 0.08 0.10 Wavelength (nm) FIGURE 3.24 Some typical bremsstrahlung spectra. Each spectrum is labeled with the value of the accelerating voltage V . 94 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation where E1 and E2 are the photon energies. Usually the kinetic energies K+ and K− are negligibly small, so we can assume the positron and electron to be essentially at rest. Momentum conservation then requires the two photons to have equal and opposite momenta and thus equal energies. The two annihilation photons have equal energies of 0.511 MeV (= me c2 ) and move in exactly opposite directions. 3.6 WHAT IS A PHOTON? We can describe photons by giving a few of their basic properties: • like an electromagnetic wave, photons move with the speed of light; • they have zero mass and rest energy; • they carry energy and momentum, which are related to the frequency and wavelength of the electromagnetic wave by E = hf and p = h/λ ; • they can be created or destroyed when radiation is emitted or absorbed; • they can have particlelike collisions with other particles such as electrons. Laser Mirror B Splitter A Switch Mirror Interference pattern Detector FIGURE 3.25 Apparatus for delayed choice experiment. Photons from the laser strike the beam splitter and can then travel paths A or B. The switch in path A can deflect the beam into a detector. If the switch is off, the beam on path A recombines with the beam on path B to form an interference pattern. [Source: A. Shimony, “The Reality of the Quantum World,” Scientific American 258, 46 (January 1988)]. In this chapter we have described some experiments that favor the photon interpretation of electromagnetic radiation, according to which the energy of the radiation is concentrated in small bundles. Other experiments, such as interference and diffraction, favor the wave interpretation, according to which the energy of the radiation is spread over its entire wavefront. For example, the explanation of the double-slit interference experiment requires that the wavefront be divided so that some of its intensity can pass through each slit. A particle must choose to go through one slit or the other; only a wave can go through both. If we regard the wave and particle pictures as valid but exclusive alternatives, we must assume that the light emitted by a source must travel either as waves or as particles. How does the source know what kind of light (particles or waves) to emit? Suppose we place a double-slit apparatus on one side of the source and a photoelectric cell on the other side. Light emitted toward the double slit behaves like a wave and light emitted toward the photocell behaves like particles. How did the source know in which direction to aim the waves and in which direction to aim the particles? Perhaps nature has a sort of “secret code” in which the kind of experiment we are doing is signaled back to the source so that it knows whether to emit particles or waves. Let us repeat our dual experiment with light from a distant galaxy, light that has been traveling toward us for a time roughly equal to the age of the universe (13 × 109 years). Surely the kind of experiment we are doing could not be signaled back to the limits of the known universe in the time it takes us to remove the double-slit apparatus from the laboratory table and replace it with the photoelectric apparatus. Yet we find that the starlight can produce both the double-slit interference and also the photoelectric effect. Figure 3.25 shows a recent experiment that was designed to test whether this dual nature is an intrinsic property of light or of our apparatus. A light beam from a laser goes through a beam splitter, which separates the beam into two components (A and B). The mirrors reflect the two component beams so that they can recombine to form an interference pattern. In path A there is a switch that can deflect the beam into a detector. If the switch is off, beam A is not deflected and will combine with beam B to produce the interference pattern. If the switch is on, 3.6 | What is a Photon? beam A is deflected and observed in the detector, indicating that the light traveled a definite path, as would be characteristic of a particle. To put this another way, if the switch is off, the light beam is observed as a wave; if it is on, the light beam is observed as particles. If light behaves like particles, the beam splitter sends it along either path A or path B; either path can be randomly chosen for the particle, but each particle can travel only one path. If light behaves like a wave, on the other hand, the beam splitter sends it along both paths, dividing its intensity between the two. Perhaps the beam splitter can somehow sense whether the switch is open or closed, so that it knows whether we are doing a particle-type or a wave-type experiment. If this were true, then the beam splitter would “know” whether to send all of the intensity down one path (so that we would observe a particle) or to split the intensity between the paths (so that we would observe a wave). However, in this experiment the experimenters used a very fast optical switch whose response time was shorter than the time it takes for light to travel through the apparatus to the switch. That is, the state of the switch could be changed after the light had already passed through the beam splitter, and so it was impossible for the beam splitter to “know” how the switch was set and thus whether a particle-type or a wave-type experiment was being done. This kind of experiment is called a “delayed choice” experiment, because the experimenter makes the choice of what kind of experiment to do after the light is already traveling on its way to the observation apparatus. In this experiment, the investigators discovered that whenever they had the switch off, they observed the interference pattern characteristic of waves. When they had the switch on, they observed particles in the detector and no interference pattern. That is, whenever they did a wave-type experiment they observed waves, and whenever they did a particle-type experiment they observed particles. The wave and particle natures are both present simultaneously in the light, and this dual nature is clearly associated with the light and is not characteristic of the apparatus. Many other experiments of this type have been done, and they all produce similar results. We are therefore trapped into an uncomfortable conclusion: Light is not either particles or waves; it is somehow both particles and waves, and only shows one or the other aspect, depending on the kind of experiment we are doing. A particle-type experiment shows the particle nature, while a wave-type experiment shows the wave nature. Our failure to classify light as either particle or wave is not so much a failure to understand the nature of light as it is a failure of our limited vocabulary (based on experiences with ordinary particles and waves) to describe a phenomenon that is more elegant and mysterious than either simple particles or waves. Wave-Particle Duality The dilemma of the dual particle+wave nature of light, which is called waveparticle duality, cannot be resolved with a simple explanation; physicists and philosophers have struggled with this problem ever since the quantum theory was introduced. The best we can do is to say that neither the wave nor the particle picture is wholly correct all of the time, that both are needed for a complete description of physical phenomena, and that in fact the two are complementary to one another. Suppose we use a photographic film to observe the double-slit interference pattern. The film responds to individual photons. When a single photon is absorbed 95 96 Chapter 3 | The Particlelike Properties of Electromagnetic Radiation by the film, a single grain of the photographic emulsion is darkened; a complete picture requires a large number of grains to be darkened. Let us imagine for the moment that we could see individual grains of the film as they absorbed photons and darkened, and let us do the double-slit experiment with a light source that is so weak that there is a relatively long time interval between photons. We would see first one grain darken, then another, and so forth, until after a large number of photons we would see the interference pattern begin to emerge. Some areas of the film (the interference maxima) show evidence for the arrival of a large number of photons, while in other areas (the interference minima) few photons arrive. Alternatively, the wave picture of the double-slit experiment suggests that we could find the net electric field of the wave that strikes the screen by superimposing the electric fields of the portions of the incident wave fronts that pass through the two slits; the intensity or power in that combined wave could then be found by a procedure similar to Eqs. 3.7 through 3.10, and we would expect that the resultant intensity should show maxima and minima just like the observed double-slit interference pattern. In summary, the correct explanation of the origin and appearance of the interference pattern comes from the wave picture, and the correct interpretation of the evolution of the pattern on the film comes from the photon picture; the two explanations, which according to our limited vocabulary and common-sense experience cannot simultaneously be correct, must somehow be taken together to give a complete description of the properties of electromagnetic radiation. Keep in mind that “photon” and “wave” represent descriptions of the behavior of electromagnetic radiation when it encounters material objects. It is not correct to think of light as being “composed” of photons, just as we don’t think of light as being “composed” of waves. The explanation in terms of photons applies to some interactions of radiation with matter, while the explanation in terms of waves applies to other interactions. For example, when we say that an atom “emits” a photon, we don’t mean that there is a supply of photons stored within the atom; instead, we mean that the atom has given up a quantity of its internal energy to create an equivalent amount of energy in the form of electromagnetic radiation. In the case of the double-slit experiment, we might reason as follows: the interaction between a “source” of radiation and the electromagnetic field is quantized, so that we can think of the emission of radiation by the atoms of the source in terms of individual photons. The interaction at the opposite end of the experiment, the photographic film, is also quantized, and we have the similarly useful view of atoms absorbing radiation as individual photons. In between, the electromagnetic radiation propagates smoothly and continuously as a wave and can show wave-type behavior (interference or diffraction) when it encounters the double slit. Where the wave has large intensity, the film reveals the presence of many photons; where the wave has small intensity, few photons are observed. Recalling that the intensity of the wave is proportional to the square of its amplitude, we then have probability to observe photons ∝ |electric field amplitude|2 It is this expression that provides the ultimate connection between the wave behavior and the particle behavior, and we will see in the next two chapters that Questions 97 a similar expression connects the wave and the particle aspects of those objects, such as electrons, which have been previously considered to behave as classical particles. Chapter Summary Section Double-slit maxima yn = n Bragg’s law for X-ray diffraction Energy of photon Maximum kinetic energy of photoelectrons Cutoff wavelength Stefan’s law Wien’s displacement law λD d n = 0, 1, 2, 3, . . . Section 2πc kT λ4 3.1 Rayleigh-Jeans formula 2d sin θ = nλ n = 1, 2, 3, · · · 3.1 E = hf = hc/λ 3.2 Kmax = eVs = hf − φ 3.2 λc = hc/φ 3.2 2πhc2 1 Planck’s blackbody I(λ) = 5 hc/λkT − 1 λ e distribution 1 1 1 − = (1 − cos θ ), Compton E′ E me c2 scattering h λ′ − λ = (1 − cos θ ) me c Bremsstrahlung λmin = hc/K = hc/eV I= Pair production σ T4 λmax T = 2.8978 × 10 3.3 −3 m·K 3.3 Electron-positron annihilation I(λ) = hf = E+ + E− = (me c2 + K+ ) + (me c2 + K− ) (me c2 + K+ ) + (me c2 + K− ) = E1 + E2 3.3 3.3 3.4 3.5 3.5 3.5 Questions 1. The diameter of an atomic nucleus is about 10 × 10−15 m. Suppose you wanted to study the diffraction of photons by nuclei. What energy of photons would you choose? Why? 2. How is the wave nature of light unable to account for the observed properties of the photoelectric effect? 3. In the photoelectric effect, why do some electrons have kinetic energies smaller than Kmax ? 4. Why doesn’t the photoelectric effect work for free electrons? 5. What does the work function tell us about the properties of a metal? Of the metals listed in Table 3.1, which has the least tightly bound electrons? Which has the most tightly bound? 6. Electric current is charge flowing per unit time. If we increase the kinetic energy of the photoelectrons (by increasing the energy of the incident photons), shouldn’t the current increase, because the charge flows more rapidly? Why doesn’t it? 7. What might be the effects on a photoelectric effect experiment if we were to double the frequency of the incident light? If we were to double the wavelength? If we were to double the intensity? 8. In the photoelectric effect, how can a photon moving in one direction eject an electron moving in a different direction? What happens to conservation of momentum? 9. In Figure 3.10, why does the photoelectric current rise slowly to its saturation value instead of rapidly, when the potential difference is greater than Vs ? What does this figure indicate about the experimental difficulties that might arise from trying to determine Vs in this way? 10. Suppose that the frequency of a certain light source is just above the cutoff frequency of the emitter, so that the photoelectric effect occurs. To an observer in relative motion, the frequency might be Doppler shifted to a lower value that is below the cutoff frequency. Would this moving observer conclude that the photoelectric effect does not occur? Explain. 11. Why do cavities that form in a wood fire seem to glow brighter than the burning wood itself? Is the temperature in such cavities hotter than the surface temperature of the exposed burning wood? 12. What are the fields of classical physics on which the classical theory of blackbody radiation is based? Why don’t 98 13. 14. 15. 16. 17. Chapter 3 | The Particlelike Properties of Electromagnetic Radiation we believe that the “ultraviolet catastrophe” suggests that something is wrong with one of those classical theories? In what region of the electromagnetic spectrum do roomtemperature objects radiate? What problems would we have if our eyes were sensitive in that region? How does the total intensity of thermal radiation vary when the temperature of an object is doubled? Compton-scattered photons of wavelength λ′ are observed at 90◦ . In terms of λ′ , what is the scattered wavelength observed at 180◦ ? The Compton-scattering formula suggests that objects viewed from different angles should show scattered light of different wavelengths. Why don’t we observe a change in color of objects as we change the viewing angle? You have a monoenergetic source of X rays of energy 84 keV, but for an experiment you need 70 keV X rays. How would you convert the X-ray energy from 84 to 70 keV? 18. TV sets with picture tubes can be significant emitters of X rays. What is the origin of these X rays? Estimate their wavelengths. 19. The X-ray peaks of Figure 3.20 are not sharp but are spread over a range of wavelengths. What reasons might account for that spreading? 20. A beam of photons passes through a block of matter. What are the three ways discussed in this chapter that the photons can lose energy in interacting with the material? 21. Of the photon processes discussed in this chapter (photoelectric effect, thermal radiation, Compton scattering, bremsstrahlung, pair production, electron-positron annihilation), which conserve momentum? Energy? Mass? Number of photons? Number of electrons? Number of electrons minus number of positrons? Problems 3.1 Review of Electromagnetic Waves 1. A double-slit experiment is performed with sodium light (λ = 589.0 nm). The slits are separated by 1.05 mm, and the screen is 2.357 m from the slits. Find the separation between adjacent maxima on the screen. 2. In Example 3.1, what angle of incidence will produce the second-order Bragg peak? 3. Monochromatic X rays are incident on a crystal in the geometry of Figure 3.5. The first-order Bragg peak is observed when the angle of incidence is 34.0◦ . The crystal spacing is known to be 0.347 nm. (a) What is the wavelength of the X rays? (b) Now consider a set of crystal planes that makes an angle of 45◦ with the surface of the crystal (as in Figure 3.6). For X rays of the same wavelength, find the angle of incidence measured from the surface of the crystal that produces the first-order Bragg peak. At what angle from the surface does the emerging beam appear in this case? 4. A certain device for analyzing electromagnetic radiation is based on the Bragg scattering of the radiation from a crystal. For radiation of wavelength 0.149 nm, the first-order Bragg peak appears centered at an angle of 15.15◦ . The aperture of the analyzer passes radiation in the angular range of 0.015◦ . What is the corresponding range of wavelengths passing through the analyzer? 7. 8. 9. 10. 11. 12. are continuously bombarded by these photons. Why are they not dangerous to us? (a) What is the wavelength of an X-ray photon of energy 10.0 keV? (b) What is the wavelength of a gamma-ray photon of energy 1.00 MeV? (c) What is the range of energies of photons of visible light with wavelengths 350 to 700 nm? What is the cutoff wavelength for the photoelectric effect using an aluminum surface? A metal surface has a photoelectric cutoff wavelength of 325.6 nm. It is illuminated with light of wavelength 259.8 nm. What is the stopping potential? When light of wavelength λ illuminates a copper surface, the stopping potential is V . In terms of V , what will be the stopping potential if the same wavelength is used to illuminate a sodium surface? The cutoff wavelength for the photoelectric effect in a certain metal is 254 nm. (a) What is the work function for that metal? (b) Will the photoelectric effect be observed for λ > 254 nm or for λ < 254 nm? A surface of zinc is illuminated and photoelectrons are observed. (a) What is the largest wavelength that will cause photoelectrons to be emitted? (b) What is the stopping potential when light of wavelength 220.0 nm is used? 3.3 Blackbody Radiation 3.2 The Photoelectric Effect 5. Find the momentum of (a) a 10.0-MeV gamma ray; (b) a 25-keV X ray; (c) a 1.0-μm infrared photon; (d) a 150-MHz radio-wave photon. Express the momentum in kg · m/s and eV/c. 6. Radio waves have a frequency of the order of 1 to 100 MHz. What is the range of energies of these photons? Our bodies 13. (a) Show that in the classical result for the energy distribution of the cavity wall oscillators (Eq. 3.32), the total number of oscillators at all energies is N. (b) Show that Eav = kT for the classical oscillators. 14. (a) Writing the discrete Maxwell-Boltzmann distribution for Planck’s cavity wall oscillators as Nn = Ae−En /kT (where A is a constant to be determined), show that the Problems condition ∞  n=0 Nn = N gives A = N(1 − e−ε/kT ) as in Eq. 3.38. [Hint: Use ∞  n=0 15. 16. 17. 18. 19. 20. 21. 22. 23. enx = (1 − ex )−1 ]. (b) By taking the derivative with respect to x of the equation given in the ∞  hint, show that nenx = ex /(1 − ex )2 . (c) Use this result n=0 to derive Eq. 3.40 from Eq. 3.39. (d) Show that Eav ∼ = kT at large λ and Eav → 0 for small λ. By differentiating Eq. 3.41 show that I(λ) has its maximum as expected according to Wien’s displacement law, Eq. 3.27. Integrate  ∞ Eq. 3.41 to obtain Eq. 3.26. Use the definite integral 0 x3 dx/(ex − 1) = π 4 /15 to obtain Eq. 3.42 relating the Stefan-Boltzmann constant to Planck’s constant. Use the numerical value of the Stefan-Boltzmann constant to find the numerical value of Planck’s constant from Eq. 3.42. The surface of the Sun has a temperature of about 6000 K. At what wavelength does the Sun emit its peak intensity? How does this compare with the peak sensitivity of the human eye? The universe is filled with thermal radiation, which has a blackbody spectrum at an effective temperature of 2.7 K (see Chapter 15). What is the peak wavelength of this radiation? What is the energy (in eV) of quanta at the peak wavelength? In what region of the electromagnetic spectrum is this peak wavelength? (a) Assuming the human body (skin temperature 34◦ C) to behave like an ideal thermal radiator, find the wavelength where the intensity from the body is a maximum. In what region of the electromagnetic spectrum is radiation with this wavelength? (b) Making whatever (reasonable) assumptions you may need, estimate the power radiated by a typical person isolated from the surroundings. (c) Estimate the radiation power absorbed by a person in a room in which the temperature is 20◦ C. A cavity is maintained at a temperature of 1650 K. At what rate does energy escape from the interior of the cavity through a hole in its wall of diameter 1.00 mm? An analyzer for thermal radiation is set to accept wavelengths in an interval of 1.55 nm. What is the intensity of the radiation in that interval at a wavelength of 875 nm emitted from a glowing object whose temperature is 1675 K? (a) Assuming the Sun to radiate like an ideal thermal source at a temperature of 6000 K, what is the intensity of the solar radiation emitted in the range 550.0 nm to 552.0 nm? (b) What fraction of the total solar radiation does this represent? 3.4 The Compton Effect 24. Show how Eq. 3.48 follows from Eq. 3.47. 25. Incident photons of energy 10.39 keV are Compton scattered, and the scattered beam is observed at 45.00◦ relative to the incident beam. (a) What is the energy of the scattered photons at that angle? (b) What is the kinetic energy of the scattered electrons? 99 26. X-ray photons of wavelength 0.02480 nm are incident on a target and the Compton-scattered photons are observed at 90.0◦ . (a) What is the wavelength of the scattered photons? (b) What is the momentum of the incident photons? Of the scattered photons? (c) What is the kinetic energy of the scattered electrons? (d) What is the momentum (magnitude and direction) of the scattered electrons? 27. High-energy gamma rays can reach a radiation detector by Compton scattering from the surroundings, as shown in Figure 3.26. This effect is known as back-scattering. Show that, when E ≫ me c2 , the back-scattered photon has an energy of approximately 0.25 MeV, independent of the energy of the original photon, when the scattering angle is nearly 180◦ . Detector FIGURE 3.26 Problem 27. 28. Gamma rays of energy 0.662 MeV are Compton scattered. (a) What is the energy of the scattered photon observed at a scattering angle of 60.0◦ ? (b) What is the kinetic energy of the scattered electrons? 3.5 Other Photon Processes 29. Suppose an atom of iron at rest emits an X-ray photon of energy 6.4 keV. Calculate the “recoil” momentum and kinetic energy of the atom. (Hint: Do you expect to need classical or relativistic kinetic energy for the atom? Is the kinetic energy likely to be much smaller than the atom’s rest energy?) 30. What is the minimum X-ray wavelength produced in bremsstrahlung by electrons that have been accelerated through 2.50 × 104 V? 31. An atom absorbs a photon of wavelength 375 nm and immediately emits another photon of wavelength 580 nm. What is the net energy absorbed by the atom in this process? General Problems 32. A certain green light bulb emits at a single wavelength of 550 nm. It consumes 55 W of electrical power and is 75% efficient in converting electrical energy into light. (a) How many photons does the bulb emit in one hour? (b) Assuming the emitted photons to be distributed uniformly in space, how many photons per second strike a 10 cm by 10 cm paper held facing the bulb at a distance of 1.0 m? 33. When sodium metal is illuminated with light of wavelength 4.20 × 102 nm, the stopping potential is found to be 0.65 V; when the wavelength is changed to 3.10 × 102 nm, the 100 34. 35. 36. 37. 38. 39. Chapter 3 | The Particlelike Properties of Electromagnetic Radiation stopping potential is 1.69 V. Using only these data and the values of the speed of light and the electronic charge, find the work function of sodium and a value of Planck’s constant. A photon of wavelength 192 nm strikes an aluminum surface along a line perpendicular to the surface and releases a photoelectron traveling in the opposite direction. Assume the recoil momentum is taken up by a single aluminum atom on the surface. Calculate the recoil kinetic energy of the atom. Would this recoil energy significantly affect the kinetic energy of the photoelectron? A certain cavity has a temperature of 1150 K. (a) At what wavelength will the intensity of the radiation inside the cavity have its maximum value? (b) As a fraction of the maximum intensity, what is the intensity at twice the wavelength found in part (a)? In Compton scattering, calculate the maximum kinetic energy given to the scattered electron for a given photon energy. The COBE satellite was launched in 1989 to study the cosmic background radiation and measure its temperature. By measuring at many different wavelengths, researchers were able to show that the background radiation exactly followed the spectral distribution expected for a blackbody. At a wavelength of 0.133 cm, the radiant intensity is 1.440 × 10−7 W/m2 in a wavelength interval of 0.00833 cm. What is the temperature of the radiation that would be deduced from these data? The WMAP satellite launched in 2001 studied the cosmic microwave background radiation and was able to chart small fluctuations in the temperature of different regions of the background radiation. These fluctuations in temperature correspond to regions of large and small density in the early universe. The satellite was able to measure differences in temperature of 2 × 10−5 K at a temperature of 2.7250 K. At the peak wavelength, what is the difference in the radiation intensity per unit wavelength interval between the “hot” and “cold” regions of the background radiation? You have been hired as an engineer on a NASA project to design a microwave spectrometer for an orbital mission to measure the cosmic background radiation, which has a blackbody spectrum with an effective temperature of 2.725 K. (a) The spectrometer is to scan the sky between wavelengths of 0.50 mm and 5.0 mm, and at each wavelength it accepts radiation in a wavelength range of 40. 41. 42. 43. 44. 3.0 × 10−4 mm. What maximum and minimum radiation intensity do you expect to find in this region? (b) The photon detector in the spectrometer is in the form of a disk of diameter 0.86 cm. How many photons per second will the spectrometer record at its maximum and minimum intensities? A photon of wavelength 7.52 pm scatters from a free electron at rest. After the interaction, the electron is observed to be moving in the direction of the original photon. Find the momentum of the electron. A hydrogen atom is moving at a speed of 125.0 m/s. It absorbs a photon of wavelength 97 nm that is moving in the opposite direction. By how much does the speed of the atom change as a result of absorbing the photon? Before a positron and an electron annihilate, they form a sort of “atom” in which each orbits about their common center of mass with identical speeds. As a result of this motion, the photons emitted in the annihilation show a small Doppler shift. In one experiment, the Doppler shift in energy of the photons was observed to be 2.41 keV. (a) What would be the speed of the electron or positron before the annihilation to produce this Doppler shift? (b) The positrons form these atom-like structures with the nearly “free” electrons in a solid. Assuming the positron and electron must have about the same speed to form this structure, find the kinetic energy of the electron. This technique, called “Doppler broadening,” is an important method for learning about the energies of electrons in materials. Prove that it is not possible to conserve both momentum and total relativistic energy in the following situation: A v emits a photon and then free electron moving at velocity  v′. moves at a slower velocity  A photon of energy E interacts with an electron at rest and undergoes pair production, producing a positive electron (positron) and an electron (in addition to the original electron): photon + e− → e+ + e− + e− The two electrons and the positron move off with identical momenta in the direction of the initial photon. Find the kinetic energy of the three final particles and find the energy E of the photon. (Hint: Conserve momentum and total relativistic energy.) Chapter 4 THE WAVELIKE PROPERTIES OF PARTICLES Just as we produce images from light waves that scatter from objects, we can also form images from ‘‘particle waves’’. The electron microscope produces images from electron waves that enable us to visualize objects on a scale that is much smaller than the wavelength of light. The ability to observe individual human cells and even sub-cellular objects such as chromosomes has revolutionized our understanding of biological processes. It is even possible to form images of a single atom, such as this cobalt atom on a gold surface. The ripples on the surface show electrons from gold atoms reacting to the presence of the intruder. 102 Chapter 4 | The Wavelike Properties of Particles In classical physics, the laws describing the behavior of waves and particles are fundamentally different. Projectiles obey particle-type laws, such as Newtonian mechanics. Waves undergo interference and diffraction, which cannot be explained by the Newtonian mechanics associated with particles. The energy carried by a particle is confined to a small region of space; a wave, on the other hand, distributes its energy throughout space in its wavefronts. In describing the behavior of a particle we often want to specify its location, but this is not so easy to do for a wave. How would you describe the exact location of a sound wave or a water wave? In contrast to this clear distinction found in classical physics, quantum physics requires that particles sometimes obey the rules that we have previously established for waves, and we shall use some of the language associated with waves to describe particles. The system of mechanics associated with quantum systems is sometimes called “wave mechanics” because it deals with the wavelike behavior of particles. In this chapter we discuss the experimental evidence in support of this wavelike behavior for particles such as electrons. As you study this chapter, notice the frequent references to such terms as the probability of the outcome of a measurement, the average of many repetitions of a measurement, and the statistical behavior of a system. These terms are fundamental to quantum mechanics, and you cannot begin to understand quantum behavior until you feel comfortable with discarding such classical notions as fixed trajectories and certainty of outcome, while substituting the quantum mechanical notions of probability and statistically distributed outcomes. 4.1 DE BROGLIE’S HYPOTHESIS Louis de Broglie (1892–1987, France). A member of an aristocratic family, his work contributed substantially to the early development of the quantum theory. Progress in physics often can be characterized by long periods of experimental and theoretical drudgery punctuated occasionally by flashes of insight that cause profound changes in the way we view the universe. Frequently the more profound the insight and the bolder the initial step, the simpler it seems in historical perspective, and the more likely we are to sit back and wonder, “Why didn’t I think of that?” Einstein’s special theory of relativity is one example of such insight; the hypothesis of the Frenchman Louis de Broglie is another.∗ In the previous chapter we discussed the double-slit experiment (which can be understood only if light behaves as a wave) and the photoelectric and Compton effects (which can be understood only if light behaves as a particle). Is this dual particle-wave nature a property only of light or of material objects as well? In a bold and daring hypothesis in his 1924 doctoral dissertation, de Broglie chose the latter alternative. Examining Eq. 3.20, E = hf, and Eq. 3.22, p = h/λ, we find some difficulty in applying the first equation in the case of particles, for we cannot be sure whether E should be the kinetic energy, total energy, or total relativistic energy (all, of course, are identical for light). No such difficulties arise from the second relationship. De Broglie suggested, lacking any experimental evidence in ∗ De Broglie’s name should be pronounced “deh-BROY” or “deh-BROY-eh,” but it is often said as “deh-BROH-lee.” 4.1 | De Broglie’s Hypothesis 103 support of his hypothesis, that associated with any material particle moving with momentum p there is a wave of wavelength λ, related to p according to λ= h p (4.1) where h is Planck’s constant. The wavelength λ of a particle computed according to Eq. 4.1 is called its de Broglie wavelength. Example 4.1 Compute the de Broglie wavelength of the following: (a) A 1000-kg automobile traveling at 100 m/s (about 200 mi/h). (b) A 10-g bullet traveling at 500 m/s. (c) A smoke particle of mass 10−9 g moving at 1 cm/s. (d) An electron with a kinetic energy of 1 eV. (e) An electron with a kinetic energy of 100 MeV. Then, λ= 6.6 × 10−34 J · s h = p 5.4 × 10−25 kg · m/s = 1.2 × 10−9 m = 1.2 nm Solution (a) Using the classical relation between velocity and momentum, λ= h 6.6 × 10−34 J · s h = = = 6.6 × 10−39 m p mv (103 kg)(100 m/s) (b) As in part (a), λ= 6.6 × 10−34 J · s h = = 1.3 × 10−34 m mv (10−2 kg)(500 m/s) (c) λ= 6.6 × 10−34 J · s h = 6.6 × 10−20 m = −2 −12 mv (10 kg)(10 m/s) (d) The rest energy (mc2 ) of an electron is 5.1 × 105 eV. Because the kinetic energy (1 eV) is much less than the rest energy, we can use nonrelativistic kinematics. √ p = 2mK  = 2(9.1 × 10−31 kg)(1 eV)(1.6 × 10−19 J/eV) −25 = 5.4 × 10 kg · m/s We can √ also find this solution in the following way, using p = 2mK and hc = 1240 eV · nm.  √ cp = c 2mK = 2(mc2 )K  = 2(5.1 × 105 eV)(1 eV) = 1.0 × 103 eV λ= h hc 1240 eV · nm = = = 1.2 nm p pc 1.0 × 103 eV This method may seem artificial at first, but with practice it becomes quite useful, especially because energies are usually given in electron-volts in atomic and nuclear physics. (e) In this case, the kinetic energy is much greater than the rest energy, and so we are in the extreme relativistic realm, where K ∼ =E∼ = pc, as in Eq. 2.40. The wavelength is λ= 1240 MeV · fm hc = = 12 fm pc 100 MeV Note that the wavelengths computed in parts (a), (b), and (c) are far too small to be observed in the laboratory. Only in the last two cases, in which the wavelength is of the same order as atomic or nuclear sizes, do we have any chance of observing the wavelength. Because of the smallness of h, only for particles of atomic or nuclear size will the wave behavior be observable. Two questions immediately follow. First, just what sort of wave is it that has this de Broglie wavelength? That is, what does the amplitude of the de Broglie 104 Chapter 4 | The Wavelike Properties of Particles wave measure? We’ll discuss the answer to this question later in this chapter. For now, we assume that, associated with the particle as it moves, there is a de Broglie wave of wavelength λ, which shows itself when a wave-type experiment (such as diffraction) is performed on it. The outcome of the wave-type experiment depends on this wavelength. The de Broglie wavelength, which characterizes the wave-type behavior of particles, is central to the quantum theory. The second question then occurs: Why was this wavelength not directly observed before de Broglie’s time? As parts (a), (b), and (c) of Example 4.1 showed, for ordinary objects the de Broglie wavelength is very small. Suppose we tried to demonstrate the wave nature of these objects through a double-slit type of experiment. Recall from Eq. 3.16 that the spacing between adjacent fringes in a double-slit experiment is y = λD/d. Putting in reasonable values for the slit separation d and slit-to-screen distance D, you will find that there is no achievable experimental configuration that can produce an observable separation of the fringes (see Problem 9). There is no experiment that can be done to reveal the wave nature of macroscopic (laboratory-sized) objects. Experimental verification of de Broglie’s hypothesis comes only from experiments with objects on the atomic scale, which are discussed in the next section. 4.2 EXPERIMENTAL EVIDENCE FOR DE BROGLIE WAVES Light waves (Plane wave fronts) a The indications of wave behavior come mostly from interference and diffraction experiments. Double-slit interference, which was reviewed in Section 3.1, is perhaps the most familiar type of interference experiment, but the experimental difficulties of constructing double slits to do interference experiments with beams of atomic or subatomic particles were not solved until long after the time of de Broglie’s hypothesis. We discuss these experiments later in this section. First we’ll discuss diffraction experiments with electrons. Particle Diffraction Experiments q Screen Diffraction of light waves is discussed in most introductory physics texts and is illustrated in Figure 4.1 for light diffracted by a single slit. For light of wavelength λ incident on a slit of width a, the diffraction minima are located at angles given by a sin θ = nλ FIGURE 4.1 Light waves (represented as plane wave fronts) are incident on a narrow slit of width a. Diffraction causes the waves to spread after passing through the slit, and the intensity varies along the screen. The photograph shows the resulting intensity pattern. n = 1, 2, 3, . . . (4.2) on either side of the central maximum. Note that most of the light intensity falls in the central maximum. The experiments that first verified de Broglie’s hypothesis involve electron diffraction, not through an artificially constructed single slit (as for the diffraction pattern in Figure 4.1) but instead through the atoms of a crystal. The outcomes of these experiments resemble those of the similar X-ray diffraction experiments illustrated in Section 3.1. In an electron diffraction experiment, a beam of electrons is accelerated from rest through a potential difference V √, acquiring a nonrelativistic kinetic energy K = e V and a momentum p = 2mK. Wave mechanics would describe the beam of electrons as a wave of wavelength λ = h/p. The beam strikes a 4.2 | Experimental Evidence for De Broglie Waves crystal, and the scattered beam is photographed (Figure 4.2). The similarity between electron diffraction patterns (Figure 4.2) and X-ray diffraction patterns (Figure 3.7) strongly suggests that the electrons are behaving as waves. The “rings” produced in X-ray diffraction of polycrystalline materials (Figure 3.8b) are also produced in electron diffraction, as shown in Figure 4.3, again providing strong evidence for the similarity in the wave behavior of electrons and X rays. Experiments of the type illustrated in Figure 4.3 were first done in 1927 by G. P. Thomson, who shared the 1937 Nobel Prize for this work. (Thomson’s father, J. J. Thomson, received the 1906 Nobel Prize for his discovery of the electron and measurement of its charge-to-mass ratio. Thus it can be said that Thomson, the father, discovered the particle nature of the electron, while Thomson, the son, discovered its wave nature.) An electron diffraction experiment gave the first experimental confirmation of the wave nature of electrons (and the quantitative confirmation of the de Broglie relationship λ = h/p) soon after de Broglie’s original hypothesis. In 1926, at the Bell Telephone Laboratories, Clinton Davisson and Lester Germer were investigating the reflection of electron beams from the surface of nickel crystals. A schematic view of their apparatus is shown in Figure 4.4. A beam of electrons from a heated filament is accelerated through a potential difference V . After passing through a small aperture, the beam strikes a single crystal of nickel. Electrons are scattered in all directions by the atoms of the crystal, some of them striking a detector, which can be moved to any angle φ relative to the incident beam and which measures the intensity of the electron beam scattered at that angle. Figure 4.5 shows the results of one of the experiments of Davisson and Germer. When the accelerating voltage is set at 54 V, there is an intense reflection of the beam at the angle φ = 50◦ . Let’s see how these results give confirmation of the de Broglie wavelength. 105 Screen Crystal Electron beam FIGURE 4.2 (Top) Electron diffraction apparatus. (Bottom) Electron diffraction pattern. Each bright dot is a region of constructive interference, as in the X-ray diffraction patterns of Figure 3.7. The target is a crystal of Ti2 Nb10 O29 . f = 50° F +V Electron beam f Detector Crystal FIGURE 4.3 Electron diffraction of polycrystalline beryllium. Note the similarity between this pattern and the pattern for X-ray diffraction of a polycrystalline material (Figure 3.8b). FIGURE 4.4 Apparatus used by Davisson and Germer to study electron diffraction. Electrons leave the filament F and are accelerated by the voltage V . The beam strikes a crystal and the scattered beam is detected at an angle φ relative to the incident beam. The detector can be moved in the range 0 to 90◦ . FIGURE 4.5 Results of Davisson and Germer. Each point on the plot represents the relative intensity when the detector in Figure 4.4 is located at the corresponding angle φ measured from the vertical axis. Constructive interference causes the intensity of the reflected beam to reach a maximum at φ = 50◦ for V = 54 V. 106 Chapter 4 | The Wavelike Properties of Particles f Incident ray Diffracted ray d d sin f Each of the atoms of the crystal can act as a scatterer, so the scattered electron waves can interfere, and we have a crystal diffraction grating for the electrons. Figure 4.6 shows a simplified representation of the nickel crystal used in the Davisson-Germer experiment. Because the electrons were of low energy, they did not penetrate very far into the crystal, and it is sufficient to consider the diffraction to take place in the plane of atoms on the surface. The situation is entirely similar to using a reflection-type diffraction grating for light; the spacing d between the rows of atoms on the crystal is analogous to the spacing between the slits in the optical grating. The maxima for a diffraction grating occur at angles φ such that the path difference between adjacent rays d sin φ is equal to a whole number of wavelengths: FIGURE 4.6 The crystal surface acts like a diffraction grating with spacing d. d sin φ = nλ n = 1, 2, 3, . . . (4.3) where n is the order number of the maximum. From independent data, it is known that the spacing between the rows of atoms in a nickel crystal is d = 0.215 nm. The peak at φ = 50◦ must be a first-order peak (n = 1), because no peaks were observed at smaller angles. If this is indeed an interference maximum, the corresponding wavelength is, from Eq. 4.3, ◦ λ = d sin φ = (0.215 nm)(sin 50 ) = 0.165 nm We can compare this value with that expected on the basis of the de Broglie theory. An electron accelerated through a potential difference of 54 V has a kinetic energy of 54 eV and therefore a momentum of FIGURE 4.7 Diffraction of neutrons by a sodium chloride crystal. Intensity of scattered protons 101 10–1 –2 10–3 10–4 10–5 10–6 √ 1√ 1 1 2mK = 2mc2 K = 2(511, 000 eV)(54 eV) = (7430 eV) c c c The de Broglie wavelength is λ = h/p = hc/pc. Using hc = 1240 eV · nm, λ= 100 10 p= 0 4 8 12 16 20 24 28 Scattering angle (degrees) FIGURE 4.8 Diffraction of 1-GeV protons by oxygen nuclei. The pattern of maxima and minima is similar to that of single-slit diffraction of light waves. [Source: H. Palevsky et al., Physical Review Letters 18, 1200 (1967).] hc 1240 eV · nm = = 0.167 nm pc 7430 eV This is in excellent agreement with the value found from the diffraction maximum, and provides strong evidence in favor of the de Broglie theory. For this experimental work, Davisson shared the 1937 Nobel Prize with G. P. Thomson. The wave nature of particles is not exclusive to electrons; any particle with momentum p has de Broglie wavelength h/p. Neutrons are produced in nuclear reactors with kinetic energies corresponding to wavelengths of roughly 0.1 nm; these also should be suitable for diffraction by crystals. Figure 4.7 shows that diffraction of neutrons by a salt crystal produces the same characteristic patterns as the diffraction of electrons or X rays. Clifford Shull shared the 1994 Nobel Prize for the development of the neutron diffraction technique. To study the nuclei of atoms, much smaller wavelengths are needed, of the order of 10−15 m. Figure 4.8 shows the diffraction pattern produced by the scattering of 1-GeV kinetic energy protons by oxygen nuclei. Maxima and minima of the diffracted intensity appear in a pattern similar to the single-slit diffraction shown in Figure 4.1. (The intensity at the minima does not fall to zero because nuclei do not have a sharp boundary. The determination of nuclear sizes from such diffraction patterns is discussed in Chapter 12.) 4.2 | Experimental Evidence for De Broglie Waves 107 Example 4.2 Protons of kinetic energy 1.00 GeV were diffracted by oxygen nuclei, which have a radius of 3.0 fm, to produce the data shown in Figure 4.8. Calculate the expected angles where the first three diffraction minima should appear. Solution The total relativistic energy of the protons is E = K + mc2 = 1.00 GeV + 0.94 GeV = 1.94 GeV is, so their momentum is 1 2 E − (mc2 )2 p= c 1 = (1.94 GeV)2 − (0.94 GeV)2 = 1.70 GeV/c c The corresponding de Broglie wavelength is λ= h hc 1240 MeV · fm = = = 0.73 fm p pc 1700 MeV We can represent the oxygen nuclei as circular disks, for which the diffraction formula is a bit different from Eq. 4.2: a sin θ = 1.22nλ, where a is the diameter of the diffracting object. Based on this formula, the first diffraction minimum (n = 1) should appear at the angle sin θ = (1.22)(1)(0.73 fm) 1.22nλ = = 0.148 a 6.0 fm or θ = 8.5◦ . Because the sine of the diffraction angle is proportional to the index n, the n = 2 minimum should appear at the angle where sin θ = 2 × 0.148 = 0.296 (θ = 17.2◦ ), and the n = 3 minimum where sin θ = 3 × 0.148 = 0.444 (θ = 26.4◦ ). From the data in Figure 4.8, we see the first diffraction minimum at an angle of about 10◦ , the second at about 18◦ , and the third at about 27◦ , all in very good agreement with the expected values. The data don’t exactly follow the formula for diffraction by a disk, because nuclei don’t behave quite like disks. In particular, they have diffuse rather than sharp edges, which prevents the intensity at the diffraction minima from falling to zero and also alters slightly the locations of the minima. Double-Slit Experiments with Particles The definitive evidence for the wave nature of light was deduced from the double-slit experiment performed by Thomas Young in 1801 (discussed in Section 3.1). In principle, it should be possible to do double-slit experiments with particles and thereby directly observe their wavelike behavior. However, the technological difficulties of producing double slits for particles are formidable, and such experiments did not become possible until long after the time of de Broglie. The first double-slit experiment with electrons was done in 1961. A diagram of the apparatus is shown in Figure 4.9. The electrons from a hot filament were accelerated through 50 kV (corresponding to λ = 5.4 pm) and then passed through a double slit of separation 2.0 μm and width 0.5 μm. A photograph of the resulting intensity pattern is shown in Figure 4.10. The similarity with the double-slit pattern for light (Figure 3.2) is striking. A similar experiment can be done for neutrons. A beam of neutrons from a nuclear reactor can be slowed to a room-temperature “thermal” energy distribution (average K ≈ kT ≈ 0.025 eV), and a specific wavelength can be selected by a scattering process similar to Bragg diffraction (see Eq. 3.18 and Problem 32 at the end of the present chapter). In one experiment, neutrons of kinetic energy 0.00024 eV and de Broglie wavelength 1.85 nm passed through a gap of diameter 148 μm in a material that absorbs virtually all of the neutrons incident on it (Figure 4.11). In the center of the gap was a boron wire (also highly absorptive for neutrons) of diameter 104 μm. The neutrons could pass on either side of the wire through slits of width 22 μm. The intensity of neutrons that pass through this double slit was observed by sliding Fluorescent screen Electrons F 50 kV Photographic film FIGURE 4.9 Double-slit apparatus for electrons. Electrons from the filament F are accelerated through 50 kV and pass through the double slit. They produce a visible pattern when they strike a fluorescent screen (like a TV screen), and the resulting pattern is photographed. A photograph is shown in Figure 4.10. [See C. Jonsson, American Journal of Physics 42, 4 (1974).] 108 Chapter 4 | The Wavelike Properties of Particles D=5m Wavelength selector Detector Entrance slit Neutron beam Intensity FIGURE 4.10 Double-slit interference pattern for electrons. 100 mm Scanning slit position Intensity FIGURE 4.12 Intensity pattern observed for double-slit interference with neutrons. The spacing between the maxima is about 75 μm. [Source: R. Gahler and A. Zeilinger, American Journal of Physics 59, 316 (1991).] 10 mm Scanning slit position FIGURE 4.13 Intensity pattern observed for double-slit interference with helium atoms. [Source: O. Carnal and J. Mlynek, Physical Review Letters 66, 2689 (1991).] Double slit Scanning slit FIGURE 4.11 Double-slit apparatus for neutrons. Thermal neutrons from a reactor are incident on a crystal; scattering through a particular angle selects the energy of the neutrons. After passing through the double slit, the neutrons are counted by the scanning slit assembly, which moves laterally. another slit across the beam and measuring the intensity of neutrons passing through this “scanning slit.” Figure 4.12 shows the resulting pattern of intensity maxima and minima, which leaves no doubt that interference is occurring and that the neutrons have a corresponding wave nature. The wavelength can be deduced from the slit separation using Eq. 3.16 to obtain the spacing between adjacent maxima, y = yn+1 − yn . Estimating the spacing y from Figure 4.12 to be about 75 μm, we obtain dy (126 μm)(75 μm) λ= = = 1.89 nm D 5m This result agrees very well with the de Broglie wavelength of 1.85 nm selected for the neutron beam. It is also possible to do a similar experiment with atoms. In this case, a source of helium atoms formed a beam (of velocity corresponding to a kinetic energy of 0.020 eV) that passed through a double slit of separation 8 μm and width 1 μm. Again a scanning slit was used to measure the intensity of the beam passing through the double slit. Figure 4.13 shows the resulting intensity pattern. Although the results are not as dramatic as those for electrons and neutrons, there is clear evidence of interference maxima and minima, and the separation of the maxima gives a wavelength that is consistent with the de Broglie wavelength (see Problem 8). Diffraction can be observed with even larger objects. Figure 4.14 shows the pattern produced by fullerene molecules (C60 ) in passing through a diffraction grating with a spacing of d = 100 nm. The diffraction pattern was observed at a distance of 1.2 m from the grating. Estimating the separation of the maxima in Figure 4.14 as 50 μm, we get the angular separation of the maxima to be θ ≈ tan θ = (50 μm)/(1.2 m) = 4.2 × 10−5 rad, and thus λ = d sin θ = 4.2 pm. For C60 molecules with a speed of 117 m/s used in this experiment, the expected de Broglie wavelength is 4.7 pm, in good agreement with our estimate from the diffraction pattern. In this chapter we have discussed several interference and diffraction experiments using different particles—electrons, protons, neutrons, atoms, and molecules. These experiments are not restricted to any particular type of particle or to any particular type of observation. They are examples of a general phenomenon, the wave nature of particles, that was unobserved before 1920 because the necessary experiments had not yet been done. Today this wave nature is used as a basic tool by scientists. For example, neutron diffraction 109 Intensity 4.2 | Experimental Evidence for De Broglie Waves –150 –100 –50 0 50 100 150 Detector position in mm FIGURE 4.14 Diffraction grating pattern produced by C60 molecules. [Source: O. Nairz, M. Arndt, and A. Zeilinger, American Journal of Physics 71, 319 (2003).] FIGURE 4.15 The atomic structure of solid benzene as deduced from neutron diffraction. The circles indicate contours of constant density. The black circles show the locations of the six carbon atoms that form the familiar benzene ring. The blue circles show the locations of the hydrogen atoms. gives detailed information on the structure of solid crystals and of complex molecules (Figure 4.15). The electron microscope uses electron waves to illuminate and form an image of objects; because the wavelength can be made thousands of times smaller than that of visible light, it is possible to resolve and observe small details that are not observable with visible light (Figure 4.16). Through Which Slit Does the Particle Pass? When we do a double-slit experiment with particles such as electrons, it is tempting to try to determine through which slit the particle passes. For example, we could surround each slit with an electromagnetic loop that causes a meter to deflect whenever a charged particle or perhaps a particle with a magnetic moment passes through the loop (Figure 4.17). If we fired the particles through the slits at a slow enough rate, we could track each particle as it passed through one slit or the other and then appeared on the screen. If we performed this imaginary experiment, the result would no longer be an interference pattern on the screen. Instead, we would observe a pattern similar to that shown in Figure 4.17, with “hits” in front of each slit, but no interference fringes. No matter what sort of device we use to determine through which slit the particle passes, the interference pattern will be destroyed. The classical particle must pass through one slit or the other; only a wave can reveal interference, which depends on parts of the wavefront passing through both slits and then recombining. When we ask through which slit the particle passed, we are investigating only the particle aspects of its behavior, and we cannot observe its wave nature (the interference pattern). Conversely, when we study the wave nature, we cannot simultaneously observe the particle nature. The electron will behave as a particle or a wave, but we cannot observe both aspects of its behavior simultaneously. This curious aspect of quantum mechanics was also discussed for photons in Section 3.6, where we discovered that experiments can reveal either the particle nature of the photon or its wave nature, but not both aspects simultaneously. FIGURE 4.16 Electron microscope image of bacteria on the surface of a human tongue. The magnification here is about a factor of 5000. 110 Chapter 4 | The Wavelike Properties of Particles Electron beam Double slit Screen FIGURE 4.17 Apparatus to record passage of electrons through slits. Each slit is surrounded by a loop with a meter that signals the passage of an electron through the slit. No interference fringes are seen on the screen. This is the basis for the principle of complementarity, which asserts that the complete description of a photon or a particle such as an electron cannot be made in terms of only particle properties or only wave properties, but that both aspects of its behavior must be considered. Moreover, the particle and wave natures cannot be observed simultaneously, and the type of behavior that we observe depends on the kind of experiment we are doing: a particle-type experiment shows only particle like behavior, and a wave-type experiment shows only wavelike behavior. 4.3 UNCERTAINTY RELATIONSHIPS FOR CLASSICAL WAVES (a) (b) FIGURE 4.18 (a) A pure sine wave, which extends from −∞ to +∞. (b) A narrow wave pulse. In quantum mechanics, we want to use de Broglie waves to describe particles. In particular, the amplitude of the wave will tell us something about the location of the particle. Clearly a pure sinusoidal wave, as in Figure 4.18a, is not much use in locating a particle—the wave extends from −∞ to +∞, so the particle might be found anywhere in that region. On the other hand, a narrow wave pulse like Figure 4.18b does a pretty good job of locating the particle in a small region of space, but this wave does not have an easily identifiable wavelength. In the first case, we know the wavelength exactly but have no knowledge of the location of the particle, while in the second case we have a good idea of the location of the particle but a poor knowledge of its wavelength. Because wavelength is associated with momentum by the de Broglie relationship (Eq. 4.1), a poor knowledge of the wavelength is associated with a poor knowledge of the particle’s momentum. For a classical particle, we would like to know both its location and its momentum as precisely as possible. For a quantum particle, we are going to have to make some compromises—the better we know its momentum (or wavelength), the less we know about its location. We can improve our knowledge of its location only at the expense of our knowledge of its momentum. 4.3 | Uncertainty Relationships for Classical Waves This competition between knowledge of location and knowledge of wavelength is not restricted to de Broglie waves—classical waves show the same effect. All real waves can be represented as wave packets—disturbances that are localized to a finite region of space. We will discuss more about constructing wave packets in Section 4.5. In this section we will examine this competition between specifying the location and the wavelength of classical waves more closely. Figure 4.19a shows a very small wave packet. The disturbance is well localized to a small region of space of length x. (Imagine listening to a very short burst of sound, of such brief duration that it is hard for you to recognize the pitch or frequency of the wave.) Let’s try to measure the wavelength of this wave packet. Placing a measuring stick along the wave, we have some difficulty defining exactly where the wave starts and where it ends. Our measurement of the wavelength is therefore subject to a small uncertainty λ. Let’s represent this uncertainty as a fraction ε of the wavelength λ, so that λ ∼ ελ. The fraction ε is certainly less than 1, but it is probably greater than 0.01, so we estimate that ε ∼ 0.1 to within an order of magnitude. (In our discussion of uncertainty, we use the ∼ symbol to indicate a rough order-of-magnitude estimate.) That is, the uncertainty in our measurement of the wavelength might be roughly 10% of the wavelength. The size of this wave disturbance is roughly one wavelength, so x ≈ λ. For this discussion we want to examine the product of the size of the wave packet and the uncertainty in the wavelength, x times λ with x ≈ λ and λ ∼ ελ: xλ ∼ ελ2 (4.4) This expression shows the inverse relationship between the size of the wave packet and the uncertainty in the wavelength: for a given wavelength, the smaller the size of the wave packet, the greater the uncertainty in our knowledge of the wavelength. That is, as x gets smaller, λ must become larger. Making a larger wave packet doesn’t help us at all. Figure 4.19b shows a larger wave packet with the same wavelength. Suppose this larger wave packet contains ? ? ∆x ≈ l (a) ? ? ∆ x ≈ Nl (b) FIGURE 4.19 (a) Measuring the wavelength of a wave represented by a small wave packet of length roughly one wavelength. (b) Measuring the wavelength of a wave represented by a large wave packet consisting of N waves. 111 112 Chapter 4 | The Wavelike Properties of Particles N cycles of the wave, so that x ≈ Nλ. Again using our measuring stick, we try to measure the size of N wavelengths, and dividing this distance by N we can then determine the wavelength. We still have the same uncertainty of ελ in locating the start and end of this wave packet, but when we divide by N to find the wavelength, the uncertainty in one wavelength becomes λ ∼ ελ/N. For this larger wave packet, the product of x and λ is xλ ∼ (Nλ)(ελ/N) = ελ2 , exactly the same as in the case of the smaller wave packet. Equation 4.4 is a fundamental property of classical waves, independent of the type of wave or the method used to measure its wavelength. This is the first of the uncertainty relationships for classical waves. Example 4.3 In a measurement of the wavelength of water waves, 10 wave cycles are counted in a distance of 196 cm. Estimate the minimum uncertainty in the wavelength that might be obtained from this experiment. Solution With 10 wave crests in a distance of 196 cm, the wavelength is about (196 cm)/10 = 19.6 cm. We can take ε ∼ 0.1 as a good order-of-magnitude estimate of the typical precision that might be obtained. From Eq. 4.4, we can find the uncertainty in wavelength: λ ∼ (0.1)(19.6 cm)2 ελ2 = = 0.2 cm x 196 cm With an uncertainty of 0.2 cm, the “true” wavelength might range from 19.5 cm to 19.7 cm, so we might express this result as 19.6 ± 0.1 cm. The Frequency-Time Uncertainty Relationship ? ? We can take a different approach to uncertainty for classical waves by imagining a measurement of the period rather than the wavelength of the wave that comprises our wave packet. Suppose we have a timing device that we use to measure the duration of the wave packet, as in Figure 4.20. Here we are plotting the wave disturbance as a function of time rather than location. The “size” of the wave packet is now its duration in time, which is roughly one period T for this wave packet, so that t ≈ T. Whatever measuring device we use, we have some difficulty locating exactly the start and end of one cycle, so we have an uncertainty T in measuring the period. As before, we’ll assume this uncertainty is some small fraction of the period: T ∼ εT. To examine the competition between the duration of the wave packet and our ability to measure its period, we calculate the product of t and T: tT ∼ εT 2 ∆t ≈ T FIGURE 4.20 Measuring the period of a wave represented by a small wave packet of duration roughly one period. (4.5) This is the second of our uncertainty relationships for classical waves. It shows that for a wave of a given period, the smaller the duration of the wave packet, the larger is the uncertainty in our measurement of the period. Note the similarity between Eqs. 4.4 and 4.5, one representing relationships in space and the other in time. It will turn out to be more useful if we write Eq. 4.5 in terms of frequency instead of period. Given that period T and frequency f are related by f = 1/T, how is f related to T? The correct relationship is certainly not f = 1/T, which would imply that a very small uncertainty in the period would lead to a very large 4.4 | Heisenberg Uncertainty Relationships 113 uncertainty in the frequency. Instead, they should be directly related—the better we know the period, the better we know the frequency. Here is how we obtain the relationship: Beginning with f = 1/T, we take differentials on both sides: df = − 1 dT T2 Next we convert the infinitesimal differentials to finite intervals, and because we are interested only in the magnitude of the uncertainties we can ignore the minus sign: 1 (4.6) f = 2 T T Combining Eqs. 4.5 and 4.6, we obtain f t ∼ ε (4.7) Equation 4.7 shows that the longer the duration of the wave packet, the more precisely we can measure its frequency. Example 4.4 An electronics salesman offers to sell you a frequencymeasuring device. When hooked up to a sinusoidal signal, it automatically displays the frequency of the signal, and to account for frequency variations, the frequency is remeasured once each second and the display is updated. The salesman claims the device to be accurate to 0.01 Hz. Is this claim valid? Solution must have an associated uncertainty of about f ∼ It appears that the salesman may be exaggerating the precision of this device. Based on Eq. 4.7, and again estimating ε to be about 0.1, we know that a measurement of frequency in a time t = 1s 4.4 HEISENBERG UNCERTAINTY RELATIONSHIPS The uncertainty relationships discussed in the previous section apply to all waves, and we should therefore apply them to de Broglie waves. We can use the basic de Broglie relationship p = h/λ to relate the uncertainty in the momentum p to the uncertainty in wavelength λ, using the same procedure that we used to obtain Eq. 4.6. Starting with p = h/λ, we take differentials on both sides and obtain dp = (−h/λ2 )dλ. Now we change the differentials into differences, ignoring the minus sign: p = h λ λ2 0.1 ε = t 1s = 0.1 Hz (4.8) 114 Chapter 4 | The Wavelike Properties of Particles An uncertainty in the momentum of the particle is directly related to the uncertainty in the wavelength associated with the particle’s de Broglie wave packet. Combining Eq. 4.8 with Eq. 4.4, we obtain xp ∼ εh (4.9) Just like Eq. 4.4, this equation suggests an inverse relationship between x and p. The smaller the size of the wave packet of the particle, the larger is the uncertainty in its momentum (and thus in its velocity). Quantum mechanics provides a formal procedure for calculating x and p for wave packets corresponding to different physical situations and for different schemes for confining a particle. One outcome of these calculations gives the wave packet with the smallest possible value of the product xp, which turns out to be h/4π, as we will discuss in the next chapter. Thus ε = 1/4π in this case. All other wave packets will have larger values for xp. The combination h/2π occurs frequently in quantum mechanics and is given the special symbol h− (“h-bar”) Werner Heisenberg (1901–1976, Germany). Best known for the uncertainty principle, he also developed a complete formulation of the quantum theory based on matrices. Screen l q = sin–1 a y x a Electrons ∆x~∞ FIGURE 4.21 Single-slit diffraction of electrons. A wide beam of electrons is incident on a narrow slit. The electrons that pass through the slit acquire a component of momentum in the x direction. h− = h = 1.05 × 10−34 J · s = 6.58 × 10−16 eV · s 2π − we can write the uncertainty relationship as In terms of h, xpx  12 h− (4.10) The x subscript has been added to the momentum to remind us that Eq. 4.10 applies to motion in a given direction and relates the uncertainties in position and momentum in that direction only. Similar and independent relationships can be − − or zpz  h/2. applied in the other directions as necessary; thus ypy  h/2 Equation 4.10 is the first of the Heisenberg uncertainty relationships. It sets the limit of the best we can possibly do in an experiment to measure simultaneously the location and the momentum of a particle. Another way of interpreting this equation is to say that the more we try to confine a particle, the less we know about its momentum. − represents the minimum value of the product xpx , Because the limit of h/2 in most cases we will do worse than this limit. It is therefore quite acceptable to take xpx ∼ h− (4.11) as a rough estimate of the relationship between the uncertainties in location and momentum. As an example, let’s consider a beam of electrons incident on a single slit, as in Figure 4.21. We know this experiment as single-slit diffraction, which produces the characteristic diffraction pattern illustrated in Figure 4.1. We’ll assume that the particles are initially moving in the y direction and that we know their momentum in that direction as precisely as possible. If the electrons initially have no component of their momentum in the x direction, we know px exactly (it is exactly zero), so that px = 0; thus we know nothing about the x coordinates of the electrons (x = ∞). This situation represents a very wide beam of electrons, only a small fraction of which pass through the slit. At the instant that some of the electrons pass through the slit, we know quite a bit more about their x location. In order to pass through the slit, the uncertainty 4.4 | Heisenberg Uncertainty Relationships in their x location is no larger than a, the width of the slit; thus x = a. This improvement in our knowledge of the electron’s location comes at the expense of our knowledge of its momentum, however. According to Eq. 4.11, the uncertainty − in the x component of its momentum is now px ∼ h/a. Measurements beyond the slit no longer show the particle moving precisely in the y direction (for which px = 0); the momentum now has a small x component as well, with values − distributed about zero but now with a range of roughly ±h/a. In passing through the slit, a particle acquires on the average an x component of momentum of − roughly h/a, according to the uncertainty principle. Let us now find the angle θ that specifies where a particle with this value of px lands on the screen. For small angles, sin θ ≈ tan θ and so sin θ ≈ tan θ = − px λ h/a = = py py 2π a using λ = h/py for the de Broglie wavelength of the electrons. The first minimum of the diffraction pattern of a single slit is located at sin θ = λ/a, which is larger than the spread of angles into which most of the particles are diffracted. The calculation shows that the distribution of transverse momentum given by the uncertainty principle is roughly equivalent to the spreading of the beam into the central diffraction peak, and it illustrates again the close connection between wave behavior and uncertainty in particle location. The diffraction (spreading) of a beam following passage through a slit is just the effect of the uncertainty principle on our attempt to specify the location of the particle. As we make the slit narrower, px increases and the beam spreads even more. In trying to obtain more precise knowledge of the location of the particle by making the slit narrower, we have lost knowledge of the direction of its travel. This trade-off between observations of position and momentum is the essence of the Heisenberg uncertainty principle. We can also apply the second of our classical uncertainty relationships (Eq. 4.7) to de Broglie waves. If we assume the energy-frequency relationship for light, E = hf , can be applied to particles, then we immediately obtain E = hf . Combining this with Eq. 4.7, we obtain Et ∼ εh (4.12) Once again, the minimum uncertainty wave packet gives ε = 1/4π, and so Et  21 h− (4.13) This is the second of the Heisenberg uncertainty relationships. It tells us that the more precisely we try to determine the time coordinate of a particle, the less precisely we know its energy. For example, if a particle has a very short lifetime between its creation and decay (t → 0), a measurement of its rest energy (and thus its mass) will be very imprecise (E → ∞). Conversely, the rest energy of a stable particle (one with an infinite lifetime, so that t = ∞) can in principle be measured with unlimited precision (E = 0). As in the case of the first Heisenberg relationship, we can take Et ∼ h− as a reasonable estimate for most wave packets. (4.14) 115 116 Chapter 4 | The Wavelike Properties of Particles The Heisenberg uncertainty relationships are the mathematical representations of the Heisenberg uncertainty principle, which states: It is not possible to make a simultaneous determination of the position and the momentum of a particle with unlimited precision, and It is not possible to make a simultaneous determination of the energy and the time coordinate of a particle with unlimited precision. These relationships give an estimate of the minimum uncertainty that can result from any experiment; measurement of the position and momentum of a particle will give a spread of values of widths x and px . We may, for other reasons, do much worse than Eqs. 4.10 and 4.13, but we can do no better. These relationships have a profound impact on our view of nature. It is quite acceptable to say that there is an uncertainty in locating the position of a water wave. It is quite another matter to make the same statement about a de Broglie wave, because there is an implied corresponding uncertainty in the position of the particle. Equations 4.10 and 4.13 say that nature imposes a limit on the accuracy with which we can do experiments. To emphasize this point, the Heisenberg relationships are sometimes called “indeterminacy” rather than “uncertainty” principles, because the idea of uncertainty may suggest an experimental limit that can be reduced by using better equipment or technique. In actuality, these coordinates are indeterminate to the limits provided by Eqs. 4.10 and 4.13—no matter how hard we try, it is simply not possible to measure more precisely. Example 4.5 An electron moves in the x direction with a speed of 3.6 × 106 m/s. We can measure its speed to a precision of 1%. With what precision can we simultaneously measure its x coordinate? The uncertainty px is 1% of this value, or 3.3 × 10−26 kg · m/s. The uncertainty in position is then x ∼ Solution = 3.2 nm The electron’s momentum is px = mvx = (9.11 × 10−31 kg)(3.6 × 106 m/s) = 3.3 × 10−24 kg · m/s 1.05 × 10−34 J · s h− = px 3.3 × 10−26 kg · m/s which is roughly 10 atomic diameters. Example 4.6 Repeat the calculations of the previous example in the case of a pitched baseball (m = 0.145 kg) moving at a speed of 95 mi/h (42.5 m/s). Again assume that its speed can be measured to a precision of 1%. Solution The baseball’s momentum is px = mvx = (0.145 kg)(42.5 m/s) = 6.16 kg · m/s The uncertainty in momentum is 6.16 × 10−2 kg · m/s, and the corresponding uncertainty in position is x ∼ h− 1.05 × 10−34 J · s = = 1.7 × 10−33 m px 6.16 × 10−2 kg · m/s 4.4 | Heisenberg Uncertainty Relationships This uncertainty is 19 orders of magnitude smaller than the size of an atomic nucleus. The uncertainty principle cannot be blamed for the batter missing the pitch! Once again 117 we see that, because of the small magnitude of Planck’s constant, quantum effects are not observable for ordinary objects. A Statistical Interpretation of Uncertainty A diffraction pattern, such as that shown in Figure 4.21, is the result of the passage of many particles or photons through the slit. So far, we have been discussing the behavior of only one particle. Let’s imagine that we do an experiment in which a large number of particles passes (one at a time) through the slit, and we measure the transverse (x component) momentum of each particle after it passes through the slit. We can do this experiment simply by placing a detector at different locations on the screen where we observe the diffraction pattern. Because the detector actually accepts particles over a finite region on the screen, it measures in a range of deflection angles or equivalently in a range of transverse momentum. The result of the experiment might look something like Figure 4.22. The vertical scale shows the number of particles with momentum in each interval corresponding to different locations of the detector on the screen. The values are symmetrically arranged about zero, which indicates that the mean or average value of px is zero. The width of the distribution is characterized by px . Figure 4.22 resembles a statistical distribution, and in fact the precise definition of px is similar to that of the standard deviation σA of a quantity A that has a mean or average value Aav : σA =  (A2 )av − (Aav )2 If there are N individual measurements of A, then Aav = N −1 Ai and (A2 )av = N −1 A2i . By analogy, we can make a rigorous definition of the uncertainty in momentum as  (4.15) px = (p2x )av − (px,av )2 The average value of the transverse momentum for the situation shown in Figure 4.22 is zero, so px =  Number of particles recorded by detector ∆px (p2x )av (4.16) which gives in effect a root-mean-square value of px . This can be taken to be a rough measure of the magnitude of px . Thus it is often said that px gives a measure of the magnitude of the momentum of the particle. As you can see from Figure 4.22, this is indeed true.∗ ∗ The relationship between the value of p calculated from Eq. 4.16 and the width of the distribution x shown in Figure 4.22 depends on the exact shape of the distribution. You should consider the value from Eq. 4.16 as a rough order-of-magnitude estimate of the width of the distribution. 0 Momentum FIGURE 4.22 Results that might be obtained from measuring the number of electrons in a given time interval at different locations on the screen of Figure 4.21. The distribution is centered around px = 0 and has a width that is characterized by px . 118 Chapter 4 | The Wavelike Properties of Particles Example 4.7 In nuclear beta decay, electrons are observed to be ejected from the atomic nucleus. Suppose we assume that electrons are somehow trapped within the nucleus, and that occasionally one escapes and is observed in the laboratory. Take the diameter of a typical nucleus to be 1.0 × 10−14 m, and use the uncertainty principle to estimate the range of kinetic energies that such an electron must have. Solution If the electron were trapped in a region of width x ≈ 10−14 m, the corresponding uncertainty in its momentum would be px ∼ − h− 1 hc 1 197 MeV · fm = = = 19.7 MeV/c x c x c 10 fm − Note the use of hc = 197 MeV · fm in this calculation. This momentum is clearly in the relativistic regime for electrons, so we must use the relativistic formula to find the kinetic energy for a particle of momentum 19.7 MeV/c:  p2 c2 + (mc2 )2 − mc2  = (19.7 MeV)2 + (0.5 MeV)2 − 0.5 MeV = 19 MeV K= where we have used Eq. 4.16 to relate px to p2x . This result gives the spread of kinetic energies corresponding to a spread in momentum of 19.7 MeV/c. Electrons emitted from the nucleus in nuclear beta decay typically have kinetic energies of about 1 MeV, much smaller than the typical spread in energy required by the uncertainty principle for electrons confined inside the nucleus. This suggests that beta-decay electrons of such low energies cannot be confined in a region of the size of the nucleus, and that another explanation must be found for the electrons observed in nuclear beta decay. (As we discuss in Chapter 12, these electrons cannot preexist in the nucleus, which would violate the uncertainty principle, but are “manufactured” by the nucleus at the instant of the decay.) Example 4.8 (a) A charged pi meson has a rest energy of 140 MeV and a lifetime of 26 ns. Find the energy uncertainty of the pi meson, expressed in MeV and also as a fraction of its rest energy. (b) Repeat for the uncharged pi meson, with a rest energy of 135 MeV and a lifetime of 8.3 × 10−17 s. (c) Repeat for the rho meson, with a rest energy of 765 MeV and a lifetime of 4.4 × 10−24 s. Solution (a) If the pi meson lives for 26 ns, we have only that much time in which to measure its rest energy, and Eq. 4.8 tells us that any energy measurement done in a time t is uncertain by an amount of at least E = h− 6.58 × 10−16 eV · s = t 26 × 10−9 s = 2.5 × 10−8 eV = 2.5×10−14 MeV 2.5 × 10−14 MeV E = E 140 MeV = 1.8 × 10−16 (b) In a similar way, E = h− 6.58 × 10−16 eV · s = = 7.9 eV t 8.3 × 10−17 s = 7.9 × 10−6 MeV 7.9 × 10−6 MeV E = = 5.9 × 10−8 E 135 MeV (c) For the rho meson, E = h− 6.58 × 10−16 eV · s = = 1.5 × 108 eV t 4.4 × 10−24 s = 150 MeV 150 MeV E = = 0.20 E 765 MeV In the first case, the uncertainty principle does not give a large enough effect to be measured—particle masses cannot be measured to a precision of 10−16 (about 10−6 is the best precision that we can obtain). In the second example, the uncertainty principle contributes at about the level of 10−7 , which approaches the limit of our measuring 4.5 | Wave Packets instruments and therefore might be observable in the laboratory. In the third example, we see that the uncertainty principle can contribute substantially to the precision of our knowledge of the rest energy of the rho meson; measurements of its rest energy will show a statistical distribution centered about 765 MeV with a spread of 150 MeV, and no matter how precise an instrument we use to measure the rest energy, we can never reduce that spread. 119 The lifetime of a very short-lived particle such as the rho meson cannot be measured directly. In practice we reverse the procedure of the calculation of this example—we measure the rest energy, which gives a distribution similar to Figure 4.22, and from the “width” E of the distribution we deduce the lifetime using Eq. 4.8. This procedure is discussed in Chapter 14. Example 4.9 Estimate the minimum velocity that would be measured for a billiard ball (m ≈ 100 g) confined to a billiard table of dimension 1 m. so Solution Thus quantum effects might result in motion of the billiard ball with a speed distribution having a spread of about 1 × 10−33 m/s. At this speed, the ball would move a distance of 1% of the diameter of an atomic nucleus in a time equal to the age of the universe! Once again, we see that quantum effects are not observable with macroscopic objects. For x ≈ 1 m, we have px ∼ h− 1.05 × 10−34 J · s = = 1 × 10−34 kg · m/s x 1m vx = px 1 × 10−34 kg · m/s = = 1 × 10−33 m/s m 0.1 kg 4.5 WAVE PACKETS In Section 4.3, we described measurements of the wavelength or frequency of a wave packet, which we consider to be a finite group of oscillations of a wave. That is, the wave amplitude is large over a finite region of space or time and is very small outside that region. Before we begin our discussion, it is necessary to keep in mind that we are discussing traveling waves, which we imagine as moving in one direction with a uniform speed. (We’ll discuss the speed of the wave packet later.) As the wave packet moves, individual locations in space will oscillate with the frequency or wavelength that characterizes the wave packet. When we show a static picture of a wave packet, it doesn’t matter that some points within the packet appear to have positive displacement, some have negative displacement, and some may even have zero displacement. As the wave travels, those locations are in the process of oscillating, and our drawings may “freeze” that oscillation. What is important is the locations in space where the overall wave packet has a large oscillation amplitude and where it has a very small amplitude.∗ In this section we will examine how to build a wave packet by adding waves together. A pure sinusoidal wave is of no use in representing a particle—the wave ∗ By analogy, think of a radio wave traveling from the station to your receiver. At a particular instant of time, some points in space may have instantaneous electromagnetic field values of zero, but that doesn’t affect your reception of the signal. What is important is the overall amplitude of the traveling wave. 120 Chapter 4 | The Wavelike Properties of Particles extends from −∞ to +∞, so the particle could be found anywhere. We would like the particle to be represented by a wave packet that describes how the particle is localized to a particular region of space, such as an atom or a nucleus. The key to the process of building a wave packet involves adding together waves of different wavelength. We represent our waves as A cos kx, where k is the wave number (k = 2π/λ) and A is the amplitude. For example, let’s add together two waves: y(x) = A1 cos k1 x + A2 cos k2 x = A1 cos(2π x/λ1 ) + A2 cos(2π x/λ2 ) (4.17) This sum is illustrated in Figure 4.23a for the case A1 = A2 and λ1 = 9, λ2 = 11. This combined wave shows the phenomenon known as beats in the case of sound waves. So far we don’t have a result that looks anything like the wave packet we are after, but you can see that by adding together two different waves we have reduced the amplitude of the wave packet at some locations. This pattern repeats endlessly from −∞ to +∞, so the particle is still not localized. Let’s try a more detailed sum. Figure 4.23b shows the result of adding 5 waves with wavelengths 9, 9.5, 10, 10.5, 11. Here we have been a bit more successful in restricting the amplitude of the wave packet in some regions. By adding even more waves with a larger range of wavelengths, we can obtain still narrower regions of large amplitude: Figure 4.23c shows the result of adding 9 waves with wavelengths 8, 8.5, 9, . . . , 12, and Figure 4.23d shows the result of adding 13 waves of wavelengths 7, 7.5, 8, . . . , 13. Unfortunately, all of these patterns (including the regions of large amplitude) repeat endlessly from −∞ to +∞, so even though we have obtained increasingly large regions where the wave packet has small amplitude, we haven’t yet created a wave packet that might represent a particle localized to a particular region. If these wave packets did represent particles, then the particle would not be confined to any finite region. 1 1 0.5 0.5 0 0 –100 0 –50 50 100 –100 –0.5 –0.5 –1 –1 1 1 0.5 0.5 0 –50 0 –0.5 –1 (b) 50 100 50 100 (c) (a) –100 0 –50 50 100 –100 –50 0 0 –0.5 –1 (d) FIGURE 4.23 (a) Adding two waves of wavelengths 9 and 11 gives beats. (b) Adding 5 waves with wavelengths ranging from 9 to 11. (c) Adding 9 waves with wavelengths ranging from 8 to 12. (d) Adding 13 waves with wavelengths from 7 to 13. All of the patterns repeat from −∞ to +∞. 4.5 | Wave Packets 121 The regions of large amplitude in Figures 4.23b,c,d do show how adding more waves of a greater range of wavelengths helps to restrict the size of the wave packet. The region of large amplitude in Figure 4.23b ranges from about x = −40 to +40, while in Figure 4.23c it is from about x = −20 to +20 and in Figure 4.23d from about x = −15 to +15. This shows again the inverse relationship between x and λ expected for wave packets given by Eq. 4.4: as the range of wavelengths increases from 2 to 4 to 6, the size of the “allowed” regions decreases from about 80 to 40 to 30. Once again we find that to restrict the size of the wave packet we must sacrifice the precise knowledge of the wavelength. Note that for all four of these wave patterns, the disturbance seems to have a wavelength of about 10, equal to the central wavelength of the range of values of the functions we constructed. We can therefore regard these functions as a cosine wave with a wavelength of 10 that is shaped or modulated by the other cosine waves included in the function. For example, for the case of A1 = A2 = A, Eq. 4.17 can be rewritten after a bit of trigonometric manipulation as     πx π x πx π x y(x) = 2A cos − + cos (4.18) λ1 λ2 λ1 λ2 If λ1 and λ2 are close together (that is, if λ = λ2 − λ1 ≪ λ1 , λ2 ), this can be approximated as     λπ x 2π x y(x) = 2A cos cos (4.19) λ2av λav where λav = (λ1 + λ2 )/2 ≈ λ1 or λ2 . The second cosine term represents a wave with a wavelength of 10, and the first cosine term provides the shaping envelope that produces the beats. Any finite combination of waves with discrete wavelengths will produce patterns that repeat between −∞ to +∞, so this method of adding waves will not work in constructing a finite wave packet. To construct a wave packet with a finite width, we must replace the first cosine term in Eq. 4.19 with a function that is large in the region where we want to confine the particle but that falls to zero as x → ±∞. For example, the simplest function that has this property is 1/x, so we might imagine a wave packet whose mathematical form is     λπ x 2π x 2A sin (4.20) cos y(x) = x λ0 λ20 Here λ0 represents the central wavelength, replacing λav . (In going from Eq. 4.19 to Eq. 4.20, the cosine modulating term has been changed to a sine; otherwise the function would blow up at x = 0.) This function is plotted in Figure 4.24a. It looks more like the kind of function we are seeking—it has large amplitude only in a small region of space, and the amplitude drops rapidly to zero outside that region. Another function that has this property is the Gaussian modulating function: −2(λπx/λ20 )2 y(x) = Ae which is shown in Figure 4.24b.  2π x cos λ0  (4.21) 1 2A ∆l px sin x l20 cos 0.5 2px l0 0 –150 –100 0 –50 50 100 150 –0.5 –1 (a) 1 –2(∆lpx/l20)2 Ae cos 0.5 2px λo 0 –150 –100 –50 0 50 100 150 –0.5 –1 (b) FIGURE 4.24 (a) A wave packet in which the modulation envelope decreases in amplitude like 1/x. (b) A wave packet with a Gaussian modulating function. Both curves are drawn for λ0 = 10 and λ = 0.58, which corresponds approximately to Figure 4.23b. 122 Chapter 4 | The Wavelike Properties of Particles Both of these functions show the characteristic inverse relationship between an arbitrarily defined size of the wave packet x and the wavelength range parameter λ that is used in constructing the wave packet. For example, consider the wave packet shown in Figure 4.24a. Let’s arbitrarily define the width of the wave packet as the distance over which the amplitude of the central region falls by 1/2. That occurs roughly where the argument of the sine has the value ±π/2, which gives xλ ∼ λ20 , consistent with our classical uncertainty estimate. These wave packets can also be constructed by adding together waves of differing amplitude and wavelength, but the wavelengths form a continuous rather than a discrete set. It is a bit easier to illustrate this if we work with wave number k = 2π/λ rather than wavelength. So far we have been adding waves in the form of A cos kx, so that  (4.22) y(x) = Ai cos ki x where ki = 2π/λi . The waves plotted in Figure 4.23 represent applications of the general formula of Eq. 4.22 carried out over different numbers of discrete waves. If we have a continuous set of wave numbers, the sum in Eq. 4.22 becomes an integral:  y(x) = A(k) cos kx dk (4.23) where the integral is carried out over whatever range of wave numbers is permitted (possibly infinite). For example, suppose we have a range of wave numbers from k0 − k/2 to k0 + k/2 that is a continuous distribution of wave numbers of width k centered at k0 . If all of the waves have the same amplitude A0 , then from Eq. 4.23 the form of the wave packet can be shown to be (see Problem 24 at the end of the chapter)   k 2A0 (4.24) sin x cos k0 x y(x) = x 2 This is identical with Eq. 4.20 with k0 = 2π/λ0 and k = 2π λ/λ20 . This relationship between k and λ follows from a procedure similar to what was used to obtain Eq. 4.6. With k = 2π/λ, taking differentials gives dk = −(2π/λ2 )dλ. Replacing the differentials with differences and ignoring the minus sign gives the relationship between k and λ. A better approximation of the shape of the wave packet can be found by letting 2 2 A(k) vary according to a Gaussian distribution A(k) = A0 e−(k−k0 ) /2(k) . This gives a range of wave numbers that has its largest contribution at the central wave number k0 and falls off to zero for larger or smaller wave numbers with a characteristic width of k. Applying Eq. 4.23 to this case, with k ranging from −∞ to +∞ gives (see Problem 25) √ 2 (4.25) y(x) = A0 k 2π e−(kx) /2 cos k0 x which shows how the form of Eq. 4.21 originates. By specifying the distribution of wavelengths, we can construct a wave packet of any desired shape. A wave packet that restricts the particle to a region in space 4.6 | The Motion of a Wave Packet of width x will have a distribution of wavelengths characterized by a width λ. The smaller we try to make x, the larger will be the spread λ of the wavelength distribution. The mathematics of this process gives a result that is consistent with the uncertainty relationship for classical waves (Eq. 4.4). 4.6 THE MOTION OF A WAVE PACKET Let’s consider again the “beats” wave packet represented by Eq. 4.17 and illustrated in Figure 4.23a. We now want to turn our “static” waves into traveling waves. It will again be more convenient for this discussion to work with the wave number k instead of the wavelength. To turn a static wave y(x) = A cos kx into a traveling wave moving in the positive x direction, we replace kx with kx − ωt, so that the traveling wave is written as y(x, t) = A cos(kx − ωt). (For motion in the negative x direction, we would replace kx with kx + ωt.) Here ω is the circular frequency of the wave: ω = 2π f . The combined traveling wave then would be represented as y(x, t) = A1 cos(k1 x − ω1 t) + A2 cos(k2 x − ω2 t) (4.26) For any individual wave, the wave speed is related to its frequency and wavelength according to v = λf . In terms of the wave number and circular frequency, we can write this as v = (2π/k)(ω/2π ), so v = ω/k. This quantity is sometimes called the phase speed and represents the speed of one particular phase or component of the wave packet. In general, each individual component may have a different phase speed. As a result, the shape of the wave packet may change with time. For Figure 4.23a, we chose A1 = A2 and λ1 = 9, λ2 = 11. Let’s choose v1 = 6 units/s and v2 = 4 units/s. Figure 4.25 shows the waveform at a time of t = 1 s. In that time, wave 1 will have moved a distance of 6 units in the positive x direction and wave 2 will have moved a distance of 4 units in the positive x direction. However, the combined wave moves a much greater distance in that time: the center of the beat that was formerly at x = 0 has moved to x = 15 units. How is it possible that the combined waveform moves faster than either of its component waves? 1 0.5 0 –100 –50 0 50 100 –0.5 –1 FIGURE 4.25 The solid line shows the waveform of Figure 4.23a at t = 1 s, and the dashed line shows the same waveform at t = 0. Note that the peak that was originally at x = 0 has moved to x = 15 at t = 1 s. 123 124 Chapter 4 | The Wavelike Properties of Particles To produce the peak at x = 0 and t = 0, the two component waves were exactly in phase—their two maxima lined up exactly to produce the combined maximum of the wave. At x = 15 units and t = 1 s, two individual maxima line up once again to produce a combined maximum. They are not the same two maxima that lined up to produce the maximum at t = 0, but it happens that two other maxima are in phase at x = 15 units and t = 1 s to produce the combined maximum. If we were to watch an animation of the wave, we would see the maximum originally at x = 0 move gradually to x = 15 between t = 0 and t = 1 s. We can understand how this occurs by writing Eq. 4.26 in a form similar to Eq. 4.18 using trigonometric identities. The result is (again assuming A1 = A2 = A, as we did in Eq. 4.18):     k ω k + k2 ω + ω2 y(x, t) = 2A cos x− t cos 1 x− 1 t (4.27) 2 2 2 2 As in Eq. 4.18, the second term in Eq. 4.27 represents the rapid variation of the wave within the envelope given by the first term. It is the first term that dictates the overall shape of the waveform, so it is this term that determines the speed of travel of the waveform. For a wave that is written as cos(kx − ωt), the speed is ω/k. For this wave envelope, the speed is (ω/2)/(k/2) = ω/k. This speed is called the group speed of the wave packet. As we have seen, the group speed of the wave packet can be very different from the phase speed of the component waves. For more complicated situations than the two-component “beat” waveform, the group speed can be generalized by turning the differences into differentials: vgroup = dω dk (4.28) The group speed depends on the relationship between frequency and wavelength for the component waves. If the phase speed of all component waves is the same and is independent of frequency or wavelength (as, for example, light waves in empty space), then the group speed is identical to the phase speed and the wave packet keeps its original shape as it travels. In general, the propagation of a component wave depends on the properties of the medium, and different component waves will travel with different speeds. Light waves in glass or sound waves in most solids travel with a speed that varies with frequency or wavelength, and so their wave packets change shape as they travel. De Broglie waves in general have different phase speeds, so that their wave packets expand as they travel. Example 4.10 Certain ocean waves travel with a phase velocity vphase =  gλ/2π , where g is the acceleration due to gravity. What is the group velocity of a “wave packet” of these waves? Solution With k = 2π/λ, we can write the phase velocity as a function of k as  vphase = g/k But with vphase = ω/k, we have ω/k = and Eq. 4.28 gives dω 1 d  vgroup = gk = = dk dk 2   g 1 = k 2 g/k, so ω =   gk gλ 2π Note that the group speed of the wave packet increases as the wavelength increases. 4.6 | The Motion of a Wave Packet The Group Speed of deBroglie Waves Suppose we have a localized particle, represented by a group of de Broglie waves. For each component wave, the energy of the particle is related to the − − and so dE = hdω. Similarly, frequency of the de Broglie wave by E = hf = hω, the momentum of the particle is related to the wavelength of the de Broglie wave − − so dp = hdk. The group speed of the de Broglie wave then can by p = h/λ = hk, be expressed as vgroup = dω dE/h− dE = = − dk dp/h dp (4.29) For a classical particle having only kinetic energy E = K = p2 /2m, we can find dE/dp as  2 d p p dE = =v (4.30) = dp dp 2m m which is the velocity of the particle. Combining Eqs. 4.29 and 4.30 we obtain an important result: vgroup = vparticle (4.31) The speed of a particle is equal to the group speed of the corresponding wave packet. The wave packet and the particle move together—wherever the particle goes, its de Broglie wave packet moves along with it like a shadow. If we do a wave-type experiment on the particle, the de Broglie wave packet is always there to reveal the wave behavior of the particle. A particle can never escape its wave nature! The Spreading of a Moving Wave Packet Suppose we have a wave packet that represents a confined particle at t = 0. For example, the particle might have passed through a single-slit apparatus. Its initial uncertainty in position is x0 and its initial uncertainty in momentum is px0 . The wave packet moves in the x direction with velocity vx , but that velocity is not precisely known—the uncertainty in its momentum gives a corresponding uncertainty in velocity: vx0 = px0 /m. Because there is an uncertainty in the velocity of the wave packet, we can’t be sure where it will be located at time t. That is, its location at time t is x = vx t, with velocity vx = vx0 ± vx0 . Thus there are two contributions to the uncertainty in its location at time t: the initial uncertainty x0 and an additional amount equal to vx0 t that represents the spreading of the wave packet. We’ll assume that these two contributions add quadratically, like experimental uncertainties, so that the total uncertainty in the location of the particle is   (4.32) x = (x0 )2 + (vx0 t)2 = (x0 )2 + (px0 t/m)2 − Taking px0 = h/x 0 according to the uncertainty principle, we have  − 2 x = (x0 )2 + (ht/mx 0) (4.33) If we try to make the wave packet very small at t = 0 (x0 is small), then the second term under the square root makes the wave packet expand rapidly, because x0 appears in the denominator of that term. The more successful we are at confining 125 Chapter 4 | The Wavelike Properties of Particles 126 10 a wave packet, the more quickly it spreads. This reminds us of the single-slit experiment discussed in Section 4.4: the narrower we make the slit, the more the waves diverge after passing through the slit. Figure 4.26 shows how the size of the wave packet expands with time for two different initial sizes, and you can see that the smaller initial wave packet grows more rapidly than the larger initial packet. 8 ∆x0 = 0.5 ∆x 6 ∆ x0 = 1 4 2 ∆ x0 = 2 4.7 PROBABILITY AND RANDOMNESS 0 0 1 2 3 4 5 t FIGURE 4.26 The smaller the initial wave packet, the more quickly it grows. Any single measurement of the position or momentum of a particle can be made with as much precision as our experimental skill permits. How then does the wavelike behavior of a particle become observable? How does the uncertainty in position or momentum affect our experiment? Suppose we prepare an atom by attaching an electron to a nucleus. (For this example we regard the nucleus as being fixed in space.) Some time after preparing our atom, we measure the position of the electron. We then repeat the procedure, preparing the atom in an identical way, and find that a remeasurement of the position of the electron yields a value different from that found in our first measurement. In fact, each time we repeat the measurement, we may obtain a different outcome. If we repeat the measurement a large number of times, we find ourselves led to a conclusion that runs counter to a basic notion of classical physics—systems that are prepared in identical ways do not show identical subsequent behavior. What hope do we then have of constructing a mathematical theory that has any usefulness at all in predicting the outcome of a measurement, if that outcome is completely random? The solution to this dilemma lies in the consideration of the probability of obtaining any given result from an experiment whose possible results are subject to the laws of statistics. We cannot predict the outcome of a single flip of a coin or roll of the dice, because any single result is as likely as any other single result. We can, however, predict the distribution of a large number of individual measurements. For example, on a single flip of a coin, we cannot predict whether the outcome will be “heads” or “tails”; the two are equally likely. If we make a large number of trials, we expect that approximately 50% will turn up “heads” and 50% will yield “tails”; even though we cannot predict the result of any single toss of the coin, we can predict reasonably well the result of a large number of tosses. Our study of systems governed by the laws of quantum physics leads us to a similar situation. We cannot predict the outcome of any single measurement of the position of the electron in the atom we prepared, but if we do a large number of measurements, we ought to find a statistical distribution of results. We cannot develop a mathematical theory that predicts the result of a single measurement, but we do have a mathematical theory that predicts the statistical behavior of a system (or of a large number of identical systems). The quantum theory provides this mathematical procedure, which enables us to calculate the average or probable outcome of measurements and the distribution of individual outcomes about the average. This is not such a disadvantage as it may seem, for in the realm of quantum physics, we seldom do measurements with, for example, a single atom. If we were studying the emission of light by a radiant system or the properties of a solid or the scattering of nuclear particles, we would be dealing with a large number of atoms, and so our concept of statistical averages is very useful. In fact, such concepts are not as far removed from our daily lives as we might think. For example, what is meant when the TV weather forecaster “predicts” 4.7 | Probability and Randomness a 50% chance of rain tomorrow? Will it rain 50% of the time, or over 50% of the city? The proper interpretation of the forecast is that the existing set of atmospheric conditions will, in a large number of similar cases, result in rain in about half the cases. A surgeon who asserts that a patient has a 50% chance of surviving an operation means exactly the same thing—experience with a large number of similar cases suggests recovery in about half. Quantum mechanics uses similar language. For example, if we say that the electron in a hydrogen atom has a 50% probability of circulating in a clockwise direction, we mean that in observing a large collection of similarly prepared atoms we find 50% to be circulating clockwise. Of course a single measurement shows either clockwise or counterclockwise circulation. (Similarly, it either rains or it doesn’t; the patient either lives or dies.) Of course, one could argue that the flip of a coin or the roll of the dice is not a random process, but that the apparently random nature of the outcome simply reflects our lack of knowledge of the state of the system. For example, if we knew exactly how the dice were thrown (magnitude and direction of initial velocity, initial orientation, rotational speed) and precisely what the laws are that govern their bouncing on the table, we should be able to predict exactly how they would land. (Similarly, if we knew a great deal more about atmospheric physics or physiology, we could predict with certainty whether or not it will rain tomorrow or an individual patient will survive.) When we instead analyze the outcomes in terms of probabilities, we are really admitting our inability to do the analysis exactly. There is a school of thought that asserts that the same situation exists in quantum physics. According to this interpretation, we could predict exactly the behavior of the electron in our atom if only we knew the nature of a set of so-called “hidden variables” that determine its motion. However, experimental evidence disagrees with this theory, and so we must conclude that the random behavior of a system governed by the laws of quantum physics is a fundamental aspect of nature and not a result of our limited knowledge of the properties of the system. The Probability Amplitude What does the amplitude of the de Broglie wave represent? In any wave phenomenon, a physical quantity such as displacement or pressure varies with location and time. What is the physical property that varies as the de Broglie wave propagates? A localized particle is represented by a wave packet. If a particle is confined to a region of space of dimension x, its wave packet has large amplitude only in a region of space of dimension x and has small amplitude elsewhere. That is, the amplitude is large where the particle is likely to be found and small where the particle is less likely to be found. The probability of finding the particle at any point depends on the amplitude of its de Broglie wave at that point. In analogy with classical physics, in which the intensity of any wave is proportional to the square of its amplitude, we have probability to observe particles ∝ | de Broglie wave amplitude |2 Compare this with the similar relationship for photons discussed in Section 3.6: probability to observe photons ∝ | electric field amplitude |2 Just as the electric field amplitude of an electromagnetic wave indicates regions of high and low probability for observing photons, the de Broglie wave performs 127 128 Chapter 4 | The Wavelike Properties of Particles (a) (b) (c) FIGURE 4.27 The buildup of an electron interference pattern as increasing numbers of electrons are detected: (a) 100 electrons; (b) 3000 electrons; (c) 70,000 electrons. (Reprinted with permission from Akira Tonomura, Hitachi, Ltd, T. Matsuda and T. Kawasaki, Advanced Research Laboratory. From American Journal of Physics 57, 117. (Copyright 1989) American Association of Physics Teachers.) the same function for particles. Figure 4.27 illustrates this effect, as individual electrons in a double-slit type of experiment eventually produce the characteristic interference fringes. The path of each electron is guided by its de Broglie wave toward the allowed regions of high probability. This statistical effect is not apparent for a small number of electrons, but it becomes quite apparent when a large number of electrons has been detected. In the next chapter we discuss the mathematical framework for computing the wave amplitudes for a particle in various situations, and we also develop a more rigorous mathematical definition of the probability. Chapter Summary Section Section Statistical momentum uncertainty px =  Wave packet (discrete k) y(x) = 4.3 Wave packet (continuous k) y(x) =  Heisenberg positionxpx ∼ h− momentum uncertainty 4.4 Group speed of wave packet vgroup = Heisenberg Et ∼ h− energy-time uncertainty 4.4 De Broglie wavelength λ = h/p Single slit diffraction Classical position-wavelength uncertainty Classical frequencytime uncertainty a sin θ = nλ n = 1, 2, 3, . . . xλ ∼ ελ2 f t ∼ ε 4.1 4.2 4.3  (p2x )av − (px,av )2 Ai cos ki x 4.4 4.5 A(k) cos kx dk 4.5 dω dk 4.6 Questions 1. When an electron moves with a certain de Broglie wavelength, does any aspect of the electron’s motion vary with that wavelength? 2. Imagine a different world in which the laws of quantum physics still apply, but which has h = 1 J · s. What might be some of the difficulties of life in such a world? (See Mr. Tompkins in Paperback by George Gamow for an imaginary account of such a world.) 3. Suppose we try to measure an unknown frequency f by listening for beats between f and a known (and controllable) frequency f ′ . (We assume f ′ is known to arbitrarily small uncertainty.) The beat frequency is |f ′ − f |. If we hear no Problems 4. 5. 6. 7. 8. 9. 10. beats, then we conclude that f = f ′ . (a) How long must we listen to hear “no” beats? (b) If we hear no beats in one second, how accurately have we determined f ? (c) If we hear no beats in 10 s, how accurately? In 100 s? (d) How is this experiment related to Eq. 4.7? What difficulties does the uncertainty principle cause in trying to pick up an electron with a pair of forceps? Does the uncertainty principle apply to nature itself or only to the results of experiments? That is, is it the position and momentum that are really uncertain, or merely our knowledge of them? What is the difference between these two interpretations? The uncertainty principle states in effect that the more we try to confine an object, the faster we are likely to find it moving. Is this why you can’t seem to keep money in your pocket for long? Make a numerical estimate. Consider a collection of gas molecules trapped in a container. As we move the walls of the container closer together (compressing the gas) the molecules move faster (the temperature increases). Does the gas behave this way because of the uncertainty principle? Justify your answer with some numerical estimates. Many nuclei are unstable and undergo radioactive decay to other nuclei. The lifetimes for these decays are typically of the order of days to years. Do you expect that the uncertainty principle will cause a measurable effect in the precision to which we can measure the masses of atoms of these nuclei? Just as the classical limit of relativity can be achieved by letting c → ∞, the classical limit of quantum behavior is achieved by letting h → 0. Consider the following in the h → 0 limit and explain how they behave classically: the size of the energy quantum of an electromagnetic wave, the de Broglie wavelength of an electron, the Heisenberg uncertainty relationships. Assume the electron beam in a television tube is accelerated through a potential difference of 25 kV and then passes through a deflecting capacitor of interior width 1 cm. Are 11. 12. 13. 14. 15. 16. 129 diffraction effects important in this case? Justify your answer with a calculation. The structure of crystals can be revealed by X-ray diffraction (Figures 3.7 and 3.8), electron diffraction (Figure 4.2), and neutron diffraction (Figure 4.7). In what ways do these experiments reveal similar structure? In what ways are they different? Often it happens in physics that great discoveries are made inadvertently. What would have happened if Davisson and Germer had their accelerating voltage set below 32 V? Suppose we cover one slit in the two-slit electron experiment with a very thin sheet of fluorescent material that emits a photon of light whenever an electron passes through. We then fire electrons one at a time at the double slit; whether or not we see a flash of light tells us which slit the electron went through. What effect does this have on the interference pattern? Why? In another attempt to determine through which slit the electron passes, we suspend the double slit itself from a very fine spring balance and measure the “recoil” momentum of the slit as a result of the passage of the electron. Electrons that strike the screen near the center must cause recoils in opposite directions depending on which slit they pass through. Sketch such an apparatus and describe its effect on the interference pattern. (Hint: Consider the uncertainty h as applied to the motion of the slits principle xpx ∼ − suspended from the spring. How precisely do we know the position of the slit?) It is possible for vphase to be greater than c? Can vgroup be greater than c? In a nondispersive medium, vgroup = vphase ; this is another way of saying that all waves travel with the same phase velocity, no matter what their wavelengths. Is this true for (a) de Broglie waves? (b) Light waves in glass? (c) Light waves in vacuum? (d) Sound waves in air? What difficulties would be encountered in attempting communication (by speech or by radio signals for example) in a strongly dispersive medium? Problems 4.1 De Broglie’s Hypothesis 1. Find the de Broglie wavelength of (a) a 5-MeV proton; (b) a 50-GeV electron; (c) an electron moving at v = 1.00 × 106 m/s. 2. The neutrons produced in a reactor are known as thermal neutrons, because their kinetic energies have been reduced (by collisions) until K = 32 kT, where T is room temperature (293 K). (a) What is the kinetic energy of such neutrons? (b) What is their de Broglie wavelength? Because this wavelength is of the same order as the lattice spacing of the atoms of a solid, neutron diffraction (like X-ray and electron diffraction) is a useful means of studying solid lattices. 3. By doing a nuclear diffraction experiment, you measure the de Broglie wavelength of a proton to be 9.16 fm. (a) What is the speed of the proton? (b) Through what potential difference must it be accelerated to achieve that speed? 4. A proton is accelerated from rest through a potential difference of −2.36 × 105 V. What is its de Broglie wavelength? Chapter 4 | The Wavelike Properties of Particles 4.2 Experimental Evidence for de Broglie Waves 5. Find the potential difference through which electrons must be accelerated (as in an electron microscope, for example) if we wish to resolve: (a) a virus of diameter 12 nm; (b) an atom of diameter 0.12 nm; (c) a proton of diameter 1.2 fm. 6. In an electron microscope we wish to study particles of diameter about 0.10 μm (about 1000 times the size of a single atom). (a) What should be the de Broglie wavelength of the electrons? (b) Through what potential difference should the electrons be accelerated to have that de Broglie wavelength? 7. In order to study the atomic nucleus, we would like to observe the diffraction of particles whose de Broglie wavelength is about the same size as the nuclear diameter, about 14 fm for a heavy nucleus such as lead. What kinetic energy should we use if the diffracted particles are (a) electrons? (b) Neutrons? (c) Alpha particles (m = 4 u)? 8. In the double-slit interference pattern for helium atoms (Figure 4.13), the kinetic energy of the beam of atoms was 0.020 eV. (a) What is the de Broglie wavelength of a helium atom with this kinetic energy? (b) Estimate the de Broglie wavelength of the atoms from the fringe spacing in Figure 4.13, and compare your estimate with the value obtained in part (a). The distance from the double slit to the scanning slit is 64 cm. 9. Suppose we wish to do a double-slit experiment with a beam of the smoke particles of Example 4.1c. Assume we can construct a double slit whose separation is about the same size as the particles. Estimate the separation between the fringes if the double slit and the screen were on opposite coasts of the United States. 10. In the Davisson-Germer experiment using a Ni crystal, a second-order beam is observed at an angle of 55◦ . For what accelerating voltage does this occur? 11. A certain crystal is cut so that the rows of atoms on its surface are separated by a distance of 0.352 nm. A beam of electrons is accelerated through a potential difference of 175 V and is incident normally on the surface. If all possible diffraction orders could be observed, at what angles (relative to the incident beam) would the diffracted beams be found? 4.3 Uncertainty Relationships for Classical Waves 12. Suppose a traveling wave has a speed v (where v = λf ). Instead of measuring waves over a distance x, we stay in one place and count the number of wave crests that pass in a time t. Show that Eq. 4.7 is equivalent to Eq. 4.4 for this case. 13. Sound waves travel through air at a speed of 330 m/s. A whistle blast at a frequency of about 1.0 kHz lasts for 2.0 s. (a) Over what distance in space does the “wave train” representing the sound extend? (b) What is the wavelength of the sound? (c) Estimate the precision with which an observer could measure the wavelength. (d) Estimate the precision with which an observer could measure the frequency. 14. A stone tossed into a body of water creates a disturbance at the point of impact that lasts for 4.0 s. The wave speed is 25 cm/s. (a) Over what distance on the surface of the water does the group of waves extend? (b) An observer counts 12 wave crests in the group. Estimate the precision with which the wavelength can be determined. 15. A radar transmitter emits a pulse of electromagnetic radiation with wavelength 0.225 m. The pulses have a duration of 1.17 μs. The receiver is set to accept a range of frequencies about the central frequency. To what range of frequencies should the receiver be set? 16. Estimate the signal processing time that would be necessary if you want to design a device to measure frequencies to a precision of no worse than 10,000 Hz. 4.4 Heisenberg Uncertainty Relationships 17. The speed of an electron is measured to within an uncertainty of 2.0 × 104 m/s. What is the size of the smallest region of space in which the electron can be confined? 18. An electron is confined to a region of space of the size of an atom (0.1 nm). (a) What is the uncertainty in the momentum of the electron? (b) What is the kinetic energy of an electron with a momentum equal to p? (c) Does this give a reasonable value for the kinetic energy of an electron in an atom? 19. The ∗ particle has a rest energy of 1385 MeV and a lifetime of 2.0 × 10−23 s. What would be a typical range of outcomes of measurements of the ∗ rest energy? 20. A pi meson (pion) and a proton can briefly join together to form a  particle. A measurement of the energy of the π p system (Figure 4.28) shows a peak at 1236 MeV, corresponding to the rest energy of the  particle, with an experimental spread of 120 MeV. What is the lifetime of the ? Reaction probability 130 120 MeV 1000 1200 1400 Energy (MeV) 1600 FIGURE 4.28 Problem 20. 21. A nucleus emits a gamma ray of energy 1.0 MeV from a state that has a lifetime of 1.2 ns. What is the uncertainty in the energy of the gamma ray? The best gamma-ray detectors can measure gamma-ray energies to a precision of no better than a few eV. Will this uncertainty be directly measurable? Problems 22. In special conditions (see Section 12.9), it is possible to measure the energy of a gamma-ray photon to 1 part in 1015 . For a photon energy of 50 keV, estimate the maximum lifetime that could be determined by a direct measurement of the spread of the photon energy. 23. Alpha particles are emitted in nuclear decay processes with typical energies of 5 MeV. In analogy with Example 4.7, deduce whether the alpha particle can exist inside the nucleus. 4.5 Wave Packets 24. Use a distribution of wave numbers of constant amplitude in a range k about k0 : k k  k  k0 + k0 − A(k) = A0 2 2 =0 otherwise and obtain Eq. 4.24 from Eq. 4.23. 25. Use the distribution of wave numbers A(k) = 2 2 A0 e−(k−k0 ) /2(k) for k = −∞ to +∞ to derive Eq. 4.25. 26. Do the trigonometric manipulation necessary to obtain Eq. 4.18. 4.6 The Motion of a Wave Packet 27. Show that the data used in Figure 4.25 are consistent with Eq. 4.27; that is, use λ1 = 9 and λ2 = 11, v1 = 6 and v2 = 4 to show that vgroup = 15. 28. (a) Show that the group velocity and phase velocity are related by: dvphase vgroup = vphase − λ dλ (b) When white light travels through glass, the phase velocity of each wavelength depends on the wavelength. (This is the origin of dispersion and the breaking up of white light into its component colors—different wavelengths travel at different speeds and have different indices of refraction.) How does vphase depend on λ? Is dvphase /dλ positive or negative? Therefore, is vgroup > vphase or < vphase ? 29. Certain surface waves in a fluid travel with phase velocity √ b/λ, where b is a constant. Find the group velocity of a packet of surface waves, in terms of the phase velocity. 30. By a calculation similar to that of Eq. 4.30, show that dE/dp = v remains valid when E represents the relativistic kinetic energy of the particle. General Problems 31. A free electron bounces elastically back and forth in one dimension between two walls that are L = 0.50 nm apart. (a) Assuming that the electron is represented by a de Broglie standing wave with a node at each wall, show that the permitted de Broglie wavelengths are λn = 2L/n (n = 1, 2, 3, . . .). (b) Find the values of the kinetic energy of the electron for n = 1, 2, and 3. 131 32. A beam of thermal neutrons (see Problem 2) emerges from a nuclear reactor and is incident on a crystal as shown in Figure 4.29. The beam is Bragg scattered, as in Figure 3.5, from a crystal whose scattering planes are separated by 0.247 nm. From the continuous energy spectrum of the beam we wish to select neutrons of energy 0.0105 eV. Find the Bragg-scattering angle that results in a scattered beam of this energy. Will other energies also be present in the scattered beam at that angle? Graphite Neutron beam q Shielding Reactor q Scattering crystal FIGURE 4.29 Problem 32. 33. (a) Find the de Broglie wavelength of a nitrogen molecule in air at room temperature (293 K). (b) The density of air at room temperature and atmospheric pressure is 1.292 kg/m3 . Find the average distance between air molecules at this temperature and compare with the de Broglie wavelength. What do you conclude about the importance of quantum effects in air at room temperature? (c) Estimate the temperature at which quantum effects might become important. 34. In designing an experiment, you want a beam of photons and a beam of electrons with the same wavelength of 0.281 nm, equal to the separation of the Na and Cl ions in a crystal of NaCl. Find the energy of the photons and the kinetic energy of the electrons. 35. A nucleus of helium with mass 5 u breaks up from rest into a nucleus of ordinary helium (mass = 4 u) plus a neutron (mass = 1 u). The rest energy liberated in the break-up is 0.89 MeV, which is shared (not equally) by the products. (a) Using energy and momentum conservation, find the kinetic energy of the neutron. (b) The lifetime of the original nucleus is 1.0 × 10−21 s. What range of values of the neutron kinetic energy might we measure in the laboratory as a result of the uncertainty relationship? 36. In a metal, the conduction electrons are not attached to any one atom, but are relatively free to move throughout the entire metal. Consider a cube of copper measuring 1.0 cm on each edge. (a) What is the uncertainty in any one component of the momentum of an electron confined to the metal? (b) Estimate the average kinetic energy of an electron in the metal. (Assume p = [(px )2 + (py )2 + (pz )2 ]1/2 .) (c) Assuming the heat capacity of copper to be 24.5 J/mole · K, would the contribution of this motion to the internal energy of the copper be important at room temperature? What do you conclude from this? (See also Problem 38.) 132 Chapter 4 | The Wavelike Properties of Particles 37. A proton or a neutron can sometimes “violate” conservation of energy by emitting and then reabsorbing a pi meson, which has a mass of 135 MeV/c2 . This is possible as long as the pi meson is reabsorbed within a short enough time t consistent with the uncertainty principle. (a) Consider p → p + π. By what amount E is energy conservation violated? (Ignore any kinetic energies.) (b) For how long a time t can the pi meson exist? (c) Assuming the pi meson to travel at very nearly the speed of light, how far from the proton can it go? (This procedure, as we discuss in Chapter 12, gives us an estimate of the range of the nuclear force, because protons and neutrons are held together in the nucleus by exchanging pi mesons.) 38. In a crystal, the atoms are a distance L apart; that is, each atom must be localized to within a distance of at most L. (a) What is the minimum uncertainty in the momentum of the atoms of a solid that are 0.20 nm apart? (b) What is the average kinetic energy of such an atom of mass 65 u? (c) What would a collection of such atoms contribute to the internal energy of a typical solid, such as copper? Is this contribution important at room temperature? (See also Problem 36.) 39. An apparatus is used to prepare an atomic beam by heating a collection of atoms to a temperature T and allowing the beam to emerge through a hole of diameter d in one side of the oven. The beam then travels through a straight path of length L. Show that the uncertainty principle causes the diameter of the beam at the end of √ the path to be h/d 3mkT, where larger than d by an amount of order L− m is the mass of an atom. Make a numerical estimate for typical values of T = 1500 K, m = 7 u (lithium atoms), d = 3 mm, L = 2 m. Chapter 5 THE SCHRÖDINGER EQUATION Quantum mechanics provides a mathematical framework in which the description of a process often includes different and possibly contradictory outcomes. A favorite illustration of that situation is the case of Schrödinger’s cat. The cat is confined in a chamber with a radioactive atom, the decay of which will trigger the release of poison from a vial. Because we don’t know exactly when that decay will occur, until an observation of the condition of the cat is made the quantum-mechanical description of the cat must include both ‘‘cat alive’’ and ‘‘cat dead’’ components. 134 Chapter 5 | The Schrödinger Equation Incident wave B Transmitted wave A Reflected wave Glass Air (region 1) (region 2) Air (region 3) (a) Region 1 Region 2 A Region 3 B (b) Region 1 Region 2 Region 3 A B V0 (c) FIGURE 5.1 (a) A light wave in air is incident on a slab of glass, showing transmitted and reflected waves at the two boundaries (A and B). (b) A surface wave in water incident on a region of smaller depth similarly has transmitted and reflected waves. (c) The de Broglie waves of electrons moving from a region of constant zero potential to a region of constant negative potential V0 also have transmitted and reflected components. The future behavior of a particle in a classical (nonrelativistic, nonquantum) situation may be predicted with absolute certainty using Newton’s laws. If a  (which might particle interacts with its environment through a known force F be associated with a potential energy U), we can do the mathematics necessary  = d p/dt (a second-order, linear differential to solve Newton’s second law, F v(t) at all future times equation), and find the particle’s location r (t) and velocity  t. The mathematics may be difficult, and in fact it may not be possible to solve the equations in closed form (in which case an approximate solution can be obtained with the help of a computer). Aside from any such mathematical difficulties, the  = d p/dt physics of the problem consists of writing down the original equation F v(t). For example, a satellite or planet and interpreting its solutions r (t) and  moving under the influence of a 1/r2 gravitational force can be shown, after the equations have been solved, to follow exactly an elliptical path. In the case of nonrelativistic quantum physics, the basic equation to be solved is a second-order differential equation known as the Schrödinger equation. Like Newton’s laws, the Schrödinger equation is written for a particle interacting with its environment, although we describe the interaction in terms of the potential energy rather than the force. Unlike Newton’s laws, the Schrödinger equation does not give the trajectory of the particle; instead, its solution gives the wave function of the particle, which carries information about the particle’s wavelike behavior. In this chapter we introduce the Schrödinger equation, obtain some of its solutions for certain potential energies, and learn how to interpret those solutions. 5.1 BEHAVIOR OF A WAVE AT A BOUNDARY In studying wave motion, we often must analyze what occurs when a wave moves from one region or medium to a different region or medium in which the properties of the wave may change. For example, when a light wave moves from air into glass, its wavelength and the amplitude of its electric field both decrease. At every such boundary, a portion of the incident wave intensity is transmitted into the second medium and a portion is reflected back into the first medium. Let’s consider the case of a light wave incident on a glass plate, as in Figure 5.1a. At boundary A, the light wave moves from air (region 1) into glass (region 2), while at B the light wave moves from glass into air (region 3). The wavelength in air in region 3 is the same as the original wavelength of the incident wave in region 1, but the amplitude in region 3 is less than the amplitude in region 1, because some of the intensity is reflected at A and at B. Other types of waves show similar behavior. For example, Figure 5.1b shows a surface water wave that moves into a region of shallower depth. In that region, its wavelength is smaller (but its amplitude is larger) compared with the original incident wave. When the wave enters region 3, in which the depth is the same as in region 1, the wavelength returns to its original value, but the amplitude of the wave is smaller in region 3 than in region 1 because some of the intensity was reflected at the two boundaries. The same type of behavior occurs for de Broglie waves that characterize particles. Consider, for example, the apparatus shown in Figure 5.1c. Electrons are incident from the left and move inside a narrow metal tube that is at ground potential (V = 0). Another narrow tube in region 2 is connected to the negative terminal of a battery, which maintains it at a uniform potential of −V0 . Region 3 is connected to region 1 at ground potential. The gaps between the tubes can in 5.1 | Behavior of a Wave at a Boundary principle be made so small that we can regard the changes in potential at A and B as occurring √ suddenly. In region 1, the electrons have kinetic energy K, momentum p = 2mK, and de Broglie wavelength λ = h/p. In region 2, the potential energy for the electrons is U = qV = (−e)(−V0 ) = +eV0 . We assume that the original kinetic energy of the electrons in region 1 is greater than eV0 , so that the electrons move into region 2 with a smaller kinetic energy (equal to K − eV0 ), a smaller momentum, and thus a greater wavelength. When the electrons move from region 2 into region 3, they gain back the lost kinetic energy and move with their original kinetic energy K and thus with their original wavelength. As in the case of the light wave or the water wave, the amplitude of the de Broglie wave in region 3 is smaller than in region 1, meaning that the current of electrons in region 3 is smaller than the incident current, because some of the electrons are reflected at the boundaries at A and B. We can thus identify a total of 5 waves moving in the three regions: (1) a wave moving to the right in region 1 (the incident wave); (2) a wave moving to the left in region 1 (representing the net combination of waves reflected from boundary A plus waves reflected from boundary B and then transmitted through boundary A back into region 1); (3) a wave moving to the right in region 2 (representing waves transmitted through boundary A plus waves reflected at B and then reflected again at A); (4) a wave moving to the left in region 2 (waves reflected at B); and (5) a wave moving to the right in region 3 (the transmitted waves at boundary B). Because we are assuming that waves are incident from region 1, it is not possible to have a wave moving to the left in region 3. Penetration of the Reflected Wave Another property of classical waves that carries over into quantum waves is penetration of a totally reflected wave into a forbidden region. When a light wave is completely reflected from a boundary, an exponentially decreasing wave called the evanescent wave penetrates into the second medium. Because 100% of the light wave intensity is reflected, the evanescent wave carries no energy and so cannot be directly observed in the second medium. But if we make the second medium very thin (perhaps equal to a few wavelengths of light) the light wave can emerge on the opposite side of the second medium. We’ll discuss this phenomenon in more detail at the end of this chapter. The same effect occurs with de Broglie waves. Suppose we increase the battery voltage in Figure 5.1c so that the potential energy in region 2 (equal to eV0 ) is greater than the initial kinetic energy in region 1. The electrons do not have enough energy to enter region 2 (they would have negative kinetic there) and so all electrons are reflected back into region 1. Like light waves, de Broglie waves can also penetrate into the forbidden region with exponentially decreasing amplitudes. However, because de Broglie waves are associated with the motion of electrons, that means that electrons must also penetrate a short distance into the forbidden region. The electrons cannot be directly observed in that region, because they have negative kinetic energy there. Nor can we do any experiment that would reveal their “real” existence in the forbidden region, such as measuring the speed of their passage through that region or detecting the magnetic field that their motion might produce. One explanation for the penetration of the electrons into the forbidden region relies on the uncertainty principle—because we can’t know exactly the energy of the incident electrons, we can’t say with certainty that they don’t have enough kinetic energy to penetrate into the forbidden region. For short enough time t, the 135 136 Chapter 5 | The Schrödinger Equation − energy uncertainty E ∼ h/t might allow the electron to travel a short distance into the forbidden region, but this extra energy does not “belong to” the electron in any permanent sense. Later in this chapter we’ll discuss a more mathematical approach to this explanation of penetration into the forbidden region. (a) Continuity at the Boundaries When a wave such as a light wave or a water wave crosses a boundary as in Figure 5.1, the mathematical function that describes the wave must have two properties at each boundary: (b) 1. The wave function must be continuous. 2. The slope of the wave function must be continuous, except when the boundary height is infinite. (c) (d) FIGURE 5.2 (a) A discontinuous wave. (b) A continuous wave with a discontinuous slope. (c) Two sine waves join smoothly. (d) A sine wave and an exponential join smoothly. H y(t) Parabola t 0 Sine v(t) 0 t FIGURE 5.3 The position and velocity of a ball dropped from a height H above a springlike rubber sheet at y = 0. Figure 5.2a shows a discontinuous wave function; the wave displacement changes suddenly at a single location. This type of behavior is not allowed. Figure 5.2b shows a continuous wave function (there are no gaps) with a discontinuous slope. This type of behavior is also not allowed, unless the boundary is of infinite height. Figures 5.2c, d show how two sine curves and an exponential and a sine can be joined so that both the function and the slope are continuous. Across any non-infinite boundary, the wave must be smooth—no gaps in the function and no sharp changes in slope. When we solve for the mathematical form of a wave function, there are usually undetermined parameters, such as the amplitude and phase of the wave. In order to make the wave smooth at the boundary, we obtain the values of those coefficients by applying the two boundary conditions to make the function and its slope continuous. For example, at boundary A in Figure 5.1, we first evaluate the total wave function in region 1 at A and set it equal to the wave function in region 2 at A. This guarantees that the total wave function is continuous at A. We then take the derivative of the wave function in region 1, evaluate it at A, and set that equal to the derivative of the wave function in region 2 evaluated at A. This step makes the slope in region 1 match the slope in region 2 at boundary A. These two steps give us two equations relating the parameters of the waves and allow us to find relationships between the amplitudes and phases of the waves in regions 1 and 2. The process must be repeated at every boundary, such as at B in Figure 5.1 to match the waves in regions 2 and 3. We can understand the exception to the continuity of the slope for infinite boundaries with an example from classical physics. Imagine a ball dropped from a height y = H above a stretched rubber sheet at y = 0. The ball falls freely under gravity until it strikes the sheet, which we assume behaves like an elastic spring. The sheet stretches as the ball is brought to rest, after which the restoring force propels the ball upward. The motion of the ball might be represented by Figure 5.3. Above the sheet (y > 0) the motion is represented by parabolas, and while the ball is in contact with the sheet (y < 0) the motion is described by sine curves. Note how the curves join smoothly at y = 0, and note how both y(t) and its derivative v(t) are continuous. On the other hand, imagine a ball hitting a steel surface, which we assume to be perfectly rigid. The ball rebounds elastically, and at the instant it is in contact with the surface its velocity reverses direction. The motion of the ball is represented 5.1 | Behavior of a Wave at a Boundary in Figure 5.4. At the points of contact with the surface, there is a sudden change in the velocity, corresponding to an infinite acceleration and thus to an infinite force. The function y(t) is continuous, but its slope is not—the function has no gaps, but it does have sharp “points” where the slope changes suddenly. The assumption of the perfectly rigid surface is an idealization that we make to help us understand the situation and also to help simplify the mathematics. In reality the steel surface will flex slightly and ultimately behave somewhat like a much stiffer version of the rubber sheet. In quantum mechanics we will also sometimes use an assumption of a perfectly rigid or impenetrable boundary to help us understand and simplify the analysis of a more complicated physical situation. In this section we have established several properties of classical waves that also apply to quantum waves: H 137 y(t) t 0 v(t) t 0 FIGURE 5.4 The position and velocity of a ball dropped from a height H above a rigid surface. 1. When a wave crosses a boundary between two regions, part of the wave intensity is reflected and part is transmitted. 2. When a wave encounters a boundary to a region from which it is forbidden, the wave will penetrate perhaps by a few wavelengths before reflecting. 3. At a finite boundary, the wave and its slope are continuous. At an infinite boundary, the wave is continuous but its slope is discontinuous. Example 5.1 In the geometry of Figure 5.1, the wave in region 1 is given by y1 (x) = C1 sin(2π x/λ1 − φ1 ), where C1 = 11.5, λ1 = 4.97 cm, and φ1 = −65.3◦ . In region 2, the wavelength is λ2 = 10.5 cm. The boundary A is located at x = 0, and the boundary B is located at x = L, where L = 20.0 cm. Find the wave functions in regions 2 and 3. Solution The general form of the wave in region 2 can be represented in a form similar to that of the wave in region 1: y2 (x) = C2 sin(2π x/λ2 − φ2 ). To find the complete wave function in region 2, we must find the amplitude C2 and the phase φ2 by applying the boundary conditions on the function and its slope at boundary A (x = 0). Setting y1 (x = 0) = y2 (x = 0) gives −C1 sin φ1 = −C2 sin φ2 The slopes can be found from the derivative of the general form dy/dx = (2π/λ)C cos(2π x/λ − φ) evaluated at x = 0: 2π 2π C1 cos φ1 = C cos φ2 λ1 λ2 2 Dividing the first equation by the second eliminates C2 and allows us to solve for φ2 : φ2 = tan−1 = tan −1  λ1 tan φ1 λ2  4.97 cm ◦ tan(−65.3 ) 10.5 cm ◦   = −45.8 We can solve for C2 using the result from applying the first boundary condition: C2 = C1 sin φ1 sin(−65.3◦ ) = 11.5 = 14.6 sin φ2 sin(−45.8◦ ) To find the wave function in region 3, which we assume to have the same form y3 (x) = C3 sin(2π x/λ1 − φ3 ), we must apply the boundary conditions on y2 and y3 at x = L. Applying the two boundary conditions in the same way we did at x = 0, we obtain     2π L 2π L − φ2 = C3 sin − φ3 C2 sin λ2 λ1     2π 2π L 2π L 2π C cos − φ2 = C cos − φ3 λ2 2 λ2 λ1 3 λ1 Proceeding as we did before, we divide these two equations to find φ3 = 60.9◦ , and then from either equation obtain C3 = 7.36. Our two solutions are 138 Chapter 5 | The Schrödinger Equation then y2 (x) = 14.6 sin (2π x/10.5 + 45.8◦ ) and y3 (x) = 7.36 sin(2π x/4.97 + 14.6◦ ), with x measured in cm. Figure 5.5 shows the wave in all three regions. Note how the waves join smoothly at the boundaries. How is it possible that the amplitude of y2 can be greater than the amplitude of y1 ? Keep in mind that y1 represents the total wave in region 1, which includes the incident wave and the reflected wave. Depending on the phase difference between them, when the incident and reflected waves are added to obtain y1 , the amplitude of the resultant can be smaller than the amplitude of either wave. 10 −10 10 20 30 −10 FIGURE 5.5 Example 5.1. 5.2 CONFINING A PARTICLE L B A V0 V0 (a) U = U0 U=0 (b) FIGURE 5.6 (a) Apparatus for confining an electron to the center region of length L. (b) The potential energy of an electron in this apparatus. A free particle (that is, a particle on which no forces act anywhere) is by definition not confined, so it can be located anywhere. It has, as we discussed in Chapter 4, a definite wavelength, momentum, and energy (for which we can choose any value). A confined particle, on the other hand, is represented by a wave packet that makes it likely to be found only in a region of space of size x. We construct such a wave packet by adding together different sine or cosine waves to obtain the desired mathematical shape. In quantum mechanics, we often want to analyze the behavior of confined particles, for example an electron that is attached to a specific atom or molecule. We’ll consider the properties of atomic electrons beginning in Chapter 6, but for now let’s look at a simpler problem: an electron moving in one dimension and confined by a series of electric fields. Figure 5.6 shows how the apparatus of Figure 5.1c might be modified for this purpose. The center section is grounded (so that V = 0) and the two side sections are connected to batteries so that they are at potentials of −V0 relative to the center section. As before, we assume that the gaps between the center section and the side sections can be made as narrow as possible, so we can regard the potential energy as changing instantaneously at the boundaries A and B. This arrangement is often called a potential energy well. The potential energy of an electron in this situation is then 0 in the center section and U0 = qV = (−e)(−V0 ) = +eV0 in the two side sections as shown in Figure 5.6. To confine the electron, we want to consider cases in which it moves in the center section with a kinetic energy K that is less than U0 . For example, the electron might have a kinetic energy of 5 eV in the center section, and the side sections might have potential energies of 10 eV. The electron thus does not have enough energy to “climb” the potential energy hill between the center section and the side sections, and (at least from the classical point of view) the electron is confined to the center section. We’ll discuss the full solution to this problem later in this chapter, but for now let’s simplify even further and consider the case of an infinitely high potential energy barrier at A and B. This is a good approximation to the situation in which the kinetic energy of the electron in the center section is much smaller than the potential energy supplied by the batteries. In this case the penetration 5.2 | Confining a Particle into the forbidden region, which we discussed in Section 5.1, cannot occur. The probability to find the electron in either of the side regions is therefore precisely zero everywhere in those regions, and thus the wave amplitude is zero everywhere in those regions, including at the boundaries (locations A and B). For the wave function to be continuous, the wave function in the center section must have values of zero at A and B. Of all the possible waves that might be used to describe the particle in this center section, the continuity condition restricts us to waves that have zero amplitude at the boundaries. Some of those waves are illustrated in Figure 5.7. Note that the wave function is continuous, but its slope is not (there are sharp points in the function at locations A and B). This is an example of the exception to the second boundary condition—the slope may be discontinuous at an infinite barrier. In contrast to the free particle for which the wavelength could have any value, only certain values of the wavelength are allowed. The de Broglie relationship then tells us that only certain values of the momentum are allowed, and consequently only certain values of the energy are allowed. The energy is not a continuous variable, free to take on any arbitrary value; instead, the energy is a discrete variable that is restricted to a certain set of values. This is known as quantization of energy. You can see directly from Figure 5.7 that the allowed wavelengths are 2L, L, 2L/3, . . ., where L is the length of the center section. We can write these wavelengths as λn = 2L n n = 1, 2, 3, . . . (5.1) This set of wavelengths is identical to the wavelengths of the classical problem of standing waves on a string stretched between two points. From the de Broglie relationship λ = h/p we obtain pn = n h 2L (5.2) The energy of the particle in the center section is only kinetic energy p2 /2m, and so En = n2 h2 8mL2 (5.3) These are the allowed or quantized values of the energy of the electron. A wave packet describing the electron in this region must be a combination of waves with the allowed values of the wavelengths. However, it is not necessary to construct a wave packet from a combination of waves to describe this confined particle. Even a single one of these waves represents the confined particle, because the wave function must be zero in the forbidden regions. So the waveforms shown in Figure 5.7 can represent wave packets of this confined electron, each wave packet consisting of only a single wave. The appearance of energy quantization accompanies every attempt to confine a particle to a finite region of space. Quantization of energy is one of the principal features of the quantum theory, and studying the quantized energy levels of systems (such as by observing the energies of emitted photons) is an important technique of experimental physics that gives us information about the properties of atoms and nuclei. A 139 B L = ½λ A B L=λ A B L= 3 2λ A B L = 2λ FIGURE 5.7 Some possible waves that might be used to describe an electron confined by an infinite potential energy barrier to a region of length L. 140 Chapter 5 | The Schrödinger Equation Applying the Uncertainty Principle to a Confined Particle In Chapter 4 we constructed wave packets and showed how the uncertainty principle related the size of the wave packet to the range of wavelengths that was used in its construction. Let’s now see how the Heisenberg uncertainty relationships apply in the case of a confined particle. In the arrangement of Figure 5.6 (with infinitely high barriers on each side), the particle is known to be somewhere in the center section of the apparatus, and thus x ∼ L is a reasonable estimate of the uncertainty in its location. To find the uncertainty in its momentum, we use the rigorous definition of uncertainty given  in Eq. 4.15: px = (p2x )av − (px,av )2 . The particle moving in the center section can be considered to be moving to the left or to the right with equal probability (just as the classical standing-wave problem can be analyzed as the superposition of identical waves moving to the left and to the right). Thus px,av = 0. If the particle is moving with a momentum given by Eq. 5.2, p2x = (nh/L)2 and so px = nh/L. Combining the uncertainties in position and momentum, we have xpx ∼ L nh = nh L (5.4) − and so the result The product of the uncertainties is certainly greater than h/2, of confining the particle is entirely consistent with the Heisenberg uncertainty relationship. Note that even the smallest possible value of the product of the uncertainties (which is obtained for n = 1) is still much larger than the minimum value given by the uncertainty principle. Later in this chapter, we will use a more rigorous way to evaluate the uncertainty in position using a formula similar to Eq. 4.15 to find the uncertainty in position, and we will find that the result does not differ very much from the estimate of Eq. 5.4. 5.3 THE SCHR ÖDINGER EQUATION Erwin Schrödinger (1887–1961, Austria). Although he disagreed with the probabilistic interpretation that was later given to his work, he developed the mathematical theory of wave mechanics that for the first time permitted the wave behavior of physical systems to be calculated. The differential equation whose solution gives us the wave behavior of particles is called the Schrödinger equation. It was developed in 1926 by Austrian physicist Erwin Schrödinger. The equation cannot be derived from any previous laws or postulates; like Newton’s equations of motion or Maxwell’s equations of electromagnetism, it is a new and independent result whose correctness can be determined only by comparing its predictions with experimental results. For nonrelativistic motion, the Schrödinger equation gives results that correctly account for observations at the atomic and subatomic level. We can justify the form of the Schrödinger equation by examining the solution expected for the free particle, which should give a wave whose shape at any particular time, specified by the wave function ψ(x), is that of a simple de Broglie wave, such as ψ(x) = A sin kx, where A is the amplitude of the wave and k = 2π/λ. If we are looking for a differential equation, then we need to take some derivatives: dψ = kA cos kx, dx d2ψ = −k 2 A sin kx = −k 2 ψ(x) dx2 5.3 | The Schrödinger Equation Note that the second derivative gives the original function again. With the kinetic energy K = p2 /2m = (h/λ)2 /2m = h−2 k 2 /2m, we can then write 2m 2m d2ψ = −k 2 ψ(x) = − −2 Kψ(x) = − −2 (E − U)ψ(x) h h dx2 where E = K + U is the nonrelativistic total energy of the particle. For a free particle, U = 0 so E = K; however, we are using the free particle solution to try to extend to the more general case in which there is a potential energy U(x). The equation then becomes − h−2 d 2 ψ + U(x)ψ(x) = Eψ(x) 2m dx2 (5.5) Equation 5.5 is the time-independent Schrödinger equation for one-dimensional motion. The solution to Eq. 5.5 gives the shape of the wave at time t = 0. The mathematical function that describes a one-dimensional traveling wave must involve both x and t. This wave is represented by the function (x,t): (x,t) = ψ(x)e−iωt (5.6) The time dependence is given by the complex exponential function e−iωt with − (You can find a few useful formulas involving complex numbers in ω = E/h. Appendix B.) We’ll discuss the time-dependent part later in this chapter. For now, we’ll concentrate on the time-independent function ψ(x). We assume that we know the potential energy U(x), and we wish to obtain the wave function ψ(x) and the energy E for that potential energy. This is a general example of a type of problem known as an eigenvalue problem; we find that it is possible to obtain solutions to the equation only for particular values of E, which are known as the energy eigenvalues. The general procedure for solving the Schrödinger equation is as follows: 1. Begin by writing Eq. 5.5 with the appropriate U(x). Note that if the potential energy changes discontinuously [U(x) may be represented by a discontinuous function; ψ(x) may not], we may need to write different equations for different regions of space. Examples of this sort are given in Section 5.4. 2. Using general mathematical techniques suited to the form of the equation, find a mathematical function ψ(x) that is a solution to the differential equation. Because there is no one specific technique for solving differential equations, we will study several examples to learn how to find solutions. 3. In general, several solutions may be found. By applying boundary conditions some of these may be eliminated and some arbitrary constants may be determined. It is generally the application of the boundary conditions that selects out the allowed energies. 4. If you are seeking solutions for a potential energy that changes discontinuously, you must apply the continuity conditions on ψ(x) (and usually on dψ/dx) at the boundary between different regions. Because the Schrödinger equation is linear, any constant multiplying a solution is also a solution. The method to determine the amplitude of the wave function is discussed in the next section. 141 142 Chapter 5 | The Schrödinger Equation Probabilities and Normalization |ψ(x)|2 dx x1 x2 FIGURE 5.8 The probability to find the particle in a small region of width dx is equal to the area of the strip under the |ψ(x)|2 curve. The total probability to find the particle between x1 and x2 is the sum of the areas of the strips, equal to the integral between those limits. The remaining steps in the procedure for applying the Schrödinger recipe depend on the physical interpretation of the solution to the differential equation. Our original goal in solving the Schrödinger equation was to obtain the wave properties of the particle. What does the amplitude of ψ(x) represent, and what is the physical variable that is waving? It is certainly not a displacement, as in the case of a water wave or a wave on a stretched piano wire, nor is it a pressure wave, as in the case of sound. It is a very different kind of wave, whose squared absolute amplitude gives the probability for finding the particle in a given region of space. If we define P(x) as the probability density (probability per unit length, in one dimension), then according to the Schrödinger recipe P(x) dx = |ψ(x)|2 dx (5.7) as indicated in Figure 5.8. In Eq. 5.7, |ψ(x)|2 dx gives the probability to find the particle in the interval dx at x (that is, between x and x + dx).∗ Because the wave function ψ(x) might be a complex function, it is necessary to square its absolute magnitude to make sure that the probability is a positive real number. The squared magnitude of the general time-dependent wave function (Eq. 5.6) is: | (x,t)|2 = |ψ(x)|2 |e−iωt |2 = |ψ(x)|2 (5.8) where the last step can be taken because the magnitude of the time-dependent factor is 1. For this reason, the probability density associated with a solution to the Schrödinger equation (for any allowed value of E) is independent of time. These special quantum states are called stationary states. This interpretation of |ψ(x)|2 helps us to understand the continuity condition of ψ(x). We must not allow the probability to change discontinuously, but, like any well-behaved wave, the probability to locate the particle varies smoothly and continuously. This interpretation of ψ(x) now permits us to complete the Schrödinger recipe and to illustrate how to use the wave function to calculate quantities that we can measure in the laboratory. Steps 1 through 4 were given previously; the recipe continues: 5. For a wave function describing a single particle, the probability summed over all locations must give 100%—that is, the particle must be located somewhere between x = −∞ and x = +∞. The probability to find the particle in a small interval was given in Eq. 5.7. The total probability to find the particle in all such intervals must be exactly 1:  +∞ |ψ(x)|2 dx = 1 (5.9) −∞ The Schrödinger equation is linear, which means that if ψ(x) is a solution then any constant times ψ(x) is also a solution. For the probability to be a meaningful concept, this constant must be chosen so that Eq. 5.9 is satisfied. ∗ It is not correct to speak of “the probability to find the particle at the point x.” A single point is a mathematical abstraction with no physical dimension. The probability of finding a particle at a point is zero, but there can be a nonzero probability of finding the particle in an interval. 5.3 | The Schrödinger Equation A wave function with its multiplicative constant chosen in this way is said to be normalized, and Eq. 5.9 is known as the normalization condition. 6. Because the solution to the Schrödinger equation represents a probability, any solution that becomes infinite must be discarded—it makes no sense to have an infinite probability to find a particle in any interval. In practice, we “discard” a solution by setting its multiplicative constant equal to zero. For example, if the mathematical solution to the differential equation yields ψ(x) = Aekx + Be−kx for the entire region x > 0, then we must require A = 0 for the solution to be physically meaningful; otherwise |ψ(x)|2 would become infinite as x goes to infinity. On the other hand, if this solution is to be valid in the entire region x < 0, then we must set B = 0. However, if the solution is to be valid only in a small portion of the range of x—say, 0 < x < L—then we cannot set either A = 0 or B = 0. 7. Suppose the interval between two points x1 and x2 is divided into a series of infinitesimal intervals of width dx (Figure 5.8). To find the total probability for the particle to be located between x1 and x2 , which we represent as P(x1 : x2 ), we calculate the sum of all the probabilities P(x) dx in each interval dx. This sum can be expressed as an integral:  x2  x2 |ψ(x)|2 dx (5.10) P(x) dx = P(x1 : x2 ) = x1 x1 If the wave function has been properly normalized, Eq. 5.10 will always yield a probability that lies between 0 and 1. 8. Because we can no longer speak with certainty about the position of the particle, we can no longer guarantee the outcome of a single measurement of any physical quantity that depends on its position. Instead, we can find the average outcome of a large number of measurements. For example, suppose we wish to find the average location of a particle by measuring its coordinate x. From a large number of measurements, we find the value x1 a certain number of times n1 , x2 a number of times n2 , etc., and in the usual way we can find the average value  n1 x1 + n2 x2 + · · · nx (5.11) = i i xav = n1 + n2 + · · · ni The number of times ni that we measure each xi is proportional to the probability P(xi )dx to find the particle in the interval dx at xi . Making this substitution and changing the sums to integrals, we have  +∞ P(x)x dx  +∞ = |ψ(x)|2 x dx (5.12) xav = −∞ +∞ −∞ P(x) dx −∞ where the last step can be made if the wave function is normalized, in which case the denominator of Eq. 5.12 is equal to one. By analogy, the average value of any function of x can be found:  +∞  +∞ |ψ(x)|2 f (x) dx (5.13) P(x)f (x) dx = [f (x)]av = −∞ −∞ Average values calculated according to Eq. 5.12 or 5.13 are known as expectation values. 143 144 Chapter 5 | The Schrödinger Equation 5.4 APPLICATIONS OF THE SCHR ÖDINGER EQUATION Solutions for Constant Potential Energy First let’s examine the solutions to the Schrödinger equation for the special case of a constant potential energy, equal to U0 . Then Eq. 5.5 becomes − h−2 d 2 ψ + U0 ψ(x) = Eψ(x) 2m dx2 (5.14) or (assuming for now that E > U0 ) d2ψ = −k 2 ψ(x) dx2 with k =  2m(E − U0 ) h−2 (5.15) The parameter k in this equation is equal to the wave number 2π/λ. The solution to Eq. 5.15 is a function of x that, when differentiated twice, gives back the original function multiplied by the negative constant −k 2 . The function that has this property is sin kx or cos kx. The most general solution to the equation is ψ(x) = A sin kx + B cos kx (5.16) The constants A and B must be determined by applying the continuity and normalization requirements. We can demonstrate that Eq. 5.16 satisfies Eq. 5.15 by taking two derivatives: dψ = kA cos kx − kB sin kx dx d2ψ = −k 2 A sin kx − k 2 B cos kx = −k 2 (A sin kx + B cos kx) = −k 2 ψ(x) dx2 so the original equation is indeed satisfied. To analyze the penetration of a particle into a forbidden region, we must consider the case in which the energy E of the particle is smaller than the potential energy U0 . For this case we can rewrite Eq. 5.14 as  d2ψ 2m(U0 − E) ′2 ′ = k ψ(x) with k = (5.17) 2 dx h−2 In this case the general solution in the forbidden regions is ′ ′ ψ(x) = Aek x + Be−k x (5.18) Once again, we can demonstrate that Eq. 5.18 is a solution of Eq. 5.17 by taking two derivatives: dψ ′ ′ = k ′ Aek x − k ′ Be−k x dx d2ψ ′ ′ ′ ′ 2 2 2 2 = k ′ Aek x + k ′ Be−k x = k ′ (Aek x + Be−k x ) = k ′ ψ(x) 2 dx We will use Eqs. 5.16 and 5.18 as our solutions to the Schrödinger equation for constant potential energy in the allowed (E > U0 ) and forbidden (E < U0 ) regions. 5.4 | Applications of the Schrödinger Equation The Free Particle For a free particle, the force is zero and so the potential energy is constant. We may choose any value for that constant, so for convenience we’ll choose U0 = 0. The solution is given by Eq. 5.16, ψ(x) = A sin kx + B cos kx. The energy of the particle is E= h−2 k 2 2m (5.19) Our solution has placed no restrictions on k, so the energy is permitted to have any value (in the language of quantum physics, we say that the energy is not quantized). We note that Eq. 5.19 is the kinetic energy of a particle with momentum − or, equivalently, p = h/λ. This is as we would have expected, because p = hk the free particle can be represented by a de Broglie wave with any wavelength. Solving for A and B presents some difficulties because the normalization integral, Eq. 5.9, cannot be evaluated from −∞ to +∞ for this wave function. We therefore cannot determine probabilities for the free particle from the wave function of Eq. 5.16. It is also instructive to write the wave function in terms of complex exponentials, using sin kx = (eikx − e−ikx )/2i and cos kx = (eikx + e−ikx )/2:   ikx   ikx e + e−ikx e − e−ikx (5.20) +B = A′ eikx + B′ e−ikx ψ(x) = A 2i 2 where A′ = A/2i + B/2 and B′ = −A/2i + B/2. To interpret this solution in terms of waves we form the complete time-dependent wave function using Eq. 5.6: (x,t) = (A′ eikx + B′ e−ikx )e−iωt = A′ ei(kx−ωt) + B′ e−i(kx+ωt) (5.21) The dependence of the first term on kx − ωt identifies this term as representing a wave moving to the right (in the positive x direction) with amplitude A′ , and the second term involving kx + ωt represents a wave moving to the left (in the negative x direction) with amplitude B′ . If we want the wave to represent a beam of particles moving in the +x direction, then we must set B′ = 0. The probability density associated with this wave is then, according to Eq. 5.7, P(x) = |ψ(x)|2 = |A′ |2 eikx e−ikx = |A′ |2 (5.22) The probability density is constant, meaning the particles are equally likely to be found anywhere along the x axis. This is consistent with our discussion of the freeparticle de Broglie wave in Chapter 4—a wave of precisely defined wavelength extends from x = −∞ to x = +∞ and thus gives a completely unlocalized particle. Infinite Potential Energy Well Now we’ll consider the formal solution to the problem we discussed in Section 5.2: a particle is trapped in the region between x = 0 and x = L by infinitely high potential energy barriers. Imagine an apparatus like that of Figure 5.6, in which the particle moves freely in this region and makes elastic collisions with the perfectly rigid barriers that confine it. This problem is sometimes called “a particle in a box.” For now we’ll assume that the particle moves in only one dimension; later we’ll expand to two and three dimensions. 145 146 Chapter 5 | The Schrödinger Equation To ∞ To ∞ The potential energy may be expressed as: U(x) = 0 U=∞ x=0 U=0 U=∞ x=L FIGURE 5.9 The potential energy of a particle that moves freely (U = 0) in the region 0 ≤ x ≤ L but is completely excluded (U = ∞) from the regions x < 0 and x > L. 0≤x≤L =∞ x < 0, x > L (5.23) The potential energy is shown in Figure 5.9. We are free to choose any constant value for U in the region 0 ≤ x ≤ L; we choose it to be zero for convenience. Because the potential energy is different in the regions inside and outside the well, we must find separate solutions in each region. We can analyze the outside region in either of two ways. If we examine Eq. 5.5 for the region outside the well, we find that the only way to keep the equation from becoming meaningless when U → ∞ is to require ψ = 0, so that Uψ will not become infinite. Alternatively, we can go back to the original statement of the problem. If the walls at the boundaries of the well are perfectly rigid, the particle must always be in the well, and the probability for finding it outside must be zero. To make the probability zero everywhere outside the well, we must make ψ = 0 everywhere outside. Thus we have ψ(x) = 0 x < 0, x > L (5.24) The Schrödinger equation for 0 ≤ x ≤ L, when U(x) = 0, is identical with Eq. 5.14 with U0 = 0 and has the same solution: ψ(x) = A sin kx + B cos kx 0≤x≤L (5.25) with k=  2mE h−2 (5.26) Our solution is not yet complete, for we have not evaluated A or B, nor have we found the allowed values of the energy E. To do this, we must apply the requirement that ψ(x) is continuous across any boundary. In this case, we require that our solutions for x < 0 and x > 0 match up at x = 0; similarly, the solutions for x > L and x < L must match at x = L. Let us begin at x = 0. At x < 0, we have found that ψ = 0, and so we must set ψ(x) of Eq. 5.25 to zero at x = 0. ψ(0) = A sin 0 + B cos 0 = 0 (5.27) which gives B = 0. Because ψ = 0 for x > L, the second boundary condition is ψ(L) = 0, so ψ(L) = A sin kL + B cos kL = 0 (5.28) We have already found B = 0, so we must now have A sin kL = 0. Either A = 0, in which case ψ = 0 everywhere, ψ 2 = 0 everywhere, and there is no particle (a meaningless solution) or else sin kL = 0, which is true only when kL = π , 2π , 3π , . . ., or kL = nπ n = 1, 2, 3, . . . (5.29) 5.4 | Applications of the Schrödinger Equation With k = 2π/λ, we have λ = 2L/n; this is identical with the result obtained in introductory mechanics for the wavelengths of the standing waves in a string of length L fixed at both ends, which we already obtained in Section 5.2 (Eq. 5.1). Thus the solution to the Schrödinger equation for a particle trapped in a linear region of length L is a series of standing de Broglie waves! Not all wavelengths are permitted; only certain values, determined from Eq. 5.29, may occur. From Eq. 5.26 we find that, because only certain values of k are permitted by Eq. 5.29, only certain values of E may occur—the energy is quantized! Solving Eq. 5.29 for k and substituting into Eq. 5.26, we obtain h−2 k 2 h2 n2 h−2 π 2 n2 = = En = 2m 2mL2 8mL2 n = 1, 2, 3, . . . 147 n=4 E4 = 16E0 n=3 E3 = 9E0 n=2 E2 = 4E0 n=1 E1 = E0 (5.30) −2 For convenience, let E0 = h π 2 /2mL2 = h2 /8mL2 ; this unit of energy is determined by the mass of the particle and the width of the well. Then En = n2 E0 , and the only allowed energies for the particle are E0 , 4E0 , 9E0 , 16E0 , etc. All intermediate values, such as 3E0 or 6.2E0 , are forbidden. Figure 5.10 shows the allowed energy levels. The lowest energy state, for which n = 1, is known as the ground state, and the states with higher energies (n > 1) are known as excited states. Because the energy is purely kinetic in this case, our result means that only certain speeds are permitted for the particle. This is very different from the case of the classical trapped particle, in which the particle can be given any initial velocity and will move forever, back and forth, at the same speed. In the quantum case, this is not possible; only certain initial speeds can result in sustained states of motion; these special conditions represent the “stationary states.” Average values calculated according to Eq. 5.13 likewise do not change with time. From one energy state, the particle can make jumps or transitions to another energy state by absorbing or releasing an amount of energy equal to the energy difference between the two states. By absorbing energy the particle will move to a higher energy state, and by releasing energy it moves to a lower energy state. A similar effect occurs for electrons in atoms, in which the absorbed or released energy is usually in the form of a photon of visible light or other electromagnetic radiation. For example, from the state with n = 3 (E3 = 9E0 ), the particle might absorb an energy of E = 7E0 and jump upward to the n = 4 state (E4 = 16E0 ) or might release energy of E = 5E0 and jump downward to the n = 2 state (E2 = 4E0 ). FIGURE 5.10 The first four energy levels in a one-dimensional infinite potential energy well. Example 5.2 An electron is trapped in a one-dimensional region of length 1.00 × 10−10 m (a typical atomic diameter). (a) Find the energies of the ground state and first two excited states. (b) How much energy must be supplied to excite the electron from the ground state to the second excited state? (c) From the second excited state, the electron drops down to the first excited state. How much energy is released in this process? Solution (a) The basic quantity of energy needed for this calculation is E0 = = (hc)2 h2 = 8mL2 8mc2 L2 (1240 eV · nm)2 = 37.6 eV 8(511, 000 eV)(0.100 nm)2 148 Chapter 5 | The Schrödinger Equation With En = n2 E0 , we can find the energy of the states: n=1: n=2: n=3: This is the energy that must be absorbed for the electron to make this jump. (c) The energy difference between the second and first excited states is E1 = E0 = 37.6 eV E2 = 4E0 = 150.4 eV E3 = 9E0 = 338.4 eV (b) The energy difference between the ground state and the second excited state is E = E3 − E1 = 338.4 eV − 37.6 eV = 300.8 eV E = E3 − E2 = 338.4 eV − 150.4 eV = 188.0 eV This is the energy that is released when the electron makes this jump. To complete the solution for ψ(x), we mustdetermine the constant A by using +∞ the normalization condition given in Eq. 5.9, −∞ |ψ(x)|2 dx = 1. The integrand is zero in the regions −∞ < x ≤ 0 and L ≤ x < +∞, so all that remains is  L nπ x A2 sin2 dx = 1 (5.31) L 0  from which we find A = 2/L. The complete wave function for 0 ≤ x ≤ L is then  2 nπ x ψn (x) = sin n = 1, 2, 3, . . . (5.32) L L In Figure 5.11, the wave functions and probability densities ψ 2 are illustrated for the lowest several states. In the ground state, the particle has the greatest probability to be found near the middle of the well (x = L/2), and the probability falls off to zero between the center and the sides of the well. This is very different from the behavior of a classical particle—a classical particle moving at constant speed would be found with equal probability at every location inside the well. The quantum particle also has constant speed but yet is still found with differing probability at various locations in the well. It is the wave nature of the quantum particle that is responsible for this very nonclassical behavior. n=1 x=0 n=3 x=L n=2 x=0 x=0 x=L n=4 x=L x=0 x=L FIGURE 5.11 The wave functions (solid lines) and probability densities (shaded regions) of the first four states in the one-dimensional infinite potential energy well. 5.4 | Applications of the Schrödinger Equation 149 Another example of nonclassical behavior occurs for the first excited state. The probability density has a maximum at x = L/4 and another maximum at x = 3L/4. Between the two maxima, there is zero probability to find the particle in the center of the well at x = L/2. How can the particle travel from x = L/4 to x = 3L/4 without ever being at x = L/2? Of course, no classical particle could behave in such a way, but it is a common behavior for waves. For example, the first overtone of a vibrating string has a node at its midpoint and antinodes (vibrational maxima) at the 1 /4 and 3 /4 locations. The calculation of probabilities and average values is illustrated by the following examples. Example 5.3 Consider again an electron trapped in a one-dimensional region of length 1.00 × 10−10 m = 0.100 nm. (a) In the ground state, what is the probability of finding the electron in the region from x = 0.0090 nm to 0.0110 nm? (b) In the first excited state, what is the probability of finding the electron between x = 0 and x = 0.025 nm? Solution (a) When the interval is small, it is often simpler to use Eq. 5.7 to find the probability, instead of using the integration method. The width of the small interval is dx = 0.0110 nm − 0.0090 nm = 0.0020 nm. Evaluating the wave function at the midpoint of the interval (x = 0.0100 nm), we can use the n = 1 wave function with Eq. 5.7 to find P(x) dx = |ψ1 (x)|2 dx = = 2 2 πx sin dx L L π(0.0100 nm) 2 sin2 (0.002 nm) 0.100 nm 0.100 nm (b) For this wide interval, we must use the integration method to find the probability: P(x1 : x2 ) =  x2 x1 2 = L =   |ψ2 (x)|2 dx x2 sin2 x1 2π x dx L  x 1 4π x x2 − sin L 4π L x1 Evaluating this expression using the limits x1 = 0 and x2 = 0.025 nm gives a probability of 0.25 or 25%. This result is of course what we would expect by inspection of the graph of ψ 2 for n = 2 in Figure 5.11. The interval from x = 0 to x = L/4 contains 25% of the total area under the ψ 2 curve. = 0.0038 = 0.38% Example 5.4 Show that the average value of x is L/2, independent of the quantum state. Solution We use Eq. 5.12; because ψ = 0 except for 0 ≤ x ≤ L, the limits of integration are 0 and L:   L 2 L 2 nπ x xav = sin |ψ(x)|2 x dx = x dx L 0 L 0 This can be integrated by parts or found in integral tables; the result is xav = L 2 Note that, as required, this result is independent of n. Thus a measurement of the average position of the particle yields no information about its quantum state. 150 Chapter 5 | The Schrödinger Equation Let’s now look at how the uncertainty principle applies to the motion of this trapped particle. By solving Problems 34 and 35, you will find that the uncertainties inposition and momentum for a particle in an infinite potential well are x = L 1/12 − 1/2π 2 n2 and p = hn/2L. The product of the uncertainties is hn xp = 2  1 h 1 = − 2 2 12 2π n 2  n2 1 − 12 2π 2 Clearly the product of the uncertainties grows as n grows. The minimum value − The ground state occurs for n = 1, in which case xp = 0.090h = 0.57h. represents a fairly “compact” wave packet, but it is somewhat less compact than the minimum possible limit of 0.50h− (Eq. 4.10). You can see from Figure 5.11 how the wave becomes less compact (spreads out more) as n increases. Even for − n = 2, the product of the uncertainties grows quickly to 1.67h. Finite Potential Energy Well Because the infinite potential energy well is an idealization of a technique for confining a particle, we should examine the solution when the barriers at the sides of the well are finite rather than infinite. The potential energy well can be described by U(x) = 0 = U0 0≤x≤L x < 0, x > L (5.33) and is sketched in Figure 5.12. We look for solutions in which the particle is confined to this well, and thus the energies that we deduce for the particle must be less than U0 . The solution in the center region (between x = 0 and x = L) is exactly the same as it was for the infinite well (Eq. 5.25): ψ(x) = A sin kx + B cos kx U = U0 U = U0 ′ x=0 x=L FIGURE 5.12 The potential energy of a particle that is confined to the region 0 ≤ x ≤ L by finite barriers U0 at x = 0 and x = L. (5.34) although the values that we deduced previously for the coefficients A and B are not valid in this calculation. The region x < 0 is an example of a situation in which the energy E of the particle is less than the potential energy U0 , and so we ′ ′ must use the solution in the form of Eq. 5.18, ψ(x) = Cek x + De−k x with k ′ given in Eq. 5.17. This region includes x = −∞, for which the term with the coefficient D becomes infinite. Because we cannot allow the probability to become infinite, we must discard this term by setting D = 0. The solution for x < 0 is then ψ(x) = Cek x U=0 (0 ≤ x ≤ L) (x < 0) (5.35) In the region x > L, the energy E is once again smaller than U0 , and so the solution is ′ ′ also in the form of Eq. 5.18, ψ(x) = Fek x + Ge−k x . Here the region now includes x = +∞, for which the term with the coefficient F would become infinite. We must prevent that possibility by setting F = 0, so the solution in this region is ′ ψ(x) = Ge−k x (x > L) (5.36) We now have 4 coefficients to determine (A, B, C, G) along with the energy E. For this determination, we have 4 equations from the boundary conditions (the continuity of both ψ and dψ/dx at both x = 0 and x = L) and one equation 5.4 | Applications of the Schrödinger Equation from the normalization condition. As you might imagine, solving 5 equations in 5 unknowns presents a straightforward but very tedious algebraic challenge. Moreover, the resulting solution for the energy values cannot be obtained in terms of a direct equation such as Eq. 5.30, but instead must be found numerically by solving a transcendental equation. The result is a series of increasing energy values, but the number of energy values is finite rather than infinite, because the energy cannot be allowed to exceed the value of U0 . As we did for the infinite potential energy well in Example 5.2, let’s consider a well of width L = 0.100 nm. We’ll choose the depth of the well to be U0 = 400 eV. Applying the boundary conditions at x = 0 and x = L, we can eliminate all of the coefficients and find an equation that involves only k and k ′ (both of which depend on the energy E). Solving that equation numerically, we find four possible values of the energy: E1 = 26 eV, E2 = 104 eV, E3 = 227 eV, E4 = 375 eV. Here the subscript just numbers the energy values, starting at the ground state; there is no simple functional dependence of the energies on the quantum number n as there was for the infinite well. The allowed energy levels are shown in Figure 5.13. The probability densities (square of the wave functions) for these four states are shown in Figure 5.14. In some ways they are similar to the probability densities in the infinite well—note that each state has n maxima in its probability density, just like the infinite well (see Figure 5.11). Unlike the infinite well, these probability densities show the property of penetration into the classically forbidden region. Look carefully at the continuity of the wave function and its slope at x = 0 and x = L; see how smoothly the sine and cosine function inside the well joins the exponentials in the forbidden regions. The energy levels of the finite well are smaller than those of the infinite well of the same width (38 eV, 150 eV, 338 eV, 602 eV), and the differences increase as we go to higher states. This is consistent with the uncertainty principle—because of the penetration into the forbidden region, x is larger for the finite well and thus px must be smaller. As a result, the kinetic energies are smaller for the finite well. From Figure 5.14 we see that the penetration distance increases as we go up in energy, so the difference between x for the finite well and the infinite well increases and the energy discrepancy also increases. −0.1 0 x (nm) 0.1 0.2 n=4 E4 = 375 eV n=3 E3 = 227 eV n=2 E2 = 104 eV n=1 E1 = 26 eV FIGURE 5.13 The energy levels in a potential energy well of depth 400 eV. There are only four energy states in this well. −0.1 0 x (nm) 0.1 0.2 0.1 0.2 n=4 n=2 −0.1 U0 = 400 eV n=3 n=1 0 0.1 x (nm) 0.2 −0.1 151 0 x (nm) FIGURE 5.14 The probability densities of the four states in the one-dimensional potential energy well of width 0.100 nm and depth 400 eV. 152 Chapter 5 | The Schrödinger Equation For an energy close to the top of the well such as E4 , a smaller uncertainty − and E is necessary to reach the top of the well, giving a larger t ∼ h/E thus a larger penetration distance. At the bottom of the well, the state E1 requires much more energy to reach the top of the well and thus needs a much larger E; the smaller resulting t gives a smaller penetration distance into the forbidden region. Two-Dimensional Infinite Potential Energy Well When we extend the previous calculation to two and three dimensions, the principal features of the solution remain the same, but an important new feature is introduced. In this section we show how this occurs; this new feature, known as degeneracy, will turn out to be very important in our study of atomic physics. To begin with, we need a Schrödinger equation that is valid in more than one dimension; our previous version, Eq. 5.5, included only one spatial dimension. If the potential energy is a function of x and y, we expect that ψ also depends on both x and y, and the derivatives with respect to x must be replaced by derivatives with respect to x and y. In two dimensions, we then have∗ h−2 − 2m  ∂ 2 ψ(x, y) ∂ 2 ψ(x, y) + ∂x2 ∂y2  + U(x, y)ψ(x, y) = Eψ(x, y) (5.37) The two-dimensional potential energy well is: U(x, y) = 0 =∞ 0 ≤ x ≤ L; 0 ≤ y ≤ L otherwise (5.38) The particle is confined by infinitely high barriers to the square region with the vertices (x, y) = (0, 0), (L, 0), (L, L), (0, L), as shown in Figure 5.15. A classical analog might be a small disk sliding without friction on a tabletop and colliding elastically with walls at x = 0, x = L, y = 0, and y = L. (For simplicity, we have made the allowed region square; we could have made it rectangular by setting U = 0 when 0 ≤ x ≤ a and 0 ≤ y ≤ b.) Solving partial differential equations requires a technique more involved than we need to consider, so we will not give the details of the solution. We suspect that, as in the previous case, ψ(x, y) = 0 outside the allowed region, in order to make the probability zero there. Inside the well, we consider solutions that are separable; that is, our function of x and y can be expressed as the product of one function that depends only on x and another that depends only on y: y y=L U=∞ ψ(x, y) = f (x)g(y) U=0 (5.39) where the functions f and g are similar to Eq. 5.16: U=∞ x f (x) = A sin kx x + B cos kx x, x=L FIGURE 5.15 A particle moves freely in the two-dimensional region 0 < x < L, 0 < y < L, but encounters infinite barriers beyond that region. ∗ The g(y) = C sin ky y + D cos ky y (5.40) first two terms on the left side of this equation require partial derivatives; for well-behaved functions, these involve taking the derivative with respect to one variable while keeping the other constant. Thus if f (x, y) = x2 + xy + y2 , then ∂f /∂x = 2x + y and ∂f /∂y = x + 2y. 5.4 | Applications of the Schrödinger Equation 153 The wave number k of the one-dimensional problem has become the separate wave numbers kx for f (x) and ky for g(y). We show later how these are related. (See also Problem 18 at the end of this chapter.) The continuity condition on ψ(x, y) requires that the solutions inside and outside match at the boundary. Because ψ = 0 everywhere outside, the continuity condition then requires that ψ = 0 everywhere on the boundary. That is, ψ(0, y) = 0 and ψ(L, y) = 0 for all y ψ(x, 0) = 0 and ψ(x, L) = 0 for all x In analogy with the one-dimensional problem, the condition at x = 0 gives f (0) = 0, which requires B = 0 in Eq. 5.40. Similarly, the condition at y = 0 gives g(0) = 0, which requires D = 0. The condition f (L) = 0 requires that sin kx L = 0, and thus that kx L be an integer multiple of π ; the condition g(L) = 0 similarly requires that ky L be an integer multiple of π . These two integers do not necessarily need to be the same, so we call them nx and ny . Making all these substitutions into Eq. 5.39, we obtain ψ(x, y) = A′ sin ny π y nx π x sin L L (5.41) where we have combined A and C into A′ . The coefficient A′ is once again found by the normalization condition, which in two dimensions becomes  ψ 2 dx dy = 1 (5.42) For our case this gives  L (5,2) or (2,5) (5,1) or (1,5) dy 0  L 0 ny π y n πx A′2 sin2 x sin2 dx = 1 L L (5.43) (4,3) or (3,4) (4,2) or (2,4) from which follows 2 A′ = L (3,3) (5.44) (4,1) or (1,4) (The solutions to this problem, which are standing de Broglie waves on a twodimensional surface, are similar to the solutions of the classical problem of the vibrations of a stretched membrane such as a drumhead.) Finally, we can substitute our solution for ψ(x, y) back into Eq. 5.41 to find the energy: E= h−2 π 2 2 h2 2 (n + n ) = (n2 + n2y ) x y 2mL2 8mL2 x (3,2) or (2,3) (3,1) or (1,3) (2,2) (2,1) or (1,2) (5.45) Compare this result with Eq. 5.30. Once again we let E0 = h−2 π 2 /2mL2 = h2 /8mL2 so that E = E0 (n2x + n2y ). In Figure 5.16 the energies of the excited states are shown. You can see how different the energies are from those of the one-dimensional case shown in Figure 5.10. Figure 5.17 shows the probability density ψ 2 for several different combinations of the quantum numbers nx and ny . The probability has maxima and minima, just like the probability in the one-dimensional problem. For example, if we gave (1,1) (nx,ny) 29E0 26E0 25E0 20E0 18E0 17E0 13E0 10E0 8E0 5E0 2E0 E=0 FIGURE 5.16 The lower permitted energy levels of a particle confined to an infinite two-dimensional potential energy well. 154 Chapter 5 | The Schrödinger Equation y (1,1) x (2,1) 2E0 (1,2) (3,1) 5E0 (2,2) 5E0 (1,3) (3,2) 8E0 (2,3) 10E0 10E0 13E0 (3,3) 13E0 18E0 FIGURE 5.17 The probability density for some of the lower energy levels of a particle confined to the infinite two-dimensional potential energy well. The individual plots are labeled with the quantum numbers (nx , ny ) and with the value of the energy E. FIGURE 5.18 A ring of iron atoms on a copper surface forms a “corral” within which the probability density of trapped electrons is clearly visible. This image was obtained with a scanning tunneling electron microscope. (Image originally created by IBM Corporation.) the particle an energy of 8E0 and then made a large number of measurements of its position, we would expect to find it most often near the four points (x, y) = (L/4, L/4), (L/4, 3L/4), (3L/4, L/4) and (3L/4, 3L/4); we expect never to find it near x = L/2 or y = L/2. The shape of the probability density tells us something about the quantum numbers and therefore about the energy. Thus if we measured the probability density and found six maxima, as shown in Figure 5.17, we would deduce that the particle had an energy of 13E0 with nx = 2 and ny = 3, or else nx = 3, ny = 2. Recently it has become possible to photograph the probability densities of electrons confined in a two-dimensional region. The tip of an electron microscope was used to place 48 individual iron atoms on a metal surface in a ring or “corral” of radius 7.13 nm that formed the walls of a potential well, as shown in Figure 5.18. Inside the ring, the waves of probability density for electrons trapped in the potential well are clearly visible. The potential well is circular, rather than square, but otherwise the analysis follows the procedures described in this section; when the Schrödinger equation is solved in cylindrical polar coordinates with the potential energy for a circular well, the calculated probability density gives a close match with the observed one. These beautiful results are a dramatic confirmation of the wave functions obtained for the two-dimensional potential energy well. Degeneracy Occasionally it happens that two different sets of quantum numbers nx and ny have exactly the same energy. This situation is known as degeneracy, and the energy levels are said to be degenerate. For example, the energy level at E = 13E0 is degenerate, because both nx = 2, ny = 3 and nx = 3, ny = 2 have 5.5 | The Simple Harmonic Oscillator (1,7) (5,5) 50E0 50E0 FIGURE 5.19 Two very different probability densities with exactly the same energy. E = 13E0 . This degeneracy arises from interchanging nx and ny (which is the same as interchanging the x and y axes), so the probability distributions in the two cases are not very different. However, consider the state with E = 50E0 , for which there are three sets of quantum numbers: nx = 7, ny = 1; nx = 1, ny = 7; and nx = 5, ny = 5. The first two sets of quantum numbers result from the interchange of nx and ny and so have similar probability distributions, but the third represents a very different state of motion, as shown in Figure 5.19. The level at E = 13E0 is said to be two-fold degenerate, while the level at E = 50E0 is three-fold degenerate; we could also say that one level has a degeneracy of 2, while the other has a degeneracy of 3. Degeneracy occurs in general whenever a system is labeled by two or more quantum numbers; as we have seen in the above calculation, different combinations of quantum numbers often can give the same value of the energy. The number of different quantum numbers required by a given physical problem turns out to be exactly equal to the number of dimensions in which the problem is being solved—one-dimensional problems need only one quantum number, twodimensional problems need two, and so forth. When we get to three dimensions, as in Problem 19 at the end of this chapter and especially in the hydrogen atom in Chapter 7, we find that the effects of degeneracy become more significant; in the case of atomic physics, the degeneracy is a major contributor to the structure and properties of atoms. 5.5 THE SIMPLE HARMONIC OSCILLATOR Another situation that can be analyzed using the Schrödinger equation is the one-dimensional simple harmonic oscillator. The classical oscillator is an object of mass m attached to a spring of force constant k. The spring exerts a restoring force F = −kx on the object, where x is the displacement from its equilibrium position. Using Newton’s laws, we can analyze the oscillator and show  that it has a (circular or angular) frequency ω0 = k/m and a period T = 2π m/k. The maximum distance of the oscillating object from its equilibrium position is x0 , the amplitude of the oscillation. The oscillator has its maximum kinetic energy at x = 0; its kinetic energy vanishes at the turning points x = ±x0 . At the turning points the oscillator comes to rest for an instant and then reverses its direction of motion. The motion is, of course, confined to the region −x0 ≤ x ≤ +x0 . 155 156 Chapter 5 | The Schrödinger Equation Why analyze the motion of such a system using quantum mechanics? Although we never find in nature an example of a one-dimensional quantum oscillator, there are systems that behave approximately as one—a vibrating diatomic molecule, for example. In fact, any system in a smoothly varying potential energy well near its minimum behaves approximately like a simple harmonic oscillator. A force F = −kx has the associated potential energy U = 21 kx2 , and so we have the Schrödinger equation: − 1 h−2 d 2 ψ + kx2 ψ = Eψ 2m dx2 2 (5.46) (Because we are working in one dimension, U and ψ are functions only of x.) There are no boundaries between different regions of potential energy here, so the wave function must fall to zero for both x → +∞ and x → −∞. The simplest function that satisfies these conditions, which turns out to be the correct ground 2 state wave function, is ψ(x) = Ae−ax . The constant a and the energy E can be found by substituting this function into Eq. 5.46. We begin by evaluating d 2 ψ/dx2 . dψ 2 = −2ax(Ae−ax ) dx d2ψ 2 2 2 = −2a(Ae−ax ) − 2ax(−2ax)Ae−ax = (−2a + 4a2 x2 )Ae−ax 2 dx 2 Substituting into Eq. 5.46 and canceling the common factor Ae−ax yields h−2 a 2a2 h−2 2 1 2 − x + kx = E m m 2 (5.47) Equation 5.47 is not an equation to be solved for x, because we are looking for a solution that is valid for any x, not just for one specific value. In order for this to hold for any x, the coefficients of x2 must cancel and the remaining constants must be equal. (That is, consider the equation bx2 = c. It will be true for any and all x only if both b = 0 and c = 0.) Thus − which yield |ψ | 0 √ km a= − 2h and and E= h−2 a =E m 1 − h k/m 2 We can also write the energy in terms of the classical frequency ω0 = 1− E = hω 2 0 2 −x0 2a2 h−2 1 + k=0 m 2 +x0 x FIGURE 5.20 The probability density for the ground state of the simple harmonic oscillator. The classical turning points are at x = ±x0 . (5.48) (5.49)  k/m as (5.50) The coefficient A is found from the normalization condition (see Problem 20 at the end of the chapter). The result, which is valid only for this ground-state wave func− )1/4 . The complete wave function of the ground state is then tion, is A = (mω0 /hπ  mω 1/4 √ − 2 e−( km/2h)x ψ(x) = − 0 (5.51) hπ The probability density for this wave function is illustrated in Figure 5.20. Note that, as in the case of the finite potential energy well, the probability density can penetrate into the forbidden region beyond the classical turning points at x = ±x0 (in this region the potential energy is greater than E). 5.5 | The Simple Harmonic Oscillator 157 The solution we have found corresponds only to the ground state of the 2 oscillator. The general solution is of the form ψn (x) = Afn (x)e−ax , where fn (x) is a polynomial in which the highest power of x is xn . The corresponding energies are   1 − hω0 n = 0, 1, 2, . . . (5.52) En = n + 2 n=3 E = 72 ħω0 n=2 E = 52 ħω0 These levels are shown in Figure 5.21. Note that they are uniformly spaced, in contrast to the one-dimensional infinite potential energy well. Probability densities are shown in Figure 5.22. All of the solutions have the property of penetration of probability density into the forbidden region beyond the classical turning points. The probability density oscillates, somewhat like a sine wave, 2 between the turning points, and decreases like e−ax to zero beyond the turning points. Note the great similarity between the probability densities for the quantum oscillator and those of the finite potential energy well (Figure 5.14). A sequence of vibrational excited states similar to Figure 5.21 is commonly found in diatomic molecules such as HCl (see Chapter 9). The spacing between the states is typically 0.1–1 eV; the states are observed when photons (in the infrared region of the spectrum) are emitted or absorbed as the molecule jumps from one state to another. A similar sequence is observed in nuclei, where the spacing is 0.1–1 MeV and the radiations are in the gamma-ray region of the spectrum. n=1 E = 32 ħω0 n=0 E = 12 ħω0 n=0 U =12 kx2 FIGURE 5.21 Energy levels of the simple harmonic oscillator. Note that the levels have equal spacings and that the distance between the classical turning points increases with energy. n=2 0 x n=1 0 x 0 x n=3 0 x FIGURE 5.22 Probability densities for the simple harmonic oscillator. Note how the distance between the classical turning points (marked by the short vertical lines) increases with energy. Compare with the probability densities for the finite potential energy well (Figure 5.14). Example 5.5 An electron is bound to a region of space by a springlike force with an effective spring constant of k = 95.7 eV/nm2 . (a) What is its ground-state energy? (b) How much energy must be absorbed for the electron to jump from the ground state to the second excited state? 1 = (197 eV · nm) 2  95.7 eV/nm2 0.511 × 106 eV = 1.35 eV Solution (a) The ground-state energy is   1− k 1− k 1− E = hω0 = h = hc 2 2 m 2 mc2 (b) The difference between adjacent energy levels is − hω 0 = 2.70 eV for all energy levels, so the energy that must be absorbed to go from the ground state to the second excited state is E = 2 × 2.70 eV = 5.40 eV. 158 Chapter 5 | The Schrödinger Equation Example 5.6 For the electron of Example 5.5 in its ground state, what is the probability to find it in a narrow interval of width 0.004 nm located halfway between the equilibrium position and the classical turning point? Solution First we need to find the location of the turning point. At the classical turning points x = ±x0 , the kinetic energy is zero and so the total energy is all potential. Thus E = 21 kx20 , and so   2E 2(1.35 eV) = 0.168 nm x0 = = k 95.7 eV/nm2 2 Evaluating the parameters of the wave function Ae−ax (the normalization constant A and the exponential coefficient a), we have 1/4 −  mω 1/4  mc2 hω = −2 2 0 A= − 0 hπ h c π 1/4 (0.511 × 106 eV)(2.70 eV) = 1.83 nm−1/2 (197 eV · nm)2 π √ √ km kmc2 a= − = − 2h 2hc  (95.7 eV/nm2 )(0.511 × 106 eV) = 2(197 eV · nm) =  = 17.74 nm−2 The probability in the interval dx = 0.004 nm at x = x0 /2 = 0.084 nm is then 2 P(x) dx = |ψ(x)|2 dx = A2 e−2ax dx −2 )(0.084 nm)2 = (1.83 nm−1/2 )2 e−2(17.74 nm (0.004 nm) = 0.0104 = 1.04% As we did in the case of the infinite potential energy well, let’s look at the application of the uncertainty principle to the wave packet represented by the harmonic oscillator. Using the results of Problems 22 and 23 for the uncertainties − ω0 in position  and momentum for the ground state of the oscillator, x = h/2m − − m/2, the product of the uncertainties is xp = h/2. This is the and p = hω 0 minimum possible value for this product, according to Eq. 4.10. The ground state of the oscillator thus represents the most “compact” wave packet in which the product of the uncertainties has its smallest value. You can see from Figure 5.22 that the excited states of the oscillator are much less compact (more spread out) than the ground state. 5.6 STEPS AND BARRIERS In this general type of problem, we analyze what happens when a particle moving (again in one dimension) in a region of constant potential energy suddenly moves into a region of different, but also constant, potential energy. We will not discuss in detail the solutions to these problems, but the methods of solution of each are so similar that we can outline the steps to take in the solution. In this discussion, we let E be the (fixed) total energy of the particle and U0 will be the value of the constant potential energy. In these calculations, the particle is not confined, so the energy is not quantized—we are free to choose any value for the particle energy. 5.6 | Steps and Barriers Potential Energy Step, E > U O E E − U0 = K Consider the potential energy step shown in Figure 5.23: U(x) = 0 = U0 E=K x<0 x≥0 (5.53) If the total energy E of the particle is greater than U0 , then we can write the solutions to the Schrödinger equation in the two regions based on the general form of Eq. 5.16:  2mE k0 = x < 0 (5.54a) ψ0 (x) = A sin k0 x + B cos k0 x h−2  2m ψ1 (x) = C sin k1 x + D cos k1 x k1 = −2 (E − U0 ) x > 0 (5.54b) h U0 x=0 FIGURE 5.23 A step of height U0 . Particles are incident from the left with energy E. The kinetic energy is equal to E in the region x < 0 and is reduced to E − U0 in the region x > 0. Relationships among the four coefficients, A, B, C, and D, may be found by applying the condition that ψ(x) and ψ ′ (x) = dψ/dx must be continuous at the boundary; thus ψ0 (0) = ψ1 (0) and ψ0′ (0) = ψ1′ (0). A typical solution might look like Figure 5.24. Note the smooth transition between the solutions at x = 0, which results from applying the continuity conditions. The coefficients A, B, C, and D are in general complex, so to visualize the complete wave we need both the real and imaginary parts of ψ. We can use the equation eiθ = cos θ + i sin θ to transform these solutions from sines and cosines to complex exponentials: ψ0 (x) = A′ eik0 x + B′ e−ik0 x ψ1 (x) = C ′ eik1 x + D′ e−ik1 x x<0 x>0 (5.55a) (5.55b) The coefficients A′ , B′ , C ′ , D′ can be found from the coefficients A, B, C, D. The time dependent wave functions are obtained by multiplying each term by e−iωt , which gives = A′ ei(k0 x−ωt) + B′ e−i(k0 x+ωt) ′ i(k1 x−ωt) + D′ e−i(k1 x+ωt) 1 (x,t) = C e (5.56a) (5.56b) 0 (x,t) |ψ |2 |ψ |2 Im(ψ) Im(ψ) Re(ψ) (a) 159 Re(ψ) (b) FIGURE 5.24 Wave function for electrons incident from the left on a potential energy step for E > U0 . The probability density and the real and imaginary parts of the wavefunction are shown for (a) t = 0 and (b) t = 1/4 period. The vertical line marks the location of the step. 160 Chapter 5 | The Schrödinger Equation We can then make the following identification of the component waves, recalling that (kx − ωt) is the phase of a wave moving in the positive x direction, while (kx + ωt) is the phase of a wave moving in the negative x direction, and assuming that the squared magnitude of each coefficient gives the intensity of the corresponding component wave. In the region x < 0, Eq. 5.56a describes the superposition of a wave ei(k0 x−ωt) of intensity |A′ |2 moving in the positive x direction (from −∞ to 0) and a wave e−i(k0 x+ωt) of intensity |B′ |2 moving in the negative x direction. Suppose we had intended our solution to describe particles that are incident from the left on this step. Then |A′ |2 gives the intensity of the incident wave (more exactly, the de Broglie wave describing the incident beam of particles) and |B′ |2 gives the intensity of the reflected wave. The ratio |B′ |2 /|A′ |2 tells us the reflected fraction of the incident wave intensity. In the region x > 0, Eq. 5.56b describes the transmitted wave ei(k1 x−ωt) of intensity |C ′ |2 moving to the right and a wave e−i(k1 x+ωt) of intensity |D′ |2 moving to the left. If particles are incident from −∞, it is not possible to have particles in the region x > 0 moving to the left, so in this particular experimental situation we are justified in setting D′ to zero. Figure 5.24a shows that the probability density has the same value everywhere in the region x > 0. You can see this immediately from Eq. 5.56b with D′ = 0; taking the squared magnitude of the remaining term gives a constant result, independent of x and t. This is consistent with what we expect for the de Broglie wave of free particles; the particles can be found anywhere in the region x > 0 with equal probability. In the region x < 0, the incident and reflected waves combine to produce a standing wave, for which the probability density has fixed maxima and minima. The probability density in this region does not vary with time, as suggested by the plots for the two different times (t = 0 and t = 1/4 period) shown in Figure 5.24. To illustrate the propagation of the de Broglie wave, it is instructive also to plot the real and imaginary parts of the wave function, which are shown in Figure 5.24. Here you can see the change in wavelength (corresponding to the change in kinetic energy or momentum) in crossing the step. You can also see something of the time dependence—the wave propagates in both regions, but it does so in a way that the real and imaginary parts combine to give a probability density that remains unchanged in time. Potential Energy Step, E < U O If the energy of the particle is less than the height of the potential energy step, then the solution in the region x > 0 is of the form of Eq. 5.18:  2mE k0 = x < 0 (5.57a) ψ0 (x) = A sin k0 x + B cos k0 x h−2  2m k1 x −k1 x k1 = −2 (U0 − E) x > 0 (5.57b) ψ1 (x) = Ce + De h We set C = 0 to keep ψ1 (x) from becoming infinite as x → ∞, and we apply the boundary conditions on ψ(x) and ψ ′ (x) at x = 0. The resulting solution is shown in Figure 5.25. The probability density for x > 0 shows penetration into the classically forbidden region. All particles are reflected from the barrier; that is, if we write 0 (x,t) in the form of Equation 5.56a, then we must have |A′ | = |B′ |, indicating that the waves moving to the right (the incident wave) and to the left (the reflected wave) in the region x < 0 have equal amplitudes. 5.6 | Steps and Barriers |ψ |2 Re(ψ ) |ψ |2 Re(ψ ) Im(ψ ) (a) Im(ψ ) (b) FIGURE 5.25 Wave function for electrons incident from the left on a potential energy step for E < U0 . The probability density and the real and imaginary parts of the wavefunction are shown for (a) t = 0 and (b) t = 1/4 period. The vertical line marks the location of the step. Figure 5.25 shows the probability density at two different times, illustrating that the probability density does not change with time. In the region x < 0 we again have standing waves with fixed maxima and minima. Viewing the real and imaginary parts at two different times (t = 0 and t = 1/4 period) shows that the wave is propagating, even though the probability density does not change with time. Penetration into the forbidden region is associated with the wave nature of the particle and also with the uncertainty in the particle’s energy or location. The probability density in the x > 0 region is |ψ1 |2 , which according to Eq. 5.57b is proportional to e−2k1 x . If we define a representative penetration distance x to be the distance over which the probability drops by 1/e, then e−2k1 x = e−1 and so x = h− 1 1 =  2k1 2 2m(U0 − E) (5.58) To be able to enter the region with x > 0, the particle must gain an energy of at least U0 − E in order to get over the potential energy step; it must in addition gain some kinetic energy if it is to move in the region x > 0. Of course, it is a violation of conservation of energy for the particle to spontaneously gain any amount of energy, but according to the uncertainty relationship Et ∼ h− conservation of energy does not apply at times smaller than t except to within − That is, if the particle “borrows” an amount of energy an amount E ∼ h/t. − we observers E and “returns” the borrowed energy within a time t ∼ h/E, will still believe energy is conserved. Suppose the particle borrows an energy sufficient to give it a kinetic energy of K in the forbidden region. How far into the forbidden region does the particle penetrate? The “borrowed” energy is (U0 − E) + K; the energy (U0 − E) gets the particle to the top of the step, and the extra kinetic energy K gives it its motion. The energy must be returned within a time h− (5.59) t = U0 − E + K  The particle moves with speed v = 2K/m, and so the distance it can travel is  1 h− 1 2K x = vt = (5.60) 2 2 m U0 − E + K 161 162 Chapter 5 | The Schrödinger Equation (The factor of 1/2 is present because in the time t the particle must penetrate the distance x into the forbidden region and return through that same distance to the allowed region.) In the limit K → 0, the penetration distance x goes to 0 according to Eq. 5.60 because the particle has zero velocity; similarly, x → 0 in the limit K → ∞, because it moves for a vanishing time interval t. In between those limits, there must be a maximum value of x for some particular K. Differentiating Eq. 5.60 with respect to K, we can find the maximum value h− 1 (5.61) xmax =  2 2m(U0 − E) This value of x is identical with Eq. 5.58! This demonstrates that the penetration into the forbidden region given by the solution to the Schrödinger equation is entirely consistent with the uncertainty relationship. (The agreement between Eqs. 5.58 and 5.61 is really somewhat accidental, because the factor 1/e used to obtain Eq. 5.58 was chosen arbitrarily. What we have really demonstrated is that the estimates of uncertainty given by the Heisenberg relationships are consistent with the wave properties of the particle obtained from the Schrödinger equation. This should not be surprising, because the uncertainty principle can be derived as a consequence of the Schrödinger equation.) Potential Energy Barrier U0 E x=0 Consider now the potential energy barrier shown in Figure 5.26: x=L FIGURE 5.26 A barrier of height U0 and width L. λ0 λ0 x=L FIGURE 5.27 The real part of the wave function of a particle of energy E < U0 encountering a barrier (the particle is incident from the left in the figure). The wavelength λ0 is the same on both sides of the barrier, but the amplitude beyond the barrier is much less than the original amplitude. U(x) = 0 = U0 =0 x<0 0≤x≤L x>L (5.62) Particles with energy E less than U0 are incident from the left. Our experience then leads us to expect solutions of the form shown in Figure 5.27—sinusoidal oscillation in the region x < 0 (an incident wave and a reflected wave), exponentials in the region 0 ≤ x ≤ L, and sinusoidal oscillations in the region x > L (the transmitted wave). Note that the intensity of the transmitted wave (x > L) is much smaller than the intensity of the incident + reflected waves (x < 0), which means that most of the particles are reflected and few are transmitted through the barrier. Also note that the wavelengths are the same on either side of the barrier (because the kinetic energies are the same). The intensity of the transmitted wave, which can be found by application of the continuity conditions, depends on the energy of the particle and on the height and thickness of the barrier. Classically, the particles should never appear at x > L, because they do not have sufficient energy to overcome the barrier. This situation is an example of barrier penetration, sometimes called quantum mechanical tunneling. Particles can not be observed while they are in the classically forbidden region 0 ≤ x ≤ L, but can “tunnel” through that region and be observed at x > L. Every particle incident on the barrier of Figure 5.26 is either reflected or transmitted; the number of incident particles is equal to the number reflected back to x < 0 plus the number transmitted to x > L. None are “trapped” or ever seen in the forbidden region 0 < x < L. How can the incident particle get from x < 0 to x > L? As a classical particle, it can’t! However, the wave representing the particle can penetrate through the barrier, which allows the particle to be observed in the classically allowed region x > L. 5.6 | Steps and Barriers 163 Air Glass (a) (b) FIGURE 5.28 (a) Total internal reflection of light waves at a glass-air boundary. (b) Frustrated total internal reflection. The thicker the air gap, the smaller the probability to penetrate. Note that the light beam does not appear in the gap. This phenomenon of penetration of a forbidden region is a well-known property of classical waves. Quantum physics provides a new aspect to this phenomenon by associating a particle with the wave, and thus allowing a particle to pass through a classically forbidden region. An example of the penetration effect for classical waves occurs for total internal reflection∗ of light waves. Figure 5.28a shows a light beam in glass incident on a boundary with air. The beam is totally reflected in the glass. However, if a second piece of glass is brought close to the first, as in Figure 5.28b, the beam can appear in the second piece of glass. This effect is called frustrated total internal reflection. The intensity of the beam in the second piece, represented by the widths of the arrows in Figure 5.28b, decreases rapidly as the thickness of the gap increases. Just like our unobservable quantum wave, which penetrates a few wavelengths into the forbidden region, an unobservable light wave of exponentially decreasing amplitude, the evanescent wave, penetrates into the air even when the light wave undergoes total reflection in the glass. The evanescent wave carries no energy away from the interface, so it cannot be directly observed in the air, but it can be observed in another medium such as a second piece of glass placed close to the first. Evanescent waves have applications in microscopy, where they enable the production of images of individual molecules. Although the potential energy barrier of Figure 5.26 is rather artificial, there are many practical examples of quantum tunneling: 1. Alpha Decay. An atomic nucleus consists of protons and neutrons in a constant state of motion; occasionally these particles form themselves into an aggregate of two protons and two neutrons, called an alpha particle. In one form of radioactive decay, the nucleus can emit an alpha particle, which can be detected in the laboratory. However, in order to escape from the nucleus the alpha particle must penetrate a barrier of the form shown in Figure 5.29. The probability for the alpha particle to penetrate the barrier, and be detected in the laboratory, can be computed based on the energy of the alpha particle ∗ Total internal reflection occurs when a light beam is incident on a boundary between two substances, such as glass and air, from the side with the higher index of refraction. If the angle of incidence inside the glass exceeds a certain critical value, the light beam is totally reflected back into the glass. Repulsive Coulomb potential Energy of alpha particle E Attractive nuclear potential (a) (b) FIGURE 5.29 (a) A nuclear potential energy barrier is penetrated by an alpha particle of energy E. (b) A representation of the real part of the wave function of the alpha particle. The probability to penetrate the barrier depends strongly on the energy of the alpha particle. Chapter 5 | The Schrödinger Equation and the height and thickness of the barrier. The decay probability can be measured in the laboratory, and it is found to be in excellent agreement with the value obtained from a quantum-mechanical calculation based on barrier penetration. 2. Ammonia Inversion. Figure 5.30 is a representation of the ammonia molecule NH3 . If we were to try to move the nitrogen atom along the axis of the molecule, toward the plane of the hydrogen atoms, we find repulsion caused by the three hydrogen atoms, which produces a potential energy of the form shown in Figure 5.31. According to classical mechanics, unless we give the nitrogen atom sufficient energy, it should not be able to surmount the barrier and appear on the other side of the plane of hydrogens. According to quantum mechanics, the nitrogen can tunnel through the barrier and appear on the other side of the molecule. In fact, the nitrogen atom actually tunnels back and forth with a frequency in excess of 1010 oscillations per second. 3. The Tunnel Diode. A tunnel diode is an electronic device that uses the phenomenon of tunneling. Schematically, the potential energy of an electron in a tunnel diode can be represented by Figure 5.32. The current that flows through the device is produced by electrons tunneling through the barrier. The rate of tunneling, and therefore the current, can be regulated merely by changing the height of the barrier, which can be done with an applied voltage. N H H H FIGURE 5.30 A schematic diagram of the ammonia molecule. The Coulomb repulsion of the three hydrogens establishes a barrier against the nitrogen atom moving to a symmetric position (shown in dashed lines) on the opposite side of the plane of hydrogens. Barrier height determined by applied voltage Potential energy of nitrogen 164 Potential energy barrier Equilibrium positions Energy of electrons Ground state energy 0 Distance from plane of hydrogens FIGURE 5.31 The potential energy seen by the nitrogen atom in an ammonia molecule. The nitrogen can penetrate the barrier and move from one equilibrium position to another. FIGURE 5.32 The potential energy barrier seen by an electron in a tunnel diode. The conductivity of the device is determined by the electron’s probability to penetrate the barrier, which depends on the height of the barrier. Chapter Summary This can be done rapidly, so that switching frequencies in excess of 109 Hz can be obtained. Ordinary semiconductor diodes depend on the diffusion of electrons across a junction, and therefore operate on much longer time scales (that is, at lower frequencies). 4. The Scanning Tunneling Microscope. Images of individual atoms on the surface of materials (such as Figure 5.18) can be made with the scanning tunneling microscope. Electrons are trapped in a surface by a potential energy barrier (the work function of the material). When a needlelike probe is placed within about 1 nm of the surface (Figure 5.33), electrons can tunnel through the barrier between the surface and the probe and produce a current that can be recorded in an external circuit. The current is very sensitive to the width of the barrier (the distance from the probe to the surface). In practice, a feedback mechanism keeps the current constant by moving the tip up and down. The motion of the tip gives a map of the surface that reveals details smaller than 0.01 nm, about 1/100 the diameter of an atom! For the development of the scanning tunneling microscope, Gerd Binnig and Heinrich Rohrer were awarded the 1986 Nobel Prize in physics. 165 Probe Path of probe tip Electron cloud Atom on surface FIGURE 5.33 In a scanning tunneling microscope, a needlelike probe is scanned over a surface. The probe is moved vertically so that the distance between the probe and the surface remains constant as the probe scans laterally. Chapter Summary Section Time-independent Schrödinger equation Time-dependent Schrödinger equation Probability density Normalization condition Probability in interval x1 to x2 − h−2 d 2 ψ + U(x)ψ(x) = Eψ(x) 5.3 2m dx2 (x, t) = ψ(x)e−iωt P(x) = |ψ(x)|  +∞ 2 −∞ |ψ(x)| dx = 1 P(x1 : x2 ) = x1  +∞ 5.3 h 2 n2 (n = 1, 2, 3, . . .) 8mL2 ψ(x, y) = 2 E= 5.4 ny π y 2 n πx sin x sin L L L 5.4 h (n2 + n2y ) 8mL2 x √ − 2 km/2h )x Simple harmonic oscillator ground state − 1/4 −( ψ(x) = (mω0 /hπ) e 5.5 5.3 |ψ(x)|2 f (x) dx 5.3 Simple harmonic oscillator energies − En = (n + 12 )hω 0 (n = 0, 1, 2, . . .) 5.5 5.4 Potential energy step, E > U0 ψ0 (x < 0) = A sin k0 x + B cos k0 x 5.6 ψ1 (x > 0) = C sin k1 x + D cos k1 x 5.4 Potential energy step, E < U0 ψ0 (x < 0) = A sin k0 x + B cos k0 x 5.6 ψ1 (x > 0) = Cek1 x + De−k1 x Constant potential energy, E > U0 ψ(x) = A sin kx + B cos kx,  k = 2m(E − U0 )/h−2 ′ Two-dimensional infinite well 2 nπx sin , L L |ψ(x)|2 dx [f (x)]av = Constant potential energy, E < U0 ψn (x) = En = 5.3 Average or expectation value of f (x) −∞ Infinite potential energy well 5.3 2  x2 Section  ′ ψ(x) = Aek x + Be−k x ,  k ′ = 2m(U0 − E)/h−2 166 Chapter 5 | The Schrödinger Equation Questions 1. Newton’s laws can be solved to give the future behavior of a particle. In what sense does the Schrödinger equation also do this? In what sense does it not? 2. Why is it important for a wave function to be normalized? Is an unnormalized wave function a solution to the Schrödinger equation?  +∞ 3. What is the physical meaning of −∞ |ψ|2 dx = 1 ? 4. What are the dimensions of ψ(x)? Of ψ(x, y)? 5. None of the following are permitted as solutions of the Schrödinger equation. Give the reasons in each case. (a) ψ(x) = A cos kx x<0 ψ(x) = B sin kx x>0 −L≤x≤L (b) ψ(x) = Ax−1 e−kx (c) ψ(x) = A sin−1 kx (d) ψ(x) = A tan kx x>0 6. What happens to the probability density in the infinite well when n → ∞? Is this consistent with classical physics? 7. How would the solution to the infinite potential energy well be different if the well extended from x = x0 to x = x0 + L, where x0 is a nonzero value of x? Would any of the measurable properties be different? 8. How would the solution to the one-dimensional infinite potential energy well be different if the potential energy were not zero for 0 ≤ x ≤ L but instead had a constant value U0 ? What would be the energies of the excited states? What would be the wavelengths of the standing de Broglie waves? Sketch the behavior of the lowest two wave functions. 9. Assuming a pendulum to behave like a quantum oscillator, what are the energy differences between the quantum 10. 11. 12. 13. 14. 15. 16. 17. states of a pendulum of length 1 m? Are such differences observable? For the potential energy barrier (Figure 5.26), is the wavelength for x > L the same as the wavelength for x < 0? Is the amplitude the same? Suppose particles were incident on the potential energy step from the positive x direction. Which of the four coefficients of Eq. 5.56 would be set to zero? Why? The energies of the excited states of the systems we have discussed in this chapter have been exact—there is no energy uncertainty. What does this suggest about the lifetime of particles in those excited states? Left on its own, will a particle ever make transitions from one state to another? Explain how the behavior of a particle in a one-dimensional infinite well can be considered in terms of standing de Broglie waves. How would you design an experiment to observe barrier penetration with sound waves? What range of thicknesses would you choose for the barrier? If U0 were negative in Figure 5.26, how would the wave functions appear for E > 0? Does Eq. 5.2 imply that we know the momentum of the particle exactly? If so, what does the uncertainty principle indicate about our knowledge of its location? How can you reconcile this with our knowledge that the particle must be in the well? Do sharp boundaries and discontinuous jumps of potential energy occur in nature? If not, how would our analysis of potential energy steps and barriers be different? Problems 5.1 Behavior of a Wave at a Boundary 1. A ball falls from rest at a height H above a lake. Let y = 0 at the surface of the lake. As it falls, it experiences a gravitational force −mg. When it enters the water, it experiences a buoyant force B so the net force in the water is B − mg. (a) Write expressions for v(t) and y(t) while the ball is falling in air. (b) In the water, let v2 (t) = at + b and y2 (t) = 12 at2 + bt + c where a = (B − mg)/m. Use the continuity conditions at the surface of the water to find the constants b and c. 2. A wave has the form y = A cos(2π x/λ + π/3) when x < 0. For x > 0, the wavelength is λ/2. By applying continuity conditions at x = 0, find the amplitude (in terms of A) and phase of the wave in the region x > 0. Sketch the wave, showing both x < 0 and x > 0. 5.2 Confining a Particle 3. The lowest energy of a particle in an infinite one-dimensional well is 4.4 eV. If the width of the well is doubled, what is its lowest energy? 4. An electron is trapped in an infinite well of width 0.120 nm. What are the three longest wavelengths permitted for the electron’s de Broglie waves? 5. An electron is trapped in a one-dimensional region of width 0.050 nm. Find the three smallest possible values allowed for the energy of the electron. 6. What is the minimum energy of a neutron (mc2 = 940 MeV) confined to a region of space of nuclear dimensions (1.0 × 10−14 m)? Problems 5.3 The Schrödinger Equation 7. In the region 0 ≤ x ≤ a, a particle is described by the wave function ψ1 (x) = −b(x2 − a2 ). In the region a ≤ x ≤ w, its wave function is ψ2 (x) = (x − d)2 − c. For x ≥ w, ψ3 (x) = 0. (a) By applying the continuity conditions at x = a, find c and d in terms of a and b. (b) Find w in terms of a and b. 8. A particle is described by the wave function ψ(x) = b(a2 − x2 ) for −a ≤ x ≤ +a and ψ(x) = 0 for x ≤ −a and x ≥ +a, where a and b are positive real constants. (a) Using the normalization condition, find b in terms of a. (b) What is the probability to find the particle at x = +a/2 in a small interval of width 0.010a? (c) What is the probability for the particle to be found between x = +a/2 and x = +a? 9. In a certain region of space, a particle is described by the wave function ψ(x) = Cxe−bx where C and b are real constants. By substituting into the Schrödinger equation, find the potential energy in this region and also find the energy of the particle. (Hint: Your solution must give an energy that is a constant everywhere in this region, independent of x.) 10. A particle is represented by the following wave function: ψ(x) = = = = 0 C(2x/L + 1) C(−2x/L + 1) 0 x < −L/2 − L/2 < x < 0 0 < x < +L/2 x > +L/2 (a) Use the normalization condition to find C. (b) Evaluate the probability to find the particle in an interval of width 0.010L at x = L/4 (that is, between x = 0.245L and x = 0.255L. (No integral is necessary for this calculation.) (c) Evaluate the probability to find the particle between x = 0 and x = +L/4. (d)  Find the average value of x and the rms value of x: xrms = (x2 )av . 5.4 Applications of the Schrödinger Equation 11. A particle in an infinite well is in the ground state with an energy of 1.26 eV. How much energy must be added to the particle to reach the second excited state (n = 3)? The third excited state (n = 4)? 12. An electron is trapped in an infinitely deep one-dimensional well of width 0.251 nm. Initially the electron occupies the n = 4 state. (a) Suppose the electron jumps to the ground state with the accompanying emission of a photon. What is the energy of the photon? (b) Find the energies of other photons that might be emitted if the electron takes other paths between the n = 4 state and the ground  state. 13. Show that Eq. 5.31 gives the value A = 2/L. 14. A particle is trapped in an infinite one-dimensional well of width L. If the particle is in its ground state, evaluate the probability to find the particle (a) between x = 0 and 15. 16. 17. 18. 19. 167 x = L/3; (b) between x = L/3 and x = 2L/3; (c) between x = 2L/3 and x = L. A particle is confined between rigid walls separated by a distance L = 0.189 nm. The particle is in the second excited state (n = 3). Evaluate the probability to find the particle in an interval of width 1.00 pm located at: (a) x = 0.188 nm; (b) x = 0.031 nm; (c) x = 0.079 nm. (Hint: No integrations are required for this problem; use Eq. 5.7 directly.) What would be the corresponding results for a classical particle? What is the next level (above E = 50E0 ) of the twodimensional particle in a box in which the degeneracy is greater than 2? A particle is confined to a two-dimensional box of length L h2 π 2 /2mL2 )(n2x + and width 2L. The energy values are E = (− 2 ny /4). Find the two lowest degenerate levels. Show by direct substitution that Eq. 5.39 gives a solution to the two-dimensional Schrödinger equation, Eq. 5.37. Find the relationship between kx , ky , and E. A particle is confined to a three-dimensional region of space of dimensions L by L by L. The energy levels h2 π 2 /2mL2 )(n2x + n2y + n2z ), where nx , ny , and nz are are (− integers ≥ 1. Sketch an energy level diagram, showing the energies, quantum numbers, and degeneracies for the lowest 10 energy levels. 5.5 The Simple Harmonic Oscillator 20. Using the normalization condition, show that the constant A hπ )1/4 for the one-dimensional simple has the value (mω0 /− harmonic oscillator in its ground state. 21. (a) At the classical turning points ±x0 of the simple harmonic oscillator, K = 0 and so E = U. From this relationship, show hω0 /k)1/2 for an oscillator in its ground state. that x0 = (− (b) Find the turning points in the first and second excited states. 22. Use the ground-state wave function of the simple harmonic oscillator to find xav , (x2 )av , and x. Use the normalization hπ )1/4 . constant A = (mω0 /− 23. (a) Using a symmetry argument rather than a calculation, determine the value of pav for a simple harmonic oscillator. (b) Conservation of energy for the harmonic oscillator can be used to relate p2 to x2 . Use this relation, along with the value of (x2 )av from Problem 22, to find (p2 )av for the oscillator in its ground state.  (c) Using the results of parts a hω0 m/2. and b, show that p = − 24. The ground state energy of an oscillating electron is 1.24 eV. How much energy must be added to the electron to move it to the second excited state? The fourth excited state? 25. Compare the probabilities for an oscillating particle in its ground state to be found in a small interval of width dx at the center of the well and at the classical turning points. 168 Chapter 5 | The Schrödinger Equation 5.6 Steps and Barriers 26. Find the value of K at which Eq. 5.60 has its maximum value, and show that Eq. 5.61 is the maximum value of x. 27. For a particle with energy E < U0 incident on the potential energy step, use ψ0 and ψ1 from Eqs. 5.57, and evaluate the constants B and D in terms of A by applying the boundary conditions at x = 0. 28. Using the wave functions of Eq. 5.55 for the potential energy step, apply the boundary conditions of ψ and dψ/dx to find B′ and C ′ in terms of A′ , for the potential step when particles are incident from the negative x direction. Evaluate the ratios |B′ |2 /|A′ |2 and |C ′ |2 /|A′ |2 and interpret. 29. (a) Write down the wave functions for the three regions of the potential energy barrier (Figure 5.26) for E < U0 . You will need six coefficients in all. Use complex exponential notation. (b) Use the boundary conditions at x = 0 and at x = L to find four relationships among the six coefficients. (Do not try to solve these relationships.) (c) Suppose particles are incident on the barrier from the left. Which coefficient should be set to zero? Why? 30. Repeat Problem 29 for the potential energy barrier when E > U0 , and sketch a representative probability density that shows several cycles of the wave function. In your sketch, make sure the amplitude and wavelength in each region accurately describe the situation. E E E FIGURE 5.34 Problem 32. General Problems 31. An electron is trapped in a one-dimensional well of width 0.132 nm. The electron is in the n = 10 state. (a) What is the energy of the electron? (b) What is the uncertainty in its momentum? (Hint: Use Eq. 4.10.) (c) What is the uncertainty in its position? How do these results change as n → ∞? Is this consistent with classical behavior? 32. Sketch the form of a possible solution to the Schrödinger equation for each of the potential energies shown in Figure 5.34. The potential energies go to infinity at the boundaries. In each case show several cycles of the wave function. In your sketches, pay attention to the continuity conditions (where applicable) and to changes in the wavelength and amplitude. 33. Show that the average value of x2 in the one-dimensional infinite potential energy well is L2 (1/3 − 1/2n2 π 2 ). 34. Use the result of Problem 33 to show that,  for the infinite one-dimensional well, defining x = (x2 )av − (xav )2  gives x = L 1/12 − 1/2π 2 n2 . 35. (a) In the infinite one-dimensional well, what is pav ? (Use a symmetry argument.) (b) What is (p2 )av ? [Hint:  What is (p2 /2m)av ?] (c) Defining p = (p2 )av − (pav )2 , show that p = hn/2L. 36. The first excited state of the harmonic oscillator has a wave 2 function of the form ψ(x) = Axe−ax . Follow the method outlined in Section 5.5 to find a and the energy E. Find the constant A from the normalization condition. 37. Using the normalization constant A from Problem 20 and the value of a from Eq. 5.49, evaluate the probability to find an oscillator in the ground state beyond the classical turning points ±x0 . This problem cannot be solved in closed, analytic form. Develop an approximate, numerical method using a graph, calculator, or computer. Assume an electron bound to an atomic-sized region (x0 = 0.1 nm) with an effective force constant of 1.0 eV/nm2 . 38. A two-dimensional harmonic oscillator has energy E = − hω0 (nx + ny + 1), where nx and ny are integers beginning with zero. (a) Justify this result based on the energy of the one-dimensional oscillator. (b) Sketch an energy-level diagram similar to Figure 5.21, showing the lowest 4 energy levels. For each level, show the value of E (in units of − hω0 ), the quantum numbers nx and ny , and the degeneracy. (c) Show that the degeneracy of each level is equal to nx + ny + 1. Chapter 6 THE RUTHERFORD-BOHR MODEL OF THE ATOM This model of the atom, based on the work of Rutherford and Bohr, shows electrons circulating about the nucleus like planets circulating about the Sun. It can be a useful model for some purposes, but it does not represent even approximately the structure of real atoms. In Chapters 7 and 8 we will learn more about the behavior and properties of electrons in atoms. 170 Chapter 6 | The Rutherford-Bohr Model of the Atom Our goal in this chapter is to understand some of the details of atomic structure that can be learned from experimental studies of atoms. In particular, we discuss two types of experiments that are important in the development of our theory of atomic structure: the scattering of charged particles by atoms, which tells us about the distribution of electric charge in atoms, and the emission or absorption of radiation by atoms, which tells us about their excited states. We use the information obtained from these experiments to develop an atomic model, which helps us understand and explain the properties of atoms. A model is usually an oversimplified picture of a more complex system, which provides some insight into its operation but may not be sufficiently detailed to explain all of its properties. In this chapter, we discuss the experiments that led to the Rutherford-Bohr model (also known simply as the Bohr model), which is based on the familiar “planetary” structure in which the electrons orbit about the nucleus like planets about the Sun. Even though this model is not strictly valid from the standpoint of wave mechanics, it does help us understand many atomic properties, especially the excited states of the simplest atom, hydrogen. In Chapter 7, we show how wave mechanics changes our picture of the hydrogen atom, and in Chapter 8 we consider the structure of more complicated atoms. 6.1 BASIC PROPERTIES OF ATOMS Before we begin to construct a model of the atom, it is helpful to summarize some of the basic properties of atoms. 1. Atoms are very small, about 0.1 nm (0.1 × 10−9 m) in radius. Thus any effort to “see” an atom using visible light (λ = 500 nm) is hopeless owing to diffraction effects. We can make a crude estimate of the maximum size of an atom in the following way. Consider a cube of elemental matter—for example, iron. Iron has a density of about 8 g/cm3 and a molar mass of 56 g. One mole of iron (56 g) contains Avogadro’s number of atoms, about 6 × 1023 . Thus 6 × 1023 atoms occupy about 7 cm3 and so 1 atom occupies about 10−23 cm3 . If we assume the atoms of a solid are packed together in the most efficient possible way, like hard spheres in contact, then the diameter of √ 3 one atom is about 10−23 cm3 = 2 × 10−8 cm = 0.2 nm. 2. Atoms are stable—they do not spontaneously break apart into smaller pieces or collapse; therefore the internal forces that hold the atom together must be in equilibrium. This immediately tells us that the forces that pull the parts of an atom together must be opposed in some way; otherwise atoms would collapse. 3. Atoms contain negatively charged electrons, but are electrically neutral. If we disturb an atom or collection of atoms with sufficient force, electrons are emitted. We learn this fact from studying the Compton effect and the photoelectric effect. We also learned in Chapter 4 that even though electrons are emitted from the nuclei of atoms in certain radioactive decay processes, they don’t “exist” in those nuclei but are manufactured there by some process. Electrons were excluded from the nucleus based on the uncertainty principle, which forbids emitted electrons of the energies observed in the laboratory from existing in the nucleus (see Example 4.7). The uncertainty principle places no such restriction on the existence of electrons in a volume as large as an atom (see Problem 1). We can also easily observe that bulk matter is electrically neutral, and we assume that this is likewise a property of the atoms. Experiments with beams 6.2 | Scattering Experiments and the Thomson Model of individual atoms support this assumption. From these experimental facts we deduce that an atom with Z negatively charged electrons must also contain a net positive charge of Ze. 4. Atoms emit and absorb electromagnetic radiation. This radiation may take many forms—visible light (λ ∼ 500 nm), X rays (λ ∼ 1 nm), ultraviolet rays (λ ∼ 10 nm), infrared rays (λ ∼ 0.1 μm), and so forth. In fact it is from observation of these emitted and absorbed radiations, which can be measured with great precision, that we learn most of what we know about atoms. In a typical emission measurement, an electric current is passed through a glass tube containing a small sample of the gas phase of the element under study, and radiation is emitted when an excited atom returns to its ground state. The absorption wavelengths can be measured by passing a beam of white light through a sample of the gas and noting which colors are removed from the white light by absorption in the gas. One particularly curious feature of the atomic radiations is that atoms don’t always emit and absorb radiations at the same wavelengths—some wavelengths present in the emission experiment do not also appear in the absorption experiment. Any successful theory of atomic structure must be able to account for these emission and absorption wavelengths. Electrons 6.2 SCATTERING EXPERIMENTS AND THE THOMSON MODEL An early model of the structure of the atom was proposed (in 1904) by J. J. Thomson, who was known for his previous identification of the electron and measurement of its charge-to-mass ratio e/m. The Thomson model incorporates many of the known properties of atoms: size, mass, number of electrons, and electrical neutrality. In this model, an atom contains Z electrons that are embedded in a uniform sphere of positive charge (Figure 6.1). The total positive charge of the sphere is Ze, the mass of the sphere is essentially the mass of the atom (the electrons don’t contribute significantly to the total mass), and the radius R of the sphere is the radius of the atom. (This model is sometimes known as the “plum-pudding” model, because the electrons are distributed throughout the atom like raisins in a plum pudding.) As we will see, the Thomson model gives predictions that disagree with experiment, and so it is not the correct way of understanding the structure of atoms. One way of studying atoms is by probing the distribution of electric charge in their interior, which we can do by bombarding the atom with charged particles and observing the angle by which particles are deflected from their original direction. This type of experiment is called a scattering experiment. Ideally we would like to do this experiment with a single atom, such as is represented in Figure 6.2. The scattering angle θ depends on the impact parameter b, which measures the distance from the center of the atom that a projectile would pass if it were not deflected. Each different value of the impact parameter results in a different value of the scattering angle. The particle is deflected from its original trajectory by the electrical forces exerted on the particle by the atom. For a positively charged particle, these forces are: (1) a repulsive force due to the positive charge of the atom, and (2) an attractive force due to the negatively charged electrons. We assume that the mass of the deflected particle is much greater than the mass of an electron but also much less than the mass of the atom. In the encounter between the projectile and an electron, 171 R r FIGURE 6.1 The Thomson model of the atom. Z electrons are imbedded in a uniform sphere of positive charge Ze and radius R. An imaginary spherical surface of radius r contains a fraction r3 /R3 of the positive charge. y θ b x b=0 R FIGURE 6.2 A positively charged particle is deflected by an angle θ as it passes through a positively charged sphere, representing a Thomson model atom. The scattering angle depends on the value of the impact parameter b, which varies from 0 to R. 172 Chapter 6 | The Rutherford-Bohr Model of the Atom θ θ θ θ FIGURE 6.3 Scattering by a thin foil. Some individual scatterings tend to increase θ , while others tend to decrease θ. the forces exerted on each by the other are equal and opposite (by Newton’s third law), and so the principal victim of the encounter is the much less massive electron; the effect on the projectile is negligible. (Imagine rolling a bowling ball through a field of Ping-Pong balls!) We thus need consider only the positively charged atom as a cause of the deflection of the particle. By the same argument, we neglect any possible motion of the more massive atom caused by the passage of the projectile. The basic experiment, then, is the scattering of a positively charged projectile by the stationary positively charged massive part of the atom. In practice we cannot do the experiment with one atom. Instead, we bombard a thin foil, as in Figure 6.3. The scattering angle θ that we observe in the laboratory is the result of scattering by many atoms, with impact parameters that we do not know and cannot control. Let’s assume that for a single atom the average scattering angle is θav , which represents an average over all possible impact parameters from zero up to the atomic radius R. For a typical foil thickness of 1 μm (10−6 m), the projectile is scattered by about 104 atoms. The total scattering angle θ is determined by statistical considerations, because some of the individual scatterings move the projectile toward larger scattering angles and some toward smaller angles, as represented in Figure 6.3. This is an example of a “random walk” problem—for N scatterings, the most likely observed net scattering angle θ is related to the average individual scattering angle by √ θ ≃ Nθav (6.1) According to the Thomson model, the average scattering angle for a single atom is on the order of 0.01◦ , and for a foil that is 104 atoms thick the net scattering angle should be about 1◦ . This is consistent with experimental observations. The most critical test of the Thomson model, which it fails completely, occurs when we examine the probability for scattering at large angles. If each individual scattering deflects the projectile through an angle of around 0.01◦ , then to observe projectiles scattered through a total angle greater than 90◦ , we must have about 104 successive scatterings, all of which push the projectile toward larger angles. Because the probabilities of individual scatterings toward either larger or smaller angles are equal, the probability of having 104 successive scatterings toward larger angles, like the probability of finding 104 successive heads in tossing a coin, is about (1/2)10,000 = 10−3000 . An experiment to observe this scattering was performed by Hans Geiger and Ernest Marsden in the laboratory of Ernest Rutherford at Manchester University in 1910. For projectiles they used alpha particles, which are nuclei of helium (of charge +2e) emitted in radioactive decay. Their results showed that the probability of an alpha particle scattering at angles greater than 90◦ was about 10−4 . This remarkable discrepancy between the expected value based on the Thomson model (10−3000 ) and the observed value (10−4 ) was described by Rutherford in this way: Ernest Rutherford (1871–1937, England). Founder of nuclear physics, he is known for his pioneering work on alpha-particle scattering and radioactive decays. His inspiring leadership influenced a generation of British nuclear and atomic scientists. It was quite the most incredible event that ever happened to me in my life. It was as incredible as if you fired a 15-inch shell at a piece of tissue paper and it came back and hit you. The analysis of the results of such scattering experiments led Rutherford to propose that the mass and positive charge of the atom are not distributed uniformly over the volume of the atom, but instead are concentrated in an extremely small region, about 10−14 m in diameter, at the center of the atom. In Section 6.3 we will see how this proposal is consistent with the large-angle scattering results. 6.2 | Scattering Experiments and the Thomson Model Scattering in the Thomson Model (Optional) Let’s assume that a projectile of positive charge ze is incident on an atom of radius R that we represent according to the Thomson model as a uniform sphere of positive charge Ze. The force on the projectile when it is a distance r from the center of the atom can be computed using Gauss’s law (see Problem 2): F= zZe2 r 4πε0 R3 (6.2) Before discussing the scattering, we should note that this equation can also describe (if we put z = 1) the force on an electron embedded in the Thomson atom at a distance r from its center. This force can be written F = kr with k = Ze2 /4πε0 R3 . This linear restoring force permits the electrons to oscillate about their equilibrium positions just like a mass on a spring subject to the linear restoring force F = kx. We therefore expect the electrons in the Thomsonatom to oscillate about their equilibrium positions with a frequency f = (2π )−1 k/m , where k is the force constant. Because an oscillating electric charge radiates electromagnetic waves whose frequency is identical to the oscillation frequency, we might expect, based on the Thomson model, that the radiation emitted by atoms would show this characteristic frequency. This turns out not to be true (see Problem 3); the calculated frequencies do not correspond to the frequencies observed for radiation emitted by atoms. The exact calculation of the scattering angle for different values of the impact parameter in the Thomson model of the atom is fairly complicated, but for our purposes we want only an estimate of the average value of the angle. As we will find out later, it’s not very important if our estimate is off by a small factor. Initially the projectile moves in the x direction in the geometry of Figure 6.2, but the atom exerts a force in the y direction that produces a small component of momentum py in that direction. Using Newton’s second law we can find the momentum from the impulse received by the projectile due to the electrostatic force:  py = Fy dt (6.3) Rather than carry out this complicated integral for a force that is changing in magnitude and direction as the projectile travels, we’ll estimate the average scattering angle by choosing an average value for the impact parameter, namely b = R/2 (representing the middle trajectory of Figure 6.2), and we’ll assume the force acts in the y direction for a time t determined by the projectile’s flight along a line of length roughly equal to R. This underestimates the amount of time during which the force acts but overestimates the effect of the force (which doesn’t act purely in the y direction along the entire trajectory), so to some extent these two effects should cancel one another. Making these approximations, we obtain zZe2 zZe2 (R/2) R = py ∼ = = Ft ∼ 4πε0 R3 v 8πε0 Rv (6.4) The angle θ is small, so we can make the approximation tan θ ∼ = θ, and we can assume that px changes very little from its initial value mv, and so the average scattering angle is py py zZe2 1 zZe2 = = = (6.5) θav ∼ = tan θav = px mv 8πε0 Rv mv 16πε0 RK 173 174 Chapter 6 | The Rutherford-Bohr Model of the Atom using the nonrelativistic kinetic energy K = 12 mv2 . This gives an estimate of the scattering angle when the impact parameter b is equal to half the radius R. Smaller values of b will give smaller deflection angles, and larger values of b will give larger angles, so this is a reasonable estimate for the average scattering angle for a Thomson model atom. Example 6.1 Using the Thomson model, estimate the average scattering angle when alpha particles (z = 2) with kinetic energy 3 MeV are scattered from gold (Z = 79). The atomic radius of gold is 0.179 nm. θav ∼ = zZe2 1 e2 zZ = 16πε0 RK 4 4πε0 RK 0.25(1.44 eV · nm)(2)(79) (0.179 nm)(3 × 106 eV) Solution = Using e2 /4πε0 = 1.44 eV·nm, we have = 1 × 10−4 rad = 0.01 ◦ Even though this result represents a rough estimate of the average scattering angle in the Thomson model of the atom, its accuracy does not affect our conclusions about the failure of the model. Even if our estimate were too small by as much as a factor of 10 (which is highly unlikely), we would be comparing an expected probability of 10−300 (instead of 10−3000 ) with the observed 10−4 , still a spectacular disagreement. Any reasonable estimate shows the complete failure of the Thomson model to account for these scattering experiments. 6.3 THE RUTHERFORD NUCLEAR ATOM In analyzing the scattering of alpha particles, Rutherford concluded that the most likely way an alpha particle (m = 4 u) can be deflected through large angles is by a single collision with a more massive object. Rutherford therefore proposed that the charge and mass of the atom were concentrated at its center, in a region called the nucleus. Figure 6.4 illustrates the scattering geometry in this case. The projectile, of charge ze, experiences a repulsive force due to the positively charged nucleus: F= θ b FIGURE 6.4 Scattering by a nuclear atom. The path of the scattered particle is a hyperbola. Smaller impact parameters give larger scattering angles. 1 |q1 ||q2 | (ze)(Ze) = 2 4πε0 r 4πε0 r2 (6.6) (Compare this with Eq. 6.2, which describes a projectile that is inside the sphere of charge Ze and so feels only a portion of the positive charge. We assume now that the projectile is always outside the nucleus, so it feels the full nuclear charge Ze.) The atomic electrons, with their small mass, do not appreciably affect the path of the projectile and we neglect their effect on the scattering. We also assume that the nucleus is so much more massive than the projectile that it does not move during the scattering process; because no recoil motion is given to the nucleus, the initial and final kinetic energies K of the projectile are equal. 6.3 | The Rutherford Nuclear Atom As Figure 6.4 shows, for each impact parameter b, there is a certain scattering angle θ , and we need the relationship between b and θ. The projectile can be shown∗ to follow a hyperbolic path; in polar coordinates r and φ, the equation of the hyperbola is 1 1 zZe2 = sin φ + (cos φ − 1) r b 8πε0 b2 K (6.7) As shown in Figure 6.5, the initial position of the particle is φ = 0, r → ∞, and the final position is φ = π − θ , r → ∞. Using the coordinates at the final position, Eq. 6.7 reduces to b= zZ e2 zZe2 cot 21 θ cot 12 θ = 8πε0 K 2K 4πε0 175 b q b f r FIGURE 6.5 The hyperbolic trajectory of a scattered particle. (6.8) (This result is written in this form so that e2 /4πε0 = 1.44 eV · nm or MeV · fm can be easily inserted.) A projectile that approaches the nucleus with impact parameter b will be scattered at an angle θ; projectiles approaching with smaller values of b will be scattered through larger angles, as shown in Figure 6.4. We divide our study of the scattering of charged projectiles by nuclei (which is commonly called Rutherford scattering) into three parts: (1) calculation of the fraction of projectiles scattered at angles greater than some value of θ , (2) the Rutherford scattering formula and its experimental verification, and (3) the closest approach of a projectile to the nucleus. 1. The Fraction of Projectiles Scattered at Angles Greater than θ. From Figure 6.4 we see immediately that every projectile with impact parameters less than a given value of b will be scattered at angles greater than its corresponding θ. What is the chance of a projectile having an impact parameter less than a given value of b? Suppose the foil were one atom thick—a single layer of atoms packed tightly together, as in Figure 6.6. Each atom is represented by a circular disc, of area π R2 . If the foil contains N atoms, its total area is Nπ R2 . For scattering at angles greater than θ, the impact parameter must fall between zero and b—that is, the projectile must approach the atom within a circular disc of area π b2 . If the projectiles are spread uniformly over the area of the disc, then the fraction of projectiles that fall within that area is just π b2 /π R2 . A real scattering foil may be thousands or tens of thousands of atoms thick. Let t be the thickness of the foil and A its area, and let ρ and M be the density and molar mass of the material of which the foil is made. The volume of the foil is then At, its mass is ρAt, the number of moles is ρAt/M, and the number of atoms or nuclei per unit volume is n = NA ρAt 1 N ρ = A M At M b (6.9) where NA is Avogadro’s number (the number of atoms per mole). As seen by an incident projectile, the number of nuclei per unit area is nt = NA ρt/M; that is, on the average, each nucleus contributes an area (NA ρt/M)−1 to the field of view ∗ q See, for example, R. M. Eisberg and R. Resnick, Quantum Physics of Atoms, Molecules, Solids, Nuclei, and Particles, 2nd ed. (New York, Wiley, 1985). FIGURE 6.6 Scattering geometry for many atoms. For impact parameter b, the scattering angle is θ. If the particle enters the atom within the disc of area πb2 , its scattering angle will be larger than θ . 176 Chapter 6 | The Rutherford-Bohr Model of the Atom of the projectile. For scattering at angles greater than θ, it must once again be true that the projectile must fall within an area πb2 of the center of an atom; the fraction scattered at angles greater than θ is just the fraction that approaches an atom within the area πb2 : f<b = f> θ = ntπb2 (6.10) assuming that the incident particles are spread uniformly over the area of the foil. Example 6.2 A gold foil (ρ = 19.3 g/cm3 , M = 197 g/mole) has a thickness of 2.0 × 10−4 cm. It is used to scatter alpha particles of kinetic energy 8.0 MeV. (a) What fraction of the alpha particles is scattered at angles greater than 90◦ ? (b) What fraction of the alpha particles is scattered at angles between 90◦ and 45◦ ? Solution (a) For this case the number of nuclei per unit volume can be computed as n= NA ρ (6.02 × 1023 atoms/mole)(19.3 g/cm3 ) = M (197 g/mole)(1 m/102 cm)3 = 5.9 × 1028 m−3 For scattering at 90◦ , the impact parameter b can be found from Eq. 6.8: zZ e2 1 b= cot θ 2K 4πε0 2 = (2)(79) ◦ (1.44 MeV·fm) cot 45 2(8.0 MeV) = 14 fm = 1.4 × 10−14 m and using Eq. 6.10 we then have f>90◦ = ntπb2 = (5.9 × 1028 m−3 )(2.0 × 10−6 m)π(1.4 × 10−14 m)2 = 7.5 × 10−5 (b) Repeating the calculation for θ = 45◦ , we find zZ e2 cot 21 θ 2K 4πε0 (2)(79) ◦ (1.44 MeV · fm) cot 22.5 = 2(8.0 MeV) = 34 fm = 3.4 × 10−14 m b= f>45◦ = ntπb2 = (5.9 × 1028 m−3 )(2.0 × 10−6 m)π(3.4 × 10−14 m)2 = 4.4 × 10−4 If a total fraction of 4.4 × 10−4 is scattered at angles greater than 45◦ , and of that, 7.5 × 10−5 is scattered at angles greater than 90◦ , the fraction scattered between 45◦ and 90◦ must be 4.4 × 10−4 − 7.5 × 10−5 = 3.6 × 10−4 2. The Rutherford Scattering Formula and Its Experimental Verification. In order to find the probability that a projectile will be scattered into a small angular range at θ (between θ and θ + dθ), we require that the impact parameter lie within a small range of values db at b (see Figure 6.7). The fraction, df, is then df = nt(2πb db) (6.11) from Eq. 6.10. Differentiating Eq. 6.8 we find db in terms of dθ : db = zZ e2 (− csc2 12 θ )( 21 dθ ) 2K 4πε0 (6.12) 6.3 | The Rutherford Nuclear Atom and so  zZ |df | = πnt 2K 2  e2 4πε0 2 Detector dθ csc2 12 θ cot 12 θ dθ (6.13) (This minus sign in Eq. 6.12 is not important—it just tells us that θ increases as b decreases.) Suppose we place a detector for the scattered projectiles at the angle θ a distance r from the nucleus. The probability for a projectile to be scattered into the detector depends on df , which gives the probability for scattered particles to pass through the ring of radius r sin θ and width r dθ. The area of the ring is dA = (2π r sin θ )r dθ. In order to calculate the rate at which projectiles are scattered into the detector we must know the probability per unit area for scattering into the ring. This is |df |/dA, which we call N(θ ), and, after some manipulation, we find:    2 2 1 e nt zZ 2 (6.14) N(θ ) = 2 4r 2K 4πε0 sin4 12 θ db r θ FIGURE 6.7 Particles entering the ring between b and b + db are distributed uniformly along a ring of angular width dθ . A detector is at a distance r from the scattering foil. q F Number scattered Silver S Copper Aluminum dq FIGURE 6.8 Schematic diagram of alphaparticle scattering experiment. A radioactive source of alpha particles is in a shield with a small hole. Alpha particles strike the foil F and are scattered into the angular range dθ . Each time a scattered particle strikes the screen S a flash of light is emitted and observed with the movable microscope M. r dθ r sinθ This is the Rutherford scattering formula. In Rutherford’s laboratory, Hans Geiger and Ernest Marsden tested the predictions of this formula in a remarkable series of experiments involving the scattering of alpha particles (z = 2) from a variety of thin metal foils. In those days before electronic recording and processing equipment was available, Geiger and Marsden observed and recorded the alpha particles by counting the scintillations (flashes of light) produced when the alpha particles struck a zinc sulfide screen. A schematic view of their apparatus is shown in Figure 6.8. In all, four predictions of the Rutherford scattering formula were tested: (a) N(θ) ∝ t. With a source of 8-MeV alpha particles from radioactive decay, Geiger and Marsden used scattering foils of varying thicknesses t while keeping the scattering angle θ fixed at about 25◦ . Their results are summarized in Figure 6.9, and the linear dependence of N(θ) on t is apparent. This is also evidence that, even at this moderate scattering angle, single scattering is much more important than multiple scattering. (In a random statistical theory of multiple scattering, the probability for scattering at a large angle would be proportional to the square root of the number of single scatterings, and we would expect N(θ ) ∝ t1/2 . Figure 6.9 shows clearly that this is not true.) M 177 Foil thickness FIGURE 6.9 The dependence of scattering rate on foil thickness for three different scattering foils. Chapter 6 | The Rutherford-Bohr Model of the Atom 178 Number scattered Au Ag Cu Al 0 2000 4000 6000 Z2 FIGURE 6.10 The dependence of scattering rate on the nuclear charge Z for foils of different materials. The data are plotted against Z 2 . Number scattered 103 102 10 0.1 3. The Closest Approach of a Projectile to the Nucleus. A positively charged projectile slows down as it approaches a nucleus, exchanging part of its initial kinetic energy for the electrostatic potential energy due to the nuclear repulsion. The closer the projectile gets to the nucleus, the more potential energy it gains, because Slope = −2 0.2 This result emphasizes a significant difference between scattering by a Thomson model atom and a Rutherford nuclear atom: In the Thomson model, the projectile is scattered by every atom along its path as it passes through the foil (see Figure 6.3), while in the Rutherford nuclear model the nucleus is so tiny that the chance of even a single significant encounter is small and the chance of encountering more than one nucleus is negligible. (b) N(θ) ∝ Z 2 . In this experiment, Geiger and Marsden used a variety of different scattering materials, of approximately (but not exactly) the same thickness. This proportionality is therefore much more difficult to test than the previous one, since it involves the comparison of different thicknesses of different materials. However, as shown in Figure 6.10, the results are consistent with the proportionality of N(θ ) to Z 2 . (c) N(θ ) ∝ K −2 . In order to test this prediction of the Rutherford scattering formula, Geiger and Marsden kept the thickness of the scattering foil constant and varied the speed of the alpha particles. They accomplished this by slowing down the alpha particles emitted from the radioactive source by passing them through thin sheets of mica. From independent measurements they knew the effect of different thicknesses of mica on the velocity of the alpha particles. The results of the experiment are shown in Figure 6.11; once again we see excellent agreement with the expected relationship. (d) N(θ ) ∝ sin−4 21 θ. This dependence of N on θ is perhaps the most important and distinctive feature of the Rutherford scattering formula. It also produces the largest variation in N over the range accessible by experiment. In the tests discussed so far, N varied by perhaps an order of magnitude; in this case N varies by about five orders of magnitude from the smaller to the larger angles. Geiger and Marsden used a gold foil and varied θ from 5 to 150◦ , to obtain the relationship between N and θ plotted in Figure 6.12. The agreement with the Rutherford formula is again very good. Thus all predictions of the Rutherford scattering formula were confirmed by experiment, and the “nuclear atom” was verified. 0.5 U= 1 Relative kinetic energy of alpha particles FIGURE 6.11 The dependence of scattering rate on the kinetic energy of the incident alpha particles for scattering by a single foil. The slope of −2 on the log-log scale shows that N ∝ K −2 , as expected from the Rutherford formula. 1 zZe2 1 q1 q2 = 4πε0 r 4πε0 r (6.15) The maximum potential energy, and thus the minimum kinetic energy, occurs at the minimum value of r. We assume that U = 0 when the projectile is far from the nucleus, where it has total energy E = K = 12 mv2 . As the projectile approaches the nucleus, K decreases and U increases, but U + K remains constant. At the distance rmin , the speed is vmin and: E= (See Figure 6.13.) 1 zZe2 1 1 2 = mv2 mvmin + 2 4πε0 rmin 2 (6.16) 6.3 | The Rutherford Nuclear Atom zZe2 1 4pe0 rmin K = 1 mn2min 2 U= 107 Number scattered 179 106 U=0 K = 1 mn2 2 L = mnb 1 4 sin q 2 105 104 103 L = mnminrmin nmin 102 rmin b 10 0 40 80 120 160 Scattering angle (degrees) FIGURE 6.12 The dependence of scattering rate on the scattering angle θ, using a gold foil. The sin−4 (θ/2) dependence is exactly as predicted by the Rutherford formula. 2 U = zZe 1 4pe0 d d K=0 FIGURE 6.13 Closest approach of the projectile to the nucleus. Angular momentum is also conserved. Far from the nucleus, the angular momentum L is mvb, and at rmin , the angular momentum is mvmin rmin , so mvb = mvmin rmin (6.17) which gives vmin = bv/rmin . Substituting this result into Eq. 6.16, we find 1 b2 v2 1 2 mv = m 2 2 2 rmin + 1 zZe2 4πε0 rmin (6.18) This expression can be solved for the value of rmin . Notice that the kinetic energy of the projectile is not zero at rmin , unless b = 0. (See Figure 6.13.) In this case, the projectile would lose all of its kinetic energy, and thus get closest to the nucleus. At this point its distance from the nucleus is d, the distance of closest approach. We find this distance by solving Eq. 6.18 for rmin when b = 0, and obtain d= 1 zZe2 4πε0 K (6.19) Example 6.3 Find the distance of closest approach of an 8.0-MeV alpha particle incident on a gold foil. Solution d= zZe2 1 1 = (2)(79)(1.44 MeV·fm) = 28 fm 4πε0 K 8.0 MeV 180 Chapter 6 | The Rutherford-Bohr Model of the Atom Although a distance of 28 fm is very small (much less than an atomic radius, for example) it is larger than the nuclear radius of gold (about 7 fm). Thus the projectile is always outside of the nuclear charge distribution, and the Rutherford scattering law, which was derived assuming the projectile to remain outside the nucleus, correctly describes the scattering. If we increase the kinetic energy of the projectile, or decrease the electrostatic repulsion by using a target nucleus with low Z, this may not be the case. Under certain circumstances, the distance of closest approach can be less than the nuclear radius. When this happens, the projectile no longer feels the full nuclear charge, and the Rutherford scattering law no longer holds. In fact, as we discuss in Chapter 12, this gives us a convenient way of measuring the size of the nucleus. 6.4 LINE SPECTRA The radiation from atoms can be classified into continuous spectra and discrete or line spectra. In a continuous spectrum, all wavelengths from some minimum, perhaps 0, to some maximum, perhaps approaching ∞, are emitted. The radiation from a hot, glowing object is an example of this category. White light is a mixture of all of the different colors of visible light; an object that glows white hot is emitting light at all wavelengths of the visible spectrum. If, on the other hand, we force an electric discharge in a tube containing a small amount of the gas or vapor of a certain element, such as mercury, sodium, or neon, light is emitted at a few discrete wavelengths and not at any others. Examples of such emission “line” spectra are shown in Figure 6.14. The strong 436 nm (blue) and 546 nm (green) lines in the mercury emission spectrum give mercury-vapor street lights their blue-green tint; the strong yellow line at 590 nm in the sodium spectrum (which is actually a doublet —two very closely spaced lines) gives sodium-vapor street lights a softer, yellowish color. The intense red lines of neon are responsible for the red color of “neon signs.” Another possible experiment is to pass a beam of white light, containing all wavelengths, through a sample of a gas. When we do so, we find that certain Film or screen V Blue Hg Red Ultraviolet Visible Vapor tube Na Prism Slits 200 300 400 500 Wavelength (nm) 600 700 FIGURE 6.14 Apparatus for observing emission spectra. Light is emitted when an electric discharge is created in a tube containing a vapor of an element. The light passes through a dispersive medium, such as a prism or a diffraction grating, which displays the individual component wavelengths at different positions. Sample line spectra are shown for mercury and sodium in the visible and near ultraviolet. 6.4 | Line Spectra Gas White cell light source Film Blue or screen Hg Red Ultraviolet Visible Na Prism 200 300 Slit 400 500 Wavelength (nm) 600 700 FIGURE 6.15 Apparatus for observing absorption spectra. A light source produces a continuous range of wavelengths, some of which are absorbed by a gaseous element. The light is dispersed, as in Figure 6.14. The result is a continuous “rainbow” spectrum, with dark lines at wavelengths where the light was absorbed by the gas. wavelengths have been absorbed from the light, and again a line spectrum results. In this case there are dark lines, superimposed on the bright continuous spectrum, at the wavelengths where the absorption occurred. These wavelengths correspond to many (but not all) of the wavelengths seen in the emission spectrum. Examples of absorption spectra are shown in Figure 6.15. In general, the interpretation of line spectra is very difficult in complex atoms, and so we will deal for now with the line spectra of the simplest atom, hydrogen. Regularities appear in both the emission and absorption spectra, as shown in Figure 6.16. Notice that, as with the mercury and sodium spectra, some lines present in the emission spectrum are missing from the absorption spectrum. 90 nm 100 nm 110 nm 120 nm Lyman (ultraviolet) Absorption and emission 400 nm 500 nm 600 nm Balmer (visible) Emission only 0.5 mm 1.0 mm 1.5 mm 2.0 mm Emission only Paschen (infrared) 1.0 mm 2.0 mm 3.0 mm 4.0 mm Emission only Brackett (infrared) 2.0 mm 4.0 mm Pfund (infrared) 6.0 mm 8.0 mm Emission only FIGURE 6.16 Emission and absorption spectral series of hydrogen. Note the regularities in the spacing of the spectral lines. The lines get closer together as the limit of each series (dashed line) is approached. Only the Lyman series appears in the absorption spectrum; all series are present in the emission spectrum. 181 182 Chapter 6 | The Rutherford-Bohr Model of the Atom In 1885 Johannes Balmer, a Swiss schoolteacher, noticed (mostly by trial and error) that the wavelengths of the group of emission lines of hydrogen in the visible region could be calculated very accurately from the formula λ = (364.5 nm) n2 n2 −4 (n = 3, 4, 5, . . .) (6.20) For example, for n = 3, the formula gives λ = 656.1 nm, which corresponds exactly to the longest wavelength of the series of hydrogen lines in the visible region (see Figure 6.16). This formula is now known as the Balmer formula and the series of lines that it fits is called the Balmer series. The wavelength 364.5 nm, corresponding to n → ∞, is called the series limit (which is shown as the dashed line at the left end of the Balmer series in Figure 6.16). It was soon discovered that all of the groupings of lines in the hydrogen spectrum could be fit with a similar formula of the form λ = λlimit n2 n2 − n20 (n = n0 + 1, n0 + 2, n0 + 3 . . .) (6.21) where λlimit is the wavelength of the appropriate series limit. For the Balmer series, n0 = 2. The other series are today known as Lyman (n0 = 1), Paschen (n0 = 3), Brackett (n0 = 4), and Pfund (n0 = 5). These series of hydrogen spectral lines are shown in Figure 6.16. Another interesting property of the hydrogen wavelengths is summarized in the Ritz combination principle. If we convert the hydrogen emission wavelengths to frequencies, we find the curious property that certain pairs of frequencies added together give other frequencies that appear in the spectrum. Any successful model of the hydrogen atom must be able to explain the occurrence of these interesting arithmetic regularities in the emission spectra. Example 6.4 The series limit of the Paschen series (n0 = 3) is 820.1 nm. What are the three longest wavelengths of the Paschen series? n=4: Solution n=5: From Eq. 6.21, n2 λ = (820.1 nm) 2 n − 32 (n = 4, 5, 6, . . .) The three longest wavelengths are: n=6: 42 = 1875 nm − 32 52 λ = (820.1 nm) 2 = 1281 nm 5 − 32 62 λ = (820.1 nm) 2 = 1094 nm 6 − 32 λ = (820.1 nm) 42 These transitions are in the infrared region of the electromagnetic spectrum. 6.5 | The Bohr Model 183 Example 6.5 Show that the longest wavelength of the Balmer series and the longest two wavelengths of the Lyman series satisfy the Ritz combination principle. For the Lyman series, λlimit = 91.13 nm. Solution Using Eq. 6.20 with n = 3, we find the longest wavelength of the Balmer series to be 656.1 nm. Converting this to a frequency, we obtain f = c 2.998 × 108 m/s = 4.57 × 1014 Hz = λ (656.1 nm)(10−9 m/nm) Using Eq. 6.21 for n = 2 and 3 with n0 = 1, we find the longest two wavelengths of the Lyman series and their corresponding frequencies to be 2 n=2: λ = (91.13 nm) 2 = 121.5 nm 22 − 12 f = c 2.998 × 108 m/s = λ (121.5 nm)(10−9 m/nm) = 24.67 × 1014 Hz n=3: λ = (91.13 nm) f = 32 = 102.5 nm 32 − 12 c 2.998 × 108 m/s = λ (102.5 nm)(10−9 m/nm) = 29.24 × 1014 Hz Adding the smallest frequency of the Lyman series to the smallest frequency of the Balmer series gives the next smallest Lyman frequency: 24.67 × 1014 Hz + 4.57 × 1014 Hz = 29.24 × 1014 Hz demonstrating the Ritz combination principle. 6.5 THE BOHR MODEL Following Rutherford’s proposal that the mass and positive charge are concentrated in a very small region at the center of the atom, the Danish physicist Niels Bohr in 1913 (while working in Rutherford’s laboratory) suggested that the atom resembled a miniature planetary system, with the electrons circulating about the nucleus like planets circulating about the Sun. The atom thus doesn’t collapse under the influence of the electrostatic Coulomb force of the nucleus on the electrons for the same reason that the solar system doesn’t collapse under the influence of the gravitational force of the Sun on the planets. In both cases, the attractive force provides the centripetal acceleration necessary to maintain the orbital motion. As we discuss later, the Bohr model does not give a correct view of the actual structure and properties of atoms, but it represents an important first step in achieving an understanding of atoms. The correct view requires methods of quantum mechanics, which we discuss in Chapter 7. We consider for simplicity the hydrogen atom, with a single electron circulating about a nucleus that has a single positive charge, as in Figure 6.17. The radius of the circular orbit is r, and the electron (of mass m) moves with constant tangential speed v. The attractive Coulomb force provides the centripetal acceleration v2 /r, so F= 1 |q1 ||q2 | 1 e2 mv2 = = 4πε0 r2 4πε0 r2 r (6.22) n −e F +Ze r FIGURE 6.17 The Bohr model of the atom (Z = 1 for hydrogen). 184 Chapter 6 | The Rutherford-Bohr Model of the Atom Manipulating this equation, we can find the kinetic energy of the electron (we are assuming the more massive nucleus to remain at rest—more about this later): K= 1 e2 1 2 mv = 2 8πε0 r (6.23) The potential energy of the electron-nucleus system is the Coulomb potential energy: 1 q1 q2 1 e2 U= =− (6.24) 4πε0 r 4πε0 r The total energy E = K + U is obtained by adding Eqs. 6.23 and 6.24:   1 e2 1 e2 1 e2 + − =− E =K+U = 8πε0 r 4πε0 r 8πε0 r (6.25) We have ignored one serious difficulty with this model thus far. Classical physics requires that an accelerated electric charge, such as our orbiting electron, must continuously radiate electromagnetic energy. As it radiates this energy, its total energy would decrease, the electron would spiral in toward the nucleus, and the atom would collapse. To overcome this difficulty, Bohr made a bold and daring hypothesis—he proposed that there are certain special states of motion, called stationary states, in which the electron may exist without radiating electromagnetic energy. In these states, according to Bohr, the angular momentum − In stationary states, L of the electron takes values that are integer multiples of h. − − − 2h, 3h, . . ., but the angular momentum of the electron may have magnitude h, − − never such values as 2.5h or 3.1h. This is called the quantization of angular momentum. In a circular orbit, the position vector r that locates the electron relative to the nucleus is always perpendicular to its linear momentum p . The angular  = r × p, has magnitude L = rp = mvr when momentum, which is defined as L r is perpendicular to p. Thus Bohr’s postulate is mvr = nh− (6.26) where n is an integer (n = 1, 2, 3, . . .). We can use this expression with Eq. 6.23 for the kinetic energy 1 e2 1  nh− 2 1 2 = mv = m 2 2 mr 8πε0 r (6.27) to find a series of allowed values of the radius r: Niels Bohr (1885–1962, Denmark). He developed a successful theory of the radiation spectrum of atomic hydrogen and also contributed the concepts of stationary states and complementarity to quantum mechanics. Later he developed a successful theory of nuclear fission. The institute of theoretical physics he founded in Copenhagen attracts scholars from around the world. rn = 4πε0 h−2 2 n = a0 n2 me2 (n = 1, 2, 3, . . .) (6.28) where the Bohr radius a0 is defined as a0 = 4πε0 h−2 = 0.0529 nm me2 (6.29) This important result is very different from what we expect from classical physics. A satellite may be placed into Earth orbit at any desired radius by boosting it to the appropriate altitude and then supplying the proper tangential speed. This is not true for an electron’s orbit—only certain radii are allowed by 6.5 | The Bohr Model the Bohr model. The radius of the electron’s orbit may be a0 , 4a0 , 9a0 , 16a0 , and so forth, but never 3a0 or 5.3a0 . Substituting Eq. 6.28 for r into Eq. 6.25 gives the energy: me4 1 −13.60 eV = En = − 2 2 −2 n2 n2 32π ε0 h (n = 1, 2, 3, . . .) (6.30) The energy levels calculated from Eq. 6.30 are shown in Figure 6.18. The electron’s energy is quantized —only certain energy values are possible. In its lowest level, with n = 1, the electron has energy E1 = −13.60 eV and orbits with a radius of r1 = 0.0529 nm. This state is the ground state. The higher states (n = 2 with E2 = −3.40 eV, n = 3 with E3 = −1.51 eV, etc.) are the excited states. The excitation energy of an excited state n is the energy above the ground state, En − E1 . Thus the first excited state (n = 2) has excitation energy E = E2 − E1 = −3.40 eV − (−13.60 eV) = 10.20 eV the second excited state has excitation energy E = E3 − E1 = −1.51 eV − (−13.60 eV) = 12.09 eV and so forth. The excitation energy can also be regarded as the amount of energy that the atom must absorb for the electron to make an upward jump. For example, if the atom absorbs an energy of 10.20 eV when the electron is in the ground state (n = 1), the electron will jump upward to the first excited state (n = 2). The magnitude of an electron’s energy |En | is sometimes called its binding energy; for example, the binding energy of an electron in the n = 2 state is 3.40 eV. If the atom absorbs an amount of energy equal to the binding energy of the electron, the electron will be removed from the atom and become a free electron. The atom, minus its electron, is called an ion. The amount of energy needed to remove an electron from an atom is also called the ionization energy. Usually the ionization energy of an atom indicates the energy to remove an electron from the ground state. If the atom absorbs more energy than the minimum necessary to remove the electron, the excess energy appears as the kinetic energy of the now free electron. The binding energy can also be regarded as the energy that is released when the atom is assembled from an electron and a nucleus that are initially separated by a large distance. If we bring an electron from a large distance away (where E = 0) and place it in orbit in the state n where its energy has the negative value En , energy amounting to |En | is released, usually in the form of one or more photons. The Hydrogen Wavelengths in the Bohr Model We previously discussed the emission and absorption spectra of atomic hydrogen, and our discussion of the Bohr model is not complete without an understanding of the origin of these spectra. Bohr postulated that, even though the electron doesn’t radiate when it remains in any particular stationary state, it can emit radiation when it moves to a lower energy level. In the lower level, the electron has less energy than in the original level, and the energy difference appears as a quantum of radiation whose energy hf is equal to the energy difference between the levels. E∞ = 0 E4 = −0.85 eV E3 = −1.51 eV n=∞ n=4 n=3 n=2 185 Binding energy E2 = −3.40 eV Excitation energy n=1 E1 = −13.60 eV FIGURE 6.18 The energy levels of atomic hydrogen, showing the excitation energy of the electron from n = 1 to n = 2 and the binding energy of the n = 2 electron. 186 Chapter 6 | The Rutherford-Bohr Model of the Atom n = n1 hf That is, if the electron jumps from n = n1 to n = n2 , as in Figure 6.19, a photon appears with energy hf = En1 − En2 n = n2 (6.31) or, using Eq. 6.30 for the energies, f = FIGURE 6.19 An electron jumps from the state n1 to the state n2 as a photon is emitted. me4 64π 3 ε02 h−3  1 1 − 2 2 n2 n1  (6.32) The wavelength of the emitted radiation is  2 2   2 2  64π 3 ε02 h−3 c n1 n2 n1 n2 c 1 λ= = = 2 2 2 4 f me R∞ n1 − n22 n1 − n2 (6.33) where R∞ is called the Rydberg constant R∞ = me4 (6.34) 64π 3 ε02 h−3 c The presently accepted numerical value is R∞ = 1.097373 × 107 m−1 Example 6.6 Find the wavelengths of the transitions from n1 = 3 to n2 = 2 and from n1 = 4 to n2 = 2 in atomic hydrogen. Solution For n1 = 3 and n2 = 2, Eq. 6.33 gives  2 2  n1 n2 1 λ= 2 R∞ n1 − n22  2 2  1 3 2 = = 656.1 nm 7 −1 2 1.097 × 10 m 3 − 22 and for n1 = 4 and n2 = 2,  2 2  n1 n2 1 λ= 2 R∞ n1 − n22 1 = 1.097 × 107 m−1  42 22 42 − 22  = 486.0 nm These wavelengths are remarkably close to the values of the two longest wavelengths of the Balmer series (Figure 6.16). In fact, Eq. 6.33 gives  2  n λ = (364.5 nm) 2 1 n1 − 4 for the wavelength of a transition from any state n1 to n2 = 2. This is identical with Eq. 6.21 for the Balmer series. Thus we see that the radiations identified as the Balmer series correspond to transitions from higher levels to the n = 2 level. Similar identifications can be made for other series of radiations, as shown in Figure 6.20. This association between the transitions expected according to the Bohr model and the observed wavelengths (as in Figure 6.16) represents a huge triumph for the model. The Bohr formulas also explain the Ritz combination principle, according to which certain frequencies in the emission spectrum can be summed to give other 364.5 ... 3.40 434.0 2.86 410.1 486.0 2.55 ... 3.02 656.1 93.7 13.23 1.89 94.9 13.06 91.1 97.2 12.76 ... 13.60 102.5 n=∞ n=6 n=5 n=4 n=3 ... 12.09 Photon energy (eV) 121.5 Photon wavelength (nm) 10.20 6.5 | The Bohr Model n=2 E∞ = 0 E6 = −0.38 eV E5 = −0.54 eV E4 = −0.85 eV E3 = −1.51 eV E2 = −3.40 eV Balmer series E1 = −13.60 eV n=1 Lyman series FIGURE 6.20 The transitions of the Lyman and Balmer series in hydrogen. The series limit is shown at the right of each group. frequencies. Let us consider a transition from a state n3 to a state n2 , that is followed by a transition from n2 to n1 . Equation 6.32 can be used for this case to give   1 1 − fn3 →n2 = cR∞ n23 n22   1 1 fn2 →n1 = cR∞ − n22 n21 Thus fn3 →n2 + fn2 →n1 = cR∞  1 1 − 2 2 n3 n2  + cR∞  1 1 − 2 2 n2 n1  = cR∞  1 1 − 2 2 n3 n1  which is equal to the frequency of the single photon emitted in a direct transition from n3 to n1 , so fn3 →n2 + fn2 →n1 = fn3 →n1 (6.35) The Bohr model is thus entirely consistent with the Ritz combination principle. The frequency of an emitted photon is related to its energy by E = hf , so the summing of frequencies is equivalent to the summing of energies. We may thus restate the Ritz combination principle in terms of energy: The energy of a photon emitted in a transition that skips or crosses over one or more states is equal to the step-by-step sum of the energies of the transitions connecting all of the individual states. (See Problem 25.) 187 188 Chapter 6 | The Rutherford-Bohr Model of the Atom The Bohr model also helps us understand why the atom doesn’t absorb and emit radiation at all the same wavelengths. Isolated atoms are normally found only in the ground state; the excited states live for a very short time (less than 10−9 s) before decaying to the ground state. The absorption spectrum therefore contains only transitions from the ground state. From Figure 6.20, we see that only the radiations of the Lyman series can be found in the absorption spectrum of hydrogen. A hydrogen atom in its ground state can absorb radiation of 10.20 eV and reach the first excited state, or of 12.09 eV and reach the second excited state, and so forth. A hydrogen atom cannot absorb a photon of energy 1.89 eV (the first line of the Balmer series), because the atom is originally not in the n = 2 level. The Balmer series is therefore not found in the absorption spectrum. Atoms with Z > 1 The Bohr theory for hydrogen can be used for any atom with a single electron, even if the nuclear charge Z is greater than 1. For example, we can calculate the energy levels of singly ionized helium (helium with one electron removed), doubly ionized lithium, and so on. The nuclear electric charge enters the Bohr theory in only one place—in the expression for the electrostatic force between nucleus and electron, Eq. 6.22. For a nucleus of charge Ze, the Coulomb force acting on the electron is F= 1 Ze2 1 |q1 ||q2 | = 2 4πε0 r 4πε0 r2 (6.36) That is, where we had e2 previously, we now have Ze2 . Making the same substitution in the final results, we can find the allowed radii: rn = a0 n2 4πε0 h−2 2 = n Ze2 m Z (6.37) and the energies become En = − m(Ze2 )2 1 Z2 = −(13.60 eV) n2 32π 2 ε02 h−2 n2 (6.38) The orbits in the higher-Z atoms are closer to the nucleus and have larger (negative) energies; that is, the electron is more tightly bound to the nucleus. Example 6.7 Calculate the two longest wavelengths of the Balmer series of triply ionized beryllium (Z = 4). Solution The radiations of the Balmer series end with the n = 2 level, and so the two longest wavelengths are the radiations corresponding to n = 3 → n = 2 and n = 4 → n = 2. The energies of the radiations and their corresponding wavelengths are   1 1 E3 − E2 = −(13.60 eV)(42 ) 2 − 2 = 30.2 eV 3 2 λ= hc 1240 eV·nm = = 41.0 nm E 30.2 eV 6.6 | The Franck-Hertz Experiment E4 − E2 = −(13.60 eV)(42 ) λ=  1 1 − 2 2 4 2  = 40.8 eV hc 1240 eV·nm = = 30.4 nm E 40.8 eV 189 These radiations are in the ultraviolet region. Note that we cannot use Eq. 6.33 to find the wavelengths, because that equation applies only to hydrogen (Z = 1). 6.6 THE FRANCK-HERTZ EXPERIMENT C F G P V V0 A FIGURE 6.21 Franck-Hertz apparatus. Electrons leave the cathode C, are accelerated by the voltage V toward the grid G, and reach the plate P where they are recorded on the ammeter A. 14.7 V 9.8 V Current Let us imagine the following experiment, performed with the apparatus shown schematically in Figure 6.21. A filament heats the cathode, which then emits electrons. These electrons are accelerated toward the grid by the potential difference V , which we control. Electrons pass through the grid and reach the plate if V exceeds V0 , a small retarding voltage between the grid and the plate. The current of electrons reaching the plate is measured using the ammeter A. Now suppose the tube is filled with atomic hydrogen gas at a low pressure. As the voltage is increased from zero, more and more electrons reach the plate, and the current rises accordingly. The electrons inside the tube may make collisions with atoms of hydrogen, but lose no energy in these collisions—the collisions are perfectly elastic. The only way the electron can give up energy in a collision is if the electron has enough energy to cause the hydrogen atom to make a transition to an excited state. Thus, when the energy of the electrons reaches and barely exceeds 10.2 eV (or when the voltage reaches 10.2 V), the electrons can make inelastic collisions, leaving 10.2 eV of energy with the atom (now in the n = 2 level), and the original electron moves off with very little energy. If it should pass through the grid, the electron might not have sufficient energy to overcome the small retarding potential and reach the plate. Thus when V = 10.2 V, a drop in the current is observed. As V is increased further, we begin to see the effects of multiple collisions. That is, when V = 20.4 V, an electron can make an inelastic collision, leaving the atom in the n = 2 state. The electron loses 10.2 eV of energy in this process, and so it moves off after the collision with a remaining 10.2 eV of energy, which is sufficient to excite a second hydrogen atom in an inelastic collision. Thus, if a drop in the current is observed at V , similar drops are observed at 2V , 3V , . . . . This experiment should thus give rather direct evidence for the existence of atomic excited states. Unfortunately, it is not easy to do this experiment with hydrogen, because hydrogen occurs naturally in the molecular form H2 , rather than in atomic form. The molecules can absorb energy in a variety of ways, which would confuse the interpretation of the experiment. A similar experiment was done in 1914 by James Franck and Gustav Hertz, using a tube filled with mercury vapor. Their results are shown in Figure 6.22, which gives clear evidence for an excited state at 4.9 eV; whenever the voltage is a multiple of 4.9 V, a drop in the current appears. Coincidentally, the emission spectrum of mercury shows an intense ultraviolet line of wavelength 254 nm, which corresponds to an energy of 4.9 eV; this results from a transition between the same 4.9-eV excited state and the ground state. The Franck-Hertz experiment showed that an electron must have a certain minimum energy to make an inelastic collision with an atom; we now interpret that minimum energy as the energy of an excited state of the atom. Franck and Hertz were awarded the 1925 Nobel Prize in physics for this work. 4.9 V 5 10 Voltage 15 FIGURE 6.22 Result of Franck-Hertz experiment using mercury vapor. The current drops at voltages of 4.9 V, 9.8 V (= 2 × 4.9 V), 14.7 V (= 3 × 4.9 V). 190 Chapter 6 | The Rutherford-Bohr Model of the Atom ∗ 6.7 THE CORRESPONDENCE PRINCIPLE We have seen how Bohr’s model permits calculations of transition wavelengths in atomic hydrogen that are in excellent agreement with the wavelengths observed in the emission and absorption spectra. However, in order to obtain this agreement, Bohr had to introduce postulates that were radical departures from classical physics. In particular, according to classical physics an accelerated charged particle radiates electromagnetic energy, but an electron in Bohr’s atomic model, accelerated as it moves in a circular orbit, does not radiate (unless it jumps to another orbit). Here we have a very different case than we did in our study of special relativity. You will recall, for example, that relativity gives us one expression for the kinetic energy, K = E − E0 , and classical physics gives us another, K = 12 mv2 ; however, we showed that E − E0 reduces to 21 mv2 when v ≪ c. Thus these two expressions are really not very different—one is merely a special case of the other. The dilemma associated with the accelerated electron is not simply a matter of atomic physics (as an example of quantum physics) being a special case of classical physics. Either the accelerated charge radiates, or it doesn’t! Bohr’s solution to this serious dilemma was to propose the correspondence principle, which states that Quantum theory must agree with classical theory in the limit in which classical theory is known to agree with experiment, or equivalently, Quantum theory must agree with classical theory in the limit of large quantum numbers. Let us see how we can apply this principle to the Bohr atom. According to classical physics, an electric charge moving in a circle radiates at a frequency equal to its frequency of rotation. For an atomic orbit, the period of revolution is  the distance traveled in one orbit, 2πr, divided by the orbital speed v = 2K/m, where K is the kinetic energy:  16π 3 ε0 mr3 2π r = T= (6.39) e 2K/m where we use Eq. 6.23 for the kinetic energy. The frequency f is the inverse of the period: f = e 1 = T 16π 3 ε0 mr3 (6.40) Using Eq. 6.28 for the radii of the allowed orbits, we find fn = me4 32π 3 ε02 h−3 1 n3 (6.41) A “classical” electron moving in an orbit of radius rn would radiate at this frequency fn . ∗ This is an optional section that may be skipped without loss of continuity. 6.8 | Deficiencies of the Bohr Model 191 If we made the radius of the Bohr atom so large that it went from a quantumsized object (10−10 m) to a laboratory-sized object (10−3 m), the atom should behave classically. The radius increases with increasing n like n2 , so this classical behavior should occur for n in the range 103 –104 . Let us then calculate the frequency of the radiation emitted by such an atom when the electron drops from the orbit n to the orbit n − 1. According to Eq. 6.32, the frequency is   2n − 1 1 1 me4 me4 − (6.42) = f = 3 2 −3 3 2 −3 n2 (n − 1)2 2 2 (n − 1) n 64π ε0 h 64π ε0 h If n is very large, then we can approximate n − 1 by n and 2n − 1 by 2n, which gives f ∼ = 2n 1 me4 = 3 2 −3 n4 3 2 −3 n3 64π ε0 h 32π ε0 h me4 (6.43) This is identical with Eq. 6.41 for the “classical” frequency. The “classical” electron spirals slowly in toward the nucleus, radiating at the frequency given by Eq. 6.41, while the “quantum” electron jumps from the orbit n to the orbit n − 1 and then to the orbit n − 2, and so forth, radiating at the frequency given by the identical Eq. 6.43. (When the circular orbits are very large, this jumping from one circular orbit to the next smaller one looks very much like a spiral, as in Figure 6.23.) In the region of large n, where classical and quantum physics overlap, the classical and quantum expressions for the radiation frequencies are identical. This is an example of an application of Bohr’s correspondence principle. The applications of the correspondence principle go far beyond the Bohr atom, and this principle is important in understanding how we get from the domain in which the laws of classical physics are valid to the domain in which the laws of quantum physics are valid. 6.8 DEFICIENCIES OF THE BOHR MODEL The Bohr model gives us a picture of how electrons move about the nucleus, and many of our attempts to explain the behavior of atoms refer to this picture, even though it is not strictly correct. Our presentation ignored two effects that must be included to improve the accuracy of the model. Other deficiencies in the model cannot be so easily fixed, because they are inconsistent with the correct quantum-mechanical picture, which is presented in the next chapter. 1. Motion of the Proton. Our model was based on an electron orbiting around a fixed proton, but actually the electron and proton both orbit about their center of mass (just as the Earth and Sun orbit around their center of mass). The kinetic energy should thus include a term describing the motion of the proton. We can account for this effect if the mass that appears in the equation for the energy levels (Eq. 6.30) is not the electron mass but instead is the reduced mass of the proton-electron system, calculated from the electron mass me and proton mass mp according to me mp (6.44) m= me + mp FIGURE 6.23 (Top) A large quantum atom. Photons are emitted in discrete transitions as the electron jumps to lower states. (Bottom) A classical atom. Photons are emitted continuously by the accelerated electron. 192 Chapter 6 | The Rutherford-Bohr Model of the Atom The reduced mass is just slightly smaller than the electron mass and has the effect of decreasing the energy and frequency or increasing the wavelength by about 0.05%. Equivalently, in Eq. 6.33 for the wavelengths we can replace the Rydberg constant R∞ (so called because it would be correct if the proton mass were infinite) with the value R = R∞ (1 + me /mp ). 2. Wavelengths in Air. Another small but easily fixable error occurs when we convert the frequencies (Eq. 6.32) calculated directly from the Bohr energy levels to wavelengths (Eq. 6.33). The wavelength measurements are normally done in air, so we should calculate the wavelength as λ = vair /f , where vair is the speed of light in air. This has the effect of decreasing the calculated wavelengths by about 0.03% (to some extent offsetting the error we made by ignoring the motion of the proton). 3. Angular Momentum. A serious failure of the Bohr model is that it gives incorrect predictions for the angular momentum of the electron. In Bohr’s theory, − which is the orbital angular momentum is quantized in integer multiples of h, correct. However, for the ground state of hydrogen (n = 1), the Bohr theory gives − while experiment clearly shows L = 0. L = h, 4. Uncertainty. Another deficiency of the model is that it violates the uncertainty relationship. (In Bohr’s defense, remember that the model was developed a decade before the introduction of wave mechanics, with its accompanying ideas of uncertainty.) Suppose the electron orbits in the xy plane. In this case we know exactly its z coordinate (in the xy plane, z = 0 and so z = 0) and the z component of its momentum (also precisely zero, so pz = 0). Such an atom would therefore − In fact, as we discuss in the next violate the uncertainty relationship zpz ≥ h. chapter, quantum mechanics introduces a degree of “fuzziness” to the behavior of electrons in atoms that is not consistent with any orbit in a single plane. In spite of its successes, the Bohr model is at best an incomplete model. It is useful only for atoms that contain one electron (hydrogen, singly ionized helium, doubly ionized lithium, and so forth), but not for atoms with two or more electrons, because we have considered only the force between electron and nucleus, and not the force between the electrons themselves. Furthermore, if we look very carefully at the emission spectrum, we find that many lines are in fact not single lines, but very closely spaced combinations of two or more lines; the Bohr model is unable to account for these doublets of spectral lines. The model is also limited in its usefulness as a basis from which to calculate other properties of the atom; although we can accurately calculate the energies of the spectral lines, we cannot calculate their intensities. For example, how often will an electron in the n = 3 state jump directly to the n = 1 state, emitting the corresponding photon, and how often will it jump first to the n = 2 state and then to the n = 1 state, emitting two photons? A complete theory should provide a way to calculate this property. We do not wish, however, to discard the model completely. The Bohr model provides a useful starting point in our study of atoms, and Bohr introduced several ideas (stationary states, quantization of angular momentum, correspondence principle) that carry over into the correct quantum-mechanical calculation. There are many atomic properties, especially those associated with magnetism, that can be simply modeled on the basis of Bohr orbits. Most remarkably, when we treat the hydrogen atom correctly in the next chapter using quantum mechanics, we find that the energy levels calculated by solving the Schrödinger equation are in fact identical with those of the Bohr model. Questions 193 Chapter Summary Section Section Scattering impact parameter 1 zZ e2 cot θ b= 2K 4πε0 2 6.3 Excitation energy of level n Fraction scattered at angles > θ f>θ = ntπb2 6.3 |En | Binding (or ionization) energy of level n Rutherford N(θ) =    2 2 scattering formula nt zZ 2 1 e 4r2 2K 4πε0 sin4 12 θ 1 zZe2 Distance of d= 4πε0 K closest approach n2 Balmer formula λ = (364.5 nm) 2 n −4 (n = 3, 4, 5, . . .) 6.3 λ= Single-electron atoms with Z > 1 rn = 4πε0 h−2 2 6.5 Radii of Bohr rn = n = a 0 n2 me2 orbits in hydrogen (n = 1, 2, 3, . . .) me4 1 Energies of Bohr En = − 6.5 2 2 − 2 n2 32π ε0 h orbits in hydrogen −13.60 eV = (n = 1, 2, 3, . . .) n2 Reduced mass of proton-electron system m= 6.5 6.5  2 2  64π 3 ε02 h−3 c n1 n2 2 4 me n1 − n22  2 2  n1 n2 1 = R∞ n21 − n22 Hydrogen wavelengths in Bohr model 6.3 6.4 En − E1 a0 n 2 Z2 , En = −(13.60 eV) 2 Z n me mp me + mp 6.5 6.5 6.8 Questions 1. Does the Thomson model fail at large scattering angles or at small scattering angles? Why? 2. What principles of physics would be violated if we scattered a beam of alpha particles with a single impact parameter from a single target atom at rest? 3. Could we use the Rutherford scattering formula to analyze the scattering of (a) protons incident on iron? (b) Alpha particles incident on lithium (Z = 3)? (c) Silver nuclei incident on gold? (d) Hydrogen atoms incident on gold? (e) Electrons incident on gold? 4. What determines the angular range dθ in the alpha-particle scattering experiment (Figure 6.8)? 5. Why didn’t Bohr use the concept of de Broglie waves in his theory? 6. In which Bohr orbit does the electron have the largest velocity? Are we justified in treating the electron nonrelativistically in that case? 7. How does an electron in hydrogen get from r = 4a0 to r = a0 without being anywhere in between? 8. How is the quantization of the energy in the hydrogen atom similar to the quantization of the systems discussed in Chapter 5? How is it different? Do the quantizations originate from similar causes? 9. In a Bohr atom, an electron jumps from state n1 , with angular h, to state n2 , with angular momentum n2 − h. momentum n1 − How can an isolated system change its angular momentum? (In classical physics, a change in angular momentum requires an external torque.) Can the photon carry away the difference in angular momentum? Estimate the maximum angular momentum, relative to the center of the atom, that the photon can have. Does this suggest another failure of the Bohr model? 10. The product En rn for the hydrogen atom is (1) independent of Planck’s constant and (2) independent of the quantum number n. Does this observation have any significance? Is this a classical or a quantum effect? 11. (a) How does a Bohr atom violate the position-momentum uncertainty relationship? (b) How does a Bohr atom violate the energy-time uncertainty relationship? (What is E? What does this imply about t? What do you conclude about transitions between levels?) 194 Chapter 6 | The Rutherford-Bohr Model of the Atom 12. List the assumptions made in deriving the Bohr theory. Which of these are a result of neglecting small quantities? Which of these violate basic principles of relativity or quantum physics? 13. List the assumptions made in deriving the Rutherford scattering formula. Which of these are a result of neglecting small quantities? Which of these violate basic principles of relativity or quantum physics? 14. In both the Rutherford theory and the Bohr theory, we used the classical expression for the kinetic energy. Estimate the velocity of an electron in the Bohr atom and of an alpha particle in a typical scattering experiment, and decide whether the use of the classical formula is justified. 15. In both the Rutherford theory and the Bohr theory, we neglected any wave properties of the particles. Estimate the de Broglie wavelength of an electron in a Bohr atom and compare it with the size of the atom. Estimate the de Broglie wavelength of an alpha particle and compare it with the 16. 17. 18. 19. size of the nucleus. Is the wave behavior expected to be important in either case? What is the distinction between binding energy and ionization energy? Between binding energy and excitation energy? If you were given the value of the binding energy of a level in hydrogen, could you find its excitation energy without knowing which level it is? Why are the decreases in current in the Franck-Hertz experiment not sharp? As indicated by the Franck-Hertz experiment, the first excited state of mercury is at an energy of 4.9 eV. Do you expect mercury to show absorption lines in the visible spectrum? Is the correspondence principle a necessary part of quantum physics or is it merely an accidental agreement of two formulas? Where do we draw the line between the world of quantum physics and the world of classical, nonquantum physics? Problems 6.1 Basic Properties of Atoms 6.3 The Rutherford Nuclear Atom 1. Electrons in atoms are known to have kinetic energies in the range of a few eV. Show that the uncertainty principle allows electrons of this energy to be confined in a region the size of an atom (0.1 nm). 6.2 Scattering Experiments and the Thomson Model 2. Consider an electron in Figure 6.1 embedded in a sphere of positive charge Ze at a distance r from its center. (a) Using Gauss’s law, show that the electric field on the electron due to the positive charge is E= 1 Ze r 4πε0 R3 (b) For this electric field, show that the force on the electron is given by Eq. 6.2. 3. (a) Compute the oscillation frequency of the electron and the expected absorption or emission wavelength in a Thomsonmodel hydrogen atom. Use R = 0.053 nm. Compare with the observed wavelength of the strongest emission and absorption line in hydrogen, 122 nm. (b) Repeat for sodium (Z = 11). Use R = 0.18 nm. Compare with the observed wavelength, 590 nm. 4. Consider the Thomson model for an atom with 2 electrons. Let the electrons be located along a diameter on opposite sides of the center of the sphere, each a distance x from the center. (a) Show that the configuration is stable if x = R/2. (b) Try to construct similar stable configurations for atoms with 3, 4, 5, and 6 electrons. 5. Alpha particles of kinetic energy 5.00 MeV are scattered at 90◦ by a gold foil. (a) What is the impact parameter? (b) What is the minimum distance between alpha particles and gold nucleus? (c) Find the kinetic and potential energies at that minimum distance. 6. How much kinetic energy must an alpha particle have before its distance of closest approach to a gold nucleus is equal to the nuclear radius (7.0 × 10−15 m)? 7. What is the distance of closest approach when alpha particles of kinetic energy 6.0 MeV are scattered by a thin copper foil? 8. Protons of energy 5.0 MeV are incident on a silver foil of thickness 4.0 × 10−6 m. What fraction of the incident protons is scattered at angles: (a) Greater than 90◦ ? (b) Greater than 10◦ ? (c) Between 5◦ and 10◦ ? (d) Less than 5◦ ? 9. Protons are incident on a copper foil 12 μm thick. (a) What should the proton kinetic energy be in order that the distance of closest approach equal the nuclear radius (5.0 fm)? (b) If the proton energy were 7.5 MeV, what is the impact parameter for scattering at 120◦ ? (c) What is the minimum distance between proton and nucleus for this case? (d) What fraction of the protons is scattered beyond 120◦ ? 10. Alpha particles of kinetic energy K are scattered either from a gold foil or a silver foil of identical thickness. What is the ratio of the number of particles scattered at angles greater than 90◦ by the gold foil to the same number for the silver foil? 11. The maximum kinetic energy given to the target nucleus will occur in a head-on collision with b = 0. (Why?) Estimate Problems the maximum kinetic energy given to the target nucleus when 8.0 MeV alpha particles are incident on a gold foil. Are we justified in neglecting this energy? 12. The maximum kinetic energy that an alpha particle can transmit to an electron occurs during a head-on collision. Compute the kinetic energy lost by an alpha particle of kinetic energy 8.0 MeV in a head-on collision with an electron at rest. Are we justified in neglecting this energy in the Rutherford theory? 13. Alpha particles of energy 9.6 MeV are incident on a silver foil of thickness 7.0 μm. For a certain value of the impact parameter, the alpha particles lose exactly half their incident kinetic energy when they reach their minimum separation from the nucleus. Find the minimum separation, the impact parameter, and the scattering angle. 14. Alpha particles of kinetic energy 6.0 MeV are incident at a rate of 3.0 × 107 per second on a gold foil of thickness 3.0 × 10−6 m. A circular detector of diameter 1.0 cm is placed 12 cm from the foil at an angle of 30◦ with the direction of the incident alpha particles. At what rate does the detector measure scattered alpha particles? 195 25. Use the Bohr formula to find the energy differences E(n1 → n2 ) = En1 − En2 and show that (a) E(4 → 2) = E(4 → 3) + E(3 → 2); (b) E(4 → 1) = E(4 → 2) + E(2 → 1). (c) Interpret these results based on the Ritz combination principle. 26. Find the shortest and the longest wavelengths of the Lyman series of singly ionized helium. 27. Draw an energy-level diagram showing the lowest four levels of singly ionized helium. Show all possible transitions from the levels and label each transition with its wavelength. 28. A long time ago, in a galaxy far, far away, electric charge had not yet been invented, and atoms were held together by gravitational forces. Compute the Bohr radius and the n = 2 to n = 1 transition energy in a gravitationally bound hydrogen atom. 29. An alternative development of the Bohr theory begins by assuming that the stationary states are those for which the circumference of the orbit is an integral number of de Broglie wavelengths. (a) Show that this condition leads to standing de Broglie waves around the orbit. (b) Show that this condition gives the angular momentum condition, Eq. 6.26, used in the Bohr theory. 6.4 Line Spectra 15. The shortest wavelength of the hydrogen Lyman series is 91.13 nm. Find the three longest wavelengths in this series. 16. One of the lines in the Brackett series (series limit = 1458 nm) has a wavelength of 1944 nm. Find the next higher and next lower wavelengths in this series. 17. The longest wavelength in the Pfund series is 7459 nm. Find the series limit. 6.5 The Bohr Model 18. In the n = 3 state of hydrogen, find the electron’s velocity, kinetic energy, and potential energy. 19. Use the Bohr theory to find the series wavelength limits of the Lyman and Paschen series of hydrogen. 20. (a) Show that the speed of an electron in the nth Bohr orbit of hydrogen is αc/n, where α is the fine structure hc. (b) What would be the speed constant, equal to e2 /4πε0 − in a hydrogenlike atom with a nuclear charge of Ze? 21. An electron is in the n = 5 state of hydrogen. To what states can the electron make transitions, and what are the energies of the emitted radiations? 22. Continue Figure 6.20, showing the transitions of the Paschen series and computing their energies and wavelengths. 23. A collection of hydrogen atoms in the ground state is illuminated with ultraviolet light of wavelength 59.0 nm. Find the kinetic energy of the emitted electrons. 24. Find the ionization energy of: (a) the n = 3 level of hydrogen; (b) the n = 2 level of He+ (singly ionized helium); (c) the n = 4 level of Li++ (doubly ionized lithium). 6.6 The Franck-Hertz Experiment 30. A hypothetical atom has only two excited states, at 4.0 and 7.0 eV, and has a ground-state ionization energy of 9.0 eV. If we used a vapor of such atoms for the Franck-Hertz experiment, for what voltages would we expect to see decreases in the current? List all voltages up to 20 V. 31. The first excited state of sodium decays to the ground state by emitting a photon of wavelength 590 nm. If sodium vapor is used for the Franck-Hertz experiment, at what voltage will the first current drop be recorded? 6.7 The Correspondence Principle 32. Suppose all of the excited levels of hydrogen had lifetimes of 10−8 s. As we go to higher and higher excited states, they get closer and closer together, and soon they are so close in energy that the energy uncertainty of each state becomes as large as the energy spacing between states, and we can no longer resolve individual states. Find the value of n for which this occurs. What is the radius of such an atom? 33. Compare the frequency of revolution of an electron with the frequency of the photons emitted in transitions from n to n − 1 for (a) n = 10; (b) n = 100; (c) n = 1000; (d) n = 10,000. 6.8 Deficiencies of the Bohr Model 34. What is the difference in wavelength between the first line of the Balmer series in ordinary hydrogen (M = 1.007825 u) and in “heavy” hydrogen (M = 2.014102 u)? 196 Chapter 6 | The Rutherford-Bohr Model of the Atom General Problems 35. A hydrogen atom is in the n = 6 state. (a) Counting all possible paths, how many different photon energies can be emitted if the atom ends up in the ground state? (b) Suppose only n = 1 transitions were allowed. How many different photon energies would be emitted? (c) How many different photon energies would occur in a Thomson-model hydrogen atom? 36. An electron is in the n = 8 level of ionized helium. (a) Find the three longest wavelengths that are emitted when the electron makes a transition from the n = 8 level to a lower level. (b) Find the shortest wavelength that can be emitted. (c) Find the three longest wavelengths at which the electron in the n = 8 level will absorb a photon and move to a higher state, if we could somehow keep it in that level long enough to absorb. (d) Find the shortest wavelength that can be absorbed. 37. The lifetimes of the levels in a hydrogen atom are of the order of 10−8 s. Find the energy uncertainty of the first excited state and compare it with the energy of the state. 38. The following wavelengths are found among the many radiations emitted by singly ionized helium: 24.30 nm, 25.63 nm, 102.5 nm, 320.4 nm. If we group the transitions in helium as we did in hydrogen by identifying the final state n0 and initial state n, to which series does each transition belong? 39. Adjacent wavelengths 72.90 nm and 54.00 nm are found in one series of transitions among the radiations emitted by doubly-ionized lithium. Find the value of n0 for this series and find the next wavelength in the series. 40. When an atom emits a photon in a transition from a state of energy E1 to a state of energy E2 , the photon energy is not precisely equal to E1 − E2 . Conservation of momentum requires that the atom must recoil, and so some energy must go into recoil kinetic energy KR . Show that KR ∼ = (E1 − E2 )2 /2Mc2 where M is the mass of the atom. Evaluate this recoil energy for the n = 2 to n = 1 transition of hydrogen. 41. In a muonic atom, the electron is replaced by a negatively charged particle called the muon. The muon mass is 207 times the electron mass. (a) Ignoring the correction for finite nuclear mass, what is the shortest wavelength of the Lyman series in a muonic hydrogen atom? In what region of the electromagnetic spectrum does this belong? (b) How large is the correction for the finite nuclear mass in this case? (See the discussion at the beginning of Section 6.8.) 42. Consider an atom in which the single electron is replaced by a negatively charged muon (mμ = 207me ). What is the radius of the first Bohr orbit of a muonic lead atom (Z = 82)? Compare with the nuclear radius of about 7 fm. Chapter 7 THE HYDROGEN ATOM IN WAVE MECHANICS These computer-generated distributions represent the probability to locate the electron in the n = 8 state of hydrogen for angular momentum quantum number l = 2 (top) and l = 6 (bottom). The nucleus is at the center, and the height at any point gives the probability to find the electron in a small volume element at that location in the xz plane. This way of describing the motion of an electron in hydrogen is very different from the circular orbits of the Bohr model. 198 Chapter 7 | The Hydrogen Atom in Wave Mechanics In this chapter we study the solutions of the Schrödinger equation for the hydrogen atom. We will see that these solutions, which lead to the same energy levels calculated in the Bohr model, differ from the Bohr model by allowing for the uncertainty in localizing the electron. Other deficiencies of the Bohr model are not so easily eliminated by solving the Schrödinger equation. First, the so-called “fine structure” of the spectral lines (the splitting of the lines into close-lying doublets) cannot be explained by our solutions; the proper explanation of this effect requires the introduction of a new property of the electron, the intrinsic spin. Second, the mathematical difficulties of solving the Schrödinger equation for atoms containing two or more electrons are formidable, so we restrict our discussion in this chapter to one-electron atoms, in order to see how wave mechanics enables us to understand some basic atomic properties. In the next chapter we discuss the structure of many-electron atoms. 7.1 A ONE-DIMENSIONAL ATOM Quantum mechanics gives us a view of the structure of the hydrogen atom that is very different from the Bohr model. In the Bohr model, the electron moves about the proton in a circular orbit. Quantum mechanics, on the other hand, does not allow a fixed radius or a fixed orbital plane but instead describes the electron in terms of a probability density, which leads to an uncertainty in locating the electron. To analyze the hydrogen atom according to quantum mechanics, we must solve the Schrödinger equation for the Coulomb potential energy of the proton and the electron: e2 (7.1) U(r) = − 4πε0 r Eventually we will discuss the solutions to this three-dimensional problem for the hydrogen atom using spherical polar coordinates, but for now let’s look at the simpler one-dimensional problem, in which a proton is fixed at the origin (x = 0) and an electron moves along the positive x axis. (This doesn’t represent a real atom, but it does show how some properties of electron wave functions in atoms emerge from solving the Schrödinger equation.) In one dimension, the Schrödinger equation for an electron with potential energy U(x) = −e2 /4πε0 x would then be − e2 h−2 d 2 ψ − ψ(x) = Eψ(x) 2m dx2 4πε0 x (7.2) For a bound state, the wave function must fall to zero as x → ∞. Moreover, in order for the second term on the left side to remain finite at x = 0, the wave function must be zero at x = 0. The simplest function that satisfies both of these requirements is ψ(x) = Axe−bx , where A is the normalization constant. By substituting this trial wave function into Eq. 7.2, we find a solution when b = me2 /4πε0 h−2 = 1/a0 (where a0 is the Bohr radius defined in Eq. 6.29). The energy corresponding to this wave function is E = −h−2 b2 /2m = −me4 /32π 2 ε02 h−2 , which happens by chance to be identical to the energy of the ground state in the Bohr model (Eq. 6.30 for n = 1). 7.1 | A One-Dimensional Atom 0.4 0.4 0.4 0.2 0.6 199 0.2 0 0.4 5 10 15 0 −0.2 5 0.2 10 15 20 25 −0.2 −0.4 0 0 5 10 −0.4 −0.6 (a) (c) (b) FIGURE 7.1 Wave functions and probability densities (shaded areas) for an electron bound in a one-dimensional Coulomb potential energy. The horizontal axis represents the distance between the proton and electron in units of a0 . (a) Ground state. (b) First excited state. (c) Second excited state. Figure 7.1a shows this wave function and its corresponding probability density |ψ(x)|2 . There is clearly an uncertainty in specifying the location of the electron. The most probable region to find the electron is near x = a0 , but there is a nonzero probability for the electron to be anywhere in the range 0 < x < ∞. This is very different from the Bohr model, in which the distance between the proton and electron is fixed at the value a0 . Also shown in the figure are wave functions and probability densities corresponding to the first and second excited states. The wave functions have the oscillatory or wavelike property that we expect for quantum wave functions. As we go to higher excited states, there are more peaks in the probability density and the region of maximum probability moves to larger distances. These same features emerge from the solution to the three-dimensional problem. From this simple one-dimensional calculation (which does not in any way physically represent the real three-dimensional hydrogen atom) you can already see how quantum mechanics will resolve some of the difficulties associated with the Bohr model. Example 7.1 Find the normalization constant of the ground state wave function for a particle trapped in the one-dimensional Coulomb potential energy. Solution The normalization integral (with b = 1/a0 ) is  ∞  ∞ x2 e−2x/a0 dx = 1 |ψ(x)|2 dx = A2 0 0 The integration is in a standard form that is found in integral tables and that we will have occasion to use frequently in analyzing the hydrogen wave functions:  ∞ n! xn e−cx dx = n+1 c 0 (7.3) Using this standard form with n = 2 and c = 2/a0 , the normalization integral becomes A2 2! =1 (2/a0 )3 or −3/2 A = 2a0 200 Chapter 7 | The Hydrogen Atom in Wave Mechanics Example 7.2 In the ground state of an electron bound in a one-dimensional Coulomb potential energy, what is the probability to find the electron located between x = 0 and x = a0 ? Solution The probability can be found using Eq. 5.10:   a0 4 a0 2 −2x/a0 P(0 : a0 ) = x e dx |ψ(x)|2 dx = 3 a0 0 0 with the normalization constant from Example 7.1. The integral is a standard form that we will later find useful for analyzing the hydrogen wave functions:  e−cx xn e−cx dx = − c   n! nxn−1 n(n − 1)xn−2 n + · · · + + × x + c c2 cn (7.4) The probability is then 4 e−2x/a0 P(0 : a0 ) = 3 − 2/a0 a0   2 2x 2 + x + 2/a0 (2/a0 )2 a0 0 = 0.323 7.2 ANGULAR MOMENTUM IN THE HYDROGEN ATOM Angular momentum played a significant role in Bohr’s analysis of the structure of the hydrogen atom. Bohr was able to obtain the correct energy levels by assuming that in the orbit with quantum number n, the angular momentum of the electron − Bohr’s idea about the “quantization of angular momentum” turned is equal to nh. out to have some correct features, but his analysis is not consistent with the actual quantum mechanical nature of angular momentum. Angular Momentum of Classical Orbits L = maximum L=0 FIGURE 7.2 Planetary orbits of the same energy but different angular momentum L. As L decreases, the elliptical orbits become longer and thinner. Before considering the angular momentum of an orbiting electron, it is helpful to review how angular momentum affects classical orbits, such as those of planets or comets about the Sun. Classically, the angular momentum of a particle is  = r × p, where r is the position vector that locates represented by the vector L  is perpendicular the particle and p  is its linear momentum. The direction of L to the plane of the orbit. Along with the energy, the angular momentum remains constant as the planet orbits. The total energy of the orbital motion determines the average distance of the planet from the Sun. For a given total energy, many different orbits are possible, from the nearly circular orbit of the Earth to the highly elongated elliptical orbits of the comets. These orbits differ in their angular momentum L, which is largest for the circular orbit and smallest for the elongated ellipse. Figure 7.2 shows a variety of planetary orbits having the same total energy but different angular momentum. The complete specification of the orbit requires that we give not only the magnitude of the angular momentum vector but also its direction; this direction identifies the plane of the orbit. To completely describe the angular momentum vector requires three numbers; for example, we might give the three  (Lx , Ly , Lz ). Equivalently, we might give the magnitude L of the components of L vector and two angular coordinates that give its direction (similar to latitude and longitude on a sphere). 7.2 | Angular Momentum in the Hydrogen Atom 201 Angular Momentum in Quantum Mechanics Quantum mechanics gives us a very different view of angular momentum. The angular momentum properties of a three-dimensional wave function are described by two quantum numbers. The first is the angular momentum quantum number l. This quantum number determines the length of the angular momentum vector:   | = l(l + 1)h− (l = 0, 1, 2, . . .) (7.5) |L −  | = nh. In particular, it Note that this is very different from the Bohr condition |L is possible for the quantum vector to have a length of zero, but in the Bohr model − the minimum length is h. The second number that we use to describe angular momentum in quantum mechanics is the magnetic quantum number ml . This quantum number tells us about one component of the angular momentum vector, which we usually choose to be the z component. The relationship between the z component of  and the magnetic quantum number is L Lz = ml h− (ml = 0, ±1, ±2, . . . , ±l) (7.6) Note that for each value of l there are 2l + 1 possible values of ml . Unlike the classical angular momentum vector, for which we provide an exact specification by giving three numbers, the quantum angular momentum is described by only two numbers. Clearly two numbers cannot completely identify a vector in three-dimensional space, so something is missing from our description of the quantum angular momentum. As we discuss later, this missing part of the description of the quantum angular momentum vector is directly related to the application of the uncertainty principle to angular momentum. Example 7.3 Compute the length of the angular momentum vectors that represent the orbital motion of an electron in a quantum state with l = 1 and in another state with l = 2. For l = 1, Solution and for l = 2, Equation 7.5 gives the relationship between the length of the vector and the angular momentum quantum number l. | = |L  √ 1(1 + 1)h− = 2h− | = |L  √ 2(2 + 1)h− = 6h− Example 7.4  that What are the possible z components of the vector L represents the orbital angular momentum of a state with l = 2? Solution The possible ml values for l = 2 are +2, +1, 0, −1,  vector can have any of five possible −2, and so the L − − − − z components: Lz = 2h, h, 0, −h, or −√2h. The length of the −  , as we found previously, is 6h. vector L Chapter 7 | The Hydrogen Atom in Wave Mechanics 202  for l = 2 are illustrated in Figure 7.3. Each The components of the vector L  corresponds to a different ml value. The polar orientation in space of the vector L  makes with the z axis can be found by referring to the angle θ that the vector L  | cos θ, we have figure. With Lz = |L z Lz = +2h − ml = +2 θ Lz = +h− L  = √6h− ml = +1 Lz = 0 ml = 0 Lz = −h− ml = −1 cos θ = ml = −2 Lz = −2h− FIGURE 7.3 The orientations in space and z components of a vector with l = 2. There are five different possible orientations. Lz | |L = ml l(l + 1) (7.7)  |. using Eq. 7.6 for Lz and Eq. 7.5 for |L This behavior represents a curious aspect of quantum mechanics called spatial quantization—only certain orientations of angular momentum vectors are allowed. The number of these orientations is equal to 2l + 1 (the number of different possible ml values) and the magnitudes of their successive z components − For example, an angular momentum state with l = 1 can have always differ by h. − − to z components Lz = +h, 0, −h) and ml values of +1, 0, √ √ or −1 (corresponding  vector in this case can have one of thus cos θ = +1/ 2, 0, or − 1/ 2. The L only three possible orientations relative to the z axis, corresponding to angles of 45◦ , 90◦ , or 135◦ . This is in contrast to a classical angular momentum vector, which can have any possible orientation in space; that is, the angle between a classical angular momentum vector and the z axis can take any value between 0 and 180◦ . The Angular Momentum Uncertainty Relationship In quantum mechanics, the maximum amount of permitted information about the angular momentum vector is its length (given by Eq. 7.5) and its z component (given by Eq. 7.6). Because the complete description of a vector requires three numbers, we are always missing some information about the angular momentum  | and Lz exactly, then we have no information of a quantum state. If we specify |L  (Lx and Ly ). Any possible outcome of a about the other components of L  |2 = L2x + L2y + L2z ). measurement of Lx or Ly can therefore occur (as long as |L  vector rotates or precesses In graphic terms, we can imagine that the tip of the L about the z axis so that Lz remains fixed but Lx and Ly are undetermined, as in Figure 7.4. This rotation cannot be directly measured; all we can observe is the “smeared out” distribution of values of Lx and Ly .  that is summarized There is thus an uncertainty or indeterminacy in specifying L by another form of the uncertainty principle: z Lz φ ≥ h− L Lz φ x Ly Lx y  precesses FIGURE 7.4 The vector L rapidly about the z axis, so that Lz stays constant, but Lx and Ly are indeterminate. (7.8) where φ is the azimuthal angle shown in Figure 7.4. If we know Lz exactly (Lz = 0), then we have no knowledge at all of the angle φ —all values are equally probable. This is equivalent to saying that we know nothing at all about  is determined, the other components Lx and Ly ; whenever one component of L are completely undetermined. On the other hand, if we try to construct an angular momentum state in which a different component—for example, Lx —is completely specified (so that φ would be known), the state becomes a mixture or superposition of different Lz values. In effect, we can reduce the uncertainty in φ only at the expense of increasing the uncertainty in Lz . This is exactly the same type of behavior that was described by 7.3 | The Hydrogen Atom Wave Functions 203 the other forms of the uncertainty principle; for example, reducing the uncertainty in x is always accompanied by an increase in the uncertainty in px . From this discussion you can see why the length of the angular momentum is defined according to Eq. 7.5 and why, for example, we could not have simply −  | = lh. If this were possible, then when ml had its defined the length as |L − the length of the maximum value (ml = +l), we would have Lz = ml h− = lh; vector would then be equal to its z component, and so it must lie along the z axis with Lx = Ly = 0. However, this simultaneous exact knowledge of all  violates the angular momentum form of the uncertainty three components of L principle, and therefore this situation is not permitted to occur. It is therefore −  to be greater than lh. necessary for the length of L 7.3 THE HYDROGEN ATOM WAVE FUNCTIONS To find the complete spatial description of the electron in a hydrogen atom, we must obtain three-dimensional wave functions. The Schrödinger equation in three-dimensional Cartesian coordinates has the following form: − h−2 2m  ∂ 2ψ ∂ 2ψ ∂ 2ψ + + ∂x2 ∂y2 ∂z2  + U(x, y, z)ψ(x, y, z) = Eψ(x, y, z) (7.9) where ψ is a function of x, y, and z. The usual procedure for solving a partial differential equation of this type is to separate the variables by replacing a function of three variables with the product of three functions of one variable— for example, ψ(x, y, z) = X (x)Y (y)Z(z). However, the Coulomb potential energy (Eq. 7.1)  written in Cartesian coordinates, U(x, y, z) = −e2 /4πε0 x2 + y2 + z2 , does not lead to a separable solution. For this calculation, it is more convenient to work in spherical polar coordinates (r, θ, φ) instead of Cartesian coordinates (x, y, z). The variables of spherical polar coordinates are illustrated in Figure 7.5. This simplification in the solution is at the expense of an increased complexity of the Schrödinger equation, which becomes:   ∂ 2ψ 2 ∂ψ h−2 ∂ 2 ψ 1 ∂ ∂ψ 1 + + sin θ + (7.10) − 2m ∂r2 r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂φ 2 z +U(r)ψ(r, θ , φ) = Eψ(r, θ, φ) θ z where now ψ is a function of the spherical polar coordinates r, θ, and φ. When the potential energy depends only on r (and not on θ or φ), as is the case for the Coulomb potential energy, we can find solutions that are separable and can be factored as (7.11) where the radial function R(r), the polar function (θ ), and the azimuthal function (φ) are each functions of a single variable. This procedure gives three differential equations, each of a single variable (r, θ, φ). The quantum state of a particle that moves in a potential energy that depends only on r can be described by angular momentum quantum numbers l and ml . y φ x x ψ(r, θ, φ) = R(r)(θ )(φ) r y FIGURE 7.5 Spherical polar coordinates for the hydrogen atom. The proton is at the origin and the electron is at a radius r, in a direction determined by the polar angle θ and the azimuthal angle φ. 204 Chapter 7 | The Hydrogen Atom in Wave Mechanics The polar and azimuthal solutions are given by combinations of standard trigonometric functions. The remaining radial function is then obtained from solving the radial equation:     h−2 d 2 R 2 dR l(l + 1)h−2 e2 + + + − R(r) = ER(r) (7.12) − 2m dr2 r dr 4πε0 r 2mr2 The mass that appears in this equation is the reduced mass of the proton-electron system defined in Eq. 6.44. Quantum Numbers and Wave Functions When we solve a three-dimensional equation such as the Schrödinger equation, three parameters emerge in a natural way as indices or labels for the solutions, just as the single index n emerged from our solution of the one-dimensional infinite well in Section 5.4. These indices are the three quantum numbers that label the solutions. The three quantum numbers that emerge from the solutions and their allowed values are: n principal quantum number 1, 2, 3, . . . l angular momentum quantum number ml magnetic quantum number 0, 1, 2, . . . , n − 1 0, ±1, ±2, . . . , ±l The principal quantum number n is identical to the quantum number n that we obtained in the Bohr model. It determines the quantized energy levels: En = − me4 32π 2 ε02 h−2 1 n2 (7.13) which is identical to Eq. 6.30. Note that the energy depends only on n and not on the other quantum numbers l or ml . The permitted values of the angular momentum quantum number l are limited by n (l ranges from 0 to n − 1) and those of the magnetic quantum number ml are limited by l. Complete with quantum numbers, the separated solutions of Eq. 7.10 can be written ψn,l,ml (r, θ, φ) = Rn,l (r)l,ml (θ )ml (φ) (7.14) The indices (n, l, ml ) are the three quantum numbers that are necessary to describe the solutions. Wave functions corresponding to some values of the quantum numbers are shown in Table 7.1. The wave functions are written in terms of the Bohr radius a0 defined in Eq. 6.29. For the ground state (n = 1), only l = 0 and ml = 0 are allowed. The complete set of quantum numbers for the ground state is then (n, l, ml ) = (1, 0, 0), and the wave function for this state is given in the first line of Table 7.1. The first excited state (n = 2) can have l = 0 or l = 1. For l = 0, only ml = 0 is allowed. This state has quantum numbers (2, 0, 0), and its wave function is given in the second line of Table 7.1. For l = 1, we can have ml = 0 or ±1. There are thus three possible sets of quantum numbers: (2, 1, 0) and (2, 1, ±1). The wave functions for these states are given in the third and fourth lines of Table 7.1. The second excited state (n = 3) can have l = 0 (ml = 0), l = 1 (ml = 0, ±1), or l = 2 (ml = 0, ±1, ±2). For the n = 2 level, there are four different possible sets of quantum numbers and correspondingly four different wave functions. All of these wave functions 7.3 | The Hydrogen Atom Wave Functions TABLE 7.1 Some Hydrogen Atom Wave Functions n l ml 1 0 0 2 0 0 2 1 0 2 1 ±1 3 0 0 3 1 0 3 1 ±1 (θ ) (φ) 1 √ 2 1 √ 2π 1 √ 2 1 √ 2π 3 cos θ 2 √ 3 ∓ sin θ 2 1 √ 2π 1 √ e±iφ 2π 1 √ 2 1 √ 2π 3 cos θ 2 √ 3 ∓ sin θ 2 1 √ 2π R(r) 2 3/2 a0 1 (2a0 )3/2 e−r/a0   r 2− e−r/2a0 a0  1 r −r/2a0 e √ 3/2 a 3(2a0 ) 0 √ 1 )3/2 3(2a0  2 2r 1− (3a0 )3/2 3a0  8 r √ 3/2 a 9 2(3a0 ) 0  8 r √ 3/2 a0 9 2(3a0 ) r −r/2a0 e a0  2r2 + e−r/3a0 27a20  r2 − 2 e−r/3a0 6a0  r2 − 2 e−r/3a0 6a0 3 2 0 r2 −r/3a0 e √ 27 10(3a0 )3/2 a20 3 2 ±1 r2 −r/3a0 4 e √ 27 10(3a0 )3/2 a20 3 2 ±2 r2 −r/3a0 4 e √ 27 10(3a0 )3/2 a20 4   5 (3 cos2 θ − 1) 8  15 ∓ sin θ cos θ 4 √ 15 2 sin θ 4 1 √ e±iφ 2π 1 √ 2π 1 √ e±iφ 2π 1 √ e±2iφ 2π correspond to the same energy, so the n = 2 level is degenerate. (Degeneracy was introduced in Section 5.4.) The n = 3 level is degenerate with nine possible sets of quantum numbers. In general, the level with principal quantum number n has a degeneracy equal to n2 . Figure 7.6 illustrates the labeling of the first three levels. If different combinations of quantum numbers have exactly the same energy, what is the purpose of listing them separately? First, as we discuss in the last section of this chapter, the levels are not precisely degenerate, but are separated −1.5 eV −3.4 eV −13.6 eV (3, 0, 0) (3, 1, 1) (3, 1, 0) (3, 1, –1) (2, 0, 0) (2, 1, 1) (2, 1, 0) (2, 1, –1) (3, 2, 2) (3, 2, 1) (3, 2, 0) (3, 2, –1) (3, 2, –2) (1, 0, 0) FIGURE 7.6 The lower energy levels of hydrogen, labeled with the quantum numbers (n, l, ml ). The first excited state is four-fold degenerate and the second excited state is nine-fold degenerate. 205 Chapter 7 | The Hydrogen Atom in Wave Mechanics 0.8 1.5 0.6 n = 1, l = 0 0.4 0.4 1 0.5 n=3 n=2 0.3 0.2 0 2 r 4 l=1 0.1 l=1 l=2 0 0 l=0 0.2 l=0 R(r) 2 R(r) R(r) 206 0 6 –0.1 –0.2 0 2 4 6 8 10 0 r 5 10 15 20 r FIGURE 7.7 The radial wave functions of the n = 1, n = 2, and n = 3 states of hydrogen. The radius coordinate is measured in units of a0 . by a very small energy (about 10−5 eV). Second, in the study of the transitions between the levels, we find that the intensities of the individual transitions depend on the quantum numbers of the particular level from which the transition originates. Third, and perhaps most important, each of these sets of quantum numbers corresponds to a very different wave function, and therefore represents a very different state of motion of the electron. These states have different spatial probability distributions for locating the electron, and thus can affect many atomic properties—for example, the way two atoms can form molecular bonds. The radial wave functions for the states listed in Table 7.1 are plotted in Figure 7.7. You can readily see the differences in the motion of the electron for the different states. For example, in the n = 2 level, the l = 0 and l = 1 wave functions have the same energy but their behavior is very different: the l = 1 wave function falls to zero at r = 0, but the l = 0 wave function remains nonzero at r = 0. The l = 0 electron thus has a much greater probability of being found close to (or even inside) the nucleus, which turns out to play a large role in determining the rates for certain radioactive decay processes. z Probability Densities dr As we learned in Chapter 5, the probability of finding the electron in any spatial interval is determined by the square of the wave function. For the hydrogen atom, |ψ(r, θ , φ)|2 gives the volume probability density (probability per unit volume) at the location (r, θ , φ). To compute the actual probability of finding the electron, we multiply the probability per unit volume by the volume element dV located at (r, θ, φ). In spherical polar coordinates (see Figure 7.8) the volume element is r dθ dθ r sin θ y dφ x r sinθ dφ FIGURE 7.8 The volume element in spherical polar coordinates. dV = r2 sin θdr dθ dφ (7.15) and therefore the probability to find the electron in the volume element at that location is |ψn,l,ml (r, θ, φ)|2 dV = |Rn,l (r)|2 |l,ml (θ )|2 |ml (φ)|2 r2 sin θ dr dθ dφ (7.16) 7.4 | Radial Probability Densities l = 0, ml = 0 l = 1, ml = 0 l = 1, ml = 1 l = 2, ml = 0 l = 2, ml = 1 l = 2, ml = 2 FIGURE 7.9 Representations of |ψ|2 for different sets of quantum numbers. The z axis is the vertical direction. The diagrams represent surfaces on which the probability has the same value. Some representations of the probability density |ψ(r, θ, φ)|2 are shown in Figure 7.9. We can regard these illustrations as representing the “smeared out” distribution of electronic charge in the atom, which results from the uncertainty in the electron’s location. They also represent the statistical outcomes of a large number of measurements of the location of the electron in the atom. These spatial distributions have important consequences for the structure of atoms with many electrons, which is discussed in Chapter 8, and also for the joining of atoms into molecules, which is discussed in Chapter 9. In the next two sections, we will separately examine how the probability density depends on the radial coordinate and on the angular coordinates. 7.4 RADIAL PROBABILITY DENSITIES Instead of asking about the complete probability density to locate the electron, we might want to know the probability to find the electron at a particular distance from the nucleus, no matter what the values of θ and φ might be. That is, imagine a thin spherical shell of radius r and thickness dr. What is the probability to find the electron in the shell between spheres of radius r and r + dr? We define the radial probability density P(r) so that the probability to find the electron within 207 Chapter 7 | The Hydrogen Atom in Wave Mechanics 208 that shell is P(r)dr. We can determine the radial probability from the complete probability (Eq. 7.16) by integrating over the θ and φ coordinates. In effect, this adds up the probabilities for the volume elements at a given r for all θ and φ. P(r) dr = |Rn,l (r)|2 r2 dr  π |l,ml (θ )|2 sin θ dθ 0  2π 0 |ml (φ)|2 dφ (7.17) The θ and φ integrals are each equal to unity, because each of the functions R, , and  is individually normalized. Thus the radial probability density is P(r) = r2 |Rn,l (r)|2 (7.18) Figure 7.10 shows this function for several of the lowest levels of hydrogen. Note that, because of the r2 factor, P(r) must be zero at r = 0 even though R(r) might not. That is, the probability to locate the electron in a spherical shell always goes to zero as r → 0 because the volume of the shell goes to zero, but the probability density |ψ|2 may be nonzero at r = 0. Moreover, P(r) and |R(r)|2 convey different information about the electron’s behavior, as you can see by comparing Figures 7.7 and 7.10. For example, the radial wave function R(r) for n = 1, l = 0 has its maximum at r = 0, but the radial probability density for that state has its maximum at r = a0 . Using the radial probability densities, it is possible to find the average value of the radial coordinate, that is, the average distance between the proton and the electron (see Problems 30 and 31). These values are indicated by markers in Figure 7.10. Notice that the average radial coordinate is about 1.5a0 for the n = 1 wave function and is much greater, about 5a0 , for both of the n = 2 wave functions. The average radius is greater still, about 12a0 , for the n = 3 states. It appears from these graphs that the average radius depends mostly on n and not very much on l. The principal quantum number n thus determines not only 0.8 0.12 0.2 l=1 n=3 n=2 l=1 n = 1, l = 0 l=2 l=0 0.1 P(r) P(r) P(r) 0.4 l=0 0.06 0.2 1 0 2 10 0 0 0 0 1 2 3 r 4 5 0 2 4 6 r 8 10 12 0 5 10 15 20 25 r FIGURE 7.10 The radial probability density P(r) for the n = 1, n = 2, and n = 3 states of hydrogen. The radius coordinate is measured in units of a0 . The markers on the horizontal axis show the values of the average radius rav labeled with the value of l. 7.4 | Radial Probability Densities 209 the energy level of the electron, it also determines to a great extent the average distance of the electron from the nucleus. As in the Bohr model, this average radius varies roughly as n2 , so that an n = 2 electron is on the average about 4 times farther from the nucleus than an n = 1 electron, an n = 3 electron is about 9 times farther from the nucleus than an n = 1 electron, and so forth. Another measure of the location of the electron is its most probable radius, determined from the location at which P(r) has its maximum value. For each n, P(r) for the state with l = n − 1 has only a single maximum, which occurs at the location of the Bohr orbit, r = n2 a0 . The following example illustrates this for the n = 2 state. Example 7.5 Prove that the most likely distance from the origin of an electron in the n = 2, l = 1 state is 4a0 . dP(r) 1 d 4 −r/a0 ) = (r e dr 24a50 dr Solution   1 1 3 −r/a0 4 = +r − e−r/a0 = 0 4r e a0 24a50 In the n = 2, l = 1 level, the radial probability density is P(r) = r2 |R2,1 (r)|2 = r2 1 r2 −r/a0 e 24a30 a20 We wish to find where this function has its maximum; in the usual fashion, we take the first derivative of P(r) and set it equal to zero: or   1 −r/a0 r4 3 e − 4r =0 a0 24a50 The only solution that yields a maximum is r = 4a0 . Example 7.6 For the n = 2 states (l = 0 and l = 1), compare the probabilities of the electron being found inside the Bohr radius. Evaluating the integrals using Eq. 7.4, we obtain P(0 : a0 ) = 0.034 Solution For the n = 2, l = 0 level, P(r)dr = r2 |R2,0 (r)|2 dr 2  = r2 8a13 2 − ar0 e−r/a0 dr. The total probability of find0 ing the electron between r = 0 and r = a0 is P(0 : a0 ) = =  For the n = 2, l = 1 level P(r)dr = r2 |R2,1 (r)|2 dr 1 r2 −r/a0 dr. The total probability between r = 0 = r2 24a 3 a2 e 0 0 and r = a0 is a0 P(r) dr 0 1 8a30  0 a0   4r3 r4 + 2 e−r/a0 dr 4r2 − a0 a0 P(0 : a0 ) = =  a0 P(r) dr 0 1 24a30  0 a0 r4 −r/a0 e dr = 0.0037 a20 210 Chapter 7 | The Hydrogen Atom in Wave Mechanics The results of Example 7.6 are consistent with the radial probability densities shown in Figure 7.10—in the l = 0 state, the probability of finding the electron inside a0 is about 10 times larger than in the l = 1 state, as suggested by the small peak in the radial probability density for n = 2, l = 0 at small r. There is clearly more area under the P(r) curve between r = 0 and r = a0 for n = 2, l = 0 than there is for n = 2, l = 1. Curiously, Figure 7.10 also shows that for n = 2 the l = 0 radial probability density is also greater than the l = 1 probability density at large r. Thus the l = 0 electron spends more time close to the nucleus than the l = 1 electron and it also spends more time farther away. This is a general result that holds for any value of n: the smaller the l value, the larger is the probability to find the electron both close to the nucleus and far from the nucleus. The classical planetary orbits of Figure 7.2 show the same type of behavior—the orbit with L = 0 spends more time close to the central body and also more time far away from the central body, compared with the orbits that have larger L. 7.5 ANGULAR PROBABILITY DENSITIES In this section, we consider the angular part of the probability density, which is obtained from the squared magnitudes of the angular parts of the wave function: P(θ, φ) = |l,ml (θ )ml (φ)|2 (7.19) Figure 7.11 shows the angular probablity densities for the l = 0 and l = 1 wave functions listed in Table 7.1. Note that all of the probability densities are cylindrically symmetric—there is no dependence on the azimuthal angle φ. The l = 0 wave function is also spherically symmetric—that is, the probability density is independent of direction. The l = 1 probability densities have two distinct shapes. For ml = 0, the electron is found primarily in two regions of maximum probability along the positive and negative z axis, while for ml = ±1, the electron is found primarily near the xy plane. For ml = 0, the electron’s angular momentum vector lies in the xy plane (Figure 7.3). Classically, the angular momentum vector is perpendicular to the orbital plane, so it should not be surprising that the electron is most likely to be found in a location away from the xy plane—that is, along the z axis. For z z l=1 ml = 0 l=0 ml = 0 y l=1 ml = ±1 y x x z y x FIGURE 7.11 The angular dependence of the l = 0 and l = 1 probability densities. 7.6 | Intrinsic Spin 211 ml = ±1, the angular momentum vector has its maximum projection along the z  , spends most of its time near axis; the electron, again orbiting perpendicular to L the xy plane. These probability densities for locating the electron are consistent with the information given by the orientation of the angular momentum vector, and the cylindrical symmetry of the probability densities is consistent with the  represented in Figure 7.4. uncertainty in the knowledge of the orientation of L Example 7.7 For the n = 2, l = 1 wave functions, find the direction in space at which the maximum probability occurs when ml = 0 and when ml = ±1. Solution For l = 1, ml = 0 we have P(θ, φ) = |2,0 (θ )0 (φ)|2 = 3 2 4π cos θ . To find the location of the maximum, we set dP/dθ equal to zero: 3 dP = (−2 cos θ sin θ ) = 0 dθ 4π There are two solutions to this equation: one for cos θ = 0, for which θ = π/2, and another for sin θ = 0, which gives θ = 0 or π . By taking the second derivative, we find that θ = π/2 leads to a minimum while θ = 0 or π gives the maximum. There are thus two regions of maximum probability, one along the positive z axis (θ = 0) and another along the negative z axis (θ = π ), as in Figure 7.11. For l = 1, ml = ±1 the angular probability density is 3 sin2 θ. We can then P(θ, φ) = |2,±1 (θ )±1 (φ)|2 = 8π find the location of the maximum: 3 dP = (sin θ cos θ ) = 0 dθ 4π Once again there are two solutions: θ = 0, π or θ = π/2. However, in this case the maximum occurs for θ = π/2 and the probability maximum occurs in the xy plane, as in Figure 7.11. 7.6 INTRINSIC SPIN One way of observing spatial quantization is to place the atom in an externally applied magnetic field. From the interaction between the magnetic field and the magnetic dipole moment of the atom (which is related to the electron’s orbital angular momentum), it is possible both to observe the separate components of  and also to determine l by counting the number of z components (which, as L we have seen, is equal to 2l + 1). However, when this experiment is done, a surprising result emerges that indicates an unexpected property of the electron, known as intrinsic spin. L − Orbital Magnetic Dipole Moments Figure 7.12 shows a classical magnetic dipole moment, which might be produced by a current loop or the orbital motion of a charged object. The classical magnetic dipole moment µ  is defined as a vector whose magnitude is equal to the product of the circulating current and the area enclosed by the orbital loop. The direction of µ  is perpendicular to the plane of the orbit, determined by the right-hand rule—with the fingers in the direction of the conventional (positive) current, the thumb indicates the direction of µ  , as shown in Figure 7.12 for a circulating negative charge like an electron. r i µ FIGURE 7.12 A circulating negative charge is represented as a current loop.  and Because the charge is negative, L µ  point in opposite directions. 212 Chapter 7 | The Hydrogen Atom in Wave Mechanics z ml ប µL L −ml μB FIGURE 7.13 According to quantum mechanics, the vectors can be considered to precess around the z axis, and so we can specify only the z compo and µ nents of L . As we have seen, quantum mechanics forbids exact knowledge of the direction  and therefore of µ  and of L  . Figure 7.13 suggests the relationship between L µ  that is consistent with quantum mechanics. Only the z components of these  and µ vectors can be specified. Because the electron has a negative charge, L  have z components of opposite signs. We can use the Bohr model with a circular orbit to obtain the relationship  and µ between L  , which turns out to be identical with the correct quantum mechanical result. We regard the circulating electron as a circular loop of current i = dq/dt = q/T, where q is the charge of the electron (−e) and T is the time for one circuit around the loop. If the electron moves with speed v = p/m around a loop of radius r, then T = 2πr/v = 2πrm/p. The magnitude of the magnetic moment is q q q | (7.20) π r2 = rp = |L μ = iA = 2πrm/p 2m 2m  | = rp. Writing Eq. 7.20 in terms of vectors and putting −e for the with |L electronic charge, we obtain µ L = − e  L 2m (7.21) The negative sign, which is present because the electron has a negative charge,  and µ indicates that the vectors L  L point in opposite directions. The subscript L on µ  L reminds us that this magnetic moment arises from the orbital angular  of the electron. momentum L The z component of the magnetic moment is μL,z = − e eh− e Lz = − ml h− = − ml = −ml μB 2m 2m 2m (7.22) − The quantity eh/2m is defined to be the Bohr magneton μB = eh− 2m (7.23) The value of μB is μB = 9.274 × 10−24 J/T The Bohr magneton is a convenient unit for expressing atomic magnetic moments, which typically have values of the order of μB . A Dipole in an External Field Before we consider further the behavior of µ  L , we discuss the similar behavior of an electric dipole, which consists of two equal and opposite charges q separated by a distance r. The electric dipole moment p  has magnitude qr and points from the negative charge to the positive charge. As shown in Figure 7.14a, in a uniform + on the positive charge and F− on the negative electric field, the vertical forces F charge are of equal magnitude. The dipole experiences a torque that tends to rotate  , but the net force on the dipole is zero. Suppose now it into alignment with E that the field is not uniform—for example, the field strength decreases from the − bottom of the figure to the top, as in Figure 7.14b. Now the downward force F + on the positive acting on the negative charge is greater than the upward force F 7.6 | Intrinsic Spin E F+ E E F+ + + p F+ − p F− − − F− 213 p + F− Fnet = 0 Fnet Fnet (a) (b) (c)  experiences no net force. (b) In a nonuniform FIGURE 7.14 (a) An electric dipole in a uniform electric field E − is greater than the force F+ . electric field (decreasing from the bottom of the figure to the top), the force F There is a net downward force on the dipole. (c) If the dipole moment is reversed, the net force is in the opposite direction. charge. There is still a net torque that tends to rotate the dipole, but there is also a net force that tends to move the dipole, in this case downward. On the other hand, if we reverse the locations of the two charges (Figure 7.14c), which is equivalent to reversing the electric dipole moment p , the upward force F+ on the positive − on the negative charge, so the charge is now greater than the downward force F net force on the dipole is upward. We can state this result in another way that will be more applicable to our discussion of magnetic dipole moments. Let the field direction define the z axis. Then dipoles with pz > 0 (as in Figure 7.14b) experience a net negative force and move in the negative z direction, while dipoles with pz < 0 (as in Figure 7.14c) experience a net positive force and move in the positive z direction. A magnetic dipole moment µ  behaves in an identical way. (In fact, if we imagine fictitious N and S poles, the behavior of a magnetic moment would be described by illustrations similar to Figure 7.14.) A nonuniform magnetic field acting on the magnetic moments gives an unbalanced force that causes a displacement. Figure 7.15 illustrates the behavior of magnetic dipole moments having different orientations in a nonuniform magnetic field. The two different orientations give net forces in opposite directions: if μz is positive the force on the dipole is negative, and if μz is negative the force on the dipole is positive. B B µ µ The Stern-Gerlach Experiment Imagine the following experiment, illustrated schematically in Figure 7.16. A beam of hydrogen atoms is prepared in the n = 2, l = 1 state. The beam consists of equal numbers of atoms in the ml = −1, 0, and +1 states. (We assume we can do the experiment so quickly that the n = 2 state doesn’t decay to the n = 1 state. In practice this may not be possible.) The beam passes through a region in which there is a nonuniform magnetic field. The atoms with ml = +1 (μL,z = −μB ) Fnet Fnet FIGURE 7.15 Two magnetic dipoles in a nonuniform magnetic field. Oppositely directed dipoles experience net forces in opposite directions. 214 Chapter 7 | The Hydrogen Atom in Wave Mechanics z axis +1 ml –1 0 +1 0 −1 Slit Oven Magnet Screen FIGURE 7.16 Schematic diagram of the Stern-Gerlach experiment. A beam of atoms passes through a region where there is a nonuniform magnetic field. Atoms with their magnetic dipole moments in opposite directions experience forces in opposite directions. (a) (b) FIGURE 7.17 The results of the Stern-Gerlach experiment. (a) The image of the slit with the field turned off. (b) With the field on, two images of the slit appear. The small divisions in the scale at the left represent 0.05 mm. [Source: W. Gerlach and O. Stern, Zeitschrift für Physik 9, 349 (1922)] experience a net upward force and are deflected upward, while the atoms with ml = −1 (μL,z = +μB ) are deflected downward. The atoms with ml = 0 are undeflected. After passing through the field, the beam strikes a screen where it makes a visible image. When the field is off, we expect to see one image of the slit in the center of the screen, because there is no deflection at all. When the field is on, we expect three images of the slit on the screen—one in the center (corresponding to ml = 0), one above the center (ml = +1), and one below the center (ml = −1). If the atom were in the ground state (l = 0), we expect to see one image in the screen whether the field was off or on (recall that a ml = 0 atom is not deflected). If we had prepared the beam in a state with l = 2, we would see five images with the field on. The number of images that appears is just the number of different ml values, which is equal to 2l + 1. With the possible values for l of 0, 1, 2, 3, . . ., it follows that 2l + 1 has the values 1, 3, 5, 7, . . .; that is, we should always see an odd number of images on the screen. However, if we were actually to perform the experiment with hydrogen in the l = 1 state, we would find not three but six images on the screen! Even more confusing, if we did the experiment with hydrogen in the l = 0 state, we would find not one but two images on the screen, one representing an upward deflection and one a downward deflection! In the  has length zero, and so we expect that there is no magnetic l = 0 state, the vector L moment for the magnetic field to deflect. We observe this not to be true—even when l = 0, the atom still has a magnetic moment, in contradiction to Eq. 7.21. The first experiment of this type was done by O. Stern and W. Gerlach in 1921. They used a beam of silver atoms; although the electronic structure of silver is more complicated than that of hydrogen (as we discuss in Chapter 8), the same basic principle applies—the silver atom must have l = 0, 1, 2, 3, . . ., and so an odd number of images is expected to appear on the screen. In fact, they observed the beam to split into two components, producing two images of the slits on the screen (see Figure 7.17). The observation of separated images was the first conclusive evidence of spatial quantization; classical magnetic moments would have all possible orientations and would make a continuous smeared-out pattern on the screen, but the observation of a number of discrete images on the screen means that the atomic magnetic 7.6 | Intrinsic Spin moments can take only certain discrete orientations in space. These correspond to the discrete orientations of the magnetic moment (or, equivalently, of the angular momentum). However, the number of discrete images on the screen does not agree with our expectations that it be an odd number. We expect 2l + 1 images, so for two images we should have l = 1/2, which is not permitted by the Schrödinger equation. We can resolve this dilemma if there is another contribution to the angular momentum of the atom, the intrinsic angular momentum of the electron. An electron in an atom has two kinds of angular momentum, somewhat like the Earth as it both orbits the Sun and rotates on its axis. The electron has an orbital  , which characterizes the motion of the electron about the angular momentum L , which behaves as if the electron nucleus, and an intrinsic angular momentum S  is usually called the intrinsic spin. were spinning about its axis. For this reason, S (However, it is not correct to use the classical analogy to think of the electron as a tiny ball of charge spinning about an axis, because the electron is a point particle with no physical size.) The idea of electron spin was proposed by S. A. Goudsmit and G. E. Uhlenbeck in 1925, and P. A. M. Dirac showed in 1928 that relativistic quantum theory for the electron gives the electron spin directly as an additional quantum number. In order to explain the result of the Stern-Gerlach experiment, we must assign to the electron an intrinsic spin quantum number s of 1/2. The intrinsic spin behaves much like the orbital angular momentum; there is the quantum number s (which we can regard as a label arising from the mathematics), the angular , a z component Sz , an associated magnetic moment µ momentum vector S  S , and a spin magnetic quantum number ms . Figure 7.18 illustrates the vector properties of , and Table 7.2 compares the properties of orbital and spin angular momentum S for electrons in atoms. The inclusion of spin gives a direct explanation for the Stern-Gerlach experiment. The outermost electron in a silver atom occupies a state with l = 0. (The other electrons do not contribute to the magnetic properties of the atom.) The magnetic behavior is therefore due entirely to the spin magnetic moment, which has only two possible orientations in the magnetic field (corresponding to ms = ± 1/2) and thus gives the two beams observed emerging from the magnet. Every fundamental particle has a characteristic intrinsic spin and a corresponding spin magnetic moment. For example, the proton and neutron also have a spin quantum number of 1/2. The photon has a spin quantum number of 1, while the pi meson (pion) has s = 0. Sz = + S ប 2 S = √ 3/4 ប Sz = − ប 2 FIGURE 7.18 The spin angular momentum of an electron and the spatial orientation of the spin angular momentum vector. TABLE 7.2 Orbital and Spin Angular Momentum of Electrons in Atoms Orbital Quantum number Length of vector z component Magnetic quantum number Magnetic moment Spin l = 0, 1, 2, . . . √  | = l(l + 1)h− |L | = |S ml = 0, ±1, ±2, . . ., ±l ms = ± 1/2 Lz = ml h−  µ  L = −(e/2m)L s= 215 1/2 √ √ s(s + 1)h− = 3/4h− Sz = ms h− µ  S = −(e/m)S 216 Chapter 7 | The Hydrogen Atom in Wave Mechanics Example 7.8 In a Stern-Gerlach type of experiment, the magnetic field varies with distance in the z direction according to dBz /dz = 1.4 T/mm. The silver atoms travel a distance x = 3.5 cm through the magnet. The most probable speed of the atoms emerging from the oven is v = 750 m/s. Find the separation of the two beams as they leave the magnet. The mass of a silver atom is 1.8 × 10−25 kg, and its magnetic moment is about 1 Bohr magneton. Solution The potential energy of the magnetic moment in the magnetic field is because the field along the central axis of the magnet has only a z component. The force on the atom can be found from the potential energy according to dB dU = μz z dz dz a= Fz μ (dBz /dz) = z m m The vertical deflection z of either beam can be found from z = 12 at2 , where t, the time to traverse the magnet, equals x/v. Each beam is deflected by this amount, so the net separation d is 2z, or d=  = −μz Bz U = −µ  ·B Fz = − The acceleration of a silver atom of mass m as it passes through the magnet is = μz (dBz /dz)x2 mv2 (9.27 × 10−24 J/T)(1.4 × 103 T/m)(3.5 × 10−2 m)2 (1.8 × 10−25 kg)(750 m/s)2 = 1.6 × 10−4 m = 0.16 mm This is consistent with the separation that can be read from the scale in Figure 7.17. 7.7 ENERGY LEVELS AND SPECTROSCOPIC NOTATION We previously described all of the possible electronic states in hydrogen by three quantum numbers (n, l, ml ), but as we have seen, a fourth property of the electron, the intrinsic angular momentum or spin, requires the introduction of a fourth quantum number. We don’t need to specify the spin s, because it is always 1/2 (we regard it as a fundamental property of the electron, like its electric charge or its mass), but we must specify the value of the quantum number ms (+ 1/2 or − 1/2), which tells us about the z component of the spin. Thus the complete description of the state of an electron in an atom requires the four quantum numbers (n, l, ml , ms ). For example, the ground state of hydrogen was previously labeled as (n, l, ml ) = (1, 0, 0). With the addition of ms , this would become either (1, 0, 0, + 1/2) or (1, 0, 0, − 1/2). The degeneracy of the ground state is now 2. The first excited state would have eight possible labels: (2, 0, 0, + 1/2), (2, 0, 0, − 1/2), (2, 1, +1, + 1/2), (2, 1, +1, − 1/2), (2, 1, 0, + 1/2), (2, 1, 0, − 1/2), (2, 1, −1, + 1/2), and (2, 1, −1, − 1/2). There are now two possible labels for each previous single label (each n, l, ml becomes n, l, ml , − 1/2 and n, l, ml , − 1/2, so the degeneracy of each level is 2n2 instead of n2 . It is important to know the direction (z component) of the angular momentum vectors when an atom is in a magnetic field, but for most other applications the values of ml and ms are of no significance, and it is cumbersome to write them each time we wish to refer to a certain level of an atom. We therefore use a different notation, known as spectroscopic notation, to label the levels. In this 7.8 | The Zeeman Effect system we use letters to stand for the different l values: for l = 0, we use the letter s (do not confuse this with the quantum number s), for l = 1, we use the letter p, and so on. The complete notation is as follows: −0.8 eV −1.5 eV −3.4 eV Value of l 0 1 2 3 4 5 6 Designation s p d f g h i (The first four letters stand for sharp, principal, diffuse, and fundamental, which were terms used to describe atomic spectra before atomic theory was developed.) In spectroscopic notation, the ground state of hydrogen is labeled 1s, where the value n = 1 is specified before the s. Figure 7.19 illustrates the labeling of the hydrogen atom levels in this notation. Also shown in Figure 7.19 are arrows representing some different photons that can be emitted when the atom makes a transition from one state to a lower state. Some of the missing arrows (such as 4d to 3s) would represent transitions that are not allowed to occur. By solving the Schrödinger equation and using the solutions to compute transition probabilities, we find that the transitions most likely to occur are those that change l by one unit. This restriction is called a selection rule, and for atomic transitions the selection rule is l = ±1 −13.6 eV 4s 4p 3s 3p 2s 2p 4d 217 4f 3d 1s FIGURE 7.19 A partial energy level diagram of hydrogen, showing the spectroscopic notation of the levels and some of the transitions that satisfy the l = ±1 selection rule. (7.24) For example, the 3s level cannot emit a photon in a transition to the 2s level (l = 0), but rather must go to the 2p level (l = 1). There is no selection rule for n, so the 3p level can go to 2s or 1s (but not to 2p). ∗ 7.8 THE ZEEMAN EFFECT Consider for the moment a hypothetical (and less interesting) world in which the electron has no spin, and therefore no spin magnetic moment. Suppose we prepared a hydrogen atom in a 2p (l = 1) level and placed it in an external  (supplied by a laboratory electromagnet, for example). uniform magnetic field B The magnetic moment µ  L associated with the orbital angular momentum then interacts with the field, and the energy associated with this interaction is  U = −µ L · B (7.25) That is, magnetic moments aligned in the direction of the field have less energy than those aligned oppositely to the field. Using Eq. 7.22 for the z component of the magnetic moment (assuming that the field is in the z direction), we have U = −μL,z B = ml μB B (7.26) in terms of the Bohr magneton μB defined in Eq. 7.23. In the absence of a magnetic field, the 2p level has a certain energy E0 (−3.4 eV). When the field is turned on, the energy becomes E0 + U = E0 + ml μB B; that is, there are now three different possible energies for the level, depending on the value of ml . Figure 7.20 illustrates this situation. ∗ This is an optional section that may be skipped without loss of continuity. l = 1, ml = 0, ±1 μ BB Field off μBB Field on ml = +1 ml = 0 ml = −1 FIGURE 7.20 The splitting of an l = 1 level in an external magnetic field. (The effects of the electron’s spin angular momentum are ignored.) The energy in a magnetic field is different for different values of ml . Chapter 7 | The Hydrogen Atom in Wave Mechanics 218 Field off Field on ml = +1 ml = 0 2p E − μBB E E E + μBB ml = −1 dE = − ml = 0 1s Now suppose the atom emits a photon in a transition from the 2p state to the 1s ground state. In the absence of the magnetic field, a single photon is emitted with an energy of 10.2 eV and a corresponding wavelength of 122 nm. When the magnetic field is present, three photons can be emitted, with energies of 10.2 eV + μB B, 10.2 eV, and 10.2 eV − μB B. To determine how a small change in energy E affects the wavelength, we differentiate the expression E = hc/λ and obtain FIGURE 7.21 The normal Zeeman effect. When the field is turned on, the original wavelength λ becomes three separate wavelengths. (7.27) Replacing the differentials with small differences, taking absolute magnitudes, and solving for λ gives λ = λ − ∆λ λ λ + ∆λ λ hc dλ λ2 λ2 E hc (7.28) where E is the energy splitting between the levels when the field is on (E = μB B). Figure 7.21 illustrates the three transitions, and shows an example of the result of a measurement of the emitted wavelengths. In analyzing transitions between different ml states, often we need to use a second selection rule: the only transitions that occur are those that change ml by 0, +1, or −1: ml = 0, ±1 (7.29) Changes in ml of two or more are not permitted. Example 7.9 Compute the change in wavelength of the 2p → 1s photon when a hydrogen atom is placed in a magnetic field of 2.00 T. and so, from Eq. 7.28, λ = λ2 E hc Solution The energy of the photon from n = 2 to n = 1 is E = −13.6 eV( 212 − 112 ) = 10.2 eV, and its wavelength is λ = hc/E = (1240 eV · nm)/(10.2 eV) = 122 nm. The energy change E of the levels is E = μB B = (9.27 × 10−24 J/T)(2.00 T) = 18.5 × 10−24 J = 11.6 × 10−5 eV = (122 nm)2 11.6 × 10−5 eV 1240 eV · nm = 0.00139 nm Even for a fairly large magnetic field of 2 T, the change in wavelength is very small, but it is easily measurable using an optical spectrometer. The experiment we have just considered is an example of the Zeeman effect —the splitting of a spectral line with a single wavelength into lines with 7.9 | Fine Structure several different wavelengths when the emitting atoms are in an externally applied magnetic field. In the normal Zeeman effect a single spectral line splits into three components; this occurs only in atoms without spin. (All electrons of course have spin, unlike the hypothetical spinless electrons we considered; however, in certain atoms with several electrons, the spins can pair off and cancel, so that the atom behaves like a spinless one.) When spin is present, we must consider not only the effect of the orbital magnetic moment but also the spin magnetic moment. The resulting pattern of level splittings is more complicated, and spectral lines may split into more than three components. This case is known as the anomalous Zeeman effect, an example of which is shown in Figure 7.22. ∗ 7.9 FINE STRUCTURE A careful inspection of the emission lines of atomic hydrogen shows that many of them are in fact not single lines but very closely spaced combinations of two lines. In this section we examine the origin of that effect, known as fine structure. In this calculation it is more convenient for us to examine the hydrogen atom from the electron’s frame of reference, in which the proton appears to travel around the electron, just as the Sun appears to travel around the Earth. For convenience, we treat this problem in the context of the Bohr model to obtain an estimate of the effect. Figure 7.23a shows the atom viewed from the ordinary frame of reference of the proton. We assume the electron to orbit counterclockwise so that the orbital  is in the z direction, and we also assume that the spin S angular momentum L (which could point either up or down) is also in the z direction. The same situation is shown in Figure 7.23b from the viewpoint of the electron, with the proton now appearing to move in a circular orbit around the electron. In the electron reference frame the motion of the proton in a circular orbit of  radius r can be considered to be a current loop, which causes a magnetic field B at the electron, as shown in Figure 7.23c. This magnetic field interacts with the spin magnetic moment of the electron, µ  S = −(e/m)S. The interaction energy of the magnetic moment µ  S in a magnetic field is  U = −µ S · B (7.30)  ; with µ We choose the z direction to be the direction of B  S = −(e/m)S, we have U= e ·B  = e Sz B S m m (7.31) eh− B = ±μB B 2m (7.32) − the energy is With Sz = ± 12 h, U =± ∗ This is an optional section that may be skipped without loss of continuity. D1 219 D2 FIGURE 7.22 The anomalous Zeeman effect in sodium. (Top) The so-called sodium D-lines, a closelying doublet of wavelengths 589.0 and 589.6 nm in the absence of a magnetic field. (Bottom) Splitting of the lines into six and four components in a magnetic field. This image was photographed by Peter Zeeman in 1897. 220 Chapter 7 | The Hydrogen Atom in Wave Mechanics L S i + S r + B r µs (b) (a) (c) FIGURE 7.23 An electron circulates about the proton in a hydrogen atom. (b) From the point of view of the electron, the proton circulates about the electron. (c) The apparently circulating proton is represented by the current i and causes a  at the location of the electron. magnetic field B μBB ∆E μBB L S FIGURE 7.24 The fine-structure split and ting in hydrogen. The state with L S parallel is slightly higher in energy  and S antiparallel. than the state with L −  and thus U = +μB B. When S The situation shown in Figure 7.23 has Sz = + 21 h, has the opposite orientation, U = −μB B. The effect is to split each level into two,  and S parallel and a lower state with L  and S antiparallel, as a higher state with L shown in Figure 7.24. The energy difference between the states is E = 2μB B. At this point, the result looks rather similar to that of our previous discussion of the Zeeman effect, but it is important to note one significant difference: the magnetic field B in this case is not a field in the laboratory that can be turned on or off; it is, instead, a field that is always present, produced by the relative motion between the proton and the electron. We can use the Bohr model to make a rough estimate of the magnitude of this energy splitting. A circular loop of radius r carrying current i establishes at its center a magnetic field B = μ0 i/2r.The current i is the charge carried around the loop (+e in this case) divided by the time T for one orbit. The time for one orbit is the distance traveled (2πr) divided by the speed v. B= μ e μ ev μ0 i = 0 = 0 2r 2r T 2r 2πr (7.33) The energy difference between the states is then E = 2μB B = μ0 e2 h−2 n μ0 ev μ = 2πr2 B 4πm2 r3 (7.34) − where the last result is obtained by substituting v = nh/mr from Bohr’s angu− lar momentum condition (Eq. 6.26) and μB = eh/2m from Eq. 7.23. Finally, substituting from Eq. 6.28 for the radius of the orbits in the Bohr atom, we obtain  3 μ0 me8 1 me2 1 μ0 e2 h−2 n = (7.35) E = − 4πm2 4πε0 h2 n2 256π 4 ε03 h−4 n5 We can rewrite this in a somewhat simpler form by recalling that c2 = 1/ε0 μ0 and using the dimensionless constant α, known as the fine structure constant, α= e2 − 4πε0 hc (7.36) Chapter Summary 221 which gives E = mc2 α 4 1 n5 (7.37) The value of the fine structure constant is approximately 1/137. For hydrogen in  and S the n = 2 level, we expect the energy difference between the state with L   parallel and the state with L and S antiparallel to be E = (0.511 MeV)  1 137 4 1 = 4.53 × 10−5 eV 25 We can compare this estimate with the experimental value, based on the observed splitting of the first line of the Lyman series, which gives 4.54 × 10−5 eV. We see that in spite of the assumptions we have made, our use of the Bohr model, and our failure to use the hydrogen wave functions to do this calculation, the agreement with the experimental value is remarkably good. (In fact, the agreement is so good as to be embarrassing, for we neglected to consider the important relativistic effect of the motion of the electron, which contributes to the fine structure about equally to the spin-orbit interaction discussed in this section. We really should regard this calculation as an order-of-magnitude estimate, which happens by chance to give a numerical result close to the observed value.) Chapter Summary Section Orbital angular momentum Orbital magnetic quantum number Spatial quantization Angular momentum uncertainty relationship √ − l(l + 1) h (l = 0, 1, 2, . . .) 7.2 Angular probability density P(θ , φ) = |l,ml (θ)ml (φ)| 7.5 Lz = ml h− (ml = 0, ±1, ±2, . . . , ±l) 7.2 Orbital magnetic dipole moment µ  L = −(e/2m)L 7.6 7.2 Spin magnetic dipole moment µ  S = −(e/m)S 7.6 7.2 Spin angular momentum √ √ | S| = s(s + 1)h− = 3/4h− (for s = 1/2) 7.6 Spin magnetic quantum number Sz = ms h− (ms = ± 1/2) 7.6 Spectroscopic notation s (l = 0), p(l = 1), d(l = 2), f (l = 3), . . . 7.7 Lz m =  l |L| l(l + 1) Lz φ  h− cos θ = Hydrogen n = 1, 2, 3, . . . quantum numbers l = 0, 1, 2, . . . , n − 1 ml = 0, ±1, ±2, . . . , ±l me4 1 Hydrogen energy En = − −2 2 2 ε2 h n 32π 0 levels Hydrogen wave functions Section | = |L ψn,l,ml (r, θ , φ) = Rn,l (r)l,ml (θ)ml (φ) Radial probability P(r) = r2 |Rn,l (r)|2 density 7.3 2 l = ±1 ml = 0, ±1 7.3 Selection rules for photon emission 7.3 Normal Zeeman effect λ = 7.4 Fine-structure estimate E = mc2 α 4 /n5 λ2 λ2 E = μB B hc hc (α≈1/137) 7.7, 7.8 7.8 7.9 222 Chapter 7 | The Hydrogen Atom in Wave Mechanics Questions 1. How does the quantum-mechanical interpretation of the hydrogen atom differ from the Bohr model? 2. How does a quantized angular momentum vector differ from a classical angular momentum vector? 3. What are the meanings of the quantum numbers n, l, ml according to (a) the quantum-mechanical calculation; (b) the vector model; (c) the Bohr (orbital) model? 4. List the dynamical quantities that are constant for a specific choice of n and l. List the dynamical quantities that are not constant. Compare these lists with the Bohr model. 5. How does the orbital angular momentum differ between the Bohr model and the quantum-mechanical calculation?  precesses about the z axis? Can 6. What does it mean that L we observe the precession? 7. In the Bohr model, we calculated the total energy from the potential energy and kinetic energy for each orbit. In the quantum-mechanical calculation, is the potential energy constant for any set of quantum numbers? Is the kinetic energy? Is the total energy? 8. What is meant by the term spatial quantization? Is space really quantized? 9. A deficiency of the Bohr model is the problem of angular momentum conservation in transitions between levels. Discuss this problem in relation to the quantum-mechanical angular momentum properties of the atom, especially the selection rule Eq. 7.24. The photon can be considered to h. carry angular momentum − 10. The 2s electron has a greater probability to be close to the nucleus than the 2p electron and also a greater probability to be farther away (see Figure 7.10). How is this possible? 11. The probability density ψ*ψ does not depend on φ for the wave functions listed in Table 7.1. What is the significance of this? 12. How would the wave functions of Table 7.1 change if the nuclear charge were Ze instead of e? (Recall how we made the same change in the Bohr model in Section 6.5.) What effect would this have on the radial probability densities P(r)? 13. Can a hydrogen atom in its ground state absorb a photon (of the proper energy) and end up in the 3d state? 14. Is it correct to think of the electron as a tiny ball of charge spinning on its axis? Is it useful? Is this situation similar to using the Bohr model to represent the electron’s orbital motion? 15. The photon has a spin quantum number of 1, but its spin magnetic moment is zero. Explain. 16. What are the similarities and differences between Zeeman splitting and fine-structure splitting? 17. How would the calculated fine structure be different in an atom with a single electron and a nuclear charge of Ze? 18. Does the fine structure, as we have calculated it, have any effect on the n = 1 level? 19. How would (a) the Zeeman effect and (b) the fine structure be different in a muonic hydrogen atom? (See Problems 41 and 42 in Chapter 6.) The muon has the same spin as the electron, but is 207 times as massive. 20. Even though our calculation of the fine structure was based on a very simplified model, it does yield a result similar to the more correct calculation: the fine-structure splitting decreases as we go to higher excited states. Give at least two qualitative reasons for this. Problems 7.1 A One-Dimensional Atom −bx 1. By substituting the wave function ψ(x) = Axe into Eq. 7.2, show that a solution can be obtained only for b = 1/a0 , and find the ground-state energy. 2. Show that the probability density for the ground-state solution of the one-dimensional Coulomb potential energy has its maximum at x = a0 . 3. An electron in its ground state is trapped in the onedimensional Coulomb potential energy. What is the probability to find it in the region between x = 0.99a0 and x = 1.01a0 ? 7.2 Angular Momentum in the Hydrogen Atom 4. An electron is in an angular momentum state with l = 3. (a) What is the length of the electron’s angular momentum vector? (b) How many different possible z components can the angular momentum vector have? List the possible z components. (c) What are the values of the angle that the  vector makes with the z axis? L  vector make with the z axis when 5. What angles does the L l = 2? 7.3 The Hydrogen Atom Wave Functions 6. List the 16 possible sets of quantum numbers n, l, ml of the n = 4 level of hydrogen (as in Figure 7.6). 7. (a) What are the possible values of l for n = 6? (b) What are the possible values of ml for l = 6? (c) What is the smallest possible value of n for which l can be 4? (d) What is the h? smallest possible l that can have a z component of 4− Problems 8. Show that the (1, 0, 0) and (2, 0, 0) wave functions listed in Table 7.1 are properly normalized. 9. Show by direct substitution that the n = 2, l = 0, ml = 0 and n = 2, l = 1, ml = 0 wave functions of Table 7.1 are both solutions of Eq. 7.10 corresponding to the energy of the first excited state of hydrogen. 10. Show by direct substitution that the wave function corresponding to n = 1, l = 0, ml = 0 is a solution of Eq. 7.10 corresponding to the ground-state energy of hydrogen. 11. Consider a thin spherical shell located between r = 0.49a0 and 0.51a0 . For the n = 2, l = 1 state of hydrogen, find the probability for the electron to be found in a small volume element that subtends a polar angle of 0.11◦ and an azimuthal angle of 0.25◦ if the center of the volume element is located at: (a) θ = 0, φ = 0; (b) θ = 90◦ , φ = 0; (c) θ = 90◦ , φ = 90◦ ; (d) θ = 45◦ , φ = 0. Do the calculation for all possible ml values. 7.4 Radial Probability Densities 12. Show that the radial probability density of the 1s level has its maximum value at r = a0 . 13. Find the values of the radius where the n = 2, l = 0 radial probability density has its maximum values. 14. What is the probability of finding a n = 2, l = 1 electron between a0 and 2a0 ? 15. For a hydrogen atom in the ground state, what is the probability to find the electron between 1.00a0 and 1.01a0 ? (Hint: It is not necessary to evaluate any integrals to solve this problem.) 7.5 Angular Probability Densities 16. Find the directions in space where the angular probability density for the l = 2, ml = ±1 electron in hydrogen has its maxima and minima. 17. Find the directions in space where the angular probability density for the l = 2, ml = 0 electron in hydrogen has its maxima and minima. 7.6 Intrinsic Spin 18. (a) Including the electron spin, what is the degeneracy of the n = 5 energy level of hydrogen? (b) By adding up the number of states for each value of l permitted for n = 5, show that the same degeneracy as part (a) is obtained. 19. For each l value, the number of possible states is 2(2l + 1). Show explicitly that the total number of states for each prinn−1  2(2l + 1) = 2n2 . This gives the cipal quantum number is l=0 degeneracy of each energy level. 20. Explain why each of the following sets of quantum numbers (n, l, ml , ms ) is not permitted for hydrogen. (a) (2, 2, −1, + 1/2) (b) (3, 1, +2, − 1/2) (c) (4, 1, +1, − 3/2) (d) (2, −1, +1, + 1/2) 223 7.7 Energy Levels and Spectroscopic Notation 21. List the excited states (in spectroscopic notation) to which the 4p state can make downward transitions. 22. (a) A hydrogen atom is in an excited 5g state, from which it makes a series of transitions by emitting photons, ending in the 1s state. Show, on a diagram similar to Figure 7.19, the sequence of transitions that can occur. (b) Repeat part (a) if the atom begins in the 5d state. 23. (a) List in spectroscopic notation all levels with n = 7. (b) An electron is initially in the state with n = 7, l = 2. List in spectroscopic notation all lower states to which transitions are allowed. 7.8 The Zeeman Effect 24. Consider the normal Zeeman effect applied to the 3d to 2p transition. (a) Sketch an energy-level diagram that shows the splitting of the 3d and 2p levels in an external magnetic field. Indicate all possible transitions from each ml state of the 3d level to each ml state of the 2p level. (b) Which transitions satisfy the ml = ±1 or 0 selection rule? (c) Show that there are only three different transition energies emitted. 25. A collection of hydrogen atoms is placed in a magnetic field of 3.50 T. Ignoring the effects of electron spin, find the wavelengths of the three normal Zeeman components (a) of the 3d to 2p transition; (b) of the 3s to 2p transition. 7.9 Fine Structure 26. Calculate the wavelengths of the components of the first line of the Lyman series, taking the fine structure of the 2p level into account. 27. Calculate the energies and wavelengths of the 3d to 2p transition, taking into account the fine structure of both levels. How many component wavelengths might there be in the transition? General Problems 28. Show that the wave function ψ(x) = A(x + cx2 )e−bx gives a solution to the Schrödinger equation for the one-dimensional Coulomb potential energy. Evaluate the constants A, b, c, and find the energy corresponding to this solution. 29. Find the probabilities for the n = 2, l = 0 and n = 2, l = 1 electron states in hydrogen to be further than r = 5a0 from the nucleus. Which has the greater probability to be far from the nucleus? 30. The mean or average  ∞value of the radius r can be found according to rav = 0 rP(r)dr. Show that the mean value of r for the 1s state of hydrogen is 32 a0 . Why is this greater than the Bohr radius? 31. Find the value of rav (see Problem 30) for the 2s and 2p levels. 32. The mean or average value of the potential energy of the electron in a hydrogen atom can be found from 224 Chapter 7 | The Hydrogen Atom in Wave Mechanics ∞ Uav = 0 U(r)P(r)dr. Find Uav in the 1s state and compare with the potential energy computed with the Bohr model when n = 1. 33. Suppose the source of atoms in a Stern-Gerlach experiment were an oven of temperature 1000 K. Assume the magnetic field gradient to be 10 T/m, and take the length of the magnetic field region and the field-free region between magnet and screen to be 1 m each. Make any other assumptions you may need and estimate the separation of the images observed on the screen. 34. For the 1s, 2s, and 2p states of hydrogen, show that (r−1 )av = 1/n2 a0 . This turns out to be a general result for any state of hydrogen. Based on this result, explain why the Bohr model gives such a good estimate for the finestructure splitting as well as for other magnetic effects due to the circulating electron. Chapter 8 MANY-ELECTRON ATOMS This computer-generated drawing shows the structure of an atom of neon, with the electron probability distributions surrounding the central nucleus. The bright inner sphere represents the 1s electrons, the dark outer sphere is the 2s electrons, and the lobes are the 2p electrons. This is a more realistic picture of an atom than the ‘‘planetary’’ view developed in Chapter 6. 226 Chapter 8 | Many-Electron Atoms Physicists often attack complex problems by trying to separate the more important parts from the less important. For example, in analyzing the motion of the Earth in the Solar System, we can start by ignoring all bodies other than the Sun. With this simplification, we find that the Earth moves about the Sun in an elliptical orbit. Now we can account for the effect of the Moon, which introduces a slight “wobble” about the ellipse. Finally, we can introduce the much weaker effect of the gravitational pull of the other planets. It is tempting to try to use a similar approach to understand the motion of electrons in atoms with more than one electron. Unfortunately, we can’t analyze the motion of an electron in an atom with more than one electron by separating out the more and less influential forces. For example, in a neutral atom with atomic number Z, each electron experiences an electrostatic force due to the nucleus with a charge of +Ze, but it also experiences an electrostatic force due to all the other electrons with a total charge of −(Z − 1)e. The effect of the nucleus is comparable to the effect of the other electrons, which can’t be analyzed as a small correction. We are thus required to consider simultaneously the effect of the nucleus and each of the other electrons. The problem of the mutual interactions of three or more objects is an example of what physicists call the many-body problem. Exact, closed-form solutions to the Schrödinger equation cannot be found for such problems. The solutions must be obtained numerically using a computer. In this chapter, we consider an approximate set of energy levels for many-electron atoms, and we try to understand some of the properties of atoms (chemical, electrical, magnetic, optical, etc.) based on those energy levels. 8.1 THE PAULI EXCLUSION PRINCIPLE Wolfgang Pauli (1900–1958, Switzerland). His exclusion principle gave the basis for understanding atomic structure. He also contributed to the development of quantum theory, to the theory of nuclear beta decay, and to the understanding of symmetry in physical laws. Let’s begin by considering how the Z electrons in an atom might occupy the atomic energy levels. As a first guess, we might expect that all Z electrons will eventually cascade down to the lowest energy level, the 1s state. If this were correct, we would expect the properties of the atom to vary rather smoothly compared with its neighbors having Z ± 1 electrons. Indeed, certain of the properties of atoms, such as the energies of the emitted X rays, show this smooth variation. However, other properties do not vary in this way and thus are not consistent with this model of all electrons in the same level. For example, neon (with Z = 10) is an inert gas; it is practically unreactive and does not form chemical compounds under most conditions. Its neighbors, fluorine (Z = 9) and sodium (Z = 11), are among the most reactive of the elements and under most conditions will combine with other substances, sometimes violently. As another example, nickel (Z = 28) is strongly magnetic (ferromagnetic) and, for a metal, does not have a particularly large electrical conductivity. Copper (Z = 29) is an excellent electrical conductor but is not magnetic. Such wide variations in properties between neighboring elements suggest that it is not correct to assume that all electrons occupy the same energy level. The rule that prevents all of the electrons in an atom from falling into the 1s level was proposed by Wolfgang Pauli in 1925, based on a study of the transitions that are present, and those that are expected but not present, in the emission spectra of atoms. Simply stated, the Pauli exclusion principle is as follows: No two electrons in a single atom can have the same set of quantum numbers (n, l, ml , ms ). 8.1 | The Pauli Exclusion Principle 227 The Pauli principle is the most important rule governing the structure of atoms, and no study of the properties of atoms can be attempted without a thorough understanding of this principle. To illustrate how the Pauli principle works, consider the structure of helium (Z = 2). The first electron in helium, in the 1s ground state, has quantum numbers n = 1, l = 0, ml = 0, ms = + 1/2 or − 1/2. The second electron can have the same n, l, and ml , but it cannot have the same ms , because the exclusion principle would be violated. Thus if the first 1s electron has ms = + 1/2, the second 1s electron must have ms = − 1/2. Now consider an atom of lithium (Z = 3). Just as with helium, the first two electrons will have quantum numbers (n, l, ml , ms ) = (1, 0, 0, + 1/2) and (1, 0, 0, − 1/2). According to the exclusion principle, the third electron cannot have the same set of quantum numbers as the first two, so it cannot go into the n = 1 level, because there are only two different sets of quantum numbers available in the n = 1 level, and both of those sets have already been used. The third electron must therefore go into one of the n = 2 levels, and experiments indicate that the 2s level is the next available. Without the Pauli principle, lithium would have three electrons in the 1s level; with the Pauli principle, we expect that lithium has two electrons in the 1s level and one electron in the 2s level. These two different possible structures for lithium would give very different physical properties, and the physical properties of lithium indicate that the structure with one electron in the 2s level is the correct one. We can continue this process with beryllium (Z = 4). The fourth electron can join the third electron in the 2s level, but that now completes the capacity of the 2s level—one of the electrons might have quantum numbers (n, l, ml , ms ) = (2, 0, 0, + 1/2) and the other might have (2, 0, 0, − 1/2). There are no other sets of quantum numbers that an additional electron could have in the 2s level without duplicating one of the sets that has already been assigned and thus violating the Pauli principle. When we reach boron, with Z = 5, the fifth electron must go into a different level—one of the 2p levels. We might therefore expect that the properties of boron, with a 2p electron, would be different from the properties of lithium or beryllium, which have only 2s electrons. It is this process of first using up all of the possible quantum numbers for one level, and then placing electrons in the next level, that accounts for the variations in the chemical and physical properties of the elements. Example 8.1 A certain atom has six electrons in the 3d level. (a) What is the maximum possible total ml for the six electrons, and what is the total ms in that configuration? (b) What is the maximum possible total ms for the six electrons, and what would be the largest possible total ml in that configuration? Solution (a) For a d state l = 2, so the possible ml values are +2, +1, 0, −1, and −2. At most two electrons can be assigned ml of +2 according to the Pauli principle (one with ms = + 1/2 and one with ms = − 1/2). Similarly, two electrons can be assigned ml of +1 (again, with ms = + 1/2 and ms = − 1/2), and the remaining two electrons can be assigned to ml of 0. That gives a total ml of +6, with a total ms of 0. (b) To maximize ms , we can assign at most five electrons to ms = + 1/2 (with corresponding ml values of +2, +1, 0, −1, and −2). The sixth electron cannot also have ms = + 1/2, because its ml value would be the same as one already assigned, which would violate the Pauli principle by having two electrons with the same ml and ms labels. The sixth electron must therefore have ms = − 1/2, giving a total ms of +2. The first five electrons give a total ml of 0, so the largest total ml would be obtained by assigning the sixth electron to ml of +2, giving a total ml of +2. 228 Chapter 8 | Many-Electron Atoms 6p 5d 4f 6s 5p 4d 5s Energy 4p 3d 4s 3p 3s 2p 2s 1s FIGURE 8.1 Atomic subshells, in order of increasing energy. The energy groupings are not to scale, but represent the relative energies of the subshells. 8.2 ELECTRONIC STATES IN MANY-ELECTRON ATOMS Figure 8.1 illustrates the result of an approximate calculation of the order of the filling of energy levels in many-electron atoms as the atomic number Z increases. The 1s level is always the lowest energy level to be filled, and the 2s and 2p levels are fairly close in energy. The 2s level always lies a bit lower in energy than the 2p level, and so the 2s level is filled before the 2p. (The fine-structure splitting is very small on the scale of this diagram.) We can understand why the 2s level lies lower in energy if we recall Example 7.6 and Figure 7.10. An electron in the 2s level has a greater probability to be found at small radii compared with an electron in the 2p level. (Penetrating close to the nucleus, the 2s electron also is attracted by the full nuclear charge +Ze, while the 2p electron spends most of its time beyond the orbits of the 1s electrons where it is attracted by an effective charge that is less than the full charge of the nucleus. We’ll discuss this effect, which is called electron screening, in Section 8.3.) These two effects—closer penetration to the nucleus and screening—are responsible for the tighter binding of the 2s electrons compared with the 2p electrons. A more extreme example of the tighter binding of the penetrating orbits occurs for the n = 3 levels. The 3s electron penetrates the inner orbits (it has a large probability density at small r; see Figure 7.10), and the 3p electron penetrates almost as much. The 3d electron has negligible penetration of the inner orbits. As a result, the 3s and 3p levels are more tightly bound and therefore lower in energy than the 3d level. A similar effect occurs for the n = 4 levels—the tighter binding of the 4s and 4p electrons pulls their energy levels down so low that they almost coincide with the 3d level, as shown in Figure 8.1. The 3d and 4s levels are very close in energy—for some atoms the 3d level is lower and for some atoms the 4s is lower. This small energy difference is an important factor that contributes to the large electrical conductivity of copper, as we discuss later in this chapter. The tighter binding of the penetrating s and p orbits also pulls the 5s and 5p levels down close to the 4d level, and similarly causes the 6s and 6p levels to appear at roughly the same energy as the 5d and 4f levels. As we learned in the case of the hydrogen atom, orbits with the same value of n all lie at about the same average distance from the nucleus. (The electrons in the penetrating orbits spend some of their time closer to the nucleus than the nonpenetrating orbits, but also some of their time further from the nucleus; the average distance from the nucleus of the penetrating orbits is then about the same as the average distance from the nucleus of the nonpenetrating orbits with the same value of n. See Problem 31 in Chapter 7 for a verification of this property for the hydrogen atom.) The set of orbits with a certain value of n, with about the same average distance from the nucleus, is known as an atomic shell. The atomic shells are designated by letter, as follows: n 1 2 3 4 5 Shell K L M N O The levels with a certain value of n and l (for instance, 2s or 3d) are known as subshells. According to the Pauli principle, the maximum number of electrons that can be placed in each subshell is 2(2l + 1). The (2l + 1) factor comes from the number of different ml values for each l, because ml can take the values 0, ±1, ±2, ±3, . . . , ±l. The extra factor of 2 comes from the two different ms 8.2 | Electronic States in Many-Electron Atoms values; for each ml , we can have ms = + 1/2 or ms = − 1/2. According to this scheme, the 1s subshell has a capacity of 2(2 × 0 + 1) = 2 electrons; the 3d subshell has a capacity of 2(2 × 2 + 1) = 10 electrons. (Note that this capacity doesn’t depend on n; any d subshell has a capacity of 10 electrons.) Table 8.1 shows the ordering and capacity of the subshells. It is important to keep in mind exactly what is represented by Figure 8.1 and Table 8.1. They give the order of filling of the energy levels, and so they represent only the “outer” or valence electrons. For example, the first 18 electrons fill the levels up through 3p, and the energy levels (subshells) available to the 19th electron in potassium (Z = 19) or calcium (Z = 20) are well described by Figure 8.1. However, the energy levels appropriate to the 19th electron in a heavy element such as lead (Z = 82) would be very different. In this case it is more correct to describe the atom in terms of shells—all of the n = 3 states (the M shell) are grouped together, as are all of the n = 4 states (the N shell), and so forth. When we discuss the inner structure of the atom, as in the case of X rays, the ordering of Figure 8.1 is not appropriate, and it is more appropriate to group the levels by shells, as we do in Section 8.5. 229 TABLE 8.1 Filling of Atomic Subshells Subshell Capacity 2(2l+1) n l 1 0 1s 2 2 0 2s 2 2 1 2p 6 3 0 3s 2 3 1 3p 6 4 0 4s 2 3 2 3d 10 4 1 4p 6 5 0 5s 2 4 2 4d 10 5 1 5p 6 The Periodic Table 6 0 6s 2 Figure 8.2 shows the periodic table, which is an orderly array of the chemical elements, listed in order of increasing atomic number Z and arranged in such a way that the vertical columns, called groups, contain elements with rather similar physical and chemical properties. In this section we discuss the way in which the filling of electronic subshells helps us understand the arrangement of the periodic table. In later sections we examine some of the physical and chemical properties of the elements. 4 3 4f 14 5 2 5d 10 6 1 6p 6 7 0 7s 2 5 3 5f 14 6 2 6d 10 Inert gases Alkalis 1 1s 2s 3s 3 5s 6s 7s Li 4 2 He Halogens Be 2p 5 11 Na 12 Mg K 20 Ca 3d 37 Rb 38 Sr 21 Sc 22 Ti 23 39 V 24 Cr 25 Mn 26 Fe 27 Co 28 Ni 29 Cu 30 Zn Y 40 Zr 41 Nb 42 Mo 43 Tc 44 Ru 45 Rh 46 Pd 47 Ag 48 Cd 4d 55 Cs 56 Ba 5d 87 Fr 88 Ra 6d B6 C7 3p 13 Al 14 Si 15 Transition metals 19 4s H Alkaline earths 71 Lu 72 Hf 73 Ta 74 W 75 Re 76 Os 77 Ir 78 Pt 79 Au 80 Hg 103 Lr 104 Rf 105 Db 106 Sg 107 Bh 108 Hs 109 Mt 110 Ds 111 Rg 112 Cn N8 O9 P 16 S 17 Cl 18 Ar 4p 31 Ga 32 Ge 33 As 34 Se 35 Br 36 Kr 5p 49 In 50 Sn 51 Sb 52 Te 53 5f 7p 113 Uut 114 Uuq 115 Uup 116 Uuh 117 Uus 118 Uuo 57 La 58 Ce 59 Pr 60 Nd 61 Pm 62 Sm 63 Eu 64 Gd 65 Tb 66 Dy 67 Ho 68 Er 69 Tm 70 Yb 89 Ac 90 Th 91 Pa 92 U 93 Np 94 Pu 95 Am 96 Cm 97 Bk 98 Cf 99 Es 100 Fm 101 Mv 102 No Actinides FIGURE 8.2 The periodic table of the elements. I 54 Xe 6p 81 Tl 82 Pb 83 Bi 84 Po 85 At 86 Rn Lanthanides (rare earths) 4f F 10 Ne 230 Chapter 8 | Many-Electron Atoms In attempting to understand the ordering of subshells and the periodic table, we must follow two rules for filling the electronic subshells: 1. The capacity of each subshell is 2(2l+1). (This is of course just another way of stating the Pauli exclusion principle.) 2. The electrons occupy the lowest energy states available. To indicate the electron configuration of each element, we use a notation in which the identity of the subshell and the number of electrons in it are listed. The identity of the subshell is indicated in the usual way, and the number of electrons in that subshell is indicated by a superscript. Thus hydrogen has the configuration 1s1 , for one electron in the 1s shell, and helium has the configuration 1s2 . Helium has both a filled subshell (the 1s) and a closed major shell (the K shell) and thus is an extraordinarily stable and inert element. With lithium (Z = 3), we begin to fill the 2s subshell; lithium has the configuration 1s2 2s1 . With beryllium (Z = 4, 1s2 2s2 ) the 2s subshell is full, and the next element must begin filling the 2p subshell (boron, Z = 5, 1s2 2s2 2p1 ). The 2p subshell has a capacity of six electrons, and with neon (Z = 10, 1s2 2s2 2p6 ) both the 2p subshell and the L shell (n = 2) are complete. The next row (or period) begins with sodium (Z = 11, 1s2 2s2 2p6 3s1 ), and the 3s and 3p subshells are filled in much the same way as the 2s and 2p subshells, ending with the inert gas argon (Z = 18, 1s2 2s2 2p6 3s2 3p6 ). The elements of the third row (period) are chemically similar to the corresponding elements of the second row (period), and so are written directly under them. The next electron might be expected to go into the 3d level. However, the highly penetrating orbit of the 4s electron causes the 4s level to appear at a slightly lower energy than the 3d level, so the 4s subshell normally fills first. The configurations of potassium (Z = 19) and calcium (Z = 20) are therefore respectively 1s2 2s2 2p6 3s2 3p6 4s1 and 1s2 2s2 2p6 3s2 3p6 4s2 . These elements have properties similar to, and therefore appear directly under, the corresponding elements with one and two s-subshell electrons in the second and third periods. We now begin to fill the 3d subshell. Because there is no 1d or 2d subshell, we would expect the first element with a d-subshell configuration to have rather different chemical properties from the elements we have placed previously; thus it should not appear in any of our previously occupied groups (columns), and so we begin a new group with scandium (Z = 21, 1s2 2s2 2p6 3s2 3p6 4s2 3d 1 ). The 3d subshell eventually closes with zinc (Z = 30, 1s2 2s2 2p6 3s2 3p6 4s2 3d 10 ). Along the way there are some minor variations; the most important is copper, with Z = 29. For this case the 3d level lies slightly lower than the 4s level, and so the 3d subshell fills before the 4s, resulting in the configuration 1s2 2s2 2p6 3s2 3p6 3d 10 4s1 . As we discuss later, this configuration is responsible for the large electrical conductivity of copper. In the next series of elements, the 4p subshell is filled, from gallium (Z = 31) to the inert gas krypton (Z = 36). When we move to the next period, we fill the 5s subshell before the 4d subshell, and the series of 10 elements corresponding to the filling of the 4d subshell is written directly under the series that had unfilled configurations in the 3d subshell. (Silver, with Z = 47, corresponds exactly to copper in the fourth period, with the 4d subshell filling before the 5s.) After the completion of the 4d subshell, the 5p subshell is filled, ending with the inert gas xenon (Z = 54). The next period begins with cesium and barium filling the 6s subshell. As was the case in the previous periods, the 5d and 6s lie at almost the same energy. However, there is yet another subshell at about the same energy as the 6s and 5d —the 4f subshell, which now begins to fill, from lanthanum to ytterbium. This series of 8.2 | Electronic States in Many-Electron Atoms 231 TABLE 8.2 Electronic Configurations of Some Elements H 1s1 Mn [Ar]4s2 3d 5 La [Xe]6s2 5d 1 He 1s2 Cu [Ar]4s1 3d 10 Ce [Xe]6s2 5d 1 4f 1 Li 1s2 2s1 Zn [Ar]4s2 3d 10 Pr [Xe]6s2 4f 3 Be 1s2 2s2 Ga [Ar]4s2 3d 10 4p1 Gd [Xe]6s2 5d 1 4f 7 B 1s2 2s2 2p1 Kr [Ar]4s2 3d 10 4p6 Dy [Xe]6s2 4f 10 Ne 1s2 2s2 2p6 Rb [Kr]5s1 Yb [Xe]6s2 4f 14 Na [Ne]3s1 Y [Kr]5s2 4d 1 Lu [Xe]6s2 5d 1 4f 14 Al [Ne]3s2 3p1 Mo [Kr]5s1 4d 5 Re [Xe]6s2 5d 5 4f 14 Ar [Ne]3s2 3p6 Ag [Kr]5s1 4d 10 Au [Xe]6s1 5d 10 4f 14 K [Ar]4s1 In [Kr]5s2 4d 10 5p1 Hg [Xe]6s2 5d 10 4f 14 Sc [Ar]4s2 3d 1 Xe [Kr]5s2 4d 10 5p6 Tl [Xe]6s2 5d 10 4f 14 6p1 Cr [Ar]4s1 3d 5 Cs [Xe]6s1 Rn [Xe]6s2 5d 10 4f 14 6p6 A symbol in brackets [ ] means that the atom has the configuration of the previous inert gas plus the additional electrons listed. elements, called the lanthanides or rare earths, is usually written separately in the periodic table, because there have been no other f -subshell elements under which to write them. The 4f subshell has a capacity of 14 electrons, and so there are 14 elements in the lanthanide series. Once the 4f subshell is complete, we return to filling the 5d subshell, writing those elements in the groups under the corresponding 3d and 4d elements, and then complete the period with the filling of the 6p subshell, ending with the inert gas radon (Z = 86). The seventh period is filled much like the sixth, with a series known as the actinides, written under the lanthanides, corresponding to the filling of the 5f subshell. What is most remarkable about this scheme is that the arrangement of the periodic table was known well before the introduction of atomic theory. The elements were organized into groups and periods based on their physical and chemical properties by Dmitri Mendeleev in 1859; understanding that organization in terms of atomic levels is a great triumph for the atomic theory. This way of organizing the elements gives us great insight into their physical and chemical properties, as we discuss in the next sections. Table 8.2 lists the electronic configurations of some of the elements. Example 8.2 Copper has the electronic configuration [Ar]4s1 3d 10 in its ground state. By adding a small amount of energy (about 1 eV) to a copper atom, it is possible to move one of the 3d electrons to the 4s level and change the configuration to [Ar]4s2 3d 9 . By adding still more energy (about 5 eV), one of the 3d electrons can be moved to the 4p level so that the configuration becomes [Ar]4s1 3d 9 4p1 . For each of these configurations, determine the maximum value of the total ms of the electrons. Solution The electrons in the filled shells (Ar core) have a total ms of zero. In fact, any filled subshell has equal numbers of electrons in ms = + 1/2 and ms = − 1/2 states, which also gives a total of zero. In the 4s1 3d 10 configuration, only the single 232 Chapter 8 | Many-Electron Atoms 4s electron contributes to ms , and its maximum value is + 1/2. In the 4s2 3d 9 configuration, the two 4s electrons give a total ms of zero. In the 3d subshell, there are 5 different ml values, so we can have at most 5 electrons with ms = + 1/2. The remaining 4 electrons must have ms = − 1/2, so we have a total ms of 5 × (+ 1/2) + 4 × (− 1/2) = + 1/2. In the 4s1 3d 9 4p1 configuration, each of the three subshells contributes a maximum ms of + 1/2, so the maximum total ms is + 3/2. 8.3 OUTER ELECTRONS: SCREENING AND OPTICAL TRANSITIONS Nucleus (+3e) 1s electrons (−2e) FIGURE 8.3 Electron structure in lithium, as might be seen from the average location of an outer (2s) electron. The dashed line represents a spherical Gaussian surface at that location. The electronic configurations of the alkali elements (those in the first column of the periodic table) all show a single s electron outside an inert gas core. These elements are very reactive, meaning they can easily give up the s electron to another element to form a chemical bond. For example, lithium (1s2 2s1 ) readily gives up its 2s electron to form the positive ion Li+ . It may at first seem somewhat surprising that Li gives up its electron so easily. The ionization energy of Li is 5.39 eV. This is smaller than the ionization energy of hydrogen (13.6 eV), even though from Eq. 6.38 we might expect that the energies of electrons in atoms should increase in proportion to Z 2 . We can understand this effect from the diagram of Figure 8.3. The lithium atom can be roughly characterized by an inner atomic shell consisting of two 1s electrons and a single electron in the 2s subshell. As was the case in the oneelectron atoms we considered in Chapters 6 and 7, the principal quantum number n determines the average distance of an electron from the nucleus. Although there is no simple formula that allows us to calculate the average orbital radius in atoms with more than one electron, it is certainly reasonable to expect that the 2s electron is most likely to be found much farther from the nucleus than the 1s electrons. The net electric force on the 2s electron can be estimated using Gauss’s law. Imagine a spherical surface centered at the nucleus having a radius equal to the average orbital radius of the 2s electron. The electric field at that distance is determined, according to Gauss’s law, by the net charge contained within the sphere. The electrons in the n = 1 orbit have nearly a 100% probability of being found within the sphere. Thus the net charge inside the sphere must include the nucleus (+3e) and the two n = 1 electrons (−2e) for a total net charge of +e. To a good approximation, for some applications a lithium atom looks very much like a one-electron atom with the electron in the n = 2 orbit about a nucleus with an effective charge of +e. (Recall from electrostatics that if the charge distribution is spherically symmetric, we can replace an extended charge distribution with a point charge at the center of the sphere.) Equation 6.38 gives the energy of such an electron in the n = 2 orbit in an atom with an effective nuclear charge of Zeff e = +e as Z2 = −3.40 eV (8.1) En = (−13.6 eV) eff n2 This simple model predicts that the ionization energy of a neutral lithium atom is 3.40 eV. The measured value is 5.39 eV. The agreement is not extremely good, but the estimated value is off by much less than a factor of Z 2 = 9, so the calculation is probably on the right track. The difference between the measured and estimated values can be accounted for by an effect that we have already discussed: the penetration of the s electrons through the inner shells to be occasionally found close to the nucleus. The 2s 8.3 | Outer Electrons: Screening and Optical Transitions 233 electron sometimes finds itself much closer to the nucleus than its average orbital radius, and may occasionally be inside the n = 1 shell. In this case Gauss’s law tells us that the electron feels the full +3e charge of the nucleus, which results in an increase in the binding energy. Let’s instead consider an excited state of lithium, in which the 2s electron moves to the 2p state. The 2p electron penetrates the inner shell hardly at all. The energy of the 2p electron in lithium is −3.54 eV, in almost exact agreement with the prediction of our simple model. The small discrepancy might indicate a small degree of penetration of the 2p electron inside some of the 1s probability distribution, which gives a small increase to the binding energy. If we instead move the outer electron to the 3d state, the measured energy is −1.51 eV, in exact agreement with the prediction of Eq. 8.1 for n = 3. The 3d electron has almost no penetration inside the 1s shell, and so that electron is very well described by Zeff = 1. This effect is called electron screening. To an outer electron, the charge of the nucleus can be screened or shielded by the electrons in the inner shells. This is one case in which the formulas we derived for the energies of a one-electron atom can be used to determine approximately the energy of an electron in an atom with more than one electron. For the outer electron in lithium, the 3 positive charges in the nucleus are screened by the negative charges of the two inner electrons, giving a net charge of one unit. The less penetrating is the orbit of the outer electron, the more accurate is the prediction of Eq. 8.1. In lithium, for example, the 3d orbit has almost no penetration of the inner shells and so the formula gives a very accurate representation of the binding of that electron. The 2p orbit in lithium has relatively little penetration, so again the approximate formula gives a good prediction. It is less accurate for the 2s electron, which does occasionally penetrate through the inner 1s orbits. Electron screening can also be used in a qualitative way to help understand the ionization energies of atoms. Consider helium, for example. In ionized helium, the single electron has an energy of −54.4 eV in its ground state. If we add a second electron to make neutral helium (with both electrons in the 1s state), the ionization energy is 24.6 eV. The screening of one electron by a portion of the probability distribution of the other is responsible for reducing the ionization energy from 54.4 eV when no second electron is present to 24.6 eV when the second electron is present. Example 8.3 The ground state of helium has the configuration 1s2 . Use the electron screening model to predict the energies of the following excited states of helium: (a) 1s1 2s1 (measured value −4.0 eV); (b) 1s1 2p1 (−3.4 eV); (c) 1s1 3d 1 (−1.5 eV). Solution (a) For the outer electron in helium, the nuclear charge of +2e is screened by the single 1s electron, so the effective charge seen by the outer electron is +e. From Eq. 8.1, we have Z2 12 = (−13.6 eV) = −3.4 eV En = (−13.6 eV) eff n2 22 The measured value is -4.0 eV, suggesting that the 2s electron has a small penetration through the 1s distribution and thus experiences a somewhat tighter binding than this simple model predicts. (b) Because Eq. 8.1 depends on n but not l, the calculation for the 2p excited state gives the same result as the calculation for the 2s excited state (−3.4 eV). Now the agreement is almost exact, because the 2p has less penetration than the 2s. (c) For the 3d excited state, Eq. 8.1 gives En = (−13.6 eV) 2 Zeff 12 = (−13.6 eV) = −1.5 eV n2 32 The agreement is again very good, suggesting little penetration of the 3d electron inside the 1s probability distribution. 234 Chapter 8 | Many-Electron Atoms Optical Transitions E=0 1s13p1 1s12s1 1s12p1 1s13d1 1s2 FIGURE 8.5 A small portion of the energy level diagram for helium. Note the l = ±1 transitions. 0 0 5p 4p 5s 5s −2 4s 4p 3d 4 4 13 61 60 .3 . 0. 3 4 −3 −1 −3 5d 4d 5p 5.2 51 5.8 9.6 61 3 11 3s 7.3 42 2 1. 7 49 12. 8 Energy (eV) 6s 4s 3p −2 5f 4f Energy (eV) −1 5d 4d 49 81 568 8.1 9. . 1 6 −24.5 eV 1s13s1 When we excite one of the outer electrons to a higher energy level or remove it completely from the atom, the resulting vacancy can be filled by electrons dropping into the empty state. The energy lost by these electrons usually appears as emitted photons, which are in the visible range of the spectrum and are thus known as optical transitions. The binding energies of the outer electrons in a typical atom are of the order of several electron-volts, and so it takes relatively little energy to move an outer electron and produce an optical transition. In fact, it is the absorption and reemission of light by these outer electrons that are responsible for the colors of material objects (although in solids the electron energy levels are usually very different from those in isolated atoms). In contrast with X-ray spectra, which vary slowly and smoothly from one element to the next, optical spectra can show large variations between neighboring elements, especially those that correspond to filled subshells. Beyond hydrogen, the simplest energy-level diagrams to understand are those of the alkali metals, which have a single s electron outside an inert core. Many of the excited states then correspond to the excitation of this single electron, and the resulting spectra are very similar to the spectrum of hydrogen, because the nuclear charge of +Ze is screened by the other (Z − 1) electrons. Figure 8.4 shows the energy levels of Li and Na along with some of the emitted transitions, which follow the same l = ±1 selection rule as the transitions in hydrogen (see Figure 7.19). The ground-state configuration of lithium is 1s2 2s1 and the ground-state configuration of sodium is 1s2 2s2 2p6 3s1 . The excited states in both cases can be obtained by moving the outer electron to a higher state. For example, the first excited state of Li is 1s2 2p1 , with the 2s electron moving to the 2p level. (The energy necessary to accomplish this can be provided by various means, such as by absorption of a photon or by passage of an electric current through the material as in a gas discharge tube.) The excited electron in the 2p state rapidly drops back 5f 4f 3d 3p 2p 32 3.3 670 .8 58 9. 2 −4 −4 −5 −5 2s (a) 3s (b) FIGURE 8.4 (a) Energy-level diagram of lithium, showing some of the transitions (labeled with wavelength in nm) in the optical region. (b) Energy-level diagram for sodium. Because of the fine-structure splitting, the 3p level in sodium is actually a very closely spaced pair of levels, so all transitions involving that level show two closely spaced wavelengths. The wavelength shown here is the average of the two. The fine-structure splitting is negligible for the other levels in sodium and for all of the levels in lithium. 8.4 | Properties of the Elements Ultraviolet light #4 #3 #2 #1 3.0 eV 2.4 eV Sunlight 1.8 eV 517 nm 689 nm Excited states 413 nm to the 2s state, with the emission of a photon of wavelength 670.8 nm. The inert core doesn’t participate in this excitation or emission, so to a good approximation we can ignore all but the outer electron in studying the levels and transitions in the alkali elements. The ground-state configuration of helium is 1s2 . We can produce an excited state by moving one of these electrons up to a higher level, and so some possible excited-state configurations might be 1s1 2s1 , 1s1 2p1 , 1s1 3s1 , and so forth. Photons are emitted when the excited electron drops back to the 1s level. The l = ±1 selection rule for transitions once again limits those that can occur. Figure 8.5 shows a portion of the energy level diagram for helium. The phenomenon of fluorescence is responsible for the appearance of objects under so-called “black light,” which is a source of ultraviolet radiation. Photons in the ultraviolet region, invisible to the human eye, have higher energies than those in the visible region, and hence if an ultraviolet photon is absorbed by an atom, the outer electron (which is responsible for the optical transitions) can be excited to high levels. These electrons make transitions back to their ground state, accompanied by the emission of photons in the visible region. Objects seen in ultraviolet light often show colors in the blue or violet end of the spectrum that are not present when the objects are viewed in sunlight. We can understand this effect by considering the composition of sunlight and the optical excited states of a hypothetical atom shown in Figure 8.6. The intensity of sunlight is concentrated in the center of the visible spectrum, in the yellow region; very little intensity is present in the red or blue ends of the visible spectrum. The “yellow” photons have enough energy to excite the hypothetical atom to levels 1 and 2 shown in Figure 8.6, but not enough to reach level 3 or 4. However, the higher-energy ultraviolet photons have sufficient energy to reach the higher levels, so the light emitted by the atom has a stronger blue component when that atom is excited by ultraviolet light than when excited by sunlight. 235 Ground 0 eV state FIGURE 8.6 Excited states of a hypothetical atom. Only excited states 1 and 2 can be easily reached by exposure to sunlight; exposure to ultraviolet light populates state 4, which in turn populates state 3. Under ultraviolet light, a stronger blue or violet (413 nm) color is revealed than under sunlight. Example 8.4 Calculate the energy difference between the 3d and 2p states in lithium, and compare with the corresponding energy difference in hydrogen. Solution From Figure 8.4, the wavelength of the photon emitted in the 3d to 2p transition is 610.4 nm. The energy difference is then 1240 eV · nm hc = = 2.03 eV E = λ 610.4 nm The energy difference between corresponding levels in hydrogen (Figure 6.20) is E3 − E2 = −1.51 eV − (−3.40 eV) = 1.89 eV. Due to electron screening, we expect the outer electron in lithium to behave similarly to the electron in hydrogen, so the energy differences are in rough agreement. 8.4 PROPERTIES OF THE ELEMENTS In this section we briefly study the way our knowledge of atomic structure helps us to understand the physical and chemical properties of the elements. Our discussion is based on the following two principles: 1. Filled subshells are normally very stable configurations. An atom with one electron beyond a filled shell will readily give up that electron to another atom Chapter 8 | Many-Electron Atoms 236 to form a chemical bond. Similarly, an atom lacking one electron from a filled shell will readily accept an additional electron from another atom in forming a chemical bond. 2. Filled subshells do not normally contribute to the chemical or physical properties of an atom. Only the electrons in the unfilled subshells need be considered. (X-ray energies, discussed in the next section, are an exception to this rule.) Sometimes only a single outer electron is the primary factor influencing the physical properties of an element. We consider a number of different physical properties of the elements, and try to understand those properties based on atomic theory. TABLE 8.3 Ionization Energies (in eV) of Neutral Atoms of Some Elements H 13.60 Ar 15.76 He 24.59 K 4.34 Li 5.39 Cu 7.72 Be 9.32 Kr 14.00 Ne 21.56 Rb 4.18 Na 5.14 Au 9.22 1. Atomic Radii. The radius of an atom is not a precisely defined quantity, because the electron probability density determines the “size” of an atom. The radii are also difficult to define experimentally, and in fact different kinds of experiments may give different values for the radii. One way of defining the radius is by means of the spacing between the atoms in a crystal containing that element. Figure 8.7 shows how such typical atomic radii vary with Z. 2. lonization Energy. Table 8.3 gives the ionization energies of some of the elements, and Figure 8.8 shows the variation of ionization energy with atomic number Z. 3. Electrical Resistivity. In bulk materials, an electric current flows when a potential difference (voltage) is applied across the material. The current i and voltage V are related according to the expression V = iR, where R is the electrical resistance of the material. If the material is uniform with length L and cross-sectional area A, then the resistance is R=ρ L A (8.2) The resistivity ρ is characteristic of the kind of material and is measured in units of  · m (ohm · meter). A good electrical conductor has a small 0.3 30 Na 0.2 Li 4f 4d 0.1 3p 3d 6p 5f 5d 5p 4p 2p 0.0 0 10 20 30 40 50 Z 60 70 80 90 100 FIGURE 8.7 Atomic radii, determined from atomic separations in ionic crystals. These radii are different from the mean radii of the electron cloud for free atoms. Ionization energy (eV) Atomic radius (nm) K Fr Cs Rb 25 20 He 2p Ne 3p Ar 15 4p 5d 6p Xe 4d 10 Rn 4f 2s 5 3s Li Na K 4s 0 0 5p Kr 3d 10 20 Rb 30 5s 40 Cs 50 Z 6s 60 5f 7s Fr 70 80 90 100 FIGURE 8.8 Ionization energies of neutral atoms of the elements. 8.4 | Properties of the Elements resistivity (ρ = 1.7 × 10−8  · m for copper); a poor conductor has a large resistivity (ρ = 2 × 1015  · m for sulfur). From the atomic point of view, current depends on the movement of relatively loosely bound electrons, which can be removed from their atoms by the applied potential difference, and also on the ability of the electrons to travel from one atom to another. Thus elements with s electrons, which are the least tightly bound and which also travel farthest from the nucleus, are expected to have small resistivities. Figure 8.9 shows the variation of electrical resistivity with atomic number. 4. Magnetic Susceptibility. When a material is placed in a magnetic field of intensity B, the material becomes “magnetized” and acquires a magnetization M, which for many materials is proportional to B: μ0 M = χ B (8.3) where χ is a dimensionless constant called the magnetic susceptibility. (Materials for which χ > 0 are known as paramagnetic, and those for which χ < 0 are called diamagnetic; materials that remain permanently magnetized even when B is removed are known as ferromagnetic, and χ is undefined for such materials.) From the atomic point of view, the magnetism of atoms depends on the  and S of the electrons in unfilled subshells, because the atomic magnetic L  and S (recall Table 7.2). This moments μ  S are proportional to L  L and μ effect is responsible for paramagnetic susceptibilities and occurs in all atoms  or S is nonzero. Diamagnetism is caused by the following effect: in which L when a varying magnetic field occurs in an area bounded by an electric circuit, an induced current flows in the circuit; the induced current sets up a magnetic field which tends to oppose the changes in the applied field (Lenz’s law). In the atomic physics case, the electric circuit is the circulating electron, and the induced current consists of a slight speeding up or slowing down of the electron in its orbit when a magnetic field is applied. This produces a 103 Resistivity (10–8 Ω.m) 3d 4f Mn 102 5d 4d Re Tc 10 1 Cu 0 10 20 30 Ag 40 50 Z Au 60 70 80 FIGURE 8.9 Electrical resistivities of the elements. 90 100 237 Chapter 8 | Many-Electron Atoms Magnetic susceptibility (units of 10–6) 238 106 4f Ferromagnetic 105 104 5f 4d 3d 5d 103 102 3s 4s 2s 5s 6s 10 1 –1 –10 2p –102 –103 3p 0 10 20 4p 30 40 5p 50 60 Z 6p 70 80 90 100 FIGURE 8.10 Magnetic susceptibilities of the elements. contribution to the magnetization of the material that is opposite to the applied  , and so the diamagnetic contribution to χ is negative. field B Figure 8.10 shows the magnetic susceptibilities of the elements. Just by examining Figures 8.7 to 8.10, you can see the remarkable regularities in the properties of the elements. Notice especially how similar the properties of the different sequences of elements are—for example, the electrical resistivity of the d-subshell elements or the magnetic susceptibility of the p-subshell elements. We now look at how the atomic structure is responsible for these properties. Inert Gases The inert gases occupy the last column of the periodic table. Because they have only filled subshells, the inert gases do not generally combine with other elements to form compounds; these elements are very reluctant to give up or to accept an electron. At room temperature they are monatomic gases. Their atoms don’t easily join together, so the boiling points are very low (typically −200◦ C). Their ionization energies are much larger than those of neighboring elements, because of the extra energy needed to break open a filled subshell. p -Subshell Elements The elements of the column (group) next to the inert gases are the halogens (F, Cl, Br, I, At). These atoms lack one electron from a closed shell and have the configuration np5 . A filled p subshell is a very stable configuration, so these elements readily form compounds with other atoms that can provide an extra electron to complete the p subshell. The halogens are therefore extremely reactive. As we move across the series of six elements in which the p subshell is being filled, the atomic radius decreases. This “shrinking” occurs because the nuclear charge is increasing and pulling all of the orbits closer to the nucleus. Notice from Figure 8.7 that the halogens have the smallest radii within each p subshell series. (The ionic crystal radii of the inert gases are not known.) 8.4 | Properties of the Elements As we increase the nuclear charge, the p electrons also become more tightly bound; Figure 8.8 shows how the ionization energy increases systematically as the p subshell is filled. From Figure 8.10 we see that each p subshell series is diamagnetic, with a characteristic negative magnetic susceptibility. s -Subshell Elements The elements of the first two columns (groups) are known as the alkalis (configuration ns1 ) and alkaline earths (ns2 ). The single s electron makes the alkalis quite reactive. The alkaline earths are similarly reactive, in spite of the filled s subshell. This occurs because the s electron wave functions can extend rather far from the nucleus, where the electrons are screened (by Z − 2 other electrons) from the nuclear charge and therefore not tightly bound. (Notice from Figure 8.7 that the ns1 and ns2 configurations give the largest atomic radii, and from Figure 8.8 that they have the smallest ionization energies.) For the same reasons, the ns1 and ns2 elements are relatively good electrical conductors. From Figure 8.10 we see that these elements are paramagnetic; for l = 0, there is no diamagnetic contribution to the magnetism. Transition Metals The three rows of elements in which the d subshell is filling (Sc to Zn, Y to Cd, Lu to Hg) are known as the transition metals. Many of their chemical properties are determined by the outer electrons—those whose wave functions extend furthest from the nucleus. For the transition metals, these are always s electrons, which have a larger mean radius than the d electrons. (Remember that the mean radius depends mostly on n; the s electrons of the transition metals have a larger n than the d electrons. For example, in the first row of transition metals, the 3d subshell is filling but the 4s subshell is already filled.) As the atomic number increases across the transition metal series, we add one d electron and one unit of nuclear charge; the net effect on the s electron is very small, because the additional d electron screens the s electron from the additional nuclear charge. Properties of the transition metals, which are in large part determined by the outermost electrons, can therefore be very similar, as the small variation in radius and ionization energy shows. The electrical resistivity of the transition metals shows two interesting features: a sharp rise at the center of the sequence, and a sharp drop near the end (Figure 8.9). The sharp drop near the end of the sequence indicates the small resistivity (large conductivity) of copper, silver, and gold. If we filled the d subshell in the expected sequence, copper would have the configuration 4s2 3d 9 ; however, the filled d subshell is more stable than a filled s subshell, and so one of the s electrons transfers to the d subshell, resulting in the configuration 4sl 3d 10 . This relatively free, single s electron makes copper an excellent conductor. Silver (5s1 4d 10 ) and gold (6s1 5d 10 ) behave similarly. At the center of the sequence of transition metals there is a sharp rise in the resistivity; apparently a half-filled shell is also a stable configuration, and so Mn (3d 5 ), Tc (4d 5 ), and Re (5d 5 ) have larger resistivities than their neighbors. A similar rise in resistivity is seen at the center of the 4f sequence. The transition metals have similar paramagnetic susceptibilities, due to the large orbital angular momentum of the d electrons and also to the large number of d-subshell electrons that can couple their spin magnetic moments. These two 239 240 Chapter 8 | Many-Electron Atoms effects are large enough to overcome the diamagnetism of the orbital motion. It is the d electrons that are also responsible for the ferromagnetism of iron, nickel, and cobalt. As soon as the d subshell is filled, however, the orbital and spin magnetic moments no longer contribute to the magnetic properties (all of the ml and ms values, positive as well as negative, are taken); for this reason, copper and zinc are diamagnetic, not paramagnetic like their transition metal neighbors. Lanthanides (Rare Earths) The lanthanide (or rare earth) elements are contained in the series of 14 elements from La to Yb; this series is usually drawn at the bottom of the periodic chart of the elements. The rare earths are rather similar to the transition metals in that an “inner” subshell (the 4f ) is being filled after an “outer” subshell (the 6s) is already filled. For the same reasons discussed above, the chemical properties of the rare earths should be rather similar, because they are determined mainly by the 6s electrons; the radii and ionization energies show that this is true. Because of the larger orbital angular momentum of f -subshell electrons (l = 3) and also because of the larger number of f -subshell electrons (up to 14) that can align their spin magnetic moments, the paramagnetic susceptibilities of the rare earths are even larger than those of the transition metals. Even the ferromagnetism of the rare earths is substantially stronger than that of the iron group. Generally, we think of iron as the most magnetic of the elements. The internal magnetic field within a magnetized piece of iron is about 28 T. Magnetized holmium metal, a rare earth, has an internal magnetic field of 800 T, roughly 30 times that of iron! Most of the other rare earths have similar magnetic properties. (The rare earth metals do not reveal their ferromagnetic properties at room temperature, but must be cooled to lower temperatures. Holmium must be cooled to 20 K to reveal its ferromagnetic properties.) Actinides The actinide series of elements, which corresponds to the filling of the 5f subshell, is usually shown in the periodic table directly under the lanthanide series. These elements should have chemical and physical properties similar to those of the rare earths. Unfortunately, most of the actinide elements (those beyond uranium) are radioactive and do not occur in nature. They are artificially produced elements and are available only in microscopic quantities. We are thus unable to determine many of their bulk properties. 8.5 INNER ELECTRONS: ABSORPTION EDGES AND X RAYS Let’s imagine doing the Franck-Hertz experiment (see Section 6.6), in which we accelerate a beam of electrons that then passes through a chamber filled with mercury vapor. However, instead of using accelerating voltages in the range of 10 V, we’ll use voltages in the range of 105 V. Figure 8.11 shows the current passing through the tube as a function of the accelerating voltage. A sudden drop in the current occurs at 83.1 kV. Low accelerating voltages correspond to K Absorption edge (keV) 100 80 60 40 20 0 0 20 60 40 80 Z FIGURE 8.13 K absorption edges of the elements. 83.1 kV Accelerating Voltage FIGURE 8.11 Electron current passing through mercury vapor as a function of accelerating voltage. Absorption interactions that push the outer electrons in the mercury atoms to higher excited states (or ionize the atom). The drop in the current at 83.1 kV occurs when the mercury atom absorbs energy from the electron beam that ionizes the atom by knocking loose one of the tightly bound inner electrons. The binding energy of the inner electron in this case is 83.1 keV. A similar experiment can be done by passing a beam of X rays through a thin film of mercury and measuring the absorption of the photon intensity. If we are able to vary the wavelength of the X rays, the absorption as a function of wavelength might look like Figure 8.12. Photons are absorbed from the beam by the photoelectric effect, in which electrons are knocked loose from mercury atoms. As the photon wavelength is increased (or as the photon energy is decreased), we reach a point at which the photons do not have enough energy to produce at least one component of photoelectrons, and thus there is a sudden decrease in the photon absorption. The wavelength at which this occurs, 0.0149 nm, corresponds to an energy of 83.1 keV, in agreement with the value deduced from electron scattering (Figure 8.11). The sudden drop in the electron current or in the photoelectron emission is called the absorption edge. It corresponds to the release of an inner electron from the atom. In the case of mercury, the most tightly bound (1s) electrons have a binding energy of 83.1 keV. In the electron scattering experiment, when the energy of the electrons in the beam exceeds 83.1 keV, the collision of an electron with a mercury atom can transfer an energy of 83.1 keV to the atom and result in the ejection of one of the 1s electrons. Similarly, when the photon energy exceeds 83.1 keV (or when its wavelength is below 0.0149 nm), the photons can eject a photoelectron from the 1s level, but when the photon energy is below 83.1 keV that is not possible. As discussed in Section 8.3, the n = 1 level is also known as the K shell. So far we have been discussing the K absorption edge in mercury, which corresponds to the release of an electron from the K shell. It is also possible to release a less tightly bound electron from the L shell (n = 2), in which case we would speak of the L absorption edge. In mercury, the L absorption edge is about 14 keV. (Because of the fine-structure splitting, there are actually three different states in the L shell with slightly different energies.) Figure 8.13 shows the K absorption edges of the elements. There is a very noticeable difference between the data shown in Figure 8.13 and those shown in 241 Current 8.5 | Inner Electrons: Absorption Edges and X Rays 0.0149 nm Photon wavelength FIGURE 8.12 Absorption of photons by a thin film of mercury as a function of the photon wavelength. Chapter 8 | Many-Electron Atoms 242 Figures 8.7–8.10: the K absorption edges show no evidence for any shell effects. Instead, there is a smooth dependence on the atomic number over the entire range of elements. As the nuclear charge increases, the 1s electrons are pulled into smaller and more tightly bound orbits, but this is a gradual process that is largely unaffected by the stacking of electrons into higher energy shells. There are no sudden changes in the 1s properties as a higher shell is filled and a still higher shell begins filling. X-Ray Transitions Mβ Mα Nα Lγ M series Lα n=4 (N shell) n=3 (M shell) Lβ L series n=2 (L shell) Kα X rays, as we discussed in Chapter 3, are electromagnetic radiations with wavelengths from approximately 0.01 to 10 nm (energies from 100 eV to 100 keV). In Chapter 3 we discussed the continuous X-ray spectrum emitted by accelerated electrons. In this section we are concerned with the discrete X-ray line spectra emitted by atoms. X rays are emitted in transitions between the more tightly bound inner electron energy levels of an atom. Under normal conditions all of the inner shells of an atom are filled, so X-ray transitions do not occur between these levels. However, when we remove one of the inner electrons, such as by ejecting a K electron following electron scattering or a photoelectric process, an electron from a higher subshell will rapidly make a transition to fill that vacancy, emitting an X-ray photon in the process. The energy of the photon is equal to the energy difference of the initial and final atomic levels of the electron that makes the transition. When we remove a 1s electron, we are creating a vacancy in the K shell. The X rays that are emitted in the process of filling this vacancy are known as K-shell X rays, or simply K X rays. (These X rays are emitted in transitions that come from the L, M, N, . . . shells, but they are known by the vacancy that they fill, not by the shell from which they originate.) The K X ray that originates with the n = 2 shell (L shell) is known as the Kα X ray, and the K X rays originating from the M shell are known as Kβ X rays. Figure 8.14 illustrates these transitions. If the bombarding electrons or photons knock loose an electron from the L shell, electrons from higher levels will drop down to fill this vacancy. The photons emitted in these transitions are known as L X rays. The lowest-energy X ray of the L series is known as Lα , and the other L X rays are labeled in order of increasing energy as shown in Figure 8.14. It is possible to have an L X ray emitted directly following the Kα X ray. A vacancy in the K shell can be filled by a transition from the L shell, with the emission of the Kα X ray. However, the electron that made the jump from the L shell left a vacancy there, which can be filled by an electron from a higher shell, with the accompanying emission of an L X ray. In a similar manner, we label the other X-ray series by M, N, and so forth. Figure 8.15 shows a sample X-ray spectrum emitted by silver. Kβ Moseley’s Law Kγ Kδ K series FIGURE 8.14 X-ray series. n=1 (K shell) Let us consider in more detail the Kα X ray, which (as shown in Figure 8.14) is emitted when an electron from the L shell drops down to fill a vacancy in the K shell. An electron in the L shell is normally screened by the two 1s electrons, and so it sees an effective nuclear charge of Zeff = Z − 2. When one of those 1s electrons is removed in the creation of a K-shell vacancy, only the remaining single 1s electron shields the L shell, and so Zeff = Z − 1. (In this calculation, we neglect the small screening effect of the outer electrons; their probability densities 8.5 | Inner Electrons: Absorption Edges and X Rays 243 150 Kβ Kγ Lβ Mα 0.01 Slope = 3.22 Lα 0.1 1.0 √∆ E (eV1/2) Intensity Kα 100 50 Wavelength (nm) Intercept =1 FIGURE 8.15 Characteristic X-ray spectrum of silver, such as might be produced by 30 keV electrons striking a silver target. The continuous distribution is a bremsstrahlung spectrum. are not zero within the L-shell orbits, but they are sufficiently small that their effect on Zeff can be neglected.) To a very good approximation, the Kα X ray can thus be analyzed as a transition from the n = 2 level to the n = 1 level in a one-electron atom with Zeff = Z − 1. Using Eq. 6.38 for the Bohr atom, we can find the energy of the Kα transition in an atom of atomic number Z:   1 1 2 E = E2 − E1 = (−13.6 eV)(Z − 1) − 2 = (10.2 eV)(Z − 1)2 (8.4) 22 1 Just as was the case for the K absorption edge, the energies of the Kα X rays vary√smoothly with atomic number and show no effects of atomic shells. If we plot E as a function of Z, we expect to obtain a straight line with slope 1 1 (10.2 eV) /2 = 3.19 eV /2 . Figure 8.16 is an example of such a plot. The measured 1 slope is 3.22 eV /2 , in excellent agreement with what is expected from Eq. 8.4. The straight line intersects the x axis at a value very close to 1, as we expect from Eq. 8.4. This method gives us a powerful and direct way to determine the atomic number Z of an atom, as was first demonstrated in 1913 by the British physicist H. G. J. Moseley, who measured the Kα (and other) X-ray energies of the elements and thus determined their atomic numbers. The dependence of the X-ray energies on Z given by Eq. 8.4 is known as Moseley’s law. Moseley was the first to demonstrate the type of linear relationship shown in Figure 8.16; such graphs are now known as Moseley plots. His discovery provided the first direct means of measuring the atomic numbers of the elements. Previously, the elements had been ordered in the periodic table according to increasing mass. Moseley found certain elements listed out of order, in which the element of higher Z happened to have the smaller mass (for example, cobalt and nickel or iodine and tellurium). He also found gaps corresponding to yet undiscovered elements; for example, the naturally radioactive element technetium (Z = 43) does not exist in nature and was not known at the time of Moseley’s work, but Moseley showed the existence of such a gap at Z = 43. The straight-line plot of Figure 8.16 is independent of our assumption regarding the exact value of the screening correction. That is, we could have written Zeff = Z − k, where k is some unknown number, probably close to 1. The only change in our plot would be in the intercept. We would still have a straight line with the same slope. Moseley’s work was of great importance in the development of atomic physics. Working in the same year as Rutherford and Bohr, Moseley not only provided 0 0 10 20 30 40 50 60 Atomic number Z FIGURE 8.16 Moseley plot of square root of Kα X-ray energy as a function of atomic number. Henry G. J. Moseley (1887–1915, England). His work on X-ray spectra provided the first link between the chemical periodic table and atomic physics, but his brilliant career was cut short when he died on a World War I battlefield. 244 Chapter 8 | Many-Electron Atoms confirmation of the Rutherford-Bohr model, he also demonstrated a direct link between atomic structure and the periodic table, which was previously a rather arbitrary ordering scheme of the elements but subsequently became a classification based on their electronic configurations. Example 8.5 Compute the energy of the Kα X ray of sodium (Z = 11). Solution The energy can be found with the help of Eq. 8.4, The measured value is 1.04 keV. The small discrepancy may be due to the screening correction in Zeff , which is not exactly equal to 1. E = (10.2 eV)(Z − 1)2 = (10.2 eV)(10)2 = 1.02 keV Example 8.6 Some measured X-ray energies in silver (Z = 47) are E(Kα ) = 21.990 keV and E(Kβ ) = 25.145 keV. The binding energy of the K electron in silver is E(K) = 25.514 keV. From these data, find: (a) the energy of the Lα X ray, and (b) the binding energy of the L electron. Solution (a) From Figure 8.14, we see that the energies are related by: E(Lα ) + E(Kα ) = E(Kβ ) (b) Again from Figure 8.14, we see that E(Kα ) = E(L) − E(K) or E(L) = E(K) + E(Kα ) = −25.514 keV + 21.990 keV = −3.524 keV The binding energy of the L electron is therefore 3.524 keV. or E(Lα ) = E(Kβ ) − E(Kα ) = 25.145 keV − 21.990 keV = 3.155 keV ∗ 8.6 ADDITION OF ANGULAR MOMENTA The properties of an alkali atom such as sodium are determined primarily by the single outer electron; if that electron has quantum numbers (n, l, ml , ms ) then the entire atom behaves as if it had those same quantum numbers. In atoms with several electrons outside of filled subshells, this is not the case. For example, the electronic configuration of carbon (Z = 6) is 1s2 2s2 2p2 . To find the angular momentum of carbon, we must combine the angular momenta of the two 2p electrons to find the total orbital angular momentum quantum number L and total magnetic quantum number ML that characterize the entire atom. ∗ This is an optional section that may be skipped without loss of continuity. 8.6 | Addition of Angular Momenta Suppose we have an atom with two electrons outside of filled subshells. These electrons have quantum numbers (n1 , l1 , ml1 , ms1 ) and (n2 , l2 , ml2 , ms2 ). The total orbital angular momentum of the atom is determined by the vector sum of the orbital angular momenta of the two electrons: =L 1 + L 2 L (8.5) Each vector is related to its corresponding angular momentum quantum number by     | = L(L + 1)h−  1 | = l1 (l1 + 1)h−  2 | = l2 (l2 + 1)h− (8.6) |L |L |L These vectors do not add like ordinary vectors, but have special addition rules associated with quantized angular momentum. These rules enable us to find L and its associated magnetic quantum number ML . 1. The maximum value of the total orbital angular momentum quantum number is Lmax = l1 + l2 (8.7) 2. The minimum value of the total orbital angular momentum quantum number is Lmin = |l1 − l2 | (8.8) 3. The permitted values of L range from Lmin to Lmax in integer steps: L = Lmin , Lmin + 1, Lmin + 2, . . . , Lmax (8.9) 4. The z component of the total angular momentum vector is found from the sum of the z components of the individual vectors: Lz = L1z + L2z (8.10) or, in terms of the magnetic quantum numbers, ML = ml1 + ml2 (8.11) The permitted values of the total magnetic quantum number ML range from −L to +L in integer steps: ML = −L, −L + 1, . . . , −1, 0, +1, . . . , L − 1, L (8.12) An identical set of rules holds for coupling the spin angular momentum vectors . For two electrons, each of which has to give the total spin angular momentum S s = 1/2, the total spin quantum number S can be 0 or 1. All filled subshells have L = 0 and S = 0, so we don’t need to consider filled subshells in analyzing the angular momentum of an atom. For this reason, filled subshells ordinarily do not contribute to the magnetic properties of atoms. For coupling more than two electrons, the procedure is first to couple the angular momenta of two electrons to give the maximum and minimum values of L. Then couple each allowed L to the angular momentum of the third electron to find the largest maximum and smallest minimum. This continues for all of the electrons in the unfilled subshell. 245 246 Chapter 8 | Many-Electron Atoms Example 8.7 Find the total orbital and spin quantum numbers for carbon. Smax = Solution Carbon has two 2p electrons outside filled subshells. Each of these electrons has l = 1. According to the rules for adding angular momenta, we have Lmax = 1 + 1 = 2, Thus L = 0, 1, or 2. For the spin angular momentum, we have Lmin = |1 − 1| = 0 1 2 + 1 2 = 1, Smin = | 21 − 12 | = 0 and so S = 0 or 1. Some combinations of L and S might be forbidden by the Pauli principle. For example, to obtain L = 2, the two electrons must both have ml = +1. The two electrons must therefore have different values of ms , so S = 1 is not allowed when L = 2. Example 8.8 Find the total orbital and spin quantum numbers for nitrogen. Solution Nitrogen has three 2p electrons, each with l = 1, outside filled subshells. If we add the first two, we get Lmax = 2 and Lmin = 0, as in Example 8.7, so that L = 0, 1, or 2. We now couple the third l = 1 electron to each of these values to find the largest maximum and smallest minimum, which give Lmax = 2 + 1 = 3, Lmin = |1 − 1| = 0 Smax = 1 + 1 2 = 23 , Smin = |0 − 21 | = 1 2 The resulting values of S are 1/2 and 3/2 (from the minimum to the maximum in integer steps). Once again, the Pauli principle may forbid certain combinations of L and S. The state with L = 3 cannot exist at all, because all three electrons must have ml = +1, and assigning ms quantum numbers will then result in two electrons with the same ml and ms , which is forbidden by the Pauli principle. and so L = 0, 1, 2, or 3. For the spin vectors, we again couple the first two to give Smax = 1 and Smin = 0. Adding the third s = 1/2 electron, we have The two 2p electrons of carbon can combine to give L = 0, 1, or 2 and S = 0 or 1. The ground state of carbon will be identified by only one particular choice of L and S. How do we know which of these combinations will be the ground state? The rules for finding the ground state quantum numbers are known as Hund’s rules: 1. First find the maximum value of the total spin magnetic quantum number MS consistent with the Pauli principle. Then S = MS,max (8.13) 2. Next, for that MS , find the maximum value of ML consistent with the Pauli principle. Then L = ML,max (8.14) In the case of carbon, the maximum value of MS is +1, obtained when the two valence electrons both have ms = + 1/2. Thus S = 1. With only two electrons in the 2p shell, the Pauli principle places no restrictions on S; in fact, three electrons in the 2p shell can be assigned ms = + 1/2. Our next task is to find the maximum value of ML . The maximum value of ml for the first p electron is +1. The second p electron cannot also have ml = +1, because that would give both electrons the 8.6 | Addition of Angular Momenta 247 same set of quantum numbers, in violation of the Pauli principle. The maximum value of ml for the second electron is 0, so ML,max = +1 and L = 1. The ground state of carbon is therefore characterized by S = 1 and L = 1. Example 8.9 Use Hund’s rules to find the ground-state quantum numbers of nitrogen. Solution The electronic configuration of nitrogen is 1s2 2s2 2p3 . We begin by maximizing the total MS for the three 2p electrons. Three electrons in the p subshell are permitted by the Pauli principle to have ms = + 1/2, so the maximum value of MS is 3/2, and therefore S is 3/2. Each of the three electrons has quantum numbers (2, 1, ml , + 1/2). To maximize ML we assign the first electron the maximum value of ml —namely, +1. The maximum value of ml left for the second electron is 0, and the third electron must therefore have ml = −1. The total ML is 1 + 0 + (−1) = 0, so L = 0. Thus L = 0, S = 3/2 are the ground-state quantum numbers for nitrogen. Example 8.10 Find the ground-state L and S of oxygen (Z = 8). Solution The electronic configuration of oxygen is 1s2 2s2 2p4 . Because only three electrons in the p subshell can have ms = + 1/2, the fourth must have ms = − 1/2, so MS,max = 1/2 + 1/2 + 1/2 + (− 1/2) = +1, and it follows that S = 1. To find L, we note that, as for nitrogen, the three electrons with ms = + 1/2 have ml = +1, 0, and −1, and we maximize ML by giving the fourth electron ml = +1. Thus ML,max = +1, and L = 1. Let us look now at the energy levels of helium. The ground-state configuration of helium is 1s2 . Both electrons are s electrons, with l = 0, and so the only possible value of L is zero. Because both electrons have ml = 0, the Pauli principle requires that the spin of the two electrons be opposite, so that one has ms = + 1/2 and the other has ms = − 1/2. The only possible total MS is therefore zero, so the ground state of helium has L = 0 and S = 0. The first excited state has configuration 1s1 2s1 . Both electrons still have l = 0, so we must again have L = 0. However, the total spin S can now be 0 or 1, because the Pauli principle does not restrict ms in this case—the two electrons already have different principal quantum numbers n, and so there is nothing to prevent them from having the same ms . There are, therefore, two “first excited states” of helium, one with L = 0 and S = 0, and another with L = 0 and S = 1. (Both of these states have configuration 1s1 2s1 .) A state with S = 0 is called a singlet state (because there is only a single possible MS value), and a state with S = 1 is called a triplet state (because there are three possible MS values: +1, 0, −1). The classification of states into singlet and triplet is important when we consider the selection rules for transitions between states; these selection rules tell us which transitions are allowed (and therefore likely) to occur and which are not. The selection rules, which involve both L and S, are L = 0, ±1 S = 0 (8.15) (8.16) (There are no selection rules for n.) Of course, the selection rule l = ±1 for the single electron that makes the transition still applies. For the two 1s1 2s1 states Chapter 8 | Many-Electron Atoms S=0 1s14s1 39 6. 5 Energy (eV) 8.1 –3 –4 1s12s1 1s12p1 1s13s1 31 8 38 .8 8. 9 50 1. 6 72 –2 1s13d1 5 1.3 05. 7 47 1s 3s 1 1s13p1 4.8 1 1s14d1 1s14s1 50 –1 S=1 1s14p1 1s14p1 1 1s14d1 1 1s 3p 1s13d1 4 58 47. 7. 1 6 0 4 66 92 7. .2 8 248 1s12p1 1s12s1 58.4 –5 –6 –24 –25 0 S=0 −1 L=2 −2 L=3 L=1 S=1 L=2 2p13d1 L=1 L=3 L=1 L=0 −3 Energy (eV) −4 L=2 L=1 2p13p1 L=2 2p13s1 L=1 −5 −6 −7 −8 L = 0, S = 0 −9 −10 −11 L = 2, S = 0 2p2 L = 1, S = 1 FIGURE 8.18 Energy-level diagram for carbon. Each group of levels is labeled with the electron configuration. Each individual level is labeled with the total L and S. 1s2 FIGURE 8.17 Energy-level diagram for helium. The states are grouped into singlets (S = 0) and triplets (S = 1). Some of the transitions in the optical and ultraviolet regions are shown. Transitions marked with an X would violate the l = ±1 selection rule. of helium, the l rule does not permit either state to make transitions to the 1s2 ground state (2s to 1s would be l = 0), and in addition, the S rule forbids the triplet (S = 1) states from decaying to the S = 0 ground state. These transitions can thus occur only by violating these selection rules. Because that is a very unlikely event, the transitions occur with very low probability. Energy levels that have a low probability of decay must “live” for a long time before they decay; such states are known as metastable states. Figure 8.17 shows the energy levels and transitions in helium. The singlet and triplet levels are grouped separately, because transitions between singlet and triplet levels would violate the S = 0 selection rule. Figure 8.18 shows the energy-level diagram of carbon. Notice the increasing complexity of the diagram, compared with the alkali metals and even with helium. This follows from the coupling of two electrons, both of whose l values may be different from zero. We have already discussed how the 2p2 configuration can give L = 0, 1, or 2 and S = 0 or 1. Only one of these (L = 1, S = 1) is the ground state of carbon; the others are excited states. More excited states can be obtained by promoting one of the 2p electrons to a higher level, giving configurations of 2p1 3s1 (L = 1, S = 0 or 1), 2p1 3p1 (L = 0, 1, or 2; S = 0 or 1), 2p1 3d 1 (L = 1, 2, or 3; S = 0 or 1), and so forth. Imagine the difficulty of analyzing the energy level diagram of the rare earths or actinides, which have f subshells (l = 3) with as many as 14 electrons! 8.7 LASERS There are three means by which radiation can interact with the energy levels of atoms (depicted in Figure 8.19). The first two we have already discussed. In the first kind of interaction, an atom in an excited state makes a transition to a lower 8.7 | Lasers 249 state, with the emission of a photon. (In all the examples we consider here, the photon energy is equal to the energy difference of the two atomic states.) This is spontaneous emission, which we represent as atom∗ → atom + photon Spontaneous emission where the asterisk indicates an excited state. The second interaction, induced absorption, is responsible for absorption spectra and resonance absorption. An atom in the ground state absorbs a photon (of the proper energy) and makes a transition to an excited state. Symbolically: atom + photon → atom∗ Induced absorption The third interaction, which is responsible for the operation of the laser, is induced (or stimulated) emission. In this process, an atom is initially in an excited state. A passing photon of just the right energy (again, equal to the energy difference of the two levels) induces the atom to emit a photon and make a transition to the lower, or ground, state. (Of course, it would eventually have made that transition left on its own, but it makes it sooner after being prodded by the passing photon.) Symbolically, atom∗ + photon → atom + 2 photons The significant detail is that the two photons that emerge are traveling in exactly the same direction with exactly the same energy, and the associated electromagnetic waves are perfectly in phase (coherent). Suppose we have a collection of atoms, all in the same excited state, as shown in Figure 8.20. A photon passes the first atom, causing induced emission and resulting in two photons. Each of these two photons causes an induced emission process, resulting in four photons. This process continues, doubling the number of photons at each step, until we build up an intense beam of photons, all coherent and moving in the same direction. In its simplest interpretation, this is the basis of operation of the laser. (The word laser is an acronym for Light Amplification by Stimulated Emission of Radiation.) This simple model for a laser will not work, for several reasons. First, it is difficult to keep a collection of atoms in their excited states until they are stimulated to emit the photon (we don’t want any spontaneous emission). A FIGURE 8.20 Buildup of intense beam in a laser. Each emitted photon interacts with an excited atom and produces two photons. Induced emission FIGURE 8.19 Interactions of radiation with atomic energy levels. 250 Chapter 8 | Many-Electron Atoms Short-lived state Metastable state Pumping Lasing transition Ground state FIGURE 8.21 A three-level atom. Short-lived state Metastable state Pumping Lasing transition Short-lived state Ground state FIGURE 8.22 A four-level atom. second reason is that atoms that happen to be in their ground state undergo absorption and thus remove photons from the beam as it builds up. To solve these problems, we must achieve a population inversion—in a collection of atoms, there must be more atoms in the upper state than in the lower state. This is called an “inversion” because under normal conditions at thermal equilibrium, the lower state always has the greater population. The “inversion” is thus an unnatural situation that must be achieved by artificial means, because it is essential for the operation of the laser. The first laser, which was constructed by T. H. Maiman in 1960, was based on a three-level atom (Figure 8.21). The laser medium is a solid ruby rod, in which the chromium atoms are responsible for the action of the laser. The atoms, originally in the ground state, are “pumped” into the excited state by an external source of energy (a burst of light from a flash lamp that surrounds the ruby rod). The excited state decays very rapidly (by spontaneous emission) to a lower excited state, which is a metastable state—the atom remains in that level for a relatively long time, perhaps 10−3 s, compared with 10−8 s for the short-lived states. The transition from the metastable state to the ground state is the “lasing” transition, resulting from stimulated emission by a passing photon. If the pumping action is successful, there are more atoms in the metastable state than in the ground state, and we have achieved a population inversion. However, as the lasing transition occurs, the population of the ground state is increased, thereby upsetting the population inversion. This excess of population in the ground state allows absorption of the lasing transition, thereby removing photons that might contribute to the lasing action. The four-level laser illustrated in Figure 8.22 relieves this remaining difficulty. The ground state is pumped to an excited state that decays rapidly to the metastable state, as with the three-level laser. The lasing transition proceeds from the metastable state to yet another excited state, which in turn decays rapidly to the ground state. The atom in its ground state thus cannot absorb at the energy of the lasing transition, and we have a workable laser. Because the lower short-lived state decays rapidly, its population is always smaller than that of the metastable state, which maintains the population inversion. A common example of the four-level laser is the familiar helium-neon laser, which operates with a mixture of helium and neon gas (about 90% helium). The important energy levels of He and Ne are shown in Figure 8.23. An electrical current in the gas “pumps” the helium from its ground state to the excited state at an energy of about 20.6 eV. This is a metastable state of helium—the atom remains in that state for a relatively long time because a 2s electron is not permitted to return to the 1s level by photon emission. Occasionally, an excited helium atom collides with a ground-state neon atom. When this occurs, the 20.6 eV of excitation energy may be transferred to the neon atom, because neon happens to have an excited state at 20.6 eV, and the helium atom returns to its ground state. Symbolically, helium∗ + neon → helium + neon∗ where the excited state is indicated by the asterisk. The excited state of neon corresponds to removing one electron from the filled 2p subshell and promoting it to the 5s subshell. From there it decays to the 3p level and eventually returns to the 2p ground state. Figure 8.23 illustrates this sequence of events and the level schemes. (The level shown with a dashed line, the neon 3s level, is not important 8.7 | Lasers Collision 20.61 eV 1s12s1 20.66 eV 2p55s1 18.70 eV 2p53p1 632.8 nm 2p53s1 0 He 1s2 0 Ne 2p6 FIGURE 8.23 Sequence of transitions in a He-Ne laser. for the basic operation of the laser, but it is necessary as an intermediate step in the return to the neon ground state, because the l = 0 transition 3p → 2p is not allowed, but the sequence 3p → 3s → 2p is permitted.) At any given time, there are more neon atoms in the 5s state than in the 3p state, because the good energy matchup of the 5s state with the helium excited state gives a high probability of the 5s state in neon being excited. The 3p state, on the other hand, decays rapidly. This provides the population inversion that is needed for the laser. In the helium-neon laser, the gases are enclosed in a narrow tube (Figure 8.24). Occasionally a neon atom in the 5s state spontaneously emits a photon (at a wavelength of 632.8 nm) parallel to the axis of the tube. This photon causes stimulated emission by other atoms, and a beam of coherent (in-phase) radiation eventually builds up traveling along the tube axis. Mirrors are carefully aligned at the ends of the tube to help in the formation of the coherent wave, as it bounces V Laser beam Fully silvered mirror FIGURE 8.24 Schematic diagram of a He-Ne laser. Partially silvered mirror 251 252 Chapter 8 | Many-Electron Atoms back and forth between the two ends of the tube, causing additional stimulated emission. One of the mirrors is only partially silvered, allowing a portion of the beam to escape through one end. The laser is not a particularly efficient device; the small helium-neon lasers you have probably seen used for laboratory or demonstration experiments have a light output of perhaps a few milliwatts; the electric power required to operate such a device may be of the order of 10 to 100 W, and thus the efficiency (power out ÷ power in) of such a device is only about 10−4 to 10−5 . It is the coherence and directionality of the laser beam and its energy density that make the laser such a useful device—its power can be concentrated in a beam only a few millimeters in diameter, and thus even a small laser can deliver 100 to 1000 W/m2 . Larger lasers in the megawatt (106 W) range are presently readily available, and research laboratories are using lasers in the 100 terawatt (1014 W) range for special applications. These powerful lasers do not operate continuously, but are instead pulsed, producing short (perhaps 10−9 s) pulses at rates of order 100 Hz. (Such a pulse is, in fact, an excellent example of a wave packet.) Chapter Summary Section Section No two electrons in a single atom can have the same set of quantum numbers (n, l, ml , ms ). 8.1 Filling order of atomic subshells 1s, 2s, 2p, 3s, 3d, 4s, 3d, 4p, 5s, 4d, 5p, 6s, 4f , 5d, 6p, 7s, 5f , 6d 8.2 Moseley’s law for Kα X rays Capacity of subshell nl 2(2l + 1) 8.2 Adding angular momenta l1 , ml1 and l2 , ml2 Pauli exclusion principle Energy of screened electron Hund’s rules for ground state Z2 En = (−13.6 eV) eff n2 8.3 E = (10.2 eV)(Z − 1)2 8.5 Lmax = l1 + l2 , Lmin = |l1 − l2 |, ML = ml1 + ml2 8.6 First S = MS,max , then L = ML,max 8.6 Questions 1. Continue Figure 8.1 upward, showing the next two major groups. What will be the atomic number of the next inert gas below Rn? What will be the structure of the eighth row (period) of the periodic table? Where do you expect the first g subshell to begin filling? What properties would you expect the g-subshell elements to have? What will be the atomic number of the second inert gas below radon? 2. Why do the 4s and 3d subshells appear so close in energy, when they belong to different principal quantum numbers n? 3. Would you expect element 107 to be a good conductor or a poor conductor? How about element 111? Do you expect element 112 to be paramagnetic or diamagnetic? 4. Zirconium frequently is present as an impurity in hafnium metal. Why? 5. Do you expect ytterbium (Yb) to become ferromagnetic at sufficiently low temperatures? What type of magnetic behavior would be expected at ordinary temperatures for polonium (Po)? For francium (Fr)? 6. As we move across the series of transition metal or rare earth elements, we add electrons to the d or f subshells. In chemical compounds, these elements often show valence states of +2, which correspond to removing two s electrons. Explain this apparent paradox. Problems 7. Why do the rare earth (lanthanide) elements have such similar chemical properties? What property might you use to distinguish lanthanide atoms from one another? 8. Explain why the Bohr theory gives a poor accounting of optical transitions but does well in predicting the energies of X-ray transitions. 9. What can you conclude about the electronic configuration of an atom that has both L = 0 and S = 0 in the ground state? 10. Suppose we do a Stern-Gerlach experiment using an atom that has angular momentum quantum numbers L and S in its ground state. Into how many components will the beam split? Do you expect them to be equally spaced? 11. What is the degeneracy of a state of total orbital angular momentum L that has S = 0? What is the degeneracy of a state of total spin angular momentum S that has L = 0? What is the total degeneracy of a state in which both L and S are nonzero? 12. What L and S values must an atom have in order to show the normal Zeeman effect? Does this apply only to the ground state or to excited states also? Can an atom show the normal Zeeman effect in some transitions and the anomalous Zeeman effect in other transitions? Could the same atom even show no Zeeman effect at all in some transitions? 13. Based on the rules for coupling electron l and s values to give the total L and S, explain why filled subshells don’t contribute to the magnetic properties of an atom. 14. If an atom in its ground state has S = 0, can you infer whether it has an even or an odd number of electrons? What if L = 0? 15. The L atomic shell actually contains three distinct levels: a 2s level and two 2p levels (a fine-structure doublet). If 16. 17. 18. 19. 20. 21. 253 we look carefully at the Kα X ray under high resolution, we see two, not three, different components. Explain this discrepancy. The Kα energies computed using Eq. 8.4 are about 0.1% low for Z = 20, 1% low for Z = 40, and 10% low for Z = 80. Why does the simple theory fail for large Z? Could it be because the screening effect has not been handled correctly and that Zeff is not Z − 1? If not, can you suggest an alternative reason? The first excited state in sodium is a fine-structure doublet; the wavelengths emitted in the decay of these states are 589.59 nm and 589.00 nm, a difference of 0.59 nm. The excited 4s1 state in sodium (see Figure 8.4) decays to the 3p doublet with the emission of radiation at the wavelengths 1138.15 nm and 1140.38 nm, a difference of 2.23 nm. Explain how the 3p fine structure can give a wavelength difference of 0.59 nm in one case and 2.23 nm in the other case. Suppose we had a three-level atom, like that of Figure 8.21, in which the metastable state were the higher excited state; the lasing transition would then be the upper transition. Does this atom solve the problem of absorption of the lasing transition? Would such an atom make a good laser? How does a laser beam differ from a point source of light? Contrast the change in beam intensity with distance from the source for a laser and a point source. Explain what is meant by a population inversion and why it is necessary for the operation of a laser. How could you demonstrate that laser light is coherent? What would be the result of the same experiment using an ordinary monochromatic source? A white light source? Problems 8.1 The Pauli Exclusion Principle 1. (a) List the six possible sets of quantum numbers (n, l, ml , ms ) of a 2p electron. (b) Suppose we have an atom such as carbon, which has two 2p electrons. Ignoring the Pauli principle, how many different possible combinations of quantum numbers of the two electrons are there? (c) How many of the possible combinations of part (b) are eliminated by applying the Pauli principle? (d) Suppose carbon is in an excited state with configuration 2p1 3p1 . Does the Pauli principle restrict the choice of quantum numbers for the electrons? How many different sets of quantum numbers are possible for the two electrons? 2. Nitrogen (Z = 7) has three electrons in the 2p level (in addition to two electrons each in the 1s and 2s levels). (a) Consistent with the Pauli principle, what is the maximum possible value of the total ms of all seven electrons? (b) List the quantum numbers of the three 2p electrons that result in the largest total ms . (c) If the electrons in the 2p level occupy states that maximize ms , what would be the maximum possible value for the total ml ? (d) What would be the maximum possible total ml if the three 2p electrons were in states that did not maximize ms ? 3. (a) How many different sets of quantum numbers (n, l, ml , ms ) are possible for an electron in the 4f level? (b) Suppose a certain atom has three electrons in the 4f level. What is the maximum possible value of the total ms of the three electrons? (c) What is the maximum possible total ml of three 4f electrons? (d) Suppose an atom has ten electrons in the 4f level. What is the maximum possible value of the total ms of the ten 4f electrons? (e) What is the maximum possible total ml of ten 4f electrons? 8.2 Electronic States in Many-Electron Atoms 4. (a) Suppose a beryllium atom (Z = 4) absorbs energy (such as from a beam of photons) that pushes one of the electrons to an excited state. If the photon energy is set at the 254 Chapter 8 | Many-Electron Atoms minimum necessary for this to occur, from which subshell does the electron make the transition and to which subshell does it jump? (b) Suppose the same experiment is done with neon (Z = 10). At the minimum energy for absorption, from which subshell does the electron make the transition and to which subshell does it jump? (c) Would you expect the minimum absorption energy for beryllium to be larger or smaller than the minimum energy for neon? Explain. 5. (a) List all elements with a p3 configuration. (b) List all elements with a d 7 configuration. 6. Give the electronic configuration of (a) P; (b) V; (c) Sb; (d) Pb. 7. (a) What is the electronic configuration of Fe? (b) In its ground state, what is the maximum possible total ms of its electrons? (c) When the electrons have their maximum possible total ms , what is the maximum total ml ? (d) Suppose one of the d electrons is excited to the next highest level. What is the maximum possible total ms , and when ms has its maximum total what is the maximum total ml ? 8.3 Outer Electrons: Screening and Optical Transitions 8. The ground state of singly ionized lithium (Z = 3) is 1s2 . Use the electron screening model to predict the energies of the 1s1 2p1 and 1s1 3d 1 excited states in singly ionized lithium. Compare your predictions with the measured energies (respectively −13.4 eV and −6.0 eV). 9. The ground state of neutral beryllium (Z = 4) is 1s2 2s2 . Use the electron screening model to predict the energies of the following excited states: 1s2 2s1 3p1 (measured −2.02 eV) and 1s2 2s1 4d 1 (−0.90 eV). 10. Using the wavelengths given in Figure 8.4, compute the energy difference between the 3d and 4d states in lithium; do the same for sodium. Compare those values with the corresponding n = 4 to n = 3 energy difference in hydrogen. Why is the agreement so good, considering the different values of Z? 11. (a) Using the information for lithium given in Figure 8.4, compute the energy difference of the 3p and 3d states. (b) Compute the energy of the 3s, 4s, and 5s states above the ground state. (c) The ionization energy of lithium in its ground state is 5.39 eV. What is the ionization energy of the 2p state? Of the 3s state? 8.5 Inner Electrons: Absorption Edges and X Rays 12. A certain element emits a Kα X ray of wavelength 0.1940 nm. Identify the element. 13. Compute the Kα X ray energies of calcium (Z = 20), zirconium (Z = 40), and mercury (Z = 80). Compare with the measured values of 3.69 keV, 15.8 keV, and 70.8 keV. (See Question 16). 8.6 Addition of Angular Momenta 14. Chromium has the electron configuration 4s1 3d 5 beyond the inert argon core. What are the ground-state L and S values? 15. Use Hund’s rules to find the ground-state L and S of (a) Ce, configuration [Xe]6s2 4f 1 5d 1 ; (b) Gd, configuration [Xe]6s2 4f 7 5d 1 ; (c) Pt, configuration [Xe]6s1 4f 14 5d 9 . 16. Using Hund’s rules, find the ground-state L and S of (a) fluorine (Z = 9); (b) magnesium (Z = 12); (c) titanium (Z = 22); (d) iron (Z = 26). 17. A certain excited state of an atom has the configuration 4d 1 5d 1 . What are the possible L and S values? 18. Use the degeneracies of the states with all possible total L and S to find how many different levels the 2p1 3p1 excited state of carbon includes. (See Figure 8.18.) Compare this result with the result of counting the individual ml and ms values from Problem 1(d). (See also Question 11.) 8.7 Lasers 19. A small helium-neon laser produces a light beam with an average power of 3.5 mW and a diameter of 2.4 mm. (a) How many photons per second are emitted by the laser? (b) What is the amplitude of the electric field of the light wave? Compare this result with the electric field at a distance of 1 m from an incandescent light bulb that emits 100 W of visible light. General Problems 20. (a) How many different possible ways are there to assign the sets of quantum numbers to the four 2p electrons in oxygen (Z = 8)? (b) List all possible values of the total ms for the four electrons. (c) List all possible values of the total ml of the four electrons. (d) If the total ms has its largest possible value, what are the possible values of the total ml ? (e) If the total ml has its largest possible value, what are the possible values of the total ms ? 21. (a) The ionization energy of sodium is 5.14 eV. What is the effective charge seen by the outer electron? (b) If the 3s electron of a sodium atom is moved to the 4f state, the measured binding energy is 0.85 eV. What is the effective charge seen by an electron in this state? 22. Draw a Moseley plot, similar to Figure 8.16, for the Kβ X rays using the following energies in keV: Ne P Ca 0.858 2.14 4.02 Mn Zn Br 6.51 9.57 13.3 Zr Rh Sn 17.7 22.8 28.4 Determine the slope and compare with the expected value. (Equation 8.4 applies only to Kα X rays; you will need to Problems derive a similar equation for the Kβ X rays.) Determine the z-axis intercept and give its interpretation. 23. Draw a Moseley plot, similar to Figure 8.16, for the Lα X rays using the following energies in keV: Mn Zn Br Zr 0.721 1.11 1.60 2.06 Rh Sn Cs Nd 2.89 3.71 4.65 5.72 Give interpretations of the slope and intercept. 24. Because of the fine-structure splitting of the 3p state, the 3p → 3s transition in sodium actually consists of two closely spaced lines of wavelengths 589.00 nm and 589.59 nm. Assuming a magnetic moment of one Bohr magneton, find 255 the effective magnetic field that produces the fine-structure splitting of the 3p state of sodium. 25. (a) What is the longest wavelength of the absorption spectrum of lithium? (b) What is the longest wavelength of the absorption spectrum of helium? In what region of the spectrum does this occur? (c) What are the shortest wavelengths in the absorption spectra of helium and lithium? In what region of the electromagnetic spectrum are these? 26. Using the wavelengths given in Figure 8.17, compute the energy difference between the 1s1 4p1 and 1s1 3p1 singlet (S = 0) states in helium. Compare this energy difference with the value expected using the Bohr model, assuming that the p electron is screened by the s electron. Repeat the calculation for the 3d and 4d triplet (S = 1) states. Chapter 9 MOLECULAR STRUCTURE Molecules range from the simple with only two atoms to very complex organic molecules such as DNA. The photo shows a computer model of C60 , a spherical arrangement of 60 carbon atoms in pentagons and hexagons, known as a ‘‘buckyball.’’ 258 Chapter 9 | Molecular Structure In this chapter we consider the combination of atoms into molecules, the excited states of molecules, and the ways that molecules can absorb and emit radiation. From a variety of experiments we learn that the spacing of atoms in molecules is of the order of 0.1 nm, and that the binding energy of an atom in a molecule is of the order of electron-volts. This spacing and binding energy are characteristic of electronic orbits, which suggests that the forces that bind molecules together originate with the electrons. The negatively charged electrons provide the binding that overcomes the Coulomb repulsion of the positively charged nuclei of the atoms in the molecule. When atoms are brought together to form molecules, the atomic states of the electrons change into molecular states. These states are filled in the order of increasing energy by the valence electrons of the atoms of the molecule. The probability densities of the occupied molecular states determine the nature of molecular bonds and the structure and properties of molecules, including their geometrical shapes. Just as we began to study atomic physics by looking at the simplest atom, we begin our study of molecular physics with the simplest molecule, H+ 2 , the singly ionized hydrogen molecule. We next turn to other simple molecules, such as H2 and NaCl, and finally we look at how our previous knowledge of atomic wave functions can help us to understand the molecular states that form the basis of organic chemistry. We will also study ways other than electronic excitations that molecules can absorb and emit electromagnetic radiation. These radiations give a distinctive signature of the molecule and its structure. Molecular spectroscopy, the study of these radiations, finds application in such diverse areas as identification of atmospheric pollutants and the search for life in outer space. 9.1 THE HYDROGEN MOLECULE Let’s first look at how the wave functions of the atomic electrons can lead to the binding together of atoms into stable molecules. Even though the negatively charged electrons provide the attractive force that overcomes the Coulomb repulsion of the positively charged atomic nuclei, it is perhaps not immediately obvious how stable molecules form at all because there is also a Coulomb repulsion of the electrons of one atom for those of another. The key to understanding this problem is the existence of the spatial probability densities of atomic orbits, such as we calculated for hydrogen and illustrated in Chapter 7. These probability densities are frequently not spherically symmetric, and very often may show overwhelming preferences for one spatial direction over another. A complete understanding of the effect of the electrons on molecular binding is in general made difficult by what also complicates atomic structure—there are too many electrons present for us to be able to write down and solve the equations that govern the structure of the atom or molecule. We therefore use the same tactic to study molecular structure that we used for atomic structure: we begin with a molecule that has only one electron. Such a molecule is H+ 2 , the hydrogen molecule ion, which results when we remove an electron from a molecule of ordinary hydrogen, H2 . Before we discuss the wave mechanical properties of H+ 2 , let’s try to guess what holds this molecule together. We first realize that it is not correct to think of H+ 2 as an atom of hydrogen (proton plus electron) joined to a second proton. 9.1 | The Hydrogen Molecule ψ1 ψ1 + ψ2 ψ2 + ψ1 + FIGURE 9.1 The electron wave functions for two hydrogen atoms separated by a large distance. The atom of hydrogen in such a combination is electrically neutral, so there is no electrostatic Coulomb force to hold the two pieces together. In this kind of molecule, at least, it is apparently not correct to identify the electron as belonging exclusively to one or the other of the components. The electron must somehow be shared between the two parts. The electron must spend a significant part of its time in the region between the two protons. In the language of quantum mechanics, the electron’s probability density must have a large value in that region. As we learned in Chapter 7, an electron in the ground state of hydrogen has an energy of −13.6 eV, a wave function ψ = (πa30 )−1/2 e−r/a0 , where a0 is the Bohr radius, and a probability density proportional to ψ 2 . Figure 9.1 shows the wave function for an electron that could be bound to either of two protons separated by a large distance. As we bring the two protons closer together, the wave functions begin to overlap, and we must combine them according to the rules of quantum mechanics—first add the wave functions, then square the result to find the combined probability density. (Note that this gives a very different result from first squaring, then adding.) We can combine these two wave functions in two different ways, depending on whether they have the same signs or opposite signs. The absolute sign of a wave function is arbitrary. When we calculate the normalization constant of a wave function, we actually compute its square. We could choose either the positive or the negative root; for convenience we usually choose the positive one. When we calculate probability densities ψ 2 for a single wave function, the choice of sign becomes irrelevant. However, when we combine different wave functions, their relative signs determine whether the two functions add or subtract, which can result in very different probability densities. Consider the two different combinations of wave functions shown in Figure 9.2. In one case (Figure 9.2a), the two wave functions have the same sign, and in the other case (Figure 9.2b) they have opposite signs. This has a substantial effect on the probability distributions, which are shown in Figure 9.3. The probability density obtained from squaring ψ1 + ψ2 (Figure 9.3a) has relatively large values in the region between the two protons. This suggests a concentration of negative charge between the protons, which can supply the Coulomb attraction to pull the two protons together and form a stable molecule. The square of ψ1 − ψ2 (Figure 9.3b), however, gives a vanishing probability density midway between the protons and thus a small density of negative charge in the region between the protons. There is not enough negative charge to overcome the Coulomb repulsion of the protons, and as a result this combination of wave functions does not lead to the formation of a stable molecule. Binding Energy of 259 H+ 2 There are two contributions to the energy of the H+ 2 molecule: the Coulomb repulsion of the two positively charged protons for each other, and the attraction of the combination of the two protons for the negatively charged electron. ψ2 + + + + ψ1 − ψ 2 (a) ψ1 −ψ2 (b) FIGURE 9.2 The overlap of two hydrogenic wave functions. The wave functions are indicated by the dashed lines, and their sum by the solid line. In (a), the two wave functions have the same sign, while in (b) they have opposite signs. |ψ1 + ψ2|2 + + (a) |ψ1 − ψ2|2 + + (b) FIGURE 9.3 The probability densities corresponding to the two combined wave functions of Figure 9.2. Chapter 9 | Molecular Structure 260 The Coulomb repulsion energy of the protons is positive, and the energy of the attraction of the electron by the protons is negative. For a stable molecule to form, the total energy must be negative, so the critical question is whether the electrons can provide enough negative energy of attraction to overcome the positive repulsion energy of the protons. To find the conditions necessary for a stable H+ 2 ion to form, let’s look at how the various contributions to the energy of the ion depend on the separation distance R between the two protons. The Coulomb potential energy that characterizes the repulsion of the bare protons is Up = e2 /4πε0 R; this function is plotted in Figure 9.4. To find the electron energy as a function of R, we first consider the case when the two protons are very far apart. In this case the electron is in the ground state orbit about one of the protons, for which E = −13.6 eV. As we bring the protons together, the electron becomes more tightly bound (because it is attracted by both protons) and its energy becomes more negative. As R → 0, the system approaches a single atom with Z = 2. For the wave function ψ1 + ψ2 (Figure 9.2a), the combined wave function has a maximum at R = 0 and resembles the ground-state wave function of an atom with Z = 2. Recalling the result from Chapter 6 for the electron energy in a hydrogen-like atom, En = (−13.6 eV) Energy (eV) 40 30 20 Up 10 Up + E − 0 −10 −13.6 0.1 0.2 U p + E+ −20 0.3 R (nm) E− 2.7 eV −30 −40 E+ −50 −60 FIGURE 9.4 Dependence of energy on separation distance R for H+ 2. Z2 n2 (9.1) where n is the principal quantum number, we find the energy of a ground-state electron for Z = 2 to be −54.4 eV. The energy corresponding to the sum of the two wave functions, which we label E+ , therefore has the value −13.6 eV at large R and approaches the value −54.4 eV at small R. The result of an exact calculation of E+ is shown in Figure 9.4. For the combination corresponding to the difference between the two wave functions, the energy is once again −13.6 eV for large R. As R → 0, the combined wave function approaches 0 (Figure 9.2b). The lowest energy level with a wave function that vanishes at R = 0 is the 2p state, for which the energy in a Z = 2 hydrogenlike atom is −13.6 eV. The energy E− corresponding to the wave function ψ1 − ψ2 therefore has the value −13.6 eV for both large R and small R. Its exact form is shown in Figure 9.4. The total energy of the hydrogen molecule ion is the sum of the proton energy Up and the electron energy E+ or E− . These two sums are also plotted in Figure 9.4. You can see that the combination Up + E− has no minimum and therefore no stable bound state. The wave function ψ1 − ψ2 does not lead to a stable configuration for the hydrogen molecule ion, just as we originally suspected. The sum Up + E+ gives the stable configuration of the ion, for which the equilibrium condition occurs at the point where Up + E+ has its minimum value. The minimum occurs at a separation Req = 0.106 nm and an energy of −16.3 eV. The binding energy B of H+ 2 is the energy necessary to take apart the ion into H and H+ and corresponds to the depth of the potential energy minimum of Up + E+ in Figure 9.4: B = E(H + H+ ) − E(H+ 2 ) = −13.6 eV − (−16.3 eV) = 2.7 eV (9.2) Note that we have defined molecular binding energy as the energy difference between the separate components (H and H+ ) and the combined system (H+ 2 ). It is interesting to note that the stability is achieved at Req = 2a0 . In Chapter 7 we learned that the radial probability density for the 1s state of hydrogen has its 9.1 | The Hydrogen Molecule 261 maximum value at r = a0 . Thus the stable configuration of the H+ 2 ion is such that the maximum in the radial probability density for a single H atom would fall exactly in the middle of the molecule! This is once again consistent with our expectations for the structure of H+ 2 —the electron must spend most of its time between the two protons. In summary, from our study of this simple molecule we have learned that an important feature of molecular bonding concerns the sharing of a single electron by two atoms of the molecule. This sharing is responsible for the stability of the molecule. With this in mind we can now add a second electron and consider the H2 molecule. The H2 Molecule B = E(H + H) − E(H2 ) = 2(−13.6 eV) − (−31.7 eV) = 4.5 eV (9.3) Comparing Figures 9.4 and 9.6, you can immediately see the effect of adding an additional electron to H+ 2 : the binding energy is greater (the molecule is more tightly bound), and the protons are drawn closer together. Both of these effects are due to the presence of the increased electron density in the region between the two protons. Antibonding Energy ψ1 − ψ2 R=∞ −27.2 eV ψ1 + ψ 2 Bonding FIGURE 9.5 Energy of different combinations of wave functions in H2 . 10 0 0.1 R (nm) Energy (eV) Suppose we have two hydrogen atoms separated by a very large distance. Associated with each atom there is a 1s electronic state, at an energy of −13.6 eV, because the atoms are so far apart that there is no interaction between the electrons. As we bring the atoms closer together to form a H2 molecule, the electron wave functions begin to overlap, so that the electrons are “shared” between the two atoms. As we have seen in the previous discussion, this can occur in such a way that the two electron wave functions add in the region between the two protons, giving a stable molecule, or subtract, leading to no stable molecule. The separate, individual electronic states of the atoms now become molecular states. Notice that, as shown in Figure 9.5, the number of states does not change as the separation R is reduced. When the atoms are separated by a large distance, there are two states, each at −13.6 eV, so the total energy at R = ∞ is −27.2 eV. When the separation is reduced, there are still two states, but now at different energies. One state corresponds to the sum of the two wave functions and leads to a stable H2 molecule; the other state corresponds to the difference of the two wave functions and does not give a stable molecule. The molecular state that leads to a stable molecule is known as a bonding state, and the one that does not lead to a stable molecule is an antibonding state. As we found previously for H+ 2 , in order to form a molecule, the electron probability distribution must be large in the region between the two protons. In the case of H2 , this is true for both electrons, and it is certainly our expectation, based on the Pauli principle, that for the two electrons both to occupy the molecular state leading to the large probability in the central region, their spins must be oppositely directed; that is, one must have ms = + 1/2 and the other ms = − 1/2 . As long as the two electrons have opposite spins, they can both occupy the bonding state, leading to a stable molecule. The energy of the bonding state for H2 is shown in Figure 9.6; as you can see, there is a minimum with E = −31.7 eV at R = 0.074 nm. The molecular binding energy of H2 is the difference between the energy of the separated neutral H atoms and the energy of the combined system: 0.2 −10 −20 0.074 nm −27.2 −30 Antibonding Bonding 4.5 eV −40 FIGURE 9.6 Bonding and antibonding in H2 . 262 Chapter 9 | Molecular Structure We can also understand why He does not form the molecule He2 —as two He atoms are brought together, the bonding and antibonding states are formed in much the same way as with H2 . The He2 molecule would have four electrons; at most two can be in the bonding state, so the other two must be in the antibonding state. The net effect is that no stable molecule forms. (However, He+ 2 is stable, with two bonding electrons and only one antibonding electron. The binding energy of He+ 2 is 3.1 eV and the separation is 0.108 nm, remarkably close to the corresponding values of H+ 2.) 9.2 COVALENT BONDING IN MOLECULES The sharing of electrons in a molecule such as H2 is the origin of the covalent bond; this type of bonding occurs commonly in molecules containing two identical atoms, in which case it is called homopolar or homonuclear bonding. The essential features of covalent bonding are: 1. As two atoms are brought together, the electrons interact and the separate atomic states and energy levels are transformed into molecular states. 2. In one of the molecular states, the electron wave functions overlap in such a way as to give a lower energy than the separated atoms had; this is the bonding state that leads to the formation of stable molecules. 3. The other molecular state (the antibonding state) has an increased energy relative to the separated atoms and does not lead to the formation of stable molecules. 4. The restrictions of the Pauli principle apply to molecular states just as they do to atomic states; each molecular state has a maximum occupancy of two electrons, corresponding to the two different orientations of electron spin. Other hydrogenlike atoms with a single s electron can also form stable molecules through covalent bonding. For example, two Li atoms (Z = 3, configuration 1s2 2s1 ) can form a molecule of Li2 . The four 1s electrons (two from each atom) fill the 1s bonding and antibonding states, and the remaining two 2s electrons can both occupy the 2s bonding state. The binding energy of Li2 is 1.10 eV, which is considerably smaller than the binding energy of H2 (4.52 eV), and the equilibrium separation distance of the atoms in the molecule is 0.267 nm, much larger than that of H2 (0.074 nm). Other homonuclear molecules formed from s-state bonds are listed in Table 9.1. It is customary to characterize the molecular bond strength in terms of the dissociation energy rather than the binding energy; the two terms are usually equivalent and indicate the energy needed to break the molecule into neutral atoms. The dissociation energy is weakly temperature dependent. Some tabulations list the values at room temperature (as in Table 9.1), while others list values at 0 K. The room-temperature values are higher than the 0 K values by about 1.5kT = 0.04 eV. As Z increases, meaning that the s electrons are associated with increasing principal quantum numbers n, the dissociation energy decreases and the equilibrium separation increases. This is consistent with the behavior of the s electron in atoms as n increases—as Figure 8.7 shows, the radius of the orbit of the s electron increases with increasing n for the alkali elements. 9.2 | Covalent Bonding in Molecules TABLE 9.1 Properties of s -Bonded Molecules* Molecule Dissociation Energy (eV) Equilibrium Separation (nm) H2 4.52 0.074 Li2 1.10 0.267 Na2 0.80 0.308 K2 0.59 0.392 Rb2 0.47 0.422 Cs2 0.43 0.450 LiH 2.43 0.160 LiNa 0.91 0.281 NaH 2.09 0.189 KNa 0.66 0.347 NaRb 0.61 0.359 ∗ Values taken from the Handbook of Chemistry and Physics and the American Institute of Physics Handbook. We can also form molecular bonds with two different alkali elements. Some of these are listed in Table 9.1. The dissociation energies and equilibrium separations are consistent with those of the corresponding homonuclear molecules. For example, the dissociation energy and equilibrium separation of LiH are midway between those of H2 and Li2 . Atoms with valence electrons in p states can also form diatomic molecules through covalent bonds—oxygen and nitrogen, for example. There are three atomic p states, so there will be six molecular p states, and the classification of levels can become quite tedious, but we can understand the structure of molecules composed of atoms with p electrons based on the geometry of atomic p states. In Chapter 7 we solved the Schrödinger equation for the H atom and showed the spatial probability distributions for the various possible electronic wave functions. Of course, these solutions for hydrogen will not be correct for other atoms, but the essential features of the geometry of the atomic states remains correct. We identified three different p states, corresponding to ml = −1, 0, and +1. The probability distributions corresponding to these ml values were shown in Figure 7.11. We can imagine these distributions to have a sort of “figure-eight” shape with two distinct lobes of large probability. In the ml = 0 case, the figure eight has its long axis along the z axis, and the two lobes of maximum probability occur in the +z and −z directions. In the ml = ±1 cases, the probability distribution can be regarded as occurring from a figure eight probability distribution in the xy plane that is rotating about the z axis, counterclockwise for ml = +1 and clockwise for ml = −1. Because of the uncertainty principle, we can’t observe the two probability lobes in the xy plane; all we can observe is the smeared-out “donut-shaped” distribution shown in Figure 7.11. For our purposes here, it is not as convenient to use the ml notation as it is to use a different representation in which we assign each of the three possible p states 263 264 Chapter 9 | Molecular Structure y py pz px x z FIGURE 9.7 Probability distributions of three different p electrons. px px (a) py py a label that gives the direction in space corresponding to the lobes of maximum probability. Thus pz is the state with regions of large electron probability along the z axis, and similarly for px and py . Figure 9.7 shows a schematic representation of these probability distributions. (The pz state corresponds exactly to ml = 0; px and py correspond to mixtures of ml = +1 and ml = −1.) Just as the uncertainty principle does not allow us to observe the two lobes of probability in the xy plane, it also forbids us from observing the separate px and py probability distributions. However, the distributions do exist (even through we can’t observe them), and two atoms can interact with one another by means of these electron clouds. We consider the structure of molecules containing p electrons based on this model of the three mutually perpendicular p states px , py , and pz . We discuss three applications of this type of covalent bonding: pp bonds, sp directed bonds, and sp hybrid states. pp Covalent Bonds Consider what happens when we bring together two p-shell atoms, whose probability distributions are each similar to Figure 9.7. We assume that the atoms approach along the x axis, as in Figure 9.8. As the atoms are brought together, the px states overlap (Figure 9.9a), giving (if the two wave functions add) an increased electron charge density between the two nuclei and contributing to the bonding of the atoms in the molecule. There is a much weaker overlap between the py states (Figure 9.9b) and also between the pz states (which are not shown in the figure). Because the overlap of the py states is not along the line connecting the nuclei, there are components to the binding force that oppose one another, and only a much smaller resultant force acts along the line connecting the nuclei (Figure 9.9b). In addition, there is less overlap of the py states. The net result is that the py states (and also the pz states) are less effective in binding the molecule than the px states. This somewhat oversimplified model suggests that the px state should have a much greater bonding effect (and also a greater antibonding effect) than the py and pz states. It also suggests that the bonding and antibonding effects of py states should be the same as those of pz states. Now we can consider the energies of the molecular states as a function of the nuclear separation distance R. We assume that we are dealing with two atoms py py (b) FIGURE 9.9 (a) Overlap of px probability distributions. The vectors indicate the force on the nuclei due to the overlap. (b) Overlap of py probability distributions. The off-axis forces give a smaller resultant force along the axis. px px FIGURE 9.8 Two atoms with p electrons. The pz probability distribution, which extends perpendicular to the page, is not shown. 9.2 | Covalent Bonding in Molecules px An py, pz Electron energy (not to scale) having filled 1s and 2s states and valence electrons in the 2p shell. When the 1s states of the two atoms overlap, the result is 1s bonding and antibonding molecular states, just as in the case of H2 . There are altogether four 1s electrons in the molecule, and with two in each state the 1s bonding and antibonding molecular states are filled to capacity. The same is true of the 2s states. The atomic 2s levels form bonding and antibonding molecular states; because each atom has a filled 2s shell, the four 2s electrons fill both the bonding and antibonding molecular states. The atoms have partially filled 2p shells, so the final molecular bonding depends critically on the molecular 2p states. For each atomic p state (px , py , pz ) there are corresponding bonding and antibonding molecular states. However, the bonding and antibonding effects of these states are not equivalent, as Figure 9.9 illustrates. The p state that happens to lie along the line of approach (px ) has an effect that is significantly greater than the p states that lie off the line of approach (py , pz ). The px bonding state must therefore lie lower in energy than the py and pz bonding states, and the px antibonding state must lie higher in energy than the py and pz antibonding states. Figure 9.10 illustrates the energies of the molecular states. The relative stability of a molecule can be determined based on the filling of the bonding and antibonding states with electrons (two per state, corresponding to spin up and spin down electrons). The following example illustrates how these states are filled. 265 tib ond ing 2p py, pz ing ond B px Antibonding 2s Bonding Antibonding 1s Bonding Separation distance R=∞ FIGURE 9.10 Bonding and antibonding 2p states. Example 9.1 Based on the filling of the bonding and antibonding states, predict the relative stability of the molecules (a) N2 , (b) O2 , and (c) F2 . Solution (a) Nitrogen (1s2 2s2 2p3 ) has seven electrons: two each in the filled 1s and 2s shells, and three electrons in the 2p shell. In the N2 molecule, there are therefore 14 electrons. We start in Figure 9.10 with two in the bonding 1s state, then two in the antibonding 1s, then two more in the bonding 2s, followed by two in the antibonding 2s for a total of eight electrons in the s states. That leaves six 2p electrons for the 2p molecular states. We can place two each in the three lowest 2p bonding molecular states, thus filling those states. No electrons go into the 2p antibonding states. With only bonding 2p electrons, N2 forms a very stable diatomic molecule. (b) Oxygen (1s2 2s2 2p4 ) has eight electrons, so the O2 molecule has a total of 16 electrons. As in N2 , the first eight electrons fill the 1s and 2s states, leaving eight additional electrons for the 2p states. The first six of those fill the three bonding states, and so the remaining two must go into 2p antibonding states. With six bonding and two antibonding valence electrons, we would expect that O2 is less stable than N2 , which has only bonding valence electrons. (c) Fluorine (1s2 2s2 2p5 ) has nine electrons, so of the 18 electrons in the F2 molecule 10 must be placed in the 2p states: six in the bonding states and four in the antibonding states. Thus F2 should be less stable than O2 . How well do the properties of these molecules agree with our predictions? N2 has a dissociation energy of 9.8 eV and is not reactive under most circumstances. O2 has a smaller dissociation energy (5.1 eV); the O2 molecular bonds can be broken by relatively modest chemical reactions, as, for example, the oxidation of metals exposed to air. F2 has an even smaller dissociation energy (1.6 eV); 266 Chapter 9 | Molecular Structure fluorine gas reacts quite violently with many substances, and the F2 molecule can be broken apart by exposure to visible light (which has photon energies of 2–4 eV) in a process known as photodissociation. The properties of these 2p molecules are thus quite consistent with expectations based on the filling of the bonding and antibonding states. Similar relationships occur for the 3p, 4p, 5p, and 6p homonuclear molecules. sp Molecular Bonds ψH ψF + − + F + H FIGURE 9.11 Overlap of s and p wave functions. It is often the case that a stable molecule is formed from two atoms, one with an s-state valence electron and the other with one or more p-state valence electrons. Consider, for example, the HF molecule. The F atom has five electrons in the p shell, so of the three 2p atomic states, two will each have their capacity of two electrons, and the third will have a single electron. We ignore the four paired p electrons, which do not significantly affect the molecular bonding, and concentrate instead on the single unpaired p electron. The two-lobed probability distribution corresponds to a two-lobed p−state wave function, in which the signs of ψ are opposite for the two lobes. The 1s wave function of H has only one sign (Figure 9.11). As the H and F atoms approach each other from a large distance, the H wave function and the F wave function can combine to give an increased electron probability in the region between the nuclei, and hence a bonding sp state is formed. It is also possible to have antibonding sp states, which result from the H and F wave functions having opposite signs and producing a reduced electron probability density between the nuclei. Table 9.2 gives dissociation energies and nuclear separation distances for some sp-bonded diatomic molecules. Consider now the structure of the water molecule, H2 O. Oxygen has eight electrons, four of which occupy the 2p shell. When we place these electrons in the 2p atomic states, we begin with one electron each in the px , py , and pz states, and then the fourth 2p electron must pair with one of the first three. An oxygen atom therefore has two unpaired 2p electrons, each of which can form a bond with the TABLE 9.2 Properties of sp -Bonded Molecules Molecule Dissociation Energy (eV) Equilibrium Separation (nm) HF 5.90 0.092 HCl 4.48 0.128 HBr 3.79 0.141 HI 3.10 0.160 LiF 5.98 0.156 LiCl 4.86 0.202 NaF 4.99 0.193 NaCl 4.26 0.236 KF 5.15 0.217 KCl 4.43 0.267 9.2 | Covalent Bonding in Molecules 1s electron of H to form a molecule of H2 O. Figure 9.12 shows a representation of the electron probability distributions we might expect for an oxygen atom and for a molecule of H2 O. Such a molecule has directed bonds, which have a fixed, measurable relative direction in space. The expected angle between the two bonds is 90◦ ; this angle can be measured experimentally by, for example, measuring the electric dipole moment of the atom, and the result, 104.5◦ , is somewhat larger than we expect. This discrepancy can be interpreted as arising from the Coulomb repulsion of the two H atoms, which tends to spread the bond angle somewhat. As another example, consider the NH3 (ammonia) molecule. With Z = 7, the nitrogen atom has three unpaired p electrons, one each in the px , py , and pz atomic states. Each of these can form a bond with a H atom to form the NH3 molecule, and we expect to find three mutually perpendicular sp bonds (Figure 9.13). The measured bond angle is 107.3◦ , again indicating some repulsion between the H atoms. Table 9.3 lists some bond angles measured for other molecules that have sp directed bonds. As you can see, the bond angle does indeed approach 90◦ in many cases. Based on the discussion given above, you should be able to explain why this happens as the Z of the central atom increases. − + − + + O + H + H + FIGURE 9.12 Overlap of electronic wave functions in H2 O. + H + + sp Hybrid States One example of a 2p atom we have so far not considered is carbon, and for a special reason: carbon forms a great variety of molecular bonds, with a resulting diversity in the type and complexity of molecules containing carbon. It is this diversity that is the basis for the many kinds of organic molecules that can form, based on various kinds of carbon molecular bonds, and so an understanding of the physics of carbon molecular bonds is essential to the understanding of many fundamental questions of structure and processes in molecular biology. Carbon, with six electrons, has the configuration 1s2 2s2 2p2 , so we expect carbon under ordinary circumstances to show a valence of 2, with the two 2p electrons contributing to the structure, and we might therefore expect to form stable molecules such as CH2 , with directed sp bonding (similar to H2 O) and a bond angle of roughly 90◦ . Instead, what forms is CH4 (methane) in a tetrahedral structure (Figure 9.14), with four equivalent bonds. For another example, the elements of the third column of the periodic table (boron, aluminum, gallium, . . .) have the outer configuration ns2 np (n = 2 for boron, n = 3 for aluminum, etc.), and we expect these elements to form compounds as if they had a single valence p electron. We therefore expect halides such as BCl or GaF, oxides such as B2 O or Al2 O, nitrides such as B3 N or Al3 N, hydrides such as BH or GaH, and so forth. Instead we find that boron, aluminum, and gallium generally behave as if they had three valence electrons, and form compounds such as BCl3 , Al2 O3 , AlN, and B2 H6 . Furthermore, the three valence electrons seem to be equivalent; there seems to be no way, for example, to associate two of the valence electrons with s states and one with a p state. The bonds formed by the three electrons make equivalent angles of 120◦ with one another. It is the effect of sp hybridization that is responsible for the valence of three (rather than one) in boron and four (rather than two) in carbon. The four bonds in CH4 are equivalent and identical, which would not be expected if we had two ss bonds and two sp bonds; similarly, in BF3 or BCl3 , the three bonds are identical and are clearly not identified with two sp bonds and one pp bond. 267 − − + + + + − + N H